Nvidia celebrates its 2024 AI research milestones

Nvidia celebrates its 2024 AI research milestones — text prompt 3D modeling, advanced motion reconstruction, and more

A demo of Nvidia LATTE3D, a model that converts text prompts into 3D models. In this case, "Cinnamon spice latte with whipped cream".

Yesterday, Nvidia Research published a blog post highlighting the numerous advancements in AI research throughout 2024. Some of these advancements are pretty typical generative AI fare. Still, some are a little bit more interesting than simply generating content with other people's copyrighted material—for example, "StormCast" and "MaskedMimic," which correspond to more advanced weather forecasts and motion inpainting (reconstruction of full motion from partially visible motion) for robotics, respectively.

There's also "GluFormer," which uses AI to predict blood sugar levels up to four years out, though it does require past glucose monitoring data. GluFormer also makes it easier to determine how dietary changes will impact long-term blood sugar, with studies indicating high accuracy for people with conditions up to and including diabetes.

Some other touted improvements are slightly less impressive and more ethically dubious than the rest of the generative AI space. However, they do still show the evolution of technology. "ConsiStory" allows for multiple AI image prompts with a consistent subject, improving the utility for those trying to make something narratively consistent with these tools.

Meanwhile, "Edify3D" and "LATTE3D" are generative AI tools for creating easy 3D models. Existing 3D modelers aren't a huge fan of this. However, they point out that AI retopology and UV mapping would be pretty useful to existing 3D art workloads without removing all the fun and/or billable labor of making the models yourself. There's also "Fugatto," which is a generative AI model geared at creating new sound files (including music) or modifying existing ones (such as removing background music).

Finally, toward the end of its highlight blog post, Nvidia Research summarizes numerous improvements and benchmarking victories. These victories include "Hydra-MDP," an autonomous driving framework that won the Autonomous Grand Challenge at CVPR 2024, Nvidia Blackwell's leading performance in MLPerf industry benchmarks, and "FoundationPose," which obtained first place on the BOP leaderboard for model-based pose estimation of unseen objects.

In its original blog post, Nvidia links relevant research papers for more detailed overviews of these advancements.

Read news from 100’s of titles, curated specifically for you.

Already a member? Sign in here