Nvidia has revealed GR00T (Generalist Robot 00 Technology), a new artificial intelligence model designed to give robots the ability to carry out tasks without requiring each action to be pre-programmed.
The new model is part of a “moonshot mission” from Nvidia’s GEAR lab to solve the problem of embodied AGI in the physical world. That is essentially giving legs to a future superintelligence so it isn't stuck in a black box.
GR00T was announced by Nvidia CEO Jensen Huang at GDC, bringing a pair of cute Disney-made robots on the stage that learnt to walk themselves using Nvidia AI.
Haung said GR00T was part of a new industrial revolution and that being general purpose will allow robots to learn tasks by watching humans. In turn this means they can repeat those tasks on an assembly line or in a hazardous-to-humans space.
What is embodied AI?
GR00T and other projects like it are part of the solution to the Embodied AI problem. That is a growing area of research aimed at giving foundation AI models like ChatGPT or Google Gemini the ability to interact with the real world.
Some of this comes in the form of virtual environments. Google's recent SIMA gaming agent is an example of this, but it also includes bringing them to our world.
We are already seeing companies building in this space, creating a variety of robots that could one day be operating in hazardous environments unsupervised. Doing jobs too dangerous, or too boring, for humans.
Unicorn startup Figure has partnered with OpenAI to give its Figure 01 humanoid robot the ability to both reason and learn as they interact with humans.
Google’s DeepMind is working on training robots to think for themselves, giving them the ability to determine movements, actions and how to complete a task.
Nvidia’s plan is to create the underlying framework that could be used by a wide range of companies building humanoid robots.
What is GR00T?
GR00T in this scenario is an AI framework, not a living tree travelling the universe with a talking racoon. Built by the GEAR lab it is built on the Nvidia deep technology stack that includes other models for training, scaling and powering AI models on the move.
Jim Fan, GEAR Group lead for Nvidia wrote on X that GR00T will “enable a robot to understand multimodal instructions, such as language, video, and demonstration, and perform a variety of useful tasks.”
He said they are collaborating with a range of leading humanoid robot companies so that GR00T can work across a range of body types and ecosystems.
Yuke Zhu, co-lead of GEAR with Fan wrote on X: “GR00T will enable the robots to follow natural language instructions and learn new skills from human videos and demonstrations.”
The goal is to create generalist robots with both a versatile body and an intelligent mind. This will allow millions of these bots to be deployed to real-world tasks in the future without having to programme or specifically train each machine for every task.