Nvidia Unveils Project GR00T: A Cutting-Edge Multimodal AI for the Humanoids of Tomorrow

Nvidia is revolutionizing robotics with the launch of Project GR00T—a cutting-edge multimodal AI designed to empower the humanoid robots of the future with advanced foundational AI.

Unveiled during the GTC conference at the San Jose McEnery Convention Center, Project GR00T utilizes a general-purpose foundation model that allows humanoid robots to process inputs from text, speech, videos, and live demonstrations to perform specific actions. This project leverages Nvidia's Isaac Robotics Platform, including a new Isaac Lab dedicated to reinforcement learning.

“Building foundation models for general humanoid robots is one of the most exciting challenges in AI today,” stated Nvidia CEO Jensen Huang. He emphasized that the convergence of enabling technologies offers roboticists worldwide the potential for significant advancements in artificial general robotics.

To facilitate enterprises in harnessing GR00T, Nvidia has introduced a specialized Jetson Thor chip tailored for humanoid robots. The company also announced significant enhancements for developing AI-powered industrial manipulation arms and robots capable of navigating unstructured environments.

What to Expect from Nvidia Project GR00T?

Although the name evokes Marvel's Groot, it actually stands for Generalist Robot 00 Technology. According to Nvidia, GR00T is designed to comprehend natural language text, speech, video, and live demonstrations, enabling it to replicate human movements—coordinating dexterity and other skills to navigate and interact with the real world.

This advancement not only expands humanoid robot capabilities but also simplifies the development and deployment process. With inputs like text and demonstrations, robots can be programmed by anyone with the necessary access.

In his GTC keynote, Huang demonstrated various tasks performed by GR00T-powered humanoid robots from companies such as Agility Robotics, Apptronik, Fourier Intelligence, and Unitree Robotics. Deepu Talla, who briefed journalists on GR00T, indicated that the project capitalizes on the latest advancements in generative AI and transformers, although specifics on its full range of capabilities remain limited for now.

OpenAI, a leader in generative AI, is also venturing into embodied AI, supporting startups like 1X Technologies and Figure. Recently, Figure showcased one of its robots executing routine chores, including picking up litter, using a large vision-language model developed by OpenAI.

Project GR00T serves as the intelligence behind humanoid robots, equipping them with the ability to learn skills for various useful tasks.

During a media inquiry, Talla noted that while detailed internal architecture information is currently unavailable, more insights into GR00T's capabilities will be shared in the future. Presently, only select humanoid developers have early access to the model, but Nvidia plans to expand access to additional developers soon.

To ensure humanoid robots can operate complex multimodal models like GR00T, Nvidia has introduced the Jetson Thor computing platform. Built on the Thor SoC, this high-performance computing system includes a powerful CPU cluster and a next-generation GPU from Nvidia's Blackwell architecture, capable of delivering 800 teraflops of 8-bit floating-point AI performance. Talla highlighted that this GPU performance is eight times better than the previous Jetson Orin version and the CPU performance is 2.6 times superior.

New Isaac Robotics Tools at the Core of GR00T

Nvidia is utilizing its Isaac Robotics Platform to bring Project GR00T to fruition, offering developers a comprehensive, end-to-end framework for designing, simulating, and deploying AI-powered robots.

The project leverages the new Isaac Lab—a GPU-accelerated virtual environment—utilizing parallel simulations for the model's training and testing. Additionally, the OSMO compute orchestration service allows management of training and simulation workloads on Nvidia DGX and OVX systems.

The Isaac Robotics Platform is also expanding its offerings with two targeted solutions: Isaac Manipulator and Isaac Perceptor.

Isaac Manipulator provides GPU-accelerated libraries and foundation models to enhance robotic arms with advanced motion and dexterity. This includes models for detecting objects, estimating their 6D pose, tracking, and making detailed predictions for grasping.

Conversely, Isaac Perceptor assists robots in navigating unstructured environments using multi-camera, 360-degree vision capabilities driven by AI algorithms for 3D perception and surrounding awareness. Nvidia is making this technology available through its Nova Orin DevKit, collaborating with partners like ArcBest, BYD, and KION Group to enhance their autonomous mobile robot functionalities in manufacturing and fulfillment.

“Integrating the Isaac Perceptor platform into our Vaux Smart Autonomy AMR forklifts and reach trucks enables improved perception, semantic-aware navigation, and 3D mapping for obstacle detection in material handling at warehouses and distribution centers,” said Michael Newcity, Chief Innovation Officer at ArcBest.

The new capabilities of the Isaac platform are expected to launch in the second quarter of this year, while Project GR00T remains in early access. Nvidia is accepting applications from additional humanoid developers, with plans for broader public release yet to be announced.

Most people like

Find AI tools in YBX