Agility: Harnessing Large Language Models for Enhanced Communication with Humanoid Robots

Over the past year, I've engaged in numerous discussions with robotics experts about the impact of generative AI and large language models (LLMs) on the field. It has become evident that these technologies are poised to transform how robots communicate, learn, perceive their surroundings, and are programmed.

Consequently, leading universities, research institutions, and innovative companies are investigating the best ways to harness these AI-powered solutions. One notable player is Oregon-based startup Agility, which has been experimenting with this technology using its bipedal robot, Digit.

Today, Agility released a brief video on its social channels to highlight some of this groundbreaking work. The company states, “We were intrigued to explore what integrating this technology into Digit could achieve.” The demonstration featured a series of numbered towers of varying heights and three distinct boxes. Digit received information about this scenario but no specific task guidelines—only natural language commands of differing complexity to test its execution abilities.

In the video, Digit is instructed to pick up a box resembling “Darth Vader’s lightsaber” and transfer it to the tallest tower. The execution isn't instantaneous; instead, it unfolds slowly and deliberately, typical of an early-stage demonstration. Nevertheless, the robot successfully completes the assigned task.

Agility remarks, “Our innovation team created this interactive demo to illustrate how LLMs can enhance our robots' versatility and speed up their deployment. This demo allows users to communicate with Digit using natural language, offering a glimpse of the future of robotics.”

Natural language interaction is a crucial potential application for this technology, along with low- and no-code programming capabilities. During my discussion at the Disrupt panel, Gill Pratt shared insights on how the Toyota Research Institute employs generative AI to accelerate robotic learning. "We’ve discovered a method to leverage modern generative AI techniques that allow human demonstration of both position and force to teach a robot using just a few examples. Our code remains unchanged, and this approach utilizes what we call diffusion policy, developed in collaboration with Columbia and MIT. So far, we have taught 60 different skills.”

Additionally, Daniela Rus from MIT CSAIL mentioned, “Generative AI is proving to be quite effective in addressing motion planning challenges. It can provide significantly faster and more fluid, human-like control solutions compared to traditional model predictive approaches. This is revolutionary, as future robots will be designed to exhibit less robotic behavior and more fluid, human-like motions.”

The possibilities for applications here are vast and promising. With Digit being an advanced commercially available robotic system already being tested in Amazon fulfillment centers and other practical environments, it stands out as a prime candidate for this technology. As robotics adapt to working alongside humans, the ability to listen and respond effectively will be essential for their success.

Most people like

Find AI tools in YBX