Watch a Robot Skillfully Navigate Google DeepMind Offices with Gemini Technology

Home Hardware Watch a Robot Skillfully Navigate Google DeepMind Offices with Gemini Technology

Updated on October 22 2024

Generative AI is making significant strides in the field of robotics, showcasing a variety of applications such as natural language processing, robotic learning, no-code programming, and design innovation. This week, Google's DeepMind Robotics team is highlighting an exciting intersection of these domains: navigation.

In their latest research paper titled “Mobility VLA: Multimodal Instruction Navigation with Long-Context VLMs and Topological Graphs,” the team illustrates how they have implemented Google Gemini 1.5 Pro to enable a robot to understand commands and navigate effectively around an office setting. Notably, DeepMind has utilized some of the Everyday Robots that were part of a project halted last year amid broader layoffs.

In a series of engaging demonstration videos, DeepMind employees initiate interactions with a smart assistant-like prompt: “OK, Robot.” They then request the robot to carry out various tasks within a spacious 9,000-square-foot office environment.

In one instance, a Googler instructs the robot to take them to a place where they can draw. “OK,” the robot replies, sporting a cheerful yellow bowtie, “give me a minute. Thinking with Gemini …” The robot promptly guides the user to a wall-sized whiteboard. In another example, a different individual directs the robot to follow instructions displayed on the whiteboard. A straightforward map directs the robot to the “Blue Area,” and after a brief moment of contemplation, the robot opts for a longer route, ultimately arriving at a robotics testing area. “I’ve successfully followed the directions on the whiteboard,” it declares with a confidence that many humans would envy.

Before these demonstrations, the robots became acclimated to their environment through a process called “Multimodal Instruction Navigation with demonstration Tours (MINT).” This involves guiding the robot through the office while verbally identifying various landmarks. The team then applies hierarchical Vision-Language-Action (VLA) techniques, which merge environmental awareness with common-sense reasoning. By integrating these approaches, the robot gains the ability to respond to written and drawn commands, along with hand gestures.

Google reports that the robot achieved around a 90% success rate through its interactions with more than 50 employees, underscoring the effectiveness of this innovative navigation system.

Discover Bird Buddy's Innovative AI Feature: Name and Identify Individual Birds Effortlessly

Samsung Launches Galaxy Ring: First Smart Ring Available on July 24 for $399

Most people like

RightBlogger

Welcome to RightBlogger, an innovative AI-driven platform designed specifically for bloggers. With over 40 powerful tools at your fingertips, you can effortlessly create high-quality content in less time than ever before. Elevate your blogging experience and enhance your productivity with RightBlogger today!

AI-powered content tools AI Content Generator

TimeHero

Revolutionize your team's efficiency with AI-powered task planning and management solutions. Embrace the future of collaboration and streamline workflows to achieve optimal productivity.

task management AI Task Management

The Good AI

The Good AI is an innovative platform harnessing the power of artificial intelligence to craft essays with precision and speed. Whether you need a well-researched paper or a creative piece, The Good AI delivers exceptional writing solutions effortlessly.

Essay writer AI Checker Essay

Sapient.ai

Sapient.ai automates the generation of unit tests, enabling developers to concentrate on building innovative new features. This streamlined process enhances productivity by allowing teams to allocate their time and resources more effectively.

generative QA AI Testing & QA

Find AI tools in YBX