Google Unveils Project Astra at I/O 2024: AI Takes Center Stage Again
When Google introduced its Duplex voice assistant technology at the 2018 developer conference, it sparked both admiration and concern. Fast forward to I/O 2024, and the company reignited similar sentiments with the unveiling of Project Astra, a new AI application.
Before the keynote began, Google teased Project Astra on social media with a video showcasing a camera-based AI app. During the keynote, DeepMind CEO Demis Hassabis shared that his team has "always wanted to develop universal AI agents that can be helpful in everyday life," positioning Project Astra as a significant step in that direction.
What is Project Astra?
In a video presented during a media briefing, Project Astra appeared as an interactive app featuring a viewfinder interface. Users could point their phone’s camera at various objects in an office while speaking prompts like, "Tell me when you see something that makes sound." When a speaker was identified, Gemini responded, "I see a speaker, which makes sound."
The user then inquired, "What is that part of the speaker called?" to which Gemini accurately replied, "That is the tweeter. It produces high-frequency sounds."
In a single-take demonstration, the tester asked Gemini for "a creative alliteration about these," while pointing to a cup of crayons. Gemini responded, "Creative crayons color cheerfully. They certainly craft colorful creations."
Are Those Project Astra Glasses? Is Google Glass Making a Comeback?
The video further demonstrated Gemini's capabilities in identifying software code and determining the user's neighborhood based on the view outside. Impressively, when asked, "Do you remember where you saw my glasses?" Gemini accurately replied, locating them on a desk near a red apple, even though they were out of frame.
After finding the glasses, the tester wore them, shifting the video perspective to display what the glasses viewed. With an onboard camera, the glasses scanned the surroundings, including a whiteboard diagram. When the tester asked, "What can I add here to make this system faster?" an onscreen waveform indicated Gemini was listening, responding with, "Adding a cache between the server and database could improve speed."
In a playful exchange, when asked, "What does this remind you of?" while looking at doodled cats, Astra replied, "Schrodinger's cat." Finally, when presented with a plush tiger next to a golden retriever, Astra humorously suggested, "Golden stripes" as a band name for the duo.
How Does Project Astra Work?
Project Astra excels in processing visual information in real-time while retaining contextual knowledge. As Hassabis explained, these AI agents are designed to process information efficiently by continuously encoding video frames, combining video and speech inputs into a coherent timeline, and caching information for quick recall.
The speed of Astra's responses was markedly impressive, a feat Hassabis noted involves overcoming significant engineering challenges in conversational AI.
Additionally, Google has been enhancing the vocal expressiveness of its AI systems, improving their intonations to create a more human-like dialogue, reminiscent of Duplex's subtle verbal cues.
When Will Project Astra Be Available?
Currently, Project Astra is an early-stage feature without a specific launch timeline. However, Hassabis mentioned that future iterations of these assistants could be accessed "through your phone or glasses." While it's unclear if these glasses will be a new product or a successor to Google Glass, he confirmed that "some of these capabilities are coming to Google products, like the Gemini app, later this year."