Google is making significant strides in the generative AI arena, recently announcing Gemini Live—a groundbreaking voice mode for its AI model, Gemini, available through the Gemini mobile app. This feature allows users to engage in fluid, conversational dialogue, even interrupting the AI, which responds in a natural human-like voice and cadence. As Google stated on X, “You can now have a free-flowing conversation and change topics like a regular phone call.”
Gemini Live is currently accessible in English on the Google Gemini app for Android devices as part of the $19.99 USD Gemini Advanced subscription, with plans for an iOS version and additional language support in the coming weeks.
While OpenAI showcased a similar feature, the “Advanced Voice Mode” for ChatGPT, earlier this year, they have since rolled it out selectively following internal security tests that identified potential risks, such as mimicking a user's voice without consent—a vulnerability that could be exploited for fraud.
How is Google addressing potential risks associated with this technology? Details remain sparse, but we have reached out to the company for more information.
So, what can users expect from Gemini Live? Google describes it as a tool for natural conversation, ideal for brainstorming, preparing for important discussions, or casual chatting. The feature responds and adapts in real-time and can operate hands-free, allowing users to interact while their device is locked or running other apps.
Moreover, Google has integrated the Gemini AI model into the Android user experience, enhancing context-aware assistance. Users can access Gemini by long-pressing the power button or saying, “Hey Google,” enabling the AI to engage with on-screen content, such as providing information about YouTube videos or generating restaurant recommendations to add directly to Google Maps.
Sissie Hsiao, Vice President and General Manager of Gemini Experiences and Google Assistant, highlighted that the evolution of AI is reshaping personal assistance, making Gemini a more intuitive and conversational partner for managing complex tasks.