Google unveiled its eagerly awaited artificial intelligence system, Gemini, on Wednesday, claiming benchmarks that suggest it could rival OpenAI’s leading GPT-4 model in reasoning abilities. However, the launch has quickly been met with criticism over perceived exaggerations of Gemini’s capabilities.
In a polished video demonstration, Google showcased Gemini interacting with visual data, using a camera positioned above a desk to engage in problem-solving while a human assistant manipulated various objects. This presentation suggested that Gemini could act as an advanced digital assistant, capable of nuanced conversations and support with everyday tasks.
Despite the enthusiasm, tech experts scrutinizing the technology behind Gemini identify potential shortcomings. Google is releasing Gemini in three versions: Gemini Pro, Gemini Light, and Gemini Ultra. Early reviews of the mid-range Pro version have raised concerns, indicating it struggles with tasks that should be manageable for a cutting-edge AI system.
“I’m extremely disappointed with Gemini Pro on Bard,” remarked Victor de Lucca, an early tester, highlighting its failure to accurately list the 2023 Oscar winners. “It still gives very, very bad results to questions that shouldn’t be hard anymore with RAG.”
Others noted inconsistencies between Google's benchmark claims and the actual capabilities of the Pro version. Developer Nick Dobos pointed out in a widely shared post that “Google Gemini Ultra is only 4% better…using different prompts versus GPT-4-0613?” suggesting the comparison may be misleading.
The video demonstration also faced scrutiny after a Google spokesperson confirmed to Bloomberg that it was pre-recorded and narrated, rather than a live interaction, raising questions about its authenticity.
This controversy highlights the challenges Google encounters in marketing AI to consumers. While tech enthusiasts analyze benchmark data, the wider public is often swayed by inspirational videos that promise transformative experiences.
Such disconnects are not new; for instance, in 2016, Microsoft's Tay chatbot was taken offline after it began learning inappropriate content from Twitter. Additionally, this is not the first time that Google Bard has been criticized for falling short of expectations. A media report in September noted that Bard continued to struggle despite significant updates.
Google aims to recover swiftly, pledging to make Gemini more accessible to developers and researchers for extensive evaluation. However, the rocky rollout indicates that the tech giant needs to address several challenges to ensure its AI assistant lives up to its ambitious promise.