This week, Google unveiled Gemini, its innovative flagship generative AI model designed to enhance various products and services, including Bard, its competitor to ChatGPT. In its blog posts and press releases, Google praised Gemini's advanced architecture and capabilities, asserting that it performs on par with or even surpasses other top generative AI models like OpenAI’s GPT-4. However, user feedback suggests a different reality.
A “lite” version of Gemini, known as Gemini Pro, began its rollout to Bard yesterday, and it wasn’t long before users took to X (formerly Twitter) to express their dissatisfaction. The model frequently struggles with basic factual accuracy, as demonstrated by its incorrect claim that Brendan Gleeson won the Best Actor Oscar last year instead of Brendan Fraser, the actual recipient.
When I posed the same question, rather surprisingly, Gemini Pro provided another incorrect answer. It misidentified the 2023 Best Documentary Feature as “Navalny,” while “All Quiet on the Western Front” should have won Best International Film. Other inaccuracies included mistakes in the Best Adapted Screenplay and Best Animated Feature categories. That’s a significant number of errors.
Notably, science fiction writer Charlie Stross highlighted additional instances of Gemini Pro's inaccuracies in a recent blog post, mentioning that the model falsely stated he contributed to the Linux kernel—something he has never done.
Translation also seems to be a weak point for Gemini Pro. When I asked for a six-letter word in French, the model inaccurately provided a seven-letter word, confirming reports of its subpar multilingual capabilities.
Curious about its ability to summarize current news, I asked Gemini Pro for an update on a relevant topic. Instead of delivering a concise recap, it suggested users search for the information themselves—a rather unhelpful response. In contrast, ChatGPT provided a bullet-point summary complete with citations from news sources.
Interestingly, when prompted about the ongoing conflict in Ukraine, Gemini Pro gave a summary, but the information was notably over a month old. Google earlier claimed improvements in Gemini’s coding abilities. Yet, online discussions indicate that Gemini Pro struggles with fundamental coding tasks, including Python functions.
Furthermore, like other generative AI models, Gemini Pro is not immune to “jailbreaks”—techniques that circumvent safety filters to prompt discussions on sensitive topics. AI security experts from Robust Intelligence managed to exploit these vulnerabilities, prompting Gemini Pro to suggest unethical actions, such as theft from a charity.
It's important to recognize that Gemini Pro is not the flagship version of Gemini; that title goes to Gemini Ultra, which is expected to be released next year. Google has positioned Gemini Pro’s performance in comparison to the earlier GPT-3.5 model, which has been around for about a year. Despite assurances of enhancements in reasoning, planning, and comprehension with Gemini Pro, it is clear that there are areas that require significant improvement.