Engaging with the tech and AI community on X (formerly Twitter) this week has provided valuable insights into the capabilities and shortcomings of Google’s latest AI chatbot, Gemini.
Tech professionals, influencers, and writers have shared screenshots of their interactions with Gemini, highlighting instances of bizarre, inaccurate, and ahistorical image generation. Critics argue that these outputs often lean into notions of diversity and "wokeness."
Just before this article was published, Jack Krawczyk, Google’s Senior Director of Product, responded on X, acknowledging that the company is aware of the inaccuracies in Gemini's historical image generation and is actively working to address them.
Krawczyk stated:
“We are aware that Gemini is offering inaccuracies in some historical image generation depictions, and we are working to fix this immediately. As part of our AI principles, we design our image generation capabilities to reflect our global user base and take representation and bias seriously. We will continue to do this for open-ended prompts. Historical contexts require more nuance, and we will improve our systems accordingly. Thank you for your feedback, and please keep it coming!”
Gemini was initially unveiled last year amid considerable anticipation, positioning itself as a leading AI model, potentially rivaling OpenAI’s GPT-4, which currently dominates most third-party benchmarks. However, independent evaluations indicated that Gemini underperformed compared to OpenAI’s older model, GPT-3.5, prompting Google to release advanced versions, Gemini Advanced and Gemini 1.5, earlier this year and phase out its previous Bard chatbot.
Despite these updates, Gemini faces criticism for its reluctance to generate historical imagery, such as representations of German soldiers from the 1930s, alongside questionable portrayals of Native Americans and darker-skinned individuals in contexts that don't align with historical accuracy. For instance, the AI seems to mistakenly emphasize diversity when depicting earlier European cultures.
Users have expressed concerns about Gemini's perceived adherence to “wokeness,” a term that has evolved from denoting awareness of racial inequality in the U.S. to being used pejoratively to critique organizations that seem overly politically correct.
Interestingly, some users have noted real-time adjustments in Gemini’s outputs, suggesting that Google is actively refining its image generation capabilities. A Google spokesperson reiterated Krawczyk’s commitment to improvement, acknowledging the need for more accurate depictions while emphasizing the importance of diversity.
Yann LeCun, head of Meta’s AI initiatives, highlighted a concerning instance where Gemini refused to generate an image of the Tiananmen Square protests in 1989. LeCun argued that such omissions underscore the necessity of open-source AI, allowing broader control over outputs.
The scrutiny surrounding Gemini's image outputs underscores a larger debate about how AI should navigate sensitive topics, including diversity, historical injustices, and oppression.
Google has previously encountered similar controversies, such as the incident in 2015 involving Google Photos and algorithmic bias against darker-skinned individuals. Additionally, the firing of employee James Damore in 2017 over his criticisms of Google's diversity practices further exemplifies the challenges tech companies face in these discussions.
Other companies have also struggled with the implications of AI. For example, Microsoft’s AI chatbot Tay had to be shut down after producing harmful and racist outputs.
In attempting to avoid past blunders, Google’s restrictive approach for Gemini has faced backlash for distorting history in favor of modern sensibilities, invoking comparisons to George Orwell's 1984, where an authoritarian regime suppresses the truth.
Similar criticisms have followed OpenAI’s ChatGPT, with users attempting to "jailbreak" the system to elicit restricted information, reflecting the ongoing tension between the desire for free expression and the need to mitigate harmful content.
AI developers find themselves in a precarious position, balancing the need for permissible content against backlash from users who feel that constraints lead to historical inaccuracies. This tension has fueled calls for open-source models that allow for individual autonomy while also raising concerns about the potential risks associated with unrestricted content.
The surge of generative AI only intensifies debates surrounding freedom of expression versus the necessity to prevent socially damaging behaviors. As technological advancements evolve, the divide over their implications in society will likely continue to spark contentious discussions.