Google is optimistic about soon "unpausing" the image generation capabilities of its multimodal generative AI tool, Gemini. DeepMind founder Demis Hassabis announced today that the feature, which allows Gemini to create images of people, should be operational again in the “next few weeks.”
Last week, Google temporarily halted this capability after users raised concerns about Gemini producing historically inaccurate images. For instance, the AI depicted the U.S. Founding Fathers as a diverse group, which did not align with historical facts.
Hassabis addressed these concerns during an interview at the Mobile World Congress in Barcelona. When asked by Wired's Steven Levy about what caused the issue, Hassabis refrained from offering a technical explanation. He explained that the problem stemmed from Google's inability to recognize when users were seeking a “universal depiction” of certain subjects. He noted, “This highlights the nuances that accompany advanced AI.”
"When you request a prompt like ‘show me a person walking a dog’ or ‘a nurse in a hospital,’ you’re often looking for a general representation," he elaborated. "Given that Google operates in over 200 countries, we can't predict the user's background or context, so it’s necessary to provide a broad array of options.”
Hassabis admitted that the intent behind Gemini’s image outputs was well-meaning—to enhance diversity in representation—but acknowledged that it was executed "too bluntly across the board." He emphasized that prompts for historical figures should result in a "narrower distribution" of images, which Gemini will adjust for in the future.
“We prioritize historical accuracy, so we’ve taken that feature offline while we address this issue. We expect it to be back online very soon, within the next couple of weeks,” he added.
In response to a follow-up question about safeguarding generative AI tools from misuse by authoritarian regimes intent on spreading propaganda, Hassabis indicated that there is no straightforward solution. The complexity of the issue likely requires collective societal engagement to establish and enforce appropriate boundaries.
“There’s important dialogue and research that must take place—not just within tech companies, but also involving civil society and governments,” he stated. “This is a societal question that transcends individual interests. It encompasses what values we want these systems to embody, what they should represent, and how to prevent misuse by bad actors for harmful purposes.”
When discussing the challenges associated with open-source generative AI models, which Google also provides, he noted: “Users often prefer open-source systems they can completely control. The critical concern is ensuring that such systems aren’t used for harmful applications as they grow increasingly powerful.”
“Currently, this is not a pressing issue since these systems are still in their early stages. However, looking ahead three to five years to next-generation systems with enhanced planning abilities, society must seriously consider the ramifications of their proliferation and the potential misuse by individuals and rogue states,” he warned.
Hassabis also shared his vision for the evolution of AI-assisted devices in the mobile market. He speculated about a surge of “next-generation smart assistants” that enhance daily life, moving away from the “gimmicky” features of previous AI iterations. He suggested this evolution could even transform the types of mobile hardware people use.
“There will be discussions about the most suitable device forms,” he mused. “In five years, will a traditional phone still be ideal? Perhaps we will need glasses or alternative devices that enable AI to better understand our surroundings, thereby increasing its usefulness in our everyday lives. The potential for innovation is limitless.”