Google Introduces Enhanced Generative AI Tools: Imagen 3 and Gems
Google is enhancing its generative AI tools, starting this week with the launch of Imagen 3, the next-gen version of its image generator. This update reintroduces the previously removed ability to generate AI images of people, overcoming earlier controversies surrounding the feature. Additionally, Google’s new Gemini chatbot now includes “Gems,” a function that allows users to create customized bots with personalized instructions, akin to ChatGPT’s custom GPTs.
Imagen 3: Enhanced Image Generation
Imagen 3 is designed to set a new standard in image quality, complete with built-in guardrails to prevent issues with diversity that plagued prior iterations. According to Gemini Product Manager Dave Citron, “Across a wide range of benchmarks, Imagen 3 performs favorably compared to other image generation models.” The updated tool enables users to guide image creation with supplementary prompts if the initial output doesn't meet expectations.
The new model also incorporates Google’s SynthID technology to watermark images, distinctly marking them as AI-generated, thus avoiding confusion with real photographs. Citron indicated that the ability to generate images of people will soon be available for paid users, with safeguards in place to prohibit the creation of “photorealistic, identifiable individuals,” as well as images featuring children or any graphic, violent, or sexual content. While acknowledging that Gemini's images may not be perfect, he assured users that the company will continue to refine the model based on feedback.
Introducing Gems: Custom Chatbots for Enhanced Functionality
Gems, initially previewed at Google I/O 2024, allow users to create custom chatbots tailored to specific tasks. This feature provides a solution for users needing assistance with projects, brainstorming sessions, or even crafting social media captions. Citron emphasized, “Your Gem can remember a detailed set of instructions to help you save time on tedious, repetitive, or challenging tasks.”
To facilitate user engagement, Gemini will also offer prebuilt Gems designed to inspire creativity and streamline work processes. Some of the available prebuilt Gems include:
- Learning Coach: Assists in understanding complex topics.
- Brainstormer: Sparks new ideas for projects.
- Career Guide: Aids in skill upgrades and career decisions.
- Writing Editor: Provides constructive feedback on grammar and structure.
- Coding Partner: Helps developers enhance coding skills and inspire new projects.
Gems are rolling out today for desktop and mobile, but are currently accessible only to Gemini Advanced, Business, and Enterprise subscribers. To utilize these innovative features, users will need to opt for a paid plan.
By enhancing its AI capabilities, Google aims to provide tools that are more intuitive and effective, responding to the growing demand for sophisticated generative technologies.