Google has launched Gemini 1.5, the latest version of its conversational AI system, boasting significant enhancements in efficiency, performance, and long-form reasoning capabilities.
In a blog post by Google AI chief Demis Hassabis, key architectural improvements were highlighted, enabling Gemini 1.5 Pro to match the performance of the company's largest model, Gemini 1.0 Ultra, while consuming fewer computing resources. The Gemini 1.0 Ultra was introduced just last week.
The most notable advancement is the introduction of a million-token context window, which represents a breakthrough in long-context understanding. The standard Gemini model can analyze prompts within a 128,000-token context. With the million-token upgrade, Gemini 1.5 can process a much larger volume of continuous information before generating a response.
This million-token context empowers long-form reasoning. Google CEO Sundar Pichai showcased Gemini 1.5’s capabilities by illustrating its ability to summarize the full Apollo 11 mission transcript or analyze a 44-minute silent film featuring Buster Keaton.
Hassabis explained that the extended context allows Gemini 1.5 to analyze, classify, and summarize substantial content seamlessly. Early results indicate that performance remains strong even with the expanded context.
As of now, the public availability of the million-token version remains uncertain. Google is offering a limited preview to select developers and enterprise users through its Vertex AI platform.
This release follows Google’s recent rebranding of its conversational AI from Bard to Gemini, along with the launch of a paid Gemini Advanced tier utilizing the Ultra 1.0 model. Gemini is positioned as a competitor to OpenAI’s ChatGPT Plus.
Hassabis noted that the efficiency improvements in Gemini 1.5 will enable Google teams to "iterate, train, and deliver more advanced versions of Gemini faster than ever."
Pichai emphasized Google’s commitment to developing Gemini responsibly, adhering to its AI principles. The company has conducted extensive ethics and safety testing for Gemini 1.5, focusing on content safety and representation.
The pace of progress in conversational AI has accelerated significantly since the launch of ChatGPT last year. Experts attribute this to reduced training costs and innovations like Google’s Sparsely-Gated Mixture-of-Experts architecture, which facilitates rapid development of new iterations.
With Gemini 1.5, Google aims to solidify its leadership in the AI sector. The pressing question is when these advanced long-context reasoning capabilities will be integrated into Google’s consumer products.