As OpenAI reaches remarkable heights, Google is launching a strategic response. On December 6, Google introduced Gemini, its most advanced large language model, in conjunction with the new TPU (Tensor Processing Unit) system, Cloud TPU v5p, and an AI supercomputer from Google Cloud. The v5p significantly outperforms the previous Cloud TPU v4, offering enhanced performance for a wide range of applications.
In the MMLU (Massive Multi-task Language Understanding) tests, Gemini Ultra achieved an impressive score of 90.0%, marking the first time a machine has surpassed human experts in this evaluation.
Gemini 1.0, detailed by interface news on December 7, is the culmination of a year's development aimed at competing with GPT-4. The model features three variations: Gemini Ultra, Gemini Pro, and Gemini Nano. Ultra leads in capabilities, excelling in complex multimodal tasks, while Pro focuses on versatile multitasking. Nano is tailored for mobile devices, demonstrating Gemini's adaptability across different platforms.
With extensive training on diverse datasets, Gemini effectively understands and generates text, images, and audio, tackling complex queries in various fields. Its strengths include advanced reasoning capabilities in mathematics and physics, as well as proficiency in popular programming languages like Python, Java, C++, and Go. Furthermore, Google has developed a specialized coding model, AlphaCode 2, based on Gemini, which has shown over a 50% improvement compared to earlier versions.
Gemini's multimodal strengths enable it to extract insights from lengthy novels and analyze extensive financial reports, providing significant advantages for professionals in finance, technology, and healthcare. In a recent demo, Sundar Pichai showcased Gemini's exceptional capability to navigate between different media formats, unlocking new opportunities for diverse applications.
The results presented in Google’s demo highlight Gemini's superiority over existing multimodal models, including Meta's ImageBind AI, launched earlier this year.
Just over a year after OpenAI introduced ChatGPT, Google found itself reacting to the AI landscape’s rapid evolution. Now, with Gemini, Google seeks to reclaim its position as a leader. At the launch event, Demis Hassabis, CEO of Google DeepMind, emphasized that Google conducted a thorough comparison of GPT-4 and Gemini. He stated, “We executed 32 benchmark tests across various metrics, and I believe we lead significantly in 30 of them.”
Starting today, Gemini will be integrated into Bard and the Pixel 8 Pro, with plans for inclusion in additional Google services like Chrome and search. Developers and enterprise clients will have access to Gemini Pro via the Gemini API in Google AI Studio or Google Cloud Vertex AI beginning December 13, while Android developers can utilize Gemini Nano for app development.
Gemini Ultra has become the first model to exceed human performance in the MMLU benchmark, utilizing 57 subjects, including math, physics, history, law, medicine, and ethics, to evaluate its knowledge and problem-solving abilities. Google has indicated that Gemini Pro performs better than GPT-3.5 but has not disclosed its relative performance against GPT-4.
Sissie Hsiao, Bard's general manager, indicated that Google is currently focused on enhancing user experience and has no immediate plans for monetization concerning Bard Advanced.
In conjunction with Gemini, Google introduced the new TPU v5p chip, designed to enhance the training efficiency of large language models. First launched in 2016, the TPU has evolved into a vital tool for neural networks. The TPU v5p features double the floating-point performance of its predecessor and triples memory bandwidth. Google’s configuration allows for pairing 8,960 v5p accelerators in a single pod, resulting in substantial performance gains during model training.
According to Google, TPU v5p represents its most powerful chip yet, capable of delivering 459 teraFLOPS of bfloat16 performance, supported by high-bandwidth memory and rapid data transfer rates. These enhancements enable TPU v5p to train expansive language models, like GPT-3, much faster than previous TPU versions.
Additionally, Google introduced the concept of an "AI supercomputer," an integrated system built on open software, optimized hardware, and machine learning frameworks designed for high efficiency. Mark Lohmeyer, Google’s VP of Computing and Machine Learning Infrastructure, explained that this system-level design enhances the overall productivity of AI training and application, avoiding inefficiencies common in traditional approaches. This holistic integration of hardware and software aims to optimize performance across all variables.