After the recent launch of Bard and the Pixel 8 Pro, Google is introducing Gemini, its cutting-edge Generative AI (GenAI) model family, to Google Cloud customers utilizing Vertex AI.
Gemini Pro, a streamlined version of the advanced Gemini Ultra, is now available in public preview on Vertex AI through the new Gemini Pro API. This API, currently free to use within certain limits, supports 38 languages and regions, including Europe, and offers features like chat functionality and content filtering.
“Gemini is a state-of-the-art multimodal model with advanced reasoning and coding capabilities,” stated Google Cloud CEO Thomas Kurian during a press briefing. “Now developers can build tailored applications using this powerful tool.”
Gemini Pro API
By default, the Gemini Pro API accepts text inputs and produces text outputs, similar to generative text model APIs from other providers like Anthropic, AI21, and Cohere. Additionally, a new feature called Gemini Pro Vision is being introduced in preview; it can process both text and images—including photos and videos—outputting text comparable to OpenAI’s GPT-4 with Vision model.
This image processing capability addresses one of the main criticisms of Gemini since its unveiling last week: the version utilized by Bard, a fine-tuned Gemini Pro model, does not accept images, leading to confusion about its multimodal capabilities (trained on various data types including text, images, videos, and audio). Users have had lingering questions about Gemini’s image analysis abilities following a misleading product demonstration. With the introduction of Gemini Pro Vision, users can now test the model's image comprehension firsthand.
Employing customizable features, developers can tailor Gemini Pro to meet specific needs using the same fine-tuning tools available for other Vertex-hosted models like Google’s PaLM 2. Moreover, Gemini Pro can be integrated with external APIs to carry out specific tasks, enhancing the relevance and accuracy of its responses with third-party data from applications, databases, or even real-time information from the web and Google Search.
The existing citation-checking feature in Vertex AI, now enhanced for Gemini Pro, provides an additional layer of fact-checking by highlighting the sources that contributed to Gemini Pro’s responses.
"Grounding allows us to compare Gemini-generated answers against a dataset within a company’s systems or web sources," Kurian explained. “This comparison is key to enhancing the quality of the model’s outputs.”
Kurian emphasized Gemini Pro’s control, moderation, and governance features, countering any perceptions that Gemini Pro may not be the strongest model in the market. To incentivize developers further, Google is offering attractive pricing.
Input for Gemini Pro on Vertex AI will be priced at $0.0025 per character, while output will cost $0.00005 per character—both significantly lower than the costs of Gemini Pro’s predecessor. Furthermore, for a limited time until early next year, Gemini Pro will be available free of charge to Vertex AI customers.
“Our goal is to attract developers with competitive pricing,” Kurian added candidly.
Enhancing Vertex AI
In its quest to draw developers away from rivals like AWS’s Bedrock, Google is introducing other new features to Vertex AI, particularly concerning Gemini Pro. Soon, customers will be able to leverage Gemini Pro for building custom conversational voice and chat agents designed for dynamic interactions with advanced reasoning capabilities. Gemini Pro will also be utilized to drive search summarization, recommendation, and answer generation features in Vertex AI, drawing information from diverse document sources such as PDFs and images across platforms like OneDrive and Salesforce.
Kurian anticipates that the Gemini Pro-powered conversational and search features will debut very early in 2024.
In addition, Google has launched Automatic Side by Side (Auto SxS) evaluation for Vertex AI, which provides developers with a fast and cost-efficient way to assess models compared to AWS’s recently announced Model Evaluation for Bedrock.
Google is also expanding Vertex AI’s offerings by incorporating models from third-party providers such as Mistral and Meta and introducing “step-by-step” distillation—a process that creates smaller, specialized models from larger ones. Additionally, Google will extend its indemnification policy to cover outputs from the PaLM 2 and Imagen models, ensuring that eligible customers are legally defended in potential lawsuits over IP disputes involving those models’ outputs.
Generative AI models can sometimes inadvertently reproduce training data, raising concerns for corporate customers. Should it be revealed that a vendor like Google used copyrighted material to train a model without securing the necessary licenses, their customers might face liability for including infringing content in their projects.
While some vendors lean on fair use as a defense, a growing number are enhancing their indemnification policies concerning GenAI offerings, acknowledging enterprises' concerns.
However, Google has not yet expanded its Vertex AI indemnification policy to cover customers using the Gemini Pro API, stating that this will be addressed once the Gemini Pro API launches publicly.