On October 5, technology media NeoWin reported that Google is preparing to commercialize the Gemini 1.5 Flash 8B model, which will be its most affordable AI offering to date. Earlier this August, it was revealed that Google had launched three experimental models under the Gemini brand, with the Gemini 1.5 Flash 8B being a more compact version of the Gemini 1.5 Flash, featuring 8 billion parameters. This model is specifically designed for various multimodal tasks, including large-scale analyses and long text summaries.
Notably, the Gemini 1.5 Flash 8B boasts lower latency, making it particularly well-suited for chat functionalities, transcription, and long-form translation tasks. Another standout feature is its competitive pricing, effective October 14. The cost for using the model is as follows:
- For input prompts under a 128K context window, the fee is $0.0375 per million tokens.
- For output prompts under a 128K context window, the fee is $0.15 per million tokens.
- For caching prompts under a 128K context window, the fee is $0.01 per million tokens.
In comparison, the original Gemini 1.5 Flash model has a cost of $0.30 per million output tokens, which will be implemented starting August 12, 2024. This means that the new Gemini 1.5 Flash 8B offers a significant cost reduction, essentially halving the price compared to its predecessor.