For frontier AI models, the momentum is undeniable. On Wednesday, Mistral unveiled its latest flagship model, Large 2, claiming it rivals the most advanced models from OpenAI and Meta in key areas like code generation, mathematics, and reasoning.
This launch follows closely on the heels of Meta's release of its latest open-source model, Llama 3.1 405B, just a day prior. Mistral asserts that Large 2 not only elevates the standard for performance but also optimizes cost efficiency for open models, which it supports with several key benchmarks.
With a parameter count of just 123 billion, Large 2 reportedly surpasses Llama 3.1 405B in both coding tasks and mathematical accuracy, despite having less than a third of the parameters. Mistral highlights a focused effort during training to reduce hallucination issues. The company emphasizes that Large 2 is designed to be more discerning, opting to admit uncertainty rather than fabricating plausible yet incorrect information.
The Paris-based AI startup recently secured $640 million in a Series B funding round led by General Catalyst, achieving a valuation of $6 billion. Although Mistral is one of the newer players in the artificial intelligence landscape, it is rapidly delivering cutting-edge AI models.
However, it's important to note that Mistral's offerings, like many in the industry, aren’t open-source in the traditional sense; any commercial use of the model requires a paid license. While they are more accessible than models like GPT-4o, very few organizations possess the expertise or infrastructure necessary to implement such large-scale models, including Llama’s impressive 405 billion parameters.
One feature absent from both Mistral Large 2 and Meta's Llama 3.1 release is multimodal capability. OpenAI currently leads the way in multimodal AI systems, which can process both images and text simultaneously—a development many startups are eager to pursue.
Large 2 boasts a remarkable 128,000 token window, allowing it to process extensive data inputs in one go (128,000 tokens translate to about a 300-page book). Additionally, Mistral's new model enhances multilingual capabilities, understanding languages including English, French, German, Spanish, Italian, Portuguese, Arabic, Hindi, Russian, Chinese, Japanese, and Korean, along with 80 coding languages. Notably, Mistral claims that Large 2 produces more concise responses compared to leading models, which often provide excessively verbose outputs.
Mistral Large 2 can be accessed on several platforms, including Google Vertex AI, Amazon Bedrock, Azure AI Studio, and IBM watsonx.ai. Users can also try the model for free on Mistral’s platform, under the name “mistral-large-2407,” or explore its functionalities through the startup’s ChatGPT alternative, Le Chat.