The most widely used language models are accessible via API, but open models—despite some debate over the term—are increasingly making their mark. Mistral, a French AI startup that secured significant funding in June, has unveiled its inaugural model, claiming it outperforms its peers of similar scale—and it’s completely free to use without any restrictions.
The Mistral 7B model is now available for download through various channels, including a 13.4-gigabyte torrent that already has several hundred seeders. The company has also established a GitHub repository and a Discord channel to foster collaboration and support.
Crucially, this model is released under the Apache 2.0 license, which is highly permissive, placing no restrictions on use or reproduction aside from proper attribution. This means that anyone from a hobbyist to a multinational corporation—or even the Pentagon—can utilize it, provided they have the capability to run it locally or the budget to acquire the necessary cloud resources.
Mistral 7B represents an advancement over other compact language models like Llama 2, offering comparable capabilities—according to some standard benchmarks—while significantly reducing computational costs. While larger foundational models like GPT-4 deliver greater functionality, they come with high operational costs, typically requiring access via APIs or remote resources.
“Our mission is to lead and support the open generative AI community by developing open models that achieve state-of-the-art performance,” stated Mistral’s team in a blog post accompanying the release. “Mistral 7B illustrates the potential of smaller models when built with conviction. This achievement is the culmination of three months of intensive effort in which we assembled the Mistral AI team, established a top-tier MLops stack, and devised a sophisticated data processing pipeline from the ground up.”
For some, this development might seem like a two-year project rather than a three-month endeavor. However, the founders benefitted from prior experience in similar projects at Meta and Google DeepMind, equipping them with valuable expertise.
Despite its availability for public download, the model’s status is distinct from being “open source,” as discussed at Disrupt last week. Although the Apache 2.0 license is highly permissive, the model itself was developed in private, funded by private investments, and both the datasets and model weights remain confidential.
This appears to be the foundation of Mistral’s business model: while the model is free for anyone to use, those interested in deeper access will likely need to explore paid offerings. “[Our commercial offering] will be available as white-box solutions, providing both the weights and source code. We are actively developing hosted solutions and dedicated deployment options for enterprises,” the blog post noted.
I reached out to Mistral for additional insights regarding their future plans. CEO Arthur Mensch indicated that not all forthcoming models will be released under the Apache 2.0 license, although “some will.” Larger models are expected to be accessible via a paid API rather than as downloadable resources. He refrained from providing further specifics on the training and dataset assembly processes, citing proprietary constraints thus far.