Mistral CEO Confirms 'Leak' of New Open Source AI Model Approaching GPT-4 Performance

The past few days have been a whirlwind for the open-source AI community, even by its typically fast-paced standards.

Chronology of Events:

On January 28, a user named “Miqu Dev” uploaded a collection of files to HuggingFace, a premier platform for open-source AI models. This upload introduced the “miqu-1-70b,” a seemingly new large language model (LLM).

The HuggingFace entry, still available at the time of writing, highlighted that this LLM utilized the same prompt format as Mistral, a prominent Parisian AI company known for its Mixtral 8x7b model. Many consider Mixtral to be the top-performing open-source LLM currently, which is a finely-tuned version of Meta’s Llama 2.

A Viral Discovery:

On the same day, an anonymous user on 4chan (potentially “Miqu Dev”) shared a link to the miqu-1-70b files. As awareness spread, users on X (formerly Twitter) began discussing the model's impressive performance on common LLM tasks, as indicated by benchmark tests, rivaling OpenAI's GPT-4 on the EQ-Bench.

Community Reactions:

Machine learning researchers took to LinkedIn with intrigue. Maxime Labonne, an ML scientist at JP Morgan & Chase, questioned whether "Miqu" stood for "MIstral QUantized." He noted, “Thanks to @152334H, we now have an unquantized version of miqu available,” implying potential for enhanced performance over GPT-4 in future fine-tuned iterations.

Quantization is a technique that allows AI models to run on less powerful hardware by simplifying complex numerical sequences in their architecture.

Speculation and Confirmation:

Speculation arose that "Miqu" might be a newly leaked Mistral model, given the company's discreet approach to releasing updates. Mistral co-founder and CEO Arthur Mensch confirmed this theory, announcing on X that an overzealous employee of an early access customer had leaked a quantized version of an old model they had openly trained. Mensch explained, “We retrained this model from Llama 2 the day we accessed our cluster.”

Rather than demanding a takedown of the HuggingFace post, Mensch left a comment suggesting that the poster might consider proper attribution.

Implications for the AI Landscape:

Mensch’s note to "stay tuned!" suggests that Mistral is developing a version of the "Miqu" model that could rival GPT-4. This could mark a pivotal moment not just for open-source generative AI but for the entire AI landscape. Since its release in March 2023, GPT-4 has been recognized as the most advanced LLM available, surpassing even the long-anticipated Gemini models from Google.

The emergence of an open-source model akin to GPT-4 could exert substantial competitive pressure on OpenAI, especially as businesses increasingly seek models that combine open-source and proprietary elements. Although OpenAI may maintain an edge with its faster GPT-4 Turbo and GPT-4V (vision), the open-source AI community is rapidly closing the gap. The looming question remains: will OpenAI's head start and unique offerings be enough to keep it at the forefront of LLMs?

Most people like

Find AI tools in YBX