Mistral Unveils Pixtral 12B: Its First Cutting-Edge Multimodal AI Model

French AI startup Mistral has unveiled its first model capable of processing both images and text. Dubbed Pixtral 12B, this model boasts 12 billion parameters and has a size of approximately 24GB. The number of parameters indicates the model’s problem-solving capabilities, with a higher count generally correlating to better performance.

Pixtral 12B is built on Mistral’s existing text model, Nemo 12B, and it can respond to inquiries about a limitless number of images, regardless of their size, using either image URLs or base64-encoded images. Like other multimodal models, including Anthropic’s Claude series and OpenAI’s GPT-4, Pixtral 12B is theoretically equipped to handle tasks such as image captioning and object counting in photos.

Pixtral 12B is accessible via a torrent link on GitHub and the AI and machine learning platform Hugging Face. Users can download, fine-tune, and utilize the model under an Apache 2.0 license without any restrictions. A Mistral representative confirmed this licensing via email.

Unfortunately, I was unable to test Pixtral 12B directly, as there were no working web demos available at the time of publication. However, Sophia Yang, head of Mistral developer relations, noted in a post on X that Pixtral 12B will soon be available for testing through Mistral’s platforms, Le Chat and Le Plateforme.

The specific image data used in the development of Pixtral 12B remains unclear. Similar to Mistral’s other models, most generative AI models are trained on extensive amounts of public data sourced from the internet, which often includes copyrighted material. While some model developers argue that their practices fall under “fair use,” many copyright holders contend otherwise, leading to lawsuits against major vendors like OpenAI and Midjourney.

The launch of Pixtral 12B comes shortly after Mistral completed a $645 million funding round led by General Catalyst, valuing the company at $6 billion. Founded just over a year ago and minority-owned by Microsoft, Mistral is widely regarded as Europe’s response to OpenAI. The startup's current strategy includes releasing free, open models, offering paid managed versions of these models, and providing consulting services to corporate clients.

Updated 9/11 at 8:11 a.m. Pacific: Clarified that Pixtral 12B is distributed under an Apache 2.0 license, distinct from Mistral’s standard development license, which imposes certain restrictions on commercial usage.

Most people like

Find AI tools in YBX