Microsoft Launches Phi-3 for General Use and Unveils Phi-3-Vision: A Multimodal Small Language Model Preview

Microsoft is expanding access to its Phi-3 lightweight model family for developers, nearly a month after its initial announcement. This family includes Phi-3-medium, Phi-3-small, and Phi-3-mini, with the latter now integrated into Azure AI. Additionally, Microsoft introduced the multimodal variant, Phi-3-vision, which features 4.2 billion parameters.

Phi-3 Overview

Developed by Microsoft Research, Phi-3 is a robust 3 billion parameter language model designed to deliver strong reasoning capabilities comparable to larger models but at a lower cost. This represents the fourth iteration of Microsoft’s compact language models, following Phi-1, Phi-1.5, and Phi-2.

AI Agents and Smaller Models

The increasing demand for AI solutions that operate locally or on devices encourages developers to explore more efficient and smaller models. Microsoft’s Phi-3 family includes three options: Phi-3-mini (3.8 billion parameters), Phi-3-small (7 billion parameters), and Phi-3-medium (14 billion parameters). According to the company, Phi-3 demonstrates performance on par with OpenAI’s GPT-3.5 in a more lightweight format.

The release of Phi-3 coincides with the upcoming introduction of AI capabilities in PCs. Developers can now leverage these variants to enhance AI functionality across laptops, mobile devices, and wearables.

Insights on Phi-3-vision

In addition to the Phi-3 models, Microsoft is unveiling Phi-3-vision, which supports general visual reasoning tasks, including analyzing charts, graphs, and tables. With 4.2 billion parameters, users can interact with Phi-3-vision by asking questions about data visualizations or specific images.

Notably, Google also introduced its lightweight multimodal model, PaliGemma, at its recent developer conference, featuring 3 billion parameters, slightly fewer than Microsoft’s.

The ability of AI to process diverse input types is crucial for developers. A model that combines the efficiency of a lightweight architecture with the performance of larger language models could significantly enhance adoption.

While Phi-3-vision is currently in preview, Microsoft has yet to announce its public availability.

Most people like

Find AI tools in YBX