Microsoft Launches Advanced Phi-3.5 Models, Outpacing Google, OpenAI, and Competitors

Home AI News Microsoft Launches Advanced Phi-3.5 Models, Outpacing Google, OpenAI, and Competitors

Updated on October 24 2024

Microsoft is pushing the boundaries of AI innovation beyond its partnership with OpenAI. Today, the tech giant unveiled three new models in its Phi series of language and multimodal AI, positioning itself as a formidable player in the AI landscape.

Introducing the Phi 3.5 Models

The newly released Phi 3.5 models include:

- Phi-3.5-mini-instruct: 3.82 billion parameters

- Phi-3.5-MoE-instruct: 41.9 billion parameters

- Phi-3.5-vision-instruct: 4.15 billion parameters

Each model is optimized for specific tasks: Phi-3.5-mini for basic and rapid reasoning, Phi-3.5-MoE for advanced reasoning, and Phi-3.5-vision for image and video analysis. Developers can download, customize, and fine-tune these models on Hugging Face, all under a Microsoft-branded MIT License permitting unrestricted commercial usage.

Remarkably, these models deliver near state-of-the-art performance across various third-party benchmarks, outperforming notable competitors like Google's Gemini 1.5, Meta's Llama 3.1, and even OpenAI's GPT-4o in certain tests. This impressive performance has sparked praise for Microsoft across social media platforms.

Model Overviews

1. Phi-3.5 Mini Instruct: For Compute-Constrained Environments

The Phi-3.5 Mini Instruct model, with its 3.8 billion parameters, is designed for environments with limited memory and computing power. It supports a 128k token context length, making it ideal for tasks such as code generation, mathematical problem-solving, and logic-based reasoning. Despite its smaller size, it exhibits competitive performance in multilingual and multi-turn conversations, outperforming similar models in long-context code understanding.

2. Phi-3.5 MoE: Mixture of Experts

The Phi-3.5 MoE model represents Microsoft's foray into the Mixture of Experts architecture, which combines multiple specialized models into one. With 42 billion active parameters and a 128k token context length, this model delivers scalable performance across various reasoning tasks. It frequently surpasses larger models in benchmarks, including significant strides in STEM and humanities subjects on the MMLU (Massive Multitask Language Understanding) test.

3. Phi-3.5 Vision Instruct: Advanced Multimodal Reasoning

The Phi-3.5 Vision Instruct model combines text and image processing, excelling at tasks such as image comprehension, optical character recognition, and video summarization. Like its counterparts, it supports a 128k token context length, allowing it to handle complex visual tasks. Microsoft trained this model on a mix of synthetic and publicly available datasets, emphasizing high-quality, reasoning-rich data.

Training the Phi Trio

- Phi-3.5 Mini Instruct: Trained on 3.4 trillion tokens over 10 days using 512 H100-80G GPUs.

- Phi-3.5 Vision Instruct: Trained on 500 billion tokens over 6 days with 256 A100-80G GPUs.

- Phi-3.5 MoE: Trained on 4.9 trillion tokens over 23 days using 512 H100-80G GPUs.

Open Source Commitment

All three Phi-3.5 models are released under the MIT license, showcasing Microsoft's dedication to the open-source community. This license allows developers to utilize, modify, and distribute the software freely while stating that it is provided “as is,” devoid of warranties.

Microsoft's introduction of the Phi-3.5 series marks a pivotal advancement in multilingual and multimodal AI, equipping developers to integrate cutting-edge capabilities into their applications and driving innovation in both commercial and research sectors.

How India Leverages Nvidia Accelerated Computing to Streamline Tollbooth Traffic Management

Nvidia’s Llama-3.1-Minitron 4B: A Powerful Small Language Model That Outperforms Expectations

Most people like

Evoto

516.2K

Revolutionize your photography with our next-generation AI photo editor, designed for swift and superior photo processing. Experience fast, high-quality enhancements that elevate your images to professional standards.

photo editing AI Image Enhancer

Covariant

18.6K

As we embrace the digital age, automation is transforming industries and enhancing productivity like never before. This evolution not only streamlines processes but also paves the way for innovative solutions that drive growth. In this rapidly changing landscape, understanding the pivotal role of automation is essential for businesses looking to thrive. Join us as we explore the cutting-edge technologies and strategies that are shaping the future of automation.

AI Other

insMind AI Design Generator

1.2M

Unlock Your Creativity with insMind AI Design Generator – a powerful tool that allows you to create professional-quality graphic designs for free. With a single click, you can effortlessly generate outstanding AI-driven designs specifically crafted for marketing, promotion, and business needs, saving you the hassle and expense of hiring a designer. Start creating eye-catching designs with AI today!

AI Design Generator AI Design Generator

FlowTunes

46.1K

Discover the best free music app for enhancing your focus and productivity! With the right tunes, you can create an ideal environment for concentration, whether you're studying, working, or just looking to unwind. Explore our top picks for music apps specifically designed to help you stay on track and maximize your efficiency. Let’s dive into the world of music that keeps you engaged and focused!

Music app for focus AI Music Generator

Find AI tools in YBX