Microsoft's Orca-Math AI Delivers 10x Superior Performance Compared to Larger Models

Exciting News for Students and STEM Researchers!

If you’ve ever struggled with math—like I did growing up—or if you're simply eager to enhance your skills, Microsoft has fantastic news for you.

Arindam Mitra, a senior researcher at Microsoft Research and leader of the Orca AI initiative, recently announced Orca-Math on X. This innovative model, a variant of French startup Mistral’s Mistral 7B, is specifically designed to excel in solving math word problems. Notably, it maintains a compact size, making it efficient for training and deployment. This breakthrough is part of the Microsoft Orca team's overarching goal to enhance the capabilities of smaller-scale large language models (LLMs).

Orca-Math: Performance Meets Efficiency

Orca-Math has achieved impressive results, outperforming models with ten times as many parameters (the weights and biases that guide AI model connections) in solving complex math word problems. Mitra shared a chart demonstrating Orca-Math's superiority over most other 7-70 billion parameter AI models in the GSM8K benchmark—a collection of 8,500 diverse math problems crafted to be solvable by bright middle school students.

Remarkably, Orca-Math, with its 7 billion parameters, competes nearly on par with larger models like OpenAI's GPT-4 and Google’s Gemini Ultra, while significantly outpacing larger models such as MetaMath (70B) and Llemma (34B).

Creating Orca-Math: A Collaborative Approach

How did the Orca team accomplish this feat? They formulated a new dataset of 200,000 math problems through a collaboration of specialized AI agents, including student and teacher AIs, which corrected the generated answers. This dataset was built from 36,217 math problems sourced from open datasets, with answers provided by OpenAI's GPT-4. This process culminated in the development of Orca-Math using the Mistral 7B model.

Additionally, the Orca team implemented a "Suggester and Editor" agent system to produce more complex questions, enhancing the AI's training set. According to their research published on arXiv.org, the iterative problem enhancement process contributes significantly to developing challenging questions that drive higher accuracy during learning.

Machine-generated synthetic data has proven valuable in boosting the capabilities of LLMs, addressing concerns about model stagnation. The Orca team utilized the "Kahneman-Tversky Optimization" (KTO) method, which focuses on assessing the desirability of outputs rather than complex preference criteria. This method, alongside traditional supervised fine-tuning, further refined Orca-Math’s performance.

Open-Source Resource: 200,000 Math Problems for Innovation

The Orca team has generously made their AI-generated dataset of 200,000 math problems available on Hugging Face under a permissive MIT license. This opens the door for startups and companies to explore, innovate, and even utilize the dataset for commercial purposes.

Since the release of the original Orca 13B in June 2023—which utilized GPT-4 as a teaching model—followed by the Orca 2 in November 2023, the Orca family continues to expand and evolve, consistently delivering smarter, more compact iterations.

With these advancements, Microsoft is set to transform the landscape of math education and AI-driven learning tools.

Most people like

Find AI tools in YBX

Related Articles
Refresh Articles