Apple Showcases AI Capabilities: New Models Surpass Mistral and Hugging Face Performance

Home AI News Apple Showcases AI Capabilities: New Models Surpass Mistral and Hugging Face Performance

Updated on October 25 2024

As excitement builds around the capabilities of the new GPT-4o-mini, Apple has expanded its collection of compact AI models with the release of several open DataComp for Language Models (DCLM) models on Hugging Face.

The package includes two significant models: one with 7 billion parameters and another with 1.4 billion. Both models excel in benchmarking tests, particularly the larger model, which outperforms Mistral-7B and is rapidly approaching the performance of other leading open models like Llama 3 and Gemma.

Vaishaal Shankar from the Apple ML team refers to these models as the “best-performing” open-source options available. Notably, the project has fully embraced open-source principles by releasing model weights, training code, and the pretraining dataset.

Overview of Apple DCLM Models

The DataComp project is a collaborative initiative involving researchers from Apple, the University of Washington, Tel Aviv University, and the Toyota Institute of Research. Its goal is to create high-quality datasets for training AI models, particularly in the multimodal domain. The team employs a standardized framework with fixed model architectures, training code, hyperparameters, and evaluations to test various data curation strategies to optimize model performance.

Early experiments revealed that model-based filtering—where machine learning models filter and select high-quality data from larger datasets—plays a critical role in assembling superior training sets. Using this curation technique, the team developed the DCLM-Baseline dataset, which was instrumental in training the 7 billion and 1.4 billion parameter decoder-only transformer models from scratch.

The 7B model, trained on 2.5 trillion tokens using OpenLM pretraining recipes, features a 2K context window and achieves 63.7% 5-shot accuracy on the MMLU benchmark. This marks a 6.6 percentage point improvement over MAP-Neo, the previous leader in open data language models, while utilizing 40% less computing power during training.

Crucially, its MMLU performance is in close range with leading models that feature open weights but closed data, such as Mistral-7B-v0.3 (62.7%), Llama3 8B (66.2%), Google’s Gemma (64.3%), and Microsoft’s Phi-3 (69.9%).

Additionally, when researchers lengthened the model's context to 8K and conducted 100 billion more training iterations using the Dataset Decomposition technique, they observed further performance improvements across Core and Extended benchmarks, although MMLU results remained consistent.

“Our findings underscore the significance of dataset design in training language models and serve as a foundation for ongoing research in data curation,” the researchers stated in a paper on DataComp-LM.

Impressive Performance of the Smaller Model

Similar to the DCLM-7B, the smaller 1.4B model—developed collaboratively with the Toyota Research Institute using 2.6 trillion tokens—also shows remarkable performance in MMLU, Core, and Extended tests. In the 5-shot MMLU assessment, it achieved 41.9%, surpassing other models in its category, including Hugging Face’s SmolLM, which had an MMLU score of 39.97%. Qwen-1.5B and Phi-1.5B followed with scores of 37.87% and 35.90%, respectively.

Currently, the 7B model is available under Apple’s Sample Code License, while the 1.4B model has been released under Apache 2.0, permitting commercial use, distribution, and modification. Additionally, an instruction-tuned version of the 7B model is available in the Hugging Face library.

It is essential to highlight that this release represents early research emphasizing data curation effectiveness. These models are not intended for Apple devices and may exhibit biases from their training datasets or produce potentially harmful responses.

Researchers Unveil New Technique for Equipping Robots with Embodied Reasoning Skills

Survey Reveals Women's Strong Interest in Generative AI Tools Despite Being Vastly Underrepresented in the AI Field

Most people like

Dream Interpreter AI

112.3K

Unlock the secrets of your dreams with Dream Interpreter AI, your ultimate guide to understanding hidden meanings and insights within your subconscious.

dream Other

Ivee

12.8K

In today's digital landscape, B2B influencer marketing platforms have emerged as powerful tools for businesses aiming to enhance their brand visibility and credibility. By partnering with industry leaders and influencers, companies can effectively engage their target audience, build trust, and drive conversions. This article explores the key benefits and strategies for leveraging B2B influencer marketing platforms to elevate your marketing efforts and achieve sustainable growth. Discover how these platforms can transform your approach to reaching clients and generating leads in a competitive marketplace.

influencer marketing Other

Octane AI

72.2K

Unlock AI-Powered Revenue Growth for Your Shopify Store Discover how leveraging AI can propel your Shopify store to new heights of revenue growth. By integrating intelligent solutions and data-driven insights, you can enhance customer experiences, streamline operations, and optimize sales strategies. Embrace the future of e-commerce and watch your business thrive with AI-driven tools tailored specifically for Shopify.

Shopify integration AI Product Description Generator

MyScale

190.4K

Introducing the next-generation AI database that seamlessly integrates vector search capabilities with advanced SQL analytics. This innovative platform revolutionizes data management, empowering users to unlock deep insights and enhance their decision-making processes. Harness the power of AI to elevate your data analytics experience to new heights.

Vector search AI Knowledge Base

Find AI tools in YBX