As tech giants like Google, Samsung, and Microsoft enhance their generative AI capabilities on PCs and mobile devices, Apple is entering the arena with OpenELM, a new suite of open-source large language models (LLMs) designed to operate fully on standalone devices without the need for cloud connectivity.
Launched recently on the AI code community Hugging Face, OpenELM encompasses small models optimized for efficient text generation tasks.
Overview of OpenELM
The OpenELM family includes eight models—four pre-trained and four instruction-tuned—varying in size from 270 million to 3 billion parameters. These parameters represent the connections among artificial neurons in an LLM, where a higher number generally indicates improved performance.
Pre-training allows the model to generate coherent text, but it primarily focuses on predicting text based on prompts. In contrast, instruction tuning helps the model deliver more relevant and specific responses. For instance, when asked "teach me how to bake bread," a pre-trained model might inadequately respond with "in a home oven," while an instruction-tuned model would provide comprehensive steps.
Apple has made the weights of its OpenELM models available under a “sample code license,” which permits commercial use and modification, provided that any unmodified redistributions retain the accompanying notice and disclaimers. However, Apple warns users that these models may produce outputs that are inaccurate, harmful, biased, or objectionable.
This release marks a significant shift for Apple, traditionally known for its secrecy and closed technology ecosystems. Previously, the company introduced Ferret, an open-source language model with multimodal capabilities, underscoring its commitment to the open-source AI community.
Key Features of OpenELM
OpenELM, which stands for Open-source Efficient Language Models, targets on-device applications, paralleling the strategies of competitors like Google, Samsung, and Microsoft. The recent Phi-3 Mini model from Microsoft, for example, operates entirely on smartphones, showcasing the trend toward portable AI solutions.
Apple's development of OpenELM was led by Sachin Mehta, with significant contributions from Mohammad Rastegari and Peter Zatloukal. The models come in four sizes: 270 million, 450 million, 1.1 billion, and 3 billion parameters—all smaller than many leading models, which typically exceed 7 billion parameters. They were trained on a massive dataset of 1.8 trillion tokens sourced from platforms like Reddit, Wikipedia, and arXiv.org, ensuring a diverse range of language understanding.
Performance Insights
OpenELM's performance benchmarks indicate solid results, particularly from the 450 million-parameter instruction variant. Notably, the 1.1 billion OpenELM model outperforms OLMo, a recent release from The Allen Institute for AI, demonstrating effectiveness while requiring significantly fewer tokens for pre-training.
On various benchmarks, the pre-trained OpenELM-3B has shown the following accuracies:
- ARC-C: 42.24%
- MMLU: 26.76%
- HellaSwag: 73.28%
Initial user feedback suggests that while OpenELM produces reliable and aligned outputs, it lacks creativity and is less likely to explore unconventional or NSFW topics. In comparison, Microsoft's Phi-3 Mini, with its larger parameter count and context length, dominates in performance metrics.
Conclusion
As the OpenELM models are tested and refined, they hold promise for enhancing on-device AI applications. It will be intriguing to observe how the community leverages this open-source initiative, especially given the excitement surrounding Apple’s commitment to transparency and collaboration in the AI space.