SambaNova Systems has unveiled what may be one of the largest large language models (LLMs) to date: the one trillion parameter Samba-1. Unlike OpenAI’s GPT-4, Samba-1 is not a single model. Instead, it integrates over 50 high-quality AI models through a method known as Composition of Experts architecture, allowing for customization and optimization for specific enterprise applications.
In September, SambaNova announced its SN40L AI chip, designed to compete with Nvidia by providing an efficient training and inference solution. The Samba-1 model will be incorporated into the SambaNova Suite, enabling organizations to tailor and deploy models effectively.
Rodrigo Liang, co-founder and CEO of SambaNova, emphasized the value of providing pre-composed, pre-trained, and pre-optimized models. This feature allows businesses to achieve high-performance deployments without the extensive fine-tuning typically required.
How Samba-1 Uses Composition of Experts to Build a Massive LLM
Samba-1 consists of over 50 individually trained AI models optimized for cohesion. This includes both SambaNova proprietary models and curated open-source models suited for specific tasks, such as Llama 2, Mistral, DeepSeek Coder, Falcon, DePlot, CLIP, and Llava.
“We’ve taken the best models, optimized them, and combined them into a single trillion parameter model,” Liang stated. The models within Samba-1 can interact seamlessly, allowing responses from one to serve as inputs for others.
Chaining LLMs to derive outputs is not new; popular open-source technologies like LangChain do this. However, Liang asserts that Samba-1’s Composition of Experts approach offers significant advantages. Unlike LangChain, which requires users to predefine model chains, Samba-1’s experts can be dynamically connected based on prompts and responses, promoting flexibility.
Moreover, Samba-1 allows users to gain various perspectives by drawing on models trained on different datasets. “It can dynamically create 50 LangChain equivalents to explore diverse results,” he noted.
Composition of Experts vs. Mixture of Experts
It’s important to differentiate the Composition of Experts from the Mixture of Experts approach used by some LLMs like Mistral. Liang explained that a Mixture of Experts uses a single model trained across multiple datasets, potentially risking data privacy.
In contrast, the Composition of Experts maintains the security of each model by training them on separate, secure datasets. This approach ensures that security protocols during training extend to deployment and inference.
Tailored Solutions Over a Trillion Parameters
While Samba-1 boasts a trillion parameters, organizations may not always require this scale for deployment. By leveraging multiple specialized models, Samba-1 offers broad capabilities more efficiently.
“Not every prompt requires activating the entire trillion parameters at once,” Liang explained. This leads to improved efficiency, reduced power and bandwidth usage, and a lighter operational footprint, as only the necessary expert is employed.
SambaNova empowers customers to train models on their proprietary data, allowing businesses to develop unique, optimized assets. “With Samba-1, you can have your own private trillion-parameter model, and once it’s trained on your data, it belongs to you indefinitely,” Liang stated.