Transformers play a vital role in the generative AI landscape, but they are not the sole method for model development.
AI21 has launched new iterations of its Jamba model today, integrating transformers with a Structured State Space (SSM) model approach. The Jamba 1.5 mini and large versions enhance the innovations introduced with Jamba 1.0 in March. Utilizing an SSM method called Mamba, Jamba aims to merge the strengths of both transformers and SSM. The name "Jamba" stands for Joint Attention and Mamba architecture, promising improved performance and accuracy beyond what either model can achieve alone.
"We received amazing feedback from the community; this was the first—and remains one of the only—production-scale models based on Mamba," said Or Dagan, VP of Product at AI21. "This novel architecture has sparked discussions about the future of LLM architectures and the role of transformers."
The Jamba 1.5 series introduces enhanced functionalities including function calling, JSON mode, structured document objects, and citation mode. These additions position the models as ideal candidates for developing agentic AI systems. Both versions feature a substantial context window of 256K and utilize a Mixture-of-Experts (MoE) architecture, with Jamba 1.5 mini offering 52 billion total parameters (12 billion active) and Jamba 1.5 large showcasing 398 billion total parameters (94 billion active).
These models are available under an open license, with AI21 providing commercial support and services. The company has established partnerships with AWS, Google Cloud, Microsoft Azure, Snowflake, Databricks, and Nvidia.
New Features of Jamba 1.5: Accelerating Agentic AI
The Jamba 1.5 Mini and Large models feature several new capabilities aimed at meeting the evolving requirements of AI developers:
- JSON Mode for efficient structured data handling
- Citations to enhance accountability
- Document API for improved context management
- Function Calling capabilities
According to Dagan, these enhancements are crucial for developers advancing agentic AI systems. JSON (JavaScript Object Notation) is widely used to create application workflows, and its inclusion facilitates clearer input/output relationships in complex AI setups, extending beyond basic language model usage. The citation feature works in tandem with the new document API.
"We teach the model to attribute relevant content to the documents provided during generation," Dagan explained.
Distinguishing Citation Mode from RAG
It’s important to differentiate citation mode from Retrieval Augmented Generation (RAG), though both aim to ground AI outputs in reliable data.
Dagan clarified that Jamba 1.5’s citation mode is designed for a seamless integration with the document API, offering a more holistic approach compared to traditional RAG. In standard RAG setups, developers connect a language model to a vector database, requiring the model to effectively incorporate retrieved data into its outputs.
In contrast, the citation mode in Jamba 1.5 is intrinsically intertwined with the model itself, allowing it to retrieve, integrate, and explicitly cite the sources of information used in its outputs. This feature enhances transparency and traceability compared to conventional LLM workflows, where the model’s reasoning can be less clear.
AI21 also supports RAG solutions and provides an end-to-end managed service that includes document retrieval and indexing.
Looking ahead, Dagan emphasized AI21's commitment to evolving its models to meet customer demands, with an ongoing focus on advancing agentic AI capabilities. "We recognize the necessity to innovate in agentic AI systems, especially regarding planning and execution," he stated.