Today, the Abu Dhabi-backed Technology Innovation Institute (TII) announced the launch of Falcon Mamba 7B, an innovative open-source model designed for advanced text-generation tasks. This new addition joins TII's lineup, following the successful releases of Falcon 180B, Falcon 40B, and Falcon 2.
Available on Hugging Face, Falcon Mamba 7B employs the cutting-edge Mamba State Space Language Model (SSLM) architecture. This unique decoder-only model excels in handling text generation tasks, surpassing competitors in its size class, such as Meta's Llama 3 8B, Llama 3.1 8B, and Mistral 7B across key benchmarks.
As the first SSLM model from TII, Falcon Mamba 7B offers an exciting new alternative to traditional transformer-based large language models. The model is released under the permissive Falcon License 2.0, based on Apache 2.0.
What sets Falcon Mamba 7B apart?
Transformers are popular for generative AI, but they often struggle with longer text sequences due to their attention mechanism, which requires extensive computing power and memory as context windows increase. Without adequate resources, inference slows, limiting the text length that can be processed.
In contrast, the SSLM architecture adopts a dynamic “state” approach, continuously updating as it processes words. This allows the model to selectively focus on or disregard specific inputs, similar to transformer attention, while efficiently managing lengthy texts—like entire books—without needing more memory or compute power. This makes Falcon Mamba suitable for a range of applications, including enterprise-scale machine translation, text summarization, and audio processing.
Testing against industry giants
To evaluate Falcon Mamba 7B's capabilities, TII conducted tests against leading transformer models using a single 24GB A10GPU. Results indicated that Falcon Mamba can accommodate longer sequences than state-of-the-art transformer models, with the potential for infinite context lengths by processing inputs token by token.
In throughput tests, Falcon Mamba 7B also outperformed Mistral 7B, maintaining consistent token generation speed without increasing CUDA peak memory usage. In industry benchmarks such as Arc, TruthfulQA, and GSM8K, Falcon Mamba scored 62.03%, 53.42%, and 52.54%, respectively, convincingly beating Llama 3 8B and its peers. However, its performance in the MMLU and Hellaswag benchmarks ranked slightly lower than these models.
This initial release marks a significant milestone for TII, which plans to further enhance the model's design and expand its application areas. “This release represents a major advancement, innovation, and fresh perspectives in the quest for intelligent systems,” stated Dr. Hakim Hacid, TII's acting chief researcher of the AI cross-center unit.
Overall, TII’s Falcon models have seen over 45 million downloads, establishing themselves as a leading force in generative AI from the UAE.