MIT Spinoff Liquid Launches State-of-the-Art Non-Transformer AI Models

Liquid AI, a startup co-founded by former MIT researchers from the Computer Science and Artificial Intelligence Laboratory (CSAIL), has launched its first multimodal AI models: the Liquid Foundation Models (LFMs).

Distinct from many current generative AI models that rely on transformer architecture, particularly the renowned 2017 framework "Attention Is All You Need," Liquid AI aims to explore alternatives to Generative Pre-trained Transformers (GPTs). The LFMs are constructed from "first principles," reminiscent of how engineers approach building engines and aircraft.

These innovative LFMs demonstrate superior performance compared to comparable transformer-based models, such as Meta’s Llama 3.1-8B and Microsoft’s Phi-3.5 3.8B. Available in three sizes—LFM 1.3B (small), LFM 3B, and the large LFM 40B MoE (a Mixture-of-Experts model)—the models have varying numbers of parameters, denoted by the "B" for billion. Typically, a higher parameter count indicates greater capability across diverse tasks.

The LFM 1.3B version has already surpassed Meta's Llama 3.2-1.2B and Microsoft’s Phi-1.5 in various third-party benchmarks, including the Massive Multitask Language Understanding (MMLU) test, marking a significant achievement by a non-GPT architecture. All three models balance high performance with memory efficiency; for instance, Liquid’s LFM-3B only requires 16 GB of memory, whereas Meta's Llama-3.2-3B needs over 48 GB.

Maxime Labonne, Head of Post-Training at Liquid AI, expressed pride in the LFMs via his social media account, highlighting their efficiency and ability to outperform transformer models on performance benchmarks while using significantly less memory. These models are optimized for numerous applications, including enterprise solutions in finance, biotechnology, and consumer electronics, as well as deployment on edge devices.

However, it’s important to note that the LFMs are not open source. Users must access them through Liquid's inference playgrounds, such as Lambda Chat or Perplexity AI.

Liquid’s approach to developing LFMs incorporates a mix of computational units grounded in dynamical systems theory, signal processing, and numerical linear algebra. This results in general-purpose AI models capable of handling various sequential data types, including video, audio, text, and time series.

Last year, reports highlighted Liquid AI's focus on Liquid Neural Networks (LNNs), a CSAIL-developed architecture aimed at enhancing the efficiency and adaptability of artificial neurons. Unlike traditional deep learning models that require numerous neurons for complex tasks, LNNs show that fewer neurons—when combined with innovative mathematical techniques—can achieve comparable results.

The LFMs leverage this adaptability, allowing real-time adjustments during inference with minimal computational overhead. For instance, the LFM-3B model excels in managing long-context processing while maintaining a smaller memory footprint compared to models like Google’s Gemma-2, Microsoft’s Phi-3, and Meta’s Llama-3.2.

Through its multimodal capability, Liquid AI addresses diverse industry challenges across financial services, biotechnology, and consumer electronics.

Currently in the preview phase, Liquid AI encourages early adopters to test the models and provide feedback. A full launch event is scheduled for October 23, 2024, at MIT’s Kresge Auditorium in Cambridge, MA, with RSVPs accepted. In preparation, Liquid AI plans to release a series of technical blog posts and encourage red-teaming efforts, inviting users to stress-test their models for future improvements.

With the launch of Liquid Foundation Models, Liquid AI aims to establish itself as a significant player in the foundation model sector, combining exceptional performance with unmatched memory efficiency.

Most people like

Find AI tools in YBX

Related Articles
Refresh Articles