AI21 CEO Claims Transformers Are Unsuitable for AI Agents Because of Error Propagation Issues

Home AI News AI21 CEO Claims Transformers Are Unsuitable for AI Agents Because of Error Propagation Issues

Updated on October 25 2024

As enterprise organizations pursue the agentic future, the architecture of AI models poses a significant challenge. Ori Goshen, CEO of AI21, emphasizes the need for alternative model architectures to create more efficient AI agents, as the prevailing Transformer models present limitations that hinder the establishment of a multi-agent ecosystem.

In a recent interview, Goshen highlighted the drawbacks of Transformer architecture: its computational intensity increases with longer context handling, slowing down performance and escalating costs. "Agents require multiple calls to LLMs with extensive context at each step, making Transformer a bottleneck," he noted.

AI21 advocates for a more flexible approach to model architecture, proposing that while Transformers can be a viable option, they shouldn't be the default. The company's JAMBA architecture—short for Joint Attention and Mamba—leverages the Mamba framework developed by researchers at Princeton and Carnegie Mellon to enhance inference speeds and extend context capabilities.

Goshen explains that Mamba-based models improve memory performance, facilitating better functionality for agents, especially those integrating with other models. The recent surge in the popularity of AI agents can largely be attributed to the limitations of LLMs built with Transformers.

"The primary reason agents remain in development—and have not yet seen widespread production—is reliability. Since LLMs are inherently stochastic, additional measures must be implemented to ensure the necessary reliability," Goshen stated.

AI agents have surfaced as a leading trend in enterprise AI this year, with several companies launching new platforms for agent development. For instance, ServiceNow upgraded its Now Assist AI platform to include a library of AI agents, while Salesforce introduced its Agentforce. Meanwhile, Slack is enabling users to integrate agents from various companies, including Salesforce, Cohere, and Adobe.

Goshen believes that with the right mix of models and architectures, interest in AI agents will escalate. "Current use cases, like chatbot question-and-answer functions, mainly resemble enhanced search. True intelligence lies in the ability to connect and retrieve diverse information from multiple sources," he commented. AI21 is actively developing its offerings around AI agents to meet this demand.

As Mamba architecture gains traction, Goshen remains a vocal supporter, asserting that the cost and complexity of Transformers diminish their practical applications. Unlike Transformers, which rely on a fixed attention mechanism, Mamba focuses on optimizing memory usage and utilizing GPU processing power effectively.

The demand for Mamba is on the rise, with other developers releasing Mamba-based models, such as Mistral's Codestral Mamba 7B and Falcon's Falcon Mamba 7B. Nevertheless, Transformers continue to dominate as the standard choice for foundation models, including OpenAI’s successful GPT.

Ultimately, Goshen notes that enterprises prioritize reliability over any particular architecture. However, organizations should remain cautious of tempting demos that promise extensive solutions. "We're in a phase where captivating demonstrations are prevalent, but we are still transitioning towards an applicable product phase," he cautioned. "While enterprise AI is valuable for research, it is not yet ready to inform critical business decisions."

The ‘Strawberry’ Dilemma: Overcoming the Limitations of AI for Enhanced Performance

"Criticism of Tesla's ‘We, Robot’ Event: Vague Timelines and 'Parlor Tricks' for Robots, Cybercab, and Robovan"

Most people like

Jam

Introducing Jam: an efficient bug reporting tool designed to streamline the bug reporting process for users. With its user-friendly interface, Jam makes it easier than ever for teams to identify, track, and resolve issues effectively.

bug reporting AI Testing & QA

Nexlev - AI-Powered YouTube Niche Finder

Unlock hidden YouTube opportunities with AI-driven insights from NexLev.io.

YouTube AI Course

https://www.soaster.com

Soaster is a robust Twitter management tool designed to enhance user engagement and drive sustainable growth.

Twitter management AI Advertising Assistant

1PX.AI

Introducing our cutting-edge AI photo and avatar generator platform, where innovation meets creativity! With advanced technology at our fingertips, users can effortlessly create stunning, personalized images and avatars. Whether you're looking to enhance your online presence, design unique social media graphics, or simply explore your creative potential, our platform transforms your ideas into high-quality visuals in minutes. Dive into a world of possibilities and discover how easy it is to generate captivating photos and avatars tailored to your style!

AI photo generator AI Photo & Image Generator

Find AI tools in YBX