The New Tech Arms Race: A Billion-Dollar Quest to Develop Advanced AI

During recent testing, a newly released large language model (LLM) demonstrated an awareness of its evaluation process, suggesting potential metacognition—an understanding of its own thought processes. This prompted discussions about AI's self-awareness. However, the key takeaway remains the model's impressive capabilities, a reflection of advancements seen in increasingly larger LLMs.

As LLMs grow, so do both their emergent abilities and development costs. Training costs for leading models now reach approximately $200 million, raising concerns about the industry's future accessibility. Much like the semiconductor industry, where only a few companies can afford cutting-edge chip fabrication plants, the AI sphere may soon be dominated by major tech corporations with the resources to develop leading foundation models such as GPT-4 and Claude 3.

The rapid surge in training costs and capabilities, especially those approaching or surpassing human performance, poses a significant challenge. Anthropic, a prominent player in the field, reports that training its flagship model, Claude 3, costs around $100 million. Future models, expected in 2024 or early 2025, may even approach billion-dollar price tags.

Understanding these rising costs necessitates examining the increasing complexity of LLMs. Each new generation features more parameters for deeper understanding, requiring greater data and computing resources. By 2025 or 2026, training expenses could reach between $5 to $10 billion, limiting development to a handful of large corporations and their partners.

The AI industry's trajectory mirrors that of the semiconductor sector, which saw a shift from companies manufacturing their own chips to outsourcing fabrication as costs soared. Today, only three companies—TSMC, Intel, and Samsung—can build advanced fabrication plants, with TSMC estimating a new state-of-the-art semiconductor fab could cost around $20 billion.

While not every AI application demands cutting-edge LLMs, the impact of increasing costs varies. In computing, the central processing unit (CPU) often utilizes high-end semiconductors, but it operates alongside slower chips that don't require the most advanced technology. Similarly, smaller LLM alternatives like Mistral and Llama3, which utilize billions of parameters, can provide effective solutions at lower costs. Microsoft's Phi-3, a small language model (SLM) with 3.8 billion parameters, demonstrates this approach, reducing costs by relying on a smaller dataset compared to larger counterparts.

These smaller models may be ideal for specific tasks that don't require comprehensive knowledge across various domains. For instance, they can be tailored to address company-specific data or industry needs, generating accurate responses or detailed research outputs. As senior AI analyst Rowan Curran of Forrester Research aptly stated, “You don’t need a sports car all the time. Sometimes you need a minivan or a pickup truck."

However, the rising costs in AI development risk creating a landscape dominated by a few major players—similar to high-end semiconductors. This consolidation could stifle innovation and diversity, limiting contributions from startups and smaller companies. To counteract this trend, it's essential to promote the development of specialized language models, vital for niche applications, and support open-source projects and collaborative efforts. An inclusive approach will ensure that AI technologies are accessible and beneficial to a broader range of communities, fostering equitable innovation opportunities.

Gary Grossman is the EVP of technology practice at Edelman and serves as the global lead of the Edelman AI Center of Excellence.

Most people like

Find AI tools in YBX