Meta Launches Advanced Custom AI Chip in a Bid to Compete With Rivals

Meta is determined to close the gap with its competitors in the generative AI sector, investing billions in its AI initiatives. A significant portion of this investment is directed towards attracting top AI researchers. However, an even larger share is allocated to developing the hardware needed to train and run Meta's AI models, particularly custom chips.

Today, Meta introduced the latest advancement from its chip development efforts, just a day after Intel launched its new AI accelerator. This chip, named the "next-gen" Meta Training and Inference Accelerator (MTIA), succeeds last year's MTIA v1 and is designed to optimize the performance of various models, including those used for ranking and recommending display ads across Meta's platforms such as Facebook.

In terms of specifications, the next-gen MTIA uses a cutting-edge 5nm manufacturing process, a significant improvement over the 7nm process used for MTIA v1. This newer chip features a larger physical design with more processing cores. Although it consumes more power—90W compared to the previous 25W—it offers enhanced internal memory (128MB versus 64MB) and operates at a higher average clock speed (1.35GHz, up from 800MHz).

According to Meta, the next-gen MTIA is currently deployed in 16 of its data center regions and is delivering up to three times the overall performance of its predecessor, MTIA v1. While this "3x" performance improvement may seem ambiguous, it is based on performance tests of "four key models" across both chip versions. Meta asserts in a recent blog post that by controlling the entire technology stack, they achieve superior efficiency compared to standard GPUs available in the market.

The unveiling of Meta's hardware follows closely on the heels of a press briefing showcasing the company’s ongoing generative AI projects, marking a notable moment for several reasons. Firstly, the blog post indicates that the next-gen MTIA is not currently being utilized for generative AI training workloads, although Meta claims multiple projects are in progress. Secondly, the company acknowledges that this chip will not replace GPUs for running or training models, but rather serve as a complementary solution.

While it's clear that Meta is making strides, the company appears to be treading cautiously—possibly more slowly than it prefers. There is increasing pressure on Meta's AI teams to cut expenses, especially considering the forecasted spending of around $18 billion by the end of 2024 on GPUs for generative AI tasks. With training costs for leading generative models soaring into the tens of millions, developing in-house hardware presents a more economically viable option.

As Meta navigates its hardware development challenges, competitors are advancing rapidly, which is likely a source of concern for the company's leadership. Recently, Google announced its fifth-generation custom chip for training AI models, the TPU v5p, which is now available to Google Cloud customers. Additionally, Amazon has several proprietary AI chip families, and Microsoft recently entered the competition with its Azure Maia AI Accelerator and Azure Cobalt 100 CPU.

In the same blog post, Meta reported that it took under nine months to transition from initial silicon to production models of the next-gen MTIA, a notably faster timeline than the typical development period for Google’s TPUs. However, Meta has significant work ahead to reduce reliance on third-party GPUs and compete effectively against its rivals in the AI landscape.

Most people like

Find AI tools in YBX