Intel has unveiled its next-generation AI processing chip, the Gaudi 3 AI accelerator, designed to enhance AI development by streamlining workflows, simplifying infrastructure, and accelerating enterprise workloads.
The Gaudi 3 retains the architecture of its predecessor but offers significantly improved performance—four times the computing power, double the network bandwidth, and 1.5 times the high-bandwidth memory (HBM) capacity, enabling it to manage the increasing demands of large language models (LLMs) without sacrificing performance.
Originally rooted in graphics processing unit (GPU) technology, the Gaudi 3's parallel processing capabilities and multi-tile architecture make it well-suited as an AI accelerator. This launch is part of Intel's strategy to compete with Nvidia and AMD in the AI accelerator market.
Intel CEO Patrick Gelsinger previewed the Gaudi 3 at the AI Everywhere event and announced that while the chip officially launches today, general availability is set for the third quarter of 2024, with some customers already receiving samples.
According to Jeni Barovian, Intel’s vice president for data center AI solutions, “Generative AI represents a foundational transformation of compute.” She emphasized that Gaudi 3 will deliver the performance, scalability, and efficiency required to build future AI systems.
Intel Gaudi 3: Specifications and Performance
Eitan Medina, COO of Intel’s Habana Labs, describes the Gaudi 3 as featuring a heterogeneous computer architecture that includes 64 Tensor processor cores (5th gen), 8 Matrix Math Engines, 128 GB of HBM capacity with 3.7 TB/s bandwidth, and 24x 200 GbE RoCE Ethernet ports.
Building solutions with Gaudi 3 is designed to be as straightforward as with Gaudi 2. Intel has doubled the network bandwidth per accelerator, allowing for extensive cluster configurations based on workload needs—be it inference, fine-tuning, or training.
Comparison with Nvidia GPUs
When compared to Nvidia’s H100—a leading GPU for training large language models like Llama 2 and GPT-3—the Gaudi 3 is projected to be up to 1.7 times faster in training tasks. In inferencing tests using models like Llama-7B and Falcon 180B, Gaudi 3 reportedly performs 1.5 times faster than the H100 and 1.3 times faster than the newer H200. Notably, Gaudi 3 demonstrates a power efficiency rate up to 2.3 times greater than the H100 in inference tasks.
Extensive Product Lineup
Intel is not only launching the Gaudi 3 chip but also three complementary products:
1. Gaudi 3 AI Accelerator Card (HL-325L): OAM-compliant with 1,835 TFLOPs and 128 GB HBM2e.
2. Universal Baseboard (HLB-325): Offers 14.6 PFLOPS and over 1 TB HBM2e.
3. PCI Express Add-in Card: Features a dual-slot, passive cooling design with comparable performance metrics to its counterparts.
The Future of AI in Enterprises
Intel’s Gaudi 3 addresses enterprise-level concerns, with Sachin Katti, senior VP for the network and edge group, asserting that we are entering an era of AI agents that can autonomously handle complex workflows. The next phase of AI will see these agents leveraging proprietary data, setting the stage for a significant transformation across industries.
Katti highlights the challenge of integrating unstructured, proprietary data into AI systems, which often remain CPU-dependent and scattered across various formats. He advocates for a modular, secure ecosystem where enterprises can choose from a range of compatible AI solutions, focusing on responsible deployment to ensure trustworthiness and mitigate bias.
Intel aims to leverage Gaudi's enhanced capabilities to attract customers away from the Nvidia ecosystem, especially as AI costs rise. With the AI chip market projected to grow substantially, Intel is positioning itself as a viable alternative, emphasizing an open and collaborative approach to AI solutions.
Conclusion
As generative AI marks a pivotal moment in computing, Intel's Gaudi 3 introduces competitive performance and efficiency aimed at transforming enterprise AI deployment. The company’s commitment to open standards and system compatibility highlights its dedication to supporting the evolving AI landscape, promising to meet the needs of diverse enterprises seeking to harness the power of AI.