Nvidia has introduced a powerful upgrade to its flagship AI chip, the H100, as it seeks to maintain its stronghold in the competitive GPU market. The newly launched H200 incorporates high bandwidth memory (HBM3e), setting a new standard for handling extensive datasets used in generative AI and demanding computational tasks.
The H200 is the first GPU to harness the capabilities of HBM3e, boasting an impressive 141GB of memory and delivering data at a staggering 4.8 terabytes per second. This represents nearly double the memory capacity and 2.4 times the bandwidth of its predecessor, the A100, which supports 120GB of memory. Nvidia claims that the H200 is poised to nearly double the inference speed for Meta’s 70 billion-parameter Llama 2 open-source large language model compared to the H100, with even more enhancements expected through subsequent software updates.
Major cloud service providers, including AWS, Microsoft Azure, Google Cloud, Oracle Cloud, CoreWeave, Lambda, and Vultr, are set to deploy the H200 starting in 2024, which is expected to significantly bolster cloud-based AI capabilities.
In the face of this announcement, competitors like AMD and Intel are gearing up to challenge Nvidia's dominance in the GPU sector. AMD plans to release its MI300X chip this year, which offers an impressive memory capacity of up to 192GB, essential for running large AI models that require extensive computational resources. AMD demonstrated the MI300X by successfully operating a Falcon model with 40 billion parameters.
Intel, on the other hand, is preparing to launch its AI-driven Meteor Lake chips in December, which are designed with a chiplet System on Chip (SoC) architecture. This innovative design allows for modularity and scalability, featuring Intel's first dedicated AI engine integrated directly into the SoC, aiming to deliver AI capabilities to personal computing on a broader scale.
Moreover, SambaNova Systems has introduced its SN40L chip, which claims to handle models with up to 5 trillion parameters while maintaining support for over 256k sequence lengths. This advancement promises improved quality and faster outcomes at a more accessible price point.
Despite the competition, Nvidia's H100 remains a premium product in the market, often retailing for more than $30,000 per chip, reflecting its advanced capabilities and the high demand for powerful AI hardware. As rival companies continue to innovate, the landscape of AI computing is set for significant transformation.