Despite their steep price of over $30,000, Nvidia’s H100 GPUs are in high demand and often back-ordered. Earlier this year, Google Cloud introduced a private preview of its A3 GPU virtual machines powered by H100 chips, integrated with Google’s advanced 200 Gbps Infrastructure Processing Units (IPUs). Now, at its Cloud Next conference, Google has announced that the A3 will be available to the public next month.
It remains to be seen if Google Cloud can meet the demand for these powerful chips, especially as they are tailored for training and deploying generative AI models and large language models.
When Google Cloud unveiled the A3 last year, it highlighted its capability of delivering up to 26 exaflops of AI performance. Thanks to the custom IPUs, the A3 offers up to 10 times more network bandwidth than the previous-generation A2 machines.
“A3 is specifically designed to train, fine-tune, and serve highly demanding and scalable generative AI workloads and large language models,” said Mark Lohmeyer, VP and GM of computer and machine learning infrastructure at Google Cloud, in a press conference prior to today’s announcement. “It incorporates a variety of unique Google innovations, including our networking technologies and infrastructure processing, which are essential to supporting the immense scale and performance required for these tasks.”