Nvidia is making significant strides in computing at the ongoing GTC conference in San Jose.
CEO Jensen Huang, dressed in a black leather jacket, captivated a concert-like crowd during his keynote address. He unveiled the highly anticipated GB200 Grace Blackwell Superchip, which promises up to a 30 times performance increase for large language model (LLM) inference workloads. Huang also highlighted advancements in automotive, robotics, Omniverse, and healthcare, generating considerable buzz online.
Are You Ready for AI Agents?
No GTC event is complete without showcasing industry partnerships. Nvidia revealed how it is enhancing its collaborations with major tech companies, integrating its new AI computing infrastructure, software, and services. Here’s a summary of key partnerships announced:
AWS
Nvidia announced that AWS will offer its new Blackwell platform, featuring the GB200 NVL72 with 72 Blackwell GPUs and 36 Grace CPUs, on EC2 instances. This integration will allow customers to efficiently build and run real-time inference on multi-trillion parameter LLMs at a larger scale and lower costs than earlier Nvidia GPUs. Additionally, the companies are bringing 20,736 GB200 superchips to Project Ceiba, an AI supercomputer developed exclusively on AWS, and will integrate Amazon SageMaker with Nvidia NIM inference microservices.
Google Cloud
Following suit, Google Cloud will incorporate Nvidia’s Grace Blackwell platform and NIM microservices into its cloud infrastructure. The company also announced support for JAX, a Python-native framework for high-performance LLM training on Nvidia H100 GPUs, facilitating easier deployment of the Nvidia NeMo framework through Google Kubernetes Engine (GKE) and Google Cloud HPC toolkit. Moreover, Vertex AI will support Google Cloud A3 VMs powered by NVIDIA H100 GPUs and G2 VMs powered by NVIDIA L4 Tensor Core GPUs.
Microsoft
Microsoft confirmed plans to add NIM microservices and Grace Blackwell to Azure, alongside the new Quantum-X800 InfiniBand networking platform. Additionally, the company is integrating DGX Cloud with Microsoft Fabric to simplify custom AI model development and is making newly launched Omniverse Cloud APIs available on the Azure Power platform. In healthcare, Azure will utilize Nvidia’s Clara suite of microservices and DGX Cloud to support rapid innovation across clinical research and care delivery.
Oracle
Oracle aims to leverage the Grace Blackwell computing platform across OCI Supercluster and OCI Compute instances, adopting both the Nvidia GB200 superchip and B200 Tensor Core GPU. They announced that Nvidia NIM and CUDA-X microservices, including the NeMo Retriever for RAG inference deployments, will enhance insight and accuracy for OCI customers’ generative AI applications.
SAP
SAP is partnering with Nvidia to embed generative AI into its cloud solutions, including SAP Datasphere, SAP Business Technology Platform, and RISE with SAP. The company is also developing additional generative AI capabilities within SAP BTP using Nvidia’s generative AI foundry service, which includes DGX Cloud AI supercomputing and Nvidia AI Enterprise software.
IBM
IBM Consulting plans to combine its technology and industry expertise with Nvidia’s AI Enterprise software stack, including new NIM microservices and Omniverse technologies. This collaboration aims to accelerate AI workflows for clients, enhance use case optimization, and facilitate the development of industry-specific AI solutions, including digital twin applications for supply chain and manufacturing.
Snowflake
Snowflake has expanded its partnership with Nvidia to include integration with NeMo Retriever, a generative AI microservice that connects custom LLMs to enterprise data. This enhancement will improve the performance and scalability of chatbot applications developed with Snowflake Cortex. Additionally, the collaboration includes low-latency Nvidia TensorRT software for deep learning inference applications.
Besides Snowflake, other data platform providers, including Box, Dataloop, Cloudera, Cohesity, Datastax, and NetApp, have committed to using Nvidia microservices, particularly the new NIM technology, to optimize RAG pipelines and integrate proprietary data into generative AI applications.
Nvidia GTC 2024 is scheduled for March 18 to March 21 in San Jose and online.