Ampere Computing Expands CPU Family and Focuses on AI Efficiency
Ampere Computing has announced that its AmpereOne chip family will expand to 256 cores by next year. The company is also collaborating with Qualcomm to develop cloud AI accelerators.
According to Jeff Wittich, Chief Product Officer, the new centralized processing unit (CPU) will deliver 40% more performance than any other CPU available today.
Collaboration with Qualcomm
Based in Santa Clara, California, Ampere will partner with Qualcomm Technologies to create a solution for AI inferencing by leveraging Qualcomm's high-performance, low-power Cloud AI 100 inference solutions alongside Ampere CPUs.
Ampere CEO Renee James emphasized the pressing energy challenges posed by AI advancements. "We initiated this journey six years ago because we recognized its importance," James said. "Low power no longer equates to low performance. Ampere has redefined the efficiency frontier of computing, providing superior performance within an efficient framework."
Addressing Data Center Energy Efficiency
The energy consumption of data centers is a growing concern. James highlighted that the rapid shift to AI has intensified the industry's energy challenges. "The current trajectory is unsustainable. Future data center infrastructures must retrofit existing air-cooled setups and build environmentally sustainable new facilities compatible with grid power availability," she stated.
Wittich supported this perspective, noting that the need for a new CPU was driven by the increasing power consumption in data centers, particularly due to AI. “It’s imperative that we develop solutions that enhance efficiency across general-purpose computing and AI,” Wittich added.
Ampere's Vision for AI Compute
Ampere is pioneering a comprehensive approach referred to as “AI Compute,” which encompasses both cloud-native capabilities and AI functionality. "Our CPUs can support a wide array of workloads from popular cloud-native applications to AI, integrating AI into traditional applications like data processing and media delivery," Wittich explained.
Future Roadmap
Ampere has set an ambitious roadmap for its data center CPUs. Key upcoming developments include the 12-channel 256 core CPU, manufactured using TSMC N3 technology. The previously announced 192-core CPU is already in production and available in the market.
Ampere and Qualcomm are collaborating to enhance their joint solution featuring Ampere CPUs and Qualcomm Cloud AI 100 Ultra, targeting large language model (LLM) inferencing in generative AI.
Wittich described their partnership as a commitment to creating highly efficient CPUs optimized for AI applications. “This solution will simplify customer adoption and provide innovative capabilities for AI inferencing,” he remarked.
Performance Enhancements
With the expansion of the 12-channel platform to include the new 256 core AmpereOne CPU, users can expect significant performance boosts without complicated designs. The existing 192-core model remains on track to launch later this year, showcasing an evolution from eight to twelve channels of memory.
Notably, Ampere's CPU technology is already being utilized by Meta's Llama 3 at Oracle Cloud. Remarkably, Llama 3 operates on the 128-core Ampere Altra CPU without a GPU, offering parity with an Nvidia A10 GPU and x86 CPU combination while consuming only a third of the power.
UCIe Working Group and Competitive Edge
Recently, Ampere formed a UCIe working group as part of the AI Platform Alliance to enhance the flexibility of its CPUs, allowing for the integration of customer IP in future designs.
Ampere competes directly with AMD, highlighting its performance advantages. AmpereOne CPUs lead in performance per watt, outperforming AMD’s Genoa by 50% and Bergamo by 15%. For data centers looking to upgrade infrastructure, AmpereOne can deliver 34% more performance per rack.
The new AmpereOne OEM and ODM platforms are set to ship in the coming months.
Additionally, Ampere has partnered with NETINT to develop a solution using their Quadra T1U video processing chips, enabling the simultaneous transcoding of 360 live channels and real-time subtitling for 40 streams, utilizing OpenAI’s Whisper model.
Ampere aims to be the backbone of computing in the AI era. Recent enhancements, including features like Memory Tagging, QOS Enforcement, and Mesh Congestion Management, culminate in the introduction of the FlexSKU feature, allowing customers to leverage the same SKU for both scale-out and scale-up use cases.
By collaborating with Oracle, Ampere has successfully reduced operational costs by 28% while using just a third of the power required by competing Nvidia solutions. This approach enables users to operate with 15% fewer servers, 33% less rack space, and 35% reduced power consumption, aligning with Ampere’s commitment to efficiency and performance in AI computing.