Selling High-End GPUs Like Luxury Goods: The Race for AI Supremacy
In 2023, the surge in large AI models and the evolution of AI 2.0 have transformed the tech landscape into a lucrative arena dominated by power dynamics and fierce competition. Central to this escalating race is a critical question: Are you connected enough to secure high-end NVIDIA graphics cards?
Wang Fei, a sales manager at a server customization firm in East China, shared the overwhelming demand for NVIDIA graphics cards: “Demand is skyrocketing. Some are reselling A100 80GB chips obtained before the export ban.” These second-hand chips now command prices exceeding 100,000 yuan, despite being used, as their availability remains severely constrained. “You need to pay in cash for them to even consider selling to you.”
In contrast, the less powerful A100 40GB, even when newly available, struggles to attract buyers. “We receive many inquiries, but actual sales are rare.” The in-demand cards are part of NVIDIA's "Tesla" series. Specifically designed for AI model training, these high-end GPUs significantly outperform traditional CPUs, yet they are in short supply. NVIDIA launched the A100 and H100 chips in 2020 and 2022, respectively, with larger memory capacities enabling the execution of more extensive neural networks.
Since September 2022, the U.S. has banned NVIDIA and others from exporting high-end GPUs to China. To navigate this ban, NVIDIA introduced the A800 and H800 models tailored for the Chinese market in 2022 and 2023. However, even these A800 chips are now experiencing longer supply cycles and escalating prices. Wang Fei noted, “Before the rise of big models, the A800 could be delivered in about two weeks. Now, I estimate the wait will be at least eight weeks.”
Guo Lijie, working for a GPU distribution company, confirmed that current A800 prices have risen to 87,500 yuan, up from 80,000 to 85,000 yuan in November 2022, and this figure is expected to increase. “I advise clients to contact me only when they are ready to buy, as our inventory is limited.” Another supplier has quoted the A800 at 89,500 yuan, offering a shorter wait time of about two weeks, implying a premium.
The competition for high-end NVIDIA GPUs has evolved into a strategic "catch-up game." One AI startup leader remarked, “A key factor for an AI startup's success is the ability to mobilize at least 100 NVIDIA graphics cards.” Until August of last year, A100 cards were accessible through official channels, but that supply is now cut off, with customs halting foreign purchases.
Nonetheless, even A800 and H800 chips, which are ostensibly available for China, remain elusive for many small AI firms urgently needing computing power. Wang Fei observed, “NVIDIA products typically go first to the general agent in China, and we distribute them to clients engaged in AIGC projects. Although the H800 has launched, its expected starting price will be around 200,000 yuan, which is beyond the reach of most.” Currently, only major companies like Alibaba Cloud and Tencent Cloud — capable of procuring directly from NVIDIA — have access to H800s.
In this environment of scarcity, high-end chips like the H800 have already been reserved by big players. Tencent Cloud, for example, has announced a large-scale computing cluster based on the H800, claiming a "national first." Reports indicate that ByteDance has placed orders exceeding $1 billion for GPUs from NVIDIA this year, potentially nearing the total value of NVIDIA's commercial GPU sales in China last year.
In terms of partnerships, large companies often negotiate directly with NVIDIA for procurement, where success relies increasingly on established business relationships. Amidst soaring demand, rumors suggest that NVIDIA may be adopting a "Hermès-like" strategy, requiring additional product purchases to gain priority for limited GPU supplies.
In the race for AI development, smaller companies often find themselves on waiting lists as larger corporations acquire available cards. Regardless of who ultimately prevails, NVIDIA remains the primary supplier, poised to secure consistent orders throughout this evolution.
The Luck of Jensen Huang
In the business landscape, it's often said that when gold miners flock in droves, the pursuit of wealth can devolve into a game of chance. More commonly, those "selling shovels" to these miners reap the real rewards. A tech investor noted, “During the AI sector's downtrend, when the bubble burst, investors and startups quickly understood that the high costs of algorithmic innovation led to losses, while upstream players continued to thrive.”
In contrast, NVIDIA has quickly emerged as a frontrunner in the booming AI market. At a performance call in May, founder and CEO Jensen Huang remarked that demand for new products since January has been "unbelievably steep," with orders flooding in.
On May 30th, NVIDIA became the first chip design company to exceed a market cap of $1 trillion. When asked whether this surge was a matter of luck or foresight, Huang candidly acknowledged the role of “luck” in the company’s ascent: “We believed something new would eventually unfold, but the rest was fortuitous.”
Previously, NVIDIA faced disappointing revenues and a declining stock price, making its recent resurgence appear astonishing. Without the upsurge in AIGC (AI-generated content), the narrative could reflect countless cautionary tales of entrepreneurial pitfalls.
Historically, NVIDIA's high-end GPUs primarily served the gaming and cryptocurrency sectors. The pandemic initially sparked a surge in stock prices due to a booming consumer electronics market, peaking at over $300 in 2021. However, as the gaming market cooled and cryptocurrency volatility continued in the post-pandemic phase, NVIDIA's revenues entered a downward spiral during the first three quarters of the 2023 fiscal year, with stock prices plummeting to $108 in August 2022—only a third of its peak value.
Yet by late 2022, with the launch of ChatGPT, NVIDIA reversed its fortunes, becoming a globally sought-after name. By June 23, NVIDIA's stock had jumped to $422.90.
Amidst NVIDIA's rise, discussions about strategic foresight arose. Huang delineated it differently: “It’s not about foresight; it’s about accelerated computing.” Indeed, the history of training AI models with GPUs has been significantly shaped by NVIDIA.
In 2010, NVIDIA’s chief scientist Bill Dally conversed with AI researcher Andrew Ng, who faced severe computing shortages for his work while at Google. To meet his computational needs, Ng would have needed to integrate 16,000 CPUs, costing about $27 million—an unmanageable expense.
Proposing a solution, Dally suggested leveraging NVIDIA's GPUs instead. This pivotal shift enabled Ng’s team to accomplish their goals using just 48 GPUs linked together. Following this breakthrough, numerous research teams opted for NVIDIA’s chips for AI training, prompting Huang to prioritize AI as a key focus area in 2012.
Looking back, this foresight in a seemingly nebulous application landscape a decade ago illustrates Huang’s strategic insight. One employee at an AI chip company noted, “Investing in deep learning in 2012 seemed unwise amidst a thriving internet database market.” NVIDIA’s extensive investments in deep learning, coupled with the development of supportive software tools like language designs and developer resources, showcased their commitment.
China's GPU Aspirations
Huang once summarized NVIDIA’s success by stating, “About 10 years ago, we realized that AI could change everything. We've transformed our company from the ground up.” Since 2016, NVIDIA's accelerated computing GPU series has focused on AI training tasks, launching the P100 chip, which features advanced NVLink technology to enhance CPU-GPU communication speed and efficiency.
This robust ecosystem is crucial in establishing NVIDIA as a leader in the current AI wave, attracting significant capital investment. However, due to export restrictions limiting access to high-end A100 and H100 chips, NVIDIA’s dominance faces hurdles in its largest market, accounting for 47% of total global sales.
Huang has expressed clear ambitions for the Chinese market. In an interview, he lamented the detrimental impact of U.S. export restrictions on NVIDIA's growth prospects, emphasizing the significant risk posed to the American tech sector by the escalating U.S.-China chip conflict. “Without the Chinese market, there is no contingency plan. There is only one China in the world.”
As NVIDIA grapples with export challenges, many Chinese customers are seeking alternatives. Wang Fei noted some AIGC project clients are turning to the domestically manufactured DCU Z100L from Haiguang. Another startup leader mentioned their team resorting to purchasing NVIDIA’s consumer-grade GeForce RTX 4090 cards due to the unavailability of A800 chips, considering additional options and cloud computing resources.
Chinese GPU manufacturers, like Tianzu ZhiXin and Moer Thread, are actively working to deploy products for AI training. On June 10, Tianzu ZhiXin announced that its General-Purpose GPU "Tianhai 100" could train large models with hundreds of billions of parameters, significantly optimizing the “Aquila” language model developed by the Beijing Academy of Artificial Intelligence.
As of early June, rumors indicated significant share sell-offs by major asset management firms regarding NVIDIA’s stock, hinting at potential vulnerabilities within the "Computing Empire." On June 6, Edmond de Rothschild Asset Management disclosed a liquidation of a portion of its NVIDIA holdings, expressing concerns over elevated AI valuations and mounting uncertainty.
In truth, Huang may be one of Silicon Valley's most crisis-aware CEOs, frequently stating, “I’ve always believed we are 30 days away from bankruptcy.” Recently, he voiced concerns that during NVIDIA’s "absence" from the market due to policy restrictions, Chinese GPU startups could swiftly gain ground.