From Selling Shovels to Building Gold Mines: Is AI Infra the Biggest Beneficiary Behind Large Model Applications?

A well-known saying in the industry goes, "When everyone rushes to mine for gold, those selling shovels make the most money." During the gold rush of the mid-19th century, many miners faced failure, while those selling shovels and jeans reaped substantial profits. Just as shovel sellers emerged as the biggest winners during that era, AI Infrastructure (AI Infra) plays a similar role in today's era of AI-generated content (AIGC). As the excavation of large models begins, the AI miners are yet to achieve profitability, whereas NVIDIA, the leading hardware provider, has reaped significant rewards, reaching a market valuation of over $3 trillion, surpassing Apple to become the second-largest company globally after Microsoft.

AI Infra refers to the foundational infrastructure that connects computing power and applications within the ecosystem of large models. This encompasses hardware, software, toolchains, and optimization methods, providing an integrated solution akin to the Platform as a Service (PaaS) layer in cloud computing. Computing power, algorithms, and data generate the Infrastructure as a Service (IaaS) layer, while various open-source and proprietary models represent the Software as a Service (SaaS) evolution within the context of large models, termed Model as a Service (MaaS). The application of large models is accelerating, further amplifying the potential value of AI Infra.

According to predictions from CICC Research, the AI Infra industry is currently experiencing rapid growth in its early stages, with segmented markets projected to grow over 30% annually in the next 3 to 5 years. This optimism is reflected in capital investments, such as the recent completion of an angel financing round by Silicon Flow, a startup focused on reasoning frameworks, which raised nearly $100 million. Within just six months, Silicon Flow secured two rounds of funding, with its first angel round completed earlier this year.

Similar to Silicon Flow, Lepton AI was founded by former Alibaba vice president Jia Yangqing last year, and it has also completed its angel round with investments from Fusion Fund and CRV. As large models enter a phase of widespread application, the infrastructure necessary for training, deploying, and utilizing these models is crucial, marking AI Infra as the backbone of this booming market.

Large Models: A Booming Market for AI Infrastructure

The consensus within the industry is that AI applications are becoming more valuable than the models themselves. Li Yanhong believes that countless applications will emerge from foundational models, offering transformational benefits to existing industries that far surpass those of disruptive innovations. The supply of AI applications is expanding rapidly, with IDC forecasting over 500 million new applications by 2024—equivalent to all the applications developed over the past 40 years.

Recent market shifts highlight this trend, with an influx of video generation models such as Kuaishou's Keli, ByteDance's Jimeng, and SenseTime's Vimi. Additionally, AI search products and AI companionship tools are emerging in abundance. The trend towards massive adoption of large models is evident, and according to InfoQ, the global market for AGI applications is expected to reach ¥454.36 billion by 2030, attracting participants from various sectors. Underpinning this growth is the critical role of AI Infra.

From a development perspective, creating a large model application requires data preparation, model training and optimization, deployment, and ongoing monitoring—each step relies heavily on AI Infra to provide the necessary power and tools for developers. If developing AI applications is analogous to constructing a building, AI Infra acts as the construction teams supplying the essential materials.

The value of AI Infra lies in its integrated platform which connects the underlying computation layer with the application layer, enabling rapid deployment and enhancing cost-effectiveness and efficiency while maintaining superior model performance. The larger the market for AI applications, the more opportunities exist for AI Infra providers.

In this age of large models, optimizing training, inference efficiency, cost, and performance is increasingly vital. Inference, in particular, offers a more substantial market opportunity than model training, which is primarily dominated by tech giants like Google and Microsoft, each establishing their complete AI infrastructure. Although specialized AI Infra providers face tough competition from these giants, the demand for inference is universal across various model companies and industries seeking transformation.

For instance, while training a large model may involve vast data processing—upwards of tens of trillions of tokens—the actual demand for processing during inference can soar. OpenAI generates billions of tokens daily, showcasing an enormous data handling requirement significantly exceeding that of the training phase. According to MarketsandMarkets, the global market for large model training and inference is set to rise from $12.5 billion in 2023 to $56.3 billion by 2028, highlighting the immense potential for AI Infra as a vital business sector.

Reducing Deployment Costs by 10,000 Times

Yuan Jinhui, speaking at the 2024 Rare Earth Developers Conference, posed a challenge: "How can we reduce the deployment costs of large models by 10,000 times?" AI Infra aims to provide the infrastructure necessary for large model training, deployment, and application by addressing three core issues: speed, cost-effectiveness, and quality—all without compromising model performance. The challenge for players in the AI Infra sector is how to navigate this complex landscape where cost, efficiency, and performance must align.

While the model and application layers may resemble crowded markets, AI Infra presents a blue ocean of opportunities. There are few companies in China focusing solely on AI Infra, with significant players including Silicon Flow and NoInquiry’s Chip. International competitors in this field encompass NVIDIA, Amazon, and Lepton AI.

Silicon Flow's founder, Yuan Jinhui, is a serial entrepreneur in AI, having created an advanced deep learning framework, OneFlow, before founding Silicon Flow to focus on reasoning frameworks. The company has since introduced the SiliconCloud service platform, offering free access to multiple open-source models, alongside efficient tools like the OneDif acceleration library.

The NoInquiry’s Chip, established just three months prior to Silicon Flow, focuses on providing integrated soft and hard solutions across algorithmic, chip, and application stages. This year, the company launched its Infini-AI platform to combine heterogeneous computing resources, breaking the bottleneck that comes with single-chip dependency and drastically reducing deployment costs.

As the market for AI Infra continues to grow, technology giants like Alibaba, Tencent, and Baidu leverage their robust capital and technical expertise to establish significant footholds. For instance, Alibaba Cloud offers an extensive array of AI Infra products that spans the entire development process, from infrastructure to model deployment, thereby enhancing support for large model training and inference.

Despite their different strategies, both startups like Silicon Flow and NoInquiry’s Chip, as well as larger companies, are united by a common goal: to lower deployment costs and facilitate the swift adoption of large model applications. By combining resources efficiently and providing robust tools, they contribute to the evolution of AI technologies.

Challenges Ahead for AI Infrastructure Ventures

Although AI Infra holds immense potential amidst the burgeoning large model application market, companies within this space face vulnerabilities, even if they excel in their specialized fields. The landscape is defined by entrenched competitors like NVIDIA, whose CUDA ecosystem has dominated for over twenty years. This ecosystem unifies various hardware interfaces under a common standard, ensuring that developers gravitate towards familiar language frameworks for model development. As a result, the CUDA ecosystem currently commands over 90% of the AI computing market.

However, as AI model standardization continues and the structural differences between models decrease, the reliance on NVIDIA's ecosystem is weakening. Despite this shift, NVIDIA is projected to maintain a leading position in the AI hardware market for the next 3 to 5 years, capturing a market share of no less than 80%.

For companies focused on AI Infra, competing against such industry giants presents considerable challenges. With traditional payment structures in China not favoring standalone software purchases, selling hardware or software alone may not lead to commercial success. Many firms are opting for hardware-software bundling as a key strategy for monetization. As pointed out by Xia Lixue, the founder of NoInquiry’s Chip, the firm operates as a resource provider that incorporates required tools while maintaining customer trust in the value proposition.

Overseas markets emerge as a significant route for growth. Statistics indicate that the demand for generative AI and large models abroad can be dozens to hundreds of times greater than the domestic demand. This global perspective is essential for AI Infra companies, as international B2B software services often face fewer barriers to entry. Following new financing, Silicon Flow is exploring overseas partnerships, while Jia Yangqing has established his company with a focus on serving foreign enterprises.

As AI models standardize and application scenarios expand, comprehensive, efficient one-stop deployment solutions will be crucial. These solutions not only address computational shortages and optimize data handling but also allow organizations to concentrate on application-level challenges while reducing model development costs. However, the initial investment and maintenance costs associated with AI Infra solutions still pose challenges for startups.

Ultimately, AI Infra companies must remain agile in adapting to evolving demands and enhancing scalability. The future may favor those who can deliver personalized, one-stop deployment solutions tailored for diverse application scenarios. As Xia Lixue aptly expresses, "When we turn on the tap, we don’t need to know which river the water comes from. Similarly, when using various AI applications in the future, we'll be oblivious to the foundational models and powerful accelerators at work. That will exemplify the ideal AI-native infrastructure."

Most people like

Find AI tools in YBX

Related Articles
Refresh Articles