The emergence of large-scale AI models has reshaped the competitive landscape in technology, with over 20 Chinese companies entering this sector since March. Notable launches include Baidu's "Wenxin Yiyan," Alibaba's "Tongyi Qianwen," and Tencent's "Hunyuan." These tech giants are eager to maintain their dominance in this large-model race.
As a result, the focus has shifted from general-purpose models to specialized applications. Companies face challenges such as the need for substantial computing power, large datasets, and the high costs associated with acquiring talent. Yet, the demand for customization and diverse applications is fueling the development of vertical models across China. Many small- and medium-sized enterprises in healthcare, finance, education, and art are leveraging user data to create tailored models based on established large models.
If general-purpose models represent the initial stage of development, then specialized applications are the "mid-game," where practical needs drive the creation of vertical models that deliver real-world value across industries. This shift has become apparent in recent months. During a technology exchange meeting related to Baidu’s Wenxin model, GM Xin Zhou unveiled the "Wenxin Qianfan" platform, designed to facilitate enterprise-level model production. This platform offers tools for model development, allowing businesses to create proprietary models using various existing frameworks.
Experts have identified three main categories in the Chinese large-model landscape: general-purpose models analogous to GPT technology, companies focused on developing vertical models using open-source frameworks, and application-driven companies that rely on existing models without in-house development capabilities.
Li Changliang, an AI entrepreneur and former Vice President at Kingsoft Software, noted, “Initially, there was a rush towards general-purpose models, but now a clear distinction is emerging.” He emphasized that creating a commercially viable general model necessitates extensive training, R&D capabilities, practical experience, AI safety measures, and an open ecosystem.
Zhu Yong from Baidu Smart Cloud indicated that only a few companies are capable of developing foundational models, but many specialized models will stem from these foundations. Training foundational models is resource-intensive, often requiring thousands of GPUs. In contrast, developing domain-specific models is more cost-effective, making it an attractive option for many companies with valuable scene data.
Vertical models are designed to meet specific industry needs, leveraging significant data assets. For example, Bloomberg developed "BloombergGPT," a financial model optimized with its extensive datasets.
The push for vertical models benefits from lower costs and the need for specialized industry expertise. Existing data ownership becomes a crucial competitive advantage. The Stanford Alpaca model of the LLaMA-7B architecture highlights how efficient outcomes can be achieved with minimal computational efforts in specific applications.
Companies with abundant industry data and sector expertise are better positioned to develop successful vertical models. This trend embodies a "dual approach" strategy, where companies integrate diverse large model APIs while refining their models using open-source architectures. This synergy fosters an innovative feedback loop, enhancing capabilities and spurring new solutions.
The investment community is increasingly focused on vertical models, as highlighted by Wang Wei of Silicon Valley Paradise at the 2023 China Investment Annual Conference, underscoring the importance of domain-specific data and expertise.
In conclusion, as the landscape of large models continues to evolve, specialized models are emerging as a powerful tool across industries, offering the potential for transformative impacts.