At the 2023 REAL Technology Conference, Lin Yonghua, Deputy Director of the Beijing Academy of Artificial Intelligence, delivered a keynote titled "Building the 'Linux' of Large Model Technology for a Strong Foundation in AI Development Over the Next Decade." As generative artificial intelligence gains momentum, large models and their applications are entering a new phase of research and development.
Looking back at the evolution of the AI industry, quality has emerged as the key challenge in achieving practical AI implementation—specifically, the ability to meet industrial and operational demands. Lin pointed out that large models present greater challenges than smaller ones. She highlighted that the training data, data proportions, input sequences, and all hyperparameter configurations during the training process are not fully disclosed, making it difficult to replicate the capabilities and issues of large models. Moreover, the substantial investment required for algorithm modifications and retraining complicates efforts to address shortcomings.
These factors make it evident that foundational models significantly influence downstream model performance, necessitating continuous refinement from organizations developing these base models. Lin emphasized that while large models will undoubtedly guide AI toward promising developments over the next decade, the industry currently faces numerous challenges. These include the high costs of foundational models, difficulties in obtaining training datasets, inconsistent evaluation methods, fragmented tools, and increasingly limited computational resources.
In her view, open-source initiatives allow professionals to build on the work of predecessors. Recent actions by the Academy reflect its commitment to leveraging open-source strategies to tackle these industry challenges. Specifically, the Academy recently announced significant upgrades to its Wudao·Tianying Aquila series of large language models, notably the Aquila2-34B, which boasts 34 billion parameters and has shown impressive results across multiple rankings. The Academy has made the entire Aquila2 model series open-source, including its innovative training algorithms, the FlagScale framework, the FlagAttention operator set, and the BGE semantic vector model.
Lin also revealed that the Academy has released 200GB of low-risk data from the WuDaoCorpora, reportedly the world's largest Chinese dataset, which has already seen thousands of downloads. Regarding evaluation methods, Lin indicated that large models face challenges in assessing generative capabilities, cognitive abilities, and human-like reasoning. Apart from a handful of generative tasks, evaluation largely relies on human scoring, the boundaries of cognitive assessment are hard to define, and testing human reasoning will require new complex test sets and methodologies.
In terms of computational resources, differences in architecture and development toolchains among Chinese vendors, the multitude of AI frameworks, and the emergence of diverse applications create significant obstacles in adapting heterogeneous chips, increasing development complexity, and standardizing evaluation metrics. To address these challenges, the Academy has launched the FlagEval system for large model evaluation and the FlagPerf open-source project for AI chip assessment.
"Large models are now transitioning from language models to multimodal capabilities, marking a crucial stage in technological application." Lin stated, looking ahead to a future where large models will extend beyond internet applications and permeate various industries. "We aspire to see large models step out of the digital realm and into the physical world, impacting areas like autonomous driving and robotics."