When OpenAI launched its GPT-3 model in May 2020, it was hailed as a groundbreaking achievement in artificial intelligence, capable of producing text indistinguishable from that written by humans. However, the landscape has shifted dramatically since then. Recently, researchers at the Beijing Academy of Artificial Intelligence (BAAI) unveiled their own generative deep learning model, Wu Dao, which significantly surpasses GPT-3 in both scale and capability.
Wu Dao is a monumental advancement, trained on 1.75 trillion parameters—ten times larger than GPT-3 and 150 billion parameters more than Google's Switch Transformers. This colossal model was developed in just three months after the initial version's release, thanks to the innovative open-source learning system called FastMoE. This system operates on PyTorch, allowing for training on both supercomputer clusters and standard GPUs, offering greater flexibility than Google's proprietary TPU-based method.
With its immense computing power, Wu Dao showcases an impressive range of capabilities. Unlike traditional deep learning models that typically focus on a single task—such as text generation or image recognition—Wu Dao is multi-modal. BAAI researchers demonstrated its proficiency in natural language processing, text generation, image recognition, and photorealistic image generation. The model can compose essays, poems, and couplets in traditional Chinese, generate descriptive alt text for images, and create striking visuals from natural language prompts. Additionally, Wu Dao has the ability to support virtual idols and predict the 3D structures of proteins, akin to AlphaFold.
Dr. Zhang Hongjiang, chairman of BAAI, emphasized the importance of large models and significant computing power in the pursuit of artificial general intelligence. He stated, “What we are building is a power plant for the future of AI. With mega data, mega computing power, and mega models, we can transform data to fuel the AI applications of the future.”