Embodied Intelligence: The Future of AI Investment
Embodied intelligence is rapidly emerging as a key focus in artificial intelligence (AI) investment this year. At the recent Zhiyuan Conference, Wang Zhongyuan, director of the Beijing Zhiyuan Artificial Intelligence Research Institute, urged stakeholders to adopt an objective perspective on the current investment surge in embodied intelligence and humanoid robotics. While the enthusiasm is at an all-time high, Wang cautioned that a potential downturn in the humanoid robotics sector could occur in the coming years. He emphasized the importance of maintaining confidence and patience in research efforts until the technology can break free from its current limitations and usher in a genuine industry boom. Otherwise, society may soon find itself questioning why China has yet to match the successes of organizations like OpenAI.
The Early Stages of Embodied Intelligence Development
Wang identified embodied intelligence as a crucial avenue for the future of AI and a strategic focus for resource allocation. Established in late 2018, the Beijing Zhiyuan Artificial Intelligence Research Institute is pioneering large AI model research in China. Its flagship event has grown into a significant industry gathering, where this year, several key advancements in embodied intelligence were presented. These include the SAGE model for hinge object manipulation, the Open6DOR model for six-degree-of-freedom task control, the NaVid model for end-to-end multimodal navigation, and the universal computing control framework Cradle.
He predicts that agents could become a transformative application of large AI models, acting as personal assistants. Wang stated, "Once agents achieve a sufficient level of intelligence and usability, it’s as if everyone has their own personal assistant, driving societal progress and industrial transformation. Agents can operate within mobile devices or PCs and can also integrate into robots, contributing to embodied intelligence."
Challenges and Collaborative Efforts in Humanoid Robotics
Wang acknowledged that embodied intelligence is still in its developmental infancy. A notable challenge is the slower pace of hardware innovation compared to the rapid advancements in AI models, which see significant updates monthly while hardware improvements typically progress annually. Moreover, many critical issues regarding the integration of brain and cerebellar models and application scenarios for embodied intelligence remain unresolved. The lack of comprehensive datasets, similar to what ImageNet provided for AI breakthroughs, is a major hurdle.
Looking ahead, Wang outlined the institute's plans to advance embodied intelligence, stating, "We aim to leverage our expertise in large models, particularly multimodal models, to enhance the capabilities of embodied intelligence." He made it clear that embodied intelligence represents the intersection of AI technology, especially large model technology, with physical devices, integrating various forms such as robotic arms, quadrupeds, hexapods, wheeled robots, and humanoid robots.
Since last year, collaborative innovation has gained traction within China's humanoid robotics sector. Cities like Beijing, Shanghai, Zhejiang, Guangdong, and Chengdu have established innovation centers to support this initiative. Wang advocates for this collaborative approach, noting that the complexities of the field require cooperation between hardware manufacturers, suppliers, research institutions, and application developers. Key areas for collaboration include data collection, model training, depth of application, and cost reduction in hardware.
The Beijing Zhiyuan Institute aims to foster a collaborative platform for innovation in embodied intelligence. Wang announced plans to team up with leading universities such as Tsinghua and Peking University, along with relevant humanoid robotics firms, to address challenges related to data modeling and application collaboratively.
He anticipates that the integration of humanoid robots into industrial and domestic environments is still several years away, likely requiring 3 to 5 years of development. Wang asserted, "The Zhiyuan Institute is committed to advancing research with patience and persistence until the technology evolves and sparks a true industry breakthrough. Otherwise, as humanoid robots approach their 'ChatGPT moment,' we may again face the question: 'Why hasn't China produced an OpenAI?'"