Following the launch of Kuaishou's innovative platform, Luma has responded with the introduction of its latest video model, Dream Machine, while Runway has also made waves with its Gen-3 technology. Amid the rising atmosphere of FOMO (Fear of Missing Out), several players are diving headfirst into this competitive field. Alibaba's Damo Academy has invested in the "Xun Guang" video creation platform, ByteDance's Dream AI is exploring "generative film and drama," Meitu is focusing on AI short films, and Haiper AI is enhancing creative expressions.
On July 5th, Shanghai buzzed with excitement as the "2024 WAIC Video Generation Frontiers Technology Forum" took place. Co-hosted by the World Artificial Intelligence Conference and prominent players like Machine Heart and Donghao Lansheng, the forum brought together industry leaders and experts to explore the latest advancements in video generation technology and its innovative applications.
The emergence of video generation technologies, particularly after ChatGPT's debut, has captivated the tech community. Despite being in its infancy, this field is rapidly advancing and expanding the boundaries of digital content creation. Key figures, including Chen Weihua from Alibaba, Professor Ni Bingbing from Shanghai Jiao Tong University, Chen Jianyi from Meitu, and Miao Yishu from Haiper AI, shared insights during the forum.
Chen Weihua highlighted the incredible potential of AI video generation showcased by Sora, particularly in high-definition and high-fidelity outputs. However, he noted that controlling the generation process remains challenging, with character consistency often requiring extensive manual editing. He stated, "Controlling video content is the primary demand in creation and the biggest challenge our algorithms face today." Damo Academy's latest AIGC product, the Xun Guang platform, aims to enhance video production efficiency and streamline post-production editing, allowing users precise control over video content while maintaining consistency across characters and scenes.
Professor Ni Bingbing discussed the challenges in vectorized media content generation. He emphasized that current algorithms face structural and detail-oriented issues, such as missing or extra elements in generated content. He explained that while improved training data can enhance content quality, the high-dimensional nature of videos presents significant challenges. He proposed a new approach using vector representation frameworks to ensure generated content aligns with the physical rules of the world.
Chen Jianyi provided a product manager's perspective on the application scenarios and challenges of AI video generation. He observed two interesting phenomena from user research: first, while insiders marvel at AI-generated videos, most everyday users focus on whether the content is engaging rather than its generation method. "This means that regardless of the visual capabilities of AI video generation, we must prioritize the inherent values and narratives conveyed by the content," he noted. Second, many ordinary users are unfamiliar with technical terms like "text-to-image" and "text-to-video." He emphasized the need to make these processes relatable and intuitive for better understanding.
Meitu's MOKI AI short film platform specifically addresses key challenges in video creation by providing a seamless workflow for creators, from scriptwriting and visual design to AI-assisted material generation, culminating in coherent short films.
Miao Yishu from Haiper AI discussed the broader implications of video generation technology. He argued that while language learning is vital for knowledge acquisition, it is not the only pathway to achieving Artificial General Intelligence (AGI). He suggested that video generation models can also offer valuable learning experiences through multi-modal integration. As these models evolve, they will increasingly possess perceptual capabilities akin to human understanding of the world.
The forum showcased not just technological advancements but also the shifting landscape of video creation and its critical role in the future of digital content. It reinforced the idea that video generation technology transcends mere content creation, marking a significant step toward the development of AGI and enriching the creative process in innovative ways.