Stability AI Enters a New Generation of AI with Stable Video 4D Technology

Home AI News Stability AI Enters a New Generation of AI with Stable Video 4D Technology

Updated on October 24 2024

Stability AI is enhancing its lineup of generative AI models with the introduction of Stable Video 4D, a significant advancement in video generation technology.

While numerous generative AI tools for video creation exist, such as OpenAI's Sora, Runway, Haiper, and Luma AI, Stable Video 4D distinguishes itself by building upon Stability AI's existing Stable Video Diffusion model, which transforms images into videos. This new model not only accepts video input but also generates multiple novel-view videos from eight different perspectives.

Varun Jampani, team lead for 3D Research at Stability AI, shared, “We envision Stable Video 4D being utilized in movie production, gaming, AR/VR, and various applications where there’s a need to view dynamically moving 3D objects from different camera angles.”

Advancing Beyond 2D: From Stable Video 3D to Stable Video 4D

Stable Video 4D represents a leap beyond Stability AI’s previous offering, Stable Video 3D, introduced in March. This earlier model allowed users to create short 3D videos from image or text prompts. In contrast, Stable Video 4D incorporates an additional dimension: time.

Jampani clarified that the four dimensions consist of width (x), height (y), depth (z), and time (t), enabling Stable Video 4D to render a moving 3D object from various angles and within different timeframes.

“Our innovation stems from merging the capabilities of our Stable Video Diffusion and Stable Video 3D models, fine-tuned with a meticulously curated dynamic 3D object dataset,” Jampani explained.

Stable Video 4D innovatively synthesizes novel view videos and generates moving images within a single network, unlike existing models that use separate systems for these functions. Additionally, it employs enhanced attention mechanisms, allowing each video frame to seamlessly connect to neighboring frames across different angles and timestamps, resulting in improved 3D coherence and temporal smoothness.

Differentiating Generative AI Techniques

While generative AI for 2D images often utilizes infill and outfill techniques to complete images, Stable Video 4D operates differently. Rather than relying on partial input data, it comprehensively synthesizes all eight novel view videos from the initial video input as a guide.

“Stable Video 4D synthesizes these videos from scratch without transferring explicit pixel data from input to output. Instead, the network relies on implicit information flow,” Jampani stated.

Stable Video 4D is currently accessible for research evaluation on Hugging Face, with commercial offerings yet to be announced.

According to Jampani, “Stable Video 4D can handle single-object videos several seconds long against plain backgrounds. We aim to extend its capabilities to longer videos and more complex scenes.”

AI Models Assess Their Own Safety: Insights from OpenAI's Latest Alignment Research

Berkeley SkyDeck Expands Eligibility Criteria for Startup Accelerator Applicants

Most people like

GrowthBar

GrowthBar is an innovative AI-driven tool designed to assist bloggers and content teams in crafting SEO-optimized content more efficiently. With its advanced features, GrowthBar streamlines the writing process, enabling users to create high-quality articles that rank better in search engines.

AI writing tool AI SEO Assistant

Cuspera

Discover tailored software solutions specifically designed to meet your unique business requirements.

software solutions Other

Discopixel

Turn static images into dynamic, engaging videos that capture emotions and tell stories.

AI video generation Text to Image

Soundful

Soundful empowers creators and artists to effortlessly generate and monetize an unlimited variety of music tracks, offering endless opportunities for musical expression and revenue generation.

AI music generator AI Music Generator

Find AI tools in YBX