OpenAI continues to showcase clips of Sora, its advanced photorealistic generative AI video model, which remains strictly internal for now. In the rapidly evolving video AI landscape, competitors like Pika are capitalizing on this moment.
Recently, Pika introduced a new feature that enables users to automatically generate sound effects for their AI-generated videos on its platform, pika.art. This enhancement adds a crucial dimension to AI videos, which often lack audio and previously required users to source sound files through different editing software. Now, Pika users can generate sound files directly within the app, streamlining the creative process.
This update follows the launch of Pika’s lip-syncing capabilities, further enhancing AI-generated content for individual creators and enterprises. With features like lip synchronization, sound effects, voiceovers, and visuals all integrated, Pika positions itself as an all-in-one generative AI video creation platform. This allows users to create entire projects without needing separate cinematographers or sound designers.
The significant capabilities offered by Pika could appeal to filmmakers, enabling them to eliminate the need for sourcing different assets. Instead of searching through stock databases, users can simply describe their vision and quickly generate all necessary elements directly from their imagination.
Currently, Pika's new sound effects feature is available only to members of its super-collaborators program or users with a $58/month Pro subscription. However, plans are underway to expand availability beyond beta testing.
How Will Pika’s AI Videos Incorporate Sound Effects?
Pika has confirmed that users can obtain sound effects in two main ways.
1. Contextual Generation: The AI models will automatically select audio that best complements the generated video based on the text prompt. Users simply activate the “sound effects” toggle when entering their prompt, and Pika’s model delivers a complete audiovisual output in seconds.
2. Follow-Up Approach: Users can add AI-generated sounds post-creation. After generating or uploading an audio-less clip, they can click ‘Edit’ and select ‘Sound Effects’ to describe the desired sounds. The model will then generate multiple options for users to choose from.
The introduction of generated audio is set to significantly enhance the creative process, addressing the previous inefficiencies of sourcing from external audio. Pika claims to be the first AI video platform to embed generated audio as part of the video output.
Other companies are also exploring sound generation, with ElevenLabs recently announcing early signups for a text-to-sound AI feature, while Meta offers a similar technology called AudioGen. However, neither offers an integrated video generative AI model like Pika does.
Gradual Rollout Expected
The new sound effects feature will be rolled out gradually, starting with those in the super-collaborators program or Pika's Pro subscribers. Feedback from these early users will help refine the feature, allowing for future enhancements accessible to all platform users.
Since its launch in December 2023, Pika has aggressively strengthened its offerings to compete with players like OpenAI’s forthcoming Sora. Recent collaborations, such as the lip-sync functionality developed with ElevenLabs, empower users to add AI voices and synchronized animations to their videos. The added sound effects will further enrich these immersive experiences.
As Pika evolves, it aims to introduce more features, having raised $55 million in funding at a nearly $200 million valuation. The company is positioning itself to challenge not only OpenAI but also other major players in the creative AI space, including Adobe, Runway, Stability AI, and the recently launched Haiper.