If you've been following AI influencers or creators on social media, you may have noticed a surge of excitement surrounding a new AI video generation model called “Kling.”
Kling generates highly realistic videos from text prompts and in-app settings, comparable to OpenAI’s invitation-only AI model, Sora. Sora is currently in closed beta and has been shared selectively with a handful of artists and filmmakers for testing, particularly in exploring its controversial potential uses.
Kling recently demonstrated its capabilities on YouTube by replicating “air head,” one of the initial videos created with Sora by the creative agency Shy Kids.
What is Kling and Its Origins?
According to the South China Morning Post (SCMP), Kling was developed by Kuaishou Technology, the company behind Kuaishou, China's second most popular short video app (branded Kwai outside China) with 400 million daily active users (DAUs). It trails only Douyin, the Chinese version of TikTok, which boasts 600 million DAUs. This high user engagement makes Kling particularly appealing, potentially boosting Kuaishou’s standing against Douyin.
SCMP notes that the Kling AI model, currently in trial, can transform text into video clips up to 2 minutes long at 1080p resolution, supporting various aspect ratios. It can interpret prompts to create videos that reflect real-world scenarios or imaginative scenes.
According to sources cited by Perplexity, Kling utilizes a unique 3D Variational Autoencoder (VAE) for facial and body reconstruction, capturing detailed expressions and movements from a single full-body image. This is enhanced by a 3D spatiotemporal joint attention mechanism, allowing the model to handle complex scenes while adhering to the laws of physics.
How to Access Kling and Its Cost
Kling is free through the Kuaishou, Kwai, and KwaiCut apps (the latter being a video editing competitor to TikTok's CapCut). However, potential users outside China may face accessibility issues; reports suggest that a Chinese phone number is required to download and use the app.
Venture capital partner Justine Moore from a16z suggested a workaround using a burner phone number via the KwaiCut app. U.S. filmmaker Dustin Hollywood also recommended using ChatGPT to translate app menus and interfaces for non-Chinese speakers.
Capabilities of Kling
Early users have reported that Kling excels at creating immersive, realistic high-resolution videos across various genres, from action sequences
to first-person shooter recreations
and high-fantasy scenarios reminiscent of House of the Dragon or Game of Thrones.
Dustin Hollywood mentions that generating a video based on an "intermediate" complexity prompt takes about two minutes. However, he notes some limitations, particularly in accurately depicting race and skin color, similar to challenges faced by Google’s Gemini AI image generation capabilities.
Despite these drawbacks, Kling is making waves in the filmmaking community, prompting many, including Hollywood, to reconsider their views on Sora and OpenAI's cautious distribution strategy.
The Impact of Kling on the AI Video Landscape
Kling’s emergence raises questions about its potential to push U.S.-based AI video model providers, such as OpenAI, Runway, and Pika, to enhance their offerings in terms of quality and resolution. It remains to be seen whether they can quickly adapt to meet or exceed what Kling offers.
For anyone interested in AI filmmaking or the broader film industry, the introduction of Kling is certainly an exciting development. Here’s hoping for a full release in the U.S. without the current restrictions on phone number verification.