A new AI video model called Kling appears to offer many of the same features that made OpenAI's Sora stand out when it was first announced earlier this year.
Built by the Chinese video platform company Kuaishou, its features include longer video generations, improved movement, better prompt following and multi-shot sequences. Unlike Sora, it seems Kling is already being made available to users through a waitlist.
OpenAI unveiled Sora in February and we have started to see a growing number of creators using it, including five award-winning filmmakers set to premiere Sora-made shorts at the Tribeca Film Festival next week. Despite that, it still isn’t widely available to the public.
Among the clips shared from Kling we've seen a long clip of a boy on a bicycle, a horse in the desert, someone eating noodles and a photorealistic video of a young boy enjoying a burger.
What do we know about Kling?
Chinese new DiT Video AI Generation model 【KLING】Open access!Generate 120s Video with FPS30 1080P, Understand Physics Better, Model Complex Motion Accuratelyprompt:Traveling by train, viewing all sorts of landscapes through the window.https://t.co/hTwIEHRza2 pic.twitter.com/nBvnAsqd1OJune 6, 2024
According to Kuaishou Kling can generate up to two minutes of video from a single prompt in 1080p at 30 frames per second. It also "accurately simulates real-world physics" which is something most AI models struggle with.
It is a diffusion transformer model the same as Sora and uses a proprietary model that can support a range of aspect ratios and shot types.
In addition to the generative features, Kling is capable of advanced 3D face and body reconstruction to improve full expression and limb movement within the video, the company explained on its website.
What we don't know yet is whether Kling, or even the other big Chinese AI video model Vidu will ever be released outside of China. That could be OpenAI’s saving grace in the west.
What do Kling videos look like?
The most impressive part of the videos is the photorealism. In some clips they suffer from similar blurring we see in other AI videos but not on the same scale.
There is one clip of a parrot you'd struggle to tell isn't real and I’m still not sure they haven’t faked the burger video.
One type of video I've played with many times is pouring liquid — most struggle but Kling seemed to get it right, at least in one demo of milk going into a glass of coffee.
Overall it seems Kling has the ability to create accurate motion, better model real-world movement and physics and create a photorealistic depiction of the world.
What does this mean for Sora?
I hope the company considers a broader release, making it available outside of China as competition is good for creativity and innovation. Releasing it will hopefully push OpenAI to release Sora faster than currently planned.
OpenAI is also facing competition from existing players like Runway and Pika Labs, both of which are stepping up their game. There are also newcomers like Haiper, LTX Studio and Higgsfield each taking different approaches to AI video and Google has its new Veo model.
The reality is that OpenAI’s delay isn’t one of technical availability but safety. The company says it wants to ensure the model can’t be used for misinformation or malicious purposes before offering it to the general public. They also need to make it faster and cheaper.