A new interview with the director behind the viral Sora clip Air Head has revealed that AI played a smaller part in its production than was originally claimed.
Revealed by Patrick Cederberg (who did the post-production for the viral video) in an interview with Fxguide, it has now been confirmed that OpenAI's text-to-video program was far from the only force involved in its production. The 1-minute and 21-second clip was made with a combination of traditional filmmaking techniques and post-production editing to achieve the look of the final picture.
Air Head was made by ShyKids and tells the short story of a man with a literal balloon for a head. While there's human voiceover utilized, from the way OpenAI was pushing the clip on social channels such as YouTube, it certainly left the impression that the visuals were was purely powered by AI, but that's not entirely true.
As revealed in the behind-the-scenes clip, a ton of work was done by ShyKids who took the raw output from Sora and helped to clean it up into the finished product. This included manually rotoscoping the backgrounds, removing the faces that would occasionally appear on the balloons, and color correcting.
Then there's the fact that Sora takes a ton of time to actually get things right. Cederberg explains that there were "hundreds of generations at 10 to 20 seconds a piece" which were then tightly edited in what the team described as a "300:1" ratio of what was generated versus what was primed for further touch-ups.
Such manual work also included editing out the head which would appear and reappear, and even changing the color of the balloon itself which would appear red instead of yellow. While Sora was used to generate the initial imagery with good results, there was clearly a lot more happening behind the scenes to make the finished product look as good as it does, so we're still a long way out from instantly-generated movie-quality productions.
Sora remains tightly under wraps save for a handful of carefully curated projects that have been allowed to surface, with Air Head among the most popular. The clip has over 120,000 views at the time of writing, with OpenAI touting as "experimentation" with the program, downplaying the obvious work that went into the final product.
Sora is impressive but we're not convinced
While OpenAI has done a decent job of showcasing what its text-to-video service can do through the large language model, the lack of transparency is worrying.
Air Head is an impressive clip by a talented team, but it was subject to a ton of editing to get the final product to where it is in the short.
It's not quite the one-click-and you-'re-done approach that many of the tech's boosters have represented it as. It turns out that it is merely a tool which could be used to enhance imagery instead of create from scratch, which is something that is already common enough in video production, making Sora seem less revolutionary than it first appeared.