OpenAI revealed a tool on Thursday that can generate videos from text prompts.
The new model, nicknamed Sora after the Japanese word for “sky”, can produce realistic footage up to a minute long that adheres to a user’s instructions on both subject matter and style. According to a company blogpost, the model is also able to create a video based on a still image or extend existing footage with new material.
“We’re teaching AI to understand and simulate the physical world in motion, with the goal of training models that help people solve problems that require real-world interaction,” the blogpost reads.
One video included among several initial examples from the company was based on the prompt: “A movie trailer featuring the adventures of the 30-year-old space man wearing a red wool knitted motorcycle helmet, blue sky, salt desert, cinematic style, shot on 35mm film, vivid colors.”
The company announced it had opened access to Sora to a few researchers and video creators. The experts would “red team” the product – test it for susceptibility to skirt OpenAI’s terms of service, which prohibit “extreme violence, sexual content, hateful imagery, celebrity likeness, or the IP of others”, per the company’s blogpost. The company is only allowing limited access to researchers, visual artists and film-makers, though CEO Sam Altman responded to users’ prompts on Twitter after the announcement with video clips he said were made by Sora. The videos bear a watermark to show they were made by AI.
The company debuted the still image generator Dall-E in 2021 and generative AI chatbot ChatGPT in November 2022, which quickly accrued 100 million users. Other AI companies have debuted video generation tools, though those models have only been able to produce a few seconds of footage that often bears little relation to their prompts. Google and Meta have said they are in the process of developing generative video tools, though they have not released them to the public. On Wednesday, it announced an experiment with adding deeper memory to ChatGPT so that it could remember more of its users’ chats.
OpenAI did not disclose how much footage was used to train Sora or where the training videos may have originated, other than telling the New York Times that the corpus contained videos that were both publicly available and licensed from copyright owners. The company has been sued multiple times for alleged copyright infringement in the training of its generative AI tools, which digest gargantuan amounts of material scraped from the internet and imitate the images or text contained in those datasets.