Despite ongoing concern around generative artificial intelligence replacing artists and writers, I foster a more optimistic view–and I'm not alone. I see a future in which humans will leverage generative A.I. to increase their productivity, automating the boring parts of the work, so they can focus on the creative process.
Beyond augmenting creative output, for the moviemaking industry, harnessing the power of A.I. also translates into lower budgets and shorter post-production time–a huge win for film producers, especially those leading smaller productions like Everything, Everywhere, All At Once.
The film was the big winner of this year's awards season, snatching SAG, BAFTA, and Golden Globes wins, as well as a whopping seven Academy Awards, including Best Picture, Best Director, and Best Actress. While the movie is said to be the herald of a new dawn for Hollywood, one that celebrates diversity and the Asian community, EEAAO has also ushered in another big change for the movie industry: the use of A.I. to deliver better and more cost-effective visual effects.
While recent developments in A.I.-powered chatbots have taken the Internet by storm, another Large Language Model (LLM) is quietly revolutionizing filmmaking. Generative diffusion models are unlocking powerful image creation and editing tools, enhancing the creativity of visual effects artists, and delivering a new era of movie magic. Diffusion models observe billions of images and learn various elements to produce new ones, extend an existing image beyond its boundaries, transfer style, and create entirely new images based on the text that you would find in metadata.
In the case of EEAAO, a small team of visual effects artists was tasked with creating a multiverse against tight deadlines, leading them to rely on A.I. tools to automate tedious aspects of editing. Editors used a popular suite of A.I. “magic tools” from Runway, an A.I. content creation startup and one of the researchers behind Stable Diffusion, to create a video that would have been too costly and time-consuming to produce on a movie set or as a CGI effect. For one scene specifically, a VFX artist used a rotoscoping tool to get a quick, clean cut of rocks moving through sand as dust swirled around the shot. Days’ worth of painstaking work was slashed to mere minutes. The result? Oscar-quality moviemaking magic.
There is a swell of innovative startups in the space helping filmmakers bring their visions to life in exciting new ways. Metaphysic harnesses generative A.I. to create photorealistic video, and will soon be used to help Tom Hanks and Robin Wright portray younger characters through de-aging with high-fidelity quality than previous attempts–more Harrison Ford in the latest Indiana Jones than Jeff Bridges in the Tron sequel some years back. Synthesia helps anyone with a computer to create professional videos (for corporate training, product marketing, and educational purposes) with simple text prompts covering 120 languages, with no cinematic degree required.
Krikey, a startup led by a sisters' duo, uses generative A.I. to make it easier for creators to breathe life into animations, helping them automate character motion. One of the best things about this tool is that artists can choose to create a video with custom 3D avatars provided by the tool (including body and hand motions, facial expressions, 3D backgrounds, and camera angles) or export a "skeleton animation" file and apply it to their own characters with a couple of clicks. This ensures that studios and gaming companies can protect their intellectual property, which is never shared with Krikey. The company also offers a “Canva-like” app, that makes it easier for anyone to create animation movies with just a few clicks–a welcome break for corporate and educational video producers.
The possibilities are endless. Composition, stylization, inpainting, motion tracking, you name it– A.I. can make it all easier, quicker, and painless for creators, freeing them to focus on idea concepts and to deliver on faster iterations. Existing footage of a train moving out of a station can be transformed into a clay animation. Images of a man running on snow can be recomposed to look like he is running on the surface of Mars. Aerial footage of a city mock-up built with Legos can be rendered to look like a real, vibrant cityscape at dawn. A model walking the runway can have her real hair color masked to match her dress. All of these can now be generated in seconds, following simple text or image prompts, and retaining high quality and flexibility.
As more models and refining tools hit the market and interest grows, we will require enormous computing power to sustain and scale them–an exemplar use of the power of the cloud. The first version of Stable Diffusion started with 100,000 GB worth of training images and labels to generate an image in only 5.6 seconds. Today, new releases have cut that time to 0.9 seconds while also adding functionalities that upscale image resolution and infer depth information.
We can all rejoice in the triumph of EEAAO, the first big hit for A.I. As more studios, editors, and artists embrace A.I. tools, these will be democratized and help unlock the potential of amateur filmmakers everywhere. One thing is for certain: Those cat videos the Internet loves so much are about to become way more interesting.
Howard Wright is the VP and Global Head of Startups at AWS.
The opinions expressed in Fortune.com commentary pieces are solely the views of their authors and do not necessarily reflect the opinions and beliefs of Fortune.