Get all your news in one place.
100’s of premium titles.
One app.
Start reading
Tom’s Guide
Tom’s Guide
Technology
Ryan Morrison

AI video just took a big leap forward — Pika Labs adds lip syncing

Pika Labs lip sync video.

Pika Labs, one of the leading AI video platforms, has added a new feature that can bring voice to generated characters. 

Lip Sync was built in partnership with AI audio platform ElevenLabs and lets you give words to people in generated videos and sync their lip movements to the sound.

Film makers wanting to have characters in their generated video holding a conversation would have to accept them not having lip movement, or intersect real actors with generated clips.

Lip Sync changes that. The new tool is a significant moment in the generative AI video space, which itself is barely a year old. I'd argue when properly deployed and initial issues ironed out, it is as big of a moment as the launch of OpenAI's Sora.

What is Lip Sync from Pika Labs

Until now most artificial intelligence generated video clips have been just that, clips showing a scene, a person or a situation. They haven’t had the interactivity of a character speaking to the camera or to someone else on screen.

Without the ability to have realistic characters speaking to the audience most videos have been glorified slideshows or used for music videos.

I've done both, also made fictional trailers for TV shows or commercials — all using voice over rather than giving specific characters a voice in the video.

I haven't tried Lip Sync myself yet, as it's currently only available to users subscribed to the Pro plan or above, but from what I've seen of others generations, it isn't perfect but very close to being production ready. At the very least it will present a cheap way to get a pilot off the ground quickly.

The feature can take text-to-audio with the voice provided by ElevenLabs, or a direct audio upload if you've already got your own sound — such as a podcast or book.

Similar functionality is already available from tools like Synthesia but that has a more enterprise customer service focus and generates talking heads rather than characters.

Why is Lip Sync in AI videos a big deal?

Runway and Pika Labs have been the dominant platforms for true generative video for the past few months. Early to market and iterating quickly, with Runway revealing its synthetic voice-over service last year — but not synched to video.

Competition is starting to heat up though with all the big players exploring generative video and OpenAI revealing its very impressive Sora AI video platform.

StabilityAI also has a new version of Stable Video Diffusion and Leonardo is offering motion for any of its AI generated images. Google has Lumiere and Meta has Emu, forcing the early players to add new features before everyone else catches up.

What comes next?

(Image credit: OpenAI)

Up until now we've seen silos in generative AI. Tools that make images, tools that create videos, services for writing a script and something else to add sound. The next step will be greater levels of convergence, with platforms emerging offering full end-to-end production from a simple text prompt.

ElevenLabs is also working on a sound effects library, and combined with Suno we could soon see a single platform where you can say "take this script written by ChatGPT and turn it into a short film".

A few minutes later you'd have a timeline with a series of videos, parts spoken by characters using ElevenLabs synthetic voices and appropriate sound effects and music playing to bring the full production to life.

There was concern we'd see AI turn into Skynet and control our lives, but the evidence (so far) seems to suggest it just wants to entertain.

More from Tom's Guide

Sign up to read this article
Read news from 100’s of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
One subscription that gives you access to news from hundreds of sites
Already a member? Sign in here
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.