Get all your news in one place.
100’s of premium titles.
One app.
Start reading
Tom’s Guide
Tom’s Guide
Technology
Nigel Powell

I tried the new ElevenLabs Video to Sound Effects demo — and it's pretty amazing

ElevenLabs logo on phone sitting on top of keyboard.

Eleven Labs has done it again. The pioneer in top quality AI generated voice and SFX audio, has just unveiled a new text to sound effects API. 

To celebrate the occasion the company also released a very cool open source demo called Video to Sound Effects to showcase what the tech can do. It’s available online and at Github, and it’s pretty awesome.

Just take your generated video, upload it to the ElevenLabs demo webpage, and wait while the platform analyzes the video, and returns a choice of four different sound effect audio tracks to choose from. 

Select the version you like and hit the download button to grab the video clip along with the new audio. Super simple. The whole process takes around 5 minutes from uploading a 5 second clip.

This is a new area of AI known as video-to-audio (V2A). Google recently announced a research project promising similar technology but that isn't yet available to try.

Putting ElevenLabs to the test

I tested it out using Luna Dream Machine (LDM) as my video generation tool. I tried five different video prompts with mixed results, but hey, it’s early days. Anyhoo, I eventually succeeded in getting a clip of a gorilla riding a Harley Davison motorbike, and uploaded it to the ElevenLabs demo page.

The company is not only targeting sound effects with the tech, but also on-demand samples for music production, and dynamic sound for video games.

Within 20 seconds or so I had four audio samples to audition, chose one and started the download process. I have to say that despite some dodgy iterations the final result is actually pretty great. The video is hilarious, and the audio gives it a whole new dimension.

The tech works by sampling 4 frames at 1 second intervals from the uploaded video, which is sent to ChatGPT-4o to create a custom text-to-sound-effects prompt. 

The prompt is then sent back to the ElevenLabs API to create the final SFX. It’s crude, but surprisingly effective. The results will never win an Oscar, or indeed a Golden Reels award, but as a quick and dirty way to give some life to a dull AI generated video clip, it works well.

While the demo is clearly aimed at the general public, the new API is aimed at serious business use. 

The company is not only targeting sound effects with the tech, but also on-demand samples for music production, and dynamic sound for video games.

To deploy the API, customers will need an ElevenLabs account with an API key, and every generation will cost 100 characters, or 25 characters per second for set durations.

More from Tom's Guide

Sign up to read this article
Read news from 100’s of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
One subscription that gives you access to news from hundreds of sites
Already a member? Sign in here
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.