Synthesia has launched a new feature called Personal Avatars. These are digital versions of yourself that can be created within minutes and resemble you so closely it could fool your friends — and probably even your mother if she isn’t paying close attention.
The generative AI video startup has had the option to use digital avatars, of varying degrees of realism, for some time within its PowerPoint-style presentation software. They act like virtual presenters reading a script to run alongside the text and graphics on the slide.
You’ve also been able to create personalized avatars but these have been flat and not particularly expressive — unless you take the time to visit the Synthesia offices and record a digital version of yourself. But this latest update takes personalization to a whole new level, and makes it much easier to create a realistic twin.
To create a digital version of yourself you sit in front of a webcam, read a 60-second block of text and wait a day for the avatar to generate. It's important to note, this is only available with a paid Synthesia subscription and you can only access your virtual likeness from your Synthesia account.
What is the point of a personal avatar?
With the rise of video-based social media platforms like TikTok and Instagram Reels it is becoming increasingly important to be able to present yourself visually.
Not everyone has that spark in front of a camera, or even the desire to do so. Now you can type and have a digital version of yourself read the words you’ve typed — no speaking needed.
The main use though is within an office environment to bring new life to a PowerPoint presentation — or create a video teaching your boss how to save a document as a PDF.
Teachers could use it to easily create custom lessons for different children, or you could use it to offer a presentation or TikTok in multiple languages, even if you personally only speak English.
How do Synthesia personal avatars work?
7 out of 10 people get this one wrong:How many humans are in this video? 👀Watch until the end to find out … and join us tomorrow to learn more about our next-gen Personal Avatars!👉 https://t.co/TnoY8SSM7Y pic.twitter.com/3uCHjeF21oJuly 30, 2024
The most exciting part of the personal avatar technology is actually the impressively accurate voice cloning, able to capture a mixture of inflection, emotion and style. It then essentially adds animation and lip-synching to a photograph taken with the webcam during setup.
It isn’t creating a ‘true’ avatar that could be moved to different environments, put in different outfits or even changed on a whim, but what is clever is the audio and seamless looping.
Personal Avatars use “auto alignment” for the looping. This is a form of AI that understands the difference between moments of silence and moments of speech. It can then appropriately coordinate body movements aligned to the script to make it more natural.
What is it like having a personal avatar?
When I sat down in front of a webcam and recorded a few minutes of what seemed like fairly rambling text I expected a slightly jittery, slightly off version of myself. I’d seen the demos but didn’t expect it would match that realism. I was wrong.
It took about a day for the system to finish training the model and creating my avatar. It was then available within the Synthesia system — all it took was writing a few lines of text. I could literally put words into my own mouth and it took less than 10 minutes to make a video.
If you look closely or watch it on a larger monitor while paying attention then it's clear the avatar isn’t real. Its speech is very impressive but I tend to look in one direction, movement is a little bit off compared to what you’d expect naturally, and it corrected my eyes. I’ve had a condition since childhood called Amblyopia where my left eye and my brain don’t communicate particularly well, so I have minimal vision in that eye and it always points off to the left. In my avatar, both eyes are facing exactly where you’d expect.
Seeing this slightly corrected version of myself was a shock. It was much better than I expected and while I’m used to using AI every day it took me back how advanced the tech has become and how accurately we can recreate people. I recommend checking my guide to signs that something might be a deepfake.
How do you start using personal avatars?
According to Synthesia, personal avatars are available on the Starter, Creator and Enterprise plans. There is a free plan available where you can use the inbuilt avatars but the first plan with customization is $22 per month, about the same price as ChatGPT or Claude.
“At Synthesia, we believe that AI avatars will revolutionize how we interact in digital spaces,” a spokesperson explained, adding that they wanted to see how users utilize Personal Avatars to “express yourselves, connect with others, and push the boundaries of digital communication.”
Admittedly, there is something a little creepy about having a very accurate recreation of yourself, but I think Synthesia has taken the right approach. By locking it to a verified user account not just anybody can put words into a virtual mouth.
I do wonder if the next logical evolution of this is a digital twin you can use anywhere, but one you create once. There are other companies building tech in this space including Captions with an app-based digital twin where you can change the background and add customizations, but none have come close to the realism of Synthesia Personal Avatars.
Captions approach is more flexible and mobile-friendly, but Synthesia captured my voice even more accurately than my ElevenLabs clone and that one fooled my children.