After months of anticipation, ChatGPT’s advanced Voice Mode has now started to become available to small groups of ChatGPT Plus users. Their reaction has been very enthusiastic as they jumped on the opportunity to try out the new features developed by OpenAI.
The main features of the new voice mode are that it offers more natural, real-time conversations. You can interrupt ChatGPT at any time and it can sense and respond to your emotions. There are some limits, though; ChatGPT can’t mimic any famous personalities and is limited to speaking in four preset voices.
Several users who got access to the new features eagerly posted the results of their conversations with ChatGPT and the initial results seem pretty impressive. Don’t forget to turn up the volume as you check them out for yourself.
1. It can show excitement
Some early impressions of the ChatGPT Advanced Voice Mode:It’s very fast, there’s virtually no latency from when you stop speaking to when it responds.When you ask it to make noises it always has the voice “perform” the noises (with funny results).It can do accents, but when… pic.twitter.com/vOA8qmqX06July 31, 2024
One user asked ChatGPT to keep on dialling up its excitement as it narrated a fictitious soccer match. And ChatGPT obliged. Its first attempt was ok, but it actually sounded genuinely more excited as it was asked to give it another go while trying to sound even more excited. It’s a great example of how users should be able to fine-tune ChatGPT’s voice outputs.
2. It can cry
4o crying 🥺 pic.twitter.com/3Met6miyrtJuly 31, 2024
ChatGPT sounded like it was about to burst into tears as it was asked to recite the poem I measure every Grief I meet by Emily Dickinson. It impressively managed to clearly enunciate every word while making it feel as though the waterworks were going to start any second.
3. It can beatbox
Yo ChatGPT Advanced Voice beatboxes pic.twitter.com/yYgXzHRhkSJuly 30, 2024
Can ChatGPT beatbox? Absolutely! Asked to create a short birthday rap, the chatbot spit out a few bars and wrapped it up with some beatboxing. The first attempt was a bit too short for this X user who asked ChatGPT to increase the amount of beatboxing. On the second attempt, ChatGPT did as it was instructed to do. Pretty nifty!
4. It is a storyteller
“Stress testing” ChatGPT Advanced Voice Mode. Here you can see how it handles interruptions, different versions of languages and even languages with foreign accents.Tells a story in Spanish → Mexican Spanish → Portuguese→ Brazilian Portuguese → Korean w/ an Italian accent 😉 pic.twitter.com/4vC3AQZeDnJuly 31, 2024
In voice mode ChatGPT is able to respond to prompts normally except that it speaks its answers out loud rather than simply returning a text reply to your request. Here ChatGPT was asked to tell a children’s story about a computer that comes alive. While it wasn’t quite able to fulfil the user’s request to emphasize certain words and use tone variations, as typically done by storytellers, it was able to seamlessly switch from one language to another as it told the same story. Even though it was interrupted with these requests while it was speaking, this proved to be no challenge for the AI.
5. It can create sound effects
This is awesome actuallyI did not expect the ominous sounds https://t.co/SgEPi5Bd3K pic.twitter.com/DnK8AVdWjVJuly 30, 2024
On the same theme of storytelling, in this example ChatGPT was asked to narrate a sci-fi thriller and in seconds, a newly created character was chasing a rogue AI and ended up in a shootout. The AI was also asked to create an atmosphere to enhance the story, particularly by through using onomatopoeia – the use of words that create the same sound as what they describe. The advanced Voice Mode also inserted a couple of actual (albeit basic) sound effects for good measure.
6. It can identify chords
Any musicians out there know if this is the right chord? pic.twitter.com/ymy2Enfav7July 30, 2024
“Go it! Here’s a clear C minor chord,” ChatGPT said before going on to reproduce the chord. While it sounds a bit off key, it might be because the example features a phone filming another phone. It will be more important to know if ChatGPT intends to continue in this trajectory of being able to describe what music and sound effects you’d like to hear and have it deliver the results to you in seconds.
7. It can perform tongue twisters
GPT-4o new voice doing tounge twisters 📹 r/u/Glittering-Neck-2505 pic.twitter.com/qLP2rxFODcJuly 31, 2024
Another user asked ChatGPT to come up with some tongue twisters. Not only did the chat bot come up with them on the fly but it also read them out. It would be interesting to see how it would sound if it rattled the same example off for a number of consecutive times but it’s unlikely that the AI would stumble since it simply has to repeat its first iteration. Furthermore, it is unlikely to stumble on any words unless explicitly told to do so in general.
8. It can count very fast
ChatGPT Advanced Voice Mode counting as fast as it can to 10, then to 50 (this blew my mind - it stopped to catch its breath like a human would) pic.twitter.com/oZMCPO5RPhJuly 31, 2024
This is a fun one! ChatGPT was asked to count as fast as it could up to 10 – a task which it handled with ease. It also managed to count up to 50 and it also stopped midway to catch its breath. Not that it needed to of course, but it sure makes it seem as if you’re chatting with a human.
“Interestingly, the transcript has no interruptions or notations – the voice model has simply learned natural speaking patterns, which includes breathing pauses. Uncanny,” X user Cristiano Giardina wrote.
9. It can do bad impressions
ChatGPT Advanced Voice Mode doing a few impressions:- Bugs Bunny- Yoda- Homer Simpson- Yoda + Homer 😂 pic.twitter.com/zmSH8Rl8SNJuly 31, 2024
Finally, it can do impressions of famous characters, just not very well. It plays to the stereotype such as carrots for Bugs Bunny and Doh! for Homer Simpson.
Cristiano Giardina ran this test, writing on X: "ChatGPT Advanced Voice Mode doing a few impressions," including Bugs Bunny, Yoda, Homer Simpson plus a combination of Yoda + Homer.