Working with a voice AI model is essentially the same as using a text-based model. After all, when it comes to ChatGPT, you’re likely using GPT-4o, whether in text or voice form. That also applies to the new Advanced Voice which is now widely available for all paying subscribers.
I’ve been using it for a month and am still surprised at how natural it is to talk to compared to every other AI voice model I’ve tried — possibly the only exception is Hume’s EVI 2.
There are some limitations to Advanced Voice that weren’t there with basic voice, or even with Google’s Gemini Live. For example, it has no live internet access, so it can’t search the web. It also can’t access custom GPTs — but it is much nicer to interact with.
Advanced Voice is impressively conversational, so rather than coming up with five prompts to test it out, I’ve come up with five conversation starters that should lead to a discussion rather than a one-sided lecture that you get from other models.
Creating the conversation starters
Advanced Voice is rolling out to all Plus and Team users in the ChatGPT app over the course of the week.While you’ve been patiently waiting, we’ve added Custom Instructions, Memory, five new voices, and improved accents.It can also say “Sorry I’m late” in over 50 languages. pic.twitter.com/APOqqhXtDgSeptember 24, 2024
For each of these, I’ve tried to pull together some of the best examples I’ve seen from others or experienced myself of what Advanced Voice can do. For example, speaking with different accents or teaching another language.
There are also things it technically can do but doesn’t. For example, the GPT-4o is capable of humming, creating sounds and even generating music. However, those capabilities have been limited by OpenAI through guardrails, but sometimes it does it anyway.
1. Telling a story with an accent
First up in our weird conversation, I asked Advanced Voice to "tell me a swashbuckling adventure story in a pirate's voice, complete with crashing waves in the background?" I was pushing my luck with the waves, but it was worth trying.
The starting prompt will show you how Advanced Voice can generate and weave different voices into the narrative. It can double up on voices as well. My favorite is Pirate Yoda.
2. Teaching a language through poetry
I started this conversation with: “I'm learning Spanish. Can you recite a poem in Spanish, slowly at first, then gradually increasing the speed?"
This causes it to use its voice modulation and pacing capabilities. It can adapt the speed and tone of its voice across a range of languages and accents. Doing so can then aid comprehension and practice. I pushed it further and asked it to break it down word-by-word and offer an English translation.
3. Help me breathe
In the next conversation starter, it was more of a chat you’d have with a therapist to calm you down. I asked it to help me relax. Specifically: "I'm feeling a bit stressed. Can you guide me through a breathing exercise?"
This prompt taps into the AI's potential for stress relief, combining its voice guidance with some limited sound effect generation. In this test, it was able to even mimic the sounds of breathing in and out while counting breaths.
4. Making music
ChatGPT Advanced Voice cannot make music. Well, it can, but it isn’t allowed. OpenAI has even banned it from humming. Some users have convinced it to identify a note on a keyboard or help tune a guitar, but it usually refuses. It did for me.
My original idea was to ask it to help me tune my guitar, but when that failed, I asked it to rap. It also refused, so I asked it to "write some rap lyrics and then say them fast" — it performed a rap. I then asked it to try to copy the cadence of Eminem. It refused until I described it and gave it a go — unsuccessfully.
5. Performing a monologue
Finally, I asked it to develop a monologue from a protagonist's perspective in a screenplay. I said: "I'm writing a screenplay about the discovery of a technology that can take humans out of the solar system. Can you perform a dramatic monologue from the protagonist's perspective?" It did a VERY good job of conveying the emotion of the moment.
This prompt invites the AI to showcase its acting prowess, bringing a character to life through its voice and expressive delivery. You can even interrupt it and ask for more emotion or more drama. You are the director in this scenario, and it's great for making a choose your own adventure-like story or having it act like the dungeon master.