AI developers have been trying to crack the digital personal assistant nut for a while, offering a service that's smart, easy to interact with and always on call. Gemini Live, announced at Made by Google earlier this week, is Google's new attempt to do this, so I gave this AI a 24-hour trial to see how close it gets to being truly useful.
While I'm not used to chatting directly with AI assistants beyond asking them to set timers while I'm cooking, I wanted to see what the benefit is of having an open-ended conversation with one like Gemini could be. And after this day of testing, I'm at least confident in the value of speaking with AI like this, even if I have less faith in some of the answers it gives at the moment.
While my experiments with Gemini Live are far from a formal test of its abilities, the breadth of the questions it fielded from me give us a good impression of what it does well and what it doesn't. So I'm confident in my assessment that Gemini Live is going to make a good addition to the Gemini package, and perhaps a big-enough reason for some free users to become paid users of Gemini Advanced for $20 a month. Even if it's not achieving all its aims just yet.
Thursday Afternoon — The setup
Gemini Live comes as part of the Gemini Advanced subscription, but while it's being rolled out as I write, it's not available to all users yet. Fortunately I've had a Google Pixel 9 Pro XL to try it out with. If you want to know more about the phone, you can check out our Google Pixel 9 Pro XL review, as we'll be focusing on Gemini Live exclusively here.
Another hitch is that currently you need to set your Gemini language to US English to use it. Fortunately, even after I did this, I could still pick a British voice for Gemini Chat, named "Capella" from the ten offered. All sound quite natural, just with differing levels of enthusiasm and vocal pitch. Even when you start asking questions, it's rare to get a particularly egregious mispronunciation or oddly-phrased sentence.
Thursday evening — Getting home
With everything set up, my first big interaction with Gemini Chat was to ask it for directions home. Gemini Live didn’t initially tell me what it found once I had told it my transport method of choice and confirmed the stations I wanted to go between. After a long wait, I then prompted it to actually tell me what it found and it described the route to me.
I probably would have got home with the route. However, it wouldn’t have been the smoothest of journeys. Gemini mis-identified one of the train lines and one of the stations, neglected to note that one of my changes would technically require walking between two stations and then seemed to invent a train out of whole cloth. Which is all strange because Gemini claimed it had checked the Transport for London website for its info.
This is a more a problem with the underlying AI model rather than Gemini Live, but having an authoritative-sounding voice (with a British accent no less) suggest a route could have led someone less familiar with London public transport to get very lost. Seems like you're better off sticking with Google Maps for this sort of thing.
Friday morning — News briefing
The following day, I asked Gemini to take me through the day's breaking news as I got ready for work. With just a single prompt, it was able to tell me a lot about the shifting presenters on Good Morning Britain and This Morning, plus a brief reference to the recent stabbing in Leicester Square. But when I asked for tech news, things got weirder.
Google Gemini initially told me that Microsoft had announced a Surface Duo 3 - a device that hasn't been confirmed and in fact it’s been rumored for several months to have been canceled. The PS5 Slim is real, but came out last fall, and we can assume it’s referring to the Crowdstrike outage from last month with its final comment
I then asked Gemini Live to home in on iPhone rumors, but initially its answers all related to the iPhone 15 lineup that's currently available. With further prompting, it described some iPhone 16 camera rumors, but not in much detail.
Friday mid-morning — Brewing guide
After a couple of hours of work, it was time for a coffee break, so I tried to get Gemini Live to guide me through brewing a V60 pourover.
I was hoping to get step-by-step instructions out of the AI, but the problem here is that you need to continuously prompt or interrupt Gemini Live to effectively force it to give its answers as steps. However, it was able to hold the conversation up, offering cogent-sounding answers despite the transcript showing it initially misheard my prompts.
Knowledge-wise Gemini was a mixed bag. It offered some enthusiast-level tips such as filtering my water prior to boiling it. The overall recipe, albeit simple, did result in a drinkable cup. But Gemini Live also gave me a suggested coffee weight in tablespoons of beans rather than grams or ounces, which isn't a typical measurement when you're brewing. But with an extra prompt, I was able to get a gram amount.
Friday lunchtime — fighting talk
With some time to spare over lunch, I had a quick chat with Gemini Live about Street Fighter 6, the game I’m playing the most at the moment. It correctly named this year’s Evo 2024 champion for SF6, as well as their opponent, but again didn't give a huge amount of initial detail.
I moved the conversation on to training advice (I tend to overly rely on certain moves), where I got some suggestions on how to rethink my approach in a match. Easier said than done when your opponent's throwing fireballs at you, but it was valid advice all the same.
I also tried to get some guidance on where to find in-person meet-ups, but this didn't work quite as well. It tried to check the official website for details, but found it didn't have anything outside of Capcom's official tournaments. It then found a nearby Facebook group for me, but it couldn’t give me a link to access in the transcript later.
Friday afternoon — writing advice
As a final task for Gemini, I decided to go meta, and no, we're not talking about Llama 3. I asked it to help me draft the introduction for this very article.
After experiencing Gemini neglect to give me much detail with my previous answers, I was surprised at how much more willing Gemini was to suggest specific wordings. As I asked it to include more pieces of information or change its angle, it responded in logical ways. And as Google proudly pointed out during its Made by Google demo, Gemini Live is capable of dealing with interruptions and adjusting its answers on the fly.
This was the best that Gemini Live felt, as iterating on an idea out loud feels perfectly natural, even as you talk into a glowing waveform on your phone. In the end I did write this article's intro from scratch. But you can probably see the echoes of its final suggestion if you scroll back up to compare it to what it gave me.
Google Gemini Live: Final thoughts
You may assume from this article that I don't think highly of Gemini Live, but that's not quite true. The worst of my criticisms are directed at the Gemini Advanced model running it, as it seemed to misunderstand what it was looking for in several of the test scenarios. Amusingly, a recent Gemini vs. Gemini Advanced face-off we conducted shows I may have been better off sticking to basic Gemini.
Meanwhile, Gemini Live by itself very much impressed. Being able to hold a continued conversation with a chatbot, provided you're willing to be specific and interrupt if it goes off-course, seems a much better way to interact than via text or image prompts. You can ask regular digital assistants follow-up questions, but it's still not as seamless as Gemini Live proved to be. And it's that seamlessness that allows it to be practical, helping answer questions and provide guidance not just hands-free, but eyes free, allowing you to focus on something else as you and the chatbot speak.
The big question of how this compares to the upcoming ChatGPT Voice remains though, especially since Gemini Live relies on interpreting speech as text before making its response while ChatGPT Voice can process speech directly. But even with the usual AI caveats, it feels like Google's on the right kind of path in pursuit of the digital personal assistant dream.