What you need to know
- OpenAI today launched GPT-4o, which is a new AI model that can interact with users through audio, vision, and text.
- Google appears to have a counter for GPT-4o ready to go for Google I/O 2024, which begins tomorrow.
- In a teaser, Google previewed an unreleased multimodal AI interface, and said to expect announcements related to AI, Google Search, and more.
OpenAI beat Google to the punch today, hosting an event where the company released a new AI model called GPT-4o. The timing of OpenAI's event was not by accident — it came about a day before Google was set to announce artificial intelligence plans and features at the Google I/O 2024 developer conference. However, it looks like OpenAI might spend only a day as the leader in multimodal AI. In a teaser posted to X (formerly Twitter), we got a look at an unreleased artificial intelligence interface running on an Android phone.
The main part of the teaser is a video that shows some sort of multimodal AI interface running on an Android phone. Visually, it looks kind of like the Pixel Camera app, and the demo shows someone asking it questions about their surroundings. In this case, it's the Google I/O stage. After correctly identifying that setup was happening for an event, this AI tool also recognized the Google I/O wordmark and explained the details of the developer conference. However, since it's a pre-recorded video, results have to be taken with a grain of salt.
Alongside the video, Google says to expect "the latest news about AI, Search, and more" at Google I/O 2024. The main keynote is set for 10 a.m. PT, and we'll be tracking all the developments in our live blog.
One more day until #GoogleIO! We’re feeling 🤩. See you tomorrow for the latest news about AI, Search and more. pic.twitter.com/QiS1G8GBf9May 13, 2024
Gemini has already appeared as a voice assistant, and as a chatbot that can consume images, screenshots, and more. However, the new part of Google's teaser is that the interface now supports vision as an alternative to voice and text input. By looking through your device's camera, this AI interface can answer questions and details about your surroundings. It's similar to what the Humane AI Pin and the Rabbit R1 tried to do with standalone devices.
However, there's still a lot we don't know, like whether this functionality would be built into Gemini or some other app. The demo seems impressive enough that a supercharged version of Gemini might be ready to replace Google Assistant, but that's only speculation for now.
We'll gain more clarity on what Google has been working on during Tuesday's keynote, but for now, this teaser seems to confirm a GPT-4o competitor is coming.