Project Astra stole the show at Google I/O, giving us a glimpse at what our interactions with the world will look like powered by Gemini 1.5 — Google's next-generation AI model. Though It could be some time until a public version of Project Astra trickles down to devices, I had the chance to demo its different abilities while attending Google's annual developer conference.
In short, Project Astra is real-time, camera-based AI that can do anything from identify an object in frame to craft an fictional story about said object to rewrite that story using an obnoxious amount of alliteration. No seriously, when prompted with a plastic apple, it romanticized the toy (presumably nabbed from a children's play set) as "pretty produce positioned perfectly."
@tomsguide ♬ original sound - Tom’s Guide
For the purpose of the demo, Google hooked up a stationary top-down camera to a machine running Gemini 1.5. The camera feed alone was used for this alliteration game, though it also showed off the model's object identification chops. When presented with an array of dinosaur figurines, Gemini not only named each's classification but came up with names and adventurous storylines that seemed surprisingly suitable.
Posed with a less pre-meditated challenge, a fellow reporter asked the agent to read the relatively small tattoo printed on their forearm and state which TV show it's a nod to. Though Gemini incorrectly guessed "Game of Thrones" at first, it landed on "Battlestar Galactica" on the second try. (In case you're wondering, the quote was "so say we all.")
Google had a touchscreen display fed to the Gemini model as well, outfitted for friendly rounds of Pictionary. I stepped up to challenge Project Astra, delivering my best attempt at a certain ball-shaped droid from the Star Wars universe to stay with the sci-fi theme. Though this doodle definitely didn't deserve a spot on the fridge, when asked, "what do you see?" the agent nailed it — BB-8 from the sequel trilogy.
While the demo had a quiz-like nature, the idea is that it proves how Gemini could be helpful with it's sight abilities. Google said it will initially come to Android phones in the form of Gemini Live, but this official demo video shows it action with a "prototypes glasses device," suggesting a fresh form factor is in the works.
In the ideal scenario, Gemini Live will be able to see what you see to answer questions, inspire creativity, or even help you find a missing object hiding in plain sight. Doing so through the Gemini app or camera app on a smartphone makes enough sense, but I think a glasses design like the Ray-Ban Meta Smart Glasses would ultimately provide less friction.
As a "look and tell me"-style tool, Project Astra and Gemini appear to live up to the hype. There are competing versions of both available now, but if one company knows how to do search right, it's Google.