A student project has revealed yet another power of artificial intelligence — it can be extremely good at geolocating where photos are taken.
The project, known as Predicting Image Geolocations (or PIGEON, for short) was designed by three Stanford graduate students in order to identify locations on Google Street View.
But when presented with a few personal photos it had never seen before, the program was, in the majority of cases, able to make accurate guesses about where the photos were taken.
Like so many applications of AI, this new power is likely to be a double-edged sword: It may help people identify the locations of old snapshots from relatives, or allow field biologists to conduct rapid surveys of entire regions for invasive plant species, to name but a few of many likely beneficial applications.
But it also could be used to expose information about individuals that they never intended to share, says Jay Stanley, a senior policy analyst at the American Civil Liberties Union who studies technology. Stanley worries that similar technology, which he feels will almost certainly become widely available, could be used for government surveillance, corporate tracking or even stalking.
"From a privacy point of view, your location can be a very sensitive set of information," he says.
AI has arrived at your destination
It all began with a class at Stanford: Computer Science 330, Deep Multi-task and Meta Learning.
Three friends, Michal Skreta, Silas Alberti and Lukas Haas, needed a project, and they shared a common hobby:
"During that time we were actually big players of a Swedish game called GeoGuessr," says Skreta.
GeoGuessr is an online game that challenges players to geolocate photos. It has a pretty straightforward setup, Skreta says: "You enter the game, you're placed somewhere in the world on Google Street View, and you're supposed to place a pin on the map, that is your best guess of the location."
The game has over 50 million players who compete in world championships, adds Silas Alberti, another member of the project. "It has YouTubers, Twitch streamers, pro players."
The students wanted to see if they could build an AI player that could do better than humans. They started with an existing system for analyzing images called CLIP. It's a neural network program that can learn about visual images just by reading text about them, and it's built by OpenAI, the same company that makes ChatGPT.
The Stanford students trained their version of the system with images from Google Street View.
"We created our own dataset of around 500,000 street view images," Alberti says. "That's actually not that much data, [and] we were able to get quite spectacular performance."
The team added additional pieces to the program, including one that helped the AI classify images by their position on the globe. When completed, the PIGEON system could identify the location of a Google Street view image anywhere on earth. It guesses the correct country 95% of the time and can usually pick a location within about 25 miles of the actual site.
Next, they pitted their algorithm against a human. Specifically, a really good human named Trevor Rainbolt. Rainbolt is a legend in geoguessing circles —he recently geolocated a photo of a random tree in Illinois, just for kicks — but he met his match with PIGEON. In a head-to-head competition he lost multiple rounds.
"We weren't the first AI that played against Rainbolt," Alberti says. "We're just the first AI that won against Rainbolt."
Noticing the little things
PIGEON excels because it can pick up on all the little clues humans can, and many more subtle ones, like slight differences in foliage, soil, and weather.
The group says the technology has all kinds of potential applications. It could identify roads or power lines that need fixing, help monitor for biodiversity, or be used as a teaching tool.
Skreta believes ordinary people will also find it useful: "You like this destination in Italy; where in the world could you go if you want to see something similar?"
To test PIGEON's performance, I gave it five personal photos from a trip I took across America years ago, none of which have been published online. Some photos were snapped in cities, but a few were taken in places nowhere near roads or other easily recognizable landmarks.
That didn't seem to matter much.
It guessed a campsite in Yellowstone to within around 35 miles of the actual location. The program placed another photo, taken on a street in San Francisco, to within a few city blocks.
Loading...
Not every photo was an easy match: The program mistakenly linked one photo taken on the front range of Wyoming to a spot along the front range of Colorado, more than a hundred miles away. And it guessed that a picture of the Snake River Canyon in Idaho was of the Kawarau Gorge in New Zealand (in fairness, the two landscapes look remarkably similar).
The ACLU's Jay Stanley thinks despite these stumbles, the program clearly shows the potential power of AI.
"The fact that this was done as a student project makes you wonder what could be done, by, for example, Google," he says.
In fact, Google already has a feature known as "location estimation," which uses AI to guess a photo's location. Currently, it only uses a catalog of roughly a million landmarks, rather than the 220 billion street view images that Google has collected. The company told NPR that users can disable the feature.
Stanley worries that companies might soon use AI to track where you've traveled, or that governments might check your photos to see if you've visited a country on a watchlist. Stalking and abuse are also obvious threats, he says. In the past, Stanley says, people have been able to remove GPS location tagging from photos they post online. That may not work anymore.
The Stanford graduate students are well aware of the risks. They've written a paper on their technique, which they co-authored along with their professor, Chelsea Finn — but they've held back from making their full model publicly available, precisely because of these concerns, they say.
But Stanley thinks use of AI for geolocation will become even more powerful going forward. He doubts there's much to be done — except to be aware of what's in the background photos you post online.