Get all your news in one place.
100’s of premium titles.
One app.
Start reading
PC Gamer
PC Gamer
Jeremy Laird

New MIT study shows what you already knew about AI: it doesn't actually understand anything

The OpenAI logo is being displayed on a smartphone with an AI brain visible in the background, in this photo illustration taken in Brussels, Belgium, on January 2, 2024. (Photo illustration by Jonathan Raa/NurPhoto via Getty Images).

The latest generative AI models are capable of astonishing, magical human-like output. But do they actually understand anything? That'll be a big, fat no according to the latest study from MIT (via Techspot).

More specifically, the key question is whether the LLMs or large language models at the core of the most powerful chatbots are capable of constructing accurate internal models of the world. And the answer that MIT researchers largely came up with is no, they can't.

To find out, the MIT team developed new metrics for testing AI that go beyond simple measures of accuracy in responses and hinge on what's known as deterministic finite automations, or DFAs.

A DFA is a problem with a sequence of interdependent steps that rely on a set of rules. Among other tasks, for the research navigating the streets of New York City was chosen.

The MIT team found some generative AI models are capable of very accurate turn-by-turn driving directions in New York City, but only in ideal circumstances. When researchers closed some streets and added detours, performance plummeted. In fact, the internal maps implicitly generated by the LLMs by their training processes were full of nonexistent streets and other inconsistencies.

“I was surprised by how quickly the performance deteriorated as soon as we added a detour. If we close just 1 percent of the possible streets, accuracy immediately plummets from nearly 100 percent to just 67 percent,” says lead author on the research paper, Keyon Vafa.

The core lesson here is that the remarkable accuracy of LLMs in certain contexts can be misleading. "Often, we see these models do impressive things and think they must have understood something about the world. I hope we can convince people that this is a question to think very carefully about, and we don’t have to rely on our own intuitions to answer it," says senior paper author Ashesh Rambachan.

More broadly, this research is a reminder of what's really going on with the latest LLMs. All they are actually doing is predicting what word to put next in a sequence based on having scraped, indexed and correlated gargantuan quantities of text. Reasoning and understanding are not inherent parts of that process.

Your next upgrade
(Image credit: Future)

Best CPU for gaming: The top chips from Intel and AMD.
Best gaming motherboard: The right boards.
Best graphics card: Your perfect pixel-pusher awaits.
Best SSD for gaming: Get into the game ahead of the rest.

What this new MIT research showed is that LLMs can do remarkably well without actually understanding any rules. At the same time, that accuracy can break down rapidly in the face of real-world variables.

Of course, this won't entirely come as news to anyone familiar with using chatbots. We've all experienced how quickly a cogent interaction with a chatbot can degrade into hallucination or just borderline gibberish following a certain kind of interrogative prodding.

But this MIT study is useful for crystallizing that anecdotal experience into a more formal explanation. We all knew that chatbots just predict words. But the incredible accuracy of some of the responses can sometimes begin to convince you that something magical might just be happening.

This latest study is a reminder that it's almost certainly not. Well, not unless incredibly accurate but ultimately mindless word prediction is your idea of magic.

Sign up to read this article
Read news from 100’s of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
One subscription that gives you access to news from hundreds of sites
Already a member? Sign in here
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.