Is ChatGPT getting smarter, or is it getting better at seeming smart? According to Apple, it’s the latter.
A team of AI researchers at Apple published a paper this weekend claiming that most leading large language AI models aren’t actually capable of advanced reasoning, despite how intelligent they might seem.
Large language models, or LLMs, like ChatGPT appear to be getting more advanced and “intelligent” every year. Under the hood, though, their logical reasoning hasn’t improved much. According to Apple’s research, current LLMs’ capabilities “may resemble sophisticated pattern matching more than true logical reasoning.”
What does this research mean for the reality of today’s top AI models? It might be time to focus on creating safer AI models before trying to build smarter ones.
Apple AI researchers find that LLMs struggle with grade-school math
A team of AI researchers at Apple has revealed the findings of a new benchmark test, GSM-Symbolic, which posed a whole new challenge for large language models.
The test revealed that today’s top AI models have limited reasoning capabilities, despite how intelligent they might seem.
In fact, the GSM-Symbolic test revealed that the AI models in the study struggled with basic grade school math problems. The more complex the questions became, the worse the AIs performed.
The researchers explain in their paper, “Adding seemingly relevant but ultimately inconsequential information to the logical reasoning of the problem led to substantial performance drops of up to 65% across all state-of-the-art models.
"Importantly, we demonstrate that LLMs struggle even when provided with multiple examples of the same question or examples containing similar irrelevant information.”
This means today’s leading AI models are easily confused by logic-based questions such as math problems. They rely on copying the patterns in math problems in their training data but struggle to do math the way a human can. This shows that large language models only appear to be smart, when, in reality, they’re just really good at acting smart.
This echoes OpenAI CEO Sam Altman’s remarks, claiming AI is actually “incredibly dumb” in its current state. OpenAI is the company behind ChatGPT and Altman has been ambitious in his pursuit of artificial general intelligence, which would be capable of true logical reasoning.
Apple’s study seems to agree. It concludes, “We believe further research is essential to develop AI models capable of formal reasoning, moving beyond pattern recognition to achieve more robust and generalizable problem-solving skills.”
AI might not be smart yet, but it can still be dangerous
If the research published by Apple’s AI team is accurate, today’s leading large language models struggle to hold up on an episode of Are You Smarter Than a Fifth Grader. However, that doesn’t mean AI can’t still be a powerful tool, one that can be incredibly helpful… or harmful. In fact, the Apple study reveals a core strength and potential danger of AI: its ability to mimic.
LLMs like ChatGPT may seem capable of reasoning the way humans are, but as this study points out, that’s just the AI copying human language and patterns. That might not be as advanced as actual logical reasoning, but AI has gotten extremely good at mimicking others. Unfortunately, bad actors have been quick to take advantage of every advancement.
It's happening. There are real companies who will just use an AI-created rip of my voice to promote their stuff. And there's really no repercussions for it other than being known as this scummy shady company that is willing to stoop that low to sell some product pic.twitter.com/Y12XGKNqFOOctober 14, 2024
For example, this weekend tech YouTuber Marques Brownlee announced on X that a company used AI to replicate his voice in an ad for their product, which Brownlee was not affiliated with.
The AI-generated decoy is shockingly similar to Brownlee’s real voice, though. The ad was clearly intended to deceive viewers into thinking Brownlee was endorsing their product.
Unfortunately, incidents like this are becoming more common, from fake presidential endorsements from Taylor Swift to Scarlett Johansson’s claims that OpenAI copied her voice without her permission.
Outlook
Average users might not think these controversies affect them, but they’re arguably the most critical aspect of the AI industry. It’s great that basic tools like ChatGPT or Gemini are useful to many people.
However, the ways AI is also being misused for deep fakes, deception, and scams pose a serious risk to the safety of this technology and everyone who interacts with it, knowingly or otherwise.
Addressing that should be far more important than making an AI that can talk in more human-like voices or answer longer questions.
Apple’s research calls out the man behind the curtain, highlighting that AI is still little more than pattern recognition. As developers strive to make their AI models ever more advanced, they need to do more to protect them from misuse.
Otherwise, the future of AI could become overshadowed by a minefield of misinformation, scams, and deep fakes exploiting this technology instead of using it to solve real problems.