Get all your news in one place.

100’s of premium titles.
One app.

Start reading

Get all your news in one place.

100’s of premium titles. One news app.

Start reading

LiveScience

Nicholas Fearn

'Master of deception': Current AI models already have the capacity to expertly manipulate and deceive humans

Deception in popular AI systems

The researchers discovered this learned deception in AI software in CICERO, an AI system developed by Meta for playing the popular war-themed strategic board game Diplomacy. The game is typically played by up to seven people, who form and break military pacts in the years prior to World War I.

Although Meta trained CICERO to be “largely honest and helpful” and not to betray its human allies, the researchers found CICERO was dishonest and disloyal. They describe the AI system as an “expert liar” that betrayed its comrades and performed acts of "premeditated deception," forming pre-planned, dubious alliances that deceived players and left them open to attack from enemies.

"We found that Meta's AI had learned to be a master of deception," Park said in a statement provided to Science Daily. "While Meta succeeded in training its AI to win in the game of Diplomacy — CICERO placed in the top 10% of human players who had played more than one game — Meta failed to train its AI to win honestly."

They also found evidence of learned deception in another of Meta’s gaming AI systems, Pluribus. The poker bot can bluff human players and convince them to fold.

Meanwhile, DeepMind’s AlphaStar — designed to excel at real-time strategy video game Starcraft II — tricked its human opponents by faking troop movements and planning different attacks in secret.

Huge ramifications

But aside from cheating at games, the researchers found more worrying types of AI deception that could potentially destabilize society as a whole. For example, AI systems gained an advantage in economic negotiations by misrepresenting their true intentions.

Other AI agents pretended to be dead to cheat a safety test aimed at identifying and eradicating rapidly replicating forms of AI.

"By systematically cheating the safety tests imposed on it by human developers and regulators, a deceptive AI can lead us humans into a false sense of security,” Park said.

Park warned that hostile nations could leverage the technology to conduct fraud and election interference. But if these systems continue to increase their deceptive and manipulative capabilities over the coming years and decades, humans might not be able to control them for long, he added.