'Student of Games' is the 1st AI that can master…

'Student of Games' is the 1st AI that can master different types of games, like chess and poker

"Student of Games" can master both information-perfect games like Go and information-imperfect games like Scotland Yard. (Image credit: Getty Images/Kirillm)

Researchers have built the first general-purpose artificial intelligence (AI) algorithm that can master a wide variety of games — dubbed "Student of Games."

Game algorithms are normally designed to master either information-perfect games like Go or chess — in which each player has all the information — or information-imperfect games like poker, in which some information is hidden from other players. This is because the process of training the algorithms has historically been different for the two types of games: The former uses search and learning while the latter uses game-theoretic reasoning and learning.

But the new Student of Games algorithm gets around this limitation by combining guided search, self-play learning and game-theoretic reasoning, according to a new paper describing the algorithm, published Nov. 15 in the journal Science Advances.

When tested, Student of Games held its own in both the information-perfect chess and Go, as well as the information-imperfect Texas Hold'em and Scotland Yard. However, it couldn't quite beat the best, specialized AI algorithms in head-to-head matchups.

"This is a step towards making even more general algorithms," study lead author Martin Schmid, CEO and co-founder of EquiLibre Technologies, told Live Science in an email.

"One takeaway is that one can indeed design a technique that can work for both perfect and imperfect information games, rather than having specialized algorithms. Another interesting observation was that one of the important steps was to come up with a new formalism, allowing for truly general design of search based algorithm."

Games have long served as a benchmark for progress in the field of AI. For instance, in 2016, DeepMind's AlphaGo beat a professional human Go player. The following year, the Libratus system beat the world's best human poker players in a 20-day Texas Hold'em tournament.

"Games are a well-defined benchmark, and there is a long history of AI progress being tied to milestones in AI for games," Schmid explained. "Games are sometimes referred to as fruit flies of AI, allowing for quick development and gradual progress."

But there has always been a divide between information-perfect and imperfect games. To get around this, the team trained its general-purpose algorithm using what's known as a growing-tree counterfactual regret minimization (GT-CFR) algorithm, a variation of a widely used algorithm in which an AI system learns by playing against itself repeatedly.

The team combined techniques used to build a variety of game-playing algorithms, from AlphaZero — a more advanced version of AlphaGo — to DeepStack — the first computer program to outplay human professionals in Texas Hold'em poker.

In the information-perfect category, the team found that Student of Games performed as well as human experts or professionals, but it was substantially weaker in head-to-head play than specialized algorithms like AlphaZero.