Poker has long been considered a benchmark for testing artificial intelligence (AI), due to the random and hidden nature of the game. But now researchers at Facebook and Carnegie Mellon University (CMU) say they have developed an AI bot called Pluribus that defeated five top players in the popular poker game Texas Hold’em for the first time.
The results of the unprecedented win, published in the journal Science on Thursday, showed that AI can have “strategic reasoning that supersedes humans, even in multiplayer settings,” said CMU computer science professor Tuomas Sandholm in an interview.
In Texas Hold‘em, the world’s most popular form of poker, multiple players are dealt two facedown cards, and they bet money by guessing their opponents’ next moves. The psychological game involves a level of strategy that isn’t found in others such as Go and chess, in which adversaries are aware of their opponents’ pieces.
For that reason, Sandholm calls poker an “imperfect information” game with unknown variables, much like real-world scenarios including business deals predicated on market variability.
AI in gaming has come a long way since the IBM Deep Blue computer famously beat a professional player in a six-game chess match for the first time in 1997.
But the strategic reasoning involved in defeating multiple top human players in Texas Hold‘em has stumped AI bots until now.
The experiment used two variations of six players at the table: one consisting of five professional poker players against Pluribus, and another of one human against five copies of the AI bot. The list of professionals included four-time World Poker Tour champion Darren Elias and 14 other players who have won world championships or taken home more than $1 million in prizes.
“The bot wasn’t just playing against some middle-of-the-road pros,” said Elias in a Facebook blog post. “It was playing some of the best players in the world.”
Over the past 16 years, Sandholm has been researching poker as an insight into AI strategy at his lab. The development of an AI player that beats human pros could have wide-ranging implications in fields like negotiation, trade optimization, investment banking and politics, Sandholm said.
“The research is not about going to Vegas and sitting down at a poker table,” said Facebook AI research scientist Noam Brown in an interview. “It’s about advancing fundamental AI, and in particular this question of dealing with hidden information in a multi-agent environment.” When asked about the company’s plans for applying the research to its own platform, he said Facebook does not have a particular product in mind: “It’s about making a long-term investment in AI.”
All of the previous benchmark games that an AI bot has mastered were those involving one player and one loser, while high-stakes situations in the real world customarily involve multiple people. The research will be relevant to future developments in cyber security, fraud detection and navigating traffic with self-driving cars, said Brown.
Since the collaboration between CMU and Facebook began in December, researchers built on the code that they had worked on for decades and added a self-play algorithm in which Pluribus played against itself, continuously improving upon past moves until it created better strategies.
“If the AI wants to know what would have happened if some other action had been chosen, then it need only ask itself what it would have done in response to that action,” Brown wrote in a Facebook blog post.
Sandholm has founded two companies to apply the technology behind Pluribus to real-world settings. One, called Strategic Machine, pursues business and gaming applications. Some of the technology applications include its use in optimizing strategies in investment banking or determining how political candidates can obtain campaign donations in different states.
Sandholm said Strategic Machine can also help companies like Amazon, Target and Walmart determine pricing in a predictive way, as opposed to reactive, in which the companies traditionally base prices off their competitors. “This would allow you to think ahead and to drive the market,” said Sandholm.
The other company, called Strategy Robot, uses the research from his lab to simulate military strategy. He clarified that the code created in collaboration with Facebook for Pluribus can only be used in poker, and therefore has no military applications.
According to Brown, the final major milestone for AI bots was progressing from a two-player poker game to multiple players.
“Now that we’ve achieved that,” said Brown, “it’s really about going beyond poker to other domains.”