In another computer v human challenge, the program beat two professionals, Darren Elias, who holds the record for the most World Poker Tour titles, and Chris Ferguson, who has won six World Series of Poker tournaments.
In 10,000 hands of poker, Pluribus played against five copies of itself versus one top-class professional player, as well as five top-class professional poker players versus one version of itself.
One thing that makes this triumph so special is the secretive nature of poker.
In the likes of chess and Go, everything is laid out in the open. But a game like poker, specifically six-player Texas Hold 'em, has been too tough for a machine to master - until now. The AI poker gamer's superhuman ability comes from the "limited-lookahead" search algorithm it is equipped with. But poker is a bigger challenge because it is an incomplete information game: players can't be certain which cards are in play and opponents can and will bluff.
Take, for example, Nash equilibrium: so long as your opponent's strategy remains the same, you won't benefit from changing yours. "In a game that will, more often than not, reward you when you exhibit mental discipline, focus, and consistency, and certainly punish you when you lack any of the three, competing for hours on end against an AI bot that obviously doesn't have to worry about these shortcomings is a gruelling task".
"It was incredibly fascinating getting to play against the poker bot and seeing some of the strategies it chose", poker player Michael Gagliano said. You know nothing about the humans, but can try to pick up on their play styles during the game. Each player received at least $0.40 per hand just for playing and as much as $1.60 per hand, depending upon performance.
"We think of bluffing as this very human trait". For instance, it would, in some situations, bet much higher amounts of money than humans tend to - a move that pros indicated could be smart in some cases.
If built upon, the algorithms found in Pluribus could also be applied to self-driving vehicle routing, Wall Street trading, cybersecurity, and more. Training of Pluribus on a cloud service would cost just $150 while running it required a computer with two processors and 128 GB of memory, which was enough to play twice as fast as an average human player. At the leaves of that subgame, the AI considers five possible continuation strategies it and each opponent and itself might adopt for the rest of the game.
Brown and Sandholm attribute part of this winning streak to Pluribus' unpredictable gameplay.
If it only made large bets when holding very good hands, its opponents will quickly catch on, and quickly throw in their cards. Poker so far has been immune to the relentless efforts of AI even though there were some decent achievements in the form of Libratus back in 2017.
This milestone victory could bring AI closer to solving many real-world problems involving multiple parties and missing information.