Tuomas Sandholm and Noam Brown spent the past year building an AI that plays Texas Hold ‘Em. The two Carnegie Melon researchers call their creation Libratus, and they believe it can top the world’s best players at no-limit Hold ‘Em, a version of the classic poker game that allows any bet at any time. No machine has ever reached such heights with this unusually complex game of cards. Although AI systems have topped the best players at checkers, chess, Othello, and even Go, no-limit Hold ‘Em creates a different obstacle. In contrast to those other games of intellect, a poker player can know only part of what’s happening during each hand. Poker is an imperfect information game. So many of the cards are hidden—and so much luck is involved.
Poker is an imperfect information game. So many of the cards are hidden—and so much luck is involved.
To prove the powers of this new AI, the two researchers recently arranged for Libratus to challenge four of the world’s best players at a casino in Pittsburgh, not far from Carnegie Mellon, where Sandholm is a professor and Brown is a PhD student. Sandholm did much the same thing last year with another AI, and though his earlier attempt failed, as the machine’s opponents exploited particularly telling quirks in the way it played, he felt that his latest creation, drawing on more than a decade of research, had reached a new level of smarts that could finally eclipse human competition. Then, last week, just days before match, Sandholm was hit with some competition of a different kind. A rival team of researchers based at the University of Alberta published a paper claiming that their new AI, DeepStack, had already beaten some top human poker players.
As usual in the world of high-stakes AI research, it’s not just AI versus human. It’s AI versus AI. And it’s human versus human. Carnegie Mellon and Alberta have competed in poker AI for more than a decade, and now, they’re finally reaching the finish line.
The AlphaGo Analogy
At the moment, the end result of this multi-faceted competition is still in doubt. Led by University of Alberta professor Michael Bowling—a notable figure the recent AI revolution who did his PhD work at Carnegie Mellon—the Alberta team isn’t discussing its paper because, as one of Bowling’s students told us, it hasn’t yet been peer-reviewed. And as their rival Sandholm says, the paper doesn’t settle the matter because DeepStack merely played against good pokers layers, not great ones. But we’re certainly approaching a point where no-limit Texas Hold ‘Em—and similar imperfect information games—are finally cracked by artificial intelligence. Libratus started its match against four of the very best poker players on Wednesday—winning the both the first and second days—and this competition will play out by the end of the month.
What may be even more interesting, however, is that its rival, DeepStack, is successfully using deep neural networks to mimic the very human intuition that poker players rely on, echoing the design of AlphaGo, the AI that recently cracked the ancient game of Go, the most complex of the perfect information games. “It’s analogous to AlphaGo,” says University of Michigan professor Michael Wellman, who specializes in game theory and closely follows the world of AI poker. “They found a way to integrate deep learning in a novel way—and that made the big difference.”
This poker competition isn’t nearly as a important as AlphaGo topping Lee Sedol, the best Go player of the last decade. AlphaGo was built by Google, and Google is already using so many of the same technologies to reinvent its online empire—not to mention healthcare and robotics. But an AI that wins at Texas Hold ’em can eventually prove quite useful in other areas, like auctions and financial markets and physical security and even global politics—hardcore negotiation, deciding what to do when you don’t quite know what the person across the table is going to do. “The reason I follow AI poker is that I also work with financial trading, which involving imperfect information,” says University of Michigan professor Michael Wellman, who specializes in game theory and closely follows the world of AI poker. “Some of these ideas could find traction in the real-world domains.”
Know When to Hold ‘Em
Texas Hold ’em, the main event at the World Series of Poker, is an enormously complex card game. The dealer lays two “hole” cards in front of each player—cards only that player can see—before dealing three communal cards face up on the table. Then a fourth. And then a fifth. Players place bets after each stage of the deal, and in no-limit Texas Hold ‘Em, they can bet as much as they want at any stage. But players aren’t necessarily trying to win every hand. They’re trying to win the most money, and this means that as the game progresses across hand after hand, it becomes a competition where players are trying to guess what cards opponents are holding based on not just the bet that was just made, but all the bets made over the course of the match. Plus, they’re all trying to fool their opponents through their own bets. It’s all about game theory.
‘This estimate can be thought of as DeepStacks intuition’
That’s why it’s is so hard for machines to play. But machines do have one big advantage over humans: in seconds, they can play out myriad different scenarios of a game on their own and use this to decide the best way to play. This is what Libratus does. In essence, it builds a rather complex “game tree” to determine the likely outcome of a particular play, running its calculations on a supercomputer at the Pittsburgh Supercompting Center. “We look ahead to the end of the game,” says Sandholm.
But that is a very hard thing to do, even from the most powerful machines. There are just so many scenarios to examine. So, DeepStack takes a different tack. It builds a game tree too, but it doesn’t necessarily look all the way to the end of the same. Instead, Bowling and his team trained a neural network to guess where each play will end up. Just as Facebook is training neural networks to recognize faces in photos by feeding it millions of existing snapshots, the Alberta team trained this DeepStack neural net using thousands of random poker situations, taking into account not just the cards but the bets. In this way, the neural network learns to recognize which bets will be successful. It needn’t play out every possible outcome of every hand.
“It avoids reasoning about the entire remainder of the game by substituting the computation beyond a certain depth with a fast approximate estimate,” Bowling and his team write. “This estimate can be thought of as DeepStacks intuition: a gut feeling of the value of holding any possible private cards in any possible poker situation.”
The Big Ideas
Sandholm downplays the important of the neural network, saying his team of Carnegie Mellon researchers have built this kind of “evaluation function” using other techniques—and that deep learning hasn’t proven all that useful with poker in the past. But the successful use of a dep neural network is what makes DeepStack so interesting. Not because it’s a deep neural network, but because this general route could open up a much wider range of possibilities. As Wellman explains, this could not expand with possibilities with Texas Hold ‘Em, where the games become more and more complex as you add more and more hands, but things like auctions and negotiations, which are even more complex.
This mirrors the shift across the world of AI. Increasingly, companies like Google and Facebook and Microsoft are turning to deep neural networks and other machine learning technologies, and in many cases, by analyzing vast amounts of data and learning tasks on their own, these algorithms are outperforming existing systems that were hand-coded for the task—and they’re pushing these fields forward at much faster speeds. This has happened with image recognition, speech recognition, and machine translation, and it’s beginning to happen with natural language understanding, the effort to build machines that can understand the natural way you and I talk.
Over the next twenty days, in Pittsburgh, we’ll see whether an AI can beat some of the world’s top poker players. But the real test will come later, when this AI pushes beyond poker. Wellman says that the algorithms used by Libratus and DeepStack may not hold up in the real world. But the big ideas behind them are another matter.