Nothing Either Good or Bad: You've got to know when to hold 'em,

Original post: Jan 14, 2015

Know when to fold 'em,

"The Gambler" by Kenny Rogers is a famous song that discusses a simplified strategy of poker. The rules of the game are fairly straightforward. All players are trying to create the best possible hand as determined by the rules of the game being played. With 52 cards, there are staggering numbers of potential variations that change with every hand.

In an effort to stretch their limits, computer programmers have "solved" many games. Ars Technica explains this phenomenon further:

Computers have made remarkable progress when it comes to beating their programmers at a number of games. Checkers, Chess, and Jeopardy have all seen their champions fall to silicon opponents. In fact, for two games—checkers and Connect Four—computers have calculated the optimal move for every single possible board combination. The best a human can hope for is a draw.

But the games computers have done well with are what are called "perfect information games," where both players have full access to all the knowledge there is to have about the game. (Think of it this way: by looking at a chess board, you know precisely where every piece on the board is.) Lots of games that attract players have imperfect information: players know things their opponents don't. The classic examples here are card games, where some fraction of the cards is typically known only to the player holding them.

It's not possible to solve these in the same sense; you can't know the ideal path forward from a given state because you simply don't fully know what the state is. But it is possible to figure out strategies that make it very difficult for an opponent to exploit them. And, for a specific form of poker, researchers have now done so. Given a set of strategies, no human is likely to come out ahead while playing a computer.

There have been a number of triumphant headlines declaring that computers can now defeat humans at poker. The programming effort was intensive and impressive:

This creates a serious computation challenge, as heads-up limit Texas hold’em needs 3.19 x 10¹⁴ individual pieces of data to store the combination of strategies and regret values. The authors estimate that alone would take up over 260TB of disk space. To cut this down, the authors multiplied all of these floating point values by a scaling factor (to make them larger numbers) and then converted them to integers. They then devised a storage scheme that kept these values stored in a manner that's easily compressed. In the end, this cut things down to 17TB of storage.

The program then has to iterate through all possible strategies and assign them regret values. To speed the process of identifying winning approaches, any strategy that had never been chosen before but resulted in a win was immediately tried again. Beyond that, strategies are chosen with a probability proportional to their regret values. The authors also simplified computations by dividing up the huge number of possibilities into a bit over 100,000 subgames based on things like the initial bets and the flop (first face-up cards).

Even with all these optimizations, it took a cluster of 200 2.1GHz AMD cores (each with 32GB of RAM) 900 core-years to reach a point where improvements in outcomes slowed down. At that point, the strategies were optimized such that a person could play it for 70 years, 12 hours a day, using its own optimized strategy, and it would be overwhelmingly unlikely for them to end up ahead by a statistically significant margin. The authors call this situation "weakly solved."

While this is an amazing achievement, it also points out how staggeringly complex card games involving imperfect information can be. The version of poker that the computer "weakly solved" only involves two players at a time in the simplest form of Texas Hold-'Em where bets are limited with each hand. The type of poker typically shown on TV actually involves up to nine players at a time with unlimited bets on every round. This adds multiple layers of complexity which would be well beyond the current program's capabilities.

The Economist wrote another article that sums this up best:

Whether computers will ever be able to solve other forms of poker remains doubtful. Merely removing the betting restrictions on HULHE, for instance, boosts the range of possibilities to 6.38x10¹⁶¹, a figure so mind-bogglingly big that it far exceeds the number of subatomic particles in the observable universe. No amount of improvement in computer hardware will ever make such a problem tractable. The only hope is an enormous, and unlikely, conceptual breakthrough in how to attack the question.

There are, of course, poker-playing programs out there already that play more complicated versions of the game than HULHE. The best are better than most humans. But they, like chess-playing programs, do not actually solve the game in a mathematically rigorous sense. They just process more data that a human brain can cope with, and thus arrive at a better answer than most such brains can manage.

The most interesting computational solution to poker, though, would be one that did work more like a human brain, for instance by looking for the famous “tells” that experienced players claim give away their opponent’s state of mind, or even bluffing those opponents about its own intentions. When computers can do that, mere humans—and not just poker players—should really start worrying.

Here is the link to Ars Technica: Computers used to “solve” two-person, limit Texas hold’em | Ars Technica

Here is the link to the Economist: Computer poker: The perfect card sharp | The Economist

Nothing Either Good or Bad

Wednesday, July 1, 2015

You've got to know when to hold 'em,

No comments:

Post a Comment

Blog Archive