Scientists have calculated the optimal strategy for playing poker ∀ x, y, z
Texas hold'em is the most popular type of poker
At the beginning of each draw, players receive cards (pocket cards)They look at their cards, and then the first round of trading takes place. The player who starts the trade is called the dealer (or button player, see Button (poker). after each draw, the next player in the circle becomes the dealer.
During the trade, the player can raise the bet (raise), equalize the opponent's bet (call), or refuse to continue participating in the draw and fold the cards.
As a result, after a round of trading, each remaining player in the draw I put the same amount of money on the line. Next, three community cards (flop) are opened for everyone, after which the second round of trading takes place. After that, another card is opened(turn), and the third round of trading takes place. Finally, the fifth community card (river) is revealed, and the last, fourth round of trading takes place. If at any point only one player remains in the game, he takes the entire pot. If more than one player remains in the game after the fourth round of betting, they reveal their pocket cards and compare the resulting -card combinations, which everyone can build from personal and community cards. The one with the best combination takes the pot. The combination consists of two pocket cards, which are dealt to players at the beginning of the game, and five community cards, which are laid out on the table during the game. The combination includes five cards from these games. Combinations are listed in descending order of seniority. Royal flush a special case of a straight flush, the highest of all possible ones, consists of high cards (ACE, king, Queen,Jack, ten) of the same suit. Heads up (heads up)means that only two players are playing.
Limit poker is a version of the game in which bets can be raised by a fixed amount, and you can raise the bet no more than a pre-agreed number of times.
Therefore, limit Texas hold'em is the ultimate game. Sequential games in game theory it is customary to set through the trees.The vertices of the tree will correspond to different game States. Each vertex is assigned the name of the player who owns the move at that vertex. The edges coming from this vertex correspond to the actions that this player can perform. One of the participants in the game is "nature" so in game theory they call an artificial player who performs the role of a random number generator. "Nature" randomly decides which card to deal to players or open on the table. Sequential games can be divided into two types: games with perfect information (see Perfect information)and games with perfect information. imperfect information.
In games with perfect information, each player always knows which top of the tree they are at and what happened before that.
In games with imperfect information, the player may not be sure what state the game is in. Poker is an example of a game with imperfect information: the player does not know what cards are in the hands of his opponent. Everyone can see the community cards and actions performed at the time of trading, but the opponent's cards will not be known at the time of trading. Any finite sequential game with perfect information can be calculated from the end using the reverse induction algorithm. By looking at one subgame of the most recent level (i.e, a subgame where the game ends after any decision is made and the players count the payments received), you can find the optimal action of the player who owns the turn on this subgame.
Next, you can also find the optimal actions of players on all sub-games of the last level.
The game is played with a standard -card deck
Thereafter, knowing how rational players will behave on the last-level sub-games, you can go on to analyze the games of the penultimate level, and so on. Sooner or later, you will definitely get to a subgame that coincides with the entire game, and then you can find the optimal action of the player who owns the first move in the game.
Thus, the optimal behavior of all players in any possible situation will be found and it will be found out how the game ends with the correct actions of all players.
This is how checkers were calculated in it turned out that if both sides played checkers correctly, the game would necessarily end in a draw (J. Schaeffer et al. Checkers is solved). Poker is smaller than checkers in terms of the number of possible game States. However, poker, unlike checkers, is a game with imperfect information.
This makes it impossible to directly apply the reverse induction algorithm: if a player does not know at some point which vertex he is at, then he will not be able to find the unambiguously optimal one the solution.
Nevertheless, such a game can be rewritten as a matrix (the normal form of the game).):horizontally, you can write out all the strategies of the first player, vertically-all the strategies of the second player, after which you can find the Nash equilibrium in the resulting matrix.Theoretically. Here we have another problem: the resulting matrix for poker will be very large. The difficulty of finding a Nash equilibrium using a linear programming algorithm increases exponentially as the number of game States increases, so the method is not applicable for complex games like poker. We have to abandon the idea of directly reducing the tree to a matrix. Instead, the authors use a special modification of the savage criterion (see Regret (decision theory),designed to solve games with imperfect information in linear time from the number of game States. The algorithm scans information sets from the end and assigns them one or another penalty depending on the strategy played. After that, the algorithm minimizes the amount of data collected.
Another difficulty in solving poker was that in it, the expected payments of players are not necessarily expressed in integers-compare with checkers, in which only outcomes are possible! Since we are talking about calculating payments by computer, the authors had to approximate infinite decimals with a given level of accuracy ε.
But then you can't use the standard definition of Nash equilibrium, because the calculation error may prevent you from answering the question of whether it is profitable for any of the players to deviate from a particular game profile. The authors use the concept of an e-Nash equilibrium,according to which a strategy profile is called an e-Nash equilibrium if none of the players deviating from this strategy profile can increase their utility by more than e. In particular, any Nash equilibrium is an e-Nash equilibrium. Finally, we come to the result obtained by the authors of the article in Science.For some sufficiently small e, the authors present the e-Nash equilibrium (e so small that a human life is not enough to test the difference between the Nash e-equilibrium and the Nash equilibrium). Figure shows the actions of players on the first move in this strategy profile.On the left, for any starting combination of two cards, the first actions of the dealer are indicated (green cell "raise", red - "reset"), on the right, the second player's answer is shown if the dealer raised the bet on the first move (green color "raise", blue - "equalize", red "reset", mixed colors correspond to the ability to mix several strategies with different probabilities). In this profile, the dealer often bluffs-raises the bet with a bad card, and the second player is often forced to fold his card, not being able to recognize whether the dealer is bluffing. This allows the dealer to beat the second player over a long distance. In the game we are considering, there may also exist other Nash e-equilibria. However, it should be borne in mind that in a zero-sum game, which is poker,all the equilibria are Nashes bring players the same payments. Therefore, finding a single Nash equilibrium means that strategies have been found that allow players to guarantee the best possible outcome for themselves. Can I earn money by playing the strategy I found? Yes, if you can reproduce the actions that the strategy requires you to perform in each position.
It is unlikely that a person will be able to do this there will not be enough memory.
But against the computer, playing limit heads-up is now useless. Most likely, this means that limit heads-up poker will soon disappear from poker sites it will be very difficult to check that a person does not use special programs that help you find the best answers. However, it's too early for poker players to get upset. Even if all the variations of limit poker become known one day, there will still be no limit poker (you can place bets of any size), which is not the end game. Because of this, it is unlikely that you will be able to solve unlimited poker with modifications of the reverse induction algorithm.