Abstract:
This thesis advances statistical exploitation in Texas Hold’em poker, showcasing the viability of this method for creating a world class poker playing agent. A world class poker player must be both resilient to exploitation and able to exploit opponents since skill in poker is determined by a player’s win rate over a range of opponents of varying skill levels. The resilience to exploitation ensures the player does not lose to any of the opponents to a great degree, and the exploitation capabilities ensure the player is able to achieve a high win rate against some, enabling a high sum total win rate over the entire range of opponents. The resilience to exploitation is provided by an approximate Nash equilibrium base strategy that the agent plays in every game state where it is uncertain. The exploitation is provided by the statistical exploitation module that models the opponent and provides exploitive actions in game states where it is certain the opponent is exploitable. Our evaluations show that the strategy resulting from the combination of an approximate Nash equilibrium strategy and the improved statistical exploitation module is as resilient to exploitation as the approximate Nash equilibrium strategy on its own, even against dynamic and strong opponents, and is capable of exploiting some opponents. This is made possible by the statistical exploitation modules safe exploitation, which does not increase the exploitability of the resulting strategy. This leads us to conclude that this method of creating poker playing agents is a viable method of creating a world class poker player.