Five years ago I wrote a post entitled Is Backgammon in P? It was based on conversations with Peter Bro Miltersen and Uri Zwick (shown together in the above picture) about the computational complexity of computing the values (and equilibrium points) of various stochastic games, and also on some things I learned from my game theory friends over the years about proving that values exist for some related games. A few weeks ago two former students of Peter, Rasmus Ibsen-Jensen and Kristoffer Arnsfelt Hansen visited Israel and I had a chance to chat with them and learn about some recent exciting advances.
In what way is Backgammon harder than chess?
Is there a polynomial time algorithm for chess? Well, if we consider the complexity of chess in terms of the board size then it is fair to think that the answer is “no”. But if we wish to consider the complexity in terms of the number of all possible positions then it is easy to go backward over all positions and determine the outcome of the game when we start with each given position.
Now, what about backgammon? Like chess, backgammon is a game of complete information. The difference between backgammon and chess is the element of luck; at each position your possible moves are determined by a roll of two dice. This element of luck increases the computational skill needed for playing backgammon compared to chess. It can easily be seen that optimal strategy for players in backgammon need not involve any randomness.
Problem 1: Is there a polynomial time algorithm to find the optimal strategy (and thus the value) of a stochastic zero sum game with perfect information? (Like backgammon)
This question (raised by Ann Condon in 1998) represents one of the most fundamental open problem in algorithmic game theory.
In what way is heads-up poker harder than backgammon?
Heads-up poker is just a poker game with two players. To make it concrete you may think about heads-up Texas hold’em poker. This is not a game with complete information, but by according to the minmax theorem it still has a value. The optimal strategies are mixed and involve randomness.
Problem 2: Is there a polynomial time algorithm to find the optimal strategy (and thus the value) of a stochastic zero-sum game with incomplete information? (like heads-up Texas hold’em poker).
It will be very nice to find even a sub-exponential algorithm for a stochastic zero-sum game with incomplete information like poker.
Problem 2′: Is there a subexponential-time algorithm to find the optimal strategy (and thus the value) of a stochastic zero-sum game with incomplete information?
For games with complete information like backgammon, a subexponential algorithm was found by Walter Ludwig and in greater generality by Sergei Vorobyov, Henrik Björklund, and Sven Sandberg. It is related to subexponential simplex-type algorithms for linear programming called RANDOM-FACET, found in the early 90s by Matousek, Sharir and Welzl and myself.
Poker-like games with a bounded number of states are in P!
Kristoffer Arnsfelt Hansen (see abstract below) presented a polynomial-time algorithm for 2-persons zero sum stochastic games, when the games have a bounded number of states. (Earlier algorithms were exponential.) The paper is: Exact Algorithms for Solving
Stochastic Games by Kristoffer Arnsfelt Hansen, Michal Koucky, Niels Lauritsen,
Peter Bro Miltersen, and Elias P. Tsigaridas. Slides of the talk are linked here.
Practical algorithms for heads-up poker – The Alberta group
As for backgammon there are very good computer programs. (We talked about chess-playing computers in this guest post by Amir Ban and since that time Go-playing computers are also available.) The site Cepheus Poker Project and this science paper Heads-up limit hold’em poker is solved are good sources on major achievements by a group of researchers from Alberta regarding two players poker.
What about poker with more players?
Problem 3: Is there a polynomial time algorithm to find Nash equilibrium point (or another form of optimal strategy) of a stochastic n-player game with incomplete information? (like Texas holdem poker.) Here n is fixed and small.
I think that people are optimistic that even the answer to problem 3 is yes. (There are hardness results for finding equilibrium points in matrix games but the relevance to our case is not clear.) If we want an algorithm which optimally plays poker, it is not clear that finding a Nash equilibrium is the way to go.
Problem 4: Find an algorithm for playing Texas hold’em poker when there are more than two players.
When the objective is to maximize revenues against human players I expect that it will be possible to develop computer programs for playing poker better than humans.
Problem 5: How to play the game MEDIAN of the previous post?
Matching pennies is the name for a simple game used in game theory. It is played between two players, Even and Odd. Each player has a penny and must secretly turn the penny to heads or tails. The players then reveal their choices simultaneously. If the pennies match (both heads or both tails), then Even keeps both pennies, so wins one from Odd (+1 for Even, −1 for Odd). If the pennies do not match (one heads and one tails) Odd keeps both pennies, so receives one from Even (−1 for Even, +1 for Odd) (source wikiPedia)
Variants of this game have been played since ancient times. In Hebrew matching pennies is called ZUG O PERET (even or odd; זוג או פרט). It is played like this: There are two players. Each player in his turn makes an announcement “even” or “odd”. Then each of the two players shows (simultaneously) some number of fingers and the announcing player wins if the sum of fingers has the announced parity.
The Big Match
The big match is a drastic repeated version of matching pennies. The game is played between players Even and Odd. Each player has a penny and in each stage must secretly turn the penny to heads or tails and the payoffs are the same as in matching pennies. If Even plays “head” the game continues to the next stage. However if Even plays “tails” (or tries for the “big match” as it is called) then the payoff in that round is repeated for all future rounds: Namely, if the pennies match Even will get 1 for all future rounds, and if the pennies do not match Even will pay one for all future rounds.
By playing heads with probability 1/2 and tails with probability 1/2, Odd can guarantee an expected payoff of 0. But what about Even? Can he also guarantee an expected payoff of 0? This was an open question for quite some time. The big match was introduced in 1957 by Dean Gillette who asked if the game has a value, namely if Even has a strategy to guarantee a payoff of 0.
Problem 7: Does big match has a value?
In 1968, David Blackwell and Thomas S. Ferguson settled Gillete’s question and proved that even can guarantee a zero payoff and thus big match did in fact have a value. This was the first step to showing all zero-sum stochastic games have value under limiting average payoff, which was proven in 1982 by Mertens and Neyman.
The Big Match in small space
Rasmus Ibsen-Jensen presented both positive and negative results on attaining the value for the big match with limited types of strategies and also on complexity issues regarding other stochastic games. Here are the slides for Rasmus’ talk (see full abstract below). Part of the talk is based on the paper The Big Match in Small Space by Kristoffer Arnsfelt Hansen, Rasmus Ibsen-Jensen, and Michal Koucky.
Values and equilibrium points for stochastic games.
This is a remarkable story with very important results and open questions. Here is the Wikipedia article on stochastic games and this short paper by Eilon Solan. I see now that the post is becoming too long and I will have to talk about it in a different post.
Problem 8 (informal): Does every stochastic game have
a value an equilibrium?
Following a major step by Truman Bewley and Elon Kohlberg (1976), Jean-François Mertens and Abraham Neyman (1981) proved that every two-person zero-sum stochastic game with finitely many states and actions has a uniform value. Nicolas Vieille (2000) has shown that all two-person stochastic games with finite state and action spaces have a limiting-average equilibrium payoff. The big question is to extend Vieille’s result to games with many players.
Kristoffer, Rasmus and Abraham (Merale) Neyman.
The new advances: Kristoffer’s lecture
Exact algorithms for solving stochastic games
Speaker: Kristoffer Arnsfelt Hansen, Aarhus University
In this talk we consider two-player zero sum stochastic games
with finite state and action space from an algorithmic
perspective. Prior to our work, algorithms for solving
stochastic games relied either on generic reductions to decision
procedures for the first order theory of the reals or on value or
strategy iteration. For all these algorithms, the complexity is
at least exponential even when the number of positions is a
constant and even when only a crude approximation is required
We will present an exact algorithm for solving these games based
on a simple recursive bisection pattern. The algorithm runs in
polynomial time when the number of positions is constant and our
algorithms are the first algorithms with this property. While the
algorithm is not based directly on real algebraic geometry, our
algorithm depends heavily on results from the field.
Based on joint work with Michal Koucký, Niels Lauritzen,
Peter Bro Miltersen, and Elias P. Tsigaridas published at STOC’11.
Rasmus’s lecture: Complexity of Good Strategies in Stochastic Games
Abstract: The talk will attempt to characterize good strategies for some special cases of stochastic games. For instance, the talk will argue that there might always be a good strategy with a certain property for all games in a special case of stochastic games or that no good strategy exists that has some property for some game. Concretely,
1) for the stochastic game the Big Match, no good strategy (for lim inf) exists that only depends on how long the game has been playing and a finite amount of extra memory (when the extra memory is updated deterministically).
2) for the Big Match there is a good strategy that uses only a single coin flip per round and exponentially less space then previous known good strategies.
3) let x be the greatest reward in a stochastic game. The talk will next give a simple characterization of the states of value equal to x for which there exists either (a) an optimal strategy; (b) for each epsilon>0, a stationary epsilon-optimal strategy; or (c) for each epsilon>0, a finite-memory epsilon-optimal strategy (when the memory is updated deterministically) . The characterization also gives the corresponding strategy.
4) the talk will then consider stochastic games where there exists epsilon-optimal stationary strategies for all epsilon>0. It will argue that the smallest positive probability in a stationary epsilon-optimal strategy must be at least double exponential small for some sub-classes of stochastic games, while for other classes exponential small probabilities suffices.
1) and 2) is based on “The Big Match in Small Space”, 3) is based on “The Value 1 Problem Under Finite-memory Strategies for Concurrent Mean-payoff Games” 4) is based on “Strategy Complexity of Concurrent Stochastic Games with Safety and Reachability Objectives” and “The complexity of ergodic mean-payoff games“. All papers can be found in http://Rasmus.Ibsen-Jensen.com