Iterated Prisoner’s Dilemma contains strategies that dominate any evolutionary opponent
William H. Press and Freeman J. Dyson
PNAS 109(26), June 26, 2012
This was an interesting little paper that I learned about that I am still digesting; as I’ve said before, I’m not a game theory guy. They start out with a 2×2 game, the classical Prisoner’s Dilemma with players X and Y — if both cooperate (C), they payoffs are , if the first (resp. second) defects (D) they are (resp. ) and if they both defect then . The usual conditions are to keep it interesting.
In an iterated game the players get to play over and over again and try to develop strategies to win more than their opponent. The first interesting (to me) observation in this paper is that strategies which maintain a long memory are dominated by those which maintain a short memory in the sense that if for any long-memory strategy of Y, the payoff for a short-memory strategy of X is the same as a certain short-memory strategy of Y. So if your opponent is playing a short-memory strategy you should also play a similar short-memory strategy.
So with that idea in mind they look at first-order Markov strategies — given the past state (CC, CD, DC, DD) player X randomly chooses whether to cooperate or not in the next round. Thus the problem is specified by 4 conditional probabilities of cooperating conditioned on the previous state, and you can compute a 4×4 Markov matrix of transition probabilities out of the randomized strategies of each player.
Now the game becomes how they payoffs behave under different choices of strategy, which boil down to spectral / structural properties of this Markov matrix. To be honest, I found the discussion in the rest of the paper more confusing than enlightening because the mathematics was a little obscured by the discussion of implications. I realize this is a PNAS paper but it felt a little discursive at times and the mathematical intuition was not entirely clear to me.
Still, it’s a short read and probably worth a peruse.