Part Of: Algorithmic Game Theory sequence
Content Summary: 600 words, 6 min read
Setting The Stage
The Prisoner’s Dilemma is a thought experiment central to game theory. It goes like this:
Two members of a criminal gang are arrested and imprisoned. Each prisoner is in solitary confinement with no means of speaking to or exchanging messages with the other. The police admit they don’t have enough evidence to convict the pair on the principal charge. They plan to sentence both to a year in prison on a lesser charge.
Simultaneously, the police offer each prisoner a Faustian bargain. Each prisoner is given the opportunity either to betray the other, by testifying that the other committed the crime, or to cooperate with the other by remaining silent:
- If A and B both betray the other, each of them serves 2 years in prison
- If A betrays B but B remains silent, A will be set free and B will serve 3 years in prison (and vice versa)
- If A and B both remain silent, both of them will only serve 1 year in prison (on the lesser charge)
Do you “get” the dilemma? Both prisoners do better if they each cooperate with one another. But they are taken to separate rooms, where the decisions of the other are no longer visible. The question evolving towards one of trust…
This parable can be drawn in strategy-space:
Consider Person A’s perspective:
One line of analysis might run as follows:
- If the other person cooperates (top rectangle), Player A would do better defecting (rightmost cell).
- If the other person defects (bottom rectangle), Player A would do better defecting (rightmost cell).
Thus, no matter what B’s choice, defection leads to a superior result. Let us call this strategic dominance.
Person B’s perspective, in the below figure, is analogous:
- If the other person cooperates (left rectangle), Player A would do better defecting (bottom cell).
- If the other person defects (right rectangle), Player A would do better defecting (bottom cell).
Thus, the strategically dominant outcome is Defect-Defect, or (D, D).
If the prisoners could coordinate their responses, would they select (D,D)? Surely not.
How might we express our distaste for mutual defection rigorously? One option would be to notice that (C, C) is preferred by both players. Is there anything better than mutual cooperation, in this sense? No.
Let us call the movement from (D,D) to (C,C) a Pareto improvement, and the outcome (C,C) Pareto optimal (that is, no one player’s utility can be improved without harming that of another).
It turns out that (C,D) and (D, C) are also Pareto optimal. If we map all outcomes in utility-space, we notice that Pareto optimal outcomes comprise a “fence” (also called a Pareto frontier).
The crisis of the Prisoner’s Dilemma can be put as follows: (D, D) doesn’t reside on the Pareto frontier. More generally: strategically-dominant outcomes need not reside on the Pareto frontier.
While there are other, more granular, ways to express “good outcomes”, the Pareto frontier is a useful way to think about the utility landscape.
Let me close with an observation I wish I had encountered sooner: utility-space does not exist a priori. It is an artifact of causal processes. One might even question the ethics of artificially inducing “Prisoner’s Dilemma” landscapes, given their penchant for provoking antisocial behaviors.
- A strategy is called dominant when it always outperforms alternatives, irrespective of competitor behavior.
- Pareto optimal outcomes are those for which there is no “pain-free” way to improve the outcome of any participant. All such outcomes comprise the Pareto frontier.
- The Prisoner’s Dilemma illustrates that strategically-dominant outcomes need not reside on the Pareto frontier, or more informally, that acting in one’s self-interest can lead to situations where everyone loses.