An Introduction To Prisoner’s Dilemma

Part Of: Algorithmic Game Theory sequence
Content Summary: 600 words, 6 min read

Setting The Stage

The Prisoner’s Dilemma is a thought experiment central to game theory. It goes like this:

Two members of a criminal gang are arrested and imprisoned. Each prisoner is in solitary confinement with no means of speaking to or exchanging messages with the other. The police admit they don’t have enough evidence to convict the pair on the principal charge. They plan to sentence both to a year in prison on a lesser charge.

Simultaneously, the police offer each prisoner a Faustian bargain. Each prisoner is given the opportunity either to betray the other, by testifying that the other committed the crime, or to cooperate with the other by remaining silent:

  • If A and B both betray the other, each of them serves 2 years in prison
  • If A betrays B but B remains silent, A will be set free and B will serve 3 years in prison (and vice versa)
  • If A and B both remain silent, both of them will only serve 1 year in prison (on the lesser charge)

Do you “get” the dilemma? Both prisoners do better if they each cooperate with one another. But they are taken to separate rooms, where the decisions of the other are no longer visible. The question evolving towards one of trust…

This parable can be drawn in strategy-space:

Prisoner's Dilemma- Overview

Strategic Dominance

Consider Person A’s perspective:

Prisoner's Dilemma- Dominance Player A

One line of analysis might run as follows:

  • If the other person cooperates (top rectangle), Player A would do better defecting (rightmost cell).
  • If the other person defects (bottom rectangle), Player A would do better defecting (rightmost cell).

Thus, no matter what B’s choice, defection leads to a superior result. Let us call this strategic dominance.

Person B’s perspective, in the below figure, is analogous:

Prisoner's Dilemma- Dominance Person B

  • If the other person cooperates (left rectangle), Player A would do better defecting (bottom cell).
  • If the other person defects (right rectangle), Player A would do better defecting (bottom cell).

Thus, the strategically dominant outcome is Defect-Defect, or (D, D).

Pareto Optimality

If the prisoners could coordinate their responses, would they select (D,D)? Surely not.

How might we express our distaste for mutual defection rigorously? One option would be to notice that (C, C) is preferred by both players. Is there anything better than mutual cooperation, in this sense? No.

Let us call the movement from (D,D) to (C,C) a Pareto improvement, and the outcome (C,C) Pareto optimal (that is, no one player’s utility can be improved without harming that of another).

It turns out that (C,D) and (D, C) are also Pareto optimal. If we map all outcomes in utility-space, we notice that Pareto optimal outcomes comprise a “fence” (also called a Pareto frontier).

Prisoner's Dilemma- Pareto Optimality

The crisis of the Prisoner’s Dilemma can be put as follows: (D, D) doesn’t reside on the Pareto frontier. More generally: strategically-dominant outcomes need not reside on the Pareto frontier.

While there are other, more granular, ways to express “good outcomes”, the Pareto frontier is a useful way to think about the utility landscape.

Let me close with an observation I wish I had encountered sooner: utility-space does not exist a priori. It is an artifact of causal processes. One might even question the ethics of artificially inducing “Prisoner’s Dilemma” landscapes, given their penchant for provoking antisocial behaviors.


  • A strategy is called dominant when it always outperforms alternatives, irrespective of competitor behavior.
  • Pareto optimal outcomes are those for which there is no “pain-free” way to improve the outcome of any participant. All such outcomes comprise the Pareto frontier.
  • The Prisoner’s Dilemma illustrates that strategically-dominant outcomes need not reside on the Pareto frontier, or more informally, that acting in one’s self-interest can lead to situations where everyone loses.

Iterated Schizophrenic’s Dilemma

Part Of: Breakdown of Will sequence
Followup To: Rule Feedback Loops
Content Summary: 1300 words, 13 min read


Here’s where we landed last time:

  • Preference bundling is mentally implemented via a database of personal rules (“I will do X in situations that involve Y).
  • Personal rules constitute a feedback loop, whereby rule-compliance strengthen (and rule-circumvention weakens) the circuit.

Today’s post will draw from the following:

  1. An Introduction To Hyperbolic Discounting discusses warfare between successive selves (e.g., OdysseusNOW restricts the freedoms of OdysseusFUTURE in order to survive the sirens).
  2. An Introduction To Prisoner’s Dilemma discusses how maths can be used to describe the outcome of games between competitors.

Time to connect these threads! 🙂 Let’s ground our warfare narrative in the formal systems of game theory.

Schizophrenic’s Dilemma

First, we interpret { larger-later (LL) vs. smaller-sooner (SS) } rewards in terms of { cooperation vs. defection } decisions.

Second, we name our actors:

  • P represents your present-self
  • F represents your future-self
  • FF represents your far-future-self, etc.

Does this game suffer from the same issue as classical PD? Only one way to find out!

ISD- Simple Example (1)

Here’s how I would explain what’s going on:

  • Bold arrows represent decisions, irreversible choices in the real world. Decisions lock in final scores (yellow boxes).
  • Skinny arrows represent intentions, the construction of rules for future decisions.
  • The psychological reward you get from setting intentions is just as real as that produced by decisions.
  • Reward is accretive: your brain seeks to maximize the sum total reward it receives over time.

Take some time to really ingest this diagram.

Iterated Schizophrenic’s Dilemma (ISD)

Okay, so we now understand how to model intertemporal warfare as a kind of “Schizophrenic’s Dilemma“. But we “have to live with ourselves” — the analogy ought to be extended across more than one decision. Here’s how:

ISD - From Flat Behavior To IPD

Time flows from left to right. In the first row, we see that an organism’s choices are essentially linear. For each choice, the reasoner selects either an LL (larger-longer) or an SS (smaller-sooner) reward. How can this be made into a two-dimensional game? We can achieve this graphically by “stretching each choice point up and to the left”. This move ultimately implements the following:

  • Temporal Continuity: “the future-self of one time step must be equivalent to the present-self of the next time step”.

It is by virtue of this rule that we are able to conceptualize the present-self competing against the future-self (second row). The third row simply compresses the second into one state-space.

Let us call this version of the Iterated Prisoner’s Dilemma, the Iterated Schizophrenic’s Dilemma. For the remainder of this article, I will use IPD and ISD to distinguish between the two.

The Lens Of State-Space

We have previously considered decision-space, and outcome-space. Now we shall add a third space, what I shall simply call state-space (very similar to Markov Decision Processes…)

ISD- Three IPD Spaces (1)

You’ll notice that the above state-space is a fully-connected graph: in IPD, any decision can be made at any time.

But this is not true in ISD, which must honor the rule of Temporal Continuity. For example, (C, C) -> (D, C) violates that “the future-self of one time step must be equivalent to the present-self of the next time step”.  The ISD State-Space results from trimming all transitions that violate Temporal Continuity:

ISD- ISD vs IPD state spaces

Let me briefly direct your attention to three interesting facts:

  1. Half of the edges are gone. This is because information now flows in only one direction.
  2. Every node is reachable; no node has been stranded by the edge pruning.
  3. (C,D) and (D,C) are necessarily transient states; only (C, C) and (D,D) can recur indefinitely.

Intertemporal Retaliation

Four qualities of successful IPD strategies are: nice, forgiving, non-envious, and retaliating. How does retaliation work in the “normal” IPD?

ISD - Traditional IPD retaliation

Here we have two cases of retaliation:

  1. At the first time-step, Player A defects “unfairly”. Incensed, Player B retaliates (defects) next turn.
  2. At the fourth time-step, Player B takes advantage of A’s goodwill. Outraged, Player A punishes B next turn.

Note the easy two-way symmetry. But in the ISD case, the question of retaliation becomes very complex:

  1. Present-selves may “punish” future-selves by taking an earlier reward, now.
  2. But future-selves cannot punish present-selves, obviously, because time does not flow in that direction.

What, then, is to motivate our present-selves to “keep the faith”?  To answer this, we need only appeal to the feedback nature of personal rules (explored last time)!

I’ll let Ainslie explain:

As Bratman has correctly argued (Bratman 1999, pp. 35-57), a present “person-stage” can’t retaliate against the defection of a prior one, a difference that disqualifies the prisoner’s dilemma in its classical form as a rationale for consistency. However, insofar as a failure to cooperate will induce future failures, a current decision-maker contemplating defection faces a danger of the same kind as retaliation…

With the question of retaliation repaired, the analogy between IPD and ISD seems secure. Ainslie has earned the right to invoke IPD explananda; for example:

The rules of this market are the internal equivalent of “self-enforcing contracts” made by traders who will be dealing with each other repeatedly, contracts that let them do business on the strength of handshakes (Klein & Leffler 1981; Macaulay 1963).

Survival Of The Salient

After casting warfare of successive selves into game theoretic terms, we are in a position to import other concepts. Consider the Schelling point, the notion that salient choices function as an attractor between competitors. Here’s an example of the Schelling point:

Consider a simple example: two people unable to communicate with each other are each shown a panel of four squares and asked to select one; if and only if they both select the same one, they will each receive a prize.Three of the squares are blue and one is red. Assuming they each know nothing about the other player, but that they each do want to win the prize, then they will, reasonably, both choose the red square. Of course, thered square is not in a sense a better square; they could win by both choosing any square. And it is only the “right” square to select if a player can be sure that the other player has selected it; but by hypothesis neither can. However, it is the most salient and notable square, so—lacking any other one—most people will choose it, and this will in fact (often) work.

By virtue of the feedback mechanism discussed above, rules are adapt over time via a kind of variation on natural selection (“survival of the salient“):

Intertemporal cooperation is most threatened by rationalizations that permit exceptions for the choice at hand, and is most stabilized by finding bright lines to serve as criteria for what constitutes cooperation. A personal rule never to drink alcohol, for instance, is more stable than a rule to have only two drinks a day, because the line between some drinking and no drinking is unique (bright), while the two-drinks rule does not stand out from some other number, or define the size of the drinks, and is thus susceptible to reformulation. However, skill at intertemporal bargaining will let you attain more flexibility by using lines that are less bright. This skill is apt to be a key component of the control processes that get called ego functions.

In the vocabulary of our model, it is the peculiarities of the preference bundling module whereby Schelling points are more effectual.

Let us close by, in passing, noting that this line of argument can be generalized into a justification for slippery slope arguments.


  • Warfare of successive selves can be understood in terms of the Prisoner’s Dilemma, where cooperation with oneself is selecting LL, and defection is SS.
  • In fact, intertemporal bargaining can be successfully explained via a modified form of the Iterated Prisoner’s Dilemma, since retaliation works via a unique feedback mechanism.
  • Because our model of willpower spans both a game-theoretic and cognitive grammar, we can make sense of a “survival of the salient” effect, whereby memorable rules persist longer.