Counterfactual Simulation: Presumed Similar Until Proven Different

Part Of: Sociality sequence
Content Summary: 1400 words, 14 min read

False Belief Blindness

Consider the Sally-Anne test:

A child is in a room, watching Sally and Anne who are also in the room. The room has two objects in it: a basket and a box.

Sally has a marble. Sally puts the marble in the basket and leaves the room. While she is gone, Anne moves the marble to the box.

Anne comes back, wanting to play with her marble. Where will she look for it?

To answer the question correctly, it is not enough to model Sally and Anne as having desires and beliefs (to deploy the intentional stance). The child must also be able to differentiate her own knowledge from the knowledge of others (the child is correct, but Sally is wrong). This is an instance of cognitive decoupling: building firewalls around the beliefs/desires of individual agents.

It turns out that 3yo children get the answer wrong, but 5yo children (except autistic ones) get it right. What happens at age four?

Why Belief Inference Is Blind …

At the end of the test, the child believes that the marble is in the box. Sally believes that the marble is in the basket. Their respective minds might look something like this:

ToM- Sally-Anne v1

In Awakening To A Social World, we learned that, at twelve months, children begin to think about other people as having beliefs and goals:

ToM- Sally-Anne v2

If the above picture is how your brain works, it would be puzzling to explain how a child would ever be tempted to conclude that Sally thinks the marble is in the box.

But this is not how your brain encodes second-order beliefs. Here’s what actually happens:

ToM- Sally-Anne v3

Crucially, relational second-order beliefs (“Sally thinks that”) point towards first-order beliefs (“Marble’s in basket”) which live in your world model. This mental library comes equipped with a librarian, who flags the incompatibility between “Marble’s in box” and “Marble’s in basket”, and removes the latter.

Architecturally, the reason why three year olds suffer from false belief blindness is that all beliefs funnel through one world model. There are simply no separate memory spaces to evaluate the world model of other people. In order to understand the beliefs of other people, they must be compatible with facts known by oneself.

… And Recognizing Falsehood Is No Cure

In Gullible By Default we discussed how negating beliefs is optional, effortful, and prone to failure. So you might think that the three year old child hasn’t yet developed the ability to negate Sally’s belief. But you’d be wrong. if you look carefully at the mechanics of negation, you will realize that negation cannot help.

Negating claims is implemented by adding an “It is false that” tag in front of the belief in question. We can negate Sally’s belief in two different ways:

ToM- Negation Cannot Model False Beliefs

Both negations fail. It is simply not true that Sally recognizes the ball isn’t in the basket. Likewise, we cannot say that Sally does not think that the ball is in the basket. Even exotic combinations (e.g., double negatives) are of no use.

But shouldn’t recognizing falsehood in one’s self enable us to recognize it in others?

No. It may help to notice that lie detection immediately corrects false beliefs (by introducing an “it is false than” mental prefix). This repair is essential to protect the mental ecosystem from contamination. Put simply, negation evolved as a protection against deception; it is simply not equipped to recognize honest mistakes.

The Birth Of Creativity

Quite independently of the evolution of lie detectors, the hominid line has also acquired a different kind of ability: the ability for pretense. Did you play the “floor is lava” game as a child? What is going on in the mind of a child when they pretend that couches, etc are the only refuge from a sea of molten rock?

You might be tempted to say that there exists a “floor is lava” belief in your mental library during such games. Or that the falsehood-detector is exploring the possibility of appended “It is false that…” to the traditional belief that “the floor is carpet”. But something more sophisticated is going on. As Leslie puts it:

If I jump up suddenly because I mistakenly think I see a spider on the table, I act as if a spider were there. But I certainly do not pretend a spider is there.

Instead of replacing belief, the child is alternating between competing beliefs. More specifically, the child is building a tiny little scaffold, which hovers over their actual belief that “the floor is carpet” and simulates a world in which that belief is replaced. We may call this counterfactual simulation.

ToM- Sally-Anne v4 Counterfactual Maps

To operate effectively, your counterfactual simulation must:

  • Retain the original belief and its relationships to the rest of your memory. Damaging to any of these forms of knowledge are irreversible.
  • Construct maps between the original belief and the counterfactual. It is not enough to imagine “Floor is Lava”, you must know which belief it overwrites.
  • Distance the prediction machine from the sensorimotor river. The floor’s perceptual signature doesn’t evince visceral fear, for example.

Why has a counterfactual simulator evolved in the hominid line?

Pretense originates in play, but is far more significant than that. With the ability to simulate different worlds, our minds are able to “try on” new beliefs as if they were hats. If we locate a belief that explains the world better than our world model, we upgrade our world model. Counterfactuals allow our prediction machines to upgrade themselves. They are the algorithmic bedrock of creativity.

How Counterfactuals Restored Our Sight …

A timeline of mind-relevant developmental milestones:

  • At 12 months, the intentional stance emerges
  • At 20 months, pretending behavior emerges.
  • At 48 months, false belief blindness is overcome.

Despite the large time gap between pretense and the blindness reversal, I claim that pretense is the cure. What gives?

The answer lies in the fallback mechanism for representing false beliefs. Recall that the 13 month old’s mental librarian rejects “Marble’s In Basket” as incompatible, and as a fallback, re-routes “Sally thinks that” towards the true belief “Marble’s In Box”. When the counterfactual simulator comes online at twenty months, it isn’t yet involved in this failure mode.

It is only slowly that the child’s brain discovers that these two technologies can be productively combined. Passing the Sally-Anne test requires a novel modification to the error processing algorithm:

ToM- Sally-Anne v5 Counterfactual Redemption

Not only are false beliefs encoded counterfactually, but the novel data are stored in the relationship model for later reuse. This is how we become aware of the fallibility of our peers.

… At The Cost Of Self-Anchoring

In Epistemic Topography, I said this:

The curse of knowledge expects short inferential distances. Why does this bias (not another) live in our brains?

As we have seen, estimating [epistemic] location is expensive. So the brain takes a shortcut: it uses a location it already knows about (its own) and employs differences between the Self and the Other to estimate distance. Call this self-anchoring. But the brain isn’t aware of all differences, only those it observes. Hence the process of “pushing out” one’s estimation of Other Locations typically doesn’t go far enough… the birthplace of the curse.

We now have a mechanical explanation for this mental shortcut. The more differences between our beliefs and another person, the more data we must encoded counterfactually. But counterfactuals are not prediction machines in their own right; they only facilitate tinkering with our own machinery. Other people are thus presumed similar until proven different.


Executive Summary:

  • Three year old children cannot conceive of other people being wrong.  This is far after they become mind-aware, even after they become able to recognize deceit. What gives?
  • First, children are blind to false beliefs because all beliefs (mine and yours) are based in the same location: the mental library, or world model.
  • Second, falsehood detection ultimately evolved as a deception-detector; why should we expect it to also function as a wrongness-detector?
  • The ability to pretend (e.g., “This Floor Is Lava”) allows our minds to test out new beliefs, rejecting the failures and integrating the successes.
  • The ability to simulate counterfactuals opens up a new pathway to encode false beliefs.
  • However, this pathway doesn’t let us imagine other people independently: other minds are always self-anchored; that is, imaged as deviations from your own mind.


  • [Leslie 1987] Pretense and Representation: The Origins of “Theory of Mind” [link]