An Overview of Moral Cognition

Part Of: Demystifying Ethics sequence
Content Summary: 1000 words, 10 min read

Propriety Frames

Main Article: An Introduction To Propriety Frames

The primate brain contains a diverse set of memory structures. For example, episodic and semantic memory store narratives and facts, respectively.

Propriety frames are a memory format that retains social intuitions. This form of memory permits normative judgments, or which behaviors are appropriate or inappropriate.

Propriety frames are organized by situational context. A Restaurant frame provides the social intuitions for restaurant interactions. Frames are the substrate of rituals.

Frames are organized hierarchically to promote reuse. For example, the Eating frame is relevant in contexts besides restaurant dining.

Propriety Frames- Restaurant Example (2)

When a mother instructs her son to not yell in the store, the child installs an update to his Shopping frame. When a family exchanges gossip around a campfire, they are doing so in part to synchronize their propriety frames.

Intuition Generators

Main Article: Moral Foundations Theory

The contents of our social intuitions is not arbitrary. Our environment does not fully determine our moral sense. The brain also possesses innate social intuition generators, which contribute to the contents of social judgments.

As outlined in Moral Foundations theory, there are six such generators: authority, care, loyalty, fairness, autonomy, and sanctity.

People are genetically and environmentally disposed to respond to certain generators more strongly than others. Let social matrices be the emotional intensities elicited by each generator.

People with similar matrices tend to gravitate towards similar political parties. When you measure the social matrices of American citizens, you can see large differences between the social intuitions of Democrats and Republicans.

Theory of Normatives- Social Matrices by Political Party (1)

These differences in social matrices explain much of American politics.

  • Why do Democrats praise entitlements, but Republicans denounce them? Because Democrats heavily emphasize Care for the poor, whereas Republicans more strongly reverberate to questions of Fairness (exploitation and moral hazard).
  • Why are Democrats more skeptical of patriotism than their Republican counterparts? Perhaps because they respond to Loyalty to country less.

For more information, see Graham et al (2009). Liberals and conservatives rely on different sets of moral foundations.

Generators vs Frames

Main Article: A Dual-Process Theory of Moral Judgment

Two mechanisms contribute to social intuitions: propriety frames, and intuition generators.

Theory of Normatives- Frame vs Generator (1)

Generators are located subcortically, and thus produce intuitions quickly and automatically. Frames are stored in the cortical mantle, and are thus slower, and more amenable to conscious awareness.

As the work of Joshua Greene shows, deontological (rule-based) attitudes are generated rapidly; whereas consequential theories of morality are slower and more vulnerable to distraction. This suggests a relatively straightforward mapping:

  • People with a deontological style rely more heavily on their Generators.
  • People with a consequentialist style lean instead on their Frames.

Frames and Generators can influence one another, albeit slowly. Moral argumentation can change one’s mind, and these frame updates can percolate down to change one’s social matrices (intuition weights). Call this frame internalization.

Similarly, if one’s private intuitions diverge from a culturally inherited norm, that frame can be updated to be more consistent with one’s personality. Call this frame refinement.

For more information, see Cushman, Young, Greene (2010). Our multi-system moral psychology: Towards a consensus view

Cultural Regimes

Main Article: The Relational Sphere Hypothesis

Human communities generate emergent networks known as cultural regimes. These emerge as distinct categories of frames and intuition generators.

regime categories

Language made regime accretion possible. No longer is expertise confined to the skull of the individual. No longer is death a Full Restart button. Our species sends cultural expertise down through the generations.  This knowledge – these frames – have become increasingly sophisticated over the course of human history.

Hominid evolution has also seen the advent of regime diversification. Modern religions (of the Axial Age) were successful because they improved on our ability to trust strangers. This in turn dramatically increased the size of feasible social groups.

Theory of Normativity- Evolution of Regimes (3)

Moral Tagging

We have so far only been speaking about social attitudes (should / should not). What about moral attitudes (good / evil)?

Moral attitudes are nothing more than a kind of social attitude. I know of no moral attitude that can be divorced from a social context.

Moral attitudes are constructed by moral tagging, which endows a subset of social attitudes with anger and disgust reactions (as opposed to the more typical reputation appraisal, gossiping, shaming).

Further, moral tagging produces appraisal inflexibility. Moral violations are viewed as wrong everywhere, in every context. This is in contrast to other social violations, for which it is easier to see circumstances in which the behavior would seem less bad.

The boundary between Virtuous and Tolerable is smooth, reflecting the flexibility of our intuition generators. In contrast, the boundary between Tolerable and Intolerable is sharp.

Moral Flinch- Latitudes of Acceptance (2)
A similar distinction appears amidst disagreement. There are two kinds of ways people’s judgments can diverge:

  • Moral disagreement: is a given behavior Evil?
  • Social disagreement: is a given behavior Inappropriate?

moral vs social disagreement

For more information, see Tetlock et al (2000). The psychology of the unthinkable: taboo trade-offs, forbidden base rates, and heretical counterfactuals.

Norm Synchronization

The space of possible social intuitions is vast. However, group living only became possible with relatively homogenous norms. How do individual brains synchronize propriety frames and social matrices within a group?

At least four mechanisms provide norm synchronization.

  • Natural language and facial expressions are used to communicate social and emotional information.
  • Reputation systems benefit social beings who are especially adept at learning & conforming to the frames of their peers.
  • First-order punishment emotions (anger at transgressor) incentivizes people to not violate implicit societal expectations.
  • Second-order punishment (anger at those who are accepting of transgressor) incentivize a community to respond to, and develop a unified response to, dyadic disagreements.


  • In Propriety Frames, we saw how the brain retains social information.
  • Limbic machinery, such as the Care module, generate normative intuitions.
  • Generators and frames interact to facilitate both top-down and bottom-up learning.
  • A subset of social violations are tagged as morally intolerable.
  • Several emotion devices work to consolidate normatives within a group. 

Related Work

  • Graham, Haidt, Nosek (2009). Liberals and conservatives rely on different sets of moral foundations.
  • Cushman, Young, Greene (2010). Our multi-system moral psychology: Towards a consensus view
  • Tetlock et al (2000). The psychology of the unthinkable: taboo trade-offs, forbidden base rates, and heretical counterfactuals.

Two Cybernetic Loops

Part Of: Neuroanatomy sequence
Content Summary: 800 words, 8 min read

What Is Perception About?

Consider Aristotle’s five senses: vision, hearing, smell, touch, and taste. We know that senses are windows into physical reality. But what aspects of reality do these represent?

Vision and hearing have a special property: despite receptors being located within the body (proximal), they carry information about phenomena outside of the body (distal). They carry information about the world. In contrast, smell, touch, and taste only represent events close to the body; these encode the interaction between body and world.

This distinction is a neural primitive: the brain encodes World and Interaction in extrapersonal and peripersonal space, respectively.

However, there is a significant lacuna within this binary system: none of these concern the body. Body sensation is a crucial “sixth sense”:


Making Sense of Anatomy

We spend a lot of time discussing the nervous system. But the body houses eight other anatomical systems: reproductive, integumentary (skin), muscular, skeletal, endocrine (hormones), digestive (incl. urinary and excretory subsystems), circulatory (incl. immune and lymphatic subsystems), and respiratory.

To regulate these systems, your brain recruits the following peripheral nervous systems:

  1. Somatic, which contains spinal nerves and cranial nerves
  2. Autonomic, incl. the sympathetic “fight/flight” and parasympathetic “rest/ digest” 
  3. Neuroendocrine, incl. the HPA, HPG, HPT, and Neurohypophyseal axes
  4. Enteric, also called the “second brain”, a large mass of digestion-oriented neurons
  5. Neuroenteric, connects enteric nervous system via microbiome-gut-brain axis
  6. Neuroimmune, recently discovered, primarily mediated by glial cells
  7. Glymphatic, recently discovered, which removes metabolites via CSF during sleep
  8. Neurogaseous, recently discovered, mediated by gasotransmission

The CNS must coordinate all of these to respond to sense data and regulate anatomical systems. A complex undertaking. How might we understand such a process?

With the above trichotomy { world, interaction, body }, anatomical and sensory systems can be organized into meaningful categories:


The Interlocking Loop Hypothesis posits the existence of two perception-action loops, inhabiting a gradient of abstraction:

  1. The somatic “cold” loop, world- and interaction-oriented, from exteroception to movement.
  2. The visceral “hot” loop, body-oriented, from interoception to body regulation.

Loops As Organizing Principle

Evidence for the Interlocking Loop Hypothesis comes from two anatomical principles of organisation:

First, the Bell Magendie Law is based on the observation that, in all chordates, sensory information is processed at the back of the brain, and behavioral processes are at the front (“posterior perception, anterior action”):

Cybernetics- Posterior Perception, Anterior Action

Second, the Medial Viscera Principle is the observation that visceral processes tend to reside in the center of the brain (medial regions):


Thus we can see our loops clustering at different levels of the abstraction hierarchy.

We can also see our loops’ primary site of convergence:

Anatomically, the two loops converge on the basal ganglia, in which both somatic and visceral processes are blended to yield coherent behavior.


The above quote & image are from Panksepp (1998), Affective Neuroscience.

The Basis of Motivation

Why should our two loops converge on the basal ganglia? The basal ganglia is the substrate of motivation, or “wanting”. It also participates in reinforcement learning, and its mathematical interpretation as Markov Decision Processes (MDPs).

Historically, the reward function in MDPs has proven difficult to interpret biologically; however, this task becomes straightforward on the Interlocking Loop Hypothesis. Of course the cold loop would tune its behavior to promote the hot loop’s efforts to keep the organism alive.


The Basis of Consciousness

In Can Consciousness Be Explained?, I wrote:

Let me put forward a metaphor. Consciousness feels like the movies. More specifically, it comprises:

  1. The Mental Movie. What is the content of the movie? It includes data captured by your eyes, ears, and other senses.
  2. The Mental Subject. Who watches the movie? Only one person, with your goals and your memories – you!

On this view, to explain consciousness one must explain the origins, mechanics, and output of both Movie and Subject. (Of course, one must be careful that the Subject is not a homunculus, on pain of recursion!)

The Interlocking Loop hypothesis offers an obvious foothold in the science of consciousness:

  • The world-centric cold loop generates the Mental Movie (“a world appears”). 
  • The body-centric hot loop creates the Subject (“narrative center of gravity”)

Thus, we are no longer surprised that opioid anomalies (a visceral loop instrument) are linked to depersonalization disorders; whereas dopamine (the promoter of somatic behavior) is associated with subjective time dilation effects.


First, we introduced the Interlocking Loop Hypothesis:

  • Some perceptions are about the world, others are about the body.
  • The CNS is a visceral body-centric hot loop, and a somatic world-centric cold loop
  • Bell-Magendie Law: perception for both loops is posterior, action is anterior.
  • Medial Viscera Principle: hot loop is located medially, while cold loop is more lateral.

Then, we examined its implications:

  • Motivation, as generated by the basal ganglia, is loop communication software; it allows the hot loop to influence cold loop behavior.
  • Consciousness has two components: the Mental Movie and Mental Subject. These are supported by cold and hot loops, respectively.

Until next time.

Relevant Materials

  • Northoff & Panksepp (2008). The trans-species concept of self and the subcortical–cortical midline system


Evolution of the Basal Ganglia

Part Of: [Neuroeconomics] sequence
Followup To: [An Introduction to the Basal Ganglia]

Natural History

The Earth accreted from a protoplanetary disc 4.5 billion years ago (Ga). Geologists break up Earth’s history into four eons: the Hadeon, Archaean, Proterozoic, and Phanerozoic eons.

At 3.8 Ga, abiogenesis occurred, and the sea was awash with bacteria. Since then, there have been five major events in the history of life.

  1. At 1.85 Ga, bacterial inbreeding (symbiogenesis) led to the advent of eukaryotes, whose organelles improved cellular flexibility
  2. At 800 Ma, the advent of multicellularity: some eukaryotes discovered ways to act meaningfully in groups.
  3. At 580 Ma, animal-like adaptations, such as motility and ability to consume other living matter (heterotrophy), set off the Cambrian Explosion.
  4. At 380 Ma, some animals developed four limbs (tetrapods) and the ability to become terrestrial animals.
  5. At 320 Ma, some terrestrial animals developed mammary glands, and saw the spark of the mammals.


We can use the tree of life to better understand these anatomical milestones. Since all life on this planet is related (common descent), we can represent familial relations just as you would on Key innovations in organism body-plans can be embedded in such graphics, as follows:


When confronted with some biological structure, we can employ comparative anatomy to discover its origin. If an adaptation is shared across multiple species, we can infer either homology (the innovation of some common ancestor) or homoplasy (an adaptation appearing independently, a.k.a “convergent evolution”).  

For example, the spine is a homology; whereas homeothermy (warm-bloodedness) and multicellularity are homoplasies. 

Full Circuit in Vertebrates

Last time, we discussed the basal ganglia, a brain structure that is intimately involved in motivation and behavior. Here, we use comparative anatomy to discover the evolutionary origin of the basal ganglia. By dissecting brains from eight representative species, we can infer that the basal ganglia dates back to the origin of vertebrates:


Specifically, here are the frontal sections of the eight species. By employing sophisticated histochemistry techniques such as TH-immunostaining, we are able to directly visualize the striatal and pallidal regions of the representative basal ganglia.

bg-evolution-frontal-sections-representative-species-1This investigation was conducted by Anton Reiner in his aptly-titled 2009 paper, You cannot have a vertebrate brain without a basal ganglia. The basal ganglia is not the “reptile brain”, contra the triune brain hypothesis. It is, in fact, much older.

Ancient Subcortical Loops

One of the key structures in the midbrain is the corpora quadrigemina (Latin for “four bodies”). It is composed of bilateral expressions of the superior colliculus (SC), and the inferior colliculus (IC). Anatomically, these structures are four bumps at the posterior of the midbrain; for this reason, the corpora quadrigemina is also called the tectum (Latin for “roof”).


The SC receives inputs from the retina, via input from the LGN nucleus of the thalamus. The IC receives input from the auditory system, and projects to the MGN nucleus of the thalamus. For this reason, it is easy to describe these structures as a vision center, and audio center, respectively.

However, there is more to the story. SC and IC represent space topographically, and densely innervate one another. They seem to participate in coordinate transformations, which integrate multimodal sensory information. The SC and IC are actually composed of distinct anatomical regions, each of which perform specialized tasks. Importantly, the SC Deep Layer functions as a control center: basically, a predecessor of the motor cortex.


We have seen the basal ganglia processing information from the neocortex. But the neocortex is a mammalian innovation. What did the basal ganglia do before the invention of the neocortex? If you look carefully at the basal ganglia, you can actually see afferents from the GPi / SNr / VP node into the superior colliculus (SC). It turns out that the SC drives its own loop through the basal ganglia:


The basal ganglia evolved a general-purpose reinforcement learning device, assisting behavioral computations of the superior colliculus. As motor cortex M1 began to complement and compete with the SC for motor control, it was also built on top of basal ganglia loops.

For more details, see McHaffie et al (2005). Subcortical Loops in the Basal Ganglia  

Simplified Circuit in Arthropods

Insects (arthropods) have been around long before vertebrates, evolving around the Cambrian epoch. We saw above that insects (arthropods) have a nerve cord: a predecessor of the spinal cord. Each segment of the body corresponds with a nerve bundle called a ganglia. The head segment of insects, called the cephalon, is particularly important insofar as its associated ganglia (cerebral ganglion) is the direct predecessor of the brain.

Within the cerebral ganglion, we find structures called neuropiles (analogous to modern-day nuclei) which perform specific functions:


One such structure (located in the protocerebrum), is the central complex (in above diagram, called the central body, CB). The central complex contains a fan-shaped body which strikingly resembles the mammalian striatum:


The similarities do not stop there. The basal ganglia and central complex share homologous circuitry, and are even created by the same genetic material. In fact, we can conclude that they are the same structure, with different names. 


Recall that the basal ganglia contains two pathways: direct and indirect. The central complex does not have an indirect pathway! This suggests that the indirect pathway evolved later, as an elaboration of more primitive motivation circuitry.

For more information, see Strausfeld & Hirth (2013). Deep Homology of Arthropod Central Complex and Vertebrate Basal Ganglia. See this response, however, for a critique.

The Evolution of Dopamine

Dopamine plays a key role in behavioral readiness. The basal ganglia contains ten times more dopamine receptors than any other brain area. When did dopamine evolve? Recall that, as a catecholamine, dopamine (DA) is heavily related to norepinephrine (NE) and epinephrine (EPI):


In order for these neurotransmitters to influence the nervous system, neurons must have receptors responsive to the aforementioned chemicals. Our question becomes, when did these receptors evolve?

By genomic analysis, we can confirm that DA transporters (DAT) came into existence with the invention of bilateral symmetry. This basal bilaterian also contained transporters for serotenin (SERT) and a highly flexible transporter for monoamines (MAT).

In protostomes, the MAT gene was destroyed via mutation, and replaced with the octopamine transporter (OAT). Let me repeat that. Dopamine is not used by insects etc: instead, related chemicals tyramine and octopamine (bolded above) are used in its place. 

History was not much kinder for the deuterostomes, whose dopamine transporter was destroyed. However, this clade duplicated the MAT gene to resurrect dopamine receptivity in subsequence chordates (cDAT).  


The above analysis clearly demonstrates the volatility of natural selection, and how natural selection uses the resources at its disposal to construct neurotransmitter systems like dopamine. For more information, see Caveney et al (2006). Ancestry of neuronal monoamine transporters in the Metazoa.


  • Comparative anatomy dates the emergence of the basal ganglia to at least as early as the vertebrate clade.
  • The basal ganglia also supports the “control center” of the Deep Layer of the SC, which predates its support of neocortex.
  • Incredibly, the basal ganglia predate the brain, originated prior to arthropods (insects)! The central complex is the vertebrate basal ganglia.
  • The arthropod version of the basal ganglia does not include an indirect pathway. This innovation happened later.
  • Prior to the creation of the basal ganglia, dopamine assumed its role in promoting behavior near the invention of the core animal body-plan.

We will close by condensing these discoveries into a single graphic:


Until next time.

An Introduction To Behaviorism

Part Of: Neuroeconomics sequence
Content Summary: 1100 words, 11 min read

Historical Context

William Jaynes (1842-1910), the “father of American psychology”, was also a world-renowned philosopher, who together with CS Peirce and John Dewey, founded the philosophical school of pragmatism. This illustrates that, in the early days psychology (along with many other sciences) was more closely interwoven with philosophy.

This connection can be seen into the 1930s, when two related movements conquered the intellectual zeitgeist. In analytic philosophy, logical positivism from the Vienna Circle quickly gained traction. Logical positivism partitioned language into two components: synthetic (“this leaf is green”) and analytic statements (“all men are bachelors”). Further, it relied on the verification principle, that concepts are only meaningful by virtue of the operations used to measure them.

Concurrently, BF Skinner inaugurated the research tradition of behaviorism, which focused on relationships between Stimulus and Response (SR associations). Influenced by positivism, Skinner also promoted radical behaviorism, which insisted that talk about subjective experience, and all mental phenomena, had no meaning.

In 1951, Quine penned his Two Dogmas of Empiricism, which marked the death knell of logical positivism. And in 1967, Noam Chomsky wrote his Review Of Skinner’s Verbal Behavior, a devastating takedown of radical behaviorism. Chomsky helped usher in the cognitive revolution, with its key metaphor of brain as computer. Importantly, just as we can inspect the inner workings of a computer, we can also hope to learn about events that transpire between our ears.

I disagree with the philosophy of radical behaviorism. But the empirical results of behaviorism persist, and demand explanation. So today, let’s explore what this research programme learned some sixty years ago.

Classical Conditioning

The most important result from behaviorist experiments is conditioning, a robust form of associative learning. It comes in two flavors:

  1. Classical conditioning is about learning in behavior-irrelevant situations (where e.g., a rat’s behavior doesn’t affect foot shock).
  2. Instrumental conditioning is about learning in behavior-relevant situations (where e.g., a rat’s behavior affects foot shock).

To illustrate the former, let’s turn to a famous experiment by Pavlov. Some behaviors seem innate: you don’t need to teach a rat to dislike pain, and you don’t need to teach a dog to salivate when presented with food. Call these innate associations unconditioned stimulus US and unconditioned response UR. In contrast, neutral stimuli fail to elicit meaningful behavior.


Pavlov’s insight: if you consistently ring a bell before providing food, the food-salivation association will change! The salivation reflex will travel forward in time, towards the conditioning stimulus CS (in this case, the bell).


Note the CS bell serves as a predictor of reward. Since salivation readies the mouth for digestion, it makes sense for the brain to initiate this preparation as soon as it learns of an imminent meal.

On Carrots And Sticks

Have you ever heard the expression “should I use a carrot or stick”? This idiom derives from a cart driver dangling a carrot in front of a mule and holding a stick behind it. The claim is that there are basically two routes to alter behavior: positive and negative feedback.

But this idiom is incomplete. We must also consider the effect of removing carrots & sticks. So there are four ways to alter behavior:


Let us reorganize this taxonomy, by grouping actions that increase or decrease behavior, respectively:


Instrumental Conditioning

Of course, proverbs can be wrong! For example, the “sugar high” myth remains alive and well, at least in my social circles. Do “carrots and sticks” really alter behavior?

Back in 1898, Thorndike conducted his Puzzle Box experiment, which trapped a cat in a small space, with a hidden lever that facilitated escape. On the first trial, the cat relies heavily on its innate escape behaviors: scratching at bars, pushing at ceiling, etc. However, the only behavior that facilitated escape was pressing a lever. After repeating this experiment several times, the cat has learned to not waste time scratching etc: it presses the lever immediately.


This is proof of negative reinforcement: after pressing a lever, an unpleasant state (confinement) was removed, and frequency of lever-pressing subsequently increased.

By now, strong evidence supports instrumental learning from all four kinds of reinforcements and punishments. The Law of Effect expresses this succinctly:

Responses that produce a satisfying effect in a particular situation become more likely to occur again in that situation, and responses that produce a discomforting effect become less likely to occur again in that situation.

This effect occurs in nearly all biological organisms! This suggests that the brain mechanisms underlying this ability are highly conserved across species. But I will leave my remarks on the biological substrate of conditioning for another post. 🙂


Reinforcement and punishment are powerful learning tools. One successful research programme of behaviorism is behavioral control: engineering the right sequence of positive & negative outcomes to dictate an organism’s behavior.

Shaping is an important vehicle for behavioral control. If you desire an animal to exhibit some behavior (even one that would never occur naturally), you simply apply differential reinforcement of successive approximations. Take, for example, rat basketball:

Animal training relies heavily on shaping techniques. In the words of BF Skinner,

By reinforcing a series of successive approximations, we bring a rare response to a very high probability in a short time. … The total act of turning toward the spot from any point in the box, walking toward it, raising the head, and striking the spot may seem to be a functionally coherent unit of behavior; but it is constructed by a continual process of differential reinforcement from undifferentiated behavior, just as the sculptor shapes his figure from a lump of clay.

The topic of behavioral control is often met with discomfort: don’t such findings empower manipulative people? I want to add two comments here:

  • You don’t see primates shaping each other’s behaviors, despite its obvious adaptive values. Why? I suspect primates like ourselves possess emotional software that detects and punishes social manipulation. Specifically, I suspect our intuitions about personal autonomy and moral inflexibility evolved for precisely this purpose.
  • Shaping, and associative learning, are not unlimited in scope. You can shape a rat to play basketball, but shaping will completely fail to produce e.g., self-starvation. The brain is not a tabula rasa, and it cannot be stretched beyond its biological constraints.


  • Philosophically, radical behaviorism collapsed in the 1970s. But it left behind important empirical results.
  • A wide swathe of animals exhibit classical conditioning, which is learning to associate innate responses with (previously meaningless) predictors.
  • Extensive evidence for also suggests the brain can perform instrumental conditioning, a more behaviorally-relevant form of learning.
  • Specifically, the Law of Effect states that satisfying outcomes increase the preceding behavior, and vice versa.
  • The instrumental conditioning technique of shaping is still used today by animal trainers to install utterly novel behaviors in animals, such as rats playing basketball. 🙂

For another look at conditioning, I recommend this video.

Until next time.

An Introduction to Prospect Theory

Part Of: [Neuroeconomics] sequence
Content Summary: 1500 words, 15 min reading time


Decisions are bridges between perception and action. Not all decisions are cognitive. Instead, they occur at all levels of the abstraction hierarchy, and include things like reflexes. 

Theories of decision tend to constrain themselves to cognitive phenomena. They come in two flavors: descriptive (“how does it happen”) and normative (“how should it happen”).

Decision making often occur in the context of imperfect knowledge. We may use probability theory as a language to reason about uncertainty. 

Let risk denote variance in the probability distribution of possible outcomes. Risk can exist regardless of whether a potential loss is involved. For example, a prospect that offers a 50-50 chance of paying $100 or nothing is more risky than a prospect that offers $50 for sure – even though the risky prospect entails no possibility of losing money.

Today, we will explore the history of decision theory, and the emergence of prospect theory. As the cornerstone of behavioral economics, prospect theory provides an important theoretical surface to the emerging discipline of neuroeconomics.

Maximizing Profit with Expected Value

Decision theories date back to the 17th century, and a correspondence between Pascal and Fermat. There, consumers were expected to maximize expected value (EV), which is defined as probability p multiplied by outcome value x.

EV = px

To illustrate, consider the following lottery tickets:


Suppose each ticket costs 50 cents, and you have one million dollars to spend. Crucially, it doesn’t matter which ticket you buy! Each of these tickets have the same expected value: $1. Thus, it doesn’t matter if you spend the million dollars on A, B, or C – each leads to the same amount of profit.

The above tickets have equal expected value, but they do not have equal risk. We call people who prefer choice A risk averse; whereas someone who prefers C is risk seeking.

Introducing Expected Utility

Economic transactions can be difficult to evaluate. When trading an apple for an orange, which is more valuable? That depends on a person’s unique tastes. In other words, value is subjective.

Let utility represent subjective value. We can treat utility as a function u() that operates on objective outcome x. Expected utility, then, is highly analogous to expected value:

EU = pu(x)

Most economists treat utility functions as abstractions: people act as if motivated by a utility function. Neuroeconomic research, however, suggests that utility functions are physically constructed by the brain.

Every person’s utility function may be different. If a person’s utility curve is linear, then expected utility converges onto expected value:

EU \rightarrow EV \mid u(x) = x

Recall in the above lottery, the behavioral distinction between risk-seeking (preferring ticket A) and risk-averse (preferring C). Well, in practice most people prefer A. Why?

We can explain this behave by appealing to the shape of the utility curve! Utility convexity produces risk aversion:

Prospect Theory- Utility Convexity & Risk Aversion

In the above, we see the first $50 (first vertical line) produces more utility (first horizontal line) than the second $50.

Intuitively, the first $50 is needed more than the second $50. The larger your wealth, the less your need. This phenomenon is known as diminishing marginal returns.

Neoclassical Economics

In 1947, von Neumann and Morgenstern formulated a set of axioms that are both necessary and sufficient for representing a decision-maker’s choices by the maximization of expected utility.

Specifically, if you assume an agent’s preference set accomodates these axioms…

1. Completeness. People have preferences over all lotteries.

\forall L_1, L_2 \in L either L_1 \leq L_2 or L_1 \geq L_1 or L_1 = L_2

2. Transitivity. Preferences are expressed consistently.

\forall L_1, L_2, L_3 \in L if L_1 \leq L_2 and L_1 \leq L_2 then L_1 \leq L_3

3. Continuity. Preferences are expressed as probabilities.

L_1, L_2, L_3 \in L then \exists \alpha, B  s.t. L_1 \geq L_2 \geq L_3 iff \alpha L_1 + (1-\alpha)L_3 \geq L_2 \geq BL_1 + (1 - B)L_3

4. Independence of Irrelevant Alternatives (IIA). Binary preferences don’t change by injecting a third lottery.

… then those preferences always maximize expected utility.

L_1 \geq L_2 iff sum(p_1u(x_1) \geq p_2u(x_2)

The above axioms constitute expected utility theory, and form the cornerstone for neoclassical economics.  Expected utility theory bills itself as both a normative and descriptive theory: that we understand human decision making, and have a language to explain why it is correct.

Challenges To Independence Axiom

In the 1970s, expected utility theory came under heavy fire for failing to predict human behavior. The emerging school of behavioral economics gathered empirical evidence that Neumann-Morgenstern axioms were routinely violated in practice, especially the Independence Axiom (IIA).

For example, the Allais paradox asks our preferences for the following choices:


Most people prefer A (“certain win”) and D (“bigger number”). But these preferences are inconsistent, because C = 0.01A and D = 0.01B. The independence axiom instead predicts that A ≽ B if and only if C ≽ D.

The Decoy effect is best illustrated with popcorn:


Towards a Value Function

Concurrently to these criticisms of the independence axiom, the heuristics and biases literature (led by Kahneman and Tversky) began to discover new behaviors that demanded explanation:

  • Risk Aversion. In most decisions, people tend to prefer smaller variance in outcomes.
  • Everyone prefers gains over losses, of course. Loss Aversion reflects that losses are felt more intensely than gains of equal magnitude.
  • The Endowment Effect. Things you own are intrinsically valued more highly. Framing decisions as gains or as losses affects choice behavior.

Prospect Theory- Behavioral Effects Economic Biases (1)

Each of these behavioral findings violate the Independence Axiom (IIA), and cumulatively demanded a new theory. And in 1979, Kahneman and Tversky put forward prospect theory to explain all of the above effects.

Their biggest innovation was to rethink the utility function. Do you recall how neoclassical economics appealed to u(x) convexity to explain risk aversion? Prospect theory takes this approach yet further, and seeks to explain all of the above behaviors using a more complex shape of the utility function. 

Let value function \textbf{v(x)} represent our updated notion of utility.  We can define expected prospect \textbf{EP} of a function as probability multiplied by the value function

EP = pv(x)

Terminology aside, each theory only differs in the shape of its outcome function.

Prospect Theory- Evolution of Utility Function (3)

Let us now look closer at the the shape of v(x):

Prospect Theory- Value Function.png

This shape allows us to explain the above behaviors:

The endowment effect captures the fact that we value things we own more highly. The reference point in v(x), where x = 0, captures the status quo. Thus, the reference point allows us to differentiate gains and losses, thereby producing the endowment effect.

Loss aversion captures the fact that losses are felt more strongly than gains.  The magnitude of v(x) is larger in the losses dimension. This asymmetry explains loss aversion.

We have already explained risk aversion by concavity of the utility function u(x). v(x) retains convexity for material gains. Thus, we have retained our ability to explain risk aversion in situations of possible gains. For losses, v(x) concavity predicts risk seeking.

Towards a Weight Function

Another behavioral discovery, however, immediately put prospect theory in doubt:

  • The Fourfold Pattern. For situations that involve very high or very low probabilities, participants often switch their approaches to risk.

To be specific, here are the four situations and their resultant behaviors:

  1. Fear of Disappointment. With a 95% chance to win $100, most people are risk averse.
  2. Hope To Avoid Loss. With a 95% chance to lose $100, most people are risk seeking.
  3. Hope Of Large Gain. With a 5% chance to win $100, most people are risk seeking.
  4. Fear of Large Loss. With a 5% chance to lose $100, most people are risk averse.

Crucially, v(x) fails to predict this behavior. As we saw in the previous section, it predicts risk aversion for gains, and risk seeking for losses:

Prospect Theory- Fourfold Pattern Actual vs Expected (2)

Failed predictions are not a death knell to a theory. Under certain conditions, they can inspire a theory to become stronger!

Prospect theory was improved by incorporating a more flexible weight function.

EP = pv(x) \rightarrow EP = w(p)v(x)

Where w(p) has the following shape:

Prospect Theory- Weight Function (1)These are in fact two weight functions:

  1. Explicit weights represent probabilities learned through language; e.g., when reading the sentence “there is a 5% chance of reward”.
  2. Implicit weights represent probabilities learned through experience, e.g., when the last 5 out of 100 trials yielded a reward.

This change adds some mathematical muscle to the ancient proverb:

Humans don’t handle extreme probabilities well.

And indeed, the explicit weight function successfully recovers the fourfold pattern:



Today we have reviewed theories of expected value, expected utility (neoclassical economics), and prospect theory. Each theory corresponds to a particular set of conceptual commitments, as well a particular formula:

EV = px

EU = pu(x)

EP = w(p)v(x)

However, we can unify these into a single value formula V:

V = w(p)v(x)

In this light, EV and EU have the same structure as prospect theory. Prospect theory distinguishes itself by using empirically motivated shapes:

Prospect Theory- Evolution of Both Functions

With these tools, prospect theory successfully recovers a wide swathe of economic behaviors.


Until next time.

Glymphatic System: Why We Sleep

Part Of: Demystifying Consciousness sequence
Content Summary: 1000 words, 10 min read


At some point tonight, your movements will become lethargic, your eyes droop, and you will lose consciousness for eight hours.

This won’t be a one-time thing. You’ll spend twenty years of your life in this zombie state.

Together with reproduction and feeding, sleep appears to be one of the fundamental requirements of all vertebrates.

Why? Let’s find out!

Three Modes of Existence

An EEG takes electrical recordings of the scalp. If you use an EEG during sleep, you can distinguish different kinds of sleep:


There seem to be three modes of existence: wakeful consciousness, REM sleep, non-REM (NREM) sleep. These modes switch back and forth abruptly during a typical night’s sleep. Consider the following, a sleep architecture diagram:


Chronobiological Influences

The rotation of the earth has profound implications on biological life. The temporal distribution of bodily functions is highly structured. Chronobiology studies biological rhythms.

Circadian rhythms are those that reset every 24 hours. The suprachiasmatic nucleus (SCN) drives your circadian clock. It is also a central hub of the hypothalamus, passing information to DMH to distribute to systems of feeding, stress, thermoregulation, and sleep:


The pineal gland exists in nearly all vertebrates. It originally evolved with a third eye, which measured light intensity. However, mammals have long since lost their third eye, and instead splice retinal signals from the optic nerve to the hypothalamus, delivering light intensity data.

As long as there is light, the pineal gland is inert. However, with the onset of darkness, it produces melatonin. Melatonin thereby synchronizes the hypothalamus to the day-night cycle.

Ultradian rhythms are biological rhythms that reset more frequently than every 24 hours. Of these, the basic rest-activity cycle (BRAC) is most important. BRAC duration varies across species:

Cats exhibit a 20-minute rhythm in rate of responding. Likewise, it is been found that if one unobtrusively observes humans, they tend to show invigorated periods of facial grooming (eg touching the face, including nose picking) approximately every 90 minutes.

90 minute cycles have also been observed in heart rate, urine flow, eating, vigilance tasks, and tests for verbal and spatial intelligence. Importantly, these cycles need not start at the same time. Your peak time for verbal intelligence does not necessarily correspond with heightened face touching, but both will reset every ninety minutes.

Sleep is driven by the ventrolateral preoptic area (VLPO) of the hypothalamus. The VLPO incorporates the following systems:

  1. circadian rhythms (sleep tends to occur at regular intervals);
  2. ultradian rhythms (the REM-NREM cycle is 90min); and
  3. melatonin production (sleep tends to be facilitated by darkness).

The Purpose of Sleep

Cellular metabolism uses adenosine triphosphate (ATP) to produce energy, which yields protein waste (metabolites) that float around outside of cells (interstitial space).

In the body, the lymphatic system is responsible for removing this waste. But in the brain, the blood-brain barrier (BBB) removes access to the lymphatic system. So, how does the brain remove metabolites?

Your brain does not rest against the base of your skull (that would destroy brain tissue). Instead, it is immersed in a fluid bath. This fluid is called cerebrospinal fluid (CSF)

In addition to surrounding the skull and inhabiting your ventricles, CSF also participates in the blood brain barrier. Specifically, CSF inhabits paravascular space (outside the blood vessel, but inside the astrocyte processes).

Neuroendocrine- BBB

The cerebrospinal fluid flows between arteries & veins, creating a current that sweeps away metabolic waste. This is the glymphatic system:

In vein diameter, we see a tradeoff between metabolism (which uses blood, and produces waste) and glymphatics (which uses CSF, and removes waste). 

During sleep, the brain consumes about 40% less energy. This means smaller vascular diameter, which in turn expands the paravascular channel. Therefore, we would predict that sleep would be favorable to glymphatic processes. And in 2013, it was confirmed that the glymphatic system is indeed 60% more effective during sleep.

Let me say that again. In 2013, we discovered why we sleep: to remove neural metabolites.

This discovery unifies two previously separate theories of sleep:

  1. That sleep is metabolic (more difficult to catch food at night)
  2. That sleep is restorative (that something is replenished by the act of sleep)

Homeostatic Influences

As we have seen, sleep urge involves more than simple neural oscillators. If a predator keeps an animal awake all night, that animal will feel an increased need to sleep.  Excess metabolites induce a stronger urge to sleep.

A central organizing feature of the human body is the homeostatic setpoint, which regulates some quantity. For example, the brains of warm-blooded organisms represent and maintain blood temperature at a fixed value (in humans, 98.6 degrees Fahrenheit).

In this way, the part of the brain that regulates the body – the “hot loop” – can be conceived as a fairly elaborate kind of thermostat. And one such knob on this thermostat is sleep debt. But how does the brain represent sleep debt?

Adenosine is an inhibitory neuromodulator, ubiquitous in the vertebrate brain. Concentrations in the basal forebrain seem to represent sleep debt. Sleep deprived individuals have unusually high levels of adenosine, which is restored to normal levels only after a recovery sleep.

Along with biological oscillators, adenosine seems to induce sleep urgency. This is why caffeine works: it is an adenosine antagonist.

Importantly, adenosine is a metabolite. As adenosine triphosphate (ATP) is converted into cellular energy, adenosine (a byproduct of the reaction) is ejected from the cell into the interstitial space.  

Adenosine not only measures time spent awake, but also directly represents the levels of toxins within your skull. Adenosine concentration is reduced during sleep because the glymphatic system removes it, along with other metabolites.

Adenosine thus provides another glimpse at the deep relationship between sleep and metabolism.


  • Mammals inhabit three modes of existence: wakeful consciousness, REM sleep, and non-REM sleep
  • Sleep is a consequence of metabolism: the brain uses sleep to remove metabolic waste via the glymphatic system.
  • Sleep is heavily influenced by circadian rhythms (reset every 24hr), ultradian rhythms (reset every 90min) and melatonin production.
  • Sleep is also influenced by adenosine, which is a more direct representation of ambient metabolites (and also, of course, sleep debt).


  • Saper et al (2005). Hypothalamic regulation of sleep and circadian rhythms
  • Kleitman (1982). Basic Rest-Activity Cycle-22 Years Later  
  • Xie et al (2014). Sleep Drives Metabolite Clearance from the Adult Brain

The Three Stream Hypothesis

Part Of: Neuroanatomy sequence
Content Summary: 1000 words, 10 min read


Have you heard of blindsight before?

This man has no conscious experience of vision. And yet, he can avoid obstacles while walking!  Blindsight is not a trick, and hundreds of cases are documented. How can a person see, but not know that he sees?

To answer, we turn to the brain.

Two Visual Streams

Recall our concept of flat map. Cortex is like a sheet: its wrinkles can be stretched flat.

Three Streams- Left Hemisphere Flat Map (1)

Green areas are primary areas: landing sites of different sensorimotor modalities. The visual primary area is V1, somatosensory touch is S1, muscles is M1.

Visual information is pumped from the retina to V1. From there, the Dual Stream Hypothesis suggests that it follows two distinct pathways:

Three Streams- Original Dual Streams (3)

The traditional graphic illustrates how visual information passes up into the dorsal stream, and down into the ventral stream.

Things become more clear with a flat map (right). Consider the phenomenon of hand-eye coordination involved in, e.g., playing table tennis. Moving the paddle to return a serve require close collaboration between visual information (about the ball) and motor information (from the hand). Here, it seems clear that hand-eye coordination is provided by the dorsal stream, which comprises the shortest distance between V1 and M1.

Blindsight patients have damage to V1. They are not conscious of visual experiences, yet their ability to maneuver obstacles suggests that their hand-eye coordination (Dorsal stream) is still intact. How can this be?

Well, the optic nerve also passes information directly to the dorsal stream. While the Ventral Stream of V1-damaged blindsight patients receives no information, the Dorsal Stream remains functional.

Inspired by these blindsight patients, Milner & Goodale gave these streams nicknames.

  • The Dorsal How Stream performs hand-eye coordination
  • The Ventral What Stream produces conscious perception.

Our subjective experience of unitary vision is wrong!  If you step back, though, it makes perfect sense for natural selection to carve out two flavors of vision, each with different purposes. Action-based vision needs to happen quickly; perceptual-based vision is more analytical and doesn’t have the same speed requirements. 

Tension In The Dorsal Stream

The Dual Stream Hypothesis is grounded in strong neuroscientific and behavioral evidence. It also has enjoyed consensus support from neuroscientists for nearly two decades. However, there are also some notes of tension.

Conceptual tension to a researcher is blood to a shark.

Did you know there are actually two versions of the Dual Stream Hypotheses? Goodale & Milner’s 1992 version (explored above) is the most well-known, but in 1982 Mishkin et al also proposed a Dual Stream model, based on monkey lesion studies. Both accounts agree about the Ventral What Stream. But Mishkin’s model gives a spatial processing role to the Dorsal Stream (“Where”). Why should these research traditions view the Dorsal Stream so differently?  Note well such cases of functional tension.

Let me advance two observations that are unique to myself.

First, consider the width (angular spread) of the two streams. In the above flat map, notice how much wider the Dorsal Stream is than the Ventral Stream (nearly three times wider!).  How can the Dorsal Stream be so wide, yet retain functional coherence? This is structural tension. While such hints are less damning, they destabilize our already queasy relationship with the Dorsal Stream.

Three Streams- Dorsal Stream Tension (3)

Second, it is curious that roughly ⅓ of the cortical  area around V1 are simply not recruited in the Dual Stream Hypothesis. Surely such regions (e.g., Lingual and Cuneus cortex) have some role in the processing of visual information. Could these medial regions comprise a fourth stream?

Call this the Medial Stream Conjecture. The medial stream seems to project directly the hippocampus. I suspect that in five years, we shall speak of the Medial Experiential Stream, which supports autobiographical memory. 

The Lateral Stream: Healing The Divide

A parable of three disciplines:

  • The cognitive psychologist speaks the language of function. How does the mind create behavior?
  • The neurophysiologist speaks the language of structure. How do anatomical minutiae participate in neural circuitry?
  • The cognitive neuroscientist serves as translator. She pursues structure-function maps; the connective tissue between biology and information processing.

Sometimes these maps go awry, and must be repaired. The Dorsal Stream → { What or Where } map is such a case.

Several researchers have proposed that the Dorsal Stream be split into two separate streams (e.g., Rizzolatti & Matelli, 2003). Call this the Three Stream Hypothesis. The “true” Dorsal Stream is more narrow, and retains its How functionality. However, the remaining cortical surface is now called the Lateral Stream, which performs spatial processing (Where). 

Three Streams- Fractionating Streams (4)

Milner’s Dorsal How Stream was much too wide. Conversely, Mishkin’s “Dorsal” Where Stream is actually located more laterally.

Three Streams- Stream Localization (2)

The Lateral and Dorsal streams both connect posterior parietal to premotor cortex. Two decades ago, these areas were largely referred to by name  (“the premotor cortex does X”). Modern treatments, however, parcellate these regions at a much finer granularity. This also helps explain how Milner and Mishkin conflated the streams.

Integrating Audition

Originally, the dual stream hypothesis was viewed as only relevant to vision. However, lately there has been an influx of interest in auditory streams. The seminal paper is Hickok & Poeppel (2007), The cortical organization of speech processing. It proposes two streams:

Three Streams- Auditory Streams

  1. The antero-ventral stream (green, left) performs auditory classification, and assists in speech comprehension.
  2. The postero-dorsal stream, (red, right) in contrast, is not bilaterally symmetric.
    1. The postero-dorsal stream in the right hemisphere localizes auditory stimuli both spatially and temporally.
    2. The postero-dorsal stream in the left hemisphere performs speech production.

The auditory stream model integrates cleanly with the Three Stream Hypothesis. The auditory antero-ventral stream shares real estate with the Ventral stream. The auditory postero-dorsal stream overlaps the Lateral stream. 

Why is “deafhearing” (an auditory version of blindsight) impossible?  Because ear-hand coordination is not a thing. Auditory information simply does not participate in the unconscious Dorsal stream.

Summing Up

The Three Stream Hypothesis describes how auditory and visual information are carried as far as the prefrontal cortex. We may carve each stream into three discrete phases:

Three Streams- Three Phases

Finally, we can also visualize these same relationships more abstractly, as follows:

Three Streams- Topology (1)

Until next time.