A Dual-Process Theory of Moral Judgment

Part Of: Demystifying Ethics sequence
Content Summary: 900 words, 9 min read


An ethical theory is an attempt to explain what goodness is: to ground ethics in some feature of the world. We have discussed five such theories, including:

  • Consequentialism, which claims that goodness stems from the consequences of an action
  • Deontology, which claims that goodness stems from absolute obligations (discoverable in light of the categorical imperative)

For most behaviors (e.g., theft), both theories agree (in this case, label the act as Evil). But there do exist key scenarios which prompt these theories to disagree. Consider the switch dilemma:

Consider a trolley barreling down a track that will kill five people unless diverted. However, on the other track a single person has been similarly demobilized. Should you pull to lever to divert the trolley?

Our ethical theories produce the following advice:

  • Consequentialism says: pull the lever! One death is awful, but better than five.
  • Deontology says: don’t pull the lever! Any action that takes innocent life is wrong. The five deaths are awful, but not your fault.

Would you pull the lever? Good people disagree. However, if you are like most people, you will probably say “yes”. 

Things get interesting if we modify the problem as follows. Consider the footbridge dilemma:

Consider a trolley barreling down a track that will kill five people unless diverted. You are standing on a bridge with a fat man. If you push the fat man onto the track, the trolley will derail, sparing the five people.

Notice that the consequences for the action remain the same. Thus, consequentialism says “push the fat man!”, and deontology says “don’t push!”

What about you? Would you push the fat man? Good people disagree; however, most people confess they would not push the fat man off the bridge. 

Our contrasting intuitions are quite puzzling. After all, they only thing that’s different between the two cases is how close the violence is: far away (switch) vs up close (shoving).

Towards a Dual-Process Theory

Recall that there are two kinds of ethical theories.

  1. Prescriptive theories describe what goodness objectively is.
  2. Descriptive theories tell us how the brain produces moral judgments.

Consequentialism and deontology are prescriptive theories. However, we can also conceive of consequentialism and deontology as descriptive theories. Let’s call these descriptive variants folk consequentialism and folk deontology, respectively.

Sometimes, people’s judgments is better explained by folk consequentialism; other times, folk deontology enjoys more predictive success. We might entertain two hypotheses to explain this divergence:

  1. Different boundary conditions of a single neural process
  2. Two competing processes for moral judgment

As we will see, the evidence suggests that the second hypothesis, the dual-process theory of moral judgment, is correct.

Dissociation-Based Evidence

Consider the crying baby dilemma:

It’s wartime. You and your fellow villagers are hiding from nearby enemy soldiers in a basement. Your baby starts to cry, and you cover your baby’s mouth to block the sound. If you remove your hand, your baby will cry loudly, and the soldiers will hear. They will find you… and they will kill all of your. If you do not remove your hand, your baby will smother to death. Is it morally acceptable to smother your baby to death in order to save yourself and the other villagers?

Here, people take a long time to answer, and show no consensus in their answers. If the dual-process theory of moral judgment is correct, then we expect the following:

  1. Everyone exhibits increased activity in the dorsal anterior cingulate (dACC). This region is known to reliably respond when two or more incompatible behavioral responses are simultaneously activated. 
  2. For those who eventually choose the folk consequentialist answer (save the most lives) should exhibit comparatively more activity in brain regions associated with working memory and cognitive control.

Both predictions turn out to be true. Here then is the circuit diagram of our dual-process, organized in the two cybernetic loops framework:

Dual-Process Morality- System Architecture

Four other streams of evidence corroborate our dual-process theory:

  • Deontological judgments are produced more quickly than consequentialist ones.
  • Cognitive distractions slow down consequentialist but not deontological judgments.
  • Patients with dementia or lesions that cause “emotional blunting” are disproportionately likely to approve of consequentialist action. 
  • People who are either high in “need for cognition” and low in “faith in intuition”, or have unusually high working memory capacity, tend to produce more consequentialist judgments.

Relation To Other Disciplines

We have previously distinguished two kinds of moral machinery:

  • Propriety frames are a memory format that retains social intuitions.
  • Social intuition generators which contribute to the contents of social judgments.

These machines map to the dual-process theory of judgment. Propriety frames are housed in cerebral cortex, which perform folk consequentialist analysis. Social intuition generators are located within the limbic system, and contribute folk deontology intuitions.

Recall that Kantian deontology attempted to ground moral facts in pure reason (the categorical imperative). While surely valuable as a philosophical exercise, in practice folk deontological judgments have little to do with reason. They are instead driven by autonomic emotional responses. It is folk consequential judgments which depend more on reason (cortical reasoning).

This is not to say that people who prefer consequential reasoning are strictly superior moral judges. But I will address the question which reasoning system should I trust more? on another day.


  • For the switch dilemma, most people reason consequentially (“save the most lives”)
  • For the footbridge dilemma, most people reason deontologically (“murder is always wrong”)
  • These contrasting styles emerge because the brain has two systems of judgment.
  • Folk consequentist reasoning is performed in cerebral cortex.
  • Folk deontology intuitions are generated from within the limbic system.

Until next time.


Rule Feedback Loops

Part Of: Breakdown of Will sequence
Followup To: Willpower As Preference Bundling
Content Summary: 900 words, 9 min reading time


When “in the moment”, humans are susceptible to bad choices. Last time, we introduced willpower as a powerful solution to such akrasia. More specifically:

  • Willpower is nothing more, and nothing less, than preference bundling.
  • Inasmuch as your brain can sustain preference bundling, it has the potential to redeem its fits of akrasia.

But this only explained how preference bundling works at the level of utility curves. Today, we will learn how preference bundling is mentally implemented, and this mental model will in turn provide us with predictive power.

Building Mental Models

Time to construct a model! 🙂 You ready?!

In our last post, we discussed three distinct phases that occur during preference bundling. We can then imagine three separate modules (think: software programs) that implement these phases.

Personal Rule- Crude Decision Process (2)

This diagram provides a high-level, functional account of how our minds make decisions. The three modules can be summarized as follows:

  • The Utility Transducer module is responsible for identifying affordances within sensory phenomena, and compressing a multi-dimensional description into a one-dimensional value.
  • The Preference Bundler module can aggregate utility representations that are sufficiently similar. Such a technique is useful for combating akrasia.
  • The Choice Implementer module selects Choice1 if Preference1 > Preference2. It is also responsible for computing when and how to execute a preference-selection.

The above diagram is, of course, merely a germinating seed of a more precise mental architecture (it turns out that mind-space is rather complex 🙂 ). Let us now refine our account of the Preference Bundler.

Personal Rules

Consider what it means for a brain to implement preference bundling. Your brain must receive utility-anticipated information from an arbitrary number of choice valuations, and aggregate similar decisions into a single measure.

Obviously, the mathematics of such a computation lies underneath your awareness (your superpower is math). However, does the process entirely fail to register in the small room of consciousness?

This seems unlikely, given the common phenomenal experience of personal rules. Is it not likely that the conscious experience of “I will never stay up past midnight on a weeknight” does not in some way correlate with the actions of the Preference Bundler?

Let’s generalize this question a bit. In the context of personal rules, we are inquiring about the meaning of quale-module links. This type of question is relevant in many other contexts as well. It seems to me that such links can be roughly modeled in the vocabulary of dual-process theory, where System 1 (parallel modules) data bubbles up into System 2 (sequential introspection) experience.

Let us now assume that the quale of personal rules correlates to some variety of mental substance. What would that substance have to include?

In terms of complexity analysis, it seems to me that a Preference Bundler need not generate relevant rules on the fly. Instead, it could more efficiently rely on a form of rule database, which tracks a set of rules proven useful in the past. Our mental architecture, then, looks something like this (quales are in pink):

Personal Rule- Rules Subserving Bundling

In his book, Ainslee presents intruging connections between this idea of a rule database with similar notions in the history of ideas:

The bundling phenomenon implies that you will serve your long-range interest if you obey a personal rule to behave alike towards all members of a category. This is the equivalent of Kant’s categorical imperative, and echoes the psychologist Lawrence Kohlberg’s sixth and highest principle of moral reasoning, deciding according to principle. It also explained how people with fundamentally hyperbolic discount curves may sometimes learn to choose as if their curves were exponential.

Recursive Feedback Loops

Personal rules, of course, are not spontaneously appear within your mind. They are constructed by cognitive processes. Let us again expand our model to capture this nuance:

Personal Rule- Preference Regulation

Describing our new components:

  • The Rule Controller module is responsible both for generating new rules (e.g., “I will not stay up past midnight on a weeknight”), and re-factoring existing ones.
  • The “Honored?” checkpoint conveys information on how well a given personal rule was followed. The Rule Controller module may use this information to update the rule database.

A feedback loop exists in our mental model. Observe:

Personal Rule- Feedback

Feedback loops can explain a host of strange behavior. Ainslie describes the torment of a dieter:

Even if [a food-conscious person] figures, from the perspective of distance, that dieting is better, her long-range perspective will be useless to her unless she can avoid making too many rationalizations. Her diet will succeed only insofar as she thinks that each act of compliance will be both necessary and effective – that is, that she can’t get away with cheating, and that her current compliance will give her enough reason not to cheat subsequently. The more she is doubtful of success, the more likely it will be that a single violation will make her lose this expectation and wreck her diet. Personal rules are a recursive mechanism; they continually take their own pulse, and if they feel it falter, that very fact will cause further faltering.


And that’s a wrap! 🙂 I am hoping to walk away from this article with two concepts firmly installed:

  • Preference bundling is mentally implemented via a database of personal rules (“I will do X in situations that involve Y).
  • Personal rules constitute a feedback loop, whereby rule-compliance strengthen (and rule-circumvention weakens) the circuit.

Next Up: [Iterated Schizophrenic’s Dilemma]

Construal Level Theory: Musings

Table Of Contents

  • Introduction
    • Context
    • Overview of CLT
  • Insight Scratchpad
    • Mental Health
    • Skills
    • Thinking Modes
  • Application
    • Self-Diagnosis
    • Theory Integration



This weekend, I started digging into Construal Level Theory (CLT). This post is ultimately a snapshot of my learning process. It is not comprehensive, nor polished.  I have a place for it yet within the theoretical apparatus of my mind & this blog: it contains more questions than answers.

Anyways, I hope you enjoy some of the quotes at least; I found some of them to be extremely thought-provoking. (Unless noted otherwise, quotes are taken from this Psychlopedia review article).

Overview of CLT

Construal Level Theory (CLT) arises from noticing the interchangability of traits. Psychologists have begun to notice two distinct modes of human thought:

  • Near Mode: All of these bring each other more to mind: here, now, me, us; trend-deviating likely real local events; concrete, context-dependent, unstructured, detailed, goal-irrelevant incidental features; feasible safe acts; secondary local concerns; socially close folks with unstable traits.
  • Far Mode: Conversely, all these bring each other more to mind: there, then, them; trend-following unlikely hypothetical global events; abstract, schematic, context-freer, core, coarse, goal-related features; desirable risk-taking acts, central global symbolic concerns, confident predictions, polarized evaluations, socially distant people with stable traits.

In their review article, theorists Trope and Liberman summarize:

The fact that something happened long ago does not necessarily mean that it took place far away, that it occurred to a stranger, or that it is improbable. Nevertheless, as the research reviewed here demonstrates, there is marked commonality in the way people respond to the different distance dimensions. [Construal level theory] proposes that the commonality stems from the fact that responding to an event that is increasingly distant on any of those dimensions requires relying more on mental construal and less on direct experience of the event. … [We show] that (a) the various distances are cognitively related to each other, such that thinking of an event as distant on one dimension leads one to thinking about it as distant on other dimensions, (b) the various distances influence and are influenced by level of mental construal, and (c) the various distances are, to some extent, interchangeable in their effects on prediction, preference, and self-control.

Insight Scratchpad

Mental Health

After individuals experience a negative event, such as the death of a family member, they might ruminate about this episode. That is, they might, in essence, relive this event many times, as if they were experiencing the anguish and distress again. These ruminations tend to be ineffective, compromising well-being (e.g., Smith & Alloy, 2009).

In contrast, after these events, some individuals reflect more systematically and adaptively on these episodes. These reflections tend to uncover insights, ultimately facilitating recovery (e.g., Wilson & Gilbert, 2008).

When individuals distance themselves from some event, they are more inclined to reflect on this episode rather than ruminate, enhancing their capacity to recover. That is, if individuals consider this event from the perspective of someone else, as if detached from the episode themselves, reflection prevails and coping improves. In contrast, if individuals feel immersed in this event as they remember the episode, rumination prevails and coping is inhibited. Indeed a variety of experimental (e.g., Kross & Aydyk, 2008) and correlational studies (e.g., Ayduk & Kross, 2010) have substantiated this proposition.

When processing trauma, then, inducing abstract construals is desirable.

Arguably, depressed individuals tend to adopt an abstract, rather than concrete, construal. Consequently, memories of positive events are not concrete and, therefore, do not seem salient or recent. Indeed, people may feel these positive events seem distant, highlighting the difference between past enjoyment and more recent distress.

As this argument implies, memory of positive events could improve the mood of depressed individuals, provided they adopt a concrete construal. A study that was conducted by Werner-Seidler and Moulds (2012) corroborates this possibility. In this study, individuals who reported elevated levels of depression watched an upsetting film clip. Next, they were told to remember a positive event in their lives, such as an achievement. In addition, they were told to consider the causes, consequences, and meaning of this event, purportedly evoking an abstract construal, or to replay the scene in their head like a movie, purportedly evoking a concrete construal. As predicted, positive memories improved mood, but only if a concrete construal had been evoked.

In contrast, in the context of clinical depression, inducing concrete construals may be desirable.

In short, an abstract construal may diminish anxiety, but a concrete construal can diminish dejection and dysphoria



Action identification theory specifies the settings in which abstract and concrete construals–referred to as high and low levels–are most applicable (Vallacher, Wegner, & Somoza, 1989; Wegner & Vallacher, 1986). One of the key principles of this theory is the optimality hypothesis. According to this principle, when tasks are difficult, complex, or unfamiliar, lower level action identifications, or a concrete construal, are especially beneficial. When tasks are simple and familiar, higher level action identifications, or an abstract construal, are more beneficial.

To illustrate, when individuals develop a skill, such as golf, they should orient their attention to tangible details on how to perform some act, such as “I will ensure my front arm remains straight”. If individuals are experienced, however, they should orient their attention to intangible consequences or motivations, such as “I will outperform my friends”

Applying CLT to games. Perhaps I could use this while playing chess. 🙂

Similarly, as De Dreu, Giacomantonio, Shalvi, and Sligte (2009) showed, a more abstract or global perspective may enhance the capacity of individuals to withstand and to overcome obstacles during negotiations. If individuals need to negotiate about several issues, they are both more likely to be satisfied with the outcome of this negotiation if they delay the most contentious topics. To illustrate, when a manager and employee needs to negotiate about work conditions, such as vacation leave, start date, salary, and annual pay rise, they could begin with the issues that are vital to one person but not the other person. These issues can be more readily resolved, because the individuals can apply a technique called logrolling. That is, the individuals can sacrifice their position on the issues they regard as unimportant to gain on issues they regard as very important. Once these issues are resolved, trust improves, and a positive mood prevails. When individuals experience this positive mood, their thoughts focus on more abstract, intangible possibilities, which can enhance flexibility. Because flexibility has improved, they can subsequently resolve some of the more intractable issues.

Applying CLT to negotiation. This would seem to overlap the latitudes of acceptance construct from social judgment theory.

Thinking Modes

When individuals adopt an abstract construal, they experience a sense of self clarity (Wakslak & Trope, 2009). That is, they become less cognizant of contradictions and conflicts in their personality. Presumably, after an abstract construal is evoked, individuals orient their attention towards more enduring, unobservable traits (cf. Nussbaum, Trope, & Liberman, 2000). As a consequence, individuals become more aware of their own core, enduring qualities–shifting attention away from their peripheral, and sometimes conflicting, characteristics.

This matches my experience.

Attentional tuning theory (Friedman & Forster, 2008), which is underpinning by construal level theory, was formulated to explain the finding that an abstract construal enhances creativity thinking and a concrete construal enhances analytic thinking (e.g, Friedman & Forster, 2005; Ward, 1995).

Makes me wonder whether metaphor (engine of creative thinking) is powered by System1 processes.



Over time, therefore, people tend to behave politely when they feel a sense of distance. According to construal level theory, this distance coincides with an abstract construal. Therefore, politeness and an abstract construal should be associated with each other.

I tend to be very polite…

An abstract construal can also amplify the illusion of explanatory depth–the tendency of individuals to overestimate the extent to which they understand a concept (Alter, Oppenheimer, & Zemla, 2010)

I often suffer from this particular emotion.

When individuals adopt an abstract construal, they tend to be more hypocritical. That is, they might judge an offence as more acceptable if they, rather than someone else, committed this act.

I am guilty of this more often than most.

Taken together, one could make the case that I, Kevin, gravitate towards “far mode” (i.e., finding distance between my concept of self & my surroundings).

Theory Integration

To evoke a concrete construal, participants are instructed to specify an exemplar of each word, such as poodle or Ford.

An interesting link between CLT and Machery’s Heterogeneity Hypothesis.

In my deserialization series (e.g., Deserialized Cognition), I gestured towards two processing modes: authority and inference. Perhaps this could be simply hooked into CLT, with Near Mode triggering social processing, and Far Mode triggering inference processing.

Most crucially of all, I need to see how CLT can be reconciled with dual process theory (DPT).  One weakness of dual-processing theory, in my view, is in its difficulties producing an explanation for the context-dependent cognitive styles of Eastern cultures, versus the context-independence cognitive styles of Western cultures. Perhaps difficulties such as these could be dissolved by knitting the two theories together.

But how to begin stitching? If you’ll recall, dual-process theory is also grounded in, motivated by, dissociations:

CLT- Dual-Process Theory Dissociations

Notice the partial overlap: both CLT and DPT claim ownership of the “contextualized vs. abstract” dimension. But despite this partial overlap, when writing these theories back into mental architecture diagrams, the dissociations are produced by radically different things. System2 is – arguably – the product of a serialized “virtual machine” sitting on top of our inborn evolutionary-old modules. But Near Mode and Far Mode, they seem to be the product of an identity difference vector: how far a current thing is from one’s identity. (In fact, CLT might ultimately prove a staging grounds for investigations into the nature of personality). But this all makes me wonder how identity integrates into our mental architecture…

The entire process of integrating CLT and DPT is a formidable challenge… I’m unclear the extent to which the social psychological literature has already pursued this path. I also wonder whether any principles can be extracted by such integration attempts. Both CLT and DPT are – at their core – behavioral property bundlers – finding commonalities & interchangeabilities within human behaviors and descriptions. In general, do property-bundling theories produce sufficient theoretical constraint? And how does one, in principle, move from property-bundling to abducing causal mechanisms?


As a parting gift, a fun summary from Overcoming Bias:

CLT- Dissociations

Deserialization: Hazards & Control

Part Of: [Deserialized Cognition] sequence
Followup To: [Deserialized Cognition]


Two major differences exist between conceptiation and deserialization:

  1. Deserialization Delay: A time barrier exists between concept birth & use.
  2. Deserialization Reuse: The brain is able to “get more” out of its concepts.

Inference Deserialization: Obsolescence Hazard

Let’s consider the deserialization delay within inference cognition modes:

Deserialization- Inference Cognition

If you think of an idea, and a couple hours later deserialize & leverage it, risk will (presumably) be minimal. But what about ideas conceived decades ago?

Your inference engines change over time. Here’s a fun example: Santa Claus. It is easy to imagine even a very bright child believing in Santa, given a sufficiently persuasive parent. The cognitive sophistication to reject Santa Claus only comes with time. However, even after this ability is acquired, this belief may be loaded from semantic memory for months before it is actively re-evaluated.

The problem is that every time your inference engines are upgraded (“versioned”), their past creations are not tagged as obsolete. What’s worse, you are often even ignorant of upgrades to the engine itself – you typically fail to notice (c.f., Curse Of Knowledge).

Potential Research Vector: The fact that deserialization decouples your beliefs from your belief-engines has interesting implications for psychotherapy, and the mind-hacking industries of the future. I can imagine moral fictionalism (moral talk is untrue, but useful to talk about) leveraging such a finding, for example.

Social Deserialization: Epistemic Bypass Hazard

Let’s now consider deserialization reuse within social cognition modes:

Deserialization- Social Cognition

Let me zoom into how social conceptiation is actually implemented in your brains. Do people believe every claim they hear?

The answer turns out to be… yes. Of course, you may disbelieve a claim; but to do so requires a later, optional process to analyze, and make an erasure decision about, the original claim. If you interrupt a person immediately after exposure to a social claim, you interrupt this posterior process and thereby increase acceptance irrespective of the content of the claim!

Social conceptiation, therefore, is less epistemically robust than inference conceptiation. Deserialization simply compounds this problem, by allowing the reuse of concepts that fail to be truth-tracking.

Potential Research Vector: Memetic theory postulates that, in virtue of your belief generation systems having a shape: that certain properties of the belief themselves influences cognition. I imagine that this distinction between concept acquisition modes would have nteresting implications for memetic theory.

How To Select Away From Hazardous Deserialization

Unfortunately, from the subjective/phenomenological perspective, there is precious little you can do to feel the difference between hazardous and truth-bearing deserializations. The brain simply fails to tag its beliefs in any way that would be helpful.

Before proceeding, I want to underscore one point: the process of selecting away from hazards cannot be usefully divided into a noticing step and a selection step. If you notice hazard, you don’t need “tips” on how to select away from it: your brain is already hardwired with an action-guiding desire for truth-tracking beliefs. No, these steps remain together; your challenge is “merely” to learn how to raise hazardous patterns to your attention.

Let’s get specific. When I say “raise X to your attention”, what I mean is “when X is perceived, your analytic system (System 2) overrides your autonomic system (System 1) response”. If this does not make sense to you, I’d recommend reading about dual process theory.

How does one encourage a domain-general stance favorable to such overrides? It turns out that there exists an observable personality trait – the need for cognition – which facilitates an increased override rate. Three suggestions that may help:

  1. Reward yourself when you feel curiosity.
  2. Inculcate an attitude of distrust when you notice yourself experiencing familiarity.
  3. Take advantage of your social mirroring circuit by surrounding yourself with others who possess high needs for cognition.

How can you encourage a domain-specific stance favorable to such overrides? In other words: how can you trigger overrides in hazardous conditions, in conditions where obsolescence or epistemic bypassing has occured? So far, two approachs seem promising to me:

  1. Keep track of areas where you have been learning rapidly. Be more skeptical about deserializing concepts close to this domain.
  2. Train yourself to be skeptical of memes originating outside of yourself: whenever possible, try to reproduce the underlying logic yourself.

Of course, these suggestions won’t work exceptionally well, for the same reason self-help books aren’t particularly useful. In my language, your mind has a kind of volition resistance that tends to render such mind hacks temporary and/or ineffectual (“people don’t change”). But I’ll leave a discussion for why this might be so, and what can be done, for another day…


In this post, we explored how the brain recycles concepts in order to save time, via the deserialization technique discussed earlier. Such recycling brings with it two risks:

  1. Obsolescence: The concepts you resurrect may be inconsistent with your present beliefs.
  2. Epistemic Bypass: The concepts you resurrect may not have been evaluated at all.

We then identified two ways this mindware might enrich our lives:

  1. Getting precise about how concepts & conceptiation diverge will give us more control over our mental lives.
  2. Getting precise about how deserialization complements epistemic overrides will allow us to expand memetic accounts of culture.

Finally, we explored several ways in which we might encourage our minds to override hazardous deserialization patterns.

Machery: Précis of Doing without Concepts

Content Summary: 2600 words, 26 minute read.


There is no secret that the academic field of concepts is in disarray. In this article, Machery attempts to weave these disparate traditions into a compelling whole.  But first, a quote which serves to motivate what follows:

Why do cognitive scientists want a theory of concepts? Theories of concepts are meant to explain the properties of our cognitive competences. People categorize the way they do, they draw the inductions they do, and so on, because of the properties of the concepts they have. Thus, providing a good theory of concepts could go a long way towards explaining some important higher cognitive competences.

Summarization text is grayscale, my commentary is in orange.

Article Metadata

  • Article: Précis of Doing without Concepts
  • Author: Edouard Machery
  • Published: 11/2009
  • Citations: 178 (note: as of 04/2014)
  • Link: Here (note: not a permalink)

Section 1. Regimenting the use of concept in cognitive science

We start with definitions!

The world is not an undifferentiated sea of chaos. It has statistically noticeable patterns – “joints”. Let us call these delightful patterns in nature a category (or a natural kind). But categories are things in the world, and your mind must somehow learn these categories for itself. Plato once described the act of reasoning as: “That of dividing things again by classes, where the natural joints are, and not trying to break any part, after the manner of a bad carver.” (Phaedrus, 265e). This analogy – to carve nature at its joints – is what concept processes do. Concepts represent categories in your brain.

Let’s get specific about the properties of concepts. Machery defines concept as something that:

  1. Can be about a class, event, substance, or individual.
  2. Nonproprietary, not constrained by the underlying type of represented information.
  3. Constitutive elements can vary over time and across individuals.
  4. Some elements of information about X may not fit into the concept of X; let us call these data background knowledge.
  5. They are used by Default (I will define this in Section 3).

Section 2. Individuating concepts

Is it possible for an individual to possess different concepts of the same category?
Can Kevin possess two concepts of the category of chair?
How do we individuate two related pieces of information, that would otherwise fall under the same concept?

I propose [that] when two elements of information about x, A and B, fulfill either of these [individuation] criteria, they belong to distinct concepts:

  • Connection Criterion: If retrieving A (e.g., water is typically transparent) from LTM and using it in a cognitive process (e.g., a categorization process) does not facilitate the retrieval of B (e.g., water is made of molecules of H20) from LTM and its use in some cognitive process, then A and B belong to two distinct concepts (WATER1 and WATER2).
  • Coordination Criterion: If A and B yield conflicting judgments (e.g., the judgment that some liquid is water and the judgment that this very liquid is not water) and if I do not view either judgment as defeasible in light of the other judgment (i.e., if I hold both judgments to be equally authoritative), then A and B belong to two distinct concepts (WATER1 and WATER2).

Section 3. Defending the proposed notion of concept

Time to explore our last property of concepts, “used by Default”. Default is a name for “the assumption that some bodies of knowledge are retrieved by default when one is categorizing, reasoning, drawing analogies, and making inductions”.  Say you are given a word problem involving counting apples and oranges. Default is the claim that a flood of concepts – including but not limited to arithmetic, the apple, the orange, trees, and fruit – will be drawn from long term memory (LTM) stores, and made available to your mental processes automatically.

At least two research traditions go against this claim:

  1. Concepts are not retrieved from LTM automatically, they are rather summoned via conscious attention.
  2. Concepts are drawn from LTM automatically, but they are constructed on-the-fly.  When you see an apple, you do not load a concept of apple that was hashed out long ago, your mind queries your LTM for apple-related background knowledge, constructing transient concepts especially tailored for the peculiarities of the task at hand.

Machery makes three counterpoints:

  1. Only a pronounced amount of recall variability (e.g., highly divergent results for tweaking minor parameters of a word problem) would falsify Default in favor of on-the-fly concept construction.
  2. Empirical investigations only reveal moderate levels of recall variability.
  3. A substantial amount of evidence supports Default.

Section 4. Developing a psychological theory of concepts

A psychological theory of concepts must treat the following concerns:

  • The nature of the information constitutive of concepts
  • The nature of the processes that use concepts
  • The nature of the vehicle of concepts
  • The brain areas that are involved in possessing concepts
  • The processes of concept acquisition

Section 5. Concept in cognitive science and in philosophy

The gist of the section:

Although both philosophers and cognitive scientists use the term concept, they are not talking about the same things. Cognitive scientists are talking about a certain kind of bodies of knowledge, they attempt to explain the properties of our categorization, inductions etc; whereas philosophers are talking about that which allows people to have propositional attitudes. Many controversies between philosophers and psychologists about the nature of concepts are thus vacuous.

An amusing aside that I desire to explicitly ground the definition of vacuous into some theory of concepts, when I come to treat pragmatism.

Anyways, my tentative attempt to restate the above: Philosophers concern themselves with category-concept fidelity, whereas cognitive scientists concern themselves with the lifecycle of the concept within the mental ecosystem.

Section 6. The heterogeneity hypothesis versus the received view

Machery defines the received view as the assumption that, beyond differences within concept subject-matter, concepts share many properties that are scientifically interesting. Machery suggests that this a mistake, and that the evidence suggests the existence of several distinct types of concept. Concept, in other words, is itself not a category (natural kind). A nuanced sentence if you’ve ever heard one. 🙂

The Heterogeneity Hypothesis, in contrast, claims that processes that produce concepts are distinct, that they share little in common.

Section 7. What kind of evidence could support the heterogeneity hypothesis?

Three kinds of evidence are predicted:

  1. When the conceptualization processes fire individually, we expect each to receive strong confirmation in just those experiments.
  2. When the conceptualization processes fire together, outputs may be incongruent, requiring mediation; we thus expect processing delays.
  3. Although the epistemology of dissociations is intricate, we should expect confirmation from neuropsychological data analysis.

Section 8. The fundamental kinds of concepts

Three different kinds of concepts exist in your cognitive architecture:

  1. Prototypes are bodies of statistical knowledge about a category, a substance, a type of event, and so on. For example, a prototype of dogs could store some statistical knowledge about the properties that are typical of dogs and/or the properties that are diagnostic of the class of dogs… Prototype are typically assumed to be used in cognitive processes that compute similarity linearly.
  2. Exemplars are bodies of knowledge about individual members of a category (e.g., Fido, Rover), particular samples of a substance, and particular instances of a kind of event (e.g., my last visit to the dentist). Exemplars are typically assumed to be used in cognitive processes that compute the similarity nonlinearly.
  3. Theories are bodies of causal, functional, generic, and nomological knowledge about categories, substances, types of events, etc. A theory of dogs would consist of some such knowledge about dogs. Theories are typically assumed to be used in cognitive processes that engage in causal reasoning.

Some phenomena are well explained if the concepts elicited by some experimental tasks are prototypes; some phenomena are well explained if the concepts elicited by other experimental tasks are exemplar; and yet other phenomena are well explained if the concepts elicited by yet other experimental tasks are theories. As already noted, if one assumes that experimental conditions prime the reliance on one type of concept (e.g., prototypes) instead of other types (e.g., exemplars and theories), this provides evidence for the heterogeneity hypothesis.

Let’s illustrate this situation with the work on categorical induction – the capacity to conclude that the members of a category possess a property from the fact that the members of another category possess it and to evaluate the probability of this generalization… the fact that different properties of our inductive competence are best explained by theories positing different theoretical entities constitutes evidence for the existence of distinct kinds of concepts used in distinct processes. Strikingly, this conclusion is consistent with the emerging consensus among psychologists that people rely on several distinct induction processes.

These arguments seems quite powerful at first glance. Even after reviewing peer-reviewed criticisms, its strength does not feel much diminished. Pending my own research into the forest of citations embedded within this section, I will proceed with my theorizing as though the Heterogeneity Hypothesis is true.

Section 9. Neo-empiricism

In contrast, neo-empiricism can be summarized with the following two theses:

  1. The knowledge that is stored in a concept is encoded in several perceptual and motor representational formats.
  2. Conceptual processing involves essentially re-enacting some perceptual and motor states and manipulating those states.

Amidst broader empirical concerns, Machery outlines three problems for the neo-empiricist school:

  1. Anderson’s problem: many competing versions of amodal concept theories exist, and neo-empiricists tend to assert victory over weaker versions of amodal theorizing.
  2. Imagery problem: it is hard to affirm that imagery is the only type of processes people have; people seem to have amodal concepts that are used in non-perceptual processes.
  3. Generality problem: some concepts (magnitude of classes, tonal sequences) have been empirically shown to be amodal, but neo-empiricists are bound to assume that all concepts are perceptual.

However, despite these concerns, Machery is happy to concede that there may “be something to” neo-empiricist arguments. In which case a fourth, a perceptual process would be added to the hypothesis. But the author suggests that, at this time, there is simply not enough evidence to justify this fourth concept-engine.

Machery seems not to appreciate an obvious implication here. Recall that all concepts are “conceived” and “reared” under perceptual supervision. What is there to prevent a daisy-chaining effect, whereby concepts are recalled which drag with them perceptual reconstructions, which permit new conceptual manipulations, etc. This information pathway could explain phenomena such as Serial Associative Cognition, a Stanovitchian term.  One weakness of Machery is that he does not draw enough constraints from the broader decision-making literature; Serial Associative Cognition must be explained in the language of concepts just as much as Similarity Judgments.

Speaking generally, the manner in which percepts influence concept modification is severely under-explored. The exact same percept of a dog could be the first draft of an exemplar-concept (e.g., an infant), could subliminally modify a prototype-concept (e.g., an adult), or could explicitly falsify a theory-concept (e.g., a veterinarian).  In the final analysis, it strikes me as unlikely that a perceptual concept-constructor module would simply be a cousin to the other three. I would expect neo-empiricist arguments to  ultimately be housed in some larger framework, with a more complete description of perceptual processing.

Section 10. Hybrid theories of concepts.

Hybrid theories of concepts grant the existence of several types of bodies of knowledge, but deny that these form distinct concepts; rather, these bodies of knowledge are the parts of concepts. Some hybrid theories have proposed that one part of a concept of x might store some statistical information about the x’s, while another part stores some information about specific members of the class of x’s, and a third part some causal, nomological, or functional information about the x’s…. [but] evidence tentatively suggests that prototypes, set of exemplars, and theories are not coordinated [in this way].

Section 11. Multi-process theories

While Machery is quick to cede that the evidence for many cognitive processes is incontrovertible, he retorts that dual-process theories traditionally fail to answer the following two issues:

  1. In what conditions are the cognitive processes underlying a given [module] triggered?
  2. If the cognitive processes are [simultaneously] triggered, how does the mind [coordinate] their outputs?

A legitimate criticism of dual-process theories.

What is known [regarding concepts and dual-process theories] can be presented briefly. It appears that the categorization processes can be triggered simultaneously, but that some circumstances prime reliance on one of the categorization processes. Reasoning out loud seems to prime people to rely on a theory-based process of categorization. Categorizing objects into a class with which one has little acquaintance seems to prime people to rely on exemplars. The same is true of these classes whose members appear to share few properties in common. Very little is known about the induction processes except for the fact that expertise seems to prime people to rely on theoretical knowledge about the classes involved.

This is irrelevant to dual-process theory… dual-process theory is concerned with how some mental processes become conscious, decontextualized, slow, and effortful, etc. The above quote is instead an unrelated (albeit interesting) glimpse at how the different conceptualization modules may interact.

Section 12. Open questions

Machery identifies three directions for future inquiry:

  1. There are several prototype theories, several exemplar theories, and several theory theories. It remains unclear which theory [of each type] is correct. Too little attention has been given to investigating the nature of prototypes, exemplars, and theories.
  2. The factors that determine whether an element of knowledge about x is part of the concept of x rather than being part of the background knowledge about x.
  3. How conceptualization may cohere with dual-process theories.

Dual-process theory is actually more expansive than Machery allows. The concept of Default, defined in section 3, is a System1 behavior. Thus, the questions of Default vs. Manual Override, Concept vs. Background Knowledge… these swiftly become absorbed into the need for dual-process theorizing…

Section 13. Concept eliminativism

Machery finally advances tentative philosophical and sociological reasons one might banish concept from our professional vocabulary.

Theoretical terms are often rejected when it is found that they fail to pick out natural kinds. To illustrate, some philosophers have proposed to eliminate the term emotion from the theoretical vocabulary of psychology on these grounds. The proposal here is that concept should be eliminated from the vocabulary of cognitive science for the same reason.

The continued use of concept in cognitive science might invite cognitive scientists to look for commonalities… if the heterogeneity hypothesis is correct, these efforts would be wasted. By contrast, replacing concept with prototype, exemplar, and theory would bring to the fore urgent open questions.

Interesting suggestions. However, I think it is clear more theoretical weight lies in Machery’s heterogeneity hypothesis.

Concluding Thoughts

Three different kinds of concepts must imply three different kinds of conceptualization modules.

Novel prediction: damage to any one of these modules must inhibit only one of kind of conceptualization.

Much, much more work is needed…

One counterargument made in the responses to this Précis caught my eye. David Danks of CMU argues that all three conceptualization modules can be modeled as special cases of a singular graphical model representation.  His paper, Theory Unification and Graphical Models in Human Categorization (2007), serves to this effect. Machery’s reply to this counterpoint is brief, pointing to its disconnect to biological evidence, although Machery elsewhere allows that causal models might underlie concept-theory construction (c.f., A Theory of Causal Learning in Children: Causal Maps and Bayes Nets (2004)).

I will close with a quote made by Couchman et. al, in a response to this Précis:

Our task is to carve nature at its joints using the psychological knife called concepts. It is true, it is profoundly important to know, and it is all right for the progress of science that the knife is Swiss-Army issue with multiple blades.

Kahneman: Thinking, Fast and Slow

The best popular science book I have encountered to date.

Extremely well-organized with short, self-contained chapters. Kahneman is an intellectual giant, and it shows in his writing. The book surveys an impressive amount of material. His seminal paper on prospect theory – the most widely cited article in the social sciences – is explained in detail. Every chapter ends with “water cooler quotes”, which I found to be a surprisingly-useful way to recap new material.

Kahneman’s book is organized into three distinctions:

1. Cognitive Systems. “System1” is the first system discussed, and is summarized as “fast thinking”. It is associative, subconscious, heuristic-oriented. “System2” is a more recent biological phenomenon; it is more analytical, abstract, purposive, effortful, and also more lazy. This distinction is not merely a theoretical construct of a researcher, it is the basis of dual-process theories of psychology, which is one of the most active areas of psychological research today.

2. Behavioral Agents. “Econs” are the decision making agents found in classical economic textbooks. Their preferences are constant, consistent, and geared towards maximizing utility. “Humans” are the decision making agents in the real world, cognitively driven by System1 and System2. Their preferences change, are manipulable, and are deeply inconsistent. This distinction is rooted in the modern field of behavioral economics (which Kahneman helped found).

3. Phenomenological Selves. The “experiencing self” is the self that experiences the moment, the self that perceives the world as it flies by. The “remembering self” is the self that constructs a narrative of personhood; it derives on memory to reconstruct the experiencing self, but its restorative work is – as with everything else in the book – subject to error. Here is a TED talk Kahneman gave on the subject: http://www.ted.com/talks/daniel_kahne…

I highly recommended this book.

Evans: In Two Minds


In Two Minds surveys state of the art dual-process theories that have become enormously influential within modern psychology.

Dual-process theory holds that our mental lives are the result of activity of, and interactions between, two separate minds. The first mind – System 1 – is the mind that drives your car when you daydream. It is fast, implicit, subconscious, associative, and evolutionary ancient. System 2, in contrast, is the mind that generates directions. It is cognitively taxing, language-oriented, conscious, abstract, and a relative newcomer on the ecological scene.

Numerous flavors of dual-process theory have emerged over the years. The theory has emerged, relatively independently, from among the following traditions: social psychology, cultural psychology, psychometrics, developmental psychology, behavioral economics, and artificial intelligence. While such creative independence suggests a common biological substrate, little effort had been made to synthesize these different perspectives until now. This anthology, itself written to complement an interdisciplinary academic conference, represents a significant step towards such a harmonization.

One of my few complaints is that the lack of a canonical vocabulary made comparative analysis between chapters difficult. That said, the breadth of subjects treated was astonishing, and the writing quality was generally excellent. I should mention that most chapters have been made publicly available via their originating universities. The following chapters struck me as especially significant:

Ch 2: Evans: How many dual process theories do we need?
Book editor, and leader of the dual-process synthesis movement, Jonathan Evans presents his vision for dual-process theory development. He begins by presenting the clusters of properties associated with either mind. Insights from a diverse set of traditions are collected, with a particular interest taken in mediating inter-system communications. The chapter closes with a hybrid model of mental architectures. An ideal one-shot introduction to the field.

Ch 3: Stanovich: Distinguishing the reflective, algorithmic, and autonomous minds: Is it time for a tri-process theory?
Psychometric legend Keith Stanovich rocks the boat by his proposal to bifurcate System 2 into reflective and algorithmic types. The algorithmic mind is what IQ tests measure, and is correlated with working memory. It is also thought to be a measure of cognitive decoupling, echoing Aristotle’s famous dictum: “It is the mark of an educated mind to be able to entertain a thought without accepting it.” The reflective mind is, in contrast, driven by thinking dispositions. It is an explanation for how otherwise extremely intelligent people flounder – smarts need to be complemented by work ethic, mental resource-management, innovation, and other properties. The chapter closes with a stunning taxonomy of thinking errors, which explores in great detail how the heuristics and bias literature motivate the movement.

Ch 5: Carruthers: An architecture for dual reasoning
Philosopher of mind Peter Carruthers explores concepts developed in his acclaimed work, Architecture Of Mind. His argument, inspired by the massive modularity thesis of evolutionary psychology, moves at a brisk pace. Breathtaking structural diagrams are presented, and grounded in wide swathes of empirical data. Carruthers’ main thesis is that this architecture is shared between System 1 and System 2: when consciousness takes over, it disconnects the modules from the action production systems to simulate various outcomes.

Ch 8: Thompson: Dual-process theories: a metacognitive perspective
While theorists have much to say about the different roles of either system, little is known about how they interact. Thompson seeks to fill this gap with an account of the emotional payload people experience when, say, they solve a riddle. Such Feelings Of Rightness (FORs, also known as yedasentience) are transmitted from System 1 to System 2, which only decides whether to intervene when the FOR is insufficiently strong.

Ch 10: Buchtel, Norenzayan: Thinking across cultures: implications for dual processes
It is an unfortunate truth that many psychological studies generalize their conclusions even though their polled subjects consist entirely of American psychology undergraduates. In this important chapter, Buchtel and Norenzayan explain why such a scope conceals the true breadth of human cognitive diversity. Cross-cultural studies are analyzed, with the conclusion that the System 2 characteristics of East Asian peoples consistently diverge from Occidental students. Subjects immersed in East Asian culture tend to focus more attention at contextual features of problems. The implications of this difference – for theory modification, and an account of how culture shapes ontogeny of cognition – are explored.

Ch 11: Sun, Lane, Matthews: The two systems of learning: architectural perspective
Artificial intelligence research Ron Sun reviews his architectural innovation CLARION. Since this computational innovation is well-documented at length elsewhere (including Wikipedia), Sun zeroes in on its relationship with dual-process theorizing. Specifically, and in contrast with Carruthers above, his software posits two distinct computational entities, and is able to recreate human idiosyncrasies via an exploration of the systems’ interactions.

Ch 13: Lieberman: What zombies can’t do: a social cognitive neuroscience approach to the irreducibility of reflective consciousness
For centuries, academics have countenanced philosophical zombies: what would it mean if a human being could behave normally but lack conscious experience? Lieberman here harnesses dual-process theories and neuroimaging data to explore the more focused question on whether such a phenomenon is nomologically possible.

Ch 15: Saunders: Reason and intuition in the moral life: a dual-process account of moral justification
Saunders examines the phenomenon of moral dumbfounding, via one of its manifestations regarding incest. Most people, when asked whether a short story about incest represents something morally wrong, will answer affirmatively and provide their reasons. However, when the storyteller removes the offending reasons (both parties are psychologically unharmed, there is no risk of pregnancy, etc), subjects generally maintain that the behavior is wrong, yet they cannot explain why they think so. The author goes on to explain how such moral dumbfounding is the result of clashing moral conclusions between the still-outraged System 1 and the deprived-of-reasons System 2.

This is my favorite cognitive science text to date.