Nietzsche: God is Dead

Context

It is no secret that human beings are terrible about thinking about politics and religion.  Every in-group has its own collection of pet ideas (its own semiosphere). And every single one of these massive cultural apparati are apt to go awry:

  • Arguably, some pro-intellectual groups have gotten history wrong (c.f., the Galileo affair).
  • Arguably, some anti-intellectual groups have gotten biology wrong (c.f., natural selection).
  • Arguably, some anti-religion groups have gotten philosophy wrong (c.f., logical positivism).

Today, I’d like to illustrate a “culture war” meme whose origins some pro-religion groups have gotten wrong: the phrase “God is dead”. I do this for three reasons:

  1. The meme is still misused frequently, yet its correction is less well-known than to the three other errors above.
  2. Nietzsche is an exceptionally interesting writer.
  3. To explain why I view the cognitive science of religion as a subject worthy of our attention.

Below I reproduce where “God is Dead” comes from: a parable.

From Nietzsche’s The Gay Science

The Madman

Have you not heard of that madman who lit a lantern in the bright morning hours, ran to the marketplace, and cried incessantly, “I seek God! I seek God!” As many of those who do not believe in God were standing around just then, he provoked much laughter. Why, did he get lost? said one. Did he lose his way like a child? said another. Or is he hiding? Is he afraid of us? Has he gone on a voyage? Thus they yelled and laughed. The madman jumped into their midst and pierced them with his glances.

“Whither is God” he cried. “I shall tell you. We have killed him – you and I. All of us are his murderers. But how have we done this? How were we able to drink up the sea? Who gave us the sponge to wipe away the entire horizon? What did we do when we unchained this earth from its sun? Whither is it moving now? Whither are we moving now? Away from all suns? Are we not plunging continually? Backward, sideward, forward, in all directions? Is there any up or down left? Are we not straying as through an infinite nothing? Do we not feel the breath of empty space? Has it not become colder? …

Do we not smell anything yet of God’s decomposition? Gods too decompose. God is dead. God remains dead. And we have killed him. How shall we, the murderers of all murderers, comfort ourselves? What was the holiest and most powerful of all the world has yet owned, has bled to death under our knives. Who will wipe this blood off us? What water is there to clean ourselves? What festivals of atonement, what sacred games shall we have to invent? Is not the greatness of this deed too great for us? Must not we ourselves become gods simply to seem worthy of it? There has never been a greater deed; and whoever will be born after us – for the sake of this deed he will be part of a higher history hitherto.”

Here the madman fell silent and looked again at his listeners; and they too were silent and stared at him in astonishment. At last he threw his lantern on the ground, and it broke and went out. “I came too early,” he said then; “my time has not come yet. This tremendous event is still on its way – it has not yet reached the ears of man. Lightning and thunder requires time, the light of the stars requires time, deeds require time even after they are done, before they can be seen and heard.

Analysis

Before reading on, please ask yourself:

  1. Who is Nietzsche addressing in this passage?
  2. Is Nietzsche discussing theology, or sociology?
  3. What is Nietzsche’s point?

Give yourself a minute to get comfortable with your answers.

Here’s my summary of what Nietzsche scholars think:

Nietzsche, speaking through the madman, is addressing atheists rather than theists. The theist is thus in a position to observe an inner dispute, in the midst of the “other team”. Nietzsche is in no way making a theological claim; rather, he is calling attention to the social and cultural consequences of the atheism. “God is dead” refers to how the tides of secularism are affecting the idea of God.

What is Nietzsche’s message? N is offering an (extremely) sharp condemnation against the uncritical atheism. His core message is that those who idly hope that the secularization thesis is true, without considering its consequences, are hopelessly naive. Religion, according to Nietzsche, is much too important public life to pass away without impact. He begs, he pleads, he cajoles nonbelievers to consider the implications of their disbelief.  (What does calling religious belief “the entire horizon” say about his views of the importance of religion?)

Concluding Thoughts

The above interpretation, in addition to being prima facie compelling, is fairly closely aligned with what you’ll hear  from nearly all professional philosophers (c.f.,  this article).  Of course, Nietzsche had many negative things to say about religion elsewhere (and yet, I have had conversations with Nietzschean Christians).

The first reason for this post was a simple correction. Perhaps misleading posters such as the following will now raise a few more eyebrows:

nietzsche_vs_god

The second reason for this post was to present Nietzsche’s artistic talent. Perhaps you’ll find yourself sufficiently motivated enough to step through my summary of his Genealogy of Morals. (I should eventually get around to outlining how N has influenced my thinking.)

The third reason for this post was to draw attention to the cognitive science of religion. One of my great pleasures in Nietzsche is his psychological incisiveness (e.g., he influenced later theorizing about the subconscious). Here, Nietzsche puts his thumb on the peculiar power religion has over the mind of man, particularly in his search for meaning.  The religious impulse of our species is undeniably strong; you can even witness it within secular communities (Atheism 2.0 is an interesting illustration of this).

I have two books on my wish list that I intend to help accelerate my theorizing about religious cognition:

Ultimately, this research will find a home within my larger project of building a mental architecture!

The Causal Inverse Problem

Part Of: Causal Inference sequence
Content Summary: 1000 words, 10 min read.

A Riddle

We begin with a riddle!

riddle

We will arrive at an answer by the end of this article. 🙂  Our journey will begin with a survey of a field within visual processing.

The Mystery Of Stereopsis

Stereopsis is the computational construction of depth from visual data. Physics is embedded in three spatial dimensions, yet your retinae are essentially 2D (imagine wrapping a sheet of paper around half of a sphere). Depth information can be gleaned from comparing the disparities between two similar images, and applying geometric principles to compute depth.  The dual images do not have to come from two eyes, either!  Close one eye, and the brain can still infer depth from motion (by comparing two images from the same eye across time).

However, stereopsis is plagued by the problem of underdetermination. The following diagram motivates this nicely:

depth_matrix

The inverse projection is your mental model of the environment. However, your brain only possesses 2D retinal images.  To recreate the environment, we consider image matches:

  1. Gray hexes are matches (left image color does not match right color)
  2. White hexes are non-matches.

The grey hexes are possible 3D interpretations of the 2D images. The black hexes are correct 3D interpretation. The brain must select a subset of grey hexes to be black hexes (which possible interpretation is veridical). This is the visual inverse problem.

The Secret To Depth Reconstruction

Visual data alone provides no obvious solution to the visual inverse problem. How then do we explain interpretation consensus (that mammals almost always agree on one particular depth-interpretation), and interpretation veracity (that the consensus is almost always correct)?

Consider the inverse projection again. Do you notice that the black hexes (correct answers) tend to be side-by-side?

In general, we might prefer interpretations (grey hexes) that are spatially continuous. The brain in fact uses cues like spatial continuity to solve the visual inverse problem.

Spatial continuity helps us begin to understand interpretation consensus. But it alone is insufficient for selecting only one possible interpretations. The brain relies on a total of six cognitive assumptions:

  1. Existence Of Surfaces: The visible world can be regarded as being composed of smooth surfaces having reflectance functions whose spatial structure may be elaborate.
  2. Hierarchical Organization: A surface’s reflectance function is often generated by a number of different processes, each operating at a different scale.
  3. Similarity: The items generated on a given surface by a reflectance-generating process acting at a given scale tend to be more similar to one another in their size, local contrast, color, and spatial organization that to other items on that surface.
  4. Spatial Continuity: Markings generated on a surface by a single process are often spatially organized – they are arranged in curves or lines and possibly create more complex patterns.
  5. Continuity Of Discontinuities: The loci of discontinuities in depth or in surface orientation are smooth almost everywhere.
  6. Continuity Of Flow: If direction of motion is ever discontinuous at more than one point – along a line, for example – then an object boundary is present.

In his book, Marr shows how these assumptions can be expressed in computational algorithms that solve the visual inverse problem. Further, neurobiological evidence suggests that one of them is the actual mechanism used by our brains.

The Nature Of Cognitive Assumptions

Why do these cognitive assumptions work? Because Earth’s photic environment features important statistical regularities. We assume similarity because most within-object visual characteristics tends to be more homogenous than that between objects.

These six assumptions also explain many optical illusion phenomena. Most optical illusions represent statistical deviations that violate our reliance on the above assumptions. For example, the depth illusion at the beginning of the article violates our our brain’s natural intuitions about perspective. Such illusions therefore are not a misfiring of an individual human vision system. It is a design consequence.

How do our brains know about these statistical regularities? Two vehicles suggest themselves:

  1. Natural Selection. Since the world is rife with statistical regularities, organisms that encode this structure more efficiently will tend to outperform their peers.
  2. Developmental Learning. In addition to short-term episodes visual inference, the visual system might itself learn to retain information about statistical regularities. This is e.g., suggested in recent research on visual normalization.

If physics were different, the statistics of everyday vision would be different, and thus a different collection of cognitive assumptions would have emerged.

Crossing The Bridge To Causal Inference

Gopnik et al suggest that cognitive assumptions are not unique to vision. Causal inference also relies on statistical regularities of causations. Specifically, the following causal assumptions are relied on by the brain:

  1. Markov Assumption. If the conditional probability distribution of future states of the process (conditional on both past and present values) depends only upon the present state; that is, given the present, the future does not depend on the past.
  2. Faithfulness Assumption. In the joint distribution on the variables in the graph, all conditional independencies are consequences of the Markov assumption applied to the graph.

The Markov assumption says that there will be certain conditional independencies if the graph has a particular structure, the faithfulness assumption says that there will be those conditional independencies only if the graph has a particular structure. The faithfulness assumption supplies the other half of the biconditional.

Solving The Riddle

Statisticians have long known about Simpson’s Paradox: “a paradox in which a trend that appears in different groups of data disappears when these groups are combined, and the reverse trend appears for the aggregate data”.

Image 2 summarizes this effect well: only when you disaggregate gender can you see the deleterious effect of the drug on recovery probability.

riddle

These two figures are similar in virtue of the fact that they violate cognitive assumptions embedded in all neurotypical adults:

  • Image 1 violate visual assumptions (perspective assumptions)
  • Image 2 violate causal assumptions (faithfulness assumption)

References

  • Marr (1982). Vision.
  • Gopnik et al (2004). A Theory of Causal Learning in Children: Causal Maps and Bayes Nets

Causal Inference with pcalg

Part Of: Causal Inference sequence
Content Summary: 2200 words, 22 min read

Introduction

In this post, we’re going to explore one way to do causal inference, as described in the following article:

Title: More Causal Inference with Graphical Models in R Package pcalg
Authors: Kalisch, K et. al
Published: 2014
Citations: 49 (note: as of 04/2014)
Link: Here (note: not a permalink)

Setting The Stage

Statistics is haunted by soundbites like few other professions. “Lies, damned lies, and statistics” needs to die. The way to mitigate deceit is not ignorance, it is the promotion of statistical literacy. “Correlation does not imply causation” should also be expunged. There must be a way to affirm the significance of spurious correlations without blinding people to the fact that causation can be learned from correlation.

When you read someone like C.S. Peirce, you will hear claims that causality is dead. Causality is, indeed, a very ancient topic. In the medieval period, the Aristotelian story about causality – a quadpartite distinction of Material Cause, Formal Cause, Efficient Cause, Final Cause – dominated the intellectual landscape. The moderns, however, were largely dissatisfied with this story; with the Newtonian introduction of forces, the above distinction began to fade into the background. So why are scientists now trying to reclaim causality from the annals of philosophy?

Enter Judea Pearl, champion of Bayesian networks and belief propagation. Dissatisfied with his near-godlike contributions to humanity, he proceeded to found modern causal theory with this text, appropriately named Causality. The reason that causality has reclaimed its sexiness is because Pearl found a way to quantize it, to update one’s beliefs about it, from raw data. Pearl grounds his version of causality in counterfactual reasoning, and borrows heavily from modal logic (c.f., possible worlds). He also introduces the notion of do-calculus, noting that there needs to exist within probability theory, operators that model action (just as “|” models observation). This SEP section explores the philosophical underpinnings of the theory in more depth.

Pearl’s movement is picking up speed. Today, you’ll find causal inference journals, conferences bent on exploring the state of the art, and business leaders trying to harness its powers to make a profit. Causal inference will be the next wave of the big data movement. It explains how human brains create concepts. It is the future of politics.

Put on your seatbelts. We’re going to take causal inference software – an R package named pcalg – out for a drive. If you want the driver’s wheel, you can have it: install RStudio, and refer to the step-by-step tutorial in the paper (or, see Appendix below). This article won’t attempt to install a complete understanding of causal models; I am content to build up your vocabulary.

The causal inference process can thus be modeled as three causal artifacts (data, models, measures), and two algorithm categories (modelling, do-calculus).

Causal Models- Overview

Causal Artifacts

Subtleties With Data

By data we normally mean observational data, which consists of random variables that are independent and identically distributed (iid assumption). However, sometimes our algorithms must process interventional data. What is the difference?

We often have to deal with interventional data in causal inference. In cell biology for example, data is often measured in different mutants, or collected from gene knockdown experiments, or simply measured under different experimental conditions. An intervention, denoted by Pearl’s do-calculus, changes the joint probability distribution of the system; therefore, data samples collected from different intervention experiments are not identically distributed (although still independent).

How do we get from raw data to causal relationships? The secret lies in conditional independence: “Can I use this variable to predict that one, given that I know the value of this third data point?”. Specifically, conditional independence is used to infer a property known as d-separation. D-separation enables us to prune away edges that represent spurious correlations.

We only deal with distributions whose list of conditional independencies perfectly matches the list of d-separation relations of some DAG; such distributions are called faithful. It has been shown that the set of distributions that are faithful is the overwhelming majority [7], so that the assumption does not seem to be very strict in practice.

How do we learn conditional independencies? From an conditional-independence oracle, a black box that unfailingly gives us the correct answers. While such a thing is not realized in the real world, an approximation of it is, in fact, leveraged by our causal algorithms:

In practice, the conditional independence oracle is replaced by a statistical test for conditional independence. For… the PC algorithm, [this replacement] is computationally feasible and consistent even for very high-dimensional sparse DAGs.

But hold on, you may say, data does not just drop into our lap. Data in the real world is incomplete, there may be variables we simply are not tracking (hidden variables). Worse, the subset of data that materializes in front of us is often non-random, but the product of observer bias: selection variables are at work behind the scenes. As you will see, we can and we will account for these.

A Hierarchy Of Graphical Models

I will present four different types of graphical models.

A DAG (directed acyclic graph) is our language of causality. There exist only one type of edge in a DAG:

  1. Blank-Arrow. These two edgemarks together represent the direction of causation.

Let’s break down the meaning of the acronym. A DAG is:

  • directed due to its arrows
  • acyclic in virtue of the fact that you can’t follow the arrows around in a circle.
  • graphical because it has nodes and edges

An example:

Causal Models- DAG

Notice that this diagram makes the distinction between causality and causation quite clear. SAT may be highly correlated with grade, but it has no causal effect on it. In contrast, Class Difficulty is highly correlated with Grade, and it has a causal effect on it. We tell the difference by d-separation.

Two requisite concepts before we go further.

  1. A skeleton is basically a graph with its edgemarks removed.
  2. An equivalence class is a set of graphs with the same skeleton but with different edgemarks. They are the set of all possible graphs consistent with the data.

Here’s a skeleton of our DAG:

Causal Models- DAG Skeleton

A CPDAG (completed partially directed acyclic graph) [1] is an equivalence class of DAGs. There exists two types of edges in a CPDAG:

  1. Blank-Arrow. The causal direction is displayed clearly if all members of the equivalence class agree.
  2. Arrow-Arrow. The causal direction is ambiguous if there is internal disagreement between members of the equivalence class.

An example:

cpdag

From the above two observations, we see that all DAGs in this equivalence class agree on the V6-V7 relation, but disagree about the V1-V2 relation.

Why would we even need to even conceive of such a graph, if DAGs are enough to represent the state of the world? Because, typically, our algorithms can only produce CPDAGs:

Finding a unique DAG from an independence oracle is in general impossible. Therefore, one only reports on the equivalence class of DAGs in which the true DAG must lie. The equivalence class is visualized using a CPDAG.

But even CPDAGs cannot accommodate those pesky hidden and selection variables!

Suppose, we have a DAG including observed, latent and selection variables and we would like to visualize the conditional independencies among the observed variables only. We could marginalize out all latent variables and condition on all selection variables. It turns out that the resulting list of conditional independencies can in general not be represented by a DAG, since DAGs are not closed under marginalization or conditioning. A class of graphical independence models that is closed under marginalization and conditioning and that contains all DAG models is the class of ancestral graphs.

A MAG (maximal ancestry graph) [8] thus affords for hidden and selection variables. There exist three types of edges in a MAG:

  1. Blank-Arrow. Roughly, these edges come from observed variables.
  2. Arrow-Arrow. Roughly, these edges come from hidden variables.
  3. Blank-Blank. Roughly, these edges come from selection variables.

Let me note in passing that MAGs rely on m-separation, a generalization of d-separation.

The same [motivation for CPDAGs holds] for MAGs: Finding a unique MAG from an independence oracle is in general impossible. One only reports on the equivalence class in which the true MAG lies (a PAG).

A PAG (partial ancestry graph) [11] is an equivalence class of MAGs. There exist six kinds of edges in a PAG:

  1. Circle-Circle
  2. Circle-Blank
  3. Circle-Arrow
  4. Blank-Arrow
  5. Arrow-Arrow
  6. Blank-Blank

PAG edgemarks have the following interpretation:

  • Blank: this blank is present in all MAGs in the equivalence class.
  • Arrow: this arrow is present in all MAGs in the equivalence class.
  • Circle: there is at least one MAG in the equivalence class where the edgemark is a Blank, and at least one where the edgemark is an Arrow.

Causal Measures

Okay, let’s rewind. Suppose we are in possession of the following CPDAG (whose equivalence class consists of two DAGs):

cpdag

This diagram allows us to, at a glance, evaluate the relationships between variables. However, it does not address the following question: how strong are the causal relationships? Suppose we wish to quantify the causal strength V1 has over V4, V5, and V6. It turns out that this can be done with the application of Pearl’s methods (including do-calculus). With these techniques in hand, we feed this CPDAG to our do-calculus algorithm, and receive the answer!

effects

I’ll let the authors explain what this matrix means:

Each row in the output shows the estimated set of possible causal effects on the target variable indicated by the row names. The true values for the causal effect are 0, 0.0, and 0.52 for variables V4, V5 and V6, respectively. The first row, corresponding to variable V4, quite accurately indicates a causal effect that is very close to zero or no effect at all. The second row of the output, corresponding to variable V5, is rather uninformative: although one entry comes close to the true value, the other estimate is close to zero. Thus, we cannot be sure if there is a causal effect at all. The third row is [like V4 in that it is clear].

Causal inference algorithms, therefore, do not completely liberate us from ambiguity: will are still uncertain of the character of the V1-V5 relation.  But, in the V1-V4 and V1-V6 links, we see a different kind of theme: equivalence-class consensus.

Algorithm Categories

Inference Algorithms

  • The PC (Peter-Clark) algorithm [10] takes observational, complete data and outputs a CPDAG.
  • The GES (Greedy Equivalence Search) algorithm [2] performs the same function, but is faster in virtue of its greediness.
  • The GIES (Greedy Interventional Equivalence Search) algorithm [4] generalizes the GES to accommodate interventional data.
  • The FCI (Fast Causal Inference) algorithm [9] [10] accepts observational data with an arbitrary number of hidden or selection variables, and produces a PAG.
  • The RFCI (Really Fast Causal Inference) algorithm [3] does approximately the same thing, faster!

Do-Calculus Algorithms

  • The IDA (Intervention calculus when DAG is Absent) algorithm [5] accepts CPDAGs, and produces a causal measure.
  • The GBC (Generalized Backdoor Criterion) algorithm [6] is able to handle hidden variables, but cannot handle selection variables. It takes PAG, MAG, CPDAG, or DAG models and checks whether a causal measure can be estimated. If it can, it goes ahead and gathers precisely that information.

In passing, the authors note that, in [5], “IDA was validated on a large-scale biological system”.

Conclusion

The Causal Landscape

Time to tie everything together!

Causal Models- Landscape

The State Of The Art

This field is expanding very rapidly. I had the opportunity to read an earlier version of this paper in 2012. To give you a taste of the rate of change, it appears to me that the authors have both produced the mathematics for the GIES and GBC algorithm, and implemented them in R, during the intervening months.

It is useful to gauge a field’s progress in terms of theory constraint – what can we say No to, with these new methods?

  • We can say No to non-quantitative rhetoric.
  • We can say No to appeals to unconstrained ambiguity.
  • We can say No to erroneous causal skeletons.
  • We can say No to denials of equivalence-class consensus.

I have a dream that policy makers will pull up CPDAGs of, say, national economics, and use the mathematics to quantitatively identify points-of-agreement. I have a dream that the strengths of our Nos will clear away the smoke from our rhetorical battlefields long enough to find a Yes.

It is such an exciting time to be alive.

References

[1] Andersson et al (1997). “A characterization of Markov equivalence classes for acyclic digraphs”.
[2] Chickering (2002). “Optimal structure identification with greedy search”
[3] Colombo et al. (2012). “Learning High-Dimensional directed acyclic graphs with latent and selection variables”.
[4] Hauser and Buhlmann (2012). “Characterization and greedy learning of interventional Markov equivalence classes of directed acyclic graphs.”
[5] Maathius et al (2010). “Predicting Causal Effects in Large-Scale Systems from Observational Data”.
[6] Maathuis and Colombo (2013). “A generalized backdoor criterion”.
[7] Meek (1995). “Strong completeness and Faithfulness in Bayesian Networks.”
[8] Richardson and Spirtes (2002). “Ancestral Graph Markov Models”
[9] Spirtes et al (1999). “An Algorithm for Causal Inference in the Presence of Latent Variables and Selection Bias.”
[10] Spirtes et al (2000). Causation, Prediction, and Search. Adaptive Computation and Machine Learning, second edition. MIT Press, Cambridge.
[11] Zhang (2008) “On the completeness of orientation rules for causal discovery in the presence of latent confounder and selection bias”.

Appendix: Example Commands

> install.packages(“pcalg”)
> source(“http://bioconductor.org/biocLite.R”)
> biocLite(“RBGL”)
> biocLite(“Rgraphviz”)
> library(“pcalg”)
> data(“gmG”)
> suffStat pc.gmG stopifnot(require(Rgraphviz))
> par(mfrow = c(1,2))
> plot(gmG$g, main = “”) ; plot (pc.gmG, main = “”)
> idaFast(1, c(4,5,6), cov(gmG$x), pc.gmG@graph)

Modularity & The Argument From Design

Part Of: Cognitive Modularity sequence
See Also: Fodor: Modularity of Mind
Content Summary: 1600 words, 16min read

Introduction

This post represents an argument for a particular thesis, known as massive modularity. This thesis, particularly popular among evolutionary psychologists, states that the mind is rife with mental modules, and that the cognitive life is the interplay between them.

What is a mental module? If you don’t have a clear grasp on what that means, I recommend just glancing my summary of Fodorian modularity. Bear in mind, though, that here the term is used somewhat differently: modules here may be some subset of the listed properties.

The following argument is not my own, it is rather an interpretation of Carruther’s argument, which is presented in this text, under Section 1.3.

Motivators From Biology

Carruthers starts by surveying the biological literature for instances of modularity. And he finds it, by the truckload:

There is a great deal of evidence from across many levels in biology to the effect that complex functional systems are built up out of assemblies of sub-components. This is true for the operations of genes, of cells, of cellular assemblies, of whole organs, of whole organisms, and of multi-organism units like a bee colony. And by extension, we should expect it to be true of cognition also, provided that it is appropriate to think of cognitive systems as biological ones, which have been subject to natural selection.

Amongst other sources, he cites the following research:

  • West-Eberhard, 2003. Developmental Plasticity and Evolution.
  • Seeley, 1995. The Wisdom of the Hive: the social physiology of honey bee colonies.

We thus possess considerable biological reason to believe that:

(3) Natural selection selects for modularity at a variety of different levels.

A Role For Evolvability

It’s one thing to observe natural selection promoting modularity, it is another to understand why it is doing so. To do this, we must appeal to the concept of evolvability.

Biological populations tend to conform themselves to ecological niches. That is, a species tends to adopt a particular survival strategy that exploits a certain subset of the local biosphere. Let me here decorate a concept I like to call niche distance: two species said to be in direct competition are so in virtue of the fact of short niche distance, etc. Thus, we could say that the niche distance between two types of weeds in your backyard is small, and the niche distance between the weed and the bald eagle is large.

The fact that niches change is one of the drivers for biological evolution. For example, as the earth warms in the coming centuries, mammalian species will need to acclimate to a different climate, which entails a changed vegetative response, which entails a need for change in eating patterns, etc. Such niche fluctuations are ubiquitous.

We know that evolution is driven by the engine of mutation. But mutation is simply a stochastic, quantum mechanical phenomenon:  there is no way to “speed it up”. Species typically cannot keep pace with niche fluctuations by directly modulating the rate of mutation. Rather, the genetic infrastructure of species must be able to harness mutations to keep pace with niche fluctuations. To put this concept of evolvability very crudely: natural selection does not only select for number of muscles, but also the ability to grow new ones.

(1) Evolvability is selected to allow for fluctuations within an ecological niche.

This video is a cute exploration of how evolvability may be supported in microorganisms by direct tampering of the genetic replication engine. But for larger organisms, the loci of behavior is trans-cellular. The sheer geometry of size compelled cells to become heterozygous, to constitute interdependent systems. The question of mutation containment, then, becomes central: is it possible for evolution to improve upon one function of an organism, without simultaneously affecting other functions?

Here, finally, is where modularity comes into play. One of the most important features of modularity is encapsulation: the hiding of information within specific containers. Rather than all functions affecting all other functions, computational processes erect walls around themselves, and communicate through them in a controlled fashion. Modular encapsulation is thus seen as a prerequisite for mutation containment:

(2) Modular subsystems are a necessary ingredient for evolvability.

Taken together, premise (1) and (2) support (3) in the following way:

Massive Modularity- Argument From Design- Evolvability

Motivators From Computer Science

In the above section, we were given a nice intuition regarding Premise 2: that modularity affords for mutation containment. But perhaps this intuition can be buffered with evidence from somewhere else entirely:

The basic reason why biological systems are organized hierarchically in modular fashion is a constraint of evolvability. Evolution needs to be able to add new functions without disrupting those that already exist; and it needs to be able to tinker with the operations of a given functional sub-system – either debugging it, or altering its processing in response to changes in external circumstances – without affecting the functionality of the remainder. Human software engineers have hit upon the same problem, and the same solution.

Two of the most widely used languages nowadays are C++ and Java. Languages in this class are often described as ‘object-oriented’. Many programming languages now require a total processing system to treat some of its parts as ‘objects’ which can be queried and informed, but where the processing that takes place within those objects isn’t accessible elsewhere. This enables the code within the ‘objects’ to be altered without having to make alterations in code elsewhere, with all the attendant risks that this would bring; and it likewise allows new ‘objects’ to be added to the system without necessitating wholesale re-writings of code elsewhere. And the resulting architecture is regarded as well nigh inevitable (irrespective of the programming language used) once a certain threshold in the overall degree of complexity of the system gets passed.

Interestingly, since the need for modular organization increases with increasing complexity, we can predict that the human mind will be the most modular amongst animal minds. This is the reverse of the intuition shared by many philosophers and social scientists, who would be prepared to allow that animal minds might be organized along modular lines, while believing that with the appearance of the human mind most of that organization was somehow superseded and swept away.

We extract the following argument from the above appeal to object-oriented programming (OOP):

(4) Software engineering suggests that OOP (modularization) is necessary to manage increasing complexity.
(5) Biological systems are very complex.

These premises buffer our Premise 2.

(2) Modular subsystems are a necessary ingredient for evolvability.

Massive Modularity- Argument From Design- OOP

I particularly enjoyed the originality of this argument. Even though software engineering is notoriously bad at quantifying its practices, its trajectory surely sheds some light on other disciplines. As a computer scientist, this argument made me speculate what other trends, current or future, could be brought to bear on such questions. The interchange between computer science and cognitive neuroscience is broad… with things like neuromorphic computing flowing in one direction, and information theory flowing in the other…

Is Mind Subject To Natural Selection

This phase of the argument is the most philosophical. The question is whether mental processes are subject to the forces of natural selection.

Carruthers begins with a fairly uncontroversial premise:

(6) The central nervous system is subject to natural selection.

So much, so obvious. But the crux of the issue is how to relate mind and brain. Carruthers wants to argue that:

(7) The central nervous system underwrites the mind.

However, this premise falls squarely into a philosophy of mind morass. Carruthers suggests a way forward is to notice that most mainstream approaches (“anyone who is neither an an epiphenomenalist nor an eliminativist about the mind”) support such a premise (see this post for some definitions).

If we find ourselves sympathetic to 7, we are led by the nose to Proposition 8:

(8) Mental processes are subject to natural selection.

Massive Modularity- Argument From Design- Mental Evolution

How Many Minds

While the weight of this argument labors to support the reality of computational modules, we must also spare some words to motivate massive modularity. Carruthers, leveraging Simon, H’s 1962 paper The Architecture of Complexity, points out that the question is one of degrees. Let us try to imagine a modularity thesis that is non-massive:

Moderate Modularity

The x-axis captures number of modules, the y-axis leverages David Marr’s concept of Tri-Level Analysis.  The concave shape of the curve represents the claim that, while the number of neurological functions may be large, the number of computational processes (e.g., belief, desire, motivation) is small.

In contrast, the shape of massive modularity thesis is convex:

Massive Modularity

While Carruthers elsewhere motivates massive modularity by way of task analysis and ethological surveys, he here defends this latter thesis by appealing to the empirically-robust observation that the brain appears to process its algorithms in parallel, and this would be impossible without a relatively plentiful number of processing units. So we have stumbled upon our last premise:

(9) In the mind, massive modularity is computationally superior to moderate modularity.

Putting It All Together

All that remains is to glue together the sub-conclusions of the above arguments. Specifically, take the following propositions:

(3) Natural selection selects for modularity at a variety of different levels.
(8) Mental processes are subject to natural selection.
(9) Within the mind, massive modularity is computationally superior to moderate modularity.

From these, it is clear we have successfully motivated our thesis:

(10) Natural selection selects for massive modularity in the mind

The entire argument, then, is pictured below.

Massive Modularity- Argument From Design- Summary

Concluding Thoughts

While I happen to affirm Premise 8, I feel like Carruthers – and even more so myself – do a poor job at motivating it. This observation is particularly painful because it is arguably the central thesis of evolutionary psychology. Mental note-to-self: revisit that section of the argument.

All told, I find this argument fairly compelling, although I would like to get more clear on several of its distinctions.

Peirce: The Fixation Of Belief

Parent Index ]

Metadata

Article: The Fixation of Belief
Author: C.S. Peirce
Published: 11/1877
Citations: 1048 (note: as of 04/2014)
Link: Here (note: not a permalink)
Other Resources: Outline (found this via Google Search, but quality isn’t bad)

Summarization text is grayscale, review text (my take) is orange.

Preliminaries

Peirce kicks off this article with a historical survey, nicely showcasing the fact that the scientific enterprise is a quite recent phenomenon on this earth. This suggests that the construction of personal epistemologies is susceptible to cultural influences. As Peirce puts it, “We come to the full possession of our power of drawing inferences, the last of all our faculties; for it is not so much a natural gift as a long and difficult art.”

Peirce also gestures towards the following question: can the reliability of our faculties be evaluated on a domain-by-domain basis? Peirce invokes evolutionary theory in an important move:

Logicality in regard to practical matters is the most useful quality an animal can possess, and might, therefore result from the action of natural selection; but outside of these it is probably of more advantage to the animal to have his mind filled with pleasing and encouraging visions, independently of their truth; and thus, upon unpractical subjects, natural selection might occasion a fallacious tendency of thought.

While many of Peirce’s views on evolution show their age, this particular insight is remarkably prescient: modern philosophers are currently exploring precisely this vein. Once this research is cast to cognitive science, neuroscientists will find themselves in a position to speak quantitatively on the matter. If the above argument is born out by data, this would be a real victory for the pragmatist camp.

Doubt vs. Belief

Doubt is a singularly important notion to Peirce: he conceives it as the primary motivator for critical thinking.

Doubt is an uneasy and dissatisfied state from which we struggle to free ourselves and pass into the state of belief; while the latter is a calm and satisfactory state which we do not wish to avoid, or to change a belief to anything else. On the contrary, we cling tenaciously, not merely to believing, but to believing just what we do believe.

The self-preservation of belief is of particular interest to me. This phenomenon is explored in detail within social psychology and memetics.

The irritation of doubt causes a struggle to attain a state of belief. When doubt ceases, mental action on the subject comes to an end.

With this definition of doubt in place, Peirce goes on to rebut three erroneous conceptions of proof:

  1. The mere putting of a proposition into the interrogative form does not stimulate the mind to any struggle after belief. There must be a real and living doubt, and without this all discussion is idle.
  2. The premises of an argument need not be grounded in some firm metaphysical strata: they merely should be free from doubt.
  3. There is, then, no practical value in arguing a point after all the world is fully convinced of it.

This contextual backdrop resonates with anyone who has tried to persuade someone not subject to real and living doubt. There are times when words move the human heart, and times when they are “just words”. However, I am largely disappointed in this dichotomy as it stands. Questions concerning what underlies, motivates, or justifies doubt are unattended. Peirce may not have been in an empirical position to cognitively explain doubt, but surely he could have afforded to provide a more detailed sketch.

Peirce goes on to detail four methods for the fixation of belief. I will summarize each in turn.

Belief Fixation Method #1: Method of Tenacity

Peirce uses examples to shed light on this way of being:

I remember once being entreated not to read a certain newspaper lest it might change my opinion upon free-trade. “Lest I might be entrapped by its fallacies and misstatements,” was the form of expression. “You are not,” my friend said, “a special student of political economy. You might, therefore, easily be deceived by fallacious arguments upon the subject. You might, then, if you read this paper, be led to believe in protection. But you admit that free-trade is the true doctrine; and you do not wish to believe what is not true.” A similar consideration seems to have weight with many persons in religious topics, for we frequently hear it said, “Oh, I could not believe so-and-so, because I should be wretched if I did.” A man may go through life, systematically keeping out of view all that might cause a change in his opinions…

How, then, are we to evaluate such a method?

It would be an egotistical impertinence to object that his procedure is irrational, for that only amounts to saying that his method of settling belief is not ours. But this method of fixing beliefs will be unable to hold its ground in practice. The man who adopts it will find that other men think differently from him, and it will be apt to occur to him, in some saner moment, that their opinions are quite as good as his own, and this will shake his confidence in his belief. This conception, that another man’s thought or sentiment may be equivalent to one’s own, is a distinctly new step, and a highly important one.

Yes! In my language, I call this Symmetry Debiasing. I intend to write more on this; it has played a role in my own worldview maturation.

Now, this first method is typically localized to the individual. The second method solves the problem of socializing belief acquisition.

Belief Fixation Method #2: Method of Authority

Here, belief is a group activity. Doxastic content is something one inherits, and its contents are to be trusted.

Uniformity of opinion will be secured by a moral terrorism to which the respectability of society will give its thorough approval. Following the method of authority is the path of peace. Certain non-conformities are permitted; certain others (deemed unsafe) are forbidden. These are different in different countries and in different ages; but, whoever you are, let it be known that you seriously hold a tabooed belief, and you may be perfectly sure of being treated with a cruelty less brutal but more refined than hunting you like a wolf.

Evidence for this sort of thing is ubiquitous; with political infighting serving as a nice example.

Thus, the greatest intellectual benefactors of mankind have never dared, and dare not now, to utter the whole of their thought; and thus a shade of prima facie doubt is cast upon every proposition which is considered essential to the security of society.

I like to play a game when I read pre-modern philosophers discuss religion: count the number of sentences wasted in defensive posturing “always remember that when I say X I do not mean Y”. Keeping your eye tuned to this kind of historical artifact, which I call the Placating Price, is a good heuristic for approximating the degree of fear behind the artful prose of the academic. And it has been considerable. Reviewing the posthumous publication of Hume’s Dialogues Concerning Natural Religion underscores this point nicely.

For the mass of mankind, then, there is perhaps no better method than this. If it is their highest impulse to be intellectual slaves, then slaves they ought to remain.

Fighting words.

Belief Fixation Method #3: A Priori Method

Systems of this sort have been chiefly adopted because their fundamental propositions seemed “agreeable to reason”. This is an apt expression; it does not mean that which agrees with experience, but that which we find ourselves inclined to believe. Plato, for example, finds it agreeable to reason that the distances of the celestial spheres from one another should be proportional to the different lengths of strings which produce harmonious chords. Many philosophers have been led to their main conclusions by considerations like this…

This method is far more intellectual and respectable from the point of view of reason than either of the others… but its failure has been the most manifest. It makes of inquiry something similar to the development of taste; but taste, unfortunately, is always more or less a matter of fashion.

Philosophy has acquired a poor reputation in many intellectual circles for precisely this reason. Certain strains of metaphysics constitute, arguably, Diseased Disciplines. Peirce also manages to anticipate modern arguments towards the refactoring of analytic philosophy.

Belief Fixation Method #4: Scientific Method

Peirce goes on to sketch a method familiar to our modern ears: the scientific method.

20071210_ScientificMethod

He also makes the interesting move in tying the method to scientific realism (the belief that things like atoms really exist, are really embedded in spacetime). His defense of scientific realism is as follows

It may be asked how I know that there are any Reals. The reply is this:

  1. If investigation cannot be regarded as proving that there are Real things, it at least does not lead to a contrary conclusion; but the method and the conception on which it is based remain ever in harmony. No doubt of the method, therefore, arise from its practice, as is the case with all the others.
  2. The feeling which gives rise to any method of fixing belief is a dissatisfaction at two repugnant propositions. But here already is a vague concession that there is some one thing which a proposition should represent. Nobody, therefore, can really doubt that there are Reals, for, if he did, doubt would not be a source of dissatisfaction. The hypothesis, therefore, is one which every mind admits. So that the social impulse does not cause men to doubt it.
  3. Everyone uses the scientific method about a great many things, and only ceases to use it when he does not know how to apply it.
  4. Experience of the method has not led us to doubt it, but on the contrary, scientific investigation has had the most wonderful triumphs in the way of settling opinion. These afford the explanation of my not doubting the method or the hypothesis which it supposes; and not having any doubt, nor believing that anybody else whom I could influence has, it would be the merest babble for me to say more about it.

If there be anybody with a living doubt upon the subject, let him consider it.

Peirce and I even share a similar sense of humor. 🙂 I love this parody of Mark 4:23!

Concluding Thoughts

After reading this essay, I do not see myself walking around and categorizing people with Method 1, 2, 3, or 4 (nor even some linear superposition of all four). I am simply not persuaded that these epistemological preferences represent natural kinds.

One might imagine combining Methods 3 and 4, and then casting the three resultant categories to personal dispositions: one based on fear/simplicity/opportunism, another on social belonging, a third on the need for cognition. With this tripartite division of epistemology based on disposition, one could then layer on cultural distinctions, such as intuitive vs quantitative philosophizing. But even this, more sophisticated, account doesn’t feel precise enough for my liking.

Why spend time on this essay if I don’t agree with its central thesis? For one, it brings key questions to the fore:

  • How does doubt affect belief construction?
  • How do individuals go about constructing personal epistemologies?

But, more importantly, the journey to our destination was interesting.

I’ll close with Peirce explaining his preference for the scientific method.

Yes, the methods [besides the scientific method] do have their merits: a clear logical conscience does cost something – just as any virtue, just as all that we cherish, costs us dear. But we should not desire it to be otherwise. The genius of a man’s logical method should be loved and reverenced as his bride, whom he has chosen from all the world. He need not condemn the others; on the contrary, he may honor them deeply, and in so doing only honors her the more. But she is the one that he has chosen, and he knows that he was right in making that choice. And having made it, he will work and fight for her, and will not complain that there are blows to take, and will strive to be a worthy knight and champion of her from the blaze of whose splendors he draws his inspiration and his courage.

A man on fire…

Parent Index ]

[Sequence] C.S. Peirce & Pragmatism

Charles Sanders Peirce (1839-1914) has been called “the father of pragmatism”, “America’s greatest logician”, and “the most original thinker of his time”. He founded the field of semiotics (the study of signs, which I touch on here), invented abduction (inference to the best explanation), and anticipated the work of geniuses like Georg Cantor (mathematics of infinity), Claude Shannon (information theory), and Ernst Zermelo (set theory) by decades.

Peirce met with a fate not unusual for thinkers of caliber: much of his work only came to be fully appreciated posthumously. His writings were never consolidated in book form, and remained largely disorganized until collated into various anthologies.

An autobiographical snippet from a paper entitled Concerning The Author:

My book will have no instruction to impart to anybody. Like a mathematical treatise, it will suggest certain ideas and certain reasons for holding them true; but then, if you accept them, it must be because you like my reasons, and the responsibility lies with you. Man is essentially a social animal: but to be social is one thing, to be gregarious is another: I decline to serve as shepherd. My book is meant for people who want to find out; people who want philosophy ladled out to them can go elsewhere. There are philosophy soup shops at every corner, thank God!

The development of my ideas has been the industry of thirty years. I did not know as I ever should get to publish them, their ripening seemed so slow. But the harvest time has come, at last, and to me that harvest seems a wild one, but of course it is not I who have to pass judgment. It is not quite you, either, individual reader; it is experience and history.

For years in the course of this ripening process, I used to collect my ideas under the designation fallibilism; and indeed the first step toward finding out is to acknowledge you do not satisfactorily know already; so that no blight can so surely arrest all intellectual growth as the blight of cocksureness; and ninety-nine out of every hundred good heads are reduced to impotence by that malady – of whose inroads they are most strangely unaware!

Indeed, out of a contrite fallibilism, combined with a high faith in the reality of knowledge, and an intense desire to find things out, all my philosophy has always seemed to me to grow ….

In many ways, Peirce and I march to the beat of the same drum…

Reviewed essays:

Machery: Précis of Doing without Concepts

Content Summary: 2600 words, 26 minute read.

Introduction

There is no secret that the academic field of concepts is in disarray. In this article, Machery attempts to weave these disparate traditions into a compelling whole.  But first, a quote which serves to motivate what follows:

Why do cognitive scientists want a theory of concepts? Theories of concepts are meant to explain the properties of our cognitive competences. People categorize the way they do, they draw the inductions they do, and so on, because of the properties of the concepts they have. Thus, providing a good theory of concepts could go a long way towards explaining some important higher cognitive competences.

Summarization text is grayscale, my commentary is in orange.

Article Metadata

  • Article: Précis of Doing without Concepts
  • Author: Edouard Machery
  • Published: 11/2009
  • Citations: 178 (note: as of 04/2014)
  • Link: Here (note: not a permalink)

Section 1. Regimenting the use of concept in cognitive science

We start with definitions!

The world is not an undifferentiated sea of chaos. It has statistically noticeable patterns – “joints”. Let us call these delightful patterns in nature a category (or a natural kind). But categories are things in the world, and your mind must somehow learn these categories for itself. Plato once described the act of reasoning as: “That of dividing things again by classes, where the natural joints are, and not trying to break any part, after the manner of a bad carver.” (Phaedrus, 265e). This analogy – to carve nature at its joints – is what concept processes do. Concepts represent categories in your brain.

Let’s get specific about the properties of concepts. Machery defines concept as something that:

  1. Can be about a class, event, substance, or individual.
  2. Nonproprietary, not constrained by the underlying type of represented information.
  3. Constitutive elements can vary over time and across individuals.
  4. Some elements of information about X may not fit into the concept of X; let us call these data background knowledge.
  5. They are used by Default (I will define this in Section 3).

Section 2. Individuating concepts

Is it possible for an individual to possess different concepts of the same category?
Can Kevin possess two concepts of the category of chair?
Yes.
How do we individuate two related pieces of information, that would otherwise fall under the same concept?

I propose [that] when two elements of information about x, A and B, fulfill either of these [individuation] criteria, they belong to distinct concepts:

  • Connection Criterion: If retrieving A (e.g., water is typically transparent) from LTM and using it in a cognitive process (e.g., a categorization process) does not facilitate the retrieval of B (e.g., water is made of molecules of H20) from LTM and its use in some cognitive process, then A and B belong to two distinct concepts (WATER1 and WATER2).
  • Coordination Criterion: If A and B yield conflicting judgments (e.g., the judgment that some liquid is water and the judgment that this very liquid is not water) and if I do not view either judgment as defeasible in light of the other judgment (i.e., if I hold both judgments to be equally authoritative), then A and B belong to two distinct concepts (WATER1 and WATER2).

Section 3. Defending the proposed notion of concept

Time to explore our last property of concepts, “used by Default”. Default is a name for “the assumption that some bodies of knowledge are retrieved by default when one is categorizing, reasoning, drawing analogies, and making inductions”.  Say you are given a word problem involving counting apples and oranges. Default is the claim that a flood of concepts – including but not limited to arithmetic, the apple, the orange, trees, and fruit – will be drawn from long term memory (LTM) stores, and made available to your mental processes automatically.

At least two research traditions go against this claim:

  1. Concepts are not retrieved from LTM automatically, they are rather summoned via conscious attention.
  2. Concepts are drawn from LTM automatically, but they are constructed on-the-fly.  When you see an apple, you do not load a concept of apple that was hashed out long ago, your mind queries your LTM for apple-related background knowledge, constructing transient concepts especially tailored for the peculiarities of the task at hand.

Machery makes three counterpoints:

  1. Only a pronounced amount of recall variability (e.g., highly divergent results for tweaking minor parameters of a word problem) would falsify Default in favor of on-the-fly concept construction.
  2. Empirical investigations only reveal moderate levels of recall variability.
  3. A substantial amount of evidence supports Default.

Section 4. Developing a psychological theory of concepts

A psychological theory of concepts must treat the following concerns:

  • The nature of the information constitutive of concepts
  • The nature of the processes that use concepts
  • The nature of the vehicle of concepts
  • The brain areas that are involved in possessing concepts
  • The processes of concept acquisition

Section 5. Concept in cognitive science and in philosophy

The gist of the section:

Although both philosophers and cognitive scientists use the term concept, they are not talking about the same things. Cognitive scientists are talking about a certain kind of bodies of knowledge, they attempt to explain the properties of our categorization, inductions etc; whereas philosophers are talking about that which allows people to have propositional attitudes. Many controversies between philosophers and psychologists about the nature of concepts are thus vacuous.

An amusing aside that I desire to explicitly ground the definition of vacuous into some theory of concepts, when I come to treat pragmatism.

Anyways, my tentative attempt to restate the above: Philosophers concern themselves with category-concept fidelity, whereas cognitive scientists concern themselves with the lifecycle of the concept within the mental ecosystem.

Section 6. The heterogeneity hypothesis versus the received view

Machery defines the received view as the assumption that, beyond differences within concept subject-matter, concepts share many properties that are scientifically interesting. Machery suggests that this a mistake, and that the evidence suggests the existence of several distinct types of concept. Concept, in other words, is itself not a category (natural kind). A nuanced sentence if you’ve ever heard one. 🙂

The Heterogeneity Hypothesis, in contrast, claims that processes that produce concepts are distinct, that they share little in common.

Section 7. What kind of evidence could support the heterogeneity hypothesis?

Three kinds of evidence are predicted:

  1. When the conceptualization processes fire individually, we expect each to receive strong confirmation in just those experiments.
  2. When the conceptualization processes fire together, outputs may be incongruent, requiring mediation; we thus expect processing delays.
  3. Although the epistemology of dissociations is intricate, we should expect confirmation from neuropsychological data analysis.

Section 8. The fundamental kinds of concepts

Three different kinds of concepts exist in your cognitive architecture:

  1. Prototypes are bodies of statistical knowledge about a category, a substance, a type of event, and so on. For example, a prototype of dogs could store some statistical knowledge about the properties that are typical of dogs and/or the properties that are diagnostic of the class of dogs… Prototype are typically assumed to be used in cognitive processes that compute similarity linearly.
  2. Exemplars are bodies of knowledge about individual members of a category (e.g., Fido, Rover), particular samples of a substance, and particular instances of a kind of event (e.g., my last visit to the dentist). Exemplars are typically assumed to be used in cognitive processes that compute the similarity nonlinearly.
  3. Theories are bodies of causal, functional, generic, and nomological knowledge about categories, substances, types of events, etc. A theory of dogs would consist of some such knowledge about dogs. Theories are typically assumed to be used in cognitive processes that engage in causal reasoning.

Some phenomena are well explained if the concepts elicited by some experimental tasks are prototypes; some phenomena are well explained if the concepts elicited by other experimental tasks are exemplar; and yet other phenomena are well explained if the concepts elicited by yet other experimental tasks are theories. As already noted, if one assumes that experimental conditions prime the reliance on one type of concept (e.g., prototypes) instead of other types (e.g., exemplars and theories), this provides evidence for the heterogeneity hypothesis.

Let’s illustrate this situation with the work on categorical induction – the capacity to conclude that the members of a category possess a property from the fact that the members of another category possess it and to evaluate the probability of this generalization… the fact that different properties of our inductive competence are best explained by theories positing different theoretical entities constitutes evidence for the existence of distinct kinds of concepts used in distinct processes. Strikingly, this conclusion is consistent with the emerging consensus among psychologists that people rely on several distinct induction processes.

These arguments seems quite powerful at first glance. Even after reviewing peer-reviewed criticisms, its strength does not feel much diminished. Pending my own research into the forest of citations embedded within this section, I will proceed with my theorizing as though the Heterogeneity Hypothesis is true.

Section 9. Neo-empiricism

In contrast, neo-empiricism can be summarized with the following two theses:

  1. The knowledge that is stored in a concept is encoded in several perceptual and motor representational formats.
  2. Conceptual processing involves essentially re-enacting some perceptual and motor states and manipulating those states.

Amidst broader empirical concerns, Machery outlines three problems for the neo-empiricist school:

  1. Anderson’s problem: many competing versions of amodal concept theories exist, and neo-empiricists tend to assert victory over weaker versions of amodal theorizing.
  2. Imagery problem: it is hard to affirm that imagery is the only type of processes people have; people seem to have amodal concepts that are used in non-perceptual processes.
  3. Generality problem: some concepts (magnitude of classes, tonal sequences) have been empirically shown to be amodal, but neo-empiricists are bound to assume that all concepts are perceptual.

However, despite these concerns, Machery is happy to concede that there may “be something to” neo-empiricist arguments. In which case a fourth, a perceptual process would be added to the hypothesis. But the author suggests that, at this time, there is simply not enough evidence to justify this fourth concept-engine.

Machery seems not to appreciate an obvious implication here. Recall that all concepts are “conceived” and “reared” under perceptual supervision. What is there to prevent a daisy-chaining effect, whereby concepts are recalled which drag with them perceptual reconstructions, which permit new conceptual manipulations, etc. This information pathway could explain phenomena such as Serial Associative Cognition, a Stanovitchian term.  One weakness of Machery is that he does not draw enough constraints from the broader decision-making literature; Serial Associative Cognition must be explained in the language of concepts just as much as Similarity Judgments.

Speaking generally, the manner in which percepts influence concept modification is severely under-explored. The exact same percept of a dog could be the first draft of an exemplar-concept (e.g., an infant), could subliminally modify a prototype-concept (e.g., an adult), or could explicitly falsify a theory-concept (e.g., a veterinarian).  In the final analysis, it strikes me as unlikely that a perceptual concept-constructor module would simply be a cousin to the other three. I would expect neo-empiricist arguments to  ultimately be housed in some larger framework, with a more complete description of perceptual processing.

Section 10. Hybrid theories of concepts.

Hybrid theories of concepts grant the existence of several types of bodies of knowledge, but deny that these form distinct concepts; rather, these bodies of knowledge are the parts of concepts. Some hybrid theories have proposed that one part of a concept of x might store some statistical information about the x’s, while another part stores some information about specific members of the class of x’s, and a third part some causal, nomological, or functional information about the x’s…. [but] evidence tentatively suggests that prototypes, set of exemplars, and theories are not coordinated [in this way].

Section 11. Multi-process theories

While Machery is quick to cede that the evidence for many cognitive processes is incontrovertible, he retorts that dual-process theories traditionally fail to answer the following two issues:

  1. In what conditions are the cognitive processes underlying a given [module] triggered?
  2. If the cognitive processes are [simultaneously] triggered, how does the mind [coordinate] their outputs?

A legitimate criticism of dual-process theories.

What is known [regarding concepts and dual-process theories] can be presented briefly. It appears that the categorization processes can be triggered simultaneously, but that some circumstances prime reliance on one of the categorization processes. Reasoning out loud seems to prime people to rely on a theory-based process of categorization. Categorizing objects into a class with which one has little acquaintance seems to prime people to rely on exemplars. The same is true of these classes whose members appear to share few properties in common. Very little is known about the induction processes except for the fact that expertise seems to prime people to rely on theoretical knowledge about the classes involved.

This is irrelevant to dual-process theory… dual-process theory is concerned with how some mental processes become conscious, decontextualized, slow, and effortful, etc. The above quote is instead an unrelated (albeit interesting) glimpse at how the different conceptualization modules may interact.

Section 12. Open questions

Machery identifies three directions for future inquiry:

  1. There are several prototype theories, several exemplar theories, and several theory theories. It remains unclear which theory [of each type] is correct. Too little attention has been given to investigating the nature of prototypes, exemplars, and theories.
  2. The factors that determine whether an element of knowledge about x is part of the concept of x rather than being part of the background knowledge about x.
  3. How conceptualization may cohere with dual-process theories.

Dual-process theory is actually more expansive than Machery allows. The concept of Default, defined in section 3, is a System1 behavior. Thus, the questions of Default vs. Manual Override, Concept vs. Background Knowledge… these swiftly become absorbed into the need for dual-process theorizing…

Section 13. Concept eliminativism

Machery finally advances tentative philosophical and sociological reasons one might banish concept from our professional vocabulary.

Theoretical terms are often rejected when it is found that they fail to pick out natural kinds. To illustrate, some philosophers have proposed to eliminate the term emotion from the theoretical vocabulary of psychology on these grounds. The proposal here is that concept should be eliminated from the vocabulary of cognitive science for the same reason.

The continued use of concept in cognitive science might invite cognitive scientists to look for commonalities… if the heterogeneity hypothesis is correct, these efforts would be wasted. By contrast, replacing concept with prototype, exemplar, and theory would bring to the fore urgent open questions.

Interesting suggestions. However, I think it is clear more theoretical weight lies in Machery’s heterogeneity hypothesis.

Concluding Thoughts

Three different kinds of concepts must imply three different kinds of conceptualization modules.

Novel prediction: damage to any one of these modules must inhibit only one of kind of conceptualization.

Much, much more work is needed…

One counterargument made in the responses to this Précis caught my eye. David Danks of CMU argues that all three conceptualization modules can be modeled as special cases of a singular graphical model representation.  His paper, Theory Unification and Graphical Models in Human Categorization (2007), serves to this effect. Machery’s reply to this counterpoint is brief, pointing to its disconnect to biological evidence, although Machery elsewhere allows that causal models might underlie concept-theory construction (c.f., A Theory of Causal Learning in Children: Causal Maps and Bayes Nets (2004)).

I will close with a quote made by Couchman et. al, in a response to this Précis:

Our task is to carve nature at its joints using the psychological knife called concepts. It is true, it is profoundly important to know, and it is all right for the progress of science that the knife is Swiss-Army issue with multiple blades.

[Sequence] Evans-Pritchard: Witchcraft, Oracles & Magic Among The Azande

azande

I read this classic text several years ago, and it left a lasting effect on me.

The Zande people are primarily a small-scale farming population located in central Africa. Their demographics are split between Democratic Republic of the Congo, in South Sudan, and the Central African Republic:

Zandeland_location

Evans-Pritchard briefly sketches Azande life in general, before zooming in on their complex religious system. At time of writing, 1937, these traditions had already begun to erode in the wake of European cultural imperialism. Racing against the clock, as it were, Evans-Pritchard managed to document the essence of these practices before they faded in the memories of the community.

Evans-Pritchard is a consummate professional, and this shows in his ethnographies. Azande culture and mysticism is explored in detail, and their customs – foreign to our ears – are treated largely without distracting judgment. Azande seeks spiritual answers from three kinds of oracles, each with increasing power: rubbing board, termite, and poison oracles. This practice was enmeshed in their legal system, their social structure, and their metaphysical beliefs. Azande culture further complemented these oracles by means of complex, interlocking theories of magic, and the social and medicinal contributions of a witch-doctor population:

Link: Summary

For me, the most interesting part of the book had to do with the relationship between mysticism and attention. Most of the following quotes relate to this.

Link: Quotes

To understand why it is that Azande do not draw from their observations the conclusions we would draw from the same evidence, we must realize that their attention is fixed on the mystical properties of the poison oracle and that its natural properties are of so little interest to them that they simply do not bother to consider them.

Observations such as the above suggest that disinterest in certain question-categories is not some random phenomenon that can be taken at face value. Azande individuals systematically experience disinterest in doubt-provoking challenges to their mystical ideology, and this “attention funnel” is anything but pre-meditated. Thus, attentional habits are not solely artifacts of personality: they also can be subpersonal, they are also influenced by culture: they do not necessarily serve the interests of their owners.

Finally, it would seem myopic to suppose that this quirk of human psychology is contained to this one culture. Perhaps this is enough to drive home my takeaway: treat disinterest with suspicion.