An Introduction To Category Theory [Part One]

What’s So Important About Graphs?

Of all the conceptual devices in my toolkit, hyperpoints and graphs are employed the most frequently. I have explained hyperpoints previously; let me today explain What’s So Important About Graphs?

Human beings tend to divide the world into states versus actions. This metaphor is so deeply ingrained in our psyche that it tends to be taken for granted. Graphs are important because they visualize this state-action dichotomy: states are dots, actions are lines.

Put crudely, graphs are little more than connect-the-dots drawings. Now, dear reader, perhaps you do not yet regularly contemplate connect-the-dots in your daily life. Let me paint some examples to whet your appetite.

  1. Maps are graphs. Locations are nodes, paths are the edges that connect them.
  2. Academic citations are a graph. Papers are nodes, citations are the edges that connect them.
  3. Facebook is a graph. People are nodes, friendships are the edges that connect them.
  4. Concept-space is a graph. Propositions are nodes, inference are the edges that connect them.
  5. Causality is a graph. Effects are nodes, causes are the edges that connect them.
  6. Brains are graphs. Neurons are nodes, axons are the edges that connect them.

The above examples are rather diverse, yet we can reason about them within a single framework. This is what it means to consolidate knowledge.

Graph Theory: Applications & Limitations

It turns out that a considerable amount of literature lurks beneath each of our examples. Let’s review the highlights.

Once upon a time, Karl Popper argued that a scientific theory is only as strong as it is vulnerable to falsification. General relativity made the crazy prediction of gravitational lensing which was only later confirmed experimentally. One reason to call astrology “pseudoscience” is in its reluctance to produce such a vulnerable prediction.

But simple falsification doesn’t fully describe science: healthy theories can survive certain kinds of refutations. How? W. V. O. Quine appealed to the fact that beliefs cannot be evaluated in isolation; we must instead view scientific theories as a “web of belief”. And this web can be interpreted graphically! Armed with this interpretation, one can successfully evaluate philosophical arguments involving Quine’s doctrine (called confirmation holism) based on technical constraints on graphical algorithms.

The modern incarnation of confirmation holism occurs when you replace beliefs with degrees-of-belief. These probabilistic graphical models are powerful enough to formally describe belief propagation.

But even probabilistic graphical models don’t fully describe cognition: humans possess several memory systems. Our procedural memory (“muscle memory”) is a graph of belief, and our episodic memory (“story memory”) is a separate graph of belief.

How to merge two different graphs? Graph theory cannot compose graphs horizontally.

Switching gears to two other examples:

  • The formal name for brains-are-graphs is neural networks. They are the lifeblood of computational neuroscientists.
  • The formal name for Facebook-is-a-graph is social networks. They are vital to the research of sociologists.

How might a neuroscientist talk to a sociologist? One’s network represents the mental life of a person; the other, the aggregate lives of many people. We want to say is that every node in a social graph contains an entire neural graph.

How to nest graphs-within-graphs?  Graph theory cannot compose graphs vertically.

The Categorical Landscape

What are categories? For us, categories generalize graphs.

A directed graph can be expressed G = (V, E); that is, it contains a set of nodes, and a set of edges between those nodes.

A small category C = (O, M); that is, it contains a set of objects, and a set of morphisms between those objects.

Categories endow graphs with additional structure.

  1. Category Theory require the notion of self-loops: actions that result in no change.
  2. In Graph Theory, there is a notion of paths from one node to another. Category Theory promote every path into its very own morphism.
  3. Category Theory formalizes the notion of context: objects and morphisms live within a common environment; this context is called a “category”.

As an aside, directed graphs require only one edge between graphs, and no self-loops. We could tighten the analogy yet further by comparing categories to quivers (directed graphs that don’t forbid parallel edges and self-loops).

But enough technicalities. Time to meet to your first category! In honor of category theory’s mathematical background, allow me to introduce Set. In Set, objects are sets, and morphisms are functions. Here is one small corner of the category:
Category Theory- Set Category A (1)Self-loops 1A, 1B, and 1C change nothing; they are special kind of self-loops called identity morphisms. Since such things exist on every node, we will henceforth consider their existence implicit.

Recall our requirement that every path has a morphism. The above example contains three paths:

  • π1 = A → B
  • π2 = B → C
  • π3 = A → B → C

The first two paths are claimed by f and g, respectively, but the third is unaffiliated. For this to qualify as a category, we must construct for it a new morphism. How? We create h : A → C via function composition:

h(x) = g(f(x)) = [f(x)]-1 = 2x-1.

We also require morphism composition to be associative. Another example should make this clear:Category Theory- Set Category B

Besides g(f(x)), we can also write function composition as f●g (“f then g”). In the above, we have:

  • i(x) = f●g
  • j(x) = g●h
  • k(x) = f●g●h

With this notation, we express our requirement for morphism associativity as

  • k = (f●g)●h = f●(g●h).
  • That is, k = i●h = f●j.
  • That is, “you can move from A to D by travelling either along i-then-h, or along f-then-j”.

Harden The Definition

Let me be precise. Categories contain two types of data:

  1. A set of objects O.
  2. A set of morphisms M such that m : A → B

The set of morphisms is required to contain the following two sets:

  1. A set of identity morphisms 1 such that 1A : A → A
  2. A set of composition morphisms such that for every f : A → B and g: B → C, there is a morphism h: A → C.

The set of morphisms is required to respect the following two rules:

  1. Identity: composing a morphism with an identity does nothing.  f●1 = f = 1●f
  2. Associativity: the order in which you compose functions doesn’t matter. (f●g)●h = f●(g●h)

Connecting To Group Theory

Recall that groups define axioms governing set operations. A group might choose to accept or reject any of the following axioms:

  1. Closure
  2. Associativity
  3. Identity Element
  4. Inverse Element
  5. Associativity

We can now better appreciate our taxonomy from An Introduction To Group Theory:

Abelian- Other Group Types

Take a second to locate the Category group. Do you see how its two axioms align with our previous definition of a category?

Notice that in group theory, categories are kind of a degenerate monoid. Why have we removed the requirement for closure?

We can answer this question by considering the category SomeMonoid? It turns out that this category has only one object (with each of its member elements as self-loops). This is the categorical consequence of closure. We do not require closure in our category theory because we want license to consider categories with a multitude of objects.


In this article, I introduced graphs, and used human memory and social vs. neural networks to motivate two limitations of graph theory:

  1. How to merge two different graphs? Graph theory cannot compose graphs horizontally.
  2. How to nest graphs-within-graphs?  Graph theory cannot compose graphs vertically.

I then introduced you to Category Theory, showing how it both generalizes graph theory and connects to abstract algebra.

In the final half of this introduction, I will discharge my above two commitments. Specifically, next time I will show how:

  1. The concept of functor permits horizontal composition of categories.
  2. The distinction between “inside” and “outside” views facilitates vertical composition of categories.

The Structure Of Physics

Part Of: Philosophy of Science sequence
Followup To: An Introduction To Structural Realism
Content Summary: 800 words, 8 min read


Recall the takeaways from last time:

  1. Realists advance the no-miracles argument: the predictive power of science seems too implausible unless its theories somehow refer to reality.
  2. Anti-realists counter with pessimistic meta-induction: previously successful theories have been discarded; who are we to say that our current theories won’t meet the same fate.
  3. The approximation hypothesis is where these two arguments connect meaningfully: isn’t it more accurate to call older theories approximations rather than worthless?
  4. It is notoriously difficult to describe what “approximation” means.
  5. Some realists have conceded that scientific narratives tend to fail, but produce compelling evidence that scientific equations tend to persist. This position is known as structural realism (where formulae structure means more than the meaning of the variables).

This summary is all fine, until you start to wonder… what precisely does “formulae structure” mean? And how is such a thing approximated?

Vocabulary Sharpening

Before we begin, consider the word “approximate”. It is directional: while we can say that Newtonian Physics approximates General Relativity, such a statement casts the older theory as the actor. This temporal confusion helps no one. What’s worse, in my view, is that “approximation” hints at the end of science, as though our current theories are causally derived from some Ultimate Structure, some Theory Of Everything.

Physicists in the business of approximating quantum mechanics? No! Progress runs in the other direction.

New theories are “pulled up from” a Theory Of Everything? No! Theories are fueled by the earth… data is their breath.

Let us discard the notion of approximation and consider the reverse direction. We might cite “generalization” or “disaggregation” for this purpose. But let me instead use theory decompression.  Here is a sharper expression of the two tasks that lie before us:

  1. What is structure?
  2. How do scientists decompress structure?

The Language Of Categories

Category theory seems the best candidate for a realizer of structuralism.

What is category theory? I’m glad you asked!

  • Category theory divides the world in terms of objects and processes (and meta-processes, and meta-meta-processes, etc).
  • “Processes” are called morphisms, and “meta-processes” are called functors.
  • Meta-processes that inject new information into their target categories are called free functors, those that eject information are forgetful functors.
  • A category, then, looks a lot like a network graph – with morphisms connected various objects together.
    • How can a formula become a graph-like thing? Operations (*, +, etc) become morphisms, variables (x, y, etc) become objects.
  • One way to describe categories is by looking for patterns in the underlying “graph”. These patterns are known as universal constructions.
  • Categories earn adjectives for different combinations of universal constructions.
    • For example, any category equipped with a pattern known as “exponential” is called a closed category.

Categorical Interpretations Of Physics

In what follows, I will leverage some idea in (Baez/Stay 2009) Physics, Topology, Logic and Computation: A Rosetta Stone.

  • Of course, modern physics is composed of two separate theories: Quantum Field Theory, and General Relativity
  • Quantum Field Theory (QFT) is the category Hilb, which is a categorical interpretation of a Hilbert space.
  • General Relativity (GR) is the category nCob, which is a categorical interpretation of the topological notion of cobordism.
  • Both QFT and GR both share the same set of patterns, and hence they share the same adjectives. Both are closed symmetric monoidal categories.
    • Noticing the pattern overlap is now motivating work towards a unified [Physics] Theory Of Everything.
    • As an aside, do you know what other categories share this moniker? Computation and linear logic.
  • Newtonian Physics is the category Vec, which is a categorical interpretation of vector space. It is not closed symmetric monoidal.
    • Note: unlike the rest of this section, take the above line with buckets of salt. It is my own conjecture, used for illustrative purposes.
  • In this setting, what is the meaning of theory decompression? Such a thing might be the construction of free functors, i.e., decompression functors.

Structural Realism- Disapproximation

I intend to flesh out this present section in the coming years. But for now, here is where we leave it.

Harden Your Query

We have successfully hardened the concept of structure, and used it to harden our theory of physics. Consider again our second question:

  • How do scientists decompress structure?

With our newfound understanding of structure, we can make this research question more precise:

  • How does Vec produce the same predictions as Hilb + nCob?
  • How do scientists go about constructing decompression functors?

Category theory is not yet powerful enough to answer these questions. They are, I submit, the most important unsolved questions in all of philosophy of science.

Wrapping Up

Let me close on a note of poetry. The following quote is one of the most beautiful thoughts I have ever encountered. It is attributed to Stephen Hawking.

What is it that breathes fire into the equations and makes a universe for them to describe?

For us, our query has become:

What is it that breathes fire into these categories and makes a universe for them to describe?