Complementary Learning Systems

Part Of: Demystifying Memory sequence
Content Summary: 1000 words, 10 min read


Your brain is constantly keeping track of the world and your body. It represents these ever-changing environments by patterns of neural activation. Knowledge is not kept in the neurons themselves, but in the connections between neurons.

Sometimes, the brain will discover useful regularities in the environment, and store these patterns for later use. This is long-term memory. We shall concern ourselves with five kinds of long-term memory:

  1. Episodic: ability to remember events or episodes (e.g., dinner last Tuesday night)
  2. Semantic: ability to remember facts and concepts (e.g., hands have five fingers)
  3. Procedural: ability to develop skills (e.g., playing the piano).
  4. Behavioral: ability to remember stimulus-outcome pairs (e.g., bell means food)
  5. Emotional: ability to remember emotional information (e.g., she is always angry).

These memory systems are computed in different areas of the brain.

  1. Episodic memories are computed by the hippocampus
  2. Semantic memories are computed by the association neocortex
  3. Procedural memories are computed by the somatosensory neocortex
  4. Behavioral memories are computed by the basal ganglia
  5. Emotional memories are computed by the central amygdala

Only episodic and semantic memory are directly accessible to consciousness (i.e., working memory). The others are just available to the autonomous mind.

CLS- Categories of Long-Term Memory (1)


We have previously described conscious experience as a mental movie. But, unlike a normal theater, consciousness has several screens, each of which playing a different sense modality. visual, audio information etc. Call this the multimodal movie.

Semantic memory come in two forms: encyclopedic memory (abstract descriptions of events) and conceptual memory (concepts and their inter-relationships). Both abstractions are derived from the movie, by removing redundant information.

CLS- Episodic vs Semantic Memory

Mind wandering is the tendency of animals to recall past experiences. But why does mind wandering resurrect the details of what was seen, heard, smelled, touched? Why not simply use the plot summary (encyclopedic memory) instead?

Why does episodic memory exist at all?


Henry Molaison was born on February 26, 1926. As a child, he suffered from epilepsy.

CLS- Patient HM (2)

His doctors removed what they thought to be the source of the seizures: the hippocampus. After the surgery, Henry still recognized objects, was able to solve puzzles, even had the same IQ. He had a rich emotional life, and could learn new skills (e.g., to play the piano). But he was completely incapable of forming new episodic memories. Henry (i.e., Patient HM) was locked in a 5 minute loop, never remembering prior events.

Let’s imagine different kinds of amnesia Henry might have experienced.

Scenario 1. Henry has no retrograde amnesia (old memories were unperturbed), but suffers severe anterograde amnesia (unable to create new memories). From this data, we might conclude that the hippocampus creates, but does not store, episodic memories.

CLS- HM Amnesia Pattern v1 (1)

Scenario 2. Henry experiences both severe retrograde and anterograde amnesia. From this data, we might conclude that the hippocampus creates and stores episodic memories.

CLS- HM Amnesia Pattern v2 (2)

Neither scenario actually happened. Instead, Henry experienced temporally graded retrograde amnesia:

CLS- HM Amnesia Pattern v3 (2)

This shows that, while the hippocampus creates and stores episodic memories, these memories are eventually copied elsewhere. This process is called consolidation. Hippocampal damage destroy memories that have not yet been consolidated. 

But why should the brain copy memories? This seems inefficient. And why does this process take years, even decades?


The connectionist paradigm models the brain as a neural networkThe AB-AC task illustrates a challenge for connectionism. It goes as follows:

You want to associate stimulus A with response B. For example, when you hear “chair”, you should say “map”. There are many such associations (Chair-Map, Book-Dog, Car-Idea). This is the AB list.

After you achieve 100% recall on the AB list , a new set of stimulus-response words are given: the AC list. You want to learn both. However, the AB and AC lists have the same stimuli paired with novel responses (e.g. Chair-Printer, Book-Flower, Car-Shirt).

How well do humans and connectionist models do against this task? Let’s find out! The following graphs take place after the AB list has been learned perfectly. Y-axis is %correct, x-axis is number of exposures to the AC list.

CLS- Catastrophic Interference (2)

Consider the left graph. Dotted line is AC recall over time. Humans were able to learn the AC list. The solid line shows AB list performance. As humans learned AC associations, their AB performance suffered a little, from 100 to 60%. This is moderate interference.

Consider the right graph. Dotted line shows that the model is able to learn the AC list, just like the human. But solid line shows that AB recall very quickly drops to 0%. This is catastrophic interference.

Catastrophic interference occurs when the AB list and AC list are learned separately (focused learning). But what if you learn them at the same time? More specifically, what if you train against a shuffled set of AB and AC associations (interleaved learning)?

CLS- Interleaved vs Focused Learning (2)

On the left, focused learning (black squares) shows catastrophic interference against AB memories, as before. But interleaved learning (white dots) show zero interference!

On the right, we see another consequence of interleaved learning: new memories are acquired much more slowly.


We are ready to put the puzzle together.

Catastrophic interference is an inevitable consequence of systems that employ highly-overlapping distributed representations, despite the fact that such systems have a number of highly desirable properties (e.g., the ability to perform generalization and inference).

This problem can be addressed by employing a structurally distinct system with complementary learning properties: sparse, non-overlapping representations that are highly robust to interference from subsequent learning. Such a sparse system by itself would be like an autistic savant: good at memorization but unable to perform everyday inferences. But when paired with the highly overlapping system, a much more versatile overall system can be achieved.

The neocortex and hippocampus comprise these learning systems:

CLS- Two Component Model

First introduced in 1995, Complementary Learning System (CLS) theory predicts a wide range of extant biological, neuropsychological, and behavioral data. It explains why the hippocampus exists, why it performs consolidation, and why consolidation takes years to complete.

The CLS theory was first presented in [M95]. Data in section 4 taken from that paper. Section 5 quotes liberally from [O11].

  • [M95] McClelland et al (1995). Why There Are Complementary Learning Systems in the Hippocampus and Neocortex: Insights From the Successes and Failures of Connectionist Models of Learning and Memory
  • [O11] O’Reilly et al (2011). Complementary Learning Systems

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s