# Rule Feedback Loops

Part Of: Breakdown of Will sequence
Followup To: Willpower As Preference Bundling
Content Summary: 900 words, 9 min reading time

Context

When “in the moment”, humans are susceptible to bad choices. Last time, we introduced willpower as a powerful solution to such akrasia. More specifically:

• Willpower is nothing more, and nothing less, than preference bundling.
• Inasmuch as your brain can sustain preference bundling, it has the potential to redeem its fits of akrasia.

But this only explained how preference bundling works at the level of utility curves. Today, we will learn how preference bundling is mentally implemented, and this mental model will in turn provide us with predictive power.

Building Mental Models

Time to construct a model! 🙂 You ready?!

In our last post, we discussed three distinct phases that occur during preference bundling. We can then imagine three separate modules (think: software programs) that implement these phases.

This diagram provides a high-level, functional account of how our minds make decisions. The three modules can be summarized as follows:

• The Utility Transducer module is responsible for identifying affordances within sensory phenomena, and compressing a multi-dimensional description into a one-dimensional value.
• The Preference Bundler module can aggregate utility representations that are sufficiently similar. Such a technique is useful for combating akrasia.
• The Choice Implementer module selects Choice1 if Preference1 > Preference2. It is also responsible for computing when and how to execute a preference-selection.

The above diagram is, of course, merely a germinating seed of a more precise mental architecture (it turns out that mind-space is rather complex 🙂 ). Let us now refine our account of the Preference Bundler.

Personal Rules

Consider what it means for a brain to implement preference bundling. Your brain must receive utility-anticipated information from an arbitrary number of choice valuations, and aggregate similar decisions into a single measure.

Obviously, the mathematics of such a computation lies underneath your awareness (your superpower is math). However, does the process entirely fail to register in the small room of consciousness?

This seems unlikely, given the common phenomenal experience of personal rules. Is it not likely that the conscious experience of “I will never stay up past midnight on a weeknight” does not in some way correlate with the actions of the Preference Bundler?

Let’s generalize this question a bit. In the context of personal rules, we are inquiring about the meaning of quale-module links. This type of question is relevant in many other contexts as well. It seems to me that such links can be roughly modeled in the vocabulary of dual-process theory, where System 1 (parallel modules) data bubbles up into System 2 (sequential introspection) experience.

Let us now assume that the quale of personal rules correlates to some variety of mental substance. What would that substance have to include?

In terms of complexity analysis, it seems to me that a Preference Bundler need not generate relevant rules on the fly. Instead, it could more efficiently rely on a form of rule database, which tracks a set of rules proven useful in the past. Our mental architecture, then, looks something like this (quales are in pink):

In his book, Ainslee presents intruging connections between this idea of a rule database with similar notions in the history of ideas:

The bundling phenomenon implies that you will serve your long-range interest if you obey a personal rule to behave alike towards all members of a category. This is the equivalent of Kant’s categorical imperative, and echoes the psychologist Lawrence Kohlberg’s sixth and highest principle of moral reasoning, deciding according to principle. It also explained how people with fundamentally hyperbolic discount curves may sometimes learn to choose as if their curves were exponential.

Recursive Feedback Loops

Personal rules, of course, are not spontaneously appear within your mind. They are constructed by cognitive processes. Let us again expand our model to capture this nuance:

Describing our new components:

• The Rule Controller module is responsible both for generating new rules (e.g., “I will not stay up past midnight on a weeknight”), and re-factoring existing ones.
• The “Honored?” checkpoint conveys information on how well a given personal rule was followed. The Rule Controller module may use this information to update the rule database.

A feedback loop exists in our mental model. Observe:

Feedback loops can explain a host of strange behavior. Ainslie describes the torment of a dieter:

Even if [a food-conscious person] figures, from the perspective of distance, that dieting is better, her long-range perspective will be useless to her unless she can avoid making too many rationalizations. Her diet will succeed only insofar as she thinks that each act of compliance will be both necessary and effective – that is, that she can’t get away with cheating, and that her current compliance will give her enough reason not to cheat subsequently. The more she is doubtful of success, the more likely it will be that a single violation will make her lose this expectation and wreck her diet. Personal rules are a recursive mechanism; they continually take their own pulse, and if they feel it falter, that very fact will cause further faltering.

Takeaways

And that’s a wrap! 🙂 I am hoping to walk away from this article with two concepts firmly installed:

• Preference bundling is mentally implemented via a database of personal rules (“I will do X in situations that involve Y).
• Personal rules constitute a feedback loop, whereby rule-compliance strengthen (and rule-circumvention weakens) the circuit.

Next Up: [Iterated Schizophrenic’s Dilemma]

# [Sequence] Breakdown Of Will

In this sequence, we will be exploring this précis of this book. Specifically, we will be exploring the implications of akrasia (the act of behaving against one’s own desires).

Preliminary Posts

Content Summary

1. An Introduction To Hyperbolic Discounting. Based on Chapter 1-3. Introduces the concepts of akrasia and utility, proceeds to model akrasia as a symptom of discount curves shaped like hyperbolas.
2. Willpower As Preference Bundling. Based on Chapter 5. Discusses how willpower (a therapy against akrasia) comes to make our successive selves consistent with one another.  Willpower is presented as the brain subtly manipulating how it instantiates hyperbolic discount functions.
3. Personal Rule Feedback Loops. Based on Chapter 6. Builds a mental model of preference bundling, and explores the recursive nature of personal rules.
4. Iterated Schizophrenic’s Dilemma. Based on Chapter 6. Grounds Ainslee’s account of willpower (and preference bundling) in a modified form of Iterated Prisoner’s Dilemma.
5. Against Willpower. Based on Chapter 9. If willpower is preference bundling, then its mechanisms become available for scrutiny. Ainslee here locates four surprising implications of his theory of willpower, which suggest that it is not the unilaterally-beneficial tool that we might suspect.

# An Introduction To Hyperbolic Discounting

Part Of: [Breakdown of Will] sequence

• What Is Akrasia?
• Utility Curves, In 200 Words Or Less!
• Choosing Marshmallows
• Devil In The (Hyperbolic) Details
• The Self As A Population
• Takeaways

What Is Akrasia?

Do you agree or disagree with the following?

In a prosperous society, most misery is self-inflicted. We smoke, eat and drink to excess, and become addicted to drugs, gambling, credit card abuse, destructive emotional relationships, and simple procrastination, usually while attempting not to do so.

It would seem that behavior contradicting one’s own desires is, at least, a frustratingly common human experience. Aristotle called this kind of experience akrasia. Here’s the apostle Paul’s description:

I do not understand what I do. For what I want to do I do not do, but what I hate I do. (Romans 7:15)

The phenomenon of akrasia, and the entire subject of willpower generally, is controversial (a biasing attractor). Nevertheless, both its description and underlying mechanisms are empirically tractable. Let us now proceed to help Paul understand, from a cognitive perspective, the contradictions emerging from his brain.

We begin our journey with the economic concept of utility.

Utility Curves, In 200 Words Or Less!

Let utility here represent the strength with which a person desires a thing. This value may change over time. A utility curve, then, simply charts the relationship between utility and time. For example:

Let’s zoom in on this toy example, and name three temporal locations:

• Let tbeginning represent the time I inform you about a future reward.
• Let treward represent the time you receive the reward.
• Let tmiddle represent some intermediate time, between the above.

Consider the case when NOW = tbeginning. At that time, we see that the choice is valued at 5 “utils”.

Consider what happens as the knife edge of the present (the red line) advances.  At NOW = tmiddle, the utility of the choice (the strength of our preference for it) doubles:

Increasing utility curves also go by the name discounted utility, which stems from a different view of the x-axis (at the decision point looking towards the past, or setting x to be in units of time delay). Discounted utility reflect something of human psychology: given a fixed reward, other things equal, receiving it more quickly is more valuable.

This concludes our extremely complicated foray into economic theory. 😛 As you’ll see, utility curves present a nice canvas on which we can paint human decision-making.

Choosing Marshmallows

Everyday instances of akrasia tend to be rather involved. Consider the decision to maintain destructive emotional relationships: the underlying causal graph is rather difficult to parse.

Let’s simplify. Ever heard of the Stanford Marshmallow Experiment?

In these studies, a child was offered a choice between one small reward (sometimes a marshmallow) provided immediately or two small rewards if he or she waited until the tester returned (after an absence of approximately 15 minutes). In follow-up studies, the researchers found that children who were able to wait longer for the preferred rewards tended to have better life outcomes, as measured by SAT scores, educational attainment, body mass index (BMI) and other life measures.

Naming the alternatives:

• SS reward: Call the immediate, one-marshmallow option the SS (smaller-sooner) reward.
• LL reward: Call the delayed, two-marshmallow option the LL (larger-later) reward.

Marshmallows are simply a playful vehicle to transport concepts. Why are we tempted to reach for SS despite knowing our long-term interests lie with LL?

Here’s one representation of the above experiment (LL is the orange curve, SS is green):

Our definition of utility was very simple: a measure of preference strength. This article’s model of choice will be equally straightforward: humans always select the choice with higher utility.

The option will people select? Always the orange curve. No matter how far the knife edge of the present advances, the utility of LL always exceeds that of SS:

Shockingly, economists like to model utility curves like these with mathematical formulas, rather than Google Drawings. These utility relationships can be produced with exponential functions; let us call them exponential discount curves.

Devil In The (Hyperbolic) Details

But the above utility curves are not the only one that could be implemented in the brain. Even if we held Utility(tbeginning) and Utility(treward) constant, the rate at which Utility(NOW) increases may vary. Consider what happens when most of the utility obtains close to reward-time (when the utility curves form a “hockey stick”):

Let us quickly ground this alternative in a mathematical formalism. A function that fits our “hockey stick” criteria is the hyperbolic function; so we will name the above a hyperbolic discount curve.

Notice that the above “overlap” is highly significant – it indicates different choices at different times:

This is the birthplace of akrasia – the cradle of “sin nature” – where SS (smaller-sooner) rewards temporarily outweigh LL (larger-later) rewards.

The Self As A Population

Consider the story of Odysseus and the sirens:

Odysseus was curious as to what the Sirens sang to him, and so, on the advice of Circe, he had all of his sailors plug their ears with beeswax and tie him to the mast. He ordered his men to leave him tied tightly to the mast, no matter how much he would beg. When he heard their beautiful song, he ordered the sailors to untie him but they bound him tighter.

With this powerful illustration of akrasia, we are tempted to view Odysseus as two separate people. Pre-siren Odysseus is intent on sailing past the sirens, but post-siren Odysseus is desperate to approach them. We even see pre-siren Odysseus restricting the freedoms of post-siren Odysseus…

How can identity be divided against itself? This becomes possible if we are, in part, the sum of our preferences. I am me because my utility for composing this article exceeds my utility attached to watching a football game.

Hyperbolic discounting provides a tool to quantify this concept of competing selvesConsider again the above image. The person you are between t1 and t2 makes choices differently than the You of all other times.

Another example, using this language of warfare between successive selves:

Looking at a day a month from now, I’d sooner feel awake and alive in the morning than stay up all night reading Wikipedia. But when that evening comes, it’s likely my preferences will reverse; the distance to the morning will be relatively greater, and so my happiness then will be discounted more strongly compared to my present enjoyment, and another groggy morning will await me. To my horror, my future self has different interests to my present self. Consider, too, the alcoholic who moves to a town in which alcohol is not sold, anticipating a change in desires and deliberately constraining their own future self.

Takeaways

• Behavior contradicting your desires (akrasia) can be explained by appealing to the rate at which preferences diminish over time (utility discount curve).
• A useful way of reasoning about hyperbolic discount curves is warfare between successive “yous”.

Next Up: [Willpower As Preference Bundling]