The Machinery of Morality

Part Of: Demystifying Ethics sequence
Content Summary: 1000 words, 10 min read

Propriety Frames

Main Article: An Introduction To Propriety Frames

The primate brain contains a diverse set of memory structures. For example, episodic and semantic memory store narratives and facts, respectively.

Propriety frames are a memory format that retains social intuitions. This form of memory permits normative judgments, or which behaviors are appropriate or inappropriate.

Propriety frames are organized by situational context. A Restaurant frame provides the social intuitions for restaurant interactions. Frames are the substrate of rituals.

Frames are organized hierarchically to promote reuse. For example, the Eating frame is relevant in contexts besides restaurant dining.

Propriety Frames- Restaurant Example (2)

When a mother instructs her son to not yell in the store, the child installs an update to his Shopping frame. When a family exchanges gossip around a campfire, they are doing so in part to synchronize their propriety frames.

Intuition Generators

The contents of our social intuitions is not arbitrary. Our environment does not fully determine our moral sense. The brain also possesses innate social intuition generators, which contribute to the contents of social judgments.

As outlined in Moral Foundations theory, there are six such generators: authority, care, loyalty, fairness, autonomy, and sanctity.

People are genetically and environmentally disposed to respond to certain generators more strongly than others. Let social matrices be the emotional intensities elicited by each generator.

People with similar matrices tend to gravitate towards similar political parties. When you measure the social matrices of American citizens, you can see large differences between the social intuitions of Democrats and Republicans.

Theory of Normatives- Social Matrices by Political Party (1)

These differences in social matrices explain much of American politics.

  • Why do Democrats praise entitlements, but Republicans denounce them? Because Democrats heavily emphasize Care for the poor, whereas Republicans more strongly reverberate to questions of Fairness (exploitation and moral hazard).
  • Why are Democrats more skeptical of patriotism than their Republican counterparts? Perhaps because they respond to Loyalty to country less.

For more information, see Graham et al (2009). Liberals and conservatives rely on different sets of moral foundations.

Generators vs Frames

Main Article: A Dual-Process Theory of Moral Judgment

Two mechanisms contribute to social intuitions: propriety frames, and intuition generators.

Theory of Normatives- Frame vs Generator (1)

Generators are located subcortically, and thus produce intuitions quickly and automatically. Frames are stored in the cortical mantle, and are thus slower, and more amenable to conscious awareness.

As the work of Joshua Greene shows, deontological (rule-based) attitudes are generated rapidly; whereas consequential theories of morality are slower and more vulnerable to distraction. This suggests a relatively straightforward mapping:

  • People with a deontological style rely more heavily on their Generators.
  • People with a consequentialist style lean instead on their Frames.

Frames and Generators can influence one another, albeit slowly. Moral argumentation can change one’s mind, and these frame updates can percolate down to change one’s social matrices (intuition weights). Call this frame internalization.

Similarly, if one’s private intuitions diverge from a culturally inherited norm, that frame can be updated to be more consistent with one’s personality. Call this frame refinement.

For more information, see Cushman, Young, Greene (2010). Our multi-system moral psychology: Towards a consensus view

Cultural Regimes

Human communities generate emergent networks known as cultural regimes. These emerge as distinct categories of frames and intuition generators.

regime categories

Language made regime accretion possible. No longer is expertise confined to the skull of the individual. No longer is death a Full Restart button. Our species sends cultural expertise down through the generations.  This knowledge – these frames – have become increasingly sophisticated over the course of human history.

Hominid evolution has also seen the advent of regime diversification. Modern religions (of the Axial Age) were successful because they improved on our ability to trust strangers. This in turn dramatically increased the size of feasible social groups.

Theory of Normativity- Evolution of Regimes (3)

Moral Tagging

We have so far only been speaking about social attitudes (should / should not). What about moral attitudes (good / evil)?

Moral attitudes are nothing more than a kind of social attitude. I know of no moral attitude that can be divorced from a social context.

Moral attitudes are constructed by moral tagging, which endows a subset of social attitudes with anger and disgust reactions (as opposed to the more typical reputation appraisal, gossiping, shaming).

Further, moral tagging produces appraisal inflexibility. Moral violations are viewed as wrong everywhere, in every context. This is in contrast to other social violations, for which it is easier to see circumstances in which the behavior would seem less bad.

The boundary between Virtuous and Tolerable is smooth, reflecting the flexibility of our intuition generators. In contrast, the boundary between Tolerable and Intolerable is sharp.

Moral Flinch- Latitudes of Acceptance (2)
A similar distinction appears amidst disagreement. There are two kinds of ways people’s judgments can diverge:

  • Moral disagreement: is a given behavior Evil?
  • Social disagreement: is a given behavior Inappropriate?

moral vs social disagreement

For more information, see Tetlock et al (2000). The psychology of the unthinkable: taboo trade-offs, forbidden base rates, and heretical counterfactuals.

Norm Synchronization

The space of possible social intuitions is vast. However, group living only became possible with relatively homogenous norms. How do individual brains synchronize propriety frames and social matrices within a group?

At least four mechanisms provide norm synchronization.

  • Natural language and facial expressions are used to communicate social and emotional information.
  • Reputation systems benefit social beings who are especially adept at learning & conforming to the frames of their peers.
  • First-order punishment emotions (anger at transgressor) incentivizes people to not violate implicit societal expectations.
  • Second-order punishment (anger at those who are accepting of transgressor) incentivize a community to respond to, and develop a unified response to, dyadic disagreements.

Takeaways

  • In Propriety Frames, we saw how the brain retains social information.
  • Limbic machinery, such as the Care module, generate normative intuitions.
  • Generators and frames interact to facilitate both top-down and bottom-up learning.
  • A subset of social violations are tagged as morally intolerable.
  • Several emotion devices work to consolidate normatives within a group. 

Related Work

  • Graham, Haidt, Nosek (2009). Liberals and conservatives rely on different sets of moral foundations.
  • Cushman, Young, Greene (2010). Our multi-system moral psychology: Towards a consensus view
  • Tetlock et al (2000). The psychology of the unthinkable: taboo trade-offs, forbidden base rates, and heretical counterfactuals.

 

[Sequence] Psychology Musings

This sequence is simply a bucket that collects some of my older, “pre-theoretic” views on psychology. YMMV. 🙂

Random musings

Books I enjoyed

[Sequence] Cognitive Architecture

Main Sequence

Version Iterations

  • Version 0.1: The mind as a collection of functional modules.
  • Version 1.0: Integrating anatomy and genetics.
  • Version 1.1: Integrating ecology, morality, frame memory, and social systems.
  • Version 1.2: Integrating theory of mind, social reasoning and culture.
  • Version 1.3: Integrating memory hierarchy, motivation, frame memory.
  • Version 1.4: Integrating working memory, attention, consciousness.
  • Version 1.5: Integrating cognitive control, affective neuroscience.

Singular Value Decomposition

Part Of: Algebra sequence
Followup To: Eigenvalues and Eigenvectors
Content Summary: 1300 words, 13 min read.

Limitations of Eigendecomposition

Last time, we learned how to locate eigenvalues and eigenvectors of a given matrix. Diagonalization is the process where a square matrix can be decomposed into two factors: a matrix Q (with eigenvectors along the columns) and a matrix \Lambda (with eigenvalues along the diagonal).

A = Q \Lambda Q^T

But as we saw with the spectral theorem, eigendecomposition only works well against square, symmetric matrices. If a matrix isn’t symmetric, it is easy to run into complex eigenvalues. And if a matrix isn’t square, you are out of luck entirely!

Eigendecomposition- Spectral Theorem (1)

Can we generalize eigendecomposition to apply to a wider family of matrices? Can we diagonalize matrices of any size, even if they don’t have “nice” properties?

Yes. Self-transposition is the key insight of our “eigendecomposition 2.0”. We define the self-transpositions of A as AA^{T} and A^{T}A.

Suppose A \in \mathbb{R}^{m x n}. Then AA^T \in \mathbb{R}^{mxm} and A^TA \in \mathbb{R}^{nxn}. So these matrices are square. But they are also symmetric!

To illustrate, consider the following.

A = \begin{bmatrix}  4 & 4 \\ -3 & 3 \\ \end{bmatrix}

Since A is not symmetric, we have no guarantee that its eigenvalues are real. Indeed, its eigenvalues turn out to be complex:

\det(A - \lambda I) = \begin{bmatrix} 4 - \lambda & 4 \\ -3 & 3 - \lambda \\ \end{bmatrix} = 0

(12 - 7 \lambda + \lambda^2) + 12 = 0 \Rightarrow \lambda^2 -7 \lambda + 24 = 0

\lambda = \frac{7 \pm \sqrt{(-7)^2 - 4*1*24}}{2*1} = \frac{7}{2} \pm \frac{\sqrt{47}i}{2}

Eigendecomposition on A sucks. Are the self-transposed matrices any better?

A^TA = \begin{bmatrix} 4 & -3 \\ 4 & 3 \\ \end{bmatrix} \begin{bmatrix} 4 & 4 \\ -3 & 3 \\ \end{bmatrix} = \begin{bmatrix} 25 & 7 \\ 7 & 25 \\ \end{bmatrix}

AA^T = \begin{bmatrix} 4 & 4 \\ -3 & 3 \\ \end{bmatrix} \begin{bmatrix} 4 & -3 \\ 4 & 3 \\ \end{bmatrix} = \begin{bmatrix} 32 & 0 \\ 0 & 18 \\ \end{bmatrix}

These matrices are symmetric! Thus, they are better candidates for eigendecomposition.

Towards Singular Value Decomposition

Singular Value Decomposition (SVD) is based on the principle that all matrices are eigendecomposable after self-transposition. It is essentially a bug fix:

SVD- Self-Transposition

An important way to picture SVD is with the idea of orthogonal bases. It is relatively easy to find any number of orthogonal bases for a given rowspace. Call the matrix of orthogonal vectors V.

We desire to find an orthogonal basis V such that AV produces an orthogonal basis in the column space. Orthogonal bases are not particularly hard to find. But most orthogonal bases, once projected to column space, will lose their orthogonality! We desire that particular orthogonal basis U such that V is also orthogonal.

We won’t require vectors in U to be the same size as those in V. Instead, we will normalize V; its basis vectors will be orthonormal (orthogonal and normal). Then the length of each vector in U will diverge by a scaling factor.

As we will soon see, these scaling factors are not eigenvalues. Instead, we will use sigmas instead of lambdas.

  • Scaling factors \sigma , analogous to eigenvalues \lambda.
  • Diagonal matrix \Sigma , analogous to diagonal matrix \Lambda

Our full picture then, looks like this:

SVD- Orthogonal Basis Transfer

Let us now translate this image into matrix language.

A \begin{bmatrix} \vdots & \vdots & \vdots & \vdots \\ v_1 & v_2 & \dots & v_n \\ \vdots & \vdots & \vdots & \vdots \\ \end{bmatrix} = \begin{bmatrix} \vdots & \vdots & \vdots & \vdots \\ \sigma_1u_1 & \sigma_2u_2 & \dots & \sigma_nu_n \\ \vdots & \vdots & \vdots & \vdots \\ \end{bmatrix}

But we can easily factorize the right-hand side:

\begin{bmatrix} \vdots & \vdots & \vdots & \vdots \\ \sigma_1u_1 & \sigma_2u_2 & \dots & \sigma_nu_n \\ \vdots & \vdots & \vdots & \vdots \\ \end{bmatrix} = \begin{bmatrix} \vdots & \vdots & \vdots & \vdots \\ u_1 & u_2 & \dots & u_n \\ \vdots & \vdots & \vdots & \vdots \\ \end{bmatrix} * \begin{bmatrix} \sigma_1 & 0 & \dots & 0 \\ 0 & \sigma_2 & \dots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \dots & \sigma_n \\ \end{bmatrix}

So we have that:

AV = U \Sigma

Since both V and U are orthogonal, inversion of either is equivalent to transposition:

A = U \Sigma V^{-1} = U \Sigma V^T

This strongly resembles our diagonalization equation A = Q \Lambda Q^T. SVD distinguishes itself by considering two orthogonal eigenmatrices U and V, not just one (Q).

Recalling that A = U \Sigma V^T ,

A^TA = (V \Sigma^T U^T) (U \Sigma V^T)

But now the innermost term cancels. Since \Sigma is a square diagonal matrix, its self-transposition is simply equal to \Sigma^{2}. So,

A^TA = V \Sigma^2 V^T

Since A^{T}A is a square, symmetric matrix, our diagonalization theorem applies!

A^TA = V \Sigma ^2 V^T = Q \Lambda Q^T

To find U, a similar trick works:

AA^T =  (U \Sigma V^T)(V \Sigma^T U^T) = U \Sigma^2 U^T = Q \Lambda Q^T

The relationships between SVD and eigendecomposition are as follows:

  • V is the eigenvectors of A^TA
  • U is the eigenvectors of AA^T
  • \Sigma is the square root of the eigenvalues matrix \Lambda

If any eigenvalue is negative, the corresponding sigma factor would be complex. But A^TA and AA^T are positive-semidefinite, which guarantees non-negative eigenvalues. This assures us that \Sigma contains only real values.

In contrast to eigendecomposition, every matrix has an SVD decomposition. Geometrically, V and U act as rotational transformations, and Sigma acts as a scaling transformation. In other words, every linear transformation comprises a rotation, then scaling, then another rotation.

SVD- Rotation Scaling Rotation (1)

A Worked Example

Let’s revisit A. Recall that:

A = \begin{bmatrix} 4 & 4 \\ -3 & 3 \\ \end{bmatrix}, A^TA = \begin{bmatrix} 25 & 7 \\ 7 & 25 \\ \end{bmatrix}, AA^T = \begin{bmatrix} 32 & 0 \\ 0 & 18 \\ \end{bmatrix}

Eigendecomposition against A is unpleasant because A is not symmetric. But A^{T}A is guaranteed to be positive semi-definite; that is, to have non-negative eigenvalues. Let’s see this in action.

det(A^TA - \lambda I) = \begin{bmatrix} 25 - \lambda & 7 \\ 7 & 25 - \lambda \\ \end{bmatrix} = 0

\lambda^2 - 50 \lambda + 576 = 0 \Rightarrow (\lambda_1 - 32)(\lambda_2 - 18) = 0

\lambda_1 = 32, \lambda_2 = 18

trace(A^TA) = 50 = \sum{\lambda_i}

det(A^TA) = 625-49 = 576 = 18 * 32 = \prod{\lambda_i}

These are positive, real eigenvalues. Perfect! Let’s now derive the corresponding (normalized) eigenvectors.

A - 32I = \begin{bmatrix} -7 & 7 \\ 7 & -7 \\ \end{bmatrix} = 0, A - 18I = \begin{bmatrix} 7 & 7 \\ 7 & 7 \\ \end{bmatrix} = 0

rref(A - 32I) = \begin{bmatrix} -1 & 1 \\ 0 & 0 \\ \end{bmatrix} = 0, rref(A-18I) = \begin{bmatrix} 1 & 1 \\ 0 & 0 \\ \end{bmatrix} = 0

v_1 = \begin{bmatrix} \frac{1}{\sqrt{2}} \\ \frac{1}{\sqrt{2}} \\ \end{bmatrix}, v_2 = \begin{bmatrix} \frac{1}{\sqrt{2}} \\ \frac{-1}{\sqrt{2}} \\ \end{bmatrix}

SVD intends to decompose A into U \Sigma V^{T}. The above findings give us two of these ingredients.

V^{T} = \begin{bmatrix} \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} \\ \frac{-1}{\sqrt{2}} & \frac{1}{\sqrt{2}} \\ \end{bmatrix}

\Sigma =\begin{bmatrix} \sqrt{32} & 0 \\ 0 & \sqrt{18} \\ \end{bmatrix}

What’s missing? U! To find it, we perform eigendecomposition on AA^{T}. This is an especially easy task, because AA^{T} is already a diagonal matrix.

AA^T = \begin{bmatrix} 32 & 0 \\ 0 & 18 \\ \end{bmatrix}

U = \begin{bmatrix} u_1 & u_2 \\ \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0 & 1 \\ \end{bmatrix}

We have arrived at our first Singular Value Decomposition.

A = U \Sigma V^T = \begin{bmatrix} 1 & 0 \\ 0 & 1 \\ \end{bmatrix} \begin{bmatrix} \sqrt{32} & 0 \\ 0 & \sqrt{18} \\ \end{bmatrix} \begin{bmatrix} \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} \\ \frac{1}{\sqrt{2}} & - \frac{1}{\sqrt{2}} \\ \end{bmatrix}

Okay, so let’s check our work. 😛

A = \begin{bmatrix} 1 & 0 \\ 0 & 1 \\ \end{bmatrix} \left( \begin{bmatrix} \sqrt{32} & 0 \\ 0 & \sqrt{18} \\ \end{bmatrix} \begin{bmatrix} \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} \\ \frac{1}{\sqrt{2}} & - \frac{1}{\sqrt{2}} \\ \end{bmatrix} \right) = \begin{bmatrix} 1 & 0 \\ 0 & 1 \\ \end{bmatrix} \begin{bmatrix} 4 & 4 \\ 3 & 3 \\ \end{bmatrix} = \begin{bmatrix} 4 & 4 \\ 3 & 3 \\ \end{bmatrix}

These matrices are a viable factorization: multiplication successfully recovers A.

Takeaways

  • Eigendecomposition only works for a subclass of matrices; SVD decomposes all matrices.
  • SVD relies on self-transposition to convert any arbitrary matrix into one that works well against eigendecomposition (guarantees square m = n and symmetric A = A^{T}).
  • Another way to interpret SVD is by taking a special kind of orthogonal basis that, once passed through the linear transformation, preserves its orthogonality.
  • Every matrix A = U \Sigma V^{T}. That is, every linear transformation can be conceived as rotation + scaling + rotation.

Until next time.

Eigenvalues and Eigenvectors

Part Of: Algebra sequence
Followup To: An Introduction to Linear Algebra
Next Up: Singular Value Decomposition
Content Summary: 1300 words, 13 min read

Geometries of Eigenvectors

Matrices are functions that act on vectors, by mapping from row-vectors to column-vectors.  Consider two examples:

  1. Reflection matrices, which reflect vectors across some basis.
  2. Rotation matrices, which rotate vectors clockwise by \theta degrees.

eigenvectors-transformation-geometries-1

The set of eigenvectors of a matrix A is a special set of input vectors for which the matrix behaves as a scaling transformation. In other words, we desire the set of vectors \vec{x} whose output vectors A\vec{x} differ by a scaling factor.

Eigenvectors have a straightforward geometric interpretation:

  1. Reflection eigenvectors are orthogonal or parallel to the reflecting surface. In the left image above, that is the top two pairs of vectors.
  2. Rotation eigenvectors do not exist (more formally, cannot be visualized in \mathbb{R}^2).

Algebra of Eigenvectors

We can express our “parallel output” property as:

A\vec{x} = \lambda \vec{x}

Thus \vec{x} and A\vec{x} point in the same direction, but differ by scaling factor \lambda.

Scaling factor \lambda is the eigenvalue. There can be many \left( x, \lambda \right) pairs that satisfy the above equality.

For an \mathbb{R}^{n x n} matrix, there are n eigenvalues. These eigenvalues can be difficult to find. However, two facts aid our search:

  • The sum of eigenvalues equals the trace (sum of values along the diagonal).
  • The product of eigenvalues equals the determinant.

To solve, subtract \lambda \vec{x} from both sides:

A\vec{x} = \lambda \vec{x}

(A - \lambda I)\vec{x} = 0

We would like to identify n unique eigenvectors. But if the new matrix (A - \lambda I) has an empty nullspace, it will contain zero eigenvectors. So we desire this new matrix to be singular.

How to accomplish this?  By finding eigenvalues that satisfy the characteristic equation \det(A - \lambda I) = 0. Matrices are singular iff their determinants equal zero.

Let’s work through an example! What is the eigendecompositon for matrix A:

A = \begin{bmatrix} 3 & 1 \\ 1 & 3 \\ \end{bmatrix}

We need to find eigenvalues that solve the characteristic equation.

\det(A - \lambda I) = \begin{vmatrix} 3-\lambda & 1 \\ 1 & 3-\lambda \\ \end{vmatrix} = 0

(3 - \lambda)^2 - 1^2 = \lambda_2 -6\lambda + 8 = (\lambda-2)(\lambda-4) = 0

\lambda_1 = 2, \lambda_2 = 4

Are these eigenvalues correct? Let’s check our work:

trace(A) = 6 = \sum{\lambda_i}

det(A) = 8 = \prod{\lambda_i}

How to find our eigenvectors? By solving the nullspace given each eigenvalue.

For \lambda_1=2 :

A - 2I = \begin{bmatrix} 1 & 1 \\ 1 & 1 \\ \end{bmatrix} \Rightarrow rref(A - 2I) = \begin{bmatrix} 1 & 1 \\ 0 & 0 \\ \end{bmatrix}

(\lambda_1, \vec{x}_1) = (2, \begin{bmatrix} 1 \\ -1 \\ \end{bmatrix})

For \lambda_2=4 :

A - 4I = \begin{bmatrix} -1 & 1 \\ 1 & -1 \\ \end{bmatrix} \Rightarrow rref(A - 4I) = \begin{bmatrix} -1 & 1 \\ 0 & 0 \\ \end{bmatrix}

(\lambda_2, \vec{x}_2) = (4, \begin{bmatrix} 1 \\ 1 \\ \end{bmatrix})

Desirable Matrix Properties

The above example was fairly straightforward. But eigendecomposition can “go awry”, as we shall see. Consider a rotation matrix, which in two dimensions has the following form:

R = \begin{bmatrix} \cos(\theta) & -\sin(\theta) \\ \sin(\theta) & \cos(\theta) \\ \end{bmatrix}

What are the eigenvalues for rotation \theta = 90^{\circ} ?

R = \begin{bmatrix} 0 & -1 \\ 1 & 0 \\ \end{bmatrix}

\det(R - \lambda I) = \begin{bmatrix} - \lambda & -1 \\ 1 & - \lambda \\ \end{bmatrix} = 0

(- \lambda)^2 - 1^2 \Rightarrow \lambda^2 = 1

\lambda_1 = i, \lambda_2 = -i

We can check our work:

trace(R) = 0 = \sum{\lambda_i}

det(R) = 1 = \prod{\lambda_i}

We saw earlier that rotation matrices have no geometric interpretation. Here, we have algebraically shown that its eigenvalues are complex.

A = \left[ \begin{smallmatrix} 3 & 1 \\ 1 & 3 \\ \end{smallmatrix} \right] has real eigenvalues, but R = \left[ \begin{smallmatrix} 0 & -1 \\ 1 & 0 \\ \end{smallmatrix} \right] has less-desirable complex eigenvalues.

We can generalize the distinction between A and R as follows:

Spectral Theorem. Any matrix that is symmetric (A = AT) is guaranteed to have real, nonnegative eigenvalues. The corresponding n eigenvectors are guaranteed to be orthogonal.

In other words, eigendecomposition works best against symmetric matrices.

Eigendecomposition- Spectral Theorem (1)

Diagonalization

Let us place each eigenvector in the column of a matrix S. What happens when you multiply the original matrix A by this new matrix? Since S contains eigenvectors, multiplication by A reduces to multiplication by the associated eigenvalues:

AS = \begin{bmatrix} \vdots & \vdots & \vdots & \vdots \\ \lambda_1x_1 & \lambda_2x_2 & \dots & \lambda_nx_n \\ \vdots & \vdots & \vdots & \vdots \\ \end{bmatrix}

We see the product contains a mixture of eigenvalues and eigenvectors. We can separate these by “pulling out” the eigenvalues into a diagonal matrix. Call this matrix \Lambda (“capital lambda”).

AS = \begin{bmatrix} \vdots & \vdots & \vdots & \vdots \\ x_1 & x_2 & \dots & x_n \\ \vdots & \vdots & \vdots & \vdots \\ \end{bmatrix} * \begin{bmatrix} \lambda_1 & 0 & \dots & 0 \\ 0 & \lambda_2 & \dots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \dots & \lambda_n \\ \end{bmatrix} = S \Lambda

Most matrices have the property that its eigenvectors are linearly independent. For such matrices, S is invertible. Given this fact, we can solve for A:

\Lambda = S^{-1} A S

A = S \Lambda S^{-1}

Matrices that can be factorized in this way are said to be diagonalizable. We can see that both elimination and eigendecomposition are performing the same type of work: factorizing matrices into their component parts.

If A is symmetric, then we know \Lambda is real, and its eigenvectors in S are orthogonal. Let us rename S to be Q, to reflect this additional property. But orthogonal matrices have the property that transposition equats inversion: Q^T = Q^{-1}. Thus, if A is symmetric, we can simplify the diagonalization formula to:

A = Q \Lambda Q^{-1} = Q \Lambda Q^T

Asymptoptic Interpretations

This diagonalization approach illustrates an important use case of eigenvectors: power matrices. What happens when A is applied arbitrarily many times? What does the output look like in the limit?

We can use the diagonalization equation to represent A^k:

A^k = \prod_{i=1}^{k} (Q^{-1} \Lambda Q) = (Q^{-1} \Lambda Q)(Q^{-1} A Q)\dots(Q^{-1} \Lambda Q)

We can simplify by canceling the inner terms QQ^{-1}:

A^k = Q^{-1} \Lambda^k Q

This equation tells us that the eigenvectors is invariant to how many times A is applied. In contrast, eigenvalue matrix \Lambda has important implications for ongoing processes:

  • If each eigenvalue has magnitude less than one, the output will trend towards zero.
  • If each eigenvalue has magnitude greater than one, the output will trend to infinity.

Fibonacci Eigenvalues

The powers interpretation of eigenvalues sheds light on the behavior of all linear processes. This includes number sequences such as the Fibonacci numbers, where each number is the sum of the previous two numbers.

Recall the Fibonacci numbers are 0,1,1,2,3,5,8,13,... What is F_{100} ?

Eigenvalues can answer this question. We must first express the Fibonacci generator as a linear equation:

F(k+2) = 1F(k+1) + 1F(k)

In order to translate this into a meaningful matrix, we must add a “redundant” equation

F(k+1) = 1F(k+1) + 0F(k)

With these equations, we can create a 2×2 Fibonacci matrix F.

F = \begin{bmatrix} 1 & 1 \\ 1 & 0 \\ \end{bmatrix}

This matrix uniquely generates Fibonacci numbers.

u_1 = Fu_0 = \begin{bmatrix} 1 & 1 \\ 1 & 0 \\ \end{bmatrix} \begin{bmatrix} 1 \\ 0 \\ \end{bmatrix} = \begin{bmatrix} 1 \\ 1 \\ \end{bmatrix}

u_4 = F^4u_0 = \begin{bmatrix} 1 & 1 \\ 1 & 0 \\ \end{bmatrix}^4 \begin{bmatrix} 1 \\ 0 \\ \end{bmatrix} = \begin{bmatrix} 5 \\ 3 \\ \end{bmatrix}

To discover the rate at Fibonacci numbers grow, we decompose F into its eigenvalues:

\det(F - \lambda I) = \begin{vmatrix} 1 - \lambda & 1 \\ 1 & 1- \lambda \\ \end{vmatrix} = 0

 \lambda^2 - 2\lambda - 1 = 0

 \lambda_1 = \frac{1 + \sqrt{5}}{2}, \lambda_2 = \frac{1 - \sqrt{5}}{2}

 \lambda_1 = 1.61803, \lambda_2 = -0.61803

trace(F) = 1 = \sum{\lambda_i}

det(F) = -1 = \prod{\lambda_i}

We can go on to discover eigenvectors x_1 and x_2. We can then express the Fibonnaci matrix F as

F = \lambda_1x_1 + \lambda_2 x_2

F^k = \lambda_1^k x_1 + \lambda_2^k x_2

As k goes to infinity, the second term goes to zero. Thus, the ratio is dominated by the larger eigenvalue, 1.61803.

Mathematicians in the audience will recognize this number as the golden ratio.

eigenvectors-fibonacci-and-golden-ratio

We have long known that the ratio of successive Fibonnaci numbers converges to 1.61803. Eigenvalues provide a mechanism to derive this value analytically.

Until next time.