Counting: The Fourfold Way

Part Of: Statistics, Algebra sequences
Content Summary: 1100 words, 11 min read

The Fundamental Principle of Counting

We often care to count the number of possible outcomes for multiple events.

Example 1. Consider purchasing a lunch with the following components:

  • Burger b \in \{ Chicken, Beef \}
  • Side s \in \{ Fries, Chips \}
  • Drink d \in \{ Fanta, Coke, Sprite \}

How many lunch outcomes are possible?

Three approaches to counting suggest themselves. We might make a list. But this process can be error prone. Other representations are more systematic: we can build a tree, or imagine a (hyper)-volume. Each strategy converges on the same answer: 12 possible lunches.

Permutation_ Trees of Events (1)

Can we generalize? Yes, with the help of the fundamental principle of counting (aka the rule of multiplication). For any event A with a possible outcomes, and another event B with b possible outcomes, the number of possible outcomes for composite event A \cup B is a*b

How does counting work for a repeating event? For an event with n possibilities occurs k times, there are k^n possible outcomes.

Example 2. How many numbers can be represented by a byte (8 bits)?

Each bit has two possible assignments: zero or one. For eight such “bit events”, we have 2^8 = 256 possible outcomes.

Permutations

Example 3. A trifecta bet guesses which horse will place first, which second, and which third.  How many such bets are possible in a 9-horse race?

Each medal has nine possible assignments. For three such “medal events”, we have 9^3 = 729 possible outcomes.

This answer is completely wrong. To understand why, consider the lottery machine:

powerball

  • In Example 2, a single value (e.g., 0) can freely be assigned to multiple bits. Every time you draw a bit from the “possibility machine”, it is replaced when the next bit is drawn. Sampling with replacement means that each event is exactly the same.
  • In Example 3, a single horse (e.g., Secretariat) cannot be assigned multiple medals. Every time you draw a horse from the “possibility machine”, it cannot be drawn for subsequent events. Sampling without replacement means that each event has diminishing numbers of possibilities.

Definition 4. A permutation is a list of outcomes drawn without replacement.

For the trifecta bet, how many permutations exist? Well, 9 different horses that earn the gold. Given that one horse won the gold, 8 different horses that can earn the silver. Then there are 7 different horses that can earn bronze. Thus, there are 9 \times 8 \times 7 = 504 possible trifecta bets.

Permutation_ 9 perm 3

Similar to how exponentiation is defined as repeated multiplication, a factorial is defined as slowly-decrementing multiplication.

9! = 9 \times 8 \times 7 \times 6 \times 5 \times 4 \times 3 \times 2 \times 1

But we only want 9 \times 8 \times 7. How can we get rid of the other terms? By division, of course!

9 \times 8 \times 7 = \dfrac{9 \times 8 \times 7 \times 6 \times 5 \times 4 \times 3 \times 2 \times 1}{6 \times 5 \times 4 \times 3 \times 2 \times 1} = \dfrac{9!}{6!}

Why did we use the number 6? Because if three of our nine horses place, six do not place. So a more general way to write this equation,

\dfrac{9!}{9-3!}

More generally, if you have n items and want to find the number of ways k items can be ordered:

Equation 5: Permutation. P(n, k) = \dfrac{n!}{(n-k)!}

Combinations

In contrast to permutations, for combinations, order doesn’t matter. A permutation is a list, a combination is a set.

A boxed trifecta bet requires correctly which three horses will place first, second and third (order doesn’t matter). A trifecta bet selects a permutations; a boxed trifecta bet selects a combination.

Imagine only four horses in the race. That’s P(4,3) = \dfrac{4!}{4-3!} = 24 possible trifecta bets. But how many boxed trifecta bets are possible?

Combinations treat duplicates as a single entry. For example, abc and acb are equivalent for a boxed trifecta bet. We can identify four groups, with six equivalent permutations each:

Permutation_ 4 choose 3

In general, how many winner duplicates exist? How many ways can we shuffle k winners? Well, if you have k winners and are wondering how many permutations exist for that entire set… that’s P(k,k)!

Equation 6: Combination. C(n, k) = \dfrac{P(n,k)}{P(k,k)} = \dfrac{P(n,k)}{k!} = \dfrac{n!}{(n-k)! k!}

For an example of combinations used to solve a real problem, I recommend this post.

The Fourth Way: Stars and Bars

Example 7. You have k=3 cookies a, b, c, d to give to n=4 kids. How many possible ways are there to do so?

In the case of medals and horses, we claimed four solutions: \{ a, b, c\}\{ a, b, d\}\{ a, c, d\}, and \{ b, c, d\}. But there is an important difference: horse-less medals are impossible, but cookie-less children are not! So we need to account for situations like \{ a, a, a\}, with one child getting all of the cookies.

We can use the traditional bins-as-containers metaphor to visualize outcomes (top row). Or we can instead visualize bin boundaries (bottom row). This visualization strategy is called stars and bars.

Combinatorics_ Stars and Bars (2)

How many kid-cookie outcomes are possible? The answer becomes apparent only if we use stars and bars  (bottom row). Every possible shuffling of the stars in those squares produces a valid event. That is, \binom{6}{3}.

How many objects are possible in general? There are n stars (kids). Since bars represent bin boundaries, there are n-1 bars. Thus:

Equation 8: Multi-Combination. C(n, k) = \binom{n+(k-1)}{k}

The Fourfold Way

Every example we have seen differentiates possibilities and outcomes. We will use the metaphor of balls for outcomes (something concrete) and bins for possibilities (something to “clothe” outcomes).

Combinatorics_ Possibilities vs Outcomes

Equation 9. An event is a function that maps outcomes to possibilities:

Event : Outcomes \rightarrow Possibilities

Combinatorics_ Events as Functions (2)

This function can be compactly represented as bbd.

Functions require every element of the domain to map to the codomain. Event functions require no unrealized outcomes. That is: every outcomes manifests a possibility. Every ball is given a bin.

We saw previously that combinations and permutations don’t allow events like aab and ccc. A single horse cannot win multiple medals. Multiply-realized possibilities are not allowed.

Recall the definitions of injective, surjective and bijective functions. This requirement is the injective property. Sampling without replacement is the same thing as injectivity.

Tuple, permutation, combination, and multi-combination. This is the fourfold way.  The Way can be made more general by counting situations where the possibilities are unlabeled, and the event function meets the surjection property. But for details on the more complete twelvefold way, I recommend this post.

Combinatorics_ Fourfold Way

Towards a Rosetta Stone

Now, consider all possible functions for 4 bins and 3 balls:

Combinatorics_ Rosetta Stone (4, 3)

What do the equations above have to do with this shape? Well, each way of counting corresponds with a different subset of this broader shape:

Combinatorics_ Shape of The Way (1)

I leave it to the interested reader to ponder, how such a jagged shape can be represented by these four relatively clean formulae.

Until next time.

Related Resources

https://www.ece.utah.edu/eceCTools/Probability/Combinatorics/ProbCombEx15.pdf

https://wizardofodds.com/games/poker/

http://www.math.hawaii.edu/~ramsey/Probability/PokerHands.html

Advertisements

Jesus, disciple of John

Why do the Gospels care about John?

In the 20s CE, at least two prophets were active in the Israelite highlands: John the Baptist and Jesus of Nazareth. Both were killed on political grounds. Jesus left behind disciples that remained loyal to him in some sense. So did John. In fact, these two religious groups interacted (vied for influence?) after the deaths of their leaders.

Ultimately, John’s religious group died out; Jesus’ following did not. With the exception of Josephus and a few other secular sources, the Christian gospels are our best source of information of the religious climate of this time period.

These Christian gospels spend an astonishing amount of time describing John: both his independent ministry, and his relationship with Jesus. John’s message that a powerful Son of Man will judge the world, is interpreted by Christians as referring to Jesus.

Why should the gospels lavish John with such attention and theological import? Two hypotheses suggest themselves,

  1. The early Christians shared a broader Jewish respect for John’s ministry, and that reverence led to the attention & theological significance.
  2. The early Christians crafted the gospels partially in effort to convert John’s disciples.

As we shall see, neither of these hypotheses are adequate. Instead, we shall see evidence suggesting that Jesus began his ministry as a disciple of John.

On Jesus’ Baptism

The gospels record that John baptized Jesus. This event is prima facie embarrassing for two reasons:

  1. Implications of imperfection. John’s baptism was clearly and consistently described as “for the forgiveness of sins”.
  2. Implications of subordination.  This is the reason Matthew has the Baptist say “I need to be baptized by you, yet you come to me?”

Mark and Matthew combat with these implications by describing a theophany where God calls Jesus his Son. In contrast, Luke makes the Baptist a relative of Jesus, and has John imprisoned before Jesus’ baptism. We are never explicitly told who baptizes Jesus. And in the fourth gospel, John the Baptist is not the Baptist, the title is never used on him. He even denies that he is Elijah, even though in Matthew, Jesus flatly affirms that he is.

This incredible diversity of interpretations is due to a simple fact. At the beginning of Jesus’ ministry stands an independent Baptist, a Jewish prophet who won great popularity and reverence before and apart from Jesus, who also won the reverence and submission of Jesus to his baptism of repentance for the forgiveness of sins, and who left behind a religious group that continued to exist apart from Christianity.

The Baptist constituted a stone of stumbling right at the beginning of the story of Jesus, a stone too well known to be ignored or denied, a stone that each evangelist had to come to terms with as best he could. The embarrassment of the evangelists is illustrated by the diverse, not to say contradictory ways in which they try to bend the independent Baptist to a dependent position within the story of Jesus.

A Common Vision

The gospels record that Jesus was baptized by this prophet. But why would he go? Since nobody compelled him, he must have gone to John because he agreed with John’s message.

There were lots of other groups vying for Jewish attention. Jesus did not join the Pharisees, who emphasized scrupulous observance of the Torah. He did not align himself with the Sadducees, who focused on the worship of God through the Temple cult. Nor did he associate with the Essenes, who formed monastic communities to maintain their own ritual purity. Nor did he subscribe to the teaching of the “fourth philosophy”, which advocated a violent rejection of Roman domination.

No, Jesus associated with an ascetic prophet who proclaimed an imminent end of history. As we will see later, this fact will shed light on the ministry of the historical Jesus.

A Common Practice

Was Jesus’ baptism a singular event? Did he spend much time with John? Was he admitted into John’s inner circle?

Jesus’ first disciples were John’s disciples. If some disciples of the Baptist came to transfer their allegiance to him while they were still in the company of the Baptist, that suggests that Jesus had stayed in the Baptist’s orbit long enough for some of the latter’s disciples to come to know him and be impressed by him.

The fourth gospel admits that Jesus’ ministry included baptism. In fact, not ten sentences later, and that claim is baldy contradicted. However, several pieces of evidence suggest this is the (rather clumsy) work of a Johannine redactor.

Jesus practicing baptism is further reinforced by Mark 11:27-30: “The chief priests asked Jesus, “Who gave You this authority to do these things? Jesus replied, “One question, then I will tell you. Was John’s baptism from heaven or from men?”

The Sadducees were keen to admit John’s religious authority, and deny Jesus’. So why would Jesus invoke John’s baptism? A likely explanation is that it was an area of ministry overlap: the Sadducees couldn’t well admit John’s baptism was divine, yet criticize Jesus’ ministry which included that very baptism.

Jesus as Disciple

A picture is slowly emerging. Jesus began his public life as one of John’s disciples. This is the best explanation for his a) being baptized by John, b) taking John’s disciples, c) practicing John’s baptism. He slowly differentiated himself with the following teachings:

  • Non-asceticism. John was renowned for his minimal lifestyle. Jesus was no stranger to parties, so to speak.
  • Miraculous works. John’s ministry did not feature miracles. Jesus’ did, and he used this to illustrate his end-times message.

Yet despite these divergences, Jesus and John operated largely complementary ministries. Consider Matthew 11:16-19

To what should I compare this generation? It’s like children who call out to each other: “We played the flute for you, but you didn’t dance; we sang a lament, but you didn’t mourn!”

For John did not come eating or drinking, and they say, ‘He has a demon!’ Jesus came eating and drinking, and they say, ‘Look, a glutton and a drunkard, a friend of tax collectors and sinners!’

Yet wisdom is vindicated by her children.

This passage is remarkable because it places John and Jesus’ ministry side by side. Absent are theological claims of Jesus’ superiority.  To be sure, John’s asceticism and Jesus’ non-asceticism are contrasted. Yet John (lamenter) and Jesus (flute player) are both children of wisdom.

Jesus after John

What was the relationship like between John and Jesus? Did they always function collaboratively, or competitively?

The details of this relationship are largely lost to history. Some evidence of tension can be inferred in how frequently Jesus was asked to clarify his relationship to John.

One of our most compelling clues, however, lies in the moving plea from Jesus to his former rabbi:

When John heard in prison what the Messiah was doing, he sent a message by his disciples and asked Him, “Are You the One who is to come, or should we expect someone else?” Jesus replied to them, “Go and report to John what you hear and see: the blind see, the lame walk, those with skin diseases are healed, the deaf hear, the dead are raised, and the poor are told the good news. And if anyone is not offended because of Me, he is blessed.

Absent are the polemics so typical of Jesus’ sayings.  This beautitude has an audience of one. This delicate appeal to his former rabbi: “please do not be offended because of [my origin]”. And yet here, tellingly, the conversation stops. We are not told John’s reply. The relationship is left ambiguous, as John heads for his execution by Herod Antipas.

After the execution of the Baptist, Jesus’ ministry developed by itself. And yet, as we will see, Jesus never fully emerges from the shadow of John. Their common ministry and message pervades the remaining years of Jesus’ ministry.

Polytheistic Roots of Israelite Religion

Part Of: Demystifying Religion sequence
Followup To: Yahweh and the Levites
Content Summary: 2000 words, 10min read.

Introduction

Is the Hebrew Bible monotheistic?  

We might be tempted to say yes after reading Isaiah 44:6 “I am the first and I am the last; besides me there is no God”.

But the situation is more complicated. The Hebrew Bible is also replete with polytheism. A few examples:

  • “Do you not possess that which Chemosh, your god, has given you? So shall we possess what Yahweh has given us.” Judges 11:24
  • “Who is like Yahweh among the gods?” Exodus 15:11
  • “The people of Judah have as many gods as they have towns.” Jeremiah 11:13

We also see middle ground staked out between these two positions. For example, the original audience of the book of Deuteronomy is often exhorted not to follow after other gods, without it ever being asserted that these gods did not exist or were not real. This is known as monolatrism (“single worship”).

Which belief came first?

Last time, we showed how Yahweh was originally a god of metallurgy in northwest Saudi Arabia. Today, we will work with the framework that Yahweh was introduced to Israel in a five-stage process:

  1. Traditional Polytheism. The earliest Israelites worshipped creator god El, his wife Asherah, and his sons e.g., Baal.
  2. Incorporation. Yahweh was incorporated as a 2nd tier god in El’s pantheon.
  3. Elevation. Yahweh and El are identified as the same deity.
  4. Monolatrism. A new Yahweh-only movement emerges, and the gods of the second tier are denied.
  5. Monotheism. Gods of other nations are denied, Yahweh’s power is deemed universal in scope.

Why did Yahweh worship progress along this trajectory? As we shall explore next time, as with the theocracies of surrounding nations, changes in the religious landscape have strong, robust correlates in the sociopolitical life.

Today I’d like to focus on a different, simpler topic. We shall turn to archaeology and cultural anthropology to explore expressions of polytheism within the Hebrew Bible. Many of my readers already know that the text acknowledges (polemicizes against) polytheistic practices. Less well-known are examples of celebration (bald assertions of polytheistic beliefs) and assimilation (Yahweh “adopts” the roles and characteristics of rival deities). 

Monotheism_ Five Stages (2)

 

Let’s review the deities in El’s pantheon, and their appearance in the Hebrew Bible.

A Disclaimer

For many modern readers, polytheism is a term loaded with negative connotation. Partisans use it as a weapon. Attackers point to continuities between Israelite religion & polytheism, and defenders point to instances where Israelite rhetoric polemicizes against polytheism. But all ideological innovations have both features.

More to the point, those who spend time interacting with polytheism understands how earnestly it grapples with the same aspects of the human condition as other strands of religious expression. Polytheism must be encountered on its own terms. To weaponize is to misunderstand.

The important thing to bear in mind in the following, is that underneath the images and icons of religious expression lie a particular group of people, responding to social and political pressures in thoroughly understandable ways. My experience has been, the more time you spend in someone else’s culture, the easier it becomes to empathize with their plight.

El

Israelite Polytheism_ El

At some point in its history, El was identified with Yahweh as the same god.

This equation is expressed clearly in Exodus 6:2-3. “And God said to Moses, “I am Yahweh. I appeared to the patriarchs as El, but by my name Yahweh I did not make myself known to them.” Other Biblical material asserts this equation. Joshua 22:22 states “the god of gods is Yahweh”. Judges 9:46 refers to “El of the covenant”.

The Yahweh-alone movement vigorously condemn prominent Canaanite gods… except El. There are zero condemnations of El in the Hebrew Bible. This makes sense if Yahweh was ultimately identified with this Canaanite creator-god. What’s more, archaeological evidence suggests that the Yahweh religious centers in Shiloh and Bethel were originally a place of El worship.

El and Yahweh are attributed same characteristics. El is depicted as a wise old man with a beard eg “You are great, O El, and your hoary beard instructs you”. Yahweh is described in the same terms (Daniel 7:9, Job 36:26, Habakkuk 3:6). Like “Kind El, the Compassionate”, Yahweh is a “merciful and gracious god”. The description of Yahweh’s dwelling place as a tent (Psalms 15:1, 27:6, 91:10) recalls the tent of El in the Canaanite narrative of Elkunirsa. Finally, both Yahweh and El are said to dwell amidst cosmic waters (Isaiah 33:20-22, Ezekiel 47:1-12, Zechariah 14:8).

Just as Zeus had a council, or assembly, of other gods, so too does Yahweh. The Hebrew Bible is overflowing with references to Yahweh’s (El’s) assembly. See for example Psalm 89:6-8, Zechariah 14:5, 1 Kings 22:19, Isaiah 6:1-8, and Jeremiah 23:18,22.

Baal

Israelite Polytheism_ Baal

Worship of Baal can be dated back to the foundation of Israelite societies. This can be seen in onamatology, the study of proper names. Names in the Ancient Near East tend to have a theophoric component: usually a suffix that honors a deity. Yahwistic names include Josiah, Jehu (note the “J” sound); Baal-oriented names include e.g., “Zerubabbel”. In addition to hundreds of icons devoted to Baal worship, we also see Ba’al theophoric names as common in the Levant in this time period.

Yahwistic prophets of this period reserve the most vitriol for Baal worship. Why? Because the Omride dynasty (including King Ahab & Jezebel) erected a temple to Ba’al. While the cult of Yahweh continued in the northern kingdom, Baal was perhaps elevated as the patron god of the northern monarchy, thus creating some sort of theopolitical unity between the kingdom of the north and the city of Tyre.

Indeed, there is some evidence that the cult of Baal and Yahweh got conflated in the north. Hosea 2:16-24 suggest that some northern Israelites did not distinguish between Yahweh and Baal. The religious sanctuaries in the Israelite cities of Dan and Bethel centered around golden calves; this iconography strongly parallels that of Baal. Finally, the redundancy in 1 Kings 16:32 was almost certainly a scribe glossing over the original text, “altar for Baal in temple of Yahweh”.

To induce the Israelites to stop worshipping Baal, the imagery of Baal was adopted by the Yahweh cult. The Baal Cycle, ancient mythology on the scale of the Epic of Gilgameth, has four literary themes for the storm god. Here are those themes, along with the Biblical text which mirrors them.

  1. The march of the divine warrior (Psalm 104:3 “He makes the clouds his chariot, and travels along on the wings of the wind”)
  2. The convulsions of nature as the divine warrior manifests his power (Judges 5:5, Hab 3:10)
  3. The return of the divine warrior to his holy mountain to assume divine kingship (Isaiah 31:4)
  4. The utterance of the divine warrior’s voice from his palace provides rains that fertilize the earth (Jeremiah 10:13)

Yahweh is also depicted as defeating Baal’s classic enemies:

  • Baal/Yahweh defeats a seven headed dragon, Leviathan, and River (CAT 5.1, Psalm 74:13-15).
  • Baal/Yahweh defeats Sea (KTU 1.14, Psalm 89:10).
  • Baal/Yahweh defeats Death/Mot (KTU 1.4 VIII-1.6, Isaiah 25:8).

Asherah

Israelite Polytheism_ Asherah

El’s wife was named Asherah. When Yahweh was identified with El, did he also inherit his wife? In the blessings of Joseph, Genesis 49:25 contains language specific to the Asherah cult “blessings from Breast-and-Womb”. The Bible further admits that the Israelites frequently worshipped a “Queen of Heaven” (Jeremiah 7:18, 44:17-25). Indeed, 2 Kings 21:7 tells us that worship of Asherah happened within the Temple itself. Finally, archaeology has uncovered several icons with the inscription “Yahweh and his Asherah”. This evidence cumulatively suggests that, in early forms of Israelite religion Yahweh was believed to have a wife.

Israelite polytheism_ Yahweh and his Asherah

The push towards monolatrism led to the eviction of the Asherah cult, whose memory may be preserved in Zechariah 5:5-11. But this eviction created a deficit of femininity to Israelite religious expression. To compensate, the Biblical writers began attributing feminine attributes to Yahweh (Isaiah 49:15, 46:3, 44:2,24, 42:14). Asherah-like characteristics also appear in the goddess of Wisdom in Proverbs 8.

Astral-ification

There is extensive evidence for worship of an astral deity (sun god) in Jerusalem.  And Jerusalem is presumably the site that Yahweh was identifed with El. Since the Ugaritic texts hint that El’s family was astral in character, it is not unthinkable that Yahweh was viewed similarly.

  • Proper names. A certain number of proper names are constructed from the root ‘-w-r (“shine, gleam, light”). These include Uriyyah (“Yhwh is my light”) the name of one of David’s generals, Neriyahu “Yhwh is my lamp”, Yizrayah “Yhwh gleams”, minister of Hezekiah, and dozens more.
  • Archaeology. Many pieces of material evidence, including many seals found in Jerusalem with image of the sun, or the sun god in the form of a wing bed scarab.
  • Biblical affirmations. Job 38:6-7 may attest to Israelite recognition of astral deities “Who sets its cornerstone when the morning stars sang together, and all the divine beings shouted for joy?” Similarly Judges 5:20 features conflict in the astral plane “the stars fought in the heavens”.
  • Biblical acknowledgements. Ezekiel 8:16 has Israelites worshipping sun gods. So does 2 Kings 23:5,10-11 and Zephaniah 1:4-5.
  • Biblical Incorporation. The story of Sodom and Gomorrah reflects astral themes, where the divine punishment is meted out at the moment when the sun rises. It is even possible that the two messengers and the deity in the story represent the sun god and his two acolytes. Psalm 19:4-6 and Psalm 84:11 also shows Yahweh taking on astral qualities.

Other Deities

The Ugaritic texts mention hundreds of Canaanite gods. The Bible only criticizes two of them: Ba’al and Asherah. What gives?

The Biblical authors conflates Asherah and Astarte, and conflates multiple male god as “the Baals”.  Despite this, there is only evidence of ~10 gods worshipped in early Israel. This is also true amongst Israel’s neighbors. It appears that the religious landscape of Iron Age Canaan was simply less diverse than Bronze Age Ugarit.

Do we see evidence for these gods in the Bible, despite their not being named in that text?

Anat. Known for her savagery, Anat worship involves a celebration of gore. “Knee-deep she gleans in warrior blood, neck-deep in the gore of soldiers, until she [Anat] is sated with fighting.”  While no evidence of Anat-worship exists in ancient Israel, these divine themes have strong parallels in the Biblical text. The Bible describes heaps of copses, drinking blood, devouring flesh, and swords dripping with viscera.

Astarte. In the Bible, the Name of Yahweh is described in personal terms. The divine name acts as a warrior (Isaiah 30:27) and possesses martial qualities such as radiance and strength (Psalm 29:1-2). The warrior goddess Astarte bears the title “name of Baal”. This designation of Astarte and her martial character and special relationship to the god Baal approximate the martial character of the name, and its special relationship to Yahweh as warrior god. Further evidence for this hypothesis has been adduced from the Elephantine papyri

Similar lines of argument can be made for entities like Light and Truth of Psalm 43:3.

Angels. The lowest tier of the Israelite pantheon also went through alterations. As the Ugaritic texts show, the lowest tier involved a number of deities who served in menial capacities. A common task for such gods was to act as messenger, the literal meaning of the English word “angel”. Certainly angels are not regarded in later traditions as gods. But they were in early traditions.

Takeaways

This post provides evidence for a simple point. Polytheistic expression (not just condemnation!) occurs in the Hebrew Bible.

These expressions are best explained by the Yahweh cult shifting away from its traditional pagan roots, and towards a monolatrist (worship one god) and later monotheist (acknowledge one god) understandings.

As we will see next time, the reasons why Yahweh worship proceeded in this interesting (but not original) trajectory, are fairly easy to understand.

Yahweh and the Levites

Part Of: Demystifying Religion sequence
Related To: Who Wrote The Bible?
Content Summary: 4000 words, 20min read.

Exodus 3:14 has God saying to Moses, “I Am that I Am.” And he said, “You must say this to the Israelites, ‘I Am has sent me to you.’” The Hebrew initials for “I am that I am” is YHWH (pronounced “Yahweh”). This tetragrammaton is the name of the god of Judaism.

But where, and by whom, was Yahweh first worshipped?

Today, we shall see that Yahweh was originally a god of metallurgy in northwest Saudi Arabia. The Levites brought worship of him to Israel via a “mini-Exodus”.

A Disclaimer

The historicity of the exodus is a fairly partisan topic. Many uninformed people like to give their opinions, and many opinions are uninformed. 

None of my material comes from Christian or atheistic apologetic websites. I made a point to only draw material from academic sources. Specifically, I draw from the following books (and journal articles and lecture videos, not pictured):

Yahweh_ Books

People familiar with this field will note that my sources do not see eye to eye. For example, Friedman and Romer leverage conservative and liberal approaches, respectively. Yet despite the range of expression, my sources converge on complementary solutions to the origin of Yahweh. My task today is to weld their insights together into a coherent whole.

Researching this post has felt a little like digging into a mystery novel. I hope reading it provides you with a similar experience.

Stage 1: El’s Pantheon in Israel

1.1) Certain aspects of Israelite prehistory as given by the Bible are non-historical.

First, a mass exodus of two million people (six hundred thousand fighting-age men) is not vanishingly unlikely. If it had actually happened, we would expect

  1. physical debris from the pilgrimage, at any of the thirty locations they are said to have stopped.
  2. archaeological evidence of a dramatic demographic shift in the highlands of Israel.
  3. inclusion in the (otherwise quite voluminous) records of the Egyptian border guards
  4. Egyptian texts discussing the new political situation (since the Egyptians had control over, and military outposts throughout Canaan)

And how much evidence do we have in each of these four dimensions? Literally zero evidence- in all of them. Recall that absence of evidence can (and in this case does) mean evidence of absence. The very first piece of evidence confirming the Biblical text is from 1000 BCE, where the Tel Dan stele affirms the existence of the “house of David”.

Second, the conquest narrative is non-historical. Most cities listed as razed in the Joshua narrative show evidence of uninterrupted prosperity in the archaeological record. And the three (out of thirty-one!) cities that do show interruption have not been localized to Israelite violence.

Third, until 700 BCE Judah is a much smaller political force than it makes itself to be. One demonstration of the small scale of this society is the request in one of the Armarna letter sent by the king of Jerusalem to the pharaoh that he supply fifty men “to protect the land.” Another letter asks the pharoah for one hundred soldiers to guard Megiddo from an attack by his aggressive neighbor, the king of Shechem. (Finkelstein, pp78). These letters date to the 14th century BCE. But the population in the intervening time period does not change much. Until 700 BCE, Judah’s population totaled no more than twenty settlements with a population of roughly 30,000. Only after the fall of Israel did Judah experience a population boom and full statehood.   

1.2) The Israelite people were indigenous Canaanites.

So where did the Israelite people come from? The Israelite people were originally Canaanite pastoralists who, in 1300 BCE. changed their economic strategy in response to worsening conditions. We have a wealth of evidence supporting this positive hypothesis, including:

  • Ecological: we now know that the Late Bronze Age collapse (a dark age from 1200 – 900 BCE) was caused primarily by climate change-driven famine. The pastoralist strategy can only be successful if neighboring agriculturalists have surplus wheat available to trade. When that surplus dried up, former pastoralists are forced to grow their own wheat, and adapt a hybrid lifestyle.
  • Linguistic: Hebrew and Canaanite language are increasingly indistinguishable the further back you go in the Iron Age.
  • Material culture: Israelite and Canaanites shared the same building plans, pottery designs, village layouts, cooking habits …
  • Historic repetition: Canaanite pastoralists had twice before settled the highlands, but the previous two attempts had eventually failed.

We can also see when these highlands settlements began to slowly differentiate themselves from their “parent” lowland cities. First, the highland settlements did not consume pork (pigs were available for food in all regions of Canaan). Second, the highland peoples seemed to go identify themselves by the name “Israelite”, earliest mention of which is in the Merneptah stele (1204 BCE).

Since Israelites were indigenous Canaanites, we know they share the same culture. But did they start out worship the same gods?

1.3) The Israelites and the Canaanites shared the same religion: the pantheon of El.

In Egyptian mythology, the most powerful god was Ra. In Babylon, it was Marduk. In Greece, it was Chronus.

Monotheism_ Greek Pantheon

In Canaan, the chief god was El. El’s wife was Asherah, and his sons include Ba’al and Anut. The Canaanite pantheon is well-understood from the discovery of the Ugaritic texts.

In most English translations of the Hebrew Bible, you will see frequent use of the words “God” and “Lord”. The Hebrew terms for these phrases are more literally translated “El” and “Yahweh”. They are used so interchangeably in the Hebrew Bible that you would think them synonyms.

  • Names. The very name “Israel” means “house of El”. In contrast, later Israelite names have “Yahweh”-based suffixes e.g., Jehu. Further, most Israelite cities were named after the gods in El’s assembly.  The god Anat was honored in the city of Anathoth, the place of origin of the prophet Jeremiah. The god Dagan in Beth-Dagan. The god El in Beth-El. The god Shamash in Beth-Shamash. The god Shalimu in Jerusalem.
  • Ritual systems. The priestly system laid out in Leviticus is very nearly copy-and-pasted from the Ugaritic sacrificial system.
  • Legal codes. the Covenant, Holiness, and Deuteronomic law codes share strong parallels with surrounding Canaanite legal systems.
  • Iconography. A seal found in Jerusalem in a tomb of the seventh century shows a solar god flanked by two minor gods: “Righteousness” and “Justice”

There are also expressions of polytheism throughout the Hebrew Bible. For example,

  • “Do you not possess that which Chemosh, your god, has given you? So shall we possess what Yahweh has given us.” Judges 11:24
  • “Who is like Yahweh among the gods?” Exodus 15:11
  • “The people of Judah have as many gods as they have towns.” Jeremiah 11:13

In part two of this series, we will see hundreds more data establishing Israel’s traditional religion as polytheism.

Stage 2: Yahwism in Edom

2.1) The original Yahweh cult was a Shasu religion located in southern Edom (northwest Saudi Arabia). (video)

Recognized for their goatees and hair held back in a hairband, the Shasu nomads were well-known to the Egyptian authorities. They conducted copper mining in the wilderness, and also were quite successful camel breeders. The Bible uses the terms Edom, Teman, and Midianite interchangeably. Egyptian descriptions of the Shasu geographically overlap the Biblical land of the Midianites.

Okay. So how do we know that the Yahweh cult originated with the Shashu people?

  • Four of the oldest texts in the Bible tell us so. See Deut 33:2, Judges 5:4-5, Habakkuk 3:3 and Isaiah 63:1.
  • Special treatment of Edom. The Bible repeatedly condemns the gods of the Ammonites, the Moabites, and the Sidionites, but never the god of Edom. Deut 23:7 calls Edomites the “brothers” of the Israelites. Edom’s patriarch Esau is said to be the brother of Israel’s patriarch Jacob.
    • The Bible makes a point of not mentioning Qos, the national god of Edom. We have evidence that Qos was a rather late theological development in Edom. Given this evidence, it is plausible to assume that Yahweh was worshipped in Edom and Qos stepped in only when Yahweh became the national god of Israel/Judah.
  • Archaeology.  Two Egyptian inscriptions, one dated to the period of Amenhotep III (14th century BCE), the other to the age of Ramesses II (13th century BCE), refer to “Yahweh in the land of the Shasu”. We also have one 9th century BCE text at Kuntillet Ajrud which refers to “Yahweh of Teman”.

2.2) Who was Yahweh? A god of metallurgy.  (paper)

Gods in the ancient worlds were given a specific set of powers. For reasons we will get into next time, Yahweh in the Bible is attributed the attributes of many kinds of gods: he exhibits power of the storm, of the sun, and even of femininity. But if we limit our search for descriptions of God in Midianite territory, we see the following picture:

Stage 3. The Levite Encounter

The Bible was written by four authors: J, E, P and D. Of these, E, P and D are traced to Levite priestly authors. There exist startling differences across Levite and non-Levite texts.

3.1) There was no mass exodus. But there was a mini-exodus of a group of Levites from Egypt (article, video).

Textual evidence:

  • The two oldest things in the Bible are the Song of the Sea, and the Song of Deborah. The Song of the Sea is a Levite text that does not mention Israel. The Song of Deborah, meanwhile, lists all ten tribes of Israel (Judah and Simeon were a separate community at this time and not part of Israel) but doesn’t mention Levi. Similarly, all twelve tribes are mentioned in the Blessings of Moses, but it is the only tribe associated with the exodus.
  • Detail in Egyptian stories. Only the Levite sources — E, P, and also D — that tell the entire story of the plagues and exodus from Egypt.  J, the non-Levite source, doesn’t tell it. If you read J, it jumps from Moses’ saying “Let my people go” in Exodus 5:1f to the people’s already having departed Egypt in Exodus 13:21.
  • Name of God. If the Levites brought Yahweh into Israel, they should be keen to describe the relationship between Yahweh and El. And only our Levite sources do this: J presumes the name is Yahweh from the beginning of her document.
  • It is likewise the Levite sources that concentrate on the Tabernacle.  E mentions it a little; P treats it a lot. There is more about the Tabernacle than about anything else in the Torah.  But the non-Levite source J never mentions it at all.

Egypt was known to host many Semitic peoples over the years. It is not unthinkable to imagine some small group escaping. The Shasu people were allowed by Mernepteh to bring their herds into Egyptian territory. The absence of evidence only gravitates against a massive exodus. It is silent on the question of an exodus on a small scale.

  • Names of the Levites. Hophni, Hur, Phinehas, Merari, Pashhur and above all Moses are Egyptian names. No one else, in all the names mentioned in the Bible, has an Egyptian name. If Egyptian names were invented, why only attribute them to the Levites? Further, the story of Moses’ name suggests the Biblical redactors did not know these names were Egyptian).
  • Cultural derivatives. There are strong parallels between the Levite priests’ description of the Ark and Egyptian barks. Likewise, the Seraphim that occupy the First Temple come from Egypt (the uraeus) IG.151. The serpent on Aaron’s staff mirrors Egyptian mythology. Professor Michael Homan showed that the Tabernacle has architectural parallels with the battle tent of Pharaoh Ramses II.
  • Circumcision. Only texts written by Levites (11/11)  give the requirement to practice circumcision — which was a known practice in Egypt.  So Egyptian cultural influences are present, but only in the Levite texts!

3.2) Moses was a Midianite.

  • Moses is described as having settled down with the Midianite people (the Shasu). His wife Zipporah and two sons were Midianite. What’s more: Moses’ father-in-law Jethro is called a priest. A priest of what god? Well, in Exodus 18:12, Jethro (and not Moses) is portrayed initiating a sacrifice to Yahweh. The Biblical editors seem uncomfortable with this tradition, for they later interjected a confession of faith on Jethro’s lips, which very much mirrors other such confessions. All of this suggests that Moses’ Midianite father-in-law was a priest of Yahweh. In fact, he seems to have spiritual authority over Moses in this passage.
  • The E source is replete with this kind of claim. We first meet Moses in Midian (no claims of him being born in Egypt, in this document). Moses’ response to Yahweh’s call, “Who am I that I should bring the Israelites out of Egypt?” would be a fair question for a man in Midian. E also claims he cannot go to Egypt because he is “heavy of tongue”. Traditionally interpreted as a speech defect, this phrase only occurs in one other place in the Hebrew Bible, where it means cannot speak the language. Finally, E also claims that the Midianites are direct descendents of Abraham.
  • While two Levite sources admit Moses’ Midianite connection, P actively tried to hide it. In the P source, has absolutely nothing about his ever being in Midian. Nothing about a Midianite wife, a priest father-in-law, nothing about his sons. Two books later, the P source injects a (blood-curdling) story designed to vilify the Midianites. Moses himself gives the order to kill all of the Midianite women. And this source does not include the little fact that Moses has a wife who happens to be a Midianite woman. The fact that the P source tries to deny the Midianite connection suggests the underlying claim is historical.

There are a couple problems with this theory. First, if Moses was Midianite, why did he have an Egyptian name? Further, why would he come to be in Egypt? There are ways around these difficulties (perhaps his name was retrofitted, or perhaps he didn’t come to Egypt, or …).

These problems illustrate that, unlike some of the other theories in this post, this particular hypothesis is under the most uncertainty. Fortunately, we can fairly easily swap it out with alternative theories (Moses as enslaved Levite, Moses as Egyptian royalty, etc) without harming the overall thesis. The key point in all of this, is that the Levites left Egypt and encountered Yahweh in Midian.

3.3) The Levites came into contact with the Shasu cult, and accelerated Yahweh’s introduction to Israel and Judah.

  • We need some account for how Yahweh was introduced into El’s pantheon. It is possible that Yahweh was slowly introduced to Israel via trade with its southern neighbors. However, the Levite emigration to Israel explains how the Yahweh cult became so influential.
  • Location of Sinai. Religious thinking in that era strongly associated gods with locations. In fact, deities were commonly thought to reside in sacred mountains. Mount Olympus was the home of Zeus & his pantheon. Mount Sapan was the home of Ba’al and his pantheon. Mount Sinai (aka Mount Horeb) was the house of Yahweh. This mountain was located in southern Edom, and the Levites regularly traveled to that location to worship him.
  • Exodus 24:8 features Moses splashing blood on his followers in a ritual ceremony. This kind of blood covenant was unknown to Canaan, but common in pre-Islamic Arabia.

Stage 4: El’s Adoption of Yahweh

4.1) On arrival into Israel, Yahweh was introduced as a second tier diety (a member of El’s family).

This can be seen in Deuteronomy 32:8-9, where El gives each of his sons a nation to rule over:

When El gave the nations their inheritance, when he divided all mankind, he set up boundaries for the peoples according to the number of the sons of El. For Yahweh’s portion is his people, Jacob his allotted inheritance.

In Psalm 82, we see Yahweh not at the head of the pantheon, but later asked to assume the job of all gods. “Yahweh stands in the divine assembly of El. Among the divinities, he pronounces judgment… Arise O Yahweh, judge the world; for You inherit all the nations.” Genesis 49:24-25 and Numbers 23-24 also view YHWH and El existing as distinct deities.

Again, we will see more evidence for this particular proposition in part two of this series.

4.2) The Levites “attached” themselves as priestly class

  • The Levites claim responsibility for the massacres in Genesis 34, Exodus 32:26-29, and Numbers 25:6-15 and Jacob’s blessing “Levi’s knives are vicious weapons. May I never enter their council. For in their anger they kill men, and on a whim they hamstring oxen. Their anger is cursed, for it is strong,and their fury, for it is cruel!” While the bloody purges specified in the conquest narrative are non-historical, they too speak towards the bloody zeal of the Levite people. All of this is to say: when they did arrive in Israel asking for refuge, they were not a people the Israelites could easily say no to.
  • In the book of Exodus, there are myriad references to “the people” and very few (retro-fitted) references to the Israelites. It is very plausible that “the people” referred exclusively to militant Levites. Deut 33:2-5 seems to support this distinction: “his people assembled with the tribes of Israel”.
  • On arrival, the Levites are not given territory. Instead, they are given a 10% tithe as priests. This fits into William Propp’s commentary on Exodus, which makes a strong case on the etymology of the very word “Levi” that its most probable meaning is an “attached person” in the sense of resident alien.
  • Over and over, the Levite sources command that one must not mistreat an alien. Why? “Because we were aliens in Egypt”. In the three Levite sources, the command to treat aliens fairly comes up 52 time! And how many times in the non-Levite source, J? None. Compared to legal texts of surrounding nations, this aspect is unique to the Israelite law code.

4.3) The Levites wrote the national history.

Those who accept the (very) strong reasons to think the mass exodus non-historical (section 1.1) need to explain how the story of the Exodus made it into the Bible. But we are not being asked to explain how it was invented whole-cloth. Rather, we must explain why and how memory of the mini-exodus (section 3.1) became stretched and aggrandized over time.

Why did the Levites invent the mass-exodus narrative?

  1. Promoting worship of Yahweh. The Levites were convinced that Yahweh had saved them from Egypt. What better way to have Israel worship Yahweh, than create a new history?
  2. Simple power politics. Political influence is easier to hold & retain if your group is the only “outsider”.
  3. Political unification. Iron age Israel was theocratic. The priests and kings shared (and sometimes competed for) power. A common origin story is a powerful tool for unification and shared identity. Similarly, the demonization on lowland city states (cultural & ethic siblings) as “Canaanite” served to support campaigns against them.

How did they accomplish this? By the production and dissemination of an origin story.  

While we are investigating the historicity of the Biblical narrative, we should also consider: why do these texts exist at all? The Hebrew Bible is humanity’s first attempt at prose, and of history. This intermingling of religion and history was unique to the ancient world. Instead of cyclic episodes of mythological combat, the Israelite religious imagination was fixated on events of their material past. Its structure is entirely unique, and cries out for an explanation. The Bible was written to create a written tradition (much more stable than oral traditions) of national identity.

In addition to violence, the Levites also had a reputation for teaching. We can see this in verses like Deuteronomy 6:20-23, which reads,

When your children ask you later on, “What are these laws that Yahweh commanded you?” you must say to them, “We were Pharaoh’s slaves in Egypt, but the Lord brought us out of Egypt in a powerful way. And he brought signs and great, devastating wonders on Egypt, on Pharaoh, and on his whole family before our very eyes. He delivered us from there so that he could give us the land he had promised our ancestors.

What specifically did the Levites fabricate?

They started with their own experience (an actual event), and added the following:

First, to make a mini-exodus massive, you need large numbers. You can actually “watch” the estimates grow as we move from earlier to later sources. J doesn’t mention numbers at all. E estimates a total of around 600,000, and P estimates of total of 600,000 fighting-age males (for a total of two million).

Second, the Exodus, without the conquest, would never have survived as a story. You need to explain how a nomadic nation came to reside in someone else’s territory. The conquest does this (and also stokes political sentiment of a later time period).

Why did the Israelites believe this story?

Don’t we all evaluate our personal origin stories with a bit too much credulity? Many Romans literally believed a wolf raised their patriarchs. Even in American culture, many people I’ve spoken with conceive of the Founding Fathers in mythic, rather than human, terms.

But why didn’t the first recipients of the mass exodus story reject it? Imagine the Levites waited ten or twenty generations before telling the story, and the mini-exodus narrative expansion happened only gradually. Israelites would only have distant inklings of the remembered past to go on. It is true that, for the exodus story to take root in early Israel it was necessary for it to pertain to the remembered past of settlers who did not emigrate from Egypt. And this is in fact the case. Egypt did control and oppress Canaan, during the mini-Exodus.

Takeaways

Today we learned that Yahweh was originally a god of metallurgy in northwest Saudi Arabia. The Levites brought worship of him to Israel.

More specifically:

  • Certain aspects of Israelite prehistory as given by the Bible cannot be read literally. We have strong evidence that he Israelite people we indigenous Canaanites. The Israelites and the Canaanites shared the same religion: the pantheon of El. The earliest Israelites worshipped creator god El, his wife Asherah, and his sons e.g., Baal.
  • The original Yahweh cult was located in south Edom (northwest Saudi Arabia). Yahweh was there worshipped by the Shasu people as a god of metallurgy
  • There was no mass exodus. But there was a mini-exodus of a group of Levite priests from Egypt. The Biblical evidence suggests that Moses was a Midianite, and his encounter with Yahweh occurred in Midian.
  • On arrival at Israel, the Levites were incorporated into the Israeli population. Instead of land, they were ceded priestly roles, which included a 10% tithe. Their deity Yahweh was introduced as a second tier god: a member of El’s family. The national history created by the Levites thus helped unify Israel around her new pre-history.

Until next time.

[Sequence] Demystifying Religion

Most of my efforts focus on various aspects of science and mathematics. Why write about religion?

I am not particularly interested in evaluating theological claims. But this blog is very interested in the computational and biological bases of primate sociality. And religion plays a key role in our evolved social capacities.

This post is meant as an executive summary of various positions I have come to accept over the years. As with my other overview posts, the positions laid out here are a moving target. I’m hoping to eventually motivate each topic; to give you the evidence rather than summarizing the belief. If you want to hear more about a particular topic, don’t hesitate to let me know!

Social Theses

At a social level, religious belief brings communities together. This explains the special attention many faiths place on ethics: ethical norms are the frame on which social institutions rest. It also explains why most conversion experiences tends to occur at a deeper, more emotional place of the mind (not so much in the cold light of reason).

  1. The Relational Sphere Hypothesis. Social institutions come in three flavors. There is the political sphere, economic sphere, and social sphere. Religious institutions are an extension of (a buttressing of) the social sphere.
  2. Generator of Social Capital. The reason why religion became institutionalized is that, with the triumph of market economies over gift economies, religious structure provided an alternative mode for promoting social bonds within a community.
  3. Monotheistic Cohesion Hypothesis. Monotheistic cultures tend to treat strangers more fairly than polytheistic ones. Monotheism was successful in part because it facilitated larger group size (strangers could identify as the same team).

Cognitive Theses

At a cognitive level, religious experience meets at been the nexus of animism, mythology, and ritual. Occasionally it is accentuated by numinous (altered) states of consciousness. Only very recently has belief played a role in some forms of religious participation. Here, I survey the cognitive machinery that drives these aspects of religiosity.

  1. Animism as Hyperactive Agency Detection. Mammals are good at differentiating events caused by inanimate nature, versus those caused by animate events. Due to the asymmetry of false positives vs false negatives, our Agency Detectors are built on a hair trigger: we are often too quick to attribute agency. Humans are susceptible to invoke supernatural agents whenever emotionally eruptive events arise that have superficial characteristics of agency in the absence of a corresponding agent.
  2. Mythology as Counterintuitive Narratives
  3. Ritual as Paradox-Based Social Bonding
  4. The Numinous as Altered States of Consciousness
  5. Two Faces of Meaning. Beliefs are like clothes; they serve two purposes. The first purpose is functional: beliefs can constrain expectations of physical experience. The second purpose is signaling: beliefs can signal group membership, ethical values, and personality. Most beliefs serve both purposes, at least to some extent. Religious belief is notable in that its content is mostly the latter. That is, religious belief typically does not constrain expectation of physical experience.

Historical Theses

It is admittedly strange to discuss 1st century Palestine in depth. Why pay so much attention here, as opposed to 7th century Saudi Arabia, or 19th century US state of Utah?

To this I must disclose that, most of my friends and family self-describe as evangelical Christian. 1st century Palestine is brought to my attention literally once a week. I am hoping these conversations become more interesting after I construct positive theories that go beyond “I don’t know”.

Thus, the following historical theses are rightly viewed as less interesting than more universal topics on my blog. That said, perhaps you will find value in them.

Judaism subsequence

  • A Secret In The Ark. Presents the linguistic evidence that the story of Noah was not authored by Moses, but was instead produced by the interweaving of two (surprisingly divergent) narratives.
  • Who Wrote The Bible? Introduces the theory as a generalization of observations such as the above.

Christianity subsequence

  • Jesus as Apocalyptic Prophet.

Who Wrote The Bible?

Part Of: Demystifying Religion sequence
Followup To: A Secret in The Ark
Content Summary: 1900 words, 19min read.

Who Wrote The Hebrew Bible?

A close reading of the Hebrew Bible reveals the existence of doublets: two stories that describe the same event. A few examples:

  • Abraham’s covenant (Genesis 15:1-21 and 17:1-27),
  • Jacob becoming Israel (Genesis 32:25-33 and 35:9-15),
  • Yahweh summons Moses (Exodus 3-4 and 6:2-30)
  • Water in the wilderness (Exodus 15:22b-25a and 17:1-7)

Dozens of these doublets appear throughout the first five books of the Hebrew Bible (also known as the Torah). Traditionally, the Torah is thought to have a single author, and doublets like these were explained as either a) different events, or b) same event but with different emphases.

But what if these doublets exist because the Torah has multiple authors?

Let’s look deeper.

Source Identification as Unsupervised Learning

In principle, how might we discern between a single- and a multi-author book?

The Clustering Method. Let’s conjecture two sources (clusters) and, for each sentence, assign it either Cluster 1 or Cluster 2. We have complete freedom in our assignments. We want to chose clusters that maximize the coherence within each source, and also maximize the difference between the sources.

  • If the clusters are not very different, there is probably only one author.
  • If they are very different, we can safely conclude two authors.

For readers familiar with machine learning: this is unsupervised learning – searching for latent variables that best explain our data.

A Tale of Two Books

Suppose you encounter a book you have never read before, originally written in English by a single author. Call this Book A.

But you don’t know if Book A has one or two authors! To find out, you might use the Clustering Method.

What happens if you look at every sentence in Book A, and try to make each source-cluster as different as possible. Even for books written by a single author, the resultant source-clusters could be contrived to be truly different. For example, you could put all optimistic sentences in one bucket, and all pessimistic sentences in the other. But even though the texts feel a little different, they don’t differ that much (after all, a single person wrote both!)

In contrast, imagine you come across another book, Book B, replete with doublets. You break those doublets into clusters, and discover the following facts:

  1. Dialect. One cluster uses an antiquated dialect of English (e.g., Shakespearean), the other a modern dialect (e.g., African-American Vernacular English).
  2. Terminology. One cluster consistently uses the word “soda”, the other consistently uses the alternative, “pop”. 
  3. Consistent Content. One cluster is very interested in economic issues. The other is more interested in rehashing political debates.
  4. Narrative Flow.  Reading each cluster as a standalone book tends to smooth out non-sequiturs, and generally improve the sense of narrative flow.
  5. Inter-Source Relationships.  Imagine Book B is situated in an anthology with other books (B2 and B3) of unknown authorship. These other books are kinda dissimilar  from B. But B2 has lots in common with Cluster 1, and B3 sounds like it shares an author with Cluster 2. 
  6. Historical Grounding. Given the above information, we can make a pretty good guess as to identify of both authors, and why they got merged into a single anonymous volume.

On this evidence, it seems very unlikely that there is a single author of Book B. Instead, most people would indeed accept that this document has two different authors.

The Hypothesis: Five Sources

The Hebrew Bible is like Book B. Only, instead of two distinct authors, we have identified five. These are the Jahwist source (J), the Elohim source (E), the  Priestly source (P), the Deuteronomist source (D), and the Redactor (R). This is the Documentary Hypothesis.

We will explore the different personalities of these authors in more detail next section; for now, I want to briefly describe their contributions to the Torah from a textual perspective:

Documentary Hypothesis_ Source Distribution

And here is the timeline on which our source documents were authored, where the final redactor R (Ezra) compiled the final JEPD product.

Documentary Hypothesis_ Composition Timeline (2)

Evidence For The Hypothesis

How do we know all of this? On the following grounds:

  1. Dialect. Sources J and E are written in the Hebrew of the 10th BCE. In contrast, P and D are written in 8th century BCE.
  2. Terminology. A couple examples. Source D alone use of the phrase “with all your heart and with all your soul”. Source P uses all 100 instances of the word “congregation”, and 67 out of 69 examples of the work “chieftain”. Here are more examples:

Documentary Hypothesis_ Terminology (1)

  1. Consistent Content.
    • The Revelation of God’s Name.  According to J, the name YHWH was known since the earliest generations of humans. But in E and P it is stated just as explicitly that YHWH does not reveal this name until the generation of Moses.
    • Sacred Objects.
      • Tabernacle: P discusses the Tabernacle 200 times, it receives more attention than any other subject. It is never mentioned in J or D. E mentions it three times.
      • The Ark: J identifies the ark is identified as crucial to Israel’s travels and military successes; it is never mentioned in E.
      • Urim and Thummim: P mentions Urim and Thummim. J, E, and D never do.
      • Cherubs: P and J invoke cherubs. E and D never do.
      • Miracles: E has miracles performed by Moses’ staff. P uses Aaron’s staff.
    • Priestly Leadership. In P, access to the divine is limited to Aaronid priests. There is no talk of dreams, angels, talking animals, judges, and very few mentions to prophets. These themes are developed almost exclusively in J, E, and D.
  2. Narrative Flow.  Reading J, E, D, and P as standalone narratives tends to remove non-sequiturs and contradictions, and generally improve the sense of narrative flow. Want to see this for yourself? Go compare the original composite story of Noah, and contrast it with the original two stories (the original stories were weaved together by a later redactor).
  3. Inter-Source Relationships. Source D shares the same tone, emphases, and worldview as the book of Jeremiah. Source P resonates strongly with the book of Ezekiel. Finally, Sources J and E mirrors the book of Hosea.
  4. Historical Grounding. This is the most exciting piece of evidence, for reasons I will more fully explore next time. Suffice to say that we can localize each source to the historical context in which it was written. We have evidence suggesting that J and E were composed during the divided monarchy, before Israel fell in 722 BCE. J is written from a Southern perspective (in Judah), E is written from a Norther perspective (in Israel). After the fall of the northern kingdom, many Israelites fled to Judah. Because the old tribal disputes had faded in importance, J and E were combined into a JE narrative. The Priestly source P was an alternative telling of JE written in 8th century Judah. Finally, the first iteration of Deuteronomy was composed during the reign of King Josiah (641 BCE), just 20 years before the Babylonian exile (622 BCE).

I’ll let Richard Elliott Friedman wrap up this section.

Above all, the strongest evidence establishing the Documentary Hypothesis is that several different lines of evidence converge. There are more than thirty cases of doublets: stories or laws that are repeated in the Torah. The existence of so many overlapping texts is noteworthy itself. But their mere existence is not the strongest argument. One could respond, after all, that this is just a matter of style of narrative strategy. Similarly, there are hundreds of apparent contradictions in the text, but one could respond that we can taken them one by one and find some explanation for each contradiction. And, similarly, there is a matter of the texts that consistently call the deity God while other texts consistently call God by the name YHWH, to which one could respond that this is simply like calling someone sometimes by his name and sometimes by his title.

The powerful argument is not any one of these matters. It is that all these matters converge. When we separate the doublets, this also results in the resolution of nearly all the contradictions. And when we separate the doublets, the name of God divides consistently in all but three out of more than two thousand occurrences. And when we separate the doublets, the terminology of each source remains consistent within the source. And when we separate the sources, this produces continuous narratives that flow with only a rare break. And when we separate the sources, this fits with the linguistic evidence, where the Hebrew of each source fits consistently with what we know of the Hebrew in each period. And so on for each of the categories that precede this section.

The name of God and doublets were the were the starting-points of the investigation into the formation of the Bible. But they are not major arguments or evidence in themselves. The most compelling argument is that all this evidence of so many kinds comes together so consistently. To this day, no one known to me who challenged the hypothesis has ever addressed this fact.

Open Questions

Most scholars agree with the broad picture of four sources (J, E, P, D) and two redactions (JE and JEPD). There does exist considerable controversy at finer levels of detail. The four most contentious mini-debates I know of are as follows:

  • While there is consensus on the dating of J, E, and D, the dating of P is somewhat controversial (700 vs 500 BCE).
  • The exact relationship of J and E is at times hard to work out, particularly because E has less material than J. Were parts of E ejected during the redaction process of JE? Or was E composed as a supplement to J, and not a standalone work?
  • It is hard to make out how the two redaction processes actually worked. The Hebrew Bible is the very first example of prose writing in the entire world (earlier writing was entirely poetic).
  • There is consensus that J, E, P, and D were authored long after the events that they describe. They were undoubtedly influenced by early oral traditions. However, the extent of continuity and historical memory transferred from these oral traditions is in some doubt.

Takeaways

I was raised in an evangelical household, which means that growing up, I have read the Hebrew Bible (known to Christians as the Old Testament) cover-to-cover several times. I found such reading difficult. Some of this was mere cultural distance: a kid in the 20th century CE is three millenia removed from Canaanite culture in which the Bible was written.

But for me, the Hebrew Bible feels much easier to understand in light of the Documentary Hypothesis.

  • Contradictions are explained.
  • The within-source stories flow much better.
  • It is easier to understand the narrative discontinuities in the composite.
  • The diverse perspectives can be situated within their originating cultural milieu.

I wish more people knew about the Documentary Hypothesis for these reasons. Or better yet, could look at the labelled sources of the Hebrew Bible online. But for now, if you’d like to read the Hebrew Bible yourself, with labelled sources, the best way to do this is simply to purchase a book like, The Bible With Sources Revealed for a copy of the complete Torah, color coded by authorship.

Until next time.

An Introduction to Language Models

Part Of: Language sequence
Content Summary: 1500 words, 15 min read

Why Language Models?

In the English language, ‘e’ appears more frequently than ‘z’. Similarly,  “the” occurs more frequently than “octopus”. By examining large volumes of text, we can learn the probability distributions of characters and words.

Language Models_ Letter and Word Frequency

Roughly speaking, statistical structure is distance from maximal entropy. The fact that the above distributions are non-uniform means that English is internally recoverable: if noise corrupts part of a message, the surrounding can be used to recover the original signal. Statistical structure is also used to reverse engineer secret codes such as the Roman cipher.

We can illustrate the predictability of English by generating text based on the above probability distributions. As you factor in more of the surrounding context, the utterances begin to sound less alien, and more like natural language.

Language Model_ Structure of English

A language model exploits the statistical structure of a language to express the following:

  • Assign a probability to a sentence P(w_1, w_2, w_3, \ldots w_N)
  • Assign probability of an upcoming word P(w_4 \mid w_1, w_2, w_3)

Language models are particularly useful in language perception, because they can help interpret ambiguous utterances. Three such applications might be,

  • Machine Translation: P(\text{high winds tonight}) > P(\text{large winds tonight})
  • Spelling correction: P(\text{fifteen minutes from}) > P(\text{fifteen minuets from})
  • Speech Recognition: P(\text{I saw a van}) > P(\text{eyes awe of an})

Language models can also aid in language production. One example of this is autocomplete-based typing assistants, commonly displayed within text messaging applications. 

Towards N-Grams

A sentence is a sequence of words \textbf{w} = (w_1, w_2, \ldots, w_3). To model the joint probability over this sequence, we use the chain rule:

p(\text{this is the house})

= p(\text{this})p(\text{is}\mid\text{this})p(\text{the}\mid\text{this is})p(\text{house}\mid\text{this is the})

As the number of words grows, the size of our conditional probability tables (CPTs) quickly becomes intractable. What is to be done? Well, recall the Markov assumption we introduced in Markov chains.

markov_assumption

The Markov assumption constrains the size of our CPTs. However, sometimes we want to condition on more (or less!) than just one previous word. Let v denote how many variables we admit in our context. A variable order Markov model (VOM) allows v elements in its context: p(s_{t+1} | s_{t-v}, \ldots, s_{t}). Then the size of our CPT is n=v+1, because we must take our original variable into account. Thus an N-gram is defined as a v-order Markov model. By far, the most common choices are trigrams, bigrams, and unigrams:

Language Models_ Ngram comparison (1)

We have already discussed Markov Decision Processes, used in reinforcement learning applications.  We haven’t yet discussed MRFs and HMMs. VOMs represent a fourth extension: the formalization of N-grams. Hopefully you are starting to appreciate the  richness of this “formalism family”. 🙂

Language Model_ Markov Formalisms (1)

Estimation and Generation

How can we estimate these probabilities? By counting!

ngram_v2

Let’s consider a simple bigram language model. Imagine training on this corpus:

This is the cheese.

That lay in the house that Alice built.

Suppose our trained LM encounters the new sentence “this is the house”. It estimates its probability as:

p(\text{this is the house})

= p(\text{this})p(\text{is} \mid \text{this})p(\text{the} \mid \text{is})p(\text{house} \mid \text{the}) 

= \dfrac{1}{12} * 1 * 1 * \dfrac{1}{2} = \dfrac{1}{24}

How many problems do you see with this model? Let me discuss two.

First, we have estimated that p(\text{this}) = \dfrac{1}{24}. And it is true that “this” occurs only once in our toy corpus above. But out of two sentences, “this” leads half of them. We can express this fact by adding a special START token into our vocabulary.

Second, recall what happens when language models generate speech. Once they begin a sentence, they are unable to end it! Adding a new END token will allow our model the terminate a sentence, and begin a new one.

With these new tokens in hand, we update our products as follows:

Language Models_ Sentence Estimation (1)

A couple other “bug fixes” I’ll mention in passing:

  • Out-of-vocabulary words are given zero probability. It helps to add an unknown  (UNK) pseudoword and assign it some probability mass.
  • LMs prefer very short sentences (sequential multiplication is monotonic decreasing). We can address this e.g., normalizing by sentence length.

Smoothing

In the last sentence in the image above, we estimate p(END|house) = 0, because we have no instances of this two-word sequence in our toy corpus. But this causes our language model to fail catastrophically: the sentence is deemed impossible (0% probability).

This problem of zero probability increases as we increase the complexity of our N-grams. Trigram models are more accurate than bigrams, but produce more p=0 events. You’ll notice echoes of the bias-variance (accuracy-generalization) tradeoff.

How can we remove zero counts? Why not add one to every word? Of course, we’d then need to increase the size of our denominator, to ensure the probabilities still sum to one. This is Laplace smoothing

Language Model_ Laplace Smoothing

In a later post, we will explore how (in a Bayesian framework) such smoothing algorithms can be interpreted as a form of regularization (MAP vs MLE).

Due to its simplicity, Laplace smoothing is well-known  But several algorithms achieve better performance.  How do they approach smoothing?

Recall that a zero count event in an N-gram is not likely to occur in (N-1)-gram model. For example, it is very possible that the phrase “dancing were thought” hasn’t been seen before. 

Language Model_ Backoff Smoothing

While a trigram model may balk at the above sentence, we can fall back on the bigram and/or unigram models. This technique underlies the Stupid Backoff algorithm.

As another variant on this theme, some smoothing algorithms train multiple N-grams, and essentially use interpolation as an ensembling method. Such models include Good-Turing and Kneser-Ney algorithms.

Beam Search

We have so far seen examples of language perception, which assigns probabilities to text. Let us consider language perception, which generates text from the probabilistic model. Consider machine translation. For a French sentence \textbf{x}, we want to produce the English sentence \textbf{y} such that y^* = \text{argmax } p(y\mid x).  

This seemingly innocent expression conceals a truly monstrous search space. Deterministic search has us examine every possible English sentence. For a vocabulary size V, there are V^2 possible two-word sentences. For sentences of length n, our time complexity of our brute force algorithm is O(V^n).

Since deterministic search is so costly, we might consider greedy search instead. Consider an example French sentence \textbf{x} “Jane visite l’Afrique en Septembre”. Three candidate translations might be,

  • y^A: Jane is visiting Africa in September
  • y^B: Jane is going to Africa in September
  • y^C: In September, Jane went to Africa

Of these, p(y^A|x) is the best (most probable) translation. We would like greedy search to recover it.

Greedy search generates the English translation, one word at a time. If “Jane” is the most probable first word \text{argmax } p(w_1 \mid x), then the next word generated is \text{argmax } p(w_2 \mid \text{Jane}, x). However, it is not difficult to contemplate p(\text{going}\mid\text{Jane is}) > p(\text{visiting}\mid\text{Jane is}), since the word “going” is used so much more frequently in everyday conversation. These problems of local optima happen surprisingly often.

The deterministic search space is too large, and greedy search is too confining. Let’s look for a common ground.

Beam search resembles greedy search in that it generates words sequentially. Whereas greedy search only drills one such path in the search tree, beam search drills a finite number of paths. Consider the following example with beamwidth b=3

beam_search

As you can see, beam search elects to explore y^A as a “second rate” translation candidate despite y^B initially receiving the most probability mass. Only later in the sentence does the language model discover the virtues of the y^A translation. 🙂

Strengths and Weaknesses

Language models have three very significant weaknesses.

First, language models are blind to syntax. They don’t even have a concept of nouns vs. verbs!  You have to look elsewhere to find representations of pretty much any latent structure discovered by linguistic and psycholinguistic research.

Second, language models are blind to semantics and pragmatics. This is particularly evident in the case of language production: try having your SMS autocomplete write out an entire sentence for you. In the real world, communication is more constrained: we choose the most likely word given the semantic content we wish to express right now.

Third, the Markov assumption is problematic due to long-distance dependencies. Compare the phrase “dog runs” vs “dogs run”. Clearly, the verb suffix depends on the noun suffix (and vice versa). Trigram models are able to capture this dependency. However, if you center-embed prepositional phrases, e.g., “dog/s that live on my street and bark incessantly at night run/s”, N-grams fail to capture this dependency.

Despite these limitations, language models “just work” in a surprising diversity of applications. These models are particularly relevant today because it turns out that Deep Learning sequence models like LSTMs share much in common with VOMs. But that is a story we shall have to take up next time.

Until then.