The Documentary Hypothesis

Part Of: Demystifying Religion sequence
Followup To: A Secret in The Ark
Content Summary: 1900 words, 19min read.

Who Wrote The Hebrew Bible?

A close reading of the Hebrew Bible reveals the existence of doublets: two stories that describe the same event. A few examples:

  • Abraham’s covenant (Genesis 15:1-21 and 17:1-27),
  • Jacob becoming Israel (Genesis 32:25-33 and 35:9-15),
  • Yahweh summons Moses (Exodus 3-4 and 6:2-30)
  • Water in the wilderness (Exodus 15:22b-25a and 17:1-7)

Dozens of these doublets appear throughout the first five books of the Hebrew Bible (also known as the Torah). Traditionally, the Torah is thought to have a single author, and doublets like these were explained as either a) different events, or b) same event but with different emphases.

But what if these doublets exist because the Torah has multiple authors?

Let’s look deeper.

Source Identification as Unsupervised Learning

In principle, how might we discern between a single- and a multi-author book?

The Clustering Method. Let’s conjecture two sources (clusters) and, for each sentence, assign it either Cluster 1 or Cluster 2. We have complete freedom in our assignments. We want to chose clusters that maximize the coherence within each source, and also maximize the difference between the sources.

  • If the clusters are not very different, there is probably only one author.
  • If they are very different, we can safely conclude two authors.

For readers familiar with machine learning: this is unsupervised learning – searching for latent variables that best explain our data.

A Tale of Two Books

Suppose you encounter a book you have never read before, originally written in English by a single author. Call this Book A.

But you don’t know if Book A has one or two authors! To find out, you might use the Clustering Method.

What happens if you look at every sentence in Book A, and try to make each source-cluster as different as possible. Even for books written by a single author, the resultant source-clusters could be contrived to be truly different. For example, you could put all optimistic sentences in one bucket, and all pessimistic sentences in the other. But even though the texts feel a little different, they don’t differ that much (after all, a single person wrote both!)

In contrast, imagine you come across another book, Book B, replete with doublets. You break those doublets into clusters, and discover the following facts:

  1. Dialect. One cluster uses an antiquated dialect of English (e.g., Shakespearean), the other a modern dialect (e.g., African-American Vernacular English).
  2. Terminology. One cluster consistently uses the word “soda”, the other consistently uses the alternative, “pop”. 
  3. Consistent Content. One cluster is very interested in economic issues. The other is more interested in rehashing political debates.
  4. Narrative Flow.  Reading each cluster as a standalone book tends to smooth out non-sequiturs, and generally improve the sense of narrative flow.
  5. Inter-Source Relationships.  Imagine Book B is situated in an anthology with other books (B2 and B3) of unknown authorship. These other books are kinda dissimilar  from B. But B2 has lots in common with Cluster 1, and B3 sounds like it shares an author with Cluster 2. 
  6. Historical Grounding. Given the above information, we can make a pretty good guess as to identify of both authors, and why they got merged into a single anonymous volume.

On this evidence, it seems very unlikely that there is a single author of Book B. Instead, most people would indeed accept that this document has two different authors.

The Hypothesis: Five Sources

The Hebrew Bible is like Book B. Only, instead of two distinct authors, we have identified five. These are the Jahwist source (J), the Elohim source (E), the  Priestly source (P), the Deuteronomist source (D), and the Redactor (R). This is the Documentary Hypothesis.

We will explore the different personalities of these authors in more detail next section; for now, I want to briefly describe their contributions to the Torah from a textual perspective:

Documentary Hypothesis_ Source Distribution

And here is the timeline on which our source documents were authored, where the final redactor R (Ezra) compiled the final JEPD product.

Documentary Hypothesis_ Composition Timeline (2)

Evidence For The Hypothesis

How do we know all of this? On the following grounds:

  1. Dialect. Sources J and E are written in the Hebrew of the 10th BCE. In contrast, P and D are written in 8th century BCE.
  2. Terminology. A couple examples. Source D alone use of the phrase “with all your heart and with all your soul”. Source P uses all 100 instances of the word “congregation”, and 67 out of 69 examples of the work “chieftain”. Here are more examples:

Documentary Hypothesis_ Terminology (1)

  1. Consistent Content.
    • The Revelation of God’s Name.  According to J, the name YHWH was known since the earliest generations of humans. But in E and P it is stated just as explicitly that YHWH does not reveal this name until the generation of Moses.
    • Sacred Objects.
      • Tabernacle: P discusses the Tabernacle 200 times, it receives more attention than any other subject. It is never mentioned in J or D. E mentions it three times.
      • The Ark: J identifies the ark is identified as crucial to Israel’s travels and military successes; it is never mentioned in E.
      • Urim and Thummim: P mentions Urim and Thummim. J, E, and D never do.
      • Cherubs: P and J invoke cherubs. E and D never do.
      • Miracles: E has miracles performed by Moses’ staff. P uses Aaron’s staff.
    • Priestly Leadership. In P, access to the divine is limited to Aaronid priests. There is no talk of dreams, angels, talking animals, judges, and very few mentions to prophets. These themes are developed almost exclusively in J, E, and D.
  2. Narrative Flow.  Reading J, E, D, and P as standalone narratives tends to remove non-sequiturs and contradictions, and generally improve the sense of narrative flow. Want to see this for yourself? Go compare the original composite story of Noah, and contrast it with the original two stories (the original stories were weaved together by a later redactor).
  3. Inter-Source Relationships. Source D shares the same tone, emphases, and worldview as the book of Jeremiah. Source P resonates strongly with the book of Ezekiel. Finally, Sources J and E mirrors the book of Hosea.
  4. Historical Grounding. This is the most exciting piece of evidence, for reasons I will more fully explore next time. Suffice to say that we can localize each source to the historical context in which it was written. We have evidence suggesting that J and E were composed during the divided monarchy, before Israel fell in 722 BCE. J is written from a Southern perspective (in Judah), E is written from a Norther perspective (in Israel). After the fall of the northern kingdom, many Israelites fled to Judah. Because the old tribal disputes had faded in importance, J and E were combined into a JE narrative. The Priestly source P was an alternative telling of JE written in 8th century Judah. Finally, the first iteration of Deuteronomy was composed during the reign of King Josiah (641 BCE), just 20 years before the Babylonian exile (622 BCE).

I’ll let Richard Elliott Friedman wrap up this section.

Above all, the strongest evidence establishing the Documentary Hypothesis is that several different lines of evidence converge. There are more than thirty cases of doublets: stories or laws that are repeated in the Torah. The existence of so many overlapping texts is noteworthy itself. But their mere existence is not the strongest argument. One could respond, after all, that this is just a matter of style of narrative strategy. Similarly, there are hundreds of apparent contradictions in the text, but one could respond that we can taken them one by one and find some explanation for each contradiction. And, similarly, there is a matter of the texts that consistently call the deity God while other texts consistently call God by the name YHWH, to which one could respond that this is simply like calling someone sometimes by his name and sometimes by his title.

The powerful argument is not any one of these matters. It is that all these matters converge. When we separate the doublets, this also results in the resolution of nearly all the contradictions. And when we separate the doublets, the name of God divides consistently in all but three out of more than two thousand occurrences. And when we separate the doublets, the terminology of each source remains consistent within the source. And when we separate the sources, this produces continuous narratives that flow with only a rare break. And when we separate the sources, this fits with the linguistic evidence, where the Hebrew of each source fits consistently with what we know of the Hebrew in each period. And so on for each of the categories that precede this section.

The name of God and doublets were the were the starting-points of the investigation into the formation of the Bible. But they are not major arguments or evidence in themselves. The most compelling argument is that all this evidence of so many kinds comes together so consistently. To this day, no one known to me who challenged the hypothesis has ever addressed this fact.

Open Questions

Most scholars agree with the broad picture of four sources (J, E, P, D) and two redactions (JE and JEPD). There does exist considerable controversy at finer levels of detail. The four most contentious mini-debates I know of are as follows:

  • While there is consensus on the dating of J, E, and D, the dating of P is somewhat controversial (700 vs 500 BCE).
  • The exact relationship of J and E is at times hard to work out, particularly because E has less material than J. Were parts of E ejected during the redaction process of JE? Or was E composed as a supplement to J, and not a standalone work?
  • It is hard to make out how the two redaction processes actually worked. The Hebrew Bible is the very first example of prose writing in the entire world (earlier writing was entirely poetic).
  • There is consensus that J, E, P, and D were authored long after the events that they describe. They were undoubtedly influenced by early oral traditions. However, the extent of continuity and historical memory transferred from these oral traditions is in some doubt.


I was raised in an evangelical household, which means that growing up, I have read the Hebrew Bible (known to Christians as the Old Testament) cover-to-cover several times. I found such reading difficult. Some of this was mere cultural distance: a kid in the 20th century CE is three millenia removed from Canaanite culture in which the Bible was written.

But for me, the Hebrew Bible feels much easier to understand in light of the Documentary Hypothesis.

  • Contradictions are explained.
  • The within-source stories flow much better.
  • It is easier to understand the narrative discontinuities in the composite.
  • The diverse perspectives can be situated within their originating cultural milieu.

I wish more people knew about the Documentary Hypothesis for these reasons. Or better yet, could look at the labelled sources of the Hebrew Bible online. But for now, if you’d like to read the Hebrew Bible yourself, with labelled sources, the best way to do this is simply to purchase a book like, The Bible With Sources Revealed for a copy of the complete Torah, color coded by authorship.

Until next time.


3 thoughts on “The Documentary Hypothesis

  1. Nice summary. It is not unusual to encounter claims that the Documentary Hypothesis is dead, often as a not so thinly veiled attempt to reinforce traditional notions of Mosaic \ divine authorship. That said, it does seem that there is a legitimate rift in the scholarship, which is nicely summarized in this reddit thread, and in the introduction to a compilation of essays on the topic. However, the general postulate that the DH originally illuminated – that the recevied text originates from multiple sources across time, combined by later redactors – remains intact in all cases. This basic precept is completely ignored by those who celebrate claims regarding the death of the Documentary Hypothesis. So even if some specific, rigid formulation of the DH does not stand the test of time, I think this more basic precept still supports the takeaways you enumerate as a means to radically clarify our understanding of the text.


  2. Last time I checked, most scholars date the bulk of J E D and P to be post-700 BC (putting a side a few older poems.) J is even debated to have been a product of an exilic sect, I tend to agree considering its babylonian inspired creation myth. I think both J and P were exilic sects, D were the landowners (and propaganda historians) in Judah who never left, and E I believe could date as far back at the 720 BC Israelite refugees, or a later sect of them, although many scholars seem to bundle J and E into the same source, I have my hesitations. But any of these could and do have a long editing and compositional history into themselves and could span a couple centuries. Perhaps there is an ancient core to J that predates Genesis 2-3.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s