Part Of: Language sequence
Content Summary: 900 words, 9 min read
Syntax vs Semantics
In language, we distinguish between syntax (structure) and semantics (meaning).
Compare the following:
- “Colorless green ideas sleep furiously”
- “Sleep ideas colorless green furiously”
Both sentences are nonsensical (a semantic transgression). But the first is grammatically correct, whereas the second is malformed.
The brain responds differently to errors of syntax and semantics, as measured by an EEG machine. Semantic errors produce a negative voltage after 400 milliseconds (“N400”); syntactic errors produce a positive voltage after 600 milliseconds (“P600”):
Parts of Speech
To understand syntax more precisely, we must differentiate parts of speech. Consider the following categories:
- Noun (N). cat, book, computer, peace, …
- Verb (V). jump, chase, eat, sleep, …
- Adjective (A). long, purple, young, old, …
- Determiner (D) the, this, many, all, …
- Preposition (P) in, on, to, for, with…
Nouns and verbs correspond to perception- and action- representations, respectively. They are an expression of the perception-action cycle. But to study syntax, it helps to put aside semantic context, and explore how parts of speech relate to one another.
Phrases as Color Patterns
To understand syntax intuitively, start by adding color to sentences. Then try to find patterns of color unique to well-formed sentences.
Let’s get started!
“Noun-like” groups of words appear on either side of the verb. Let noun phrase (NP) denote such a group. Optional parts of speech are indicated by the parentheses. Thus, our grammar contains the following rules:
- S → NP V NP
- NP → (D) (A) N
These rules explain why the following sentences feel malformed:
- “Chase dogs cats” (violates rule 1)
- “Old some dogs chase cats” (violates rule 2)
But these rules don’t capture regularities in how verbs are expressed. Consider the following sentences:
A verb phrase contains a verb, optionally followed by a noun, and/or a preposition.
- S → NP VP
- NP → (D) (A) N
- VP → V (NP) (P NP)
This is better. Did you notice how we improved our sentence (S) rule? 🙂 Subject-only sentences (e.g. “She ran”) are now recognized as legal.
Prepositions are not limited to verb phrases, though. They also occur in noun phrases. Consider the following:
Prepositions are sometimes “attached to” a noun phrase. We express these as a prepositional phrase, which includes a preposition (e.g. “on”) and an optional noun phrase (e.g. “the table”).
- S → NP VP
- NP → (D) (A) N (PP)
- VP → V (NP) (PP)
- PP → P (NP)
Notice how we cleaned up the VP rule, and improved the NP rule.
Congratulations! You have discovered the rules of English. Of course, a perfectly complete grammar must include determiners (e.g., “yours”), conjunction (e.g., “and”), interjection (e.g., “wow!”). But these are fairly straightforward extensions to the above system.
These grammatical rules need not only interest English speakers. As we will see later, a variant of these rules appear in all known human languages. This remarkable finding is known as universal grammar. Language acquisition is not about reconstructing syntax rules from scratch. Rather, it is about learning the parameters by which your particular natural language (English, Chinese, Egyptian) varies from the universal script.
From Rules to Trees
Our four rules are polymorphic: they permit more than one kind of structure. Unique rule sets are easier to analyze, so let’s translate our rules into this format:
Importantly, we can conceive of these unique rules as directions to construct a tree. We can conceive of the sentence “Dogs chase cats” as:
Sentences are trees. These trees are not merely used to verify whether grammatical correctness. They play a role in speech production: which transforms the language of thought (Mentalese) to natural language (e.g., English). For more on this, see my discussion of the Tripartite Mind.
How can (massively parallel) conscious thought be made into (painfully serial) speech utterances? With syntax! Simply take the concepts you desire to communicate, and construct a tree based on (a common set of) syntactical rules.
Tree construction provides much more clarity on the phenomena of wordplay (linguistic ambiguity). Consider the sentence “I shot a wolf in my pajamas”. Was the gun fired while you were wearing pajamas? Or was the wolf dressed in pajamas?
Both interpretations agree on parts of speech (colors). It is the higher-order structure that admits multiple choices. In practice, semantics constrain syntax: we tend to select the interpretation is feels the most intuitive.
The Sociology of Linguistics
The above presentation uses a simple grammar, for pedagogic reasons. I will at some point explain the popular X’ theory (pronounced “X bar”), which explores similarities between different phrase structures (e.g., NP vs PP). Indeed, there is a wide swathe of possible grammars that we will explore.
Generative grammar is part of the Symbolist tribe of machine learning. As such, this field has rich connections with algebra, production systems, and logic. For example, propositional logic was designed as the logic of sentences; predicate logic is the logic of phrases.
Other tribes besides the Symbolists care about language and grammar, of course. Natural Language Processing (NLP) and computational linguistics have been heavily influenced by the Bayesian tribe, and use probabilitic grammars (i.e., PCFGs).
More recently, the Connectionist tribe (and deep learning technologies) are taking a swing at producing language. In fact, I suspect neural network interpretability will only be achieved once a Connectionist account of language production has matured.
- Language can be understood via syntax (structure) and semantics (meaning).
- Syntax requires delineating parts of speech (e.g., nouns vs verbs).
- Parts of speech occur in patterns called phrases. We can express these patterns as the rules of syntax.
- Sentences are trees. Syntax rules are instructions for tree construction.
- Sentence-trees provide insight into problems like sentence ambiguity.
Until next time.