Introduction to Theoretical Linguistics

Lecture notes

Linguistics is the scientific study of language, concerned with the question:

What do we know when we know language?

The crucial thesis that underlies modern linguistics (and one I will try to present and defend in this lecture) is as follows:

Language is a combinatory system with multiple levels of representation.

It can be compared to Lego: while the different languages might look very diverse, the nature of the building blocks they use, and the method of putting these blocks together is always the same.

Only such a system will allow use of language which is productive and creative, so that new utterance can be systematically constructed on the spot out of the piece-parts made available by the language being spoken. Thus, while we know only a finite number of words, we have a potential to construct an infinite number of sentences out of them.

Linguist Steven Pinker illustrates this point in his book "The Language Instinct" by improving on the Guinness Book of Records entry for the longest sentence in English: (1,300 words, from William Faulkner’s "Absalom, Absalom!"): They both bore it as though in deliberate flagellant exaltation...

Steven Pinker’s new submission: Faulkner wrote, "They both bore it as though in deliberate flagellant exaltation..."

Here’s an example of linguistic analysis for a sentence, which will identify some of the building blocks we postulate for language, and tell you something about the ways of combining them.

Let us consider the utterance

Pete saw a table, and he liked it.

We don’t just perceive a stream of sounds, like a bird’s song - we perceive a sequence of sound elements, like p or a , which linguists call phonemes.

These come structured in syllables, which always have the sonorous nucleus - usually a vowel, but sometimes also consonants, like the "l" in "table". Often syllables also contain consonants in the beginning - onset (o) or in the end - coda. The nucleus together with the coda form a rhyme (r) - the part that stays the same in rhymes.


We can describe the rules for stress placement by referring to syllable structure. In fact, there are lots of generalisations about the sound structure of a language which can be easily stated by referring to phonemes, onsets, rhymes, and syllables.


A part of linguistics that studies the sound structure of language is called phonology.


Another way of thinking about building blocks of language is to think about the smallest meaningful units in it, called morphemes. For example, the word liked consists of two morphemes – one pronounced [laık] and describing the act of liking, and another pronounced [t] in this case and denoting past tense.


It matters how we put morphemes together into words – for example, the word “unlockable” consists of three morphemes: “un” “lock” and “able, but has two incompatible meanings depending on how these are put together:


un + lockable = doesnÂ’t have a lock, so we cannot lock it.

unlock + able = has a key, so we can unlock it.


Going to larger units of linguistic structure, we can talk about words being combined into phrases, which in turn combine into sentences. Here’s how we might analyse a sentence “Pete saw a table”:


      Â S
       / \
  NP Â  VP
  / Â        / Â  \
Pete Â  V Â  NP  
          /    Â  /   \
      saw  Det N
                 |     |       
               Â a Â  table


At the next level of linguistic representations, we can look at sentence meanings, and try to determine under what conditions is this true? More importantly, how does each part of the sentence contribute to the meaning of the whole? A subfield of linguistics that considers these questions is called semantics.


There is more to meaning than that, of course. Pragmatics and the study of discourse study what remains of the meaning. They address, for example, the questions

What is the relationship between “Pete saw a table” and “he liked it”?

How do we compute the meaning of he and it? 

YouÂ’ll hear much more about this in the next several days.


When we learn / perceive / use language, we compute the representations like the ones illustrated above.

How is our knowledge of language manifested in what we do when we use it? What constrains this computation?


There are some obvious answers to that – the nature of the ideas we wish to communicate, the nature of our motor/perception organs, etc.


In fact, the computation in learning, perceiving and producing language is also constrained by linguistic structure, i.e. by the architecture of language-knowledge itself.


For example, sounds of language are computed differently from regular sounds. While in general we are capable of perceiving very subtle differences between sounds, this ability becomes constrained when it comes to distinguishing phonemes. In an experiment, when researchers tweaked a sound so that it was changing continuously from b to p, the subjects kept hearing “b” till a certain threshold point, after which they just heard “p”.


We can see this at work in experiments, by looking at our perception of sounds in foreign languages.

For instance, in Russian there is only one vowel similar to English [i:] and [i]. So, Russians past acquisition stage are unable to tell the difference between “sheet” and “shit”.


This is actually a case in point showing that the level and categories of linguistic structure apply across languages.


We can see the linguistic representations being computed by looking at speech errors – what happens when the computation goes wrong.

A speaker is under time pressure, typically choosing about three words per second out of a vocabulary of 40,000 or more, while at the same time producing perhaps five syllables and a dozen phonemes per second, using more than 100 finely-coordinated muscles, none of which has a maximum gestural repetition rate of more than about three cycles per second.

Word choices are being made, and sentences constructed, at the same time that earlier parts of the same phrase are being spoken.

Given the complexities of speaking, it's not surprising that about one slip of the tongue on average occurs per thousand words said.

Reverend William A. Spooner, Dean and Warden of New College, Oxford, during Victoria's reign -- whose alleged propensity for exchange errors gave the name of spoonerism to this class of speech error. This term came into general use within his lifetime, referring most often to a phoneme exchange.

Some of the exchanges attributed (apocryphally) to Spooner are:

Work is the curse of the drinking classes.

(drink ... working)

stem morpheme exchange

You have tasted the whole worm.

(wasted the whole term)

phoneme exchange (onset)

...queer old dean...

(dear old queen, referring to Queen Victoria)

onset exchange

We can describe what happens in these speech errors by referring to units of linguistic structure: a feature of a sound, a sound (phoneme), a piece of a syllable, a syllable, a piece of morphology (morpheme), a word, or a phrase (a group of words that function together in a sentence).

In a review article entitled "Speaking and Misspeaking" (published in Gleitman and Liberman, eds., An Invitation to Cognitive Science), Gary Dell gives the following made-up examples, all related to the target utterance "I wanted to read the letter to my grandmother."

  1. phrase (exchange):
  2. word (substitution):

3.      "I wanted to read the envelope to my grandmother."

  1. inflectional morpheme (shift):
  2. stem morpheme (exchange):
  3. syllable (anticipation):
  4. syllable onset (anticipation):
  5. phoneme (exchange):
  6. phonological feature (anticipation or perseveration):

Slips can occur at each of these levels. Linguists have accumulated large collections of speech errors, and used the statistical distribution of such errors to evaluate models of linguistic structure and the process of speaking. For instance, the distribution of unit sizes in a corpus of exchanges, reproduced below, has been argued to tell us that words, morphemes and phonemes are especially important units in the process of speaking:



Two observations are important for us here: ALL errors (in all languages!) are constrained by


(1) levels (that is, there are no phoneme/word exchanges, no word/phrase exchanges)

(2) categories (that is, there are no onset/rime exchanges, no noun/verb exchanges)


You will never hear:

¨      Vowel with Consonant

–        “Hauow thld”  (hallow thud)

¨      Onset with Rhyme

–        “Udallow thud”  (hallow thud)

¨      Phoneme Level with Word Level

–        “The cl is marketosed.”  (the market is closed)


So, the levels and categories of linguistic structure are very real. We actually manipulate them when we talk and listen, no matter what language we speak.


As we can see so far, the structure of our linguistic knowledge constrains our computations during communication. Linguistic units like the features, phonemes, morphemes, words, and phrases are the building blocks of that structure.


Now let us take a closer look at how some of these are put together.

The familiar linguistic term for the method of combining linguistic units is grammar.


ThereÂ’s more to language than meets the eye -  sentences are not just bags of words.

For example, the meaning of a sentence depends not only on what units are present in it, but also on how these units are put together (like the meaning of “unlockable” depends on its morphological structure).

Here's an example from Groucho Marx in Animal Crackers, where the sentence is structurally ambiguous..

One morning I shot an elephant in my pajamas.

How he got into my pajamas I dunno.


The ambiguity here centers on the prepositional phrase in my pajamas: does it modify the noun elephant, or the entire verb phrase?

We can express these different meanings by grouping the words together in a particular way:

One morning I (shot (an elephant in my pajamas)). ŕ I shot the elephant that was in my pajamas

One morning I (shot (an elephant) in my pajamas). ŕI shot the animal while wearing my pajamas

This grouping is visually represented by phrase structure trees – more convenient than just bracketing:


    / \
  NP Â  VP
  / Â      / Â  \
 I Â  V     Â NP  
     Â /    Â   Â /  \
  shot Â  NP    PP
       Â      Â /       Â  \
    an elephant  in my pajamas


Notice that there is a node in the tree (the red one) that has “an elephant” and “in my pajamas” (but nothing else) together under it. This expresses the claim that the string of words “an elephant in my pajamas” functions as a unit in this sentence.
In a way, this is saying that “an elephant in my pajamas” is just as good as any other unit of the same type. The way it functions in the sentence is the same as any other NP (noun phrase), for instance,  a pronoun or a name:
    Â    S
       / \
  NP Â  VP
   /      / Â  \
 I Â   Â V    Â  NP  
        /        |
  shot       it/John    
Compare this with the structure below, where there is no such node:
     Â  S
   Â  Â /  \
  NP Â  VP
  /       /  \
 I   VP Â  PP
 Â   Â  /  \ Â       \
 Â  V   NP   in my pajamas
 Â   |    |
 shot  an elephant


Instead, here “in my pajamas” is inextricably linked with the entire verb phrase, while “shot an elephant” forms a unit to the exclusion of the prepositional phrase (as the orange node indicates). So, we cannot replace “an elephant in my pajamas” with anything, without affecting the verb. But we can replace the VP (verb phrase) “shot an elephant” with an appropriate kind of unit:


       / \
  NP Â  VP
  /       /  \
 I    VP Â  PP
        |           \
     Â  V       Â in my pajamas
 Â  sleep


Newspaper headlines are a good source of amusing structural ambiguities.

Dr. Ruth to talk about sex with newspaper editors

Enraged cow injures farmer with ax

Killer sentenced to die for second time in 10 years

Lawyers give poor free legal advice

Think about the source of these ambiguities, in general terms: which syntactic relation is the source of the problem?

Just to make sure we get the connection between the structure and the meaning of a sentence, try grouping words into constituent phrases in the following sentences, so that your grouping corresponds to a particular picture (don’t try to label the phrases, just draw the chief lines for the trees). As a method of checking your grouping, try replacing phrases with “John” and “sleep” as I did above:















I saw an owl with a telescope.                                                 Â  I saw an owl with a telescope.



The grouping of words into constituents of a sentence isnÂ’t just a convenient way of representing structural ambiguities. This is, in fact the only way to express some processes in a language:

A simple (and inadequate) rule for creating a question from an English sentence with the verb “to be” is that you take that verb and move it to the front of the sentence.

the dog is in the yard

is the dog __ in the yard?

But what if there's more than one such verb in the sentence? You can't just take the first one you find.

the dog that is in the yard is named Rex

*is the dog that __ in the yard is named Rex?

Rather, we need to identify the verb which is serving as the head of the main clause of the sentence:

the dog that is in the yard is named Rex

is the dog that is in the yard __ named Rex?

We can't understand this distinction without knowing how the pieces of the sentence fit together, i.e. their constituency.

In this case, we need to understand that the subject of the sentence might be complex, potentially containing one or more verbs of its own (in relative clauses that modify a noun).

[ my dog ] is named Rex

[ that dog ] is named Rex

[ the dog you just saw ] is named Rex

[ the dog that is in the yard ] is named Rex

[ the dog whose owner was arrested yesterday by the police for using him in a drug-running scheme ] is named Rex

All these sentences have the same structure except for the contents of the subject; for operations that ignore the internal structure of the subject, such as inversion of the subject and the auxiliary, they all behave the same.

This generalisation is true for all languages. We can capture patterns in a grammar of a language, and also patterns that hold cross-linguistically by using structural representations like phrase structure trees.


To conclude, we have seen that language is like a multi-layered Leggo, involving linguistic units and a system for combining them on many different levels. Only such a system can explain how users who know only a limited number of phonemes, morphemes, words, and ways of combining them can be so productive and creative in using language. We have also taken a closer look at how words are put together into sentences in syntax, and how we can use trees to represent that. Take that away now, to use in the days to come!

Materials in this lecture relied heavily on the lecture notes for LING 001: Introduction to Linguistics, by Professor Gene Buckley. The discussion was also inspired by the introductory chapter of "Foundations of Language", a 2002 book by Professor Ray Jackendoff.