Linguismstics: /lɪŋˈgwɪzm̩stɪks/
Let me introduce you to my relatives

I wrote this for another blog - essentially trying to introduce and work through some of the motivations for certain analyses of relative clauses without too much jargon…


            My research focuses on the syntax of relative clauses. A typical relative clause is a type of subordinate sentence which modifies a noun. For example,

(1)        a.         the book that I’m reading

            b.         that blog post you’ve written

            c.         the man who saw me

            d.         a type of subordinate sentence which modifies a noun

The clause in bold is traditionally called the relative clause. They are theoretically interesting for a number of reasons. Some syntactic ones are: they are optional, i.e. nouns do not require a relative clause; the noun being modified seems to play a role in both the main clause and the relative clause; relative clauses resemble other constructions to a greater or lesser extent, e.g. interrogatives, possessives, etc.

The head of the relative

            One of the major debates in the syntax of relative clauses lies in where we say the noun being modified originates in the syntactic structure (I will call this noun the relative head from now on). Consider the following example:

(2)        You wrote the book that I’m reading.

Intuitively the relative head ‘book’ is the direct object of the main clause verb ‘write’. We also understand that ‘book’ is the direct object of the relative clause verb ‘read’. How can it be two things at once?

            One option is to say that ‘book’ is base-generated, i.e. enters the syntactic structure, as the direct object of ‘write’ and is co-indexed with a relative pronoun in the relative clause (if two items are co-indexed, it basically means they refer to the same thing). This relative pronoun may be ‘who’, ‘which’ or silent (or ‘that’ depending on your analysis). Adopting the silent option and symbolising this silent pronoun as REL.PRO (for ‘relative pronoun’), the sentence in (2) would have the structure in (3) (the relative clause is placed in square brackets and the co-indexing is symbolised by the subscript ‘i’).

(3)        You wrote the booki [REL.PROi that I’m reading]

But how does this capture the idea that ‘book’ is also the direct object of ‘read’? For this we say that the REL.PRO has moved from the direct object position of ‘read’ to the left edge of the relative clause. This gives the structure in (4).

(4)        You wrote the booki [REL.PROi that I’m reading REL.PROi]

This captures our intuitions about how ‘book’ relates to the main clause and the relative clause. This is the sort of analysis found in Chomsky (1977) and Sauerland (2003), for example.

            Another option would be to abandon co-indexing and say that ‘book’ is base-generated as the direct object of ‘read’. Instead of having a silent REL.PRO move to the left edge of the relative clause, the head of the relative itself moves (I use a subscript ‘1’ to symbolise that the two occurrences of ‘book’ are two copies of a single item rather than two independent items).

(5)        You wrote the [book1 that I’m reading book1]

We would then say that the copy of ‘book’ in the direct object position of ‘read’ is not pronounced but is nonetheless present in the structure since we are able to interpret ‘book’ as being the direct object of ‘read’. The copy at the left edge of the relative clause is pronounced, giving the sentence in (2). This is the sort of analysis found in Kayne (1994).

The head, the ‘the’ and the relative clause

            The type of relative clause we have been looking at is called a restrictive relative because it restricts the possible denotation of the noun. For example, (6) means that you wrote something and that something is a book AND that something is being read by me. In other words, the direct object of ‘write’ has to satisfy both the condition of being a book and being something that I’m reading. It allows you to distinguish this book from one that I’m not reading.

(6)        You wrote the book that I’m reading.

To capture this, we say that the head of the relative and the relative clause are in the scope of the determiner ‘the’.

(7)        [the [book that I’m reading]]

This can be capture in the syntactic structure by saying that [book that I’m reading] forms a constituent which excludes the determiner ‘the’. Now we have an interesting problem: ‘the’ appears with nouns, not clauses, which might suggest the following structure.

(8)        [the [book [that I’m reading]]]

In this structure, ‘the’ requires a noun and so selects ‘book’. The relative clause modifies ‘book’ and so attaches to ‘book’. But there is evidence suggesting that the presence of ‘the’ is tied to the presence of the relative clause (a * means that the sentence is ungrammatical).

(9)        a.         London is beautiful.

            b.         *The London is beautiful.

            c.         The London that I remember is beautiful.

            d.         *London that I remember is beautiful.

A proper name, for example, ‘London’, cannot ordinarily appear with ‘the’ (hence the difference between (9a) and (9b)). However, when a proper name is modified by a relative clause, ‘the’ must appear (hence the difference between (9c) and (9d)). This suggests that ‘the’ requires the relative clause and not the noun! The following structure captures this idea (see Kayne, 1994).

(10)      [the [[book] that I’m reading]]

Now, we have to come up with a way of relating ‘the’ to the head of the relative ‘book’, unless we want to abandon the idea that ‘the’ typically appears with nouns (an idea which might not be as crazy as it sounds). We could say that ‘the’ and ‘book’, by virtue of being close enough to each other in some non-technical sense, can enter into a relationship. Note that ‘book’ does not have a determiner of any kind. This is unusual in English.

(11)      a.         *I like book.

            b.         *Book is good.

We could therefore say that ‘book’ has an empty position for a determiner (I’ll call it D) that enters into a relationship with ‘the’ (see Bianchi, 2000).

(12)      [the [[D book] that I’m reading]]

            We can now make a prediction: if some other element occupies this D position, ‘the’ cannot form the required relationship and the sentence will be ungrammatical. A preposed genitive competes with ‘the’ in English, as seen in (13).

(13)      a.         the book

            b.         Bob’s book

            c.         *the Bob’s book

Now, if a preposed genitive occupies the D position that ‘the’ is aiming to form a relationship with, there will be trouble because ‘the’ and a preposed genitive cannot both be related to this same position, as seen in (13c). If ‘Bob’s’ is present, ‘the’ cannot be, but if ‘the’ is absent, the relative clause must be absent too. This accounts for why (14) is ungrammatical.

(14)      *You wrote Bob’s book that I’m reading.

The only way to say what (14) intends to say is not to prepose the genitive, as in (15).

(15)      You wrote the book of Bob’s that I’m reading.

Since ‘Bob’s’ no longer occupies D, ‘the’ is free to form a relationship with D and the sentence is grammatical.


            That concludes this introduction to the syntax of relative clauses. We have seen that relative clauses are complex and have quite a counter-intuitive structure once we delve into the systematic patterns of grammaticality and ungrammaticality manifested in English. But that is the way of things – language is a part of the natural world and, just as theoretical physics is dumbfounding us with discoveries into the weird and wonderful nature of the physical universe, so too can theoretical linguistics make discoveries about the underlying structures of our linguistic universe (and all that without a Large Hadron Collider … for now).


Bianchi, V. (2000). The raising analysis of relative clauses: a reply to Borsley. Linguistic Inquiry, 31(1), 123–140.

Chomsky, N. (1977). On Wh-Movement. In P. Culicover, T. Wasow, & A. Akmajian (Eds.), Formal Syntax (pp. 71–132). New York: Academic Press.

Kayne, R. S. (1994). The Antisymmetry of Syntax. Cambridge, MA: MIT Press.

Sauerland, U. (2003). Unpronounced heads in relative clauses. In K. Schwabe & S. Winkler (Eds.), The Interfaces: Deriving and interpreting omitted structures (pp. 205–226). Amsterdam/Philadelphia: John Benjamins.

This language is so conservative it’s dead!
Seminar soundbite (the language being referred to is Middle English, btw, but I can easily think of other languages to which this could be applied)
Plato’s Problem

When children learn their native language(s), they receive very little in the way of explicit instruction if any at all. The utterances a child is exposed to are not perfect - they contain false starts, repetitions, and various other mistakes and errors. Furthermore, children are not told what is grammatical and what is not. This latter point is especially important because it means that there is no negative evidence. Yet somehow a child is able to glean from such data the grammatical rules of their language(s). Consider the following:

(1) What did you say that Bill thought that John saw?

(2) *What did you say that Bill met the man who saw?

Despite perhaps never coming across utterances like (1) or (2), English speakers know that (1) is a grammatical sentence of English whilst (2) is not. But where did this knowledge come from? Another way of putting this question is to ask how we can know so much given how little we have to learn from. This is Plato’s Problem.

In modern linguistics, the solution to Plato’s Problem is to say that humans come equipped (i.e. it is in our genetics) with certain bits of knowledge, e.g. we instinctively/innately know how to analyse certain types of data in our environment. If we believe that any of these certain bits of knowledge that we make use of in language acquisition is specific to language, we arrive at the idea of Universal Grammar (UG). 

A Plan for the Improvement of English Spelling

In Year 1 that useless letter “c” would be dropped to be replased either by “k” or “s”, and likewise “x” would no longer be part of the alphabet. The only kase in which “c” would be retained would be the “ch” formation, which will be dealt with later. Year 2 might reform “w” spelling, so that “which” and “one” would take the same konsonant, wile Year 3 might well abolish “y” replasing it with “i” and Iear 4 might fiks the “g/j” anomali wonse and for all.

Jenerally, then, the improvement would kontinue iear bai iear with Iear 5 doing awai with useless double konsonants, and Iears 6-12 or so modifaiing vowlz and the rimeining voist and unvoist konsonants. Bai Iear 15 or sou, it wud fainali bi posibl tu meik ius ov thi ridandant letez “c”, “y” and “x” — bai now jast a memori in the maindz ov ould doderez — tu riplais “ch”, “sh”, and “th” rispektivli.

Fainali, xen, aafte sam 20 iers ov orxogrefkl riform, wi wud hev a lojikl, kohirnt speling in ius xrewawt xe Ingliy-spiking werld.

A Plan for the Improvement of English Spelling by M. J. Shields

The full letter may be read here.

(via unconsciousplots)

…a lifetime of linguistic study assured Ransom almost at once that these were articulate noises. The creature was *talking*. It had language. If you are not a philologist, I am afraid you must take on trust the prodigious emotional consequences of this realisation in Ransom’s mind. A new world he had already seen - but a new, an extraterrestrial, a non-human language was a different matter… The love of knowledge is a kind of madness. In the fraction of a second which it took Ransom to decide that the creature was really talking, and while he still knew that he might be facing instant death, his imagination had leaped over every fear and hope and probability of his situation to follow the dazzling project of making a Malacandrian grammar. ‘An Introduction to the Malacandrian Language’ - ‘The Lunar Verb’ - ‘A Concise Martian-English Dictionary’ … the titles flitted through his mind. And what might one not discover from the speech of a non-human race? The very form of language itself, the principle behind all possible languages, might fall into his hands.
C.S. Lewis’ Out of the Silent Planet (chapter 9), when Ransom (a Cambridge philologist, and based on J.R.R. Tolkien) first encounters a hross on the planet Malacandra.
Three Factors

I haven’t written anything for a while since I’ve been so busy recently (been working a lot on the typology of relative clauses - perhaps I’ll post something about that soon). This evening I watched an interview (on YouTube) from the late 1970’s (1977, I think) with Chomsky. The interview is from a series called “Men of Ideas” produced by the BBC.

It’s a great interview - stimulating and perceptive questions and, of course, stimulating and perceptive answers! Many things caught my attention, one of which being that Chomsky spoke of two factors playing a role in language design, namely the biological endowment (i.e. Universal Grammar (UG) - the species- and domain-specific cognitive ‘organ’ dealing with language) and linguistic experience (i.e. the primary linguistic data from which we acquire our native language(s)). The idea was that all humans are born with a capacity for language, i.e. UG is innate in humans, provided by our genetic makeup. The data we encounter as children is so scant and degenerate (full of false starts, sentence fragments, etc.) that it would be virtually impossible to acquire a grammar in the short amount of time that it takes any normal child to do so the world over…unless we came pre-programmed for such a task. The idea was that UG was this pre-programming. UG was thought to be richly specified with linguistic principles (all genetically encoded) that would help children in the task of language acquisition by severely constraining the possible hypotheses that any child would postulate when acquiring a grammar to generate the data the child was exposed to. That was then.

Nowadays, Chomsky speaks not of two factors, but of three factors of language design. UG and the primary linguistic data are the first and second factors respectively. The third factor is made up of general principles of data analysis and efficient computation. The idea is that children can bring these domain-general (i.e. not exclusively related to language) tools to language acquisition. The third factor allows the first factor, i.e. UG, to be made much smaller. In other words, UG is no longer thought to be as richly specified as it once was. In fact, the aim is to make UG as small as possible. This is desirable for a number of reasons, but a particularly pertinent reason concerns the evolution of language, i.e. the evolution of the capacity for language in humans. As an ‘organ’ of the mind, UG is a biological entity, and as such it must have evolved (though not necessarily through direct selection, as Chomsky points out in the interview!). Given that chimpanzees do not have UG, UG must have evolved some time in the last 5-7 million years or so. It is therefore unlikely that something as rich and complex as UG as it was originally conceived could have evolved in such an evolutionarily short space of time. The third factors, however, need not be specific to language, nor do they need to be specific to humans. Therefore, it is conceptually desirable if we can explain the design of language in terms of third factors. This is, in fact, viewed as the only source of principled explanation in Chomskyan syntax nowadays.

Importantly, although UG is far smaller than it was and may only consist of very few things (a recursive structure building operation at the very least), it is nevertheless still thought to exist. The UG hypothesis in its modern incarnation is thus still very different from approaches which deny the existence of UG altogether.

Anyway, if you’re interested, I suggest reading Chomsky’s (2005) paper:

Chomsky, N. (2005). Three Factors in Language Design. Linguistic Inquiry 36: 1, 1-22.


Quick question: I understand when to use être vs avoir as the helping verb for the French passé composé. However, what are the linguistic reasons as to why être is used in some cases and why avoir is used in others? At first glance, it seems that one can decide based on transitivity (avoir used…

I think that in French the auxiliary is largely lexically determined, i.e. ‘être’ is used with the 13 MRS VAN DE TRAMP verbs when these are used intransitively. If one of these verbs is used transitively then ‘avoir’ is used as the auxiliary instead. So, descriptively we can say ‘avoir’ is used as the auxiliary for all transitive verbs and all non-MRS VAN DE TRAMP intransitive verbs. In other languages, such as Italian and German, I think the relationship between auxiliary selection and (in)transitivity is more transparent than in French. Bear in mind too that many linguists will subdivide intransitives into unergative and unaccusative verbs based on whether the language treats the argument of an intransitive as being an external- or internal-argument. If I remember correctly, I think that auxiliary selection in Italian depends on whether the verb is considered unergative or not, but don’t quote me on that!

I always start off groaning about puns, but by the time I got to the end I think I was genuinely amused!

I always start off groaning about puns, but by the time I got to the end I think I was genuinely amused!


I went to a talk given by the man who developed Parseltongue for the Harry Potter films, Prof Francis Nolan. Just a few ‘facts’ about the language with some of the ‘explanations’ given:


It’s got no rounded vowels or labial consonants (because snake lips aren’t very flexible)

It’s got pharyngeal consonants (because some snakes like to constrict things)

It’s got a large number of fricatives, which also exhibit a length contrast (because…snakes)


It’s got basic VSO order

It’s got postpositions (typologically highly unusual for a VSO language)

It’s ergative


The word ‘muggle’ has been borrowed into English from Parseltongue ‘ŋaʔalas’ - obviously!

The male counterpart to the map I reblogged a few days ago!