embeddedReading an article about the latest apparent murder by the North Korean royal family in this week’s Economist (February 18, 2017), I came upon this remarkable example of English hypotaxis:

Kim Kwang-jin, a defector who once worked in North Korea’s “royal court” economy, says that even if rumours that China had hoped to install Jong-nam if Jong-un fell from power are far-fetched, China would nonetheless have seen Jong-nam as useful leverage.

Hypotaxis is packing clauses inside other clauses as subparts (as opposed to parataxis, where you just string clauses together one after another, leaving the interlocutor to figure out how to connect them up semantically). When I saw the remarkably hypotactic example above, I first thought it might be a case of triple center-embedding, which would be exciting.

A subordinate clause is center-embedded if it has parts of its containing clause both before it and after it. Understanding center-embedded clauses causes the human sentence-processing mechanism some difficulty (whereas coping with subordinate clauses tucked on the ends of their containing clauses is quite easy — and of course much more like parataxis). A double center-embedding has a clause center-embedded inside another center-embedded clause, and causes much worse difficulty.

Triple center-embeddings are incredibly rare, even in writing. Fred Karlsson and others have been searching intensively for them in Danish, English, Finnish, French, German, Latin, and Swedish for more than a quarter of a century, and by 2007 had only found 13 (see Karlsson’s “Constraints on multiple center-embedding of clauses,” Journal of Linguistics 42.2: 365–392, 2007). At least one of the few he found is in an English criminal-law statute, which hardly counts as human language at all.

Let me show you the Economist sentence again with its clauses marked by numbered brackets (clause number n, for n ≥ 0, where clause 0 is the whole sentence, will appear between ‘[n’ and ‘n]’).

[Note for linguists only: I'm minimizing the clause count in three ways: [i] I don’t assume an extra clause node for that + Σ, I just count Σ; [ii] I ignore nonfinite clauses such as complements of auxiliary or lexical verbs; and [iii] I don’t treat Jong-nam as useful leverage as a verbless clause, despite its semantic equivalence to that Jong-nam could be useful leverage. These decisions are not crucial.]

Here it is:
[0 Kim Kwang-jin, a defector [1 who once worked in North Korea's "royal court" economy, 1] says that [2 even if  [3 rumours that [4 China had hoped to install Jong-nam if [5 Jong-un fell from power 54] are far-fetched 3] [6 China would nonetheless have seen Jong-nam as useful leverage 620]

As you should be able to see from the bracketing,

  • Clause 6 is on the end of Clause 4;
  • Clause 5 is on the end of Clause 2;
  • Clause 4 is center-embedded in Clause 3;
  • Clause 3 is center-embedded in Clause 2;
  • Clause 2 is on the end of Clause 0; and
  • Clause 1 is center-embedded in Clause 0.

So the sentence I found has only double center-embedding: Clause 4 center-embedded in Clause 3 which is is center-embedded in Clause 2. (The extra center-embedding of Clause 1 in Clause 0 is not relevant because Clause 2 isn’t embedded in Clause 1, so there’s no chain of three center-embeddings.) I’m still looking for a triple.

Triple center-embedding seems to be the absolute maximum that actually turns up in practice. This is theoretically significant, because a celebrated argument that Noam Chomsky gave in 1956 to show that English does not belong to a computationally important class called the finite state languages depends on the claim (which perhaps we should assume on theoretical grounds) that clauses can be center-embedded n deep for any positive integer n. Karlsson’s survey of the concrete evidence suggests a real-life upper bound of n ≤ 3 — disappointingly low even if you accept that (as it were) the linguistic spirit is willing but the psycholinguistic flesh is weak.

Double center-embeddings can just about be conversationally understood. Here is an 18-word sentence with the same crucial clause structure as the Economist one, but shorter and with a more homely topic (only the center-embedded clauses and their containing clause are bracketed):

John says thateven if  [ rumours that [ Mary would die if you left ] are overstated, ]she does love you ].

You could understand that if someone said it to you. But probably no one will. Double center-embeddings are rare (Karlsson and half a dozen other diligent hunters had only found 132 by 2007), and in speech they are fantastically rare. Listen to eloquent speakers and see if you can catch one.

[Note: My colleague Lucy Ferriss will draw you a Reed-Kellogg diagram of the Economist sentence if you request it; but she will expect a donation to the charity of her choice.]

