Yoda-Speak: A Study of Yoda's Speaking Patterns and Their Frequencies

For a lot of fans familiar with the original trilogy, the appearance of Yoda in the prequels brought some unexpected apprehension. While his new physical appearance received criticism, whether puppet or CG, there is another area where we can observe subtle change: his speaking patterns. This area has had some mild controversy: some fans have accused that Yoda's speaking patterns in the prequels sound much different from the original trilogy, while others will defend that he sounds basically the same. Examples can readily be shown illustrating both similarities and differences.

The contention with most critics is that Yoda in the prequels speaks in a more backwards, convoluted manner--"around the survivors, a perimeter create"--whereas in the originals his lines are much more normal than we would remember--"I cannot teach him. The boy has no patience." Keep in mind that many of the most ridiculous lines in the originals come from the early part of Empire Strikes Back, where Yoda is pretending to be crazy and speaks in a very inverted manner. Yoda still inverts his speech in the later sections, but in way which recalls archaic forms of English. Screenwriter Lawrence Kasdan has claimed in Annotated Screenplays that he deliberately made Yoda sound medieval in the inversions.

Of course, many would say that the sentence structure of Yoda in the prequels, at least for the most part, is consistent with this presentation. So, what is the case? Well, this is one area where a systematic study can aid us with understanding the structure of how Lucas and associates wrote dialogue for the character. This was done by my friend Drew Stewart, who first compiled a spreadsheet of all of Yoda's lines throughout the original trilogy and prequel trilogy, with each line flagged as either normal or "odd" (that is, contains inverted word orders). He at first considered it by blocks of text, but because there were normal sentences surrounded by odd ones, he did it on a sentence by sentence basis. His results were this:

ESB, "crazy swamp creature": 13/23 lines are odd, 57%
ESB, serious Yoda: 30/73 lines are odd, 41%
ESB Total: 43/96 lines are odd, 45%
RotJ Total: 18/33 lines are odd, 55%
TPM Total: 19/26 lines are odd, 73%
AotC Total: 31/56 lines are odd, 55%
RotS Total: 46/66 lines are odd, 70%

Discounting the "crazy swamp creature" persona, the OT figures of 41% and 55% average to 48%, while the PT's figures average out to 66%, which means the Yoda of the prequels has a nearly 20% increase in the number of odd-sounding lines. It is tempting to say that the very large drop for Attack of the Clones may be due to co-screenwriter Jonathan Hales influence, as this is the only film of the trio with an official co-writer credit, which happens to bring the odd-sentence frequency back to the levels of ESB/ROTJ, both of which were also co-written by someone other than Lucas (Lawrence Kasdan). This seems too big a coincidence to avoid pointing out. The Lucas-only screenplays are in the 70% range, while the Lucas+co-writer screenplays are all around the 50% range.

But moving on, taking into account the above tabulation, this means that Yoda does indeed speak in a noticeably more backwards way in the prequels. However, at a difference of 18%, this may not be as extreme as some may be expecting; by my own expectation, I would have thought the OT to have only 40% odd, with the prequels closer to 80% (a 40% difference, twice the reality). So, these are surprising and interesting results (comparison on a one-on-one basis makes it less surprising, however, as ESB's 41% and TPM's 73% show pretty wide disparity). However, and this is a big however, this is not the end of the matter. We have yet to actually examine the sentence structures themselves. While Yoda has only an 18% overall difference of "backwards" sentences between the trilogies, we shall find that upon examination these figures do change quite drastically in terms of how they matter.

To aid in this examination, Drew Stewart recruited his friend Tim MacSaveny: a professional linguist. So, his findings here might be a bit on the dense side in terms of technical linguistic terminology but I will try to keep things clarified.

Both sets of films have both normal Yoda lines as well as lines with inverted or out-of-order grammatical elements. However, the difference is that these inverted lines in the original trilogy are basically archaically structured, but nonetheless grammatically sound. This probably reflects screenwriter Lawrence Kasdan's desire that they sound medieval. On the other hand, many of the prequel trilogy lines are both inverted yet also grammatically incorrect. This will be elaborated on in a moment, and it largely has to do with the position and order of the subject, object and verbs in the sentences.

This accouts for the debate in the first place. Detractors have accurately noticed that Yoda's speaking patterns do indeed change in the newer films and are less grammatically correct. However, the other side is also correct in their assertion that there is not a gigantic difference in the frequency or amount of inverted lines overall. However, the latter camp has made one mistake, and this is looking at the matter in the simpler manner that Drew Stewart initially took when organizing his spreadsheet; with a fuller examination of the grammar we find real differences within the "odd" sounding lines, and this is where the real PT-OT difference lies. Let's look at these.

After reviewing Empire Strikes Back , linguist Tim MacSaveny noticed that Yoda makes errors common for non-native speakers of English, specifically in the way he inverts the word order to object-subject-verb, "which," MacSaveny says, "while very uncommon in languages of the Earth, is possible." He continues: "It is almost universally dispreferred to put object before subject, but some very few languages in South America do it, and marked forms of some other languages (like Mandarin) do it. Presumably, Yoda's native tongue (is there a name for this yet? If not, I may call it "Yodish", or perhaps "Dagobarista") has an OSV word order, and he is accidentally code-switching."

He also made this list of observations where Yoda makes grammatical choices either unusual or incorrect in modern English.

Temporal adverbs: are placed sentence-initial, sentence-medial, sentence final, and in their proper place.

Auxiliary verbs: Placed inconsistently or left out (especially with "do")

Negatives placed word finally: this is okay in archaic English, as in "size matters not"

Equative clauses are almost always structured OVS (with O being the predicate complement)

Possible hypothesis for future study -- Is Yoda employing topic and focus (as per Halliday, see http://en.wikipedia.org/wiki/Topic%E2%80%93comment for a primer)

This is an interesting analysis, but it doesn't quite help us in the subject of this article: how does his speech pattern change for the second trilogy? After watching both Phantom Menace and Attack of the Clones , MacSaveny came to an interesting conclusion:

"I've found two interesting linguistic deviations between Original Trilogy Yoda and Prequel Yoda.

The first: OT Yoda uses OVS construction in equative clauses. This means he says the Object (or the complement in this case) first, then the Verb, then the Subject in sentences where the main verb is "be". This is acceptable, and even euphonious, to speakers of American English, because it follows rules accepted in an older English style. Take these examples:

"Luminous beings are we, not this crude matter."

"Always in motion is the future."

"Strong is Vader."

"Anger...fear...aggression. The dark side of the Force are they."

"Strong am I with the Force... but not that strong."

In these examples, and in fact in the entire OT corpus, 100% of equative clause sentences carried an OVS construction.

Now, lets look at the second set of data (the prequels)

"Hard to see, the dark side is."

"Revealed your opinion is."

"Clouded this boy's future is."

"The chosen one the boy may be."

"But for certain, Senator, in grave danger you are."

"Truly wonderful the mind of a child is."

Notice that in every instance, 100% of equative clauses are OSV, not OVS. This is representative of the entire corpus. This means that every one is verb final, which is more unusual and uncommon in standard English, antiquated or otherwise, and generally has a more foreign, "incorrect" sound to them. At the very least, the speech pattern is markedly and wildly different from the original movies.

This starts highlighting more specific, inter-sentence structural changes in the dialogue between trilogies. However, the differences are even stronger. After viewing Revenge of the Sith , MacSaveny delivered his final observations.

"I wanted to finish what I started and supply you with the second issue of Yoda's differing speech patterns in the trilogies, which produced some of the most awkward lines delivered by Yoda. Consider this offensive datum:

"Confer on you the level of Jedi knight the council does, but agree with your taking this boy as your Padawan learner I do not."

Even in text I found this line hard to process, because this line exhibits two things that English hates: verb fronting and cleft constructions. One is rare: the other is simply ungrammatical.

Verb fronting is exactly that: the verb is placed in the initial position in the sentence. Japanese does it; really, a bunch of languages do. Even English might have this construction in some interrogative statements (which I excluded from this study, because of this exact fact). For English, though, it's a little harder to do because of the auxiliary verbs. Verbs like be, do, have and will have a special function of defining tense or aspect for the matrix verb in many sentences, and splitting a verb from its auxiliary creates a cleft construction, which in English is a big no-no. The further the cleft, the more odd it sounds.

Unfortunately, when I examined the OT and the PT more closely, I could find little difference in the mechanics of these constructions, when it occurred. The differences were in how egregiously the verb was split from its auxiliary, and the frequency in which Yoda uses it. in the OT, Yoda may say things like:

Told you, I did.

Stay and help you, I will.

Take you to him, I will.

...suffer your father's fate, you will.

These four examples are all of the non-interrogative data I could find exhibiting the verb fronting with cleft construction. In the prequels, he uses this construction much more:

Confer on you the level of Jedi knight the council does, but agree with your taking this boy as your Padawan learner I do not.

...find Obi-Wan's wayward planet we will.

Allow this appointment lightly the council does not.

Hiding in the Outer Rim Grievous is.

Heard from no one have we.

Received a coded retreat message we have.

With many more examples not listed. Notice how, in each of the examples, the auxiliaries are as far away as possible from the main verbs (not in this case can be viewed as an auxiliary also, because it is part of the verb phrase).

So there isn't a huge, qualitative difference between the trilogies in this case as there was with the OSV vs. OVS constructions in equative clauses mentioned earlier. But, the frequency in which this construction is applied (35/150 23% of all lines vs. 4/131  3% of all lines) and how far away the verb is from its auxiliary in some particularly heinous sentences forces me to conclude that this is another major difference in speech pattern between trilogies.

If you take into account that Yoda was feigning...something in the early parts of his relationship with Luke on Dagobah, the number in the OT is cut to 1 out of 131, with that one being an unclear example because it is an embedded sentence -- the whole sentence, "Do not underestimate the powers of the Emperor, or suffer your father's fate you will." does not exhibit the same construction. So, with some minor manipulation of the data you could say that this mode of speech is standard for Yoda only in the prequels.

That was a lot of information, and perhaps more jargon than anyone needed, but I hope it was helpful in some way. I know I enjoyed the study."

So, to sum up:

-An average of 48% of OT Yoda's lines have inverted elements in them, versus 66% for the PT.

-Of these, there are differences in what makes them "odd" sounding between the trilogies.

-OT Yoda uses OVS construction in equative clauses (sentences where the verb is "to be") in his inverted sentences. This is grammatically acceptable, if slighly archaic in style. This means most of his inverted sentences are inverted in a way which is still grammatically correct. This may have to do with Kasdan's desire to Medievalize the dialogue, rather than make it "backwards" as Lucas has in recent years described the character's speech.

-PT Yoda, on the other hand, while speaking in an inverted manner more frequently, also uses OSV rather than OVS construction in many of these instances. This is considered less grammatically acceptable, whether in archaic or contemporary forms of English. 

-PT Yoda uses incorrect grammar in another way that OT Yoda does not in verb fronting and cleft construction. He uses this in the OT mainly when he is in the "crazy swamp creature persona", perhaps deliberately by the writers so as to make him seem more off-balance and unusual (there is one potential example in ROTJ, but it is not clear cut). "Crazy swamp creature Yoda" also speaks with more inverted lines, 57% compared to 41% for "serious Yoda" in the same film, again showing that the writers deliberately gave Yoda more incorrect grammer in this persona, while in the PT "serious Yoda" often speaks in this manner.

So I suppose this gives us a spoiler to some future EU story which details what Yoda was doing for 20 years on Dagobah and explains his improved grammar: he was taking ESL lessons!


