The mess that is the English spelling system

The latest XKCD reads:


The first part is a reference to the cheesy pick-up line ‘If I could rearrange the alphabet, I’d put U and I together’.

The irregularity of the English spelling system is part of folk-lore now. You have countless poems and articles and whatnots, all trying to hammer in the fact that English pronunciation is the work of the Devil.

English wasn’t always this way. Let’s travel through time to the 14th century and listen to a recitation of the General Prologue to Canterbury Tales by Geoffrey Chaucer in Middle English. Note that no one who’s alive today has heard Middle English being spoken, but this seems to be a reasonable reconstruction. Notice how the spellings of the words aren’t significantly different from what we have in Modern English, but the pronunciation is so markedly different.

The Wiki page for English Orthography summarizes the situation well: This is largely due to the complex history of the English language, together with the absence of systematic spelling reforms implemented in English, in contrast to the position in a number of other languages.

To put it in short, the pronunciation has undergone significant changes, some very systematic ones (The Great Vowel Shift), but the orthography hasn’t followed suit. So what was once pronounced as it was written now sounds very different from its spelling. For instance, the word I wasn’t pronounced Aye; it was pronounced ee, the way you would expect in a speak as you spell language. The gh in night wasn’t silent, it was actually pronounced. And the i in night wasn’t aye, it was ee.

Apart from this, there is the whole issue of loan words. English usually just retains the spelling of the loan word and modifies its pronunciation slightly, while still retaining a bit of the original. When there are loan words from all kinds of languages, you are bound to have a lot of mess. You are effectively borrowing words which have been written using completely different spelling conventions. So you have words like Czech and Fjord. When there are too many conventions, there isn’t any convention at all. That, my friend, is English. Of course, with enough experience, you do begin to see patterns in English words, and can often predict the spelling of a word with reasonable accuracy, but it is true that our orthography remains a veritable mess.

As Mark Liberman wrote on Language Log, the English writing system is a complex pattern of overlapping historical layers with sporadic intrusions of reform, for which the appropriate mode of analysis is more geological than logical. 

While English orthography is indeed a colossal mess (and no linguist would claim otherwise), there are some bright sides to it once you manage to get over the terrible spelling. Often you can tell where a words has come from merely by looking at it, because it carries a signature of its original language. The letter ch pronounced as sh? It’s probably come via French. The same letter pronounced as k? We are mostly looking at a gift from the Greeks. And so on…

Finally, I hope you’ve seen this excellent video!