How to pronounce the “th” sound in English

Or, as I like to call it, hunting the wild Eth and Thorn (which are old letters that can be difficult to typesest), because back in the day, English had the two distinct “th” sounds represented differently in their writing system. There was one where you vibrated your vocal folds (that’s called ‘voiced’) which was written as “ð” and one where you didn’t (unvoiced) which was written as “þ”. It’s a bit like the difference between “s” and “z” in English today. Try it: you can say both “s” and “z” without moving your tongue a millimeter. Unfortunately, while the voiced and voiceless “th” sounds remain distinct, they’re now represented by the same “th” sequence. The difference between “thy” and “thigh”, for example, is the first sound, but the spelling doesn’t reflect that. (Yet another example of why English orthography is horrible.)

Used with permission from the How To Be British Collection copyright LGP, click picture for website.

The fact that they’re written with the same letters even though they’re different sounds is only part of why they’re so hard to master. (That goes for native English speakers as well as those who are learning it as their second language: it’s one of the last sounds children learn.). The other part is that they’re relatively rare across languages. Standard Arabic  Greek, some varieties of Spanish, Welsh and a smattering of other languages have them.  If you happen to have a native language that doesn’ t have it, though, it’s tough to hear and harder to say. Don’t worry, though, linguistics can help!

I’m afraid the cartoon above may accurately express the difficulty of  producing the “th” for non-native speakers of English, but the technique is somewhat questionable. So, the fancy technical term for the “th” sounds are the interdental fricatives.  Why? Because there are two parts to making it. The first is the place of articulation, which means where you put your tongue. In this case, as you can probably guess (“inter-” between and “-dental” teeth), it goes in between your teeth. Gently!

The important thing about your tongue placement is that your tongue tip needs to be pressed lightly against the bottom of your top teeth. You need to create a small space to push air thorough, small enough that it makes a hissing sound as it escapes. That’s the “fricative” part. Fricatives are sounds where you force air through a small space and the air molecules start jostling each other and make a high-frequency hissing noise. Now, it won’t be as loud when you’re forcing air between your upper teeth and tongue as it is, for example, when you’re making an “s”, but it should still be noticeable.

So, to review, put the tip of  your tongue against the bottom of your top teeth. Blow air through the thin space between your tongue and your teeth so that it creates a (not very loud) hissing  sound. Now try voicing the sound (vibrating  your vocal folds) as you do so. That’s it! You’ve got both of the English “th” sounds down.

If you’d like some more help, I really like this video, and it has some super-cool slow-motion videos. The lady who made it has a website focusing on English pronunciation which has some great  resources.  Good luck!

Flap that!

Imagine you’re walking down a sunny street in Chicago and pass by a construction site. Someone yells out, “Adam, the ladder, pick it up!” Congratulations, you’ve just found the elusive wild flap in its natural environment! And not just once, but three times.  Where was it? “Adam, the ladder, pick it up!” Try saying it aloud. If you’re a native speaker of American English, you’ll say all three of the underlined sounds the same way.

Construction worker at Westlake Center, 1988
Come on, Adam, Lulu's having to pick up your slack!

Unless you’re already pretty familiar with linguistics, you’ve probably never heard of the flap (or tap, as some linguists call it), but that doesn’t mean that you’re not already acquainted. In fact, the flap is one of most common sounds of the English language, especially American English. It’s produced by a very quick movement of the tongue against the little ridge of bone just behind your teeth. This video will give you an idea of just how quick:

It’s a little difficult to see, but did you notice that bit in the middle where the tongue suddenly jumped? That was the flap. It’s so fast that it makes the production of most other sounds seem like the proverbial tortoise. A flap takes an average of 20 milliseconds to produce; by contrast, the schwa vowel (it’s an ‘uh’ sound, the most common in the English Language) lasts an average of 64 milliseconds.  You can see why the flap is such a favorite; it’s a huge time saver.

It’s a little difficult to spot a flap  within specialized training because it doesn’t have its own letter, or make any minimal pairs. (A minimal pair is a pair of words that differ by only one sound, like “cat” and “cap”. Because you need to be able to tell the sounds apart in order to tell the words apart, you’re really good at distinguishing the sounds that make minimal pairs, at least in your native language[s]). Usually, it replaces the ‘t’ or ‘d’ sound in the middle of a word, but when you start speaking more quickly, more and more of your ‘t’s and ‘d’s end up coming out as flaps. And that makes sense. When you’re speaking more quickly, you want to be understood, but you just  don’t have as much time to articulate quickly. Since most people will hear the flap as a ‘t’ or a ‘d’, switching one for the other is just easier for everyone.

So that’s the flap, a shy, unassuming sound that you often mistake for one of its more glamorous siblings. Now that you’ve been introduced, though, try to keep an eye out for the little guy. You just might be surprised how often it pops up!

Indiscreet words, Part II: Son of Sounds

Ok, so in my last post about how the speech stream is far from discrete, I talked about how difficult it is to pick apart words. But I didn’t really talk that much about phonemes, and since I promised you phonetics and phonology and phun, I thought I should cover that. Besides, it’s super interesting.

It’s not just that language is continuous, it’s that language that’s discrete is actually impossible to understand. I ran across this Youtube video a while back that’s a great example of this phenomenon.

What the balls of yarn is he saying? It’s actually the preamble to the constitution, but it took me well over half the video to pick up on it, and I spend a dumb amount of time listening to phonemes in isolation.

You probably find this troubling on some level. After all, you’re a literate person, and as a literate person you’re really, really used to thinking about words as being easy to break down into “letter sounds”. If you’ve ever tried to fiddle around with learning Mandarin or Cantonese, you know just how table-flippingly frustrating it is to memorize a writing system where the graphemes (smallest unit of writing, just as morpheme is the smallest unit of meaning, phoneme is the small unit of sound and dormeme is the smallest amount of space you can legally house a person in) have no relation to the series of sounds they represent.

Fun fact: It’s actually pretty easy to learn to speak Mandarin or Cantonese once you get past the tones. They’re syntactically a lot like English, don’t have a lot of fussy agreement markers or grammatical gender and have a pretty small core vocabulary. It’s the characters that will make you tear your hair out.

Hm. Well, it kinda looks me sitting on a chair hunched over my laptop while wearing a little hat and ARGH WHAT AM I DOING THAT LOOKS NOTHING LIKE A BIRD.

But. Um. Sorry, got a little off track there. Point was, you’re really used to thinking about words as being further segmented. Like oranges. Each orange is an individual, and then there are neat little segments inside the orange so you don’t get your hands sticky. And, because you’re already familiar with the spelling system of your language, (which is, let’s face it, probably English) you probably have a fond idea that it’s pretty easy to divide words that way. But it’s not. If it were, things like instantaneous computational voice to voice translation would be common.

It’s hard because the edges of our sounds blur together like your aunt’s watercolor painting that you accidently spilled lemonade on. So let’s say you’re saying “round”. Well, for the “n” you’re going to close off your nasal passages and put your tongue against the little ridge right behind your teeth. But wait! That’s where you tongue needs to be to make the “d” sound! To make it super clear, you should stop open up your nasal passages before you flick your tongue down and release that little packet of air that you were storing behind it. You’re totally not going to, though. I mean, your tongue’s already where you need it to be; why would you take the extra time to make sure your nasal passages are fully clear before releasing the “d”? That’s just a waste of time. And if you did it, you’d sound weird. So the “d” gets some of that nasally goodness and neither you or your listener give a flying Fluco.

But, if you’re a computer who’s been told, “If it’s got this nasal sound, it’s an ‘n'”, then you’re going to be super confused. Maybe you’ll be all like, “Um, ok. It kinda sounds like an ‘n’, but then it’s got that little pop of air coming out that I’ve been told to look for with the ‘p’, ‘b’, ‘t’ ‘d’, ‘k’, ‘g’ set… so… let’s go with ‘rounp’. That’s a word, right?” Obviously, this is a vast over-simplification, but you get my point; computers are easily confused by the smearing around of sounds in words. They’re getting better, but humans are still the best.

So just remember: when you’re around the robot overlords, be sure to run your phonemes together as much as possible. It might confuse them enough for you to have time to run away.