What counts as a word?

A lot of us, as literate English speakers, have probably experienced that queasy moment of dread when you’re writing something on the computer and suddenly get a squiggly red line under a word you use all the time. You look at the suggested spellings… and none of them are the word you wanted. If you’re like me, at this point you hop online really quickly to make sure the word means what you thought it did and that you’re not butchering the spelling too horribly. Or maybe you turn to the dictionary you keep on your desk. Or maybe you turn to someone sitting next to you and ask “Is this a real word?”.

Latin dictionary
Oh, this? It’s just the pocket edition. The full one is three hundred volumes and comes with an elephant named George to carry it around your house. And it’s covered in gold. This edition is only bound in unicorn skin but it’s fine for a quick desk reference.
The underlying assumption behind the search to see if someone else uses the word is that, if they don’t, you can’t either. It’s not a “real word”.  Which begs the question: what makes a word real? Is there a moment of Pinocchio-like transformation where the hollow wooden word someone created suddenly takes on life and joins the ranks of the English language to much back-slapping and cigar-handing from the other vetted words? Is there a little graduation party where the word gets a diploma from the OED and suddenly it’s okay to use it whenever you want? Or does it get hired by the spelling board and get to work right away?

OK, so that was getting a bit silly, but my point is that most people have the vague notion that there’s a distinction between “real” words and “fake” words that’s pretty hard and fast. Like most slang words and brand names are fake words. I like to call this the Scrabble distinction. If you can play it in Scrabble, it counts and you can put it in a paper or e-mail and no one will call you on it. If you can’t, it’s a fake word and you use it at your own risk. Dictionaries play a large part in determining which is which, right? The official Scrabble dictionary is pretty conservative: it doesn’t have d’oh in it for example. But it’s also not without controversy. The first official Scrabble dictionary, for example, didn’t have “granola” in it, which the Oxford English Dictionary (the great grand-daddy of English dictionaries and probably the most complete record ever complied of the lexicon of any language ever) notes was first used in 1886 and I think most of us would agree is a “real” word.

The line is even blurrier than that, though. English is a language with a long and rich written tradition. In some ways, that’s great. We’ve got a lot more information on how words used to be pronounced than we would have otherwise and a lot of diachronic information. (That’s information about how the language has changed over time. 😛 ) But if you’ve been exposed mainly to the English tradition, as I have, you tend to forget that writing isn’t inseparable from spoken language. They’re two different things and there are a lot of traditions that aren’t writing-based. Consider, for example, the Odù Ifá, an entirely oral divination text from Nigeria that sometimes gets compared to the bible or the Qur’an. In the cultures I was raised in, the thought of a sacred text that you can’t read is strange, but that’s just part of the cultural lens that I see the world through; I shouldn’t project that bias onto other cultures.

So non-literary cultures still need to add words to their lexicons, right? But how do they know which words are “real” without dictionaries? It depends. Sometimes it just sort of happens organically. We see this in English too. Think about words associated with texting or IMing like “lol” or “brb” (that’s “laughing out loud” and “be right back” for those of you who are still living under rocks). I’ve noticed people saying these in oral conversations more and more and I wouldn’t be surprised if in fifty years “burb” started showing up in dictionaries. But even cultures which have only had writing systems for a very short amounts of time have gatekeepers. Navajo, which has only been written since around 1940, is a great example. Peter Ladefoged shares the following story in Phonetic Data Analysis:

One of our former UCLA linguistics students who is a Navajo tells how she was once giving a talk in a Navajo community. She was showing how words could be put together to create new words (such as sweet + heart creates a word with an entirely new meaning). When she was explaining this an elder called out: ‘Stop this blasphemy! Only the gods can create words.’ The Navajo language is holy in a way that is very foreign to most of us (p. 13).

So in Navajo you have elders and religious leaders who are the guardians of the language and serve as the final authorities. (FUN FACT: “authority” comes from the same root as “author”. See how writing-dependent English is?) There are always gray areas though. Language is, after all, incredibly complex. I’ll leave you one case to think about.

“Rammaflagit.” That’s ɹæm.ə.flæʒ.ɪt in the international phonetic alphabet. (I remember how thrilled my dad was when I told him I was studying IPA in college.) I hear it all the time and it means something like “gosh darn it”, sort of a bolderized curse word. Real word or not? The dictionaries say “no”, but the people  who I’ve heard using it would clearly say “yes”. What do you think?

Advertisements

What words are easy to say?

Ok, so in the last couple posts I’ve been throwing around terms like “easy to say” without giving a whole lot of explanation. And that’s a pity, because the study of what words are “easy” and what words are “hard” is, in my opinion, one of the greatest sub-disciplines in linguistics: phonotactics.

Imperial Russian soldier with phone
No, that's phone tactics, not phonotactics. They're completely different.
Phonotactics is like your great-aunt who always arranges the seating at family reunions becuase she remembers who fought with whom twenty years ago and knows not to sit them together. Basically, some sounds really like to be next to others. Like vowels. Vowels like to be next to everyone. In Japanese, for example, with a couple of exceptions, most syllables have to be made of a consonant plus a vowel. (In ling speak, this is known as “CV”. C for consonant, V for vowel. Yeah, unlike physicists, we like to keep things simple.) What’s even more amazing is that within six months of birth, Japanese infants prefer sounds that are CVCV to those that are CVCCV or CVCVC.

Polish, on the other hand, notoriously plays fast and loose with syllable structure. You can have consonant clusters up to five sounds long in Polish that, most weirdly, don’t follow the same sorts of rules that other languages do. Like English. English can have pretty big consonant clusters… but they’ll only get really big if the first or last sound in the word is ‘s’. (Protip: That’s why ‘s’ is such a great letter in scrabble; there’s a bunch of things you can slap it on to piggyback of someone else’s word, even outside of its morpheme status.) If you’ve ever stumbled over a Polish last name, there’s a sound linguistic reason you found it hard.

Why is this useful? Well, besides its obvious use in language teaching and being great cocktail party conversation material,  if you want to make a plausibly difficult-to-pronounce alien language, screw up your phonotactics and you’ll leave audio book readers in tears.