What types of emoji do people want more of?

So if you’re a weird internet nerd like me, you might already know that Unicode 9.0 was released today. The  deets are here, but they’re fairly boring unless you really care about typography. What’s more interesting to me, as someone who studies visual, spoken and written language, is that there are a whole batch of new emoji. And it’s led to lots of interesting speculation about, for example, what is the most popular new emoji is going to be (tldr: probably the ROFL face. People have a strong preference for using positive face emojis.)  This led me to wonder: what obvious lexical gaps are there?

[I]n some cases it is useful to refer to the words that are not part of the vocabulary: the nonexisting words. Instead of referring to nonexisting words, it is common to speak about lexical gaps, since the nonexisting words are indications of “holes” in the lexicon of the language that could be filled.

Janssen, M. 2012. “Lexical Gaps”. The Encyclopedia of Applied Linguistics.

This question is pretty easy to answer about emoji– we can just find out what words people are most likely to use when they’re complaining about not being able to use emoji. There’s even a Twitter bot that collects these kind of tweets. I decided to do something similar, but with a twist. I wanted to know what kinds of emoji people complain about wanting the most.

Boring technical details 💤

  1. Yesterday, I grabbed 4817 recent tweets that contained both the words “no” and “emoji”. (You can find the R script I used for this on my Github.)
  2. For each tweet, I took the two words occurring directly in front of the word “emoji” and created a corpus from them using the tm (text mining) package.
  3. I tidied up the corpus–removing super-common words like “the”, making everything lower-case, and so on. (The technical term is “cleaning“, but I like the sound of tidying better. It sounds like you’re  getting comfy with your data, not delousing it.)
  4. I ranked these words by frequency, or how often then showed up. There were 1888 distinct words, but the vast majority (1280) showed up only once. This is completely normal for word frequency data and is modelled by Zipf’s law.
  5. I then took all words that occurred more than three times and did a content analysis.


Exciting results! 😄

At the end of my content analysis, I arrived at nine distinct categories. I’ve listed them below, with the most popular four terms from each. One thing I noticed right off is how many of these are emoji that either already exist or are in the Unicode update. To highlight this, I’ve italicized terms in the list below that don’t have an emoji.

  • animal: shark, giraffe, butterfly, duck
  • color: orange, red, white, green
  • face: crying, angry, love, hate
  • (facial) feature: mustache, redhead, beard, glasses
  • flag: flag, England, Welshpride
  • food: bacon, avocado, salt, carrot
  • gesture: peace, finger, middle, crossed
  • object: rifle, gun, drum, spoon
  • person: mermaid, pirate, clown, chef

(One note: the rifle is in unicode 9.0, but isn’t an emoji. This has been the topic of some discussion, and is probably why it’s so frequent.)

Based on these categories, where are the lexical gaps? The three categories that have the most different items in them are, in order 1) food, 2) animals and 3) objects. These are also the three categories with the most mentions across all items.

So, given that so many people are talking about emojis for animals, food and objects, why aren’t the bulk of emojis in these categories? We can see why this might be by comparing how many different items get mentioned in each category to how many times each item is mentioned.

Yeah, people talk about food a lot… but they also talk about a lot of different types of food. On the other hand you have categories like colors, which aren’t talked about as much but where the same colors come up over and over again.

As you can see from the figure above, the most popular categories have a lot of different things in them, but each thing is mentioned relatively rarely. So while there is an impassioned zebra emoji fanbase, it only comes up three times in this dataset. On the other hand, “red” is fairly common but shows up because of discussion of, among other things, flowers, shoes and hair color. Some categories, like flags, fall in a happy medium–lots of discussion and fairly few suggestions for additions.

Based on this teeny data set, I’d say that if the Unicode consortium continues to be in charge of putting emoji standardization it’ll have its hands full for quite some time to come. There’s a lot of room for growth, and most of it is in food, animals and objects, which all have a lot of possible items, rather than gestures or facial expressions, which have much fewer.

Why do Canadians say ‘eh’?

Perhaps it’s because Seattle is so close to Canada, but for some reason when I ask classes of undergraduate students what they want to know about language and language use, one question I tend to get a lot is:

 Why do Canadians say ‘eh’?


Flag of Canada
Blog post about Canada, eh?

Fortunately for my curious students, this is actually an active area of inquiry. (It’s actually one those research questions where there was a flurry of work–in this case in the 1970’s–and then a couple quiet decades followed by a resurgence in interest. The ‘eh’ renaissance started in the mid-2000’s and continues today. For some reason, at least in linguistics, this sort of thing tends to happen a lot. I’ll leave discussing why this particular pattern is so common to the sociologists of science.)  So what do we know about ‘eh’?

Is ‘eh’ actually Canadian?

‘Eh’ has quite the pedigree–it’s first attested in Middle English and even shows up in Chaucer. Canadian English, however, boasts a more frequent use of ‘eh’, which can fill the same role as ‘right?’, ‘you know?’ or ‘innit?’ for speakers of other varieties of English.

What does ‘eh’ mean?

The real thing that makes an ‘eh’ Canadian, though, is how it’s used. Despite some claims to the contrary, “eh” is far from meaningless. It has a limited number of uses (Elaine Gold identified an even dozen in her 2004 paper) some of which aren’t found outside of Canada. Walter Avis described two of these uniquely Canadian uses in his 1972 paper, “So eh? is Canadian, eh” (it’s not available anywhere online as far as I can tell):

  1. Narrative use: Used to punctuate a story, in the same way that an American English speaker (south of the border, that is) might use “right?” or “you know?”
    1. Example: I was walking home from school, eh?  I was right by that construction site where there’s a big hole in the ground, eh? And I see someone toss a piece of trash right in it.
  2. Miscellaneous/exclamation use:  Tacked on to the end of a statement. (Although more recent work, presented by Martina Wiltschko and Alex D’Arcy at last year’s NWAV suggests that there’s really a limited number of ways to use this type of ‘eh’ and that they can be told apart by the way the speaker uses pitch.)
    1. Example: What a litterbug, eh?

And these uses seems to be running strong. Gold found that use of ‘eh’ in a variety of contexts has either increased or remained stable since 1980.

That’s not to say there’s no change going on, though. D’Arcy and Wiltschko found that younger speakers of Canadian English are more likely than older speakers to use ‘right?’ instead of ‘eh?’. Does this mean that ‘eh’ may be going the way of the dodo or ‘sliver’ to mean ‘splinter’ in British English?

Probably not–but it may show up in fewer places than it used to. In particular, in their 2006 study Elaine Gold and Mireille Tremblay found that almost half of their participants feel negatively about the narrative use of ‘eh’ and only 16% actually used it themselves. This suggests this type of uniquely-Canadian usage may be on its way out.

Should you go to grad school for linguistics?

So I’ve had this talk, in different forms, with lots of different people over the last couple of years. Mainly undergrads thinking about applying to PhD programs in linguistics but, occasionally, people in industry thinking about going back to school as well. Every single one of these people was smart, cool, dedicated, hard-working, a great linguist and would have been an asset to the field. And when they asked me, a current linguistics graduate student, whether it was a good idea to go to grad school in linguistics, I gave them all the same answer:

The word no made from jigsaw puzzle pieces - Flickr horiavarlan
“But Rachael,” you say, “you’re going to grad school in linguistics and having all sorts of fun. Why are you trying to keep me from doing the same thing?” Two big reasons.

The Job Market for Linguistics PhDs

What do you want to do when you get out of grad school? If you’re like most people, you’ll probably say you want to teach linguistics at the college or university level. What you should know is that this is an increasingly unsustainable career path.

In 1975, 30 percent of college faculty were part-time. By 2011, 51 percent of college faculty were part-time, and another 19 percent were non–tenure track, full-time employees. In other words, 70 percent were contingent faculty, a broad classification that includes all non–tenure track faculty (NTTF), whether they work full-time or part-time.

More Than Half of College Faculty Are Adjuncts: Should You Care? by Dan Edmonds.

And most of these part-time faculty, or adjuncts, are very poorly paid. This survey from 2015 found that 62% of adjuncts made less than $20,000 a year. This is even more upsetting you consider that you need a PhD and scholarly publications to even be considered for one of these posts.

(“But what about being paid for your research publications?” you ask. “Surely you can make a few bucks by publishing in those insanely expensive academic journals.” While I understand where you’re coming from–in almost any other professional publishing context it’s completely normal to be paid for your writing–authors of academic papers are not paid. Nor are the reviewers. Furthermore, authors are often charged fees by the publishers. One journal I was recently  looking at charges $2,900 per article, which  is about three times the funding my department gives us for research over our entire degree. Not a scam journal, either–an actual reputable venue for scholarly publication.)

Yes, there are still tenure-track positions available in linguistics, but they are by far the minority. What’s more, even including adjunct positions, there are still fewer academic posts than graduating linguists with PhDs. It’s been that way for a while, too, so even for a not-so-great adjunct position you’ll be facing stiff competition. Is it impossible to find a good academic post in linguistics? No. Are the odds in your (or my, or any other current grad student’s) favor? Also no. But don’t take it from me. In Surviving Linguistics: A Guide for Graduate Students (which I would highly recommend) Monica Macaulay says:

[It] is common knowledge that we are graduating more PhDs than there are faculty positions available, resulting in certain disappointment for many… graduates. The solution is to think creatively about job opportunities and keep your options open.

As Dr. Macaulay goes on to outline, there are jobs for linguists outside academia. Check out the LSA’s Linguistics Beyond Academia special interest group or the Linguists Outside Academia mailing list. There are lots of things you can do with a linguistics degree, from data science to forensic linguistics.

That said, there are degrees that will better prepare you for a career than a PhD in theoretical linguistics. A master’s degree in Speech Language Pathology (SLP) or Computational Linguistics or Teaching English to Speakers of Other Languages (TESOL) will prepare you for those careers far better than a general PhD.

Even if you’re 100% dead set on teaching post-secondary students, you should look around and see what linguists are doing outside of universities. Sure, you might win the job-lottery, but at least some of your students probably won’t, and you’ll want to make sure they can find well-paying, fulfilling work.

Grad School is Grueling

Yes, grad school can absolutely be fun. On a good day, I enjoy it tremendously. But it’s also work. (And don’t give me any nonsense about it not being real work because you do it sitting down. I’ve had jobs that required hard physical and/or emotional labor, and grad school is exhausting.) I feel like I probably have a slightly better than average work/life balance–partly thanks to my fellowship, which means I have limited teaching duties and don’t need a second job any more–and I’m still actively trying to get better about stopping work when I’m tired. I fail, and end up all tearful and exhausted, about once a week.

It’s also emotionally draining. Depression runs absolutely rampant among grad students. This 2015 report from Berkeley, for example, found that over two thirds of PhD students in the arts and sciences were depressed. The main reason? Point number one above–the stark realities of the job market. It can be absolutely gutting to see a colleague do everything right, from research to teaching, and end up not having any opportunity to do the job they’ve been preparing for. Especially since you know the same lays in wait for you.

And “doing everything right” is pretty Herculean in and of itself. You have to have very strong personal motivation to finish a PhD. Sure, your committee is there to provide oversight and you have drop-dead due dates. But those deadlines are often very far away and, depending on your committee, you may have a lot of independence. That means motivating yourself to work steadily while manage several ongoing projects in parallel (you’re publishing papers in addition to writing your dissertation, right?) and not working yourself to exhaustion in the process. Basically you’re going to need a big old double helping of executive functioning.

And oh by the way, to be competitive in the job market you’ll also need to demonstrate you can teach and perform service for your school/discipline. Add in time to sleep, eat, get at least a little exercise and take breaks (none of which are optional!) and you’ve got a very full plate indeed. Some absolutely iron-willed people even manage all of this while having/raising kids and I have nothing but respect for them.

Main take-away

Whether inside or outside of academia, it’s true that a PhD does tend to correlate with higher salary–although the boost isn’t as much as you’d get from a related professional degree. BUT in order to get that higher salary you’ll need to give up some of your most productive years. My spouse (who also  has a bachelors in linguistics) got a master’s degree,  found a good job,  got promoted and has cultivated a professional social network in the time it’s taken me just to get to the point of starting my dissertation.The opportunity cost of spending five more years (at a minimum–I’ve heard of people who took more than a decade to finish) in school, probably in your twenties, is very, very high. And my spouse can leave work at work, come home on weekends and just chill. This month I’ve got four full weekends of either conferences or outreach. Even worse, no matter how hard I try to stamp it out, I’ve got a tiny little voice in my head that’s very quietly screaming “you should be working” literally all the time.

I’m being absolutely real right now: going to grad school for linguistics is a bad investment of your time and labor. I knew that going in–heck, I knew that before I even applied–and I still went in. Why? Because I decided that, for me, it was a worthwhile trade-off. I really like doing research. I really like being part of the scientific community. Grad school is hard, yes, but overall I’m enjoying myself. And even if I don’t end up being able to find a job in academia (although I’m still hopeful and still plugging away at it) I really, truly believe that the research I’m doing now is valuable and interesting and, in some small way, helping the world. What can I say? I’m a nerdy idealist.

But this is 100% a personal decision. It’s up to you as an individual to decide whether the costs are worth it to you. Maybe you’ll decide, as I have, that they are. But maybe you won’t. And to make that decision you really do need to know what those costs are. I hope I’ve helped to begin making them clear. 

One final thought: Not going to grad school doesn’t mean you’re not smart. In fact, considering everything I’ve discussed above, it probably means you are.

What is linguistic discrimination?

Recently, UC Berkeley student Khairuldeen Makhzoomi was removed from his flight. The reason: he was speaking Arabic. And this isn’t the first time this has happened. Nor the second. These are all, in addition to being deeply disturbing and illegal, examples of linguistic discrimination.

What is linguistic discrimination?

Linguistic discrimination is discrimination based on someone’s language use. And it’s not restricted to the instances I discussed above:

As I’ve talked about before, linguistic discrimination can be a way to discriminate against a specific group of people without saying so in so many words. Linguistic discrimination, in addition to being morally repugnant,is illegal in the U.S. under Titles VI and VII of the Civil Rights Act of 1964.

These are important legal protections and the number of people affected by them is huge: There are over 350 different languages spoken in the United States. In Seattle, where I live, over a fifth of people over age five speak a language other than English at home. That’s a lot of people! Further, most of these individuals are bilingual or multilingual; 90% of second-generation immigrants speak English. And since multilingualism has both neurological benefits for individuals and larger positive impacts on society, I see this as no bad thing. And I’m hardly the only one: how many people that you know are learning or want to learn another language?

Unfortunately, linguistic discrimination threatens this rich diversity, and every person who speaks anything other than the standardized variety of the dominant language.

What can you do?

  • Don’t participate in linguistic discrimination. It can be hard to retrain yourself to reduce the impact of negative stereotypes but, especially if you’re in a position of privilege (as I am), it’s literally the least you can do. Don’t make assumptions about people based on their language use.
  • Stand up for people who may be facing linguistic discrimination. If you see someone being discriminated in in the workplace (like being given lower performance evaluations for having a non-native accent) point out that this is illegal, and back up people who are being discriminated against.
  • Be patient with non-native speakers. Appreciate that they’ve gone through a lot of effort to learn your language. If possible, try and arrange for an interpreter (for face-to-face communication) or translator (for written communications). Sometimes non-native speakers are more comfortable with reading and writing than speaking; offer to communicate through e-mails or other written correspondence.


What’s the difference between frosting and icing?

Fair warning: this post is full of pictures of baked goods. I can’t claim responsibly for any impulsive cake-baking that may result from reading further.

This is the second post in this series. The first half, here, focused on responses to whether “frosting” and “icing” were different things, or different words for the same thing. This post gets a little more in-depth. In the first part, I was just asking people what they thought they said. In the second part, I was asking them to pick words for specific pictures. It’s not a perfect design–by asking people what they think they saw first I primed them pretty heavily–but it does reveal some interesting patterns of usage.

The main thing I was interested in was this–did people who said frosting and icing were interchangeable for them actually use them as if they were the same? Why is this a good question to ask? Because  it turns out that a lot of the time people aren’t the best judges of how they use language. Especially if there’s some sort of “rule” about how you’re “supposed” to do it. For example, there’s something of a running joke among linguists how often people will use the passive voice while they’re telling people not to! I don’t think anyone would intentionally lie about their usage, but it’s possible that respondents aren’t always doing exactly what they think they are.

I split my dataset into people who said they thought the words “frosting” and “icing” meant the same thing and those who thought they were different. In the charts below these groups are labelled “same” and “different” respectively. For this stage of analysis, I left out people who weren’t sure; there weren’t a whole lot of them anyway.

Matcha-cupcakes (6453300119)

So this picture was a pretty canonical example of what people brought up a lot–it’s on a cake, and it’s been both whipped and piped. For a lot of people, then, this should be “frosting”. So what did people say?

cupcakeChartThe results here were pretty much what I expected. (Whew!) People who thought the words meant different things pretty much all thought this was “frosting”. And there was a pretty strong different between the groups. But this still doesn’t answer some of my questions. Is it the texture that makes it “frosting” or, as the AP Styleguide suggests, the fact that it’s on a cake? After all, you can definitely put buttercream on a cookie, as evinced by Lofthouse.



Next I had some doughnuts. A lot of people, when I first started asking around, brought up doughnuts as something that they thought were iced rather than frosted. So what did people say?


That does seem to hold true.There was no strong difference between the groups, but there were also a lot of write-in answers. (“Glaze” was especially popular, which, for the record, is probably what I’d say. ) So there seems to be more variety in what people call doughnut toppings but there is a tendency towards “icing”.

Cake with fondant

Sao Valentim 2013 (5)

Ok, so this image was a bit of a trick. The cake here is covered in fondant. Which, to me, isn’t really frosting or icing. But if it’s really “being on a cake” that makes something “frosting”, we should see a strong “frosting” bias from people with a distinction. fondantAnd that’s just not  the case. There’s also a pretty big difference between the groups here. Interestingly, people who thought “frosting” and “icing” are different things were more likely to write in “fondant”. (Remember that level of baking knowledge had no effect on whether people said there was a difference or not, so it’s probably not just specialized knowledge.)

Bundt Cake

Lemon bundt cake (2), January 2010

I included this image for a couple of reasons. Again, I’m poking at this “on a cake” idea. But I also had a lot of people tell me that, for them, the distinction between the words was texture-based. So responses here could have gone two ways: If anything on a cake is frosting, then we’d expect frosting to win. But, if frosting has to be fluffy/whipped, then we’d expect icing to win.


And icing wins! This is no surprise, given the written results summarized in my previous post and the responses for the cake pictures above, but for me it really puts the nail in the coffin of the “on cakes” argument. (Take note, AP Styleguide!) Even on this one, though, people with no distinction are much more likely to be able to use “frosting”.

Sweet Roll

Delicious orange roll

So this is an interesting one. I included it because, for me, cinnamon rolls are synonymous with cream cheese frosting/icing. Since several people I talked to said specifically that cream cheese had to be frosting and not icing, I was expecting a large “frosting” response on this one.


That was definitely not what I saw, though. (Although people with no distinction were much more likely to be able to say “frosting”, so I guess I came by it natural.) Most people, and especially people with a distinction, thought it was “icing”.


So there are two main takeaways here:

  • There’s a strong difference in usage between people who say that “frosting” and “icing” are different things and those who say they aren’t. (For most of the pictures, these groups responded significantly differently.)
  • If there is a difference, it’s got everything to do with texture and nothing to do with cake.

That’s not to say that these things will always hold true; no one knows better than linguists that language is in a constant state of flux. But for now, these generalizations seem to hold for most of the people surveyed. So if you’re going to make a usage distinction between these words, please make one that’s based on the actual usage and not some completely made-up rule!

A final note: if you’re interested in seeing the (slightly sanitized) data and the R code I used for analysis, both are available here.


Is there a difference between frosting and icing?

So recently, the Associated Press Stylebook posted this on Twitter:

This struck me as 1) kind of a petty usage distinction and 2) completely at odds with my personal usage and what I knew about the dialectal research.  The Dictionary of American Regional English, for example, notes that “Frosting” is “widespread, but chiefly North, North MidlandWest“. “Icing”, on the other hand, is found all over,”but less freq North, Pacific“. As someone from Virginia but currently living in Seattle, I have no problem using either frosting or icing for a nice buttercream. I’m hardly the only one, either. This baking blog post even says “I use lots of different icings to frost cupcakes”.

Chai white chocolate cupcakes (2)
Frosting or icing, I’ll take a dozen.

BUT when I posted about this Twitter, some people replied that they did have a very strong distinction between the two words. And the same thing happened when I brought it up with different groups of friends. A lot of people brought up texture, or that they’d say that some things are frosted and others are iced. This was really fascinating to me, both as a baker and a linguist, so I did what any social scientist would and set out to collect some data to get a better idea of what’s going on.

I set up a survey on Google forms and got 109 responses. First I collected info on where speakers were from, how old they were and how knowledgeable they were about baking. Then I asked them for both their general impression of use and then used pictures to ask what they’d call the sweet topping on a variety of baked goods. To avoid making this blog post absolutely huge, I’m going to split up data discussion. The first half  (this one) will look at whether people make a distinction between frosting and icing and whether that’s related to any of their social characteristics. The second half (I’ll link it here when it’s done) will focus on responses to specific images.

Are “frosting” and “icing” different, or are they different words for the same thing?

The first question I asked people was whether frosting and icing were different, or just different words for the same thing. Most people (over 60%) thought that they were different things, while about a third (27% ) thought they were different words for the same thing, and the rest weren’t sure. So it does look like there’s some difference in how people use these words. But in and of itself, that’s not very interesting. What I want to know is this: how do people with different social characteristics use these words? (You may remember that I wrote a while ago that this is the central question in sociolinguistics.)


The first thing I wanted to look at was region. I was expecting to see a pretty big difference here, and I wasn’t disappointed. Once I broke down the data by the states people were from, I found a definite pattern: people from the South were far more likely to say that frosting and icing were different words for the same thing. (Virginia isn’t really patterning with the rest of the South, here, but that may be due to bit of sampling bias–I recruited participants through my social network, and a lot of my friends are from Northern Virginia, which tends not to pattern with the South.)

Most people in the South thought frosting and icing were the same thing, while outside of the South more people thought they were different things. (The darker the blue, the more likely someone from that state was to say that they were different things–black states I didn’t get any respondents from.)

Why is there a distinction? Honestly, I’m not really sure. My intuition, though, is that people from the South probably have pretty wide exposure to both terms. (Since books, TV and movies tend to come from outside of the South, there’s plenty of chances to come across other dialectal variants.) However, people from outside the South historically had less exposure to one of the terms–icing–when they started to come across it they decided that it must refer to something different. As a result, the meanings of both words changed to become more narrow. (This is actually a pretty common process in languages.) I don’t have strong evidence for this theory right now, though, so take it with a couple shakes of salt!


Another thing I wanted to look at was whether the age of respondents played a role in how they used these words. If younger respondents seem to use the word differently than older respondents, it might be because there’s a change happening in the language. Given time, everyone might end up doing the same thing as the younger people.

While it looks like there’s a slight tendency for younger participants to say there’s a difference between frosting and icing, the effect isn’t strong enough to be reliable.

I didn’t find a strong pattern, though. Again, this might be due to sampling problems, since most of my respondents were roughly the same age (21-30).  But it could also be that there’s simply not anything to find–that this is neither an on ongoing change, nor one where younger people and older people do things differently.

Baking Knowledge

Ok, so it looks like people are varying by region, but not by age… but what about by level of baking knowledge? Maybe you don’t care about the difference if you almost never make or eat baked goods. It could be that people who know a lot about baking make a distinction, and it’s only people who don’t know a beater from a dough hook that are lumping things together.

Baking knowledge also isn’t closely tied to how people use these words. So it’s not just that people who don’t know a lot about baking say they’re the same.

But that’s not what I found. People at all levels of baking knowledge tended to have a pretty even balance between the two uses of the words.


I also collected comments from people, to get more information on what people thought in their own words. Two big themes emerged. One was that the most consistent thing people pointed to as the difference was texture. The other was that people tended to say that one of them was for the cake and the other wasn’t… but which one was which was pretty much random.

Just under half of the comments mentioned texture. I’ve compiled some of the differences below, but the general consensus seems to be that frosting is thick, fluffy and soft, while icing is thin and hard. Take note, AP Stylebook!

Frosting Icing
creamy or buttery syrupy, like a glaze
plasticy looking
spread squeezed or piped
thick and creamy thin, hardens as it dries
thicker clear crust, dried
fluffy thin
thin layer, smooth, glossy
more solid, less flowing watery, gooey
stays soft hardens once it sets
thicker, softer thinner, harder
thick, textured thin, flat

Six people did specifically mention how the words could be used for cake toppings in their comments. Two people said cakes could be either frosted or iced, two said that cakes could only be iced, and two said that cakes could only be frosted. Here’s an example of an icing is for cakes comment:

icing is for cakes! frosting is for all the other deliciousness. usually.

And someone who suggests frosting is for cakes:

I usually apply the word frosting solely to cakelike goods (cupcakes, regular cake) and then icing to everything else.

So… if you are going to claim there’s a difference between frosting and icing, pulling the “it goes on cakes” card is pretty likely to start a fight.  You’re much safer talking about texture. Unless you’re in the South, of course; then you can pretty much say what you like.

Is there a difference between frosting and icing? It looks like the answer mainly depends on where you are. But there were also some pretty interesting differences between different baked goods, so stay tuned for that part of the analysis.

P.S. If you’re interested in seeing the (slightly sanitized) data and the R code I used for analysis, both are available here.

Does white noise really help you study?

So midterms have started here at the University of Washington (already, I know!) and I’m starting to notice more stressed-out study sessions. Around this time of year I always think about all the crazy study hints and tips I’ve heard over the years. (My personal favorite tip is to drink sage tea while I’m reading over notes–it’s been shown to help improve memory.) But one tip that people often share is that listening to white noise can help you concentrate while studying. Being the sort of person I am (read: huge nerd) I decided to set out and see what the research has to say about it.

Study Group
Ok, with the lab report done, we’ve just got two more twenty-page papers to write before we can sleep. Anyone got some coffee? 

First things first: some noises can definitely be bad for learning. For example, one study which compared schools near major airports (which are a big source of noise pollution) and some which were not found that children who were in the noisier environment had reduced reading comprehension. An earlier, similar study showed that students in classrooms near a very noisy train track did worse academically than those that were not.

And noisy environments are bad for concentration, too. One survey of office workers found that 99% of participants were bothered by noises like ringing telephones and conversations, and that the negative effects of these noises didn’t fade over time. And we know that some types of speech noise–especially half of a telephone conversation–are incredibly distracting.

Ok, so we know that some noise can hurt both learning and concentration… so why fight fire with fire? Wouldn’t listening to white noise just be more of the same? Or even worse?

Well, not necessarily. The really distracting thing about noise is that it’s not predictable. It’s pretty easy to “tune out” a clock ticking because your brain can figure out when it’s going to tick again. When a new noise suddenly starts, however, or keeps happening in an unpredictable way, like a faucet dripping juuuust out of rhythm, your attention snaps to it. There’s actually a special set of “novelty detector neurons” that are looking for any new types of sounds that might show up. There are two ways to avoid this happening. One is to make sure that all your environmental sounds are ones you can easily ignore… or you can cover them up. And white noise is very effective at covering up other noises.

White noise is random noise that covers a wide frequency spectrum, usually 20 to 20,000 Hz. That means that other sounds that are the same volume or quieter than the white noise can’t “get thorough”. As a result, you don’t hear anything surprising, your novelty detector neurons stay quiet, and you can focus on what you’re doing. And don’t take my word for it: this study shows that students who listened to a recording of office noises masked with white noise preformed much better on tasks then those who listened to the office noises unmasked.

Now, keep in mind, just because a noise is “white” doesn’t mean it’s good for you. Volume, for one thing, is very important. Exposing rats to 100-dB white noise for 45 minutes was enough for them to undergo measurable stress-induced neurological changes. To be fair, that’s about as loud as a power mower but it does takes you out of the “relaxed concentration” range. So grab your headphones and favorite white noise source (if you’ve no other options, a radio set to static will work just fine) but remember to keep the volume down!

How to Read a Linguistics Article in 8 Easy Steps

Disclaimer: this mostly applies to experimental or quantitative articles, since those are what are common in my field. Your milage, especially in more formal fields like syntax or semantics, may vary dramatically.


Ok, so you’re not a professional linguist or anything, but you’ve come across an article in a linguistics journal and it sounds interesting. Or maybe you’ve just taken your first linguistics class and you heard about something really cool you want to learn more about. But when you start reading you’re quickly swamped by terms you don’t understand, IPA symbols you’ve never seen before and all sorts of statistics. You’re tempted to just throw in the towel.

Girl in the Library (3638661587)

Don’t panic! I’m here to help you out with Rachael’s patented* guide to reading linguistics articles.

The first thing to do is take a deep breath and accept that you may not understand everything right away. That’s ok! If you could easily read scientific literature in a field it would mean you were already an expert. Academic writing is designed to be read by other academics, and so it’s full of terms that have very specific meanings in the field. It’s a sort of time-saving code and it takes time to learn. Don’t beat yourself up for being at the beginning of your journey!

With that in mind, here’s the steps I like to follow when I’m starting a new article, especially if it’s in a field I’m less familiar with.

  1. Read the abstract. This will give you a broad outline of what the paper will be about and help you know if the whole article would be interesting or relevant for you.
  2. I like to call this the “sandwich step”. I read the introduction and then the conclusion. Why? Again, this gives me idea about what will be in the article. Sure, there may be spoilers, but knowing the answer will make it easier to understand how questions were asked.
    1. Notice any new terms that are both in the introduction and the abstract but don’t get explained? This might be a good time to look them up, since the author might be assuming you already know about it.
    2. Some places to look up terms:
      1. The SIL linguistics glossary can be a good place to start.
      2. Linguistics topics on Wikipedia are also a good choice. Linguists even get together at professional events to edit and add to linguistics-related pages.
      3. For a bit more in-depth introduction, Language and Linguistics Compass publishes short articles written by experts that are designed to be introductions to whatever topic they’re on.
  3. Flip through and look for any charts or figures and read their captions. These will be where the author(s) highlight their results. Now that you have a general idea about what’s going on you’ll have a better chance of interpreting these.
  4. Next, read the background section. This is where the author will talk about things that other people have done and how thier work fits in to the big picture of the field. This is the second place you’re likely to find new terms you’re unfamiliar with. If they’re only used once or twice, don’t worry about looking them up. Your aim is to understand the general thrust of the article, not every little detail! (Now, if you’re a grad student, on the other hand… 😉 )
  5. Now read the methods section. You can probably skim this; unless you’re interested in replicating the study or reviewing its merit you’re not going to have to have a full grasp of all the nitty-gritty nuances of item design and participant recruitment.
  6. Finally read the results. Unless you have some stats background, you’re probably safe in skipping over the statistical analyses. Again, you just want to understand the general point.
  7. Extra credit: Go back and read the abstract again. This is a very condensed version of what was in the article and is a good way to review/check your understanding.
  8. Sit back and enjoy having read a linguistics article!

Grats on making it through! Now that you’ve caught the bug, what are some ways to find more stuff to read?

  • Go find one of the articles referenced in the one you just read. Since you’re already familiar with similar work, you’ll probably have an easier time understanding the new article.
  • Or read something more recent that cites the article you’ve read. You can look up articles that cite the one you’ve read on Google Scholar, as this video explains.
  • Look up other issues of the journal your paper was in. Most journals publish in a pretty narrow range of topics so you’ll have a leg up on understanding the new articles.
  • Ask a linguist! We’re a friendly bunch and pretty responsive to e-mail. You might even see if you can find the contact info of the author(s) of the article you read to ask them for suggestions for other stuff to read.

I hope this has been helpful and piqued your interest about diving into linguistics research. Now get out there are get reading!

*Not actually patented.

Why can you mumble “good morning” and still be understood?

I got an interesting question on Facebook a while ago and though it might be a good topic for a blog post:

I say “good morning” to nearly everyone I see while I’m out running. But I don’t actually say “good”, do I? It’s more like “g’ morning” or “uh morning”. Never just morning by itself, and never a fully articulated good. Is there a name for this grunt that replaces a word? Is this behavior common among English speakers, only southeastern speakers, or only pre-coffee speakers?

This sort of thing is actually very common in speech, especially in conversation. (Or “in the wild” as us laboratory types like to call it.) The fancy-pants name for it is “hypoarticulation”. That’s less (hypo) speech-producing movements of the mouth and throat (articulation). On the other end of the spectrum you have “hyperarticulation” where you very. carefully. produce. each. individual. sound.

Ok, so you can change how much effort you put into producing speech sounds, fair enough. But why? Why don’t we just sort of find a happy medium and hang out there? Two reasons:

  1. Humans are fundamentally lazy. To clarify: articulation costs energy, and energy is a limited resource. More careful articulation also takes more time, which, again, is a limited resource. So the most efficient speech will be very fast and made with very small articulator movements. Reducing the word “good” to just “g” or “uh” is a great example of this type of reduction.
  2. On the other hand, we do want to communicate clearly. As my advisor’s fond of saying, we need exactly enough pointers to get people to the same word we have in mind. So if you point behind someone and say “er!” and it could be either a tiger or a bear, that’s not very helpful. And we’re very aware of this in production: there’s evidence that we’re more likely to hyperarticulate words that are harder to understand.

So we want to communicate clearly and unambiguously, but with as little effort as possible. But how does that tie in with this example? “G” could be “great” or “grass” or “génial “, and “uh” could be any number of things. For this we need to look outside the linguistic system.

The thing is, language is a social activity and when we’re using language we’re almost always doing so with other people. And whenever we interact with other people, we’re always trying to guess what they know. If we’re pretty sure someone can get to the word we mean with less information, for example if we’ve already said it once in the conversation, then we will expend less effort in producing the word. These contexts where things are really easily guessable are called “low entropy“. And in a social context like jogging past someone in the morning, phrases liked “good morning” have very low entropy. Much lower than, for example “Could you hand me that pickle?”–if you jogged past someone  and said that you’d be very likely to hyperarticulate to make sure they understood.

Do you tweet the way you speak?

So one of my side projects is looking at what people are doing when they choose to spell something differently–what sort of knowledge about language are we encoding when we decide to spell “talk” like “tawk”, or “playing” like “pleying”? Some of these variant spelling probably don’t have anything to do with pronunciation, like “gawd” or “dawg”, which I think are more about establishing a playful, informal tone. But I think that some variant spellings absolutely are encoding specific pronunciation. Take a look at this tweet, for example (bolding mine):

There are three different spelling here, two which look like th-stopping (where the “th” sound as in “that” is produced as a “d” sound instead) and one that looks like r-lessness (where someone doesn’t produce the r sound in some words). But unfortunately I don’t have a recording of the person who wrote this tweet; there’s no way I can know if they produce these words in the same way in their speech as they do when typing.

Fortunately, I was able to find someone who 1) uses variant spellings in their Twitter and 2) I could get a recording of:

This let me directly compare how this particular speaker tweets to how they speak. So what did I find? Do they tweet the same way they speak? It turns out that that actually depends.

  • Yes! For some things (like the th-stopping and r-lessness like I mentioned above) this person does tweet and speak in pretty much the same way. They won’t use an “r” in spelling where they wouldn’t say an “r” sound and vice versa.
  • No! But for other things (like saying “ing” words “in” or saying words like “coffin” and “coughing” with a different vowel in the first syllable) while this person does them a lot in thier speech, they aren’t using variant spellings at the same level in thier tweets. So they’ll say “runnin” 80% of the time, for example, but type it as “running” 60% of the time (rather than 20%, which is what we’d expect if the Twitter and speech data were showing the same thing).

So what’s going on? Why are only some things being used in the same way on Twitter and in speech? To answer that we’ll need to dig a little deeper into the way these things in speech.

  • How are th-stopping and r-lessness being used in speech? So when you compare the video above to one of the sports radio announcer that’s being parodied (try this one) you’ll find that they’re actually used more in the video above than they are in the speech that’s being parodied. This is pretty common in situations where someone’s really laying on a particular accent (even one they speak natively), which sociolinguists call a performance register.
  • What about the other things? The things that aren’t being used as often Twitter as they are on speech, on the other hand, actually show up at the same levels in speech, both for the parody and the original. This speaker isn’t overshooting thier use of these features; instead they’re just using them in the way that another native speaker of a dialect would.

So there’s a pretty robust pattern showing up here. This person is only tweeting the way they speak for a very small set of things: those things that are really strongly associated with this dialect and that they’re really playing up in thier speech. In other words, they tend to use the things that they’re paying a lot of attention to in the same way both in speech and on Twitter. That makes sense. If you’re very careful to do something when you’re talking–not splitting an infinitive or ending a sentence with a preposition, maybe–you’re probably not going to do it when you’re talking. But if there’s something that you do all the time when you’re talking and aren’t really aware of then it probably show up in your writing. For example, there are lots of little phrases I’ll use in my speech (like “no worries”, for example) that I don’t think I’ve ever written down, even in really informal contexts. (Except for here, obviously.)

So the answer to whether tweets and speech act the same way is… is depends. Which is actually really useful! Since it looks like it’s only the things that people are paying a lot of attention to that get overshot in speech and Twitter, this can help us figure out what things people think are really important by looking at how they use them on Twitter. And that can help us understand what it is that makes a dialect sound different, which is useful for things like dialect coaching, language teaching and even helping computers understand multiple dialects well.

(BTW, If you’re interested in more details on this project, you can see my poster, which I’ll be presenting at NWAV44 this weekend, here.)