Are there differences in automatic caption error rates due to pitch or speech rate?

So after my last blog post went up, a couple people wondered if the difference in classification error rates between men and women might be due to pitch, since men tend to have lower voices. I had no idea, so, being experimentally inclined, I decided to find out.

First, I found the longest list of words that I could from the accent tag. Pretty much every video I looked used a subset of these words.

Aunt, Roof, Route, Wash, Oil, Theater, Iron, Salmon, Caramel, Fire, Water, Sure, Data, Ruin, Crayon, New Orleans, Pecan, Marriage, Both, Again, Probably, Spitting Image, Alabama, Guarantee, Lawyer, Coupon, Mayonnaise, Ask, Potato, Three, Syrup, Cool Whip, Pajamas, Caught, Catch, Naturally, Car, Aluminium, Envelope, Arizonia, Waffle, Auto, Tomato, Figure, Eleven, Atlantic, Sandwich, Attitude, Officer, Avacodo, Saw, Bandana, Oregon, Twenty, Halloween, Quarter, Muslim, Florida, Wagon

Then I recorded myself reading them at a natural pace, with list intonation. In order to better match the speakers in the other Youtube videos, I didn’t go into the lab and break out the good microphones; I just grabbed my gaming headset and used that mic. Then, I used Praat (a free, open source software package for phonetics) to shift the pitch of the whole file up and down 60 Hertz in 20 Hertz intervals. That left me with seven total sound files: the original one, three files that were 20, 40 and 60 Hertz higher and finally three files that were 20, 40 and 60 Hertz lower. You can listen to all the files individually here.

The original recording had a mean of 192 Hz and a median of 183, which means that my voice is slightly lower pitched than average for an American English speakering women. For reference, Pepiot 2014 found a mean pitch of 210 Hz for female American English speakers. The same papers also lists a mean pitch of 119 Hz for male American English speakers. This means that my lowest pitch manipulation (mean of 132) is still higher than the average American English speaking male. I didn’t want to go too much lower with my pitch manipulations, though, because the sound files were starting to sound artifact-y and robotic.

Why did I do things this way?

  • Only using one recording. This lets me control 100% for demographic information. I’m the same person, with the same language background, saying the same words in the same way. If I’d picked a bunch of speakers with different pitches, they’d also have different language backgrounds and voices. Plus I’m not getting effects from using different microphones.
  • Manipulating pitch both up and down. This was for two reasons. First, it means that the original recording isn’t the end-point for the pitch continuum. Second, it means that we can pick apart whether accuracy is a function of pitch or just the file having been manipulated.

Results:

You can check out how well the auto-captions did yourself by checking out this video. Make sure to hit the CC button in the lower left-hand corner.


The first thing I noticed was that I had really, really good results with the auto captions. Waaayyyy better than any of the other videos I looked at. There were nine errors across 434 tokens, for a total error rate of only 2%, which I’d call pretty much at ceiling. There was maaayybe a slight effect of the pitch manipulation, with higher pitches having slightly higher error rates, as you can see:

percentErrorByPitchManipulation

BUT there’s also sort of a u-shaped curve, which suggests to me that the recognizer is doing worse with the files that have been messed with the most. (Although, weirdly, only the file that had had its pitched shifted up by 20 Hz had no errors.) I’m going to go ahead and say that I’m not convinced that pitch is a determining factor

So why were these captions so much better than the ones I looked at in my last post? It could just be that I was talking very slowly and clearly. To check that out, I looked at autocaptions for the most recent video posted by someone who’s fairly similar to me in terms of social and vocal characteristics: a white woman who speaks standardized American English with Southern features. Ideally I’d match for socioeconomic class, education and rural/urban background as well, but those are harder to get information about.

I chose Bunny Meyer, who posts videos as Grav3yardgirl. In this video her speech style is fast and conversational, as you can hear for yourself:

To make sure I had roughly the same amount of data as I had before, I checked the captions for the first 445 words, which was about two minutes worth of video (you can check my work here). There was an overall error rate of approximately 8%, if you count skipped words as errors.  Which, considering that recognizing words in fast/connected speech is generally more error-prone, is pretty good. It’s definitely better than in the videos I analyzed for my last post. It’s also a fairly small difference from my careful speech: definitely less than the 13% difference I found for gender.

So it looks like neither the speed of speech nor the pitch are strongly affecting recognition rate (at least for videos captioned recently). There are a couple other things that I think may be going on here that I’m going to keep poking at:

  • ASR has got better over time. It’s totally possible that more women just did the accent tag challenge earlier, and thus had higher error rates because the speech recognition system was older and less good. I’m going to go back and tag my dataset for date, though, and see if that shakes out some of the gender differences.
  • Being louder may be important, especially in less clear recordings. I used a head-mounted microphone in a quiet room to make my recordings, and I’m assuming that Bunny uses professional recording equipment. If you’re recording outside or with a device microphone, though, there going to be a lot more noise. If your voice is louder, and men’s voices tend to be, it should be easier to understand in noise. My intuition is that, since there are gender differences in how loud people talk, some of the error may be due to intensity differences in noisy recordings. Although an earlier study found no difference in speech recognition rates for men and women in airplane cockpits, which are very noisy, so who knows? Testing that out will have to wait for another day, though.

Google’s speech recognition has a gender bias

In my last post, I looked at how Google’s automatic speech recognition worked with different dialects. To get this data, I hand-checked annotations  more than 1500 words from fifty different accent tag videos .

Now, because I’m a sociolinguist and I know that it’s important to stratify your samples, I made sure I had an equal number of male and female speakers for each dialect. And when I compared performance on male and female talkers, I found something deeply disturbing: YouTube’s auto captions consistently performed better on male voices than female voice (t(47) = -2.7, p < 0.01.) . (You can see my data and analysis here.)

accuarcyByGender

On average, for each female speaker less than half (47%) her words were captioned correctly. The average male speaker, on the other hand, was captioned correctly 60% of the time.

It’s not that there’s a consistent but small effect size, either, 13% is a pretty big effect. The Cohen’s d was 0.7 which means, in non-math-speak, that if you pick a random man and random woman from my sample, there’s an almost 70% chance the transcriptions will be more accurate for the man. That’s pretty striking.

What it is not, unfortunately, is shocking. There’s a long history of speech recognition technology performing better for men than women:

This is a real problem with real impacts on people’s lives. Sure, a few incorrect Youtube captions aren’t a matter of life and death. But some of these applications have a lot higher stakes. Take the medical dictation software study. The fact that men enjoy better performance than women with these technologies means that it’s harder for women to do their jobs. Even if it only takes a second to correct an error, those seconds add up over the days and weeks to a major time sink, time your male colleagues aren’t wasting messing with technology. And that’s not even touching on the safety implications of voice recognition in cars.

 

So where is this imbalance coming from? First, let me make one thing clear: the problem is not with how women talk. The suggestion that, for example, “women could be taught to speak louder, and direct their voices towards the microphone” is ridiculous. In fact, women use speech strategies that should make it easier for voice recognition technology to work on women’s voices.  Women tend to be more intelligible (for people without high-frequency hearing loss), and to talk slightly more slowly. In general, women also favor more standard forms and make less use of stigmatized variants. Women’s vowels, in particular, lend themselves to classification: women produce longer vowels which are more distinct from each other than men’s are. One thing that may be making a difference is that women also tend not to be as loud, partly as a function of just being smaller, and cepstrals (the fancy math thing what’s under the hood of most automatic voice recognition) are sensitive to differences in intensity. This all doesn’t mean that women’s voices are more difficult; I’ve trained classifiers on speech data from women and they worked just fine, thank you very much. What it does mean is that women’s voices are different from men’s voices, though, so a system designed around men’s voices just won’t work as well for women’s.

Which leads right into where I think this bias is coming from: unbalanced training sets. Like car crash dummies, voice recognition systems were designed for (and largely by) men. Over two thirds of the authors in the  Association for Computational Linguistics Anthology Network are male, for example. Which is not to say that there aren’t truly excellent female researchers working in speech technology (Mari Ostendorf and Gina-Anne Levow here at the UW and Karen Livescu at TTI-Chicago spring immediately to mind) but they’re outnumbered. And that unbalance seems to extend to the training sets, the annotated speech that’s used to teach automatic speech recognition systems what things should sound like. Voxforge, for example, is a popular open source speech dataset that “suffers from major gender and per speaker duration imbalances.” I had to get that info from another paper, since Voxforge doesn’t have speaker demographics available on their website. And it’s not the only popular corpus that doesn’t include speaker demographics: neither does the AMI meeting corpus, nor the Numbers corpus.  And when I could find the numbers, they weren’t balanced for gender. TIMIT, which is the single most popular speech corpus in the Linguistic Data Consortium, is just over 69% male. I don’t know what speech database the Google speech recognizer is trained on, but based on the speech recognition rates by gender I’m willing to bet that it’s not balanced for gender either.

Why does this matter? It matters because there are systematic differences between men’s and women’s speech. (I’m not going to touch on the speech of other genders here, since that’s a very young research area. If you’re interested, the Journal of Language and Sexuality is a good jumping-off point.) And machine learning works by making computers really good at dealing with things they’ve already seen a lot of. If they get a lot of speech from men, they’ll be really good at identifying speech from men. If they don’t get a lot of speech from women, they won’t be that good at identifying speech from women. And it looks like that’s the case. Based on my data from fifty different speakers, Google’s speech recognition (which, if you remember, is probably the best-performing proprietary automatic speech recognition system on the market) just doesn’t work as well for women as it does for men.

Which accents does automatic speech recognition work best for?

If your primary dialect is something other than Standardized American English (that sort of from-the-US-but-not-anywhere-in-particular type of English you hear a lot of on the news) you may have noticed that speech recognition software doesn’t generally work very well for you. You can see the sort of thing I’m talking about in this clip:

This clip is a little old, though (2010). Surely voice recognition technology has improved since then, right? I mean, we’ve got more data and more computing power than ever. Surely somebody’s gotten around to making sure that the current generation of voice-recognition software deals equally well with different dialects of English. Especially given that those self-driving cars that everyone’s so excited about are probably going to use voice-based interfaces.

To check, I spent some time on Youtube looking at the accuracy automatic captions for videos of the accent tag challenge, which was developed by Bert Vaux. I picked Youtube automatic captions because they’re done with Google’s Automatic Speech Recognition technology–which is one of the most accurate commercial systems out there right now.

Data: I picked videos with accents from Maine (U.S), Georgia (U.S.), California (U.S), Scotland and New Zealand. I picked these locations because they’re pretty far from each other and also have pretty distinct regional accents.  All speakers from the U.S. were (by my best guess) white and all looked to be young-ish. I’m not great at judging age, but I’m pretty confident no one was above fifty or so.

What I did: For each location, I checked the accuracy of the automatic captions on the word-list part of the challenge for five male and five female speakers. So I have data for a total of 50 people across 5 dialect regions. For each word in the word list, I marked it as “correct” if the entire word was correctly captioned on the first try. Anything else was marked wrong. To be fair, the words in the accent tag challenge were specifically chosen because they have a lot of possible variation. On the other hand, they’re single words spoken in isolation, which is pretty much the best case scenario for automatic speech recognition, so I think it balances out.

Ok, now the part you’ve all been waiting for: the results. Which dialects fared better and which worse? Does dialect even matter? First the good news: based on my (admittedly pretty small) sample, the effect of dialect is so weak that you’d have to be really generous to call it reliable. A linear model that estimated number of correct classifications based on total number of words, speaker’s gender and speaker’s dialect area fared only slightly better (p = 0.08) than one that didn’t include dialect area. Which is great! No effect means dialect doesn’t matter, right?

Weellll, not really. Based on a power analysis, I really should have sampled forty people from each dialect, not ten. Unfortunately, while I love y’all and also the search for knowledge, I’m not going to hand-annotate two hundred Youtube videos for a side project. (If you’d like to add data, though, feel free to branch the dataset on Github here. Just make sure to check the URL for the video you’re looking at so we don’t double dip.)

So while I can’t confidently state there is an effect, based on the fact that I’m sort of starting to get one with only a quarter of the amount of data I should be using, I’m actually pretty sure there is one. No one’s enjoying stellar performance (there’s a reason that they tend to be called AutoCraptions in the Deaf community) but some dialect areas are doing better than others. Look at this chart of accuracy by dialect region:

accuracyByDialect

Proportion of correctly recognized words by dialect area, color coded by country.

There’s variation, sure, but in general the recognizer seems to be working best on people from California (which just happens to be where Google is headquartered) and worst on Scottish English. The big surprise for me is how well the recognizer works on New Zealand English, especially compared to Scottish English. It’s not a function of country population (NZ = 4.4 million, Scotland = 5.2 million). My guess is that it might be due to sample bias in the training sets,  especially if, say, there was some 90’s TV shows in there; there’s a lot of captioned New Zealand English in Hercules, Xena and related spin-offs. There’s also a Google outreach team in New Zealand, but not Scotland, so that might be a factor as well.

So, unfortunately, it looks like the lift skit may still be current. ASR still works better for some dialects than others. And, keep in mind, these are all native English speakers! I didn’t look at non-native English speakers, but I’m willing to bet the system is also letting them down. Which is a shame. It’s a pity that how well voice recognition works for you is still dependent on where you’re from. Maybe in another six years I’ll be able to write a blog post says it isn’t.

What types of emoji do people want more of?

So if you’re a weird internet nerd like me, you might already know that Unicode 9.0 was released today. The  deets are here, but they’re fairly boring unless you really care about typography. What’s more interesting to me, as someone who studies visual, spoken and written language, is that there are a whole batch of new emoji. And it’s led to lots of interesting speculation about, for example, what is the most popular new emoji is going to be (tldr: probably the ROFL face. People have a strong preference for using positive face emojis.)  This led me to wonder: what obvious lexical gaps are there?

[I]n some cases it is useful to refer to the words that are not part of the vocabulary: the nonexisting words. Instead of referring to nonexisting words, it is common to speak about lexical gaps, since the nonexisting words are indications of “holes” in the lexicon of the language that could be filled.

Janssen, M. 2012. “Lexical Gaps”. The Encyclopedia of Applied Linguistics.

This question is pretty easy to answer about emoji– we can just find out what words people are most likely to use when they’re complaining about not being able to use emoji. There’s even a Twitter bot that collects these kind of tweets. I decided to do something similar, but with a twist. I wanted to know what kinds of emoji people complain about wanting the most.

Boring technical details 💤

  1. Yesterday, I grabbed 4817 recent tweets that contained both the words “no” and “emoji”. (You can find the R script I used for this on my Github.)
  2. For each tweet, I took the two words occurring directly in front of the word “emoji” and created a corpus from them using the tm (text mining) package.
  3. I tidied up the corpus–removing super-common words like “the”, making everything lower-case, and so on. (The technical term is “cleaning“, but I like the sound of tidying better. It sounds like you’re  getting comfy with your data, not delousing it.)
  4. I ranked these words by frequency, or how often then showed up. There were 1888 distinct words, but the vast majority (1280) showed up only once. This is completely normal for word frequency data and is modelled by Zipf’s law.
  5. I then took all words that occurred more than three times and did a content analysis.

 

Exciting results! 😄

At the end of my content analysis, I arrived at nine distinct categories. I’ve listed them below, with the most popular four terms from each. One thing I noticed right off is how many of these are emoji that either already exist or are in the Unicode update. To highlight this, I’ve italicized terms in the list below that don’t have an emoji.

  • animal: shark, giraffe, butterfly, duck
  • color: orange, red, white, green
  • face: crying, angry, love, hate
  • (facial) feature: mustache, redhead, beard, glasses
  • flag: flag, England, Welshpride
  • food: bacon, avocado, salt, carrot
  • gesture: peace, finger, middle, crossed
  • object: rifle, gun, drum, spoon
  • person: mermaid, pirate, clown, chef

(One note: the rifle is in unicode 9.0, but isn’t an emoji. This has been the topic of some discussion, and is probably why it’s so frequent.)

Based on these categories, where are the lexical gaps? The three categories that have the most different items in them are, in order 1) food, 2) animals and 3) objects. These are also the three categories with the most mentions across all items.

So, given that so many people are talking about emojis for animals, food and objects, why aren’t the bulk of emojis in these categories? We can see why this might be by comparing how many different items get mentioned in each category to how many times each item is mentioned.

Rplot02

Yeah, people talk about food a lot… but they also talk about a lot of different types of food. On the other hand you have categories like colors, which aren’t talked about as much but where the same colors come up over and over again.

As you can see from the figure above, the most popular categories have a lot of different things in them, but each thing is mentioned relatively rarely. So while there is an impassioned zebra emoji fanbase, it only comes up three times in this dataset. On the other hand, “red” is fairly common but shows up because of discussion of, among other things, flowers, shoes and hair color. Some categories, like flags, fall in a happy medium–lots of discussion and fairly few suggestions for additions.

Based on this teeny data set, I’d say that if the Unicode consortium continues to be in charge of putting emoji standardization it’ll have its hands full for quite some time to come. There’s a lot of room for growth, and most of it is in food, animals and objects, which all have a lot of possible items, rather than gestures or facial expressions, which have much fewer.

Why do Canadians say ‘eh’?

Perhaps it’s because Seattle is so close to Canada, but for some reason when I ask classes of undergraduate students what they want to know about language and language use, one question I tend to get a lot is:

 Why do Canadians say ‘eh’?

 

Fortunately for my curious students, this is actually an active area of inquiry. (It’s actually one those research questions where there was a flurry of work–in this case in the 1970’s–and then a couple quiet decades followed by a resurgence in interest. The ‘eh’ renaissance started in the mid-2000’s and continues today. For some reason, at least in linguistics, this sort of thing tends to happen a lot. I’ll leave discussing why this particular pattern is so common to the sociologists of science.)  So what do we know about ‘eh’?

Is ‘eh’ actually Canadian?

‘Eh’ has quite the pedigree–it’s first attested in Middle English and even shows up in Chaucer. Canadian English, however, boasts a more frequent use of ‘eh’, which can fill the same role as ‘right?’, ‘you know?’ or ‘innit?’ for speakers of other varieties of English.

What does ‘eh’ mean?

The real thing that makes an ‘eh’ Canadian, though, is how it’s used. Despite some claims to the contrary, “eh” is far from meaningless. It has a limited number of uses (Elaine Gold identified an even dozen in her 2004 paper) some of which aren’t found outside of Canada. Walter Avis described two of these uniquely Canadian uses in his 1972 paper, “So eh? is Canadian, eh” (it’s not available anywhere online as far as I can tell):

  1. Narrative use: Used to punctuate a story, in the same way that an American English speaker (south of the border, that is) might use “right?” or “you know?”
    1. Example: I was walking home from school, eh?  I was right by that construction site where there’s a big hole in the ground, eh? And I see someone toss a piece of trash right in it.
  2. Miscellaneous/exclamation use:  Tacked on to the end of a statement. (Although more recent work, presented by Martina Wiltschko and Alex D’Arcy at last year’s NWAV suggests that there’s really a limited number of ways to use this type of ‘eh’ and that they can be told apart by the way the speaker uses pitch.)
    1. Example: What a litterbug, eh?

And these uses seems to be running strong. Gold found that use of ‘eh’ in a variety of contexts has either increased or remained stable since 1980.

That’s not to say there’s no change going on, though. D’Arcy and Wiltschko found that younger speakers of Canadian English are more likely than older speakers to use ‘right?’ instead of ‘eh?’. Does this mean that ‘eh’ may be going the way of the dodo or ‘sliver’ to mean ‘splinter’ in British English?

Probably not–but it may show up in fewer places than it used to. In particular, in their 2006 study Elaine Gold and Mireille Tremblay found that almost half of their participants feel negatively about the narrative use of ‘eh’ and only 16% actually used it themselves. This suggests this type of uniquely-Canadian usage may be on its way out.

Should you go to grad school for linguistics?

So I’ve had this talk, in different forms, with lots of different people over the last couple of years. Mainly undergrads thinking about applying to PhD programs in linguistics but, occasionally, people in industry thinking about going back to school as well. Every single one of these people was smart, cool, dedicated, hard-working, a great linguist and would have been an asset to the field. And when they asked me, a current linguistics graduate student, whether it was a good idea to go to grad school in linguistics, I gave them all the same answer:

“But Rachael,” you say, “you’re going to grad school in linguistics and having all sorts of fun. Why are you trying to keep me from doing the same thing?” Two big reasons.

The Job Market for Linguistics PhDs

What do you want to do when you get out of grad school? If you’re like most people, you’ll probably say you want to teach linguistics at the college or university level. What you should know is that this is an increasingly unsustainable career path.

In 1975, 30 percent of college faculty were part-time. By 2011, 51 percent of college faculty were part-time, and another 19 percent were non–tenure track, full-time employees. In other words, 70 percent were contingent faculty, a broad classification that includes all non–tenure track faculty (NTTF), whether they work full-time or part-time.

More Than Half of College Faculty Are Adjuncts: Should You Care? by Dan Edmonds.

And most of these part-time faculty, or adjuncts, are very poorly paid. This survey from 2015 found that 62% of adjuncts made less than $20,000 a year. This is even more upsetting you consider that you need a PhD and scholarly publications to even be considered for one of these posts.

(“But what about being paid for your research publications?” you ask. “Surely you can make a few bucks by publishing in those insanely expensive academic journals.” While I understand where you’re coming from–in almost any other professional publishing context it’s completely normal to be paid for your writing–authors of academic papers are not paid. Nor are the reviewers. Furthermore, authors are often charged fees by the publishers. One journal I was recently  looking at charges $2,900 per article, which  is about three times the funding my department gives us for research over our entire degree. Not a scam journal, either–an actual reputable venue for scholarly publication.)

Yes, there are still tenure-track positions available in linguistics, but they are by far the minority. What’s more, even including adjunct positions, there are still fewer academic posts than graduating linguists with PhDs. It’s been that way for a while, too, so even for a not-so-great adjunct position you’ll be facing stiff competition. Is it impossible to find a good academic post in linguistics? No. Are the odds in your (or my, or any other current grad student’s) favor? Also no. But don’t take it from me. In Surviving Linguistics: A Guide for Graduate Students (which I would highly recommend) Monica Macaulay says:

[It] is common knowledge that we are graduating more PhDs than there are faculty positions available, resulting in certain disappointment for many… graduates. The solution is to think creatively about job opportunities and keep your options open.

As Dr. Macaulay goes on to outline, there are jobs for linguists outside academia. Check out the LSA’s Linguistics Beyond Academia special interest group or the Linguists Outside Academia mailing list. There are lots of things you can do with a linguistics degree, from data science to forensic linguistics.

That said, there are degrees that will better prepare you for a career than a PhD in theoretical linguistics. A master’s degree in Speech Language Pathology (SLP) or Computational Linguistics or Teaching English to Speakers of Other Languages (TESOL) will prepare you for those careers far better than a general PhD.

Even if you’re 100% dead set on teaching post-secondary students, you should look around and see what linguists are doing outside of universities. Sure, you might win the job-lottery, but at least some of your students probably won’t, and you’ll want to make sure they can find well-paying, fulfilling work.

Grad School is Grueling

Yes, grad school can absolutely be fun. On a good day, I enjoy it tremendously. But it’s also work. (And don’t give me any nonsense about it not being real work because you do it sitting down. I’ve had jobs that required hard physical and/or emotional labor, and grad school is exhausting.) I feel like I probably have a slightly better than average work/life balance–partly thanks to my fellowship, which means I have limited teaching duties and don’t need a second job any more–and I’m still actively trying to get better about stopping work when I’m tired. I fail, and end up all tearful and exhausted, about once a week.

It’s also emotionally draining. Depression runs absolutely rampant among grad students. This 2015 report from Berkeley, for example, found that over two thirds of PhD students in the arts and sciences were depressed. The main reason? Point number one above–the stark realities of the job market. It can be absolutely gutting to see a colleague do everything right, from research to teaching, and end up not having any opportunity to do the job they’ve been preparing for. Especially since you know the same lays in wait for you.

And “doing everything right” is pretty Herculean in and of itself. You have to have very strong personal motivation to finish a PhD. Sure, your committee is there to provide oversight and you have drop-dead due dates. But those deadlines are often very far away and, depending on your committee, you may have a lot of independence. That means motivating yourself to work steadily while manage several ongoing projects in parallel (you’re publishing papers in addition to writing your dissertation, right?) and not working yourself to exhaustion in the process. Basically you’re going to need a big old double helping of executive functioning.

And oh by the way, to be competitive in the job market you’ll also need to demonstrate you can teach and perform service for your school/discipline. Add in time to sleep, eat, get at least a little exercise and take breaks (none of which are optional!) and you’ve got a very full plate indeed. Some absolutely iron-willed people even manage all of this while having/raising kids and I have nothing but respect for them.

Main take-away

Whether inside or outside of academia, it’s true that a PhD does tend to correlate with higher salary–although the boost isn’t as much as you’d get from a related professional degree. BUT in order to get that higher salary you’ll need to give up some of your most productive years. My spouse (who also  has a bachelors in linguistics) got a master’s degree,  found a good job,  got promoted and has cultivated a professional social network in the time it’s taken me just to get to the point of starting my dissertation.The opportunity cost of spending five more years (at a minimum–I’ve heard of people who took more than a decade to finish) in school, probably in your twenties, is very, very high. And my spouse can leave work at work, come home on weekends and just chill. This month I’ve got four full weekends of either conferences or outreach. Even worse, no matter how hard I try to stamp it out, I’ve got a tiny little voice in my head that’s very quietly screaming “you should be working” literally all the time.

I’m being absolutely real right now: going to grad school for linguistics is a bad investment of your time and labor. I knew that going in–heck, I knew that before I even applied–and I still went in. Why? Because I decided that, for me, it was a worthwhile trade-off. I really like doing research. I really like being part of the scientific community. Grad school is hard, yes, but overall I’m enjoying myself. And even if I don’t end up being able to find a job in academia (although I’m still hopeful and still plugging away at it) I really, truly believe that the research I’m doing now is valuable and interesting and, in some small way, helping the world. What can I say? I’m a nerdy idealist.

But this is 100% a personal decision. It’s up to you as an individual to decide whether the costs are worth it to you. Maybe you’ll decide, as I have, that they are. But maybe you won’t. And to make that decision you really do need to know what those costs are. I hope I’ve helped to begin making them clear. 

One final thought: Not going to grad school doesn’t mean you’re not smart. In fact, considering everything I’ve discussed above, it probably means you are.

What is linguistic discrimination?

Recently, UC Berkeley student Khairuldeen Makhzoomi was removed from his flight. The reason: he was speaking Arabic. And this isn’t the first time this has happened. Nor the second. These are all, in addition to being deeply disturbing and illegal, examples of linguistic discrimination.

What is linguistic discrimination?

Linguistic discrimination is discrimination based on someone’s language use. And it’s not restricted to the instances I discussed above:

As I’ve talked about before, linguistic discrimination can be a way to discriminate against a specific group of people without saying so in so many words. Linguistic discrimination, in addition to being morally repugnant,is illegal in the U.S. under Titles VI and VII of the Civil Rights Act of 1964.

These are important legal protections and the number of people affected by them is huge: There are over 350 different languages spoken in the United States. In Seattle, where I live, over a fifth of people over age five speak a language other than English at home. That’s a lot of people! Further, most of these individuals are bilingual or multilingual; 90% of second-generation immigrants speak English. And since multilingualism has both neurological benefits for individuals and larger positive impacts on society, I see this as no bad thing. And I’m hardly the only one: how many people that you know are learning or want to learn another language?

Unfortunately, linguistic discrimination threatens this rich diversity, and every person who speaks anything other than the standardized variety of the dominant language.

What can you do?

  • Don’t participate in linguistic discrimination. It can be hard to retrain yourself to reduce the impact of negative stereotypes but, especially if you’re in a position of privilege (as I am), it’s literally the least you can do. Don’t make assumptions about people based on their language use.
  • Stand up for people who may be facing linguistic discrimination. If you see someone being discriminated in in the workplace (like being given lower performance evaluations for having a non-native accent) point out that this is illegal, and back up people who are being discriminated against.
  • Be patient with non-native speakers. Appreciate that they’ve gone through a lot of effort to learn your language. If possible, try and arrange for an interpreter (for face-to-face communication) or translator (for written communications). Sometimes non-native speakers are more comfortable with reading and writing than speaking; offer to communicate through e-mails or other written correspondence.

 

What’s the difference between frosting and icing?

Fair warning: this post is full of pictures of baked goods. I can’t claim responsibly for any impulsive cake-baking that may result from reading further.

This is the second post in this series. The first half, here, focused on responses to whether “frosting” and “icing” were different things, or different words for the same thing. This post gets a little more in-depth. In the first part, I was just asking people what they thought they said. In the second part, I was asking them to pick words for specific pictures. It’s not a perfect design–by asking people what they think they saw first I primed them pretty heavily–but it does reveal some interesting patterns of usage.

The main thing I was interested in was this–did people who said frosting and icing were interchangeable for them actually use them as if they were the same? Why is this a good question to ask? Because  it turns out that a lot of the time people aren’t the best judges of how they use language. Especially if there’s some sort of “rule” about how you’re “supposed” to do it. For example, there’s something of a running joke among linguists how often people will use the passive voice while they’re telling people not to! I don’t think anyone would intentionally lie about their usage, but it’s possible that respondents aren’t always doing exactly what they think they are.

I split my dataset into people who said they thought the words “frosting” and “icing” meant the same thing and those who thought they were different. In the charts below these groups are labelled “same” and “different” respectively. For this stage of analysis, I left out people who weren’t sure; there weren’t a whole lot of them anyway.

Cupcake
Matcha-cupcakes (6453300119)

So this picture was a pretty canonical example of what people brought up a lot–it’s on a cake, and it’s been both whipped and piped. For a lot of people, then, this should be “frosting”. So what did people say?

cupcakeChartThe results here were pretty much what I expected. (Whew!) People who thought the words meant different things pretty much all thought this was “frosting”. And there was a pretty strong different between the groups. But this still doesn’t answer some of my questions. Is it the texture that makes it “frosting” or, as the AP Styleguide suggests, the fact that it’s on a cake? After all, you can definitely put buttercream on a cookie, as evinced by Lofthouse.

Doughnuts

Arnolds

Next I had some doughnuts. A lot of people, when I first started asking around, brought up doughnuts as something that they thought were iced rather than frosted. So what did people say?

donughts

That does seem to hold true.There was no strong difference between the groups, but there were also a lot of write-in answers. (“Glaze” was especially popular, which, for the record, is probably what I’d say. ) So there seems to be more variety in what people call doughnut toppings but there is a tendency towards “icing”.

Cake with fondant

Sao Valentim 2013 (5)

Ok, so this image was a bit of a trick. The cake here is covered in fondant. Which, to me, isn’t really frosting or icing. But if it’s really “being on a cake” that makes something “frosting”, we should see a strong “frosting” bias from people with a distinction. fondantAnd that’s just not  the case. There’s also a pretty big difference between the groups here. Interestingly, people who thought “frosting” and “icing” are different things were more likely to write in “fondant”. (Remember that level of baking knowledge had no effect on whether people said there was a difference or not, so it’s probably not just specialized knowledge.)

Bundt Cake

Lemon bundt cake (2), January 2010

I included this image for a couple of reasons. Again, I’m poking at this “on a cake” idea. But I also had a lot of people tell me that, for them, the distinction between the words was texture-based. So responses here could have gone two ways: If anything on a cake is frosting, then we’d expect frosting to win. But, if frosting has to be fluffy/whipped, then we’d expect icing to win.

bundt

And icing wins! This is no surprise, given the written results summarized in my previous post and the responses for the cake pictures above, but for me it really puts the nail in the coffin of the “on cakes” argument. (Take note, AP Styleguide!) Even on this one, though, people with no distinction are much more likely to be able to use “frosting”.

Sweet Roll

Delicious orange roll

So this is an interesting one. I included it because, for me, cinnamon rolls are synonymous with cream cheese frosting/icing. Since several people I talked to said specifically that cream cheese had to be frosting and not icing, I was expecting a large “frosting” response on this one.

cinnamonRoll

That was definitely not what I saw, though. (Although people with no distinction were much more likely to be able to say “frosting”, so I guess I came by it natural.) Most people, and especially people with a distinction, thought it was “icing”.

Overview

So there are two main takeaways here:

  • There’s a strong difference in usage between people who say that “frosting” and “icing” are different things and those who say they aren’t. (For most of the pictures, these groups responded significantly differently.)
  • If there is a difference, it’s got everything to do with texture and nothing to do with cake.

That’s not to say that these things will always hold true; no one knows better than linguists that language is in a constant state of flux. But for now, these generalizations seem to hold for most of the people surveyed. So if you’re going to make a usage distinction between these words, please make one that’s based on the actual usage and not some completely made-up rule!

A final note: if you’re interested in seeing the (slightly sanitized) data and the R code I used for analysis, both are available here.

 

Is there a difference between frosting and icing?

So recently, the Associated Press Stylebook posted this on Twitter:

This struck me as 1) kind of a petty usage distinction and 2) completely at odds with my personal usage and what I knew about the dialectal research.  The Dictionary of American Regional English, for example, notes that “Frosting” is “widespread, but chiefly North, North MidlandWest“. “Icing”, on the other hand, is found all over,”but less freq North, Pacific“. As someone from Virginia but currently living in Seattle, I have no problem using either frosting or icing for a nice buttercream. I’m hardly the only one, either. This baking blog post even says “I use lots of different icings to frost cupcakes”.

Chai white chocolate cupcakes (2)

Frosting or icing, I’ll take a dozen.

BUT when I posted about this Twitter, some people replied that they did have a very strong distinction between the two words. And the same thing happened when I brought it up with different groups of friends. A lot of people brought up texture, or that they’d say that some things are frosted and others are iced. This was really fascinating to me, both as a baker and a linguist, so I did what any social scientist would and set out to collect some data to get a better idea of what’s going on.

I set up a survey on Google forms and got 109 responses. First I collected info on where speakers were from, how old they were and how knowledgeable they were about baking. Then I asked them for both their general impression of use and then used pictures to ask what they’d call the sweet topping on a variety of baked goods. To avoid making this blog post absolutely huge, I’m going to split up data discussion. The first half  (this one) will look at whether people make a distinction between frosting and icing and whether that’s related to any of their social characteristics. The second half (I’ll link it here when it’s done) will focus on responses to specific images.

Are “frosting” and “icing” different, or are they different words for the same thing?

The first question I asked people was whether frosting and icing were different, or just different words for the same thing. Most people (over 60%) thought that they were different things, while about a third (27% ) thought they were different words for the same thing, and the rest weren’t sure. So it does look like there’s some difference in how people use these words. But in and of itself, that’s not very interesting. What I want to know is this: how do people with different social characteristics use these words? (You may remember that I wrote a while ago that this is the central question in sociolinguistics.)

Region

The first thing I wanted to look at was region. I was expecting to see a pretty big difference here, and I wasn’t disappointed. Once I broke down the data by the states people were from, I found a definite pattern: people from the South were far more likely to say that frosting and icing were different words for the same thing. (Virginia isn’t really patterning with the rest of the South, here, but that may be due to bit of sampling bias–I recruited participants through my social network, and a lot of my friends are from Northern Virginia, which tends not to pattern with the South.)

mapUseThisOne

Most people in the South thought frosting and icing were the same thing, while outside of the South more people thought they were different things. (The darker the blue, the more likely someone from that state was to say that they were different things–black states I didn’t get any respondents from.)

Why is there a distinction? Honestly, I’m not really sure. My intuition, though, is that people from the South probably have pretty wide exposure to both terms. (Since books, TV and movies tend to come from outside of the South, there’s plenty of chances to come across other dialectal variants.) However, people from outside the South historically had less exposure to one of the terms–icing–when they started to come across it they decided that it must refer to something different. As a result, the meanings of both words changed to become more narrow. (This is actually a pretty common process in languages.) I don’t have strong evidence for this theory right now, though, so take it with a couple shakes of salt!

Age

Another thing I wanted to look at was whether the age of respondents played a role in how they used these words. If younger respondents seem to use the word differently than older respondents, it might be because there’s a change happening in the language. Given time, everyone might end up doing the same thing as the younger people.

age

While it looks like there’s a slight tendency for younger participants to say there’s a difference between frosting and icing, the effect isn’t strong enough to be reliable.

I didn’t find a strong pattern, though. Again, this might be due to sampling problems, since most of my respondents were roughly the same age (21-30).  But it could also be that there’s simply not anything to find–that this is neither an on ongoing change, nor one where younger people and older people do things differently.

Baking Knowledge

Ok, so it looks like people are varying by region, but not by age… but what about by level of baking knowledge? Maybe you don’t care about the difference if you almost never make or eat baked goods. It could be that people who know a lot about baking make a distinction, and it’s only people who don’t know a beater from a dough hook that are lumping things together.

bakingExp

Baking knowledge also isn’t closely tied to how people use these words. So it’s not just that people who don’t know a lot about baking say they’re the same.

But that’s not what I found. People at all levels of baking knowledge tended to have a pretty even balance between the two uses of the words.

Comments

I also collected comments from people, to get more information on what people thought in their own words. Two big themes emerged. One was that the most consistent thing people pointed to as the difference was texture. The other was that people tended to say that one of them was for the cake and the other wasn’t… but which one was which was pretty much random.

Just under half of the comments mentioned texture. I’ve compiled some of the differences below, but the general consensus seems to be that frosting is thick, fluffy and soft, while icing is thin and hard. Take note, AP Stylebook!

Frosting Icing
creamy or buttery syrupy, like a glaze
plasticy looking
spread squeezed or piped
thick and creamy thin, hardens as it dries
thicker
thicker clear crust, dried
fluffy thin
thin layer, smooth, glossy
more solid, less flowing watery, gooey
stays soft hardens once it sets
thicker, softer thinner, harder
thick, textured thin, flat

Six people did specifically mention how the words could be used for cake toppings in their comments. Two people said cakes could be either frosted or iced, two said that cakes could only be iced, and two said that cakes could only be frosted. Here’s an example of an icing is for cakes comment:

icing is for cakes! frosting is for all the other deliciousness. usually.

And someone who suggests frosting is for cakes:

I usually apply the word frosting solely to cakelike goods (cupcakes, regular cake) and then icing to everything else.

So… if you are going to claim there’s a difference between frosting and icing, pulling the “it goes on cakes” card is pretty likely to start a fight.  You’re much safer talking about texture. Unless you’re in the South, of course; then you can pretty much say what you like.

Is there a difference between frosting and icing? It looks like the answer mainly depends on where you are. But there were also some pretty interesting differences between different baked goods, so stay tuned for that part of the analysis.

P.S. If you’re interested in seeing the (slightly sanitized) data and the R code I used for analysis, both are available here.

Does white noise really help you study?

So midterms have started here at the University of Washington (already, I know!) and I’m starting to notice more stressed-out study sessions. Around this time of year I always think about all the crazy study hints and tips I’ve heard over the years. (My personal favorite tip is to drink sage tea while I’m reading over notes–it’s been shown to help improve memory.) But one tip that people often share is that listening to white noise can help you concentrate while studying. Being the sort of person I am (read: huge nerd) I decided to set out and see what the research has to say about it.

Study Group

Ok, with the lab report done, we’ve just got two more twenty-page papers to write before we can sleep. Anyone got some coffee? 

First things first: some noises can definitely be bad for learning. For example, one study which compared schools near major airports (which are a big source of noise pollution) and some which were not found that children who were in the noisier environment had reduced reading comprehension. An earlier, similar study showed that students in classrooms near a very noisy train track did worse academically than those that were not.

And noisy environments are bad for concentration, too. One survey of office workers found that 99% of participants were bothered by noises like ringing telephones and conversations, and that the negative effects of these noises didn’t fade over time. And we know that some types of speech noise–especially half of a telephone conversation–are incredibly distracting.

Ok, so we know that some noise can hurt both learning and concentration… so why fight fire with fire? Wouldn’t listening to white noise just be more of the same? Or even worse?

Well, not necessarily. The really distracting thing about noise is that it’s not predictable. It’s pretty easy to “tune out” a clock ticking because your brain can figure out when it’s going to tick again. When a new noise suddenly starts, however, or keeps happening in an unpredictable way, like a faucet dripping juuuust out of rhythm, your attention snaps to it. There’s actually a special set of “novelty detector neurons” that are looking for any new types of sounds that might show up. There are two ways to avoid this happening. One is to make sure that all your environmental sounds are ones you can easily ignore… or you can cover them up. And white noise is very effective at covering up other noises.

White noise is random noise that covers a wide frequency spectrum, usually 20 to 20,000 Hz. That means that other sounds that are the same volume or quieter than the white noise can’t “get thorough”. As a result, you don’t hear anything surprising, your novelty detector neurons stay quiet, and you can focus on what you’re doing. And don’t take my word for it: this study shows that students who listened to a recording of office noises masked with white noise preformed much better on tasks then those who listened to the office noises unmasked.

Now, keep in mind, just because a noise is “white” doesn’t mean it’s good for you. Volume, for one thing, is very important. Exposing rats to 100-dB white noise for 45 minutes was enough for them to undergo measurable stress-induced neurological changes. To be fair, that’s about as loud as a power mower but it does takes you out of the “relaxed concentration” range. So grab your headphones and favorite white noise source (if you’ve no other options, a radio set to static will work just fine) but remember to keep the volume down!