Preference for wake words varies by user gender

I recently read a very interesting article on the design of aspects of choosing a wake word, the word you use to turn on a voice-activated system. In Star Trek it’s “Computer”, but these days two of the more popular ones are “Alexa” and “OK Google”. The article’s author was a designer and noted that she found “Ok Google” or “Hey Google” to be more pleasant to use than “Alexa”. As I was reading the comments (I know, I know) I noticed that a lot of the people who strongly protested that they preferred “Alexa” had usernames or avatars that I would associate with male users. It struck me that there might be an underlying social pattern here.

So, being the type of nerd I am, I whipped up a quick little survey to look at the interaction between user gender and their preference for wake words. The survey only had two questions:

  • What is your gender?
    • Male
    • Female
    • Other
  • If Google Home and the Echo offered identical performance in all ways except for the wake word (the word or phrase you use to wake the device and begin talking to it), which wake word would you prefer?
    • “Ok Google” or “Hey Google”
    • “Alexa”

I included only those options becuase those are the defaults–I am aware you can choose to change the Echo’s wake word. (And probably should, given recent events.) 67 people responded to my survey. (If you were one of them, thanks!)

So what were the results? They were actually pretty strongly in line with my initial observations: as a group, only men preferred “Alexa” to “Ok Google”. Furthermore, this preference was far weaker than people of other genders’ for “Ok Google”. Women preferred “Ok Google” at a rate of almost two-to-one, and no people of other genders preferred “Alexa”.

I did have a bit of a skewed sample, with more women than men and people of other genders, but the differences between genders were robust enough to be statistically significant (c2(2, N = 67) = 7.25, p = 0.02)).


Women preferred “Ok Google” to “Alexa” 27:11, men preferred “Alexa” to “Ok Google” 14:11, and the four people of other genders in my survey all preferred “Ok Google”.

So what’s the take-away? Well, for one, Johna Paolino (the author of the original article) is by no means alone in her preference for a non-gendered wake word. More broadly, I think that, like the Clippy debacle, this is excellent evidence that there are strong gendered differences in how users’ gender affects their interaction with virtual agents. If you’re working to create virtual agents, it’s important to consider all types of users or you might end up creating something that rubs more than half of your potential customers the wrong way.

My code and data are available here.


Acoustics Documentaries on Netflix

Happy New Year’s Eve! Have you made any resolutions? Perhaps a resolution to learn something new in the new year? If so, you’re in luck! I’ve recently run across a number of different Netflix documentaries that touch on differents aspects of acoustics that readers of this blog might enjoy. (Yes, I’ve spent a lot of my winter break watching documentaries. Why do you ask?)


Sure, I guess they have, like, movies and stuff, but really I’m here for the documentaries.

  • Sanrachna (Hindi with English subtitles)
    • This series focuses on the architecture of ancient India. The second episode is all about the architectural acoustics of Golconda fort and Gol Gumbaz. Through careful design and construction, a handclap in the foyer of Golconda fort can be heard half a mile away!
  • The Lion in Your Living Room (English)
    • This Canadian documentary is about domestic house cats. In addition to some discussion of the ins and outs of cat’s ears, there’s a really cool segment by Karen McComb where she talks about the acoustic qualities of different types of purrs.
    • Bonus: Some sweet examples of the Canadian vowel shift.
  • Ocean Giants (English)
    • This BBC documentary about whales and dolphins has three hour-long episodes, and each includes a lot of underwater acoustics and animal communication. If you’ve only got time for one episode, the third episode “Voices of the Sea” is all about whale and dolphin vocalizations.
  • Do I sound gay? (English)
    • This documentary by David Thorpe explores the stereotype of “a gay voice” and does include some cameos by linguists. From a sociolinguisitcs standpoint, I think it’s a bit simplistic (to be fair, probably becuase I’m a sociolinguist) but it’s still an interesting discussion of speech and identity.
    • Bonus: If you want to get a more linguistics-y perspective, this post on Language Log (and the comments) go into a lot of depth.

Oh, and if you don’t have Netflix, I’ve got you covered too. Here are two Youtube channels with linguistics contents you might like:

  • Lingthusiasm (English)
    • This is a brand-new podcast by Gretchen McCulloch and Lauren Gawne (two of my favorite internet linguistics people), and it’s a ton of fun. You should check it out!
  • The Ling Space (English)
    • This channel has been around for a while and has little bite-sized videos about a range of linguistics topics. They have a new video every Wednesday.

Do you know of any other good documentaries about linguistics or acoustics? Leave a comment and let me know!

Do emojis have their own syntax?

So a while ago I got into a discussion with someone on Twitter about whether emojis have syntax. Their original question was this:

As someone who’s studied sign language, my immediate thought was “Of course there’s a directionality to emoji: they encode the spatial relationships of the scene.” This is just fancy linguist talk for: “if there’s a dog eating a hot-dog, and the dog is on the right, you’re going to use 🌭🐕, not 🐕🌭.” But the more I thought about it, the more I began to think that maybe it would be better not to rely on my intuitions in this case. First, because I know American Sign Language and that might be influencing me and, second, because I am pretty gosh-darn dyslexic and I can’t promise that my really excellent ability to flip adjacent characters doesn’t extend to emoji.

So, like any good behavioral scientist, I ran a little experiment. I wanted to know two things.

  1. Does an emoji description of a scene show the way that things are positioned in that scene?
  2. Does the order of emojis tend to be the same as the ordering of those same concepts in an equivalent sentence?

As it turned out, the answers to these questions are actually fairly intertwined, and related to a third thing I hadn’t actually considered while I was putting together my stimuli (but probably should have): whether there was an agent-patient relationship in the photo.

Agent: The entity in a sentence that’s affecting a changed, the “doer” of the action.

  • The dog ate the hot-dog.
  • The raccoons pushed over all the trash-bins.

Patient: The entity that’s being changed, the “receiver” of the action.

  • The dog ate the hot-dog.
  • The raccoons pushed over all the trash-bins.


To get data, I showed people three pictures and asked them to “pick the emoji sequence that best describes the scene” and then gave them two options that used different orders of the same emoji. Then, once they were done with the emoji part, I asked them to “please type a short sentence to describe each scene”. For all the language data, I just went through and quickly coded the order that the same concepts as were encoded in the emoji showed up.


  • “The dog ate a hot-dog”  -> dog hot-dog
  • “The hot-dog was eaten by the dog” -> hot-dog dog
  • “A dog eating” -> dog
  • “The hot-dog was completely devoured” -> hot-dog

So this gave me two parallel data sets: one with emojis and one with language data.

All together, 133 people filled out the emoji half and 127 people did the whole thing, mostly in English (I had one person respond in Spanish and I went ahead and included it). I have absolutely no demographics on my participants, and that’s by design; since I didn’t go through the Institutional Review Board it would actually be unethical for me to collect data about people themselves rather than just general information on language use. (If you want to get into the nitty-gritty this is a really good discussion of different types of on-line research.)

Picture one – A man counting money

Watch, movie schedule, poster, telephone, cashier machine, cash register Fortepan 6680

I picked this photo as sort of a sanity-check: there’s no obvious right-to-left ordering of the man and the money, and there’s one pretty clear way of describing what’s going on in this scene. There’s an agent (the man) and a patient (the money), and since we tend to describe things as agent first, patient second I expected people to pretty much all do the same thing with this picture. (Side note: I know I’ve read a paper about the cross-linguistic tendency for syntactic structures where the agent comes first, but I can’t find it and I don’t remember who it’s by. Please let me know if you’ve got an idea what it could be in the comments–it’s driving me nuts!)


And they did! Pretty much everyone described this picture by putting the man before the money, both with emoji and words. This tells us that, when there’s no information about orientation you need to encode (e.g. what’s on the right or left), people do tend to use emoji in the same order as they would the equivalent words.

Picture two – A man walking by a castle

Château de Canisy (5)

But now things get a little more complex. What if there isn’t a strong agent-patient relationship and there is a strong orientation in the photo? Here, a man in a red shirt is walking by a castle, but he shows up on the right side of the photo. Will people be more likely to describe this scene with emoji in a way that encodes the relationship of the objects in the photo?


I found that they were–almost four out of five participants described this scene by using the emoji sequence “castle man”, rather than “man castle”. This is particularly striking because, in the sentence writing part of the experiment, most people (over 56%) wrote a sentence where “man/dude/person etc.” showed up before “castle/mansion/chateau etc.”.

So while people can use emoji to encode syntax, they’re also using them to encode spatial information about the scene.

Picture three – A man photographing a model

Photographing a model

Ok, so let’s add a third layer of complexity: what about when spatial information and the syntactic agent/patient relationships are pointing in opposite directions? For the scene above, if you’re encoding the spatial information then you should use an emoji ordering like “woman camera man”, but if you’re encoding an agent-patient relationship then, as we saw in the picture of the man counting money, you’ll probably want to put the agent first: “man camera woman”.

(I leave it open for discussion whether the camera emoji here is representing a physical camera or a verb like “photograph”.)


For this chart I removed some data to make it readable. I kicked out anyone who picked another ordering of the emoji, and any word order that fewer than ten people (e.g. less than 10% of participants) used.

So people were a little more divided here. It wasn’t quite a 50-50 split, but it really does look like you can go either way with this one. The thing that jumped out at me, though, was how the word order and emoji order pattern together: if your sentence is something like “A man photographs a model”, then you are far more likely to use the “man camera woman” emoji ordering. On the other hand, if your sentence is something like “A woman being photographed by the sea” or “Photoshoot by the water”, then it’s more likely that your emoji ordering described the physical relation of the scene.

So what?

So what’s the big takeaway here? Well, one thing is that emoji don’t really have a fixed syntax in the same way language does. If they did, I’d expect that there would be a lot more agreement between people about the right way to represent a scene with emoji. There was a lot of variation.

On the other hand, emoji ordering isn’t just random either. It is encoding information, either about the syntactic/semantic relationship of the concepts or their physical location in space. The problem is that you really don’t have a way of knowing which one is which.

Edit 12/16/2016: The dataset and the R script I used to analyze it are now avaliable on Github.

What’s the difference between & and +?

So if you’re like me, you sometimes take notes on the computer and end up using some shortcuts so you can keep up with the speed of whoever’s talking. One of the short cuts I use a lot is replacing the word “and” with punctuation. When I’m handwriting things I only ever use “+” (becuase I can’t reliably write an ampersand), but in typing I use both “+” and “&”. And I realized recently, after going back to change which one I used, that I had the intuition that they should be used for different things.


I don’t use Ampersands when I’m handwriting things becuase they’re hard to write.

Like sometimes happens with linguistic intuitions, though, I didn’t really have a solid idea of how they were different, just that they were. Fortunately, I had a ready-made way to figure it out. Since I use both symbols on Twitter quite a bit, all I had to do was grab tweets of mine that used either + or & and figure out what the difference was.

I got 450 tweets from between October 7th and November 11th of this year from my own account (@rctatman). I used either & or + in 83 of them, or roughly 18%. This number is a little bit inflated because I was livetweeting a lot of conference talks in that time period, and if a talk has two authors I start every livetweet from that talk with “AuthorName1 & AuthorName2:”. 43 tweets use & in this way. If we get rid of those, only around 8% of my tweets contain either + or &. They’re still a lot more common in my tweets than in writing in other genres, though, so it’s still a good amount of data.

So what do I use + for? See for yourself! Below are all the things I conjoined with + in my Twitter dataset. (Spelling errors intact. I’m dyslexic, so if I don’t carefully edit text—and even sometimes when I do, to my eternal chagrin—I tend to have a lot of spelling errors. Also, a lot of these tweets are from EMNLP so there’s quite a bit of jargon.)

  • time + space
  • confusable Iberian language + English
  • Data + code
  • easy + nice
  • entity linking + entity clustering
  • group + individual
  • handy-dandy worksheet + tips
  • Jim + Brenda, Finn + Jake
  • Language + action
  • linguistic rules + statio-temporal clustering
  • poster + long paper
  • Ratings + text
  • static + default methods
  • syntax thing + cattle
  • the cooperative principle + Gricean maxims
  • Title + first author
  • to simplify manipulation + preserve struture

If you’ve had some syntactic training, it might jump out to you that most of these things have the same syntactic structure: they’re noun phrases! There are just a couple of exception. The first is “static + default methods”, where the things that are being conjoined are actually adjectives modifying a single noun. The other is “to simplify manipulation + preserve struture”. I’m going to remain agnostic about where in the verb phrase that coordination is taking place, though, so I don’t get into any syntax arguments ;). That said, this is a fairly robust pattern! Remember that I haven’t been taught any rules about what I “should” do, so this is just an emergent pattern.

Ok, so what about &? Like I said, my number one use is for conjunction of names. This probably comes from my academic writing training. Most of the papers I read that use author names for in-line citations use an & between them. But I do also use it in the main body of tweets. My use of & is a little bit harder to characterize, so I’m going to go through and tell you about each type of thing.

First, I use it to conjoin user names with the @ tag. This makes sense, since I have a strong tendency to use & with names:

  • @uwengineering & @uwnlp
  • @amazon @baidu @Grammarly & @google

In some cases, I do use it in the same way as I do +, for conjoining noun phrases:

  • Q&A
  • the entities & relations
  • these features & our corpus
  • LSTM & attention models
  • apples & concrete
  • context & content

But I also use it for comparatives:

  • Better suited for weak (bag-level) labels & interpretable and flexible
  • easier & faster

And, perhaps more interestingly, for really high-level conjugation, like at the level of the sentence or entire verb phrase (again, I’m not going to make ANY claims about what happens in and around verbs—you’ll need to talk to a syntactician for that!).

  • Classified as + or – & then compared to polls
  • in 30% of games the group performance was below average & in 17% group was worse than worst individual
  • math word problems are boring & kids learn better if they’re interested in the theme of the problem
  • our system is the first temporal tagger designed for social media data & it doesn’t require hand tagging
  • use a small labeled corpus w/ small lexicon & choose words with high prob. of 1 label

And, finally, it gets used in sort of miscellaneous places, like hashtags and between URLs.

So & gets used in a lot more places than + does. I think that this is probably because, on some subconscious level I consider & to be the default (or, in linguistics terms, “unmarked“). This might be related to how I’m processing these symbols when I read them. I’m one of those people who hears an internal voice when reading/writing, so I tend to have canonical vocalizations of most typed symbols. I read @ as “at”, for example, and emoticons as a prosodic beat with some sort of emotive sound. Like I read the snorting emoji as the sound of someone snorting. For & and +, I read & as “and” and + as “plus”. I also use “plus” as a conjunction fairly often in speech, as do many of my friends, so it’s possible that it may pattern with my use in speech (I don’t have any data for that, though!). But I don’t say “plus” nearly as often as I say “and”. “And” is definitely the default and I guess that, by extension, & is as well.

Another thing that might possibly be at play here is ease of entering these symbols. While I’m on my phone they’re pretty much equally easy to type, on a full keyboard + is slightly easier, since I don’t have to reach as far from the shift key. But if that were the only factor my default would be +, so I’m fairly comfortable claiming that the fact that I use & for more types of conjunction is based on the influence of speech.

A BIG caveat before I wrap up—this is a bespoke analysis. It may hold for me, but I don’t claim that it’s the norm of any of my language communities. I’d need a lot more data for that! That said, I think it’s really neat that I’ve unconsciously fallen into a really regular pattern of use for two punctuation symbols that are basically interchangeable. It’s a great little example of the human tendency to unconsciously tidy up language.

How loud would a million dogs barking be?

So a friend of mine who’s a reference librarian (and has a gaming YouTube channel you should check out) recently got an interesting question: how loud would a million dogs barking be?

This is an interesting question because it gets at some interesting properties of how sound work, in particular the decibel scale.

So, first off, we need to establish our baseline. The loudest recorded dog bark clocked in at 113.1 dB, and was produced by a golden retriever named Charlie. (Interestingly, the loudest recorded human scream was 129 dB, so it looks like Charlie’s got some training to do to catch up!) That’s louder than a chain saw, and loud enough to cause hearing damage if you heard it consonantly.

Now, let’s scale our problem down a bit and figure out how loud it would be if ten Charlies barked together. (I’m going to use copies of Charlie and assume they’ll bark in phase becuase it makes the math simpler.) One Charlie is 113 dB, so your first instinct may be to multiply that by ten and end up 1130 dB. Unfortunately, if you took this approach you’d be (if you’ll excuse the expression) barking up the wrong tree. Why? Because the dB scale is logarithmic. This means that a 1130 dB is absolutely ridiculously loud. For reference, under normal conditions the loudest possible sound (on Earth) is 194 dB.  A sound of 1000 dB would be loud enough to create a black hole larger than the galaxy. We wouldn’t be able to get a bark that loud even if we covered every inch of earth with clones of champion barker Charlie.

Ok, so we know what one wrong approach is, but what’s the right one? Well, we have our base bark at 113 dB. If we want a bark that is one million times as powerful (assuming that we can get a million dogs to bark as one) then we need to take the base ten log of one million and multiply it by ten (that’s the deci part of decibel). (If you want more math try this site.) The base ten log of one million is six, so times ten that’s sixty decibels. But it’s sixty decibels louder than our original sound of 113dB, for a grand total of 173dB.

Now, to put this in perspective, that’s still pretty durn loud. That’s loud enough to cause hearing loss in our puppies and everyone in hearing distance. We’re talking about the loudness of a cannon, or a rocket launch from 100 meters away. So, yes, very loud, but not quite “destroying the galaxy” loud.

A final note: since the current world record for loudest barking group of dogs is a more modest 124 dB from group of just 76 dogs, if you could get a million dogs to bark in unison you’d definitely set a new world record! But, considering that you’d end up hurting the dogs’ hearing (and having to scoop all that poop) I’m afraid I really can’t recommend it.

Can a computer write my blog posts?

This post is pretty special: it’s the 100th post I’ve made since starting my blog! It’s hard to believe I’ve been doing this so long. I started blogging in 2012, in my final year of undergrad, and now I’m heading into my last year of my PhD. Crazy how fast time flies.

Ok, back on topic. As I was looking back over everything I’ve written, it struck me that 99 posts worth of text on a very specific subject domain (linguistics) in a very specific register (informal) should be enough text to train a simple text generator.

So how did I go about building a blog bot? It was pretty easy! All I needed was:

  • 67,000 words of text (all blog posts before this one)
  • 1 R script to tidy up the text
  • 1 Python script to train a Markov Chain  text generator

A Markov Whatnow?

A Markov Chain is a type of simple (but surprisingly powerful) statistical model that tells you, given the item you’re currently on, what item you’re likely to see next. Today we’re going to apply it to whole words in a text.

How does it work? Basically, for each word in your text, you count how many different words occur after it, how many time each shows up and figure out the probability of each transition. So if your text is “The dog ate the apple.”, then there’s a 50% chance that “the” will be followed by “apple”, but a 100% chance that “apple” will be followed by “.”. You can then use these probabilities to generate new sentences, like so:

  • Input: The dog ate the apple.
  • Possible outputs:
    • The apple.
    • The dog ate the apple.
    • The dog ate the dog ate the apple.
    • The dog ate the dog ate the dog ate the dog ate the dog ate the dog ate the dog ate the dog ate the dog ate the dog ate the dog ate the apple. (This may seem silly, but remember that we’re only looking at two words at a time. This model doesn’t “remember” what came earlier or “know” that it’s repeating itself.)

OK, now for the actual training of the model. If you want to play around with this yourself, all my code and text is up on GitHub here.

  1. First, grab your text. I downloaded all my blog posts as an XML file and used a quick little R script to snag just the text I was interested in.
  2. Next, train your model. I used PyMarkovChain, but you can always write your own if you’re feeling it.
  3. Finally, use the model to generate text.

Alright, now I’ve got (admittedly very, very dumb) little program that kinda knows what I write like when I’m blogging. How well does it do at pretending to be me?

First try:

I’m going to break eye contact, look down at your own personalized ASR system

Well that’s… vaguely disturbing. But also not really very human-sounding, and certainly not something I’d write in a blog post. Let’s give it a couple more tries and see if it does any better.

  • But, if frosting has to have a career where you learned it from Clarice
  • We only perceive the things that our bodies and sensory organs and brains can perceive, and we really don’t care about to tell the sounds in the second part, I was expecting to see whether something is grammatical is to put up a bunch of historical stuff that we’ve all heard about the structure in the arts and sciences were depressed
  • (Which is really interesting–I’d love to learn your language doesn’t make you very, very good a recognizing speech
  • Believe me, I know what you can uncontract them and what’s the take-away
    People with short face syndrome–yes, that’s a classical description of what a language does, along with parallels drawn to another, related, languages
  • Short answer: they’re all correct
  • And those speakers are aware of
  • The Job Market for Linguistics PhDsWhat do you much

Hmm, not so great. The only output that sounded good to me was “Short answer: they’re all correct”. And that was just the exact same phrasing I used in this blog post. No wonder it sounds just like me; it is me!

So it looks like I won’t be handing the reins over to Making Noise and Hearing Things bot any time soon. True, I could have used a fancier tool, like a Recurrent Neural Network. But to be perfectly honest, I have yet to see any text generation system that can produce anything even close to approximating a human-written blog post. Will we get there? Maybe. But natural language generation, especially outside of really formulaic things like weather or sports reporting, is a super hard problem. Heck, we still haven’t gotten to point where computers can reliably solve third-grade math word problems.

The very complexities that make language so useful (and interesting to study) also make it so hard to model. Which is good news for me! It means there’s still plenty of work to do in language modelling and blogging.

Six Linguists of Color (who you can follow on Twitter!)

In the light of some recent white supremacist propaganda showing up on my campus, I’ve decided to spotlight a tiny bit of the amazing work being done around the country by linguists of color. Each of the scholars below is doing interesting, important linguistics research and has a Twitter account that I personally enjoy following. If you’re on this blog, you probably will as well! I’ll give you a quick intro to their research and, if it piques your interest, you can follow them on Twitter for all the latest updates.

(BTW, if you’re wondering why I haven’t included any grad students on this list, it’s becuase we generally don’t have as well developed of a research trajectory and I want this to be a useful resource for at least a few years.)

Anne Charity Hudley

Dr. Charity Hudley is professor at the College of William and Mary (Go Tribe!). Her research focuses on language variation, especially the use of varieties such as African American English, in the classroom. If you know any teachers, they might find her two books on language variation in the classroom a useful resource. She and Christine Mallinson have even released an app to go with them!

Michel DeGraff

Dr. Michel DeGraff is a professor at MIT. His research is on Haitian Creole, and he’s been very active in advocating for the official recognition of Haitian Creole as a distinct language. If you’re not sure what Haitian Creole looks like, go check out his Twitter; many of his tweets are in the language! He’s also done some really cool work on using technology to teach low-resource languages.

Nelson Flores

Dr. Nelson Flores is a professor at the University of Pennsylvania. His work focuses on how we create the ideas of race and language, as well as bilingualism/multilingualism and bilingual education. I really enjoy his thought-provoking discussions of recent events on his Twitter account. He also runs a blog, which is a good resource for more in-depth discussion.

Nicole Holliday

Dr. Nicole Holliday is (at the moment) Chau Mellon Postdoctoral Scholar at Pomona College. Her research focuses on language use by biracial speakers. I saw her talk on how speakers use pitch differently depending on who they’re talking to at last year’s LSA meeting and it was fantastic: I’m really looking forwards to seeing her future work! She’s also a contributor to Word., an online journal about African American English.

Rupal Patel

Dr. Rupal Patel is a professor at Northeastern University, and also the founder and CEO of VocaliD. Her research focuses on the speech of speakers with developmental  disabilities, and how technology can ease communication for them. One really cool project she’s working on that you can get involved with is The Human Voicebank. This is collection of voices from all over the world that is used to make custom synthetic voices for those who need them for day-to-day communication. If you’ve got a microphone and a quiet room you can help out by recording and donating your voice.

John R. Rickford

Last, but definitely not least, is Dr. John Rickford, a professor at Stanford. If you’ve taken any linguistics courses, you’re probably already familiar with his work. He’s one of the leading scholars working on African American English and was crucial in bringing a research-based evidence to bare on the Ebonics controversy. If you’re interested, he’s also written a non-academic book on African American English that I would really highly recommend; it even won the American Book Award!

What’s a “bumpus”?

So I recently had a pretty disconcerting experience. It turns out that almost no one else has heard of a word that I thought was pretty common. And when I say “no one” I’m including dialectologists; it’s unattested in the Oxford English Dictionary and the Dictionary of American Regional English. Out of the twenty two people who responded to my Twitter poll (which was probably mostly other linguists, given my social networks) only one other person said they’d even heard the word and, as I later confirmed, it turned out to be one of my college friends.

So what is this mysterious word that has so far evaded academic inquiry? Ladies, gentlemen and all others, please allow me to introduce you to…


Pronounced ‘bʌm.pɪs or ‘bʌm.pəs. You can hear me say the word and use it in context by listening to this low quality recording.

The word means something like “fool” or “incompetent person”. To prove that this is actually a real word that people other than me use, I’ve (very, very laboriously) found some examples from the internet. It shows up in the comments section of this news article:

THAT is why people are voting for Mr Trump, even if he does act sometimes like a Bumpus.

I also found it in a smattering of public tweets like this one:

If you ever meet my dad, please ask him what a “bumpus” is

And this one:

Having seen horror of war, one would think, John McCain would run from war. No, he runs to war, to get us involved. What a bumpus.

And, my personal favorite, this one:

because the SUN(in that pic) is wearing GLASSES god karen ur such a bumpus

There’s also an Urban Dictionary entry which suggests the definition:

A raucous, boisterous person or thing (usually african-american.)

I’m a little sceptical about the last one, though. Partly because it doesn’t line up with my own intuitions (I feel like a bumpus is more likely to be silent than rowdy) and partly becuase less popular Urban Dictionary entries, especially for words that are also names, are super unreliable.

I also wrote to my parents (Hi mom! Hi dad!) and asked them if they’d used the word growing up, in what contexts, and who they’d learned it from. My dad confirmed that he’d heard it growing up (mom hadn’t) and had a suggestion for where it might have come from:

I am pretty sure my dad used it – invariably in one of the two phrases [“don’t be a bumpus” or “don’t stand there like a bumpus”]….  Bumpass, Virginia is in Lousia County …. Growing up in Norfolk, it could have held connotations of really rural Virginia, maybe, for Dad.

While this is definitely a possibility, I don’t know that it’s definitely the origin of the word. Bumpass, Virginia, like  Bumpass Hell (see this review, which also includes the phrase “Don’t be a bumpass”), was named for an early settler. Interestingly, the college friend mentioned earlier is also from the Tidewater region of Virginia, which leads me to think that the word may have originated there.

My mom offered some other possible origins, that the term might be related to “country bumpkin” or “bump on a log”. I think the latter is especially interesting, given that “bump on a log” and “bumpus” show up in exactly the same phrase: standing/sitting there like a _______.

She also suggested it might be related to “bumpkis” or “bupkis”. This is a possibility, especially since that word is definitely from Yiddish and Norfolk, VA does have a history of Jewish settlement and Yiddish speakers.

A usage of “Bumpus” which seems to be the most common is in phrases like “Bumpus dog” or “Bumpus hound”. I think that this is probably actually a different use, though, and a direct reference to a scene from the movie A Christmas Story:

One final note is that there was a baseball pitcher in the late 1890’s who went by the nickname “Bumpus”: Bumpus Jones. While I can’t find any information about where the nickname came from, this post suggests that his family was from Virginia and that he had Powhatan ancestry.

I’m really interesting in learning more about this word and its distribution. My intuition is that it’s mainly used by older, white speakers in the South, possibly centered around the Tidewater region of Virginia.

If you’ve heard of or used this word, please leave a comment or drop me a line letting me know 1) roughly how old you are, 2) where you grew up and 3) (if you can remember) where you learned it. Feel free to add any other information you feel might be relevant, too!


Can you configure speech recognition for a specific speaker?

James had an interesting question based on one of my earlier posts on gender differences in speech recognition:

Is there a voice recognition product that is focusing on women’s voices or allows for configuring for women’s voices (or the characteristics of women’s voices)?

I don’t know of any ASR systems specifically designed for women. But the answer to the second half of your question is yes!

BSPC 19 i Nyborg Danmark 2009 (4)

There are two main types of automatic speech recognition, or ASR, systems. The first is speaker independnet. These are systems, like YouTube automatic captions or  Apple’s Siri, that should work equally well across a large number of different speakers. Of course, as many other researchers have found and I corroborated in my own investigation, that’s not always the case. A major reason for this is socially-motivated variation between speakers. This is something we all know as language users. You can guess (with varying degrees of accuracy) a lot about someone from just their voice: thier sex, whether they’re young or old, where they grew up, how educated they are, how formal or casual they’re being.

So what does this mean for speech recognition? Well, while different speakers speak in a lot of different ways, individual speakers tend to use less variation. (With the exception of bidialectal speakers, like John Barrowman.) Which brings me nicely to the second type of speech recognition: speaker dependent. These are systems that are designed to work for one specific speaker, and usually to adapt and get more accurate for that speaker over time.

If you read some of my earlier posts, I suggested that the different performance for between dialects and genders was due to imbalances in the training data. The nice thing about speaker dependent systems is that the training data is made up of one voice: yours. (Although the system is usually initialized based on some other training set.)

So how can you get a speaker dependent ASR system?

  • By buying software such as Dragon speech recognition. This is probably the most popular commercial speaker-dependent voice recognition software (or at least the one I hear the most about). It does, however, cost real money.
  • Making your own! If you’re feeling inspired, you can make your own personalized ASR system. I’d recommend the CMU Sphinx toolkit; it’s free and well-documented. To make your own recognizer, you’ll need to build your own language model using text you’ve written as well as adapt the acoustic model using your recorded speech. The former lets the recognizer know what words you’re likely to say, and the latter how you say things. (If you’re REALLY gung-ho you can even build your own acoustic model from scratch, but that’s pretty involved.)

In theory, the bones of any ASR system should work equally well on any spoken human language. (Sign language recognition is a whole nother kettle of fish.) The difficulty is getting large amounts of (socially stratified) high-quality training data. By feeding a system data without a lot of variation, for example by using only one person’s voice, you can usually get more accurate recognition more quickly.



What sounds you can feel but not hear?

I got a cool question from Veronica the other day: 

Which wavelength someone would use not to hear but feel it on the body as a vibration?

So this would depend on two things. The first is your hearing ability. If you’ve got no or limited hearing, most of your interaction with sound will be tactile. This is one of the reasons why many Deaf individuals enjoy going to concerts; if the sound is loud enough you’ll be able to feel it even if you can’t hear it. I’ve even heard stories about folks who will take balloons to concerts to feel the vibrations better. In this case, it doesn’t really depend on the pitch of the sound (how high or low it is), just the volume.

But let’s assume that you have typical hearing. In that case, the relationship between pitch, volume and whether you can hear or feel a sound is a little more complex. This is due to something called “frequency response”. Basically, the human ear is better tuned to hearing some pitches than others. We’re really sensitive to sounds in the upper ranges of human speech (roughly 2k to 4k Hz). (The lowest pitch in the vocal signal can actually be much lower [down to around 80 Hz for a really low male voice] but it’s less important to be able to hear it because that frequency is also reflected in harmonics up through the entire pitch range of the vocal signal. Most telephones only transmit signals between  300 Hz to 3400 Hz, for example, and it’s only really the cut-off at the upper end of the range that causes problems–like making it hard to tell the difference between “sh” and “s”.)

The takeaway from all this is that we’re not super good at hearing very low sounds. That means they can be very, very loud before we pick up on them. If the sound is low enough and loud enough, then the only way we’ll be able to sense it is by feeling it.

How low is low enough? Most people can’t really hear anything much below 20 Hz (like the lowest note on a really big organ). The older you are and the more you’ve been exposed to really loud noises in that range, like bass-heavy concerts or explosions, the less you’ll be able to pick up on those really low sounds.

What about volume? My guess for what would be “sufficiently loud”, in this case, is 120+ Db. 120 Db is as loud as a rock concert, and it’s possible, although difficult and expensive, to get out of a home speaker set-up. If you have a neighbor listening to really bass-y music or watching action movies with a lot of low, booming sound effects on really expensive speakers, it’s perfectly possible that you’d feel those vibrations rather than hearing them. Especially if there are walls between the speakers and you. While mid and high frequency sounds are pretty easy to muffle, low-frequency sounds are much more difficult to sound proof against.

Are there any health risks? The effects of exposure to these types of low-frequency noise is actually something of an active research question. (You may have heard about the “brown note“, for example.) You can find a review of some of that research here. One comforting note: if you are exposed to a very loud sound below the frequencies you can easily hear–even if it’s loud enough to cause permanent damage at much higher frequencies–it’s unlikely that you will suffer any permanent hearing loss. That doesn’t mean you shouldn’t ask your neighbor to turn down the volume, though; for their ears if not for yours!