How to Read a Linguistics Article in 8 Easy Steps

Disclaimer: this mostly applies to experimental or quantitative articles, since those are what are common in my field. Your milage, especially in more formal fields like syntax or semantics, may vary dramatically.

 

Ok, so you’re not a professional linguist or anything, but you’ve come across an article in a linguistics journal and it sounds interesting. Or maybe you’ve just taken your first linguistics class and you heard about something really cool you want to learn more about. But when you start reading you’re quickly swamped by terms you don’t understand, IPA symbols you’ve never seen before and all sorts of statistics. You’re tempted to just throw in the towel.

Girl in the Library (3638661587)

Don’t panic! I’m here to help you out with Rachael’s patented* guide to reading linguistics articles.

The first thing to do is take a deep breath and accept that you may not understand everything right away. That’s ok! If you could easily read scientific literature in a field it would mean you were already an expert. Academic writing is designed to be read by other academics, and so it’s full of terms that have very specific meanings in the field. It’s a sort of time-saving code and it takes time to learn. Don’t beat yourself up for being at the beginning of your journey!

With that in mind, here’s the steps I like to follow when I’m starting a new article, especially if it’s in a field I’m less familiar with.

  1. Read the abstract. This will give you a broad outline of what the paper will be about and help you know if the whole article would be interesting or relevant for you.
  2. I like to call this the “sandwich step”. I read the introduction and then the conclusion. Why? Again, this gives me idea about what will be in the article. Sure, there may be spoilers, but knowing the answer will make it easier to understand how questions were asked.
    1. Notice any new terms that are both in the introduction and the abstract but don’t get explained? This might be a good time to look them up, since the author might be assuming you already know about it.
    2. Some places to look up terms:
      1. The SIL linguistics glossary can be a good place to start.
      2. Linguistics topics on Wikipedia are also a good choice. Linguists even get together at professional events to edit and add to linguistics-related pages.
      3. For a bit more in-depth introduction, Language and Linguistics Compass publishes short articles written by experts that are designed to be introductions to whatever topic they’re on.
  3. Flip through and look for any charts or figures and read their captions. These will be where the author(s) highlight their results. Now that you have a general idea about what’s going on you’ll have a better chance of interpreting these.
  4. Next, read the background section. This is where the author will talk about things that other people have done and how thier work fits in to the big picture of the field. This is the second place you’re likely to find new terms you’re unfamiliar with. If they’re only used once or twice, don’t worry about looking them up. Your aim is to understand the general thrust of the article, not every little detail! (Now, if you’re a grad student, on the other hand… ;) )
  5. Now read the methods section. You can probably skim this; unless you’re interested in replicating the study or reviewing its merit you’re not going to have to have a full grasp of all the nitty-gritty nuances of item design and participant recruitment.
  6. Finally read the results. Unless you have some stats background, you’re probably safe in skipping over the statistical analyses. Again, you just want to understand the general point.
  7. Extra credit: Go back and read the abstract again. This is a very condensed version of what was in the article and is a good way to review/check your understanding.
  8. Sit back and enjoy having read a linguistics article!

Grats on making it through! Now that you’ve caught the bug, what are some ways to find more stuff to read?

  • Go find one of the articles referenced in the one you just read. Since you’re already familiar with similar work, you’ll probably have an easier time understanding the new article.
  • Or read something more recent that cites the article you’ve read. You can look up articles that cite the one you’ve read on Google Scholar, as this video explains.
  • Look up other issues of the journal your paper was in. Most journals publish in a pretty narrow range of topics so you’ll have a leg up on understanding the new articles.
  • Ask a linguist! We’re a friendly bunch and pretty responsive to e-mail. You might even see if you can find the contact info of the author(s) of the article you read to ask them for suggestions for other stuff to read.

I hope this has been helpful and piqued your interest about diving into linguistics research. Now get out there are get reading!

*Not actually patented.

Why can you mumble “good morning” and still be understood?

I got an interesting question on Facebook a while ago and though it might be a good topic for a blog post:

I say “good morning” to nearly everyone I see while I’m out running. But I don’t actually say “good”, do I? It’s more like “g’ morning” or “uh morning”. Never just morning by itself, and never a fully articulated good. Is there a name for this grunt that replaces a word? Is this behavior common among English speakers, only southeastern speakers, or only pre-coffee speakers?

This sort of thing is actually very common in speech, especially in conversation. (Or “in the wild” as us laboratory types like to call it.) The fancy-pants name for it is “hypoarticulation”. That’s less (hypo) speech-producing movements of the mouth and throat (articulation). On the other end of the spectrum you have “hyperarticulation” where you very. carefully. produce. each. individual. sound.

Ok, so you can change how much effort you put into producing speech sounds, fair enough. But why? Why don’t we just sort of find a happy medium and hang out there? Two reasons:

  1. Humans are fundamentally lazy. To clarify: articulation costs energy, and energy is a limited resource. More careful articulation also takes more time, which, again, is a limited resource. So the most efficient speech will be very fast and made with very small articulator movements. Reducing the word “good” to just “g” or “uh” is a great example of this type of reduction.
  2. On the other hand, we do want to communicate clearly. As my advisor’s fond of saying, we need exactly enough pointers to get people to the same word we have in mind. So if you point behind someone and say “er!” and it could be either a tiger or a bear, that’s not very helpful. And we’re very aware of this in production: there’s evidence that we’re more likely to hyperarticulate words that are harder to understand.

So we want to communicate clearly and unambiguously, but with as little effort as possible. But how does that tie in with this example? “G” could be “great” or “grass” or “génial “, and “uh” could be any number of things. For this we need to look outside the linguistic system.

The thing is, language is a social activity and when we’re using language we’re almost always doing so with other people. And whenever we interact with other people, we’re always trying to guess what they know. If we’re pretty sure someone can get to the word we mean with less information, for example if we’ve already said it once in the conversation, then we will expend less effort in producing the word. These contexts where things are really easily guessable are called “low entropy“. And in a social context like jogging past someone in the morning, phrases liked “good morning” have very low entropy. Much lower than, for example “Could you hand me that pickle?”–if you jogged past someone  and said that you’d be very likely to hyperarticulate to make sure they understood.

Do you tweet the way you speak?

So one of my side projects is looking at what people are doing when they choose to spell something differently–what sort of knowledge about language are we encoding when we decide to spell “talk” like “tawk”, or “playing” like “pleying”? Some of these variant spelling probably don’t have anything to do with pronunciation, like “gawd” or “dawg”, which I think are more about establishing a playful, informal tone. But I think that some variant spellings absolutely are encoding specific pronunciation. Take a look at this tweet, for example (bolding mine):

There are three different spelling here, two which look like th-stopping (where the “th” sound as in “that” is produced as a “d” sound instead) and one that looks like r-lessness (where someone doesn’t produce the r sound in some words). But unfortunately I don’t have a recording of the person who wrote this tweet; there’s no way I can know if they produce these words in the same way in their speech as they do when typing.

Fortunately, I was able to find someone who 1) uses variant spellings in their Twitter and 2) I could get a recording of:

This let me directly compare how this particular speaker tweets to how they speak. So what did I find? Do they tweet the same way they speak? It turns out that that actually depends.

  • Yes! For some things (like the th-stopping and r-lessness like I mentioned above) this person does tweet and speak in pretty much the same way. They won’t use an “r” in spelling where they wouldn’t say an “r” sound and vice versa.
  • No! But for other things (like saying “ing” words “in” or saying words like “coffin” and “coughing” with a different vowel in the first syllable) while this person does them a lot in thier speech, they aren’t using variant spellings at the same level in thier tweets. So they’ll say “runnin” 80% of the time, for example, but type it as “running” 60% of the time (rather than 20%, which is what we’d expect if the Twitter and speech data were showing the same thing).

So what’s going on? Why are only some things being used in the same way on Twitter and in speech? To answer that we’ll need to dig a little deeper into the way these things in speech.

  • How are th-stopping and r-lessness being used in speech? So when you compare the video above to one of the sports radio announcer that’s being parodied (try this one) you’ll find that they’re actually used more in the video above than they are in the speech that’s being parodied. This is pretty common in situations where someone’s really laying on a particular accent (even one they speak natively), which sociolinguists call a performance register.
  • What about the other things? The things that aren’t being used as often Twitter as they are on speech, on the other hand, actually show up at the same levels in speech, both for the parody and the original. This speaker isn’t overshooting thier use of these features; instead they’re just using them in the way that another native speaker of a dialect would.

So there’s a pretty robust pattern showing up here. This person is only tweeting the way they speak for a very small set of things: those things that are really strongly associated with this dialect and that they’re really playing up in thier speech. In other words, they tend to use the things that they’re paying a lot of attention to in the same way both in speech and on Twitter. That makes sense. If you’re very careful to do something when you’re talking–not splitting an infinitive or ending a sentence with a preposition, maybe–you’re probably not going to do it when you’re talking. But if there’s something that you do all the time when you’re talking and aren’t really aware of then it probably show up in your writing. For example, there are lots of little phrases I’ll use in my speech (like “no worries”, for example) that I don’t think I’ve ever written down, even in really informal contexts. (Except for here, obviously.)

So the answer to whether tweets and speech act the same way is… is depends. Which is actually really useful! Since it looks like it’s only the things that people are paying a lot of attention to that get overshot in speech and Twitter, this can help us figure out what things people think are really important by looking at how they use them on Twitter. And that can help us understand what it is that makes a dialect sound different, which is useful for things like dialect coaching, language teaching and even helping computers understand multiple dialects well.

(BTW, If you’re interested in more details on this project, you can see my poster, which I’ll be presenting at NWAV44 this weekend, here.)

Does your dialect affect the way you hear things?

Most people know that your dialect affects the way that you say things (saying “tin” and “ten” the same way, for example). But did you know that your dialect also affects the way you hear things? Even more interesting, you probably have some mismatches between the way you hear language sounds and the way you say them. This came as a surprise to linguists, since back in the day we used to think that you pretty much heard and said things the same way.

"Earcov". Licensed under Public Domain via Wikimedia Commons.

Do you have an ear for your dialect?

One of the earliest studies to find differences between production and perception of dialect forms was carried out in the early seventies by William Labov, Malcah Yaeger, and Richard Steiner (you can read a discussion here on page 266). They found that speakers from Essex, in England, produced words like “line” and “loin” slightly differently. However, when they played recordings and asked speakers which word they heard, the participants weren’t able to reliably hear a difference. And it wasn’t just those two words or even that one dialect that they found this happening in: people reported hearing lots of mergers that they weren’t, in fact, producing as mergers. They found the same effect for “source” and “sauce” in New York City, “hock” and “hawk” in Pennsylvania, “full” and “fool” in Albuquerque and “too” and “toe” in Norwich. And this pattern keeps cropping up in continuing work. Alan Yu, for example, found evidence of a near-merger between two tones in Cantonese in 2007.

So you have pretty strong evidence of a split between dialectal perception and production here. This is pretty weird, since we tend to think of both production and perception as facets of one thing: capital-L Language.

But there’s a second side to the story as well. On the one side you have people that have a difference in production but no difference in perception. But on the other side you have people who can perceive and remember dialectal features effortlessly, but who don’t produce them at all. Sumner and Samuels called these people “fluent listeners”. (You can read the whole paper here–experiment three has some interesting investigation of how fluent listeners store things in short and long term memory.) We’ve all probably run across someone who was a fluent listener: they’re usually surprised that you can’t understand thier friend whose accent is impenetrable to you, and who sounds nothing like the person who can understand them so easily.

So if perception and production can have such marked mismatches, does this mean that we have to entirely abandon the idea that they’re related? Not necessarily. Even though they may not perfectly mirror each other, dialect differences in perception and production do seem to be linked. Tyler Kendall and Valerie Fridland, for instanced, looked at perception and production in the Southern Vowel Shift (a type of ongoing  sound change in the Southern United States). They found that, while individuals differed in how they heard and said these vowels, there was also a general trend: the more someone produced shifted vowels, the more likely they were to hear vowels as shifted. So there’s no guarantee that someone will hear and produce things in the same way… but there is a relationship between them.

It’s not a solved problem by any means. There’s a lot that we don’t understand about the way that people perceive speech sounds, and a lot of work to be done. We can, however, make one robust observation: someone’s dialect is likely to be related to the way they hear things.

How to make STEM classrooms more inclusive

This post is a bit of a departure from my usual content. I’m assuming two things about you, the reader:

  1. You teach/learn in a STEM classroom
  2. You’d like to be more inclusive

If that’s not you, you might want to skip this one. Sorry; I’ll be back to my usual haunts with the next post.

If you’re still with me, you may be wondering what triggered this sudden departure from fun facts about linguistics. The answer is that I recently had an upsetting experience, and it’s been been niggling at me. I’m a member of an online data analysis community that’s geared towards people who program professionally. Generally, it’s very helpful and a great way to find out about new packages and tricks I can apply in my work. The other day, though, someone posted a link to a project designed to sort women by thier physical attractiveness. I commented that it was not really appropriate for a professional environment, and was especially off-putting to the women in the group. I’m not upset that I spoke out, but I’m a little unhappy that I had to. I’m also upset that at least one person thought my criticisms were completely unnecessary. (And, yes, both the person who originally posted the link and the aforementioned commenter are male.)

It got me thinking about inclusiveness in professional spaces, though. Am I really doing all I can to ensure that the field of linguistics is being inclusive? While linguistics as a whole is not horribly skewed male, professional linguists are more likely to be male, especially in computational linguistics. And we are definitely lacking in racial diversity; as the Linguistics Society of America (our main professional organization) puts it:

The population of ethnic minorities with advanced degrees in linguistics is so low in the U.S. that none of the federal agencies report data for these groups.”

If you’re like me, you see that as a huge problem and you want to know what you can do to help fix it. That’s why I’ve put together this list of concrete strategies you can use in your classroom and interactions with students to be more inclusive, especially towards women. (Since I’m not disabled or a member of an ethnic minority group or I can’t speak to those experiences, but I invite anyone who can and has additional suggestions to either comment below or contact me anonymously.) The suggestions below are drawn from my experience as both a teacher and a student, as well as input from the participants and other facilitators in last year’s Including All Students: Teaching in the Diverse Classroom workshops.

For Teachers: 

  • If someone calls you on non-inclusive behavior, acknowledge it, apologize and don’t do it again. I know this seems like an obvious one, but it can be really, really important. For example, a lot of linguistics teaching materials are really geared towards native English speakers. The first quarter I taught I used a problem set in class that required native knowledge of English. When a student (one of several non-native speakers) mentioned it, I was mortified and tempted to just ignore the problem. If I had, though, that student would have felt even more alienated. If someone has the courage to tell you about a problem with your teaching you should acknowledge that, admit your wrong-doing and then make sure it doesn’t happen again.
  • Have space for anonymous feedback. That said, it takes a lot of courage to confront an authority figure–especially if you’re already feeling uncomfortable or like you’re not wanted or valued. To combat that, I give my students a way to contact me anonymously (usually through a webform of some kind). While it may seem risky, all the anonymous feedback I have ever received has been relevant and useful.
  • Group work. This may seem like an odd thing to have on the list, but I’ve found that group work in the classroom is really valuable, both as an instructor and as a student. I may not feel comfortable speaking up or asking question in front of the class as a whole, but small groups are much less scary. My favorite strategy for group work is to put up a problem or discussion question and then drift from group to group, asking students for thier thoughts and answering questions.
  • Structure interactive portions of the class. Sometimes small group work doesn’t work well for your material. It’s still really helpful to provide a structure for students to interact and ask questions, because it lets you ensure that all students are included (it has the additional benefit of keeping everyone awake during those drowsy after-lunch classes). Talbot Taylor, for example, would methodically go around in the classroom in order and ask every single student a question during class. Or you could have every student write a question about the course content to give to you at the end of class that you address at the beginning of the next class. Or, if you have readings, you can assign one or two students to lead the discussion for each reading.
  • Don’t tokenize. This is something that one of the workshop participants brought up and I realized that it’s totally something I’ve been guilty of doing (especially if I know one of my students speaks a rare language). If there is only one student of a certain group in your class, don’t ask them to speak for or represent thier group. So if you have one African American student, don’t turn to them every time you discuss AAE. If they volunteer to speak about, great! But it’s not fair to expect them too, and it can make students feel uncomfortable.
  • If someone asks you to speak to someone else for them, don’t mention the person who asked you. I know this one is oddly specific, but it’s another thing that came out of the workshop. One student had asked thier advisor to ask another faculty member to stop telling sexist jokes in class. Their advisor did so, but also mentioned that it was the student who’d complained, and the second faculty member then ridiculed the student during the next class. (This wasn’t in linguistics, but still–yikes!) If someone’s asking you to pass something on for them, there’s probably a very good reason why they’re not confronting that person directly.
  • Don’t objectify minority students. This one mainly applies to women. Don’t treat women, or women’s bodies, like things. That’s what was so upsetting for me about the machine learning example I brought up at the beginning of the article: the author was literally treating women like objects. Another example comes from geoscience, where a student  tells about their experience at a conference where “lecturers… included… photo[s] of a woman in revealing clothing…. I got the feeling that female bodies were shown not only to illustrate a point, but also because they were thought to be pretty to look at” (Women in the Geosciences: Practical, Positive Practices Toward Parity, Holes et al., P.4).

For Everybody: 

  • Actively advocate for minority students. If you’re outside of a minority that you notice is not receiving equal treatment, please speak up about it. For example, if you’re a man and you notice that all the example sentences in a class are about John–a common problem–suggest a sentence with Mei-Ling, or another female name, instead. It’s not fair to ask students who are being discriminated against to be the sole advocates for themselves. We should all be on the lookout for sneaky prejudices.  
  • Don’t speak for/over minority students. That said, don’t put words in people’s mouths. If you’re speaking up about something, don’t say something like, “I think x is making Sanelle uncomfortable”. It may very well be making Sanelle uncomfortable, but that’s up for Sanelle to say. Try something like “I’m not sure that’s an appropriate example”, instead.

Those are some of my pointers. What other strategies do you have to help make the classroom more inclusive?

What’s the best way to block the sound of a voice?

Atif asked:

My neighbor talks loudly on the phone and I can’t sleep. What is the best method to block his voice noise?

Great question Atif! There are few things more distracting than hearing someone else’s conversation, and only hearing one side of a phone conversation is even worse. Even if you don’t want it to, your brain is trying to fill in the gaps and that can definitely keep you awake. So what’s the best way to avoid hearing your neighbor? Well, probably the very best way is to try talking to them. Failing that, though, you have three main options: isolation, damping and masking.

Ruído Noise 041113GFDL
So what’s the difference between them and what’s the best option for you? Before we get down to the nitty gritty I think it’s worth a quick reminder of what sound actually is: sound waves are just that–waves. Just like waves in a lake or ocean. Imagine you and a neighbor share a small pond and you like to go swimming every morning. Your neighbor, on the other hand, has a motorboat that they drive around on thier side. The waves the motorboat makes keep hitting you as you try to swim and you want to avoid them.  This is very similar to your situation: your neighbor’s voice is making waves and you want to avoid being hit by them.

Isolation: So one way to avoid feeling the effects of waves in a pond, to use our example, is to build a wall down the center of the pond. As long as there no holes in the wall for the waves to diffract through, you should be able to avoid feeling the effects of the waves. Noise isolation works much the same way. You can use earplugs that are firmly mounted in your ears to form a seal and that should prevent any sound waves from reaching your eardrums, right? Well, not quite. The wrinkle is that sound can travel through solids as well. It’s like we built our wall in our pond out of something flexible, like rubber, instead of something solid, like brick. As waves hit the wall the wall itself will move with the wave and then transmit it to your side. So you may still end up hearing some noises, even with well-fitted headphones.

Techniques: earplugs/earbuds, noise isolating headphone or earbuds, noise-isolating architecture,

Damping: So in our pond example we might imagine doing something that makes it harder for waves to move through the water. If you replaced all the water with molasses or honey, for example, it would take a lot more energy for the sound waves to move through it and they’d dissipate more quickly.

Techniques: acoustic tiles, covering the intervening wall (with a fabric wall-hanging, foam, empty egg cartons, etc.), covering vents, placing a rolled-up towel under any doors, hanging heavy curtains over windows, putting down carpeting

Masking: Another way to avoid noticing our neighbor’s waves is to start making our own waves. We can either make waves that are exactly the same size as our neighbor’s but out of phase (so when theirs are at their highest peak, ours is at our lowest) so they end up cancelling each other out. That’s basically what noise-cancelling headphones do. Or we can make a lot of own waves that all feel enough like our neighbor’s that when thier wave arrives we don’t even notice it. Of course, if the point it to hear no sound that won’t work quite as well. But if the point is to avoid abrupt, distracting changes in sound then this can work quite nicely.

Techniques: Listening to white noise or music, using noise-cancelling headphones or earbuds


So what would I do? Well, first I’d take as many steps as I could to sound-proof my environment. Try to cover as many of the surfaces in your bedroom as in absorbent, ideally fluffy, surfaces as you can. (If it can absorb water it will probably help absorb sound.) Wall hangings, curtains and a throw rug can all help a great deal.

Then you have a couple options for masking. A fan help to provide both a bit of acoustic masking and a nice breeze. Personally, though, I like a white noise machine that gives you some control over the frequency (how high or low the pitch is) and intensity (loudness) of the sounds it makes. That lets you tailor it so that it best masks the sounds that are bothering you. I also prefer the ones with the fans rather than those that loop recorded sounds, since I often find the loop jarring. If you don’t want to or can’t buy one, though, myNoise has a number of free generators that let you tailor the frequency and intensity of a variety of sounds and don’t have annoying loops. (There are a bunch of additional features available that you can access for a small donation as well.)

If you can wear earbuds in bed, try playing a non-distracting noise at around 200-1000 Hertz, which will cover a lot of the speech sounds you can’t easily dampen. Make sure your earbuds are well-fitted in the ear canal so that as much noise is isolated as possible. In addition, limiting the amount of exposed hard surface on them will also increase noise isolation. You can knit little cozies, try to find earbuds with a nice thick silicon/rubber coating or even try coating your own.

By using many different strategies together you can really reduce unwanted noises. I hope this helps and good luck!

Does reading a story affect the way you talk afterwards? (Or: do linguistic tasks have carryover effects?)

So tomorrow is my generals exam (the title’s a bit misleading: I’m actually going to be presenting research I’ve done so my committee can decide if I’m ready to start work on my dissertation–fingers crossed!). I thought it might be interesting to discuss some of the research I’m going to be presenting in a less formal setting first, though. It’s not at the same level of general interest as the Twitter research I discussed a couple weeks ago, but it’s still kind of a cool project. (If I do say so myself.)

Plush bunny with headphones.jpg

Shhhh. I’m listening to linguistic data. “Plush bunny with headphones”. Licensed under Public Domain via Wikimedia Commons.

Basically, I wanted to know whether there are carryover effects for some of the mostly commonly-used linguistics tasks. A carryover effect is when you do something and whatever it was you were doing continues to affect you after you’re done. This comes up a lot when you want to test multiple things on the same person.

An example might help here. So let’s say you’re testing two new malaria treatments to see which one works best. You find some malaria patients, they agree to be in your study, and you give them treatment A and record thier results. Afterwards, you give them treatment B and again record their results. But if it turns out that treatment A cures Malaria (yay!) it’s going to look like treatment B isn’t doing anything, even if it is helpful, because everyone’s been cured of Malaria. So thier behavior in the second condition (treatment B) is affected by thier participation in the first condition (treatment A): the effects of treatment A have carried over.

There are a couple of ways around this. The easiest one is to split your group of participants in half and give half of them A first and half of them B first. However, a lot of times when people are using multiple linguistic tasks in the same experiment, then won’t do that. Why? Because one of the things that linguists–especially sociolinguists–want to control for is speech style. And there’s a popular idea in sociolinguistics that you can make someone talk more formally, but it’s really hard to make them talk less formally. So you tend to end up with a fixed task order going from informal tasks to more formal tasks.

So, we have two separate ideas here:

  • The idea that one task can affect the next, and so we need to change task order to control for that
  • The idea that you can only go from less formal speech to more formal speech, so you need to not change task order to control for that

So what’s a poor linguist to do? Balance task order to prevent carryover effects but risk not getting the informal speech they’re interested in? Or keep task order fixed to get informal and formal speech but at the risk of carryover effects? Part of the problem is that, even though they’re really well-studied in other fields like psychology, sociology or medicine, carryover effects haven’t really been studied in linguistics before. As a result, we don’t know how bad they are–or aren’t!

Which is where my research comes in. I wanted to see if there were carryover effects and what they might look like. To do this, I had people come into the lab and do a memory game that involved saying the names of weird-looking things called Fribbles aloud. No, not the milkshakes, one of the little purple guys below (although I could definitely go for a milkshake right now). Then I had them do one linguistic elicitation tasks (reading a passage, doing an interview, reading a list of words or, to control for the effects of just sitting there for a bit, an arithmetic task). Then I had them repeat the Fribble game. Finally, I compared a bunch of measures from speech I recorded during the two Fribble games to see if there was any differences.

Greeble designed by Scott Yu and hosted by the Tarr Lab wiki (click for link).

Greeble designed by Scott Yu and hosted by the Tarr Lab wiki (click for link).

What did I find? Well, first, I found the same thing a lot of other people have found: people tend to talk while doing different things. (If I hadn’t found that, then it would be pretty good evidence that I’d done something wrong when designing my experiment.) But the really exciting thing is that I found, for some specific measures, there weren’t any carryover effects. I didn’t find any carryover effects for speech speed, loudness or any changes in pitch. So if you’re looking at those things you can safely reorder your experiments to help avoid other effects, like fatigue.

But I did find that something a little more interesting was happening with the way people were saying their vowels. I’m not 100% sure what’s going on with that yet. The Fribble names were funny made-up words (like “Kack” and “Dut”) and I’m a little worried that what I’m seeing may be a result of that weirdness… I need to do some more experiments to be sure.

Still, it’s pretty exciting to find that there are some things it looks like you don’t need to worry about carryover effects for. That means that, for those things, you can have a static order to maintain the style continuum and it doesn’t matter. Or, if you’re worried that people might change what they’re doing as they get bored or tired, you can switch the order around to avoid having that affect your data.

Tweeting with an accent

I’m writing this blog post from a cute little tea shop in Victoria, BC. I’m up here to present at the Northwest Linguistics Conference, which is a yearly conference for both Canadian and American linguists (yes, I know Canadians are Americans too, but United Statsian sounds weird), and I thought that my research project may be interesting to non-linguists as well. Basically, I investigated whether it’s possible for Twitter users to “type with an accent”. Can linguists use variant spellings in Twitter data to look at the same sort of sound patterns we see in different speech communities?

Picture of a bird saying

Picture of a bird saying “Let’s Tawk”. Taken from the website of the Center for the Psychology of Women in Seattle. Click for link.

So if you’ve been following the Great Ideas in Linguistics series, you’ll remember that I wrote about sociolinguistic variables a while ago. If you didn’t, sociolinguistic variables are sounds, words or grammatical structures that are used by specific social groups. So, for example, in Southern American English (representing!) the sound in “I” is produced with only one sound, so it’s more like “ah”.

Now, in speech these sociolinguistic variables are very well studied. In fact, the Dictionary of American Regional English was just finished in 2013 after over fifty years of work. But in computer mediated communication–which is the fancy term for internet language–they haven’t been really well studied. In fact, some scholars suggested that it might not be possible to study speech sounds using written data. And on the surface of it, that does make sense. Why would you expect to be able to get information about speech sounds from a written medium? I mean, look at my attempt to explain an accent feature in the last paragraph. It would be far easier to get my point across using a sound file. That said, I’d noticed in my own internet usage that people were using variant spellings, like “tawk” for “talk”, and I had a hunch that they were using variant spellings in the same way they use different dialect sounds in speech.

While hunches have their place in science, they do need to be verified empirically before they can be taken seriously. And so before I submitted my abstract, let alone gave my talk, I needed to see if I was right. Were Twitter users using variant spellings in the same way that speakers use different sound patterns? And if they are, does that mean that we can investigate sound  patterns using Twitter data?

Since I’m going to present my findings at a conference and am writing this blog post, you can probably deduce that I was right, and that this is indeed the case. How did I show this? Well, first I picked a really well-studied sociolinguistic variable called the low back merger. If you don’t have the merger (most African American speakers and speakers in the South don’t) then you’ll hear a strong difference between the words “cot” and “caught” or “god” and “gaud”. Or, to use the example above, you might have a difference between the words “talk” and “tock”. “Talk” is little more backed and rounded, so it sounds a little more like “tawk”, which is why it’s sometimes spelled that way. I used the Twitter public API and found a bunch of tweets that used the “aw” spelling of common words and then looked to see if there were other variant spellings in those tweets. And there were. Furthermore, the other variant spellings used in tweets also showed features of Southern American English or African American English. Just to make sure, I then looked to see if people were doing the same thing with variant spellings of sociolinguistic variables associated with Scottish English, and they were. (If you’re interested in the nitty-gritty details, my slides are here.)

Ok, so people will sometimes spell things differently on Twitter based on their spoken language dialect. What’s the big deal? Well, for linguists this is pretty exciting. There’s a lot of language data available on Twitter and my research suggests that we can use it to look at variation in sound patterns. If you’re a researcher looking at sound patterns, that’s pretty sweet: you can stay home in your jammies and use Twitter data to verify findings from your field work. But what if you’re not a language researcher? Well, if we can identify someone’s dialect features from their Tweets then we can also use those features to make a pretty good guess about their demographic information, which isn’t always available (another problem for sociolinguists working with internet data). And if, say, you’re trying to sell someone hunting rifles, then it’s pretty helpful to know that they live in a place where they aren’t illegal. It’s early days yet, and I’m nowhere near that stage, but it’s pretty exciting to think that it could happen at some point down the line.

So the big take away is that, yes, people can tweet with an accent, and yes, linguists can use Twitter data to investigate speech sounds. Not all of them–a lot of people aren’t aware of many of their dialect features and thus won’t spell them any differently–but it’s certainly an interesting area for further research.

“Men” vs. “Females” and sexist writing

So, I have a confession to make. I actually set out to write a completely different blog post. In searching Wikimedia Commons for a picture, though, I came across something that struck me as odd. I was looking for pictures of people writing, and I noticed that there were two gendered sub-categories, one for men and one for women. Leaving aside the question of having only two genders, what really stuck out to me were the names. The category with pictures of men was called “Men Writing” and the category with pictures of women was called “Females Writing”.

Family 3

According to this sign, the third most common gender is “child”.

So why did that bother me? It is true that male humans are men and that women are female humans. Sure, a writing professor might nag about how the two terms lack parallelism, but does it really matter?

The thing is, it wouldn’t matter if this was just a one-off thing. But it’s not. Let’s look at the Category: Males and Category: Females*. At the top of the category page for men, it states “This category is about males in general. For human males, see Category:Male humans”. And the male humans category is, conveniently, the first subcategory. Which is fine, no problem there. BUT. There is no equivalent disclaimer at the top of Category: Females, and the first subcategory is not female humans but female animals. So even though “Females” is used to refer specifically to female humans when talking about writing, when talking about females in general it looks as if at least one editor has decided that it’s more relevant for referring to female animals. And that also gels with my own intuitions. I’m more like to ask “How many females?” when looking at a bunch of baby chickens than I am when looking at a bunch of baby humans. Assuming the editors responsible for these distinctions are also native English speakers, their intuitions are probably very similar.

So what? Well, it makes me uncomfortable to be referred to with a term that is primarily used for non-human animals while men are referred to with a term that I associate with humans. (Or, perhaps, women are being referred to as “female men”, but that’s equally odd and exclusionary.)

It took me a while to come to that conclusion. I felt that there was something off about the terminology, but I had to turn and talk it over with my officemate for a couple minutes before finally getting at the kernel of the problem. And I don’t think it’s a concious choice on the part of the editors–it’s probably something they don’t even realize they’re doing. But I definitely do think that it’s related to the gender imbalance of the editors of Wikimedia. According to recent statistics, over ninety percent (!) of Wikipedia editors are male. And this type of sexist language use probably perpetuates that imbalance. If I feel, even if it’s for reasons that I have a hard time articulating, that I’m not welcome in a community then I’m less likely to join it. And that’s not just me. Students who are presented with job descriptions in language that doesn’t match thier gender are less likely to be interested in those jobs. Women are less likely to respond to job postings if “he” is used to refer to both men and women. I could go on citing other studies, but we could end up being here all day.

My point is this: sexist language affects the behaviour and choices of those who hear it. And in this case, it makes me less likely to participate in this on-line community because I don’t feel as if I would be welcomed and respected there. It’s not only Wikipedia/Wikimedia, either. This particular usage pattern is also something I associate with Reddit (a good discussion here). The gender breakdown of Reddit? About 70% male.

For some reason, the idea that we should avoid sexist language usage seems to really bother people. I was once a TA for a large lecture class where, in the middle of discussions of the effects of sexist language, a male student interrupted the professor to say that he didn’t think it was a problem. I’ve since thought about it quite a bit (it was pretty jarring) and I’ve come to the conclusion that the reason the student felt that way is that, for him, it really wasn’t a problem. Since sexist language is almost always exclusionary to women, and he was not a women, he had not felt that moment of discomfort before.

Further, I think he may have felt that, because this type of language tends to benefit men, he felt that we were blaming him. I want to be clear here: I’m not blaming anyone for thier unconscious biases. And I’m  not saying that only men use sexist language. The Wikimedia editors who made this choice may very well have been women. What I am saying is that we need to be aware of these biases and strive to correct them. It’s hard, and it takes constant vigilance, but it’s an important and relatively simple step that we can all take in order to help eliminate sexism.

*As they were on Wednesday, April 8 2015. If they’ve been changed, I’d recommend the Way Back Machine.

What affects tongue length?

People tend to be surprised when they learn that there is a lot of variation in the vocal tract (all those parts of your head and neck that you use to produce speech sounds). For example, the epiglottis, that little flap that keeps you from swallowing your food into your lungs, has between five and six completely different shapes. It can be thin and flat, with serrated edges, thick with rounded edges, or a mixture of the two. If looking at it didn’t involve sticking cameras down the throat via the mouth or nose, it would actually be pretty useful for biometrics.

The tongue doesn’t have quite as much variation in shape as the epiglottis, but there is one bit of variation that seems to get quite a bit of interest: tongue length.

Gray1019.png

Now hold that while I get a measuring tape.

So what can affect tongue length? Well, the biggest factor is probably how you measure it. The Guinness Book of  World Records, for example, measures the length of the tongue from the tip of the extended tongue to the middle of the top lip. (The current record holder, Nick Stoeberl, can extended his tongue almost four inches past his top lip.) But, as you’ll notice looking at the diagram above, the amount of the tongue that can stick out past your lips is actually pretty limited. The tongue itself goes all the way down to the hyoid bone, in your throat. So if you want to accurately measure the entire tongue, probably the most accurate way is to measure from the tongue tip to the epiglottis (down in the throat) while the tongue is at rest. The downside to this, of course, is that it will trigger gagging and it’s hard to see what you’re doing at the back of someone’s throat. Plus it has the definite potential to block the airways. As a result, tongue measurement of this type tend to be done on cadavers. There are also some imaging techniques like x-rayultrasound or MRI. But let’s assume that you don’t have a couple hundreds of thousands of dollars’ worth of equipment or a medical cadaver just lying around and just focus on that first measurement–although be warned that it doesn’t have very strong inter-rater reliability.

Now that that’s out of the way, we can get down to business: what affects how far you can stick your tongue out? There are actually a lot of factors at work here:

  • Frenulum: The lingual frenulum, that is. This is the little bit of tissue that connects the bottom of your tongue to the floor of you mouth. For most people this actually won’t affect tongue extension, but for some people it’s a big problem. Have you ever heard the expression “tongue tied”? This actually refers to a lingual frenulum that’s too short and extends too far towards the tip of the tongue. This condition, which is called ankyloglossia, is especially problematic when trying to produce speech sounds or for babies who are trying to nurse. In some cases, doctors may actually cut the lingual frenulum in order to free the tongue. For most people, though, cutting the frenulum would not increase freedom of movement or length of extension in the tongue. Plus, the risks associated with oral surgery are substantial.
  • Bone structure and tooth placement: Bone structure and tooth placement can also affect how far the tongue can be extended. People with short face syndrome–yes, that’s a real medical diagnosis–and overjet tend to have smaller tongues. Other factors such as incisor position and whether a line drawn between the upper and lower sets of teeth tilts or not also co-vary with tongue length.
  • Age: One obvious factor that affects tongue size is age. Adults’ tongues are approximately twice the size of infants’. This is surprising, given that the infant’s skull makes up 1/4 of its height where as for adults that figure is only 1/7. As a result, an adult skull is only roughly 1.75 times as large as an infant skull.
  • Biological sex. Finally, there is a slight affect of biological sex. During puberty, high levels of testosterone and human growth hormone trigger growth, especially in the jaw and chin, and this effect is more pronounced in individuals with testes. As a result, their tongues tend to be longer. Too much human growth hormone–acromegaly–can cause growth to continue well past the point of comfort. It also causes the tongue to enlarge and shift forwards in the mouth.

You may notice all these factors have one thing in common: they’re not something you can change. Like your height or body-shape, tongue length isn’t really something you can really change about yourself. The good news, though, is that you can produce speech perfectly well with pretty much any length of tongue.