An emoji dance notation system for TikTok dance tutorials 👀💃

This blog post is more of a quick record of my thoughts than a full in-depth analysis, because when I saw this I immediately wanted to start writing about it. Basically, TikTok is a social media app for short form video (RIP Vine, forever in our hearts) and one of the most popular genres of content is short dances; you may already be familiar with the concept.

HOWEVER, what’s particularly intriguing to me is this sort of video here, where someone creates a tutorial for a specific dance and includes an emoji-based dance notation:

Example of a dance with an emoji notation system by

Back in grad school, when I was studying signed languages, I probably spent more time than I should have reading about writing systems for signed languages and also dance notations. To roughly sum up an entire field of study: representing movements of the human body in time and space using a writing system, or even a more specialized notation, is extremely difficult. There are a LOT of other notations out there, and you probably haven’t run into them for a reason: they’re complex, hard to learn, necessarily miss nuances and are a bit redundant given that the vast majority of dance is learned through watching & copying movement. Probably the most well-known type of dance notation is for ballroom dance where the footwork patterns are represented on the floor using images of footsteps, like so:

Langsamer Walzer Grundschritt

I think part of the reason that this notation in particular tends to work well is that it’s completely iconic: the image of a shoe print is where your shoe print should go. It also captures a large part of the relevant information; the upper body position can be inferred from the position of the feet (and in many cases will more or less remain the same throughout).

I think that’s true to some degree of these emoji notations as well. The fact that they work at all may be arising in part due to the constraints of the TikTok dance genre. In most TikTok dances, the dancer faces in a single direction for the dance, there is minimal movement around the space and the feet move minimally if at all. The performance format itself helps as well: the videos are short and easy to repeat, and you can still see the movements being preformed in full with the notation being used as a shorthand.

And it’s clear that this use of style of notation isn’t idiosyncratic; this compilation has a variety of tutorials from different creators that use variations on the same style of notation.

A selection of tiktok dance tutorials, some of which include emoji notation

Some of the types of ways emoji are used here are similar to the ways that things like Stokoe notation are, to indicate handshape and movement (although not location). A few other types of ways that emoji are used that stick out:

  • Articulator (hands with handshape, peach emoji for the hips)
  • Manner of articulation/movement (“explosive”, a specific number of repetitions, direction of movement using arrows)
  • Iconic representation of a movement using an object (helicopter = hands make the motion of helicopter blades, mermaid = bodywaves, as if a mermaid swimming)
  • Iconic representation of a shape to be traced (a house emoji = tracing a house shape with the hands, heart = trace a heart shape)
  • (Not emoji) Written shorthand for a (presumably) already known dance, for example “WOAH” for the woah

To sum up: I think this is a cool idea, it’s an interesting new type of dance notation that is clearly useful to a specific artistic community. It’s also another really good piece of evidence in the bucket of “emoji are gestures”: these are clearly not a linguistic system and are used in a variety of ways by different users that don’t seem entirely systematic.

Buuuut there’s also the way that the emojis are groups into phrases for a specific set of related motions, which smells like some sort of shallow parsing even if it’s not a full consistency structure, and I’d say that’s definitely linguistic-ish. I think I’d need to spend more time on analysis to have any more firmly held opinion than that.

Who all studies language? 🤔 A brief disciplinary tour

red and yellow bus photo
Buckle up friends, we’re going on a tour!

One of the nice things about human language is that no matter what your question about it might be, someone, somewhere has almost certainly already asked the same thing… and probably found at least part of an answer! The downside of this wealth of knowledge is that, even if you restrict yourself to just looking at the Western academic tradition, 1) there’s a lot of it and 2) it’s scattered across a lot of disciplines which can make it very hard to find.

An academic discipline is a field of study but also a social network of scholars with shared norms and vocabulary. While people do do “interdisciplinary” work that draws on more than one discipline, the majority of academic life is structured around working in a single discipline. This is reflected in everything from departments to journals and conferences to how research funding is divided.

As a result, even if you study human language in some capacity yourself it can be very hard to form a good idea of where else people are doing related work if it falls into another discipline you don’t have contact with. You won’t see them at your conferences, you probably won’t cite each other in your papers and even if you are studying the exact same thing you’ll probably use different words to describe it and have different reserach goals. As a result, even many researchers working in language may not know what’s happening in the discipline next door.

For better or worse, though, I’ve always been very curious about disciplinary boundaries and talk and read to a lot of folks and, as a result, have ended up learning a lot about different disciplines. (Note: I don’t know that I’d recommend this to other junior scholars. It made me a bit of a “neither fish nor fowl” when I was on the faculty job market. I did have fun though. 😉 The upside of this is that I’ve had at least three discussions with people where the gist of it was “here are the academic fields that are relevant to your interest” and so I figured it was time to write it up as a blog post to save myself some time in the future.

Disciplines where language is the main focus

These fields study language itself. While people working in these fields may use different tools and have different goals, these are fields where people are likely to say that language is their area of study.

Linguistics

This is the field that studies Language and how it works. Sometimes you’ll hear people talk about “capital L language” to distinguish it from the study of a specific language. Whatever tools or methods or theoretical linguists use, their main object of study is language itself. There a lot of fields within linguistics and they vary a lot, but generally if a field has “linguistics” on the end, they’re going to be focusing on language itself.

For more information about linguistics, check out the Linguistic Society of America or my friend Gretchen’s blog.

Language-specific disciplines (classics, English, literature, foreign language departments etc.)

This is a collection of disciplines that study particular languages and specific instances of language use (like specific documents or pieces of oral literature). These fields generally focus on language teaching or applying frameworks like critical theory to better understand texts. Oh, or they produce new texts themselves. If you ask someone in one of these fields what they study, they’ll probably say the name of the specific language or family of languages they work on.

There are a lot of different fields that fall under this umbrella, so I’d recommend searching for “[whatever language you what to know about ] studies” and taking it from there.

Speech language pathology/Audiology/Speech and hearing

I’m grouping these disciplines together because they generally focus on language in a medical context. The main focus of researchers in this field is studying how the human body produces and receives language input. A lot of the work here focus on identifying and treating instances when these processes break down.

A good place to learn more is the American Speech-Language-Hearing Association.

Computer science (Specifically natural language processing, computational linguistics)

This field (more likely to be called NLP these days) focuses on building and understanding computational systems where language data, usually text, is part of either the input or output. Currently the main focus on the field (in terms of press coverage and $$ at any rate) is in applying machine learning methods to various problems. A lot of work in NLP is focused around particular tasks which generally have an associated dataset and shared metric and where the aim is to outperform other systems on the same problem. NLP does use some methods from other fields of machine learning (like computer vision) but the majority of the work uses techniques specific to, or at least developed for, language data.

To learn more, I’d check out the Association for Computational Linguistics. (Note that “NLP” is also an acronym for a pseudoscienience thing so I’d recommend searching #NLProc or “Natural Language Processing” instead.)

For reference, I would say that currently my main field is in applied NLP, but my background is primarily in linguistics and sprinkling of language-specific studies, especially English and American Sign Language. (Although I’ve taken course work and been a co-author on papers in speech & hearing.)

Disciplines where language is sometimes studied

There are also a lot of related fields where language data is used, or language is used as a tool to study a different object of inquiry.

  • Data Science. You would you shocked how much of data science is working with text data (or maybe you’re a data scientist and you wouldn’t be). Pretty much every organization has some sort of text they would like to learn about without having to read it all.
  • Computational social science, which uses language data but also frequently other types of data produced by human interaction with computational system. The aim is usually more to model or understand society rather than language use.
  • Anthropology, where language data is often used to better understand humans. (As a note, early British anthropology in particular is straight up racist imperial apologism, so be ye warned. There have been massive changes in the field, thankfully.) A lot of language documentation used to happen in anthropology departments, although these days I think it tends to be more linguistics. The linguistic-focused subdisciplines are anthropological linguistics or linguistic anthropology (they’re slightly different).
  • Sociology, the study of society. Sociolinguistics is more sociologically-informed linguistics, and in the US historically has been slightly more macro focused.
  • Psychology/Cognitive science. Non-physical brain stuff, like the mind and behavior. The linguistic part is psycholinguistics. This is where a lot of the work on language learning goes on.
  • Neurology. Physical brain stuff. The linguistic part is neurolinguistics. They tend to do a lot of imaging.
  • Education. A lot of the literature on language learning is in education. (Language learning is not to be confused with language acquisition; that’s only for the process by which children naturally acquire a language without formal instruction.)
  • Electrical engineering (Signal processing). This is generally the field of folks who are working on telephony and automatic speech recognition. NLP historically hasn’t done as much with voices, that’s been in electrical engineering/signal processing.
  • Disability studies. A lot of work on signed languages will be in disability studies departments if they don’t have their own department.
  • Historians. While they aren’t primarily studying the changes in linguistic systems, historians interact with older language data a lot and provide context for things like language contact, shift and historical usage.
  • Informatics/information science/library science. Information science is broader than linguistics (including non-linguistic information all well) but often dovetails with it, especially in semantics (the study of meaning) and ontologies (a formal representation of categories and their relations).
  • Information theory. This field is superficially focused on how digital information is encoded. Usually linguistics draws from it rather than vice-versa because it’s lower level, but if you’ve heard of entropy, compression or source-channel theory those are all from information theory.
  • Philosophy. A lot of early linguistics scholars, like Ferdinand de Saussure, would probably have considered themselves primarily philosophers and there was this whole big thing in the early 1900’s. The language-specific branch is philosophy of language.
  • Semiotics. This is a field I haven’t interacted with too much (I get the impression that it’s more popular in Europe than the US) but they study “signs”, which as I understand it is any way of referring to a thing in any medium without using the actual thing, which by that definition does include language.
  • Design studies. Another field I’m not super familiar with, but my understanding is that it includes studying how users of a designed thing interact with it, which may include how they use or interpret language. Also: good design is so important and I really don’t think designers get enough credit/kudos.

Are “a female” and “a male” used differently?

In this first part of this two-post series, I looked at how “a male” and “a female” were used on Twitter. I found that one part of speech tagger tagged “male” as a proper noun really frequently (which is weird, cause it isn’t one) and that overall the phrase “a female” was waaaay more frequent. Which is  interesting in itself, since my initial question was “are these terms used differently?” and these findings suggest that they are. But the second question is how are these terms used differently? To answer this, we’ll need to get a little more qualitative with it.

Anas platyrhynchos male female
“Male” and “female” are fine for ducks, but a little weird for humans.
Using the same set of tweets that I collected last time, I randomly selected 100 tweets each from the “a male” and “a female” dataset. Then I hand tagged each subset of tweets for two things: the topic of the tweet (who or what was being referred to as “male” or “female”) and the part of speech of “male”  or “female”.

Who or what is being called “male” or “female”?

Rplot

Because there were so few tweets to analyze, I could do a content analysis. This is a methodology that is really useful when you don’t know for sure ahead of time what types of categories you’re going to see in your data. It’s like clustering that a human does.

Going into this analysis, I thought that there might be a difference between these datasets in terms of how often each term was used to refer to an animal, so I tagged tweets for that. But as I went through the tweets, I was floored by the really high number of tweets talking about trans people, especially Mack Beggs, a trans man from Texas who was forced to wrestle in the women’s division. Trans men were referred to as “a male” really, really often. While there wasn’t a reliable difference between how often “a female” and “a male” was used to refer to animals or humans, there was a huge difference in terms of how often they were  used to refer to trans people. “A male” was significantly more likely to be used to describe a trans person than “a female” (X2 (2, N = 200) = 55.33, p <.001.)

Part of Speech

Since the part of speech taggers I used for the first half of my analysis gave me really mixed results, I also hand tagged the part of speech of “male” or “female” in my samples. In line with my predictions during data collection, the only parts of speech I saw were nouns and adjectives.

When I looked at just the difference between nouns and adjectives, there was a little difference, but nothing dramatic. Then, I decided to break it down a little further. Rather than just looking at the differences in part of speech between “male” and “female”, I looked at the differences in part of speech and whether the tweet was about a trans person or a cis (not trans) person.

Rplot01
For tweets with “female”, it was used as a noun and an adjective at pretty much the same rates regardless of whether someone was talking about a trans person or a cis (non-trans) person. For tweets with “male”, though, when the tweet was about a trans person, it was used almost exclusively as a noun.

And there was a huge difference there. A large majority of tweets with “a male” and talking about a trans person used “male” as a noun. In fact, more than a third of my subsample of tweets using “a male” were using it as a noun to talk about someone who was trans.

So what’s going on here? This construction (using “male” or “female” as a noun to refer to a human) is used more often to talk about:

  1. Women. (Remember that in the first blog post looking at this, I found that “a female” is twice a common as “a male.)
  2. Trans men.

These both make sense if you consider the cultural tendency to think about cis men as, in some sense, the “default”. (Deborah Tannen has a really good discussion of this her article “Marked Women, Unmarked Men“. “Marked” is a linguistics term which gets used in a lot of ways, but generally means something like “not the default” or “the weird one”.) So people seem to be more likely to talk about a person being “a male” or “a female” when they’re talking about anyone but a cis man.

A note on African American English

giphy.gif

I should note that many of the tweets in my sample were African American English, which is not surprising given the large Black community on Twitter, and that use of “female” as a noun is a feature of this variety.  However, the parallel term used to refer to men in this variety is not “a man” or even “a male”, but rather “nigga”, with that spelling. This is similar to “dude” or “guy”: a nonspecific term for any man, regardless of race, as discussed at length by Rachel Jeantal here. You can see an example of this usage in speech above (as seen in the Netflix show “The Unbreakable Kimmy Schmidt“) or in this vine. (I will note, however, that it only has this connotation if used by a speaker of African American English. Borrowing it into another variety, especially if the speaker is white, will change the meaning.)

Now, I’m not a native user of African American English, so I don’t have strong intuitions about the connotation of this usage. Taylor Amari Little (who you may know from her TEDx talk on Revolutionary Self-Produced Justice) is, though, and tweeted this (quoted with permission):

If they call women “females” 24/7, leave em alone chile, run away

And this does square with my own intuitions: there’s something slightly sinister about someone who refers to women exclusively as “females”. As journalist Vonny Moyes pointed out in her recent coverage of ads offering women free rent in exchange for sexual favors, they almost refer to women as “girls or females – rarely ever women“. Personally, I find that very good motivation not to use “a male” or “a female” to talk about any human.

Can what you think you know about someone affect how you hear them?

I’ll get back to “a male/a female” question in my next blog post (promise!), but for now I want to discuss some of the findings from my dissertation research. I’ve talked about my dissertation research a couple times before, but since I’m going to be presenting some of it in Spain (you can read the full paper here), I thought it would be a good time to share some of my findings.

In my dissertation, I’m looking at how what you think you know about a speaker affects what you hear them say. In particular, I’m looking at American English speakers who have just learned to correctly identify the vowels of New Zealand English. Due to an on-going vowel shift, the New Zealand English vowels are really confusing for an American English speaker, especially the vowels in the words “head”, “head” and “had”.

tokensVowelPlot
This plot shows individual vowel tokens by the frequency of thier first and second formants (high-intensity frequency bands in the vowel). Note that the New Zealand “had” is very close to the US “head”, and the New Zealand “head” is really close to the US “hid”.

These overlaps can be pretty confusing when American English speakers are talking to New Zealand English speakers, as this Flight of the Conchords clip shows!

The good news is that, as language users, we’re really good at learning new varieties of languages we already know, so it only takes a couple minutes for an American English speaker to learn to correctly identify New Zealand English vowels. My question was this: once an American English speaker has learned to understand the vowels of New Zealand English, how do they know when to use this new understanding?

In order to test this, I taught twenty one American English speakers who hadn’t had much, if any, previous exposure to New Zealand English to correctly identify the vowels in the words “head”, “heed” and “had”. While I didn’t play them any examples of a New Zealand “hid”–the vowel in “hid” is said more quickly in addition to having different formants, so there’s more than one way it varies–I did let them say that they’d heard “hid”, which meant I could tell if they were making the kind of mistakes you’d expect given the overlap between a New Zealand “head” and American “hid”.

So far, so good: everyone quickly learned the New Zealand English vowels. To make sure that it wasn’t that they were learning to understand the one talker they’d been listening to, I tested half of my listeners on both American English and New Zealand English vowels spoken by a second, different talker. These folks I told where the talker they were listening to was from. And, sure enough, they transferred what they’d learned about New Zealand English to the new New Zealand speaker, while still correctly identifying vowels in American English.

The really interesting results here, though, are the ones that came from the second half the listeners. This group I lied to. I know, I know, it wasn’t the nicest thing to do, but it was in the name of science and I did have the approval of my institutional review board, (the group of people responsible for making sure we scientists aren’t doing anything unethical).

In an earlier experiment, I’d played only New Zealand English as this point, and when I told them the person they were listening to was from America, they’d completely changed the way they listened to those vowels: they labelled New Zealand English vowels as if they were from American English, even though they’d just learned the New Zealand English vowels. And that’s what I found this time, too. Listeners learned the New Zealand English vowels, but “undid” that learning if they thought the speaker was from the same dialect as them.

But what about when I played someone vowels from their own dialect, but told them the speaker was from somewhere else? In this situation, listeners ignored my lies. They didn’t apply the learning they’d just done. Instead, the correctly treated the vowels of thier own dialect as if they were, in fact, from thier dialect.

At first glance, this seems like something of a contradiction: I just said that listeners rely on social information about the person who’s talking, but at the same time they ignore that same social information.

So what’s going on?

I think there are two things underlying this difference. The first is the fact that vowels move. And the second is the fact that you’ve heard a heck of a lot more of your own dialect than one you’ve been listening to for fifteen minutes in a really weird training experiment.

So what do I mean when I say vowels move? Well, remember when I talked about formants above? These are areas of high acoustic energy that occur at certain frequency ranges within a vowel and they’re super important to human speech perception. But what doesn’t show up in the plot up there is that these aren’t just static across the course of the vowel–they move. You might have heard of “diphthongs” before: those are vowels where there’s a lot of formant movement over the course of the vowel.

And the way that vowels move is different between different dialects. You can see the differences in the way New Zealand and American English vowels move in the figure below. Sure, the formants are in different places—but even if you slid them around so that they overlapped, the shape of the movement would still be different.

formantDynamics
Comparison of how the New Zealand and American English vowels move. You can see that the shape of the movement for each vowel is really different between these two dialects.  

Ok, so the vowels are moving in different ways. But why are listeners doing different things between the two dialects?

Well, remember how I said earlier that you’ve heard a lot more of your own dialect than one you’ve been trained on for maybe five minutes? My hypothesis is that, for the vowels in your own dialect, you’re highly attuned to these movements. And when a scientist (me) comes along and tells you something that goes against your huge amount of experience with these shapes, even if you do believe them, you’re so used to automatically understanding these vowels that you can’t help but correctly identify them. BUT if you’ve only heard a little bit of a new dialect you don’t have a strong idea of what these vowels should sound like, so if you’re going to rely more on the other types of information available to you–like where you’re told the speaker is from–even if that information is incorrect.

So, to answer the question I posed in the title, can what you think you know about someone affect how you hear them? Yes… but only if you’re a little uncertain about what you heard in the first place, perhaps becuase it’s a dialect you’re unfamiliar with.

“Men” vs. “Females” and sexist writing

So, I have a confession to make. I actually set out to write a completely different blog post. In searching Wikimedia Commons for a picture, though, I came across something that struck me as odd. I was looking for pictures of people writing, and I noticed that there were two gendered sub-categories, one for men and one for women. Leaving aside the question of having only two genders, what really stuck out to me were the names. The category with pictures of men was called “Men Writing” and the category with pictures of women was called “Females Writing”.

Family 3
According to this sign, the third most common gender is “child”.
So why did that bother me? It is true that male humans are men and that women are female humans. Sure, a writing professor might nag about how the two terms lack parallelism, but does it really matter?

The thing is, it wouldn’t matter if this was just a one-off thing. But it’s not. Let’s look at the Category: Males and Category: Females*. At the top of the category page for men, it states “This category is about males in general. For human males, see Category:Male humans”. And the male humans category is, conveniently, the first subcategory. Which is fine, no problem there. BUT. There is no equivalent disclaimer at the top of Category: Females, and the first subcategory is not female humans but female animals. So even though “Females” is used to refer specifically to female humans when talking about writing, when talking about females in general it looks as if at least one editor has decided that it’s more relevant for referring to female animals. And that also gels with my own intuitions. I’m more like to ask “How many females?” when looking at a bunch of baby chickens than I am when looking at a bunch of baby humans. Assuming the editors responsible for these distinctions are also native English speakers, their intuitions are probably very similar.

So what? Well, it makes me uncomfortable to be referred to with a term that is primarily used for non-human animals while men are referred to with a term that I associate with humans. (Or, perhaps, women are being referred to as “female men”, but that’s equally odd and exclusionary.)

It took me a while to come to that conclusion. I felt that there was something off about the terminology, but I had to turn and talk it over with my officemate for a couple minutes before finally getting at the kernel of the problem. And I don’t think it’s a concious choice on the part of the editors–it’s probably something they don’t even realize they’re doing. But I definitely do think that it’s related to the gender imbalance of the editors of Wikimedia. According to recent statistics, over ninety percent (!) of Wikipedia editors are male. And this type of sexist language use probably perpetuates that imbalance. If I feel, even if it’s for reasons that I have a hard time articulating, that I’m not welcome in a community then I’m less likely to join it. And that’s not just me. Students who are presented with job descriptions in language that doesn’t match thier gender are less likely to be interested in those jobs. Women are less likely to respond to job postings if “he” is used to refer to both men and women. I could go on citing other studies, but we could end up being here all day.

My point is this: sexist language affects the behaviour and choices of those who hear it. And in this case, it makes me less likely to participate in this on-line community because I don’t feel as if I would be welcomed and respected there. It’s not only Wikipedia/Wikimedia, either. This particular usage pattern is also something I associate with Reddit (a good discussion here). The gender breakdown of Reddit? About 70% male.

For some reason, the idea that we should avoid sexist language usage seems to really bother people. I was once a TA for a large lecture class where, in the middle of discussions of the effects of sexist language, a male student interrupted the professor to say that he didn’t think it was a problem. I’ve since thought about it quite a bit (it was pretty jarring) and I’ve come to the conclusion that the reason the student felt that way is that, for him, it really wasn’t a problem. Since sexist language is almost always exclusionary to women, and he was not a woman, he had not felt that moment of discomfort before.

Further, I think he may have felt that, because this type of language tends to benefit men, he felt that we were blaming him. I want to be clear here: I’m not blaming anyone for thier unconscious biases. And I’m  not saying that only men use sexist language. The Wikimedia editors who made this choice may very well have been women. What I am saying is that we need to be aware of these biases and strive to correct them. It’s hard, and it takes constant vigilance, but it’s an important and relatively simple step that we can all take in order to help eliminate sexism.

*As they were on Wednesday, April 8 2015. If they’ve been changed, I’d recommend the Way Back Machine.

Great Ideas in Linguistics: Grammaticality Judgements

Today’s Great Idea in Linguistics comes to use from syntax. One interesting difference between syntax and other fields of linguistics is what is considered compelling evidence for a theory in syntax. The aim of transformational syntax is to produce a set of rules (originally phrase structure rules) that will let you produce all the grammatical sentences in a language and none of the ungrammatical ones.  So, if you’re proposing a new rule you need to show that the sentences it outputs are grammatical… but how do you do that?

Wessel smedbager04.jpg
I sentence you to ten hours of community service for ungrammatical utterances!

One way to test whether something is grammatical is to see whether someone’s said it before. Back in the day, before you had things like large searchable corpora–or, heck even the internet–this was  difficult, so say the least. Especially since the really interesting syntactic phenomena tend to be pretty rare. Lots of sentences have a subject and an object, but a lot fewer have things like wh-islands.

Another way is to see if someone will say it. This is a methodology that is often used in sociolinguistics research. The linguist interviews someone using questions that are specifically designed to elicit certain linguistic forms, like certain words or sounds. However, this methodology is chancy at best. Often times the person won’t produce whatever it is you’re looking for. Also it can be very hard to make questions or prompts to access very rare forms.

Another way to see whether something is grammatical is to see whether someone would say it. This is the type of evidence that has, historically, been used most often in syntax research. The concept is straightforward. You present a speaker of a language with a possible sentence and  they use thier intuition as a native speaker to determine whether it’s good (“grammatical”) or not (“ungrammatical”). These sentences are often outputs of a proposed structure and used to argue either for or against it.

However, in practice grammaticality judgements can occasionally be a bit more difficult. Think about the following sentences:

  • I ate the carrot yesterday.
    • This sounds pretty good to me. I’d say it’s “grammatical”.
  • *I did ate the carrot yesterday.
    • I put a star (*) in front of this sentence because it sounds bad to me, and I don’t think anyone would say it. I’d say it’s “ungrammatical”.
  • ? I done ate the carrot yesterday.
    • This one is a little more borderline. It’s actually something I might say, but only in a very informal context and I realize that not everyone would say it.

So if you were a syntactician working on these sentences, you’d have to decide whether your model should account for the last sentence or not. One way to get around this is by building probability into the syntactic structure. So I’m more likely to use a structure that produces the first example but there’s a small probability I might use the structure in the third example. To know what those probabilities are, however, you need to figure out how likely people are to use each of the competing structures (and whether there are other factors at play, like dialect) and for that you need either lots and lots of grammaticality judgements. It’s a new use of a traditional tool that’s helping to expand our understanding of language.

Great ideas in linguistics: Language acquisition

Courtesy of your friendly neighbourhood rng, this week’s great idea in linguistics is… language acquisition! Or, in other words, the process of learning  a language. (In this case, learning your first language when you’re a little baby, also known as L1 acquisition; second-language learning, or L2 acquisition, is a whole nother bag of rocks.) Which begs the question: why don’t we just call it language learning and call it a day? Well, unlike learning to play baseball, turn out a perfect soufflé or kick out killer DPS, learning a language seems to operate under a different set of rules. Babies don’t benefit from direct language instruction and it may actually hurt them.

In other words:

Language acquisition is process unique to humans that allows us to learn our first language without directly being taught it.

Which doesn’t sound so ground-breaking… until you realize that that means that language use is utterly unique among human behaviours. Oh sure, we learn other things without being directly taught them, even relatively complex behaviours like swallowing and balancing. But unlike speaking, these aren’t usually under concious control and when they are it’s usually because something’s gone wrong. Plus, as I’ve discussed before, we have the ability to be infinitely creative with language. You can learn to make a soufflé without knowing what happens when you combine the ingredients in every possible combination, but knowing a language means that you know rules that allow you to produce all possible utterances in that language.

So how does it work? Obviously, we don’t have all the answers yet, and there’s a lot of research going on on how children actually learn language. But we do know what it generally tends to look like, precluding things like language impairment or isolation.

  1. Vocal play. The kid’s figured out that they have a mouth capable of making noise (or hands capable of making shapes and movements) and are practising it. Back in the day, people used to say that infants would make all the sounds of all the world’s languages during this stage. Subsequent research, however, suggests that even this early children are beginning to reflect the speech patterns of people around them.
  2. Babbling. Kids will start out with very small segments of language, then repeat the same little chunk over and over again (canonical babbling), and then they’ll start to combine them in new ways (variegated babbling). In hearing babies, this tends to be syllables, hence the stereotypical “mamamama”. In Deaf babies it tends to be repeated hand motions.
  3. One word stage. By about 13 months, most children will have begun to produce isolated words. The intended content is often more than just the word itself, however. A child shouting “Dog!” at this point could mean “Give me my stuffed dog” or “I want to go see the neighbour’s terrier” or “I want a lion-shaped animal cracker” (since at this point kids are still figuring out just how many four-legged animals actually are dogs). These types of sentences-in-a-word are known as holophrases.
  4. Two word stage. By two years, most kids will have moved on to two-word phrases, combining words in way that shows that they’re already starting to get the hang of their language’s syntax. Morphology is still pretty shaky, however: you’re not going to see a lot of tense markers or verbal agreement.
  5. Sentences. At this point, usually around age four, people outside the family can generally understand the child. They’re producing complex sentences and have gotten down most, if not all, of the sounds in their language.

These general stages of acquisition are very robust. Regardless of the language, modality or even age of acquisition we still see these general stages. (Although older learners may never completely acquire a language due to, among other things, reduced neuroplasticity.) And the fact they do seem to be universal is yet more evidence that language acquisition is a unique process that deserves its own field of study.

Great ideas in linguistics: Sociolinguistics

I’ll be the first to admit: for a long time, even after I’d begun my linguistics training, I didn’t really understand what sociolinguistics was. I had the idea that it mainly had to do with discourse analysis, which is certainly a fascinating area of study, but I wasn’t sure it was enough to serve as the basis for a major discipline of linguistics. Fortunately, I’ve learned a great deal about sociolinguistics since that time.

Sociolinguistics is the sub-field of linguistics that studies language in its social context and derives explanatory principles from it. By knowing about the language, we can learn something about a social reality and vice versa.

Now, at first glance this may seem so intuitive that it’s odd someone would to the trouble of stating it directly. As social beings, we know that the behaviour of people around us is informed by their identities and affiliations. At the extreme of things it can be things like having a cultural rule that literally forbids speaking to your mother-in-law, or requires replacing the letters “ck” with “cc” in all written communication. But there are more subtle rules in place as well, rules which are just as categorical and predictable and important. And if you don’t look at what’s happening with the social situation surrounding those linguistic rules, you’re going to miss out on a lot.

Case in point: Occasionally you’ll here phonologists talk about sound changes being in free variation, or rules that are randomly applied. BUT if you look at the social facts of the community, you’ll often find that there is no randomness at all. Instead, there are underlying social factors that control which option a person makes as they’re speaking. For example, if you were looking at whether people in Montreal were making r-sounds with the front or back of the tongue and you just sampled a bunch of them you might find that some people made it one way most of the time and others made it the other way most of the time. Which is interesting, sure, but doesn’t have a lot of explanatory power.

However, if you also looked at the social factors associated with it, and the characteristics of the individuals who used each r-sound, you might notice something interesting, as Clermont and Cedergren did (see the illustration). They found that younger speakers preferred the back-of-the-mouth r-sound, while older people tended to use the tip of the tongue instead. And that has a lot more explanatory power. Now we can start asking questions to get at the forces underlying that pattern: Is this the way the younger people have always talked, i.e. some sort of established youthful style, or is there a language change going on and they newer form is going to slowly take over? What causes younger speakers to use the the form they do? Is there also an effect of gender, or who you hang out with?

changes
Figure one from Sankoff and Blondeau. 2007. (Click picture to look at the whole study.) As you can see, younger speakers are using [R] more than older speakers, and the younger a speaker is the more likely they are to use [R].
And that’s why sociolinguistics is all kinds of awesome. It lets us peel away and reveal some of the complexity surrounding language. By adding sociological data to our studies, we can help to reduce statistical noise and reveal new and interesting things about how language works, what it means to be a language-user, and why we do what we do.

Are television and mass media destroying regional accents?

One of the occupational hazards of linguistics is that you are often presented with spurious claims about language that are relatively easy to quantifiably disprove. I think this is probably partly due to the fact that there are multiple definitions of ‘linguist. As a result, people tend to equate mastery of a language with explicit knowledge of it’s workings. Which, on the one hand, is reasonable. If you know French, the idea is that you know how to speak French, but also how it works. And, in general, that isn’t the case. Partly because most language instruction is light on discussions of grammatical structures–reasonably so; I personally find inductive grammar instruction significantly more helpful, though the research is mixed–and partly because, frankly, there’s a lot that even linguists don’t know about how grammar works. Language is incredibly complex, and we’ve only begun to explore and map out that complexity. But there are a few things we are reasonably certain we know. And one of those is that your media consumption does not “erase” your regional dialect [pdf]. The premise is flawed enough that it begins to collapse under it’s own weight almost immediately. Even the most dedicated American fans of Dr. Who or Downton Abby or Sherlock don’t slowly develop British accents.

Christopher Eccleston Thor 2 cropped
Lots of planets have a North with a distinct accent that is not being destroyed by mass media.
So why is this myth so persistent? I think that the most likely answer is that it is easy to mischaracterize what we see on television and to misinterpret what it means. Standard American English (SAE), what newscasters tend to use, is a dialect. It’s not just a certain set of vowels but an entire, internally consistent grammatical system.  (Failing to recognize that dialects are more than just adding a couple of really noticeable sounds or grammatical structures is why some actors fail so badly at trying to portray a dialect they don’t use regularly.) And not only  is it a dialect, it’s a very prestigious dialect. Not only newscasters make use of it, but so do political figures, celebrities, and pretty much anyone who has a lot of social status. From a linguistic perspective, SAE is no better or worse than any other dialect. From a social perspective, however, SAE has more social capital than most other dialects. That means that being able to speak it, and speak it well, can give you opportunities that you might not otherwise have had access to. For example, speakers of Southern American English are often characterized as less intelligent and educated. And those speakers are very aware of that fact, as illustrated in this excrpt from the truely excellent PBS series Do You Speak American:

ROBERT:

Do you think northern people think southerners are stupid because of the way they talk?

JEFF FOXWORTHY:

Yes I think so and I think Southerners really don’t care that Northern people think that eh. You know I mean some of the, the most intelligent people I’ve ever known talk like I do. In fact I used to do a joke about that, about you know the Southern accent, I said nobody wants to hear their brain surgeon say, ‘Al’ight now what we’re gonna do is, saw the top of your head off, root around in there with a stick and see if we can’t find that dad burn clot.’

So we have pressure from both sides: there are intrinsic social rewards for speaking SAE, and also social consequences for speaking other dialects. There are also plenty of linguistic role-models available through the media, from many different backgrounds, all using SAE. If you consider these facts alone it seems pretty easy to draw the conclusion that regional dialects in America are slowly being replaced by a prestigious, homogeneous dialect.

Except that’s not what’s happening at all. Some regional dialects of American English are actually becoming more, rather than less, prominent. On the surface, this seems completely contradictory. So what’s driving this process, since it seems to be contradicting general societal pressure? The answer is that there are two sorts of pressure. One, the pressure from media, is to adopt the formal, standard style. The other, the pressure from family, friends and peers, is to retain and use features that mark you as part of your social network. Giles, Taylor and Bourhis showed that identification with a certain social group–in their case Welsh identity–encourages and exaggerates Welsh features. And being exposed to a standard dialect that is presented as being in opposition to a local dialect will actually increase that effect. Social identity is constructed through opposition to other social groups. To draw an example from American politics, many Democrats define themselves as “not Republicans” and as in opposition to various facets of “Republican-ness”. And vice versa.

Now, the really interesting thing is this: television can have an effect on speaker’s dialectal features But that effect tends to be away from, rather than towards, the standard. For example, some Glaswegian English speakers have begun to adopt features of Cockney English based on their personal affiliation with the  show EastendersIn light of what I discussed above, this makes sense. Those speakers who had adopted the features are of a similar social and socio-economic status as the characters in Eastenders. Furthermore, their social networks value the characters who are shown using those features, even though they are not standard. (British English places a much higher value on certain sounds and sound systems as standard. In America, even speakers with very different sound systems, e.g. Bill Clinton and George W. Bush, can still be considered standard.) Again, we see retention and re-invigoration of features that are not standard through a construction of opposition. In other words, people choose how they want to sound based on who they want to be seen as. And while, for some people, this means moving towards using more SAE, in others it means moving away from the standard.

One final note: Another factor which I think contributes to the idea that television is destroying accents is the odd idea that we all only have one dialect, and that it’s possible to “lose” it. This is patently untrue. Many people (myself included) have command of more than one dialect and can switch between them when it’s socially appropriate, or blend features from them for a particular rhetorical effect. And that includes people who generally use SAE. Oprah, for example, will often incorporate more features of African American English when speaking to an African American guest.  The bottom line is that television and mass media can be a force for linguistic change, but they’re hardly the great homogonizier that it is often claimed they are.

For other things I’ve written about accents and dialects, I’d recommend:

  1. Why do people  have accents? 
  2. Ask vs. Aks
  3. Coke vs. Soda vs. Pop

The Science of Speaking in Tongues

So I was recently talking with one of my friends, and she asked me what linguists know about speaking in tongues (or glossolalia, which is the fancy linguistical term for it). It’s not a super well-studied phenomenon, but there has been enough research done that we’ve reached some pretty confident conclusions, which I’ll outline below.

Bozen 1 (327)
More like speaking around tongues, in this guy’s case.

  • People don’t tend to use sounds that aren’t in their native language. (citation) So if you’re an English speaker, you’re not going to bust out some Norwegian vowels. This rather lets the air out of the theory that individuals engaged in glossolalia are actually speaking another language. It is more like playing alphabet soup with the sounds you already know. (Although not always all the sounds you know. My instinct is that glossolalia is made up predominately of the sounds that are the most common in the person’s language.)
  • It lacks the structure of language. (citation) So one of the core ideas of linguistics, which has been supported again and again by hundreds of years of inquiry, is that there are systems and patterns underlying language use: sentences are usually constructed of some sort of verb-like thing and some sort of noun-like thing or things, and it’s usually something on the verb that tells you when and it’s usually something on the noun that tells you things like who possessed what. But these patterns don’t appear in glossolalia. Plus, of course, there’s not really any meaningful content being transmitted. (In fact, the “language” being unintelligible to others present is one of the markers that’s often used to identify glossolalia.) It may sort of smell like a duck, but it doesn’t have any feathers, won’t quack and when we tried to put it in water it just sort of dissolved, so we’ve come to conclusion that it is no, in fact, a duck.
  • It’s associated with a dissociative psychological state. (citation) Basically, this means that speakers are aware of what they’re doing, but don’t really feel like they’re the ones doing it. In glossolalia, the state seems to come and then pass on, leaving speakers relatively psychologically unaffected. Disassociation can be problematic, though; if it’s particularly extreme and long-term it can be characterized as multiple personality disorder.
  • It’s a learned behaviour. (citation) Basically, you only see glossolalia in cultures where it’s culturally expected and only in situations where it’s culturally appropriate. In fact, during her fieldwork, Dr. Goodman (see the citation) actually observed new initiates into a religious group being explicitly instructed in how to enter a dissociative state and engage in glossolalia.

So glossolalia may seem language-like, but from a linguistic standpoint it doesn’t seem to be actually be language.  (Which is probably why there hasn’t been that much research done on it.) It’s vocalization that arises as the result of a learned psychological stated that lacks linguistic systematicity.