What does the National Endowment for the Humanities even do?

From the title, you might think this is a US-centric post. To a certain extent, it is. But I’m also going to be talking about topics that are more broadly of interest: what are some specific benefits of humanities research? And who should fund basic research? A lot has been written about these topics generally, so I’m going to be talking about linguistics and computational linguistics specifically.

This blog post came out of a really interesting conversation I had on Twitter the other day, sparked by this article on the potential complete elimination of both the National Endowment for the Humanities and the National Endowment for the Arts. During the course of the conversation, I realized that the person I was talking to (who was not a researcher, as far as I know) had some misconceptions about the role and reach of the NEH. So I thought it might be useful to talk about the role the NEH plays in my field, and has played in my own development as a researcher.


Oh this? Well, we don’t have funding to buy books anymore, so I put a picture of them in my office to remind myself they exist.

What does the NEH do?

I think the easiest way to answer this is to give you specific examples of projects that have been funded by the National Endowment for the Humanities, and talk about thier individual impacts. Keep in mind that this is just the tip of the iceberg; I’m only going to talk about projects that have benefitted my work in particular, and not even all of those.

  • Builds language teaching resources. One of my earliest research experiences was as a research assistance for Jack Martin, working with the Koasati tribe in Louisiana on a project funded by the NEH. The bulk of the work I did that summer was on a talking dictionary of the Koasati language, which the community especially wanted both as a record of the language and to support Koasati language courses. I worked with speakers to record the words for the dictionary, edit and transcribe the sound files to be put into the talking dictionaries. In addition to creating an important resource of the community, I learned important research skills that led me towards my current work on language variation. And the dictionary? It’s available on-line.
  • Helps fight linguistic discrimination. One of my main research topics is linguistic bias in automatic speech recognition (you can see some of that work here and here). But linguistic bias doesn’t only happen with computers. It’s a particularly pernicious form of discrimination that’s a big problem in education as well. As someone who’s both from the South and an educator, for example, I have purposefully cultivated my ability to speak mainstream American English becuase I know that, fair or not, I’ll be taken less seriously the more southern I sound. The NEH is at the forefront of efforts to help fight linguistic discrimination.
  • Document linguistic variation. This is a big one for my work, in particular: I draw on NEH-funded resources documenting linguistic variation in the United States in almost every research paper I write.

How does funding get allocated?

  • Which projects are funded is not decided by politicians. I didn’t realize this wasn’t common knowledge, but which projects get funded by federal funding agencies, including the NEH, NSF (which I’m currently being funded through) and NEA (National Endowment for the Arts) are not decided by politicians. This is a good thing–even the most accomplished politician can’t be expected to be an expert on everything from linguistics to history to architecture. You can see the breakdown of the process of allocating funding here.
  • Who looks at funding applications? Applications are peer reviewed, just like journal articles and other scholarly publications. The people looking at applications are top scholars in thier field. This means that they have a really good idea of which projects are going to have the biggest long-term impact, and that they can insure no one’s going to be reinventing the wheel.
  • How many projects are funded? All federal  research funding is extremely competitive, with many more applications submitted than accepted. At the NEH, this means as few as 6% of applications to a specific grant program will be accepted. This isn’t just free money–you have to make a very compelling case to a panel of fellow scholars that your project is truly exceptional.
  • What criteria are used to evaluate projects? This varies from grant to grant, but for the documenting endangered languages grant (which is what my work with the Koasati tribe was funded through), the evaluation criteria includes the following:
    • What is the potential for the proposed activity to
      1. Advance knowledge and understanding within its own field or across different fields (Intellectual Merit); and
      2. Benefit society or advance desired societal outcomes (Broader Impacts)?
    • To what extent do the proposed activities suggest and explore creative, original, or potentially transformative concepts?
    • Is the plan for carrying out the proposed activities well-reasoned, well-organized, and based on a sound rationale? Does the plan incorporate a mechanism to assess success?
    • How well qualified is the individual, team, or organization to conduct the proposed activities?
    • Are there adequate resources available to the PI (either at the home organization or through collaborations) to carry out the proposed activities?

Couldn’t this research be funded by businesses?

Sure, it could be. Nothing’s stopping companies from funding basic research in the humanities… but in my experience it’s not a priority, and they don’t. And that’s a real pity, because basic humanities research has a tendency of suddenly being vitally needed in other fields. Some examples from Natural Language Processing that have come up in just the last year:

  • Ethics: I’m currently taking what will  probably be my last class in graduate school. It’s a seminar course, filled with a mix of NLP researchers, electrical engineers and computer scientists, and we’re all reading… ethics texts. There’s been a growing awareness in the NLP and machine learning communities that algorithmic design and data selection is leading to serious negative social impacts (see this paper for some details). Ethics is suddenly taking center stage, and without the work of scholars working in the humanities, we’d be working up from first principles.
  • Pragmatics: Pragmatics, or the study of how situational factors affect meaning, is one of the more esoteric sub-disciplines in linguistics–many linguistics departments don’t even teach it as a core course. But one of the keynotes at the 2016 Empirical Methods in Natural Language Processing conference was about it (in NLP, conferences are the premier publication venue, so that’s a pretty big deal). Why? Because dialog systems, also known as chatbots, are a major research area right now. And modelling things like what you believe the person you’re talking to already knows is going to be critical to making interacting with them more natural.
  • Discourse analysis: Speaking of chatbots, discourse analysis–or the analysis of the structure of conversations–is another area of humanities research that’s been applied to a lot of computational systems. There are currently over 6000 ACL publications that draw on the discourse analysis literature. And given the strong interest in chatbots right now, I can only see that number going up.

These are all areas of research we’d traditionally consider humanities that have directly benefited the NLP community, and in turn many of the products and services we use day to day. But it’s hard to imagine companies supporting the work of someone working in the humanities whose work might one day benefit their products. These research programs that may not have an immediate impact but end up being incredibly important down-the-line is exactly the type of long-term investment in knowledge that the NEH supports, and that really wouldn’t happen otherwise.

Why does it matter?

“Now Rachael,” you may be saying, “your work definitely counts as STEM (science, technology, engineering and math). Why do you care so much about some humanities funding going away?”

I hope the reasons that I’ve outlined above help to make the point that humanities research has long-ranging impacts and is a good investment. NEH funding was pivotal in my development as a researcher. I would not be where I am today without early research experience on projects funded by the NEH.  And as a scholar working in multiple disciplines, I see how humanities research constantly enriches work in other fields, like engineering, which tend to be considered more desirable.

One final point: the National Endowment for the Humanities is, compared to other federal funding programs, very small indeed. In 2015 the federal government spent 146 million on the NEH, which was only 2% of the 7.1  billion dollar Department of Defense research budget. In other words, if everyone in the US contributed equally to the federal budget, the NEH would cost us each less than fifty cents a year. I think that’s a fair price for all of the different on-going projects the NEH funds, don’t you?


The entire National Endowment for the Humanities & National Endowment for the Arts, as well as the National Park Service research budget, all fit in that tiny “other” slice at the very top.



Great ideas in linguistics: Sociolinguistics

I’ll be the first to admit: for a long time, even after I’d begun my linguistics training, I didn’t really understand what sociolinguistics was. I had the idea that it mainly had to do with discourse analysis, which is certainly a fascinating area of study, but I wasn’t sure it was enough to serve as the basis for a major discipline of linguistics. Fortunately, I’ve learned a great deal about sociolinguistics since that time.

Sociolinguistics is the sub-field of linguistics that studies language in its social context and derives explanatory principles from it. By knowing about the language, we can learn something about a social reality and vice versa.

Now, at first glance this may seem so intuitive that it’s odd someone would to the trouble of stating it directly. As social beings, we know that the behaviour of people around us is informed by their identities and affiliations. At the extreme of things it can be things like having a cultural rule that literally forbids speaking to your mother-in-law, or requires replacing the letters “ck” with “cc” in all written communication. But there are more subtle rules in place as well, rules which are just as categorical and predictable and important. And if you don’t look at what’s happening with the social situation surrounding those linguistic rules, you’re going to miss out on a lot.

Case in point: Occasionally you’ll here phonologists talk about sound changes being in free variation, or rules that are randomly applied. BUT if you look at the social facts of the community, you’ll often find that there is no randomness at all. Instead, there are underlying social factors that control which option a person makes as they’re speaking. For example, if you were looking at whether people in Montreal were making r-sounds with the front or back of the tongue and you just sampled a bunch of them you might find that some people made it one way most of the time and others made it the other way most of the time. Which is interesting, sure, but doesn’t have a lot of explanatory power.

However, if you also looked at the social factors associated with it, and the characteristics of the individuals who used each r-sound, you might notice something interesting, as Clermont and Cedergren did (see the illustration). They found that younger speakers preferred the back-of-the-mouth r-sound, while older people tended to use the tip of the tongue instead. And that has a lot more explanatory power. Now we can start asking questions to get at the forces underlying that pattern: Is this the way the younger people have always talked, i.e. some sort of established youthful style, or is there a language change going on and they newer form is going to slowly take over? What causes younger speakers to use the the form they do? Is there also an effect of gender, or who you hang out with?


Figure one from Sankoff and Blondeau. 2007. (Click picture to look at the whole study.) As you can see, younger speakers are using [R] more than older speakers, and the younger a speaker is the more likely they are to use [R].

And that’s why sociolinguistics is all kinds of awesome. It lets us peel away and reveal some of the complexity surrounding language. By adding sociological data to our studies, we can help to reduce statistical noise and reveal new and interesting things about how language works, what it means to be a language-user, and why we do what we do.

Why do people have accents?

Since I’m teaching Language and Society this quarter, this is a question that I anticipate coming up early and often. Accents–or dialects, though the terms do differ slightly–are one of those things in linguistics that is effortlessly fascinating. We all have experience with people who speak our language differently than we do. You can probably even come up with descriptors for some of these differences. Maybe you feel that New Yorkers speak nasally, or that Southerners have a drawl, or that there’s a certain Western twang. But how did these differences come about and how are perpetuated?

Hyundai Accents

Clearly people have Accents because they’re looking for a nice little sub-compact commuter car.

First, two myths I’d like to dispel.

  1. Only some people have an accent or speak a dialect. This is completely false with a side of flat-out wrong. Every single person who speaks or signs a language does so with an accent. We sometimes think of newscasters, for example, as “accent-less”. They do have certain systematic variation in their speech, however, that they share with other speakers who share their social grouping… and that’s an accent. The difference is that it’s one that tends to be seen as “proper” or “correct”, which leads nicely into myth number two:
  2. Some accents are better than others. This one is a little more tricky. As someone who has a Southern-influenced accent, I’m well aware that linguistic prejudice exists. Some accents (such as the British “received pronunciation”) are certainly more prestigious than others (oh, say, the American South). However, this has absolutely no basis in the language variation itself. No dialect is more or less “logical” than any other, and geographical variation of factors such as speech rate has no correlation with intelligence. Bottom line: the differing perception of various accents is due to social, and not linguistic, factors.

Now that that’s done with, let’s turn to how we get accents in the first place. To begin with, we can think of an accent as a collection of linguistic features that a group of people share. By themselves, these features aren’t necessarily immediately noticeable, but when you treat them as a group of factors that co-varies it suddenly becomes clearer that you’re dealing with separate varieties. Which is great and all, but let’s pull out an example to make it a little clearer what I mean.

Imagine that you have two villages. They’re relatively close and share a lot of commerce and have a high degree of intermarriage. This means that they talk to each other a lot. As a new linguistic change begins to surface (which, as languages are constantly in flux, is inevitable) it spreads through both villages. Let’s say that they slowly lose the ‘r’ sound. If you asked a person from the first village whether a person from the second village had an accent, they’d probably say no at that point, since they have all of the same linguistic features.

But what if, just before they lost the ‘r’ sound, an unpassable chasm split the two villages? Now, the change that starts in the first village has no way to spread to the second village since they no longer speak to each other. And, since new linguistic forms pretty much come into being randomly (which is why it’s really hard to predict what a language  will sound like in three hundred years) it’s very unlikely that the same variant will come into being in the second village. Repeat that with a whole bunch of new linguistic forms and if, after a bridge is finally built across the chasm, you ask a person from the first village whether a person from the second village has an accent, they’ll probably say yes. They might even come up with a list of things they say differently: we say this and they say that. If they were very perceptive, they might even give you a list with two columns: one column the way something’s said in their village and the other the way it’s said in the second village.

But now that they’ve been reunited, why won’t the accents just disappear as they talk to each other again? Well, it depends, but probably not. Since they were separated, the villages would have started to develop their own independent identities. Maybe the first village begins to breed exceptionally good pigs while squash farming is all the rage in the second village. And language becomes tied that that identity. “Oh, I wouldn’t say it that way,” people from the first village might say, “people will think I raise squash.” And since the differences in language are tied to social identity, they’ll probably persist.

Obviously this is a pretty simplified example, but the same processes are constantly at work around us, at both a large and small scale. If you keep an eye out for them, you might even notice them in action.

Soda vs. Pop vs. Coke … Which is right?

Short answer: they’re all correct (at least in the United States) but some are more common in certain dialectal areas. Here’s a handy-dandy map, in case you were wondering:

Maps! Language! Still one of my favorite combinations. This particular map, and the data collection it’s based on is courtesy of popvssoda.com. Click picture for link and all the lovely statistics. (You do like statistics, right?)

Long answer: I’m going to sort this into reactions I tend to get after answering questions like this one.

What  do you mean they’re all correct? Coke/Soda/Pop is clearly wrong. Ok, I’ll admit, there are certain situations when you might need to choose to use one over the other. Say, if you’re writing for a newspaper with a very strict style guide. But otherwise, I’m sticking by my guns here: they’re all correct. How do I know? Because each of them in is current usage, and there is a dialectal group where it is the preferred term. Linguistics (at least the type of linguistics that studies dialectal variation) is all about describing what people actually say and people actually say all three.

But why doesn’t everyone just say the same thing? Wouldn’t that be easier? Easier to understand? Probably, yes. But people use different words for the same thing for the same reasons that they speak different languages. In a very, very simplified way, it kinda works like this:

  • You tend to speak like the people that you spend time with. That makes it easier for you to understand each other and lets other people in your social group know that you’re all members of the same group. Like team jerseys.
  • Over time, your group will introduce or adopt new linguistic makers that aren’t necessarily used by the whole population. Maybe a person you know refers to sodas as “phosphates” because his grandfather was a sodajerk and that form really catches on among your friends.
  • As your group keeps using and adopting new words (or sounds, or grammatical markers or any other facet of language)  that are different from other groups their language slowly begins to drift away from the language used by other groups.
  • Eventually, in extreme cases, you end up with separate languages. (Like what happened with Latin: different speech communities ended up speaking French, Italian, Spanish, Portuguese, and the other Romance languages rather than the Latin they’d shared under Roman rule.)

This is the process by which languages or dialectal communities tend to diverge. Divergence isn’t the only pressure on speakers, however. Particularly since we can now talk to and listen to people from basically anywhere (Yay internet! Yay TV! Yay radio!) your speech community could look like mine does: split between people from the Pacific Northwest and the South. My personal language use is slowly drifting from mostly Southern to a mix of Southern and Pacific Northwestern. This is called dialect leveling and it’s part of the reason why American dialectal regions tend include hundreds or thousands of miles instead of two or three.

Dialect leveling: Where two or more groups of people start out talking differently and end up talking alike. Schools tend to be a huge factor in this.

So, on the one hand, there is pressure to start all talking alike. On the other hand, however, I still want to sound like I belong with my Southern friends and have them understand me easily (and not be made fun of for sounding strange, let’s be honest) so when I’m talking to them I don’t retain very many markers of the Pacific Northwest. That’s pressure that’s keeping the dialect areas separate and the reason why I still say “soda”, even though I live in a “pop” region.

Huh. That’s pretty cool. Yep. Yep, it sure is.