Talkin’ ’bout my generativity

April 9, 2012April 10, 2012 ~ Rachael Tatman

Quick, who’s this guy:

If you answered “Einstein’s less famous brother, Einbert?” you wouldn’t actually be too far from the truth. It’s Noam Chomsky. He’s so famous his name comes pre-installed in Microsoft Word’s spell checker. (Did you mean “chomp sky?”)

If you’ve got a good history or government background, you may be thinking, “Oh yeah, the anarchy guy.” He may be, but his greatest intellectual achievement has nothing to do with anarchy and everything to do with linguistics. That achievement would be generativity.

Gen-er-a-tiv-i-ty. Write it down, it will be on the test.

Generativity was a game-changer for linguistics. Before that point, linguistics was basically phrenology, which I’ve mentioned before. Phrenology is to modern linguistics what naturalism is to modern biology. Phrenologists collected knowledge about languages haphazardly, without a whole lot of underlying theoretical structure. I mean, there was some, (I’ll talk about what the brother’s Grimm did on their weekends off later) but it was pretty confined. And a lot of it, let’s be honest, was about proving that Europe was best. The monumental Oxford English Dictionary is a good example of that mindset. They wanted to collect every single word in English language and pin it neatly to the page with a little series of notes about it and a list of sightings in the wild. It was, and remains, a grand undertaking and a staggering achievement… but modern linguists aren’t collectors anymore.

That’s because the end goal of modern linguistics is to solve language. The field is working to put together a series of rules that will actually describe and predict all human language. Not in the mind reader, fortune teller sense of predict. I mean that, with the right rules, we should be able to generate all possible sentences. In a generative way. By using generativity.

So why is this important?

Lots of reasons! Here, let me list them, because lists are fun to read.

This turned linguistics from an interesting hobby for rich people into a science. If you have rules, you can make predictions about what those rules will produce and then test those predictions. Testing predictions is also known as science. It’s also something that linguistics as a whole has been a little… hesitant to adopt, but that’s another story.
Suddenly computers! Computer programming is, at its most basic level, a series of rules. Linguistics is now dedicated to producing a series of rules. Bada-bing, bada-boom, universal translator. (It doesn’t work that way, but, in theory, it eventually can.)
Now we have a framework that we can use to figure out how to ask questions. We have a goal. Things are organized.

Now for the promised test.

What term is used to describe the current goal of linguistics; i.e. to generate a set of rules that can accurately describe and predict language usage? (Seriously, I’m not going to give you the answer. Just scroll up.)

Published by Rachael Tatman

Making NLP boring. Linguistics PhD. Data science, NLP, Stats, ML, R, Python, FAccT. View all posts by Rachael Tatman

13 thoughts on “Talkin’ ’bout my generativity”

Pingback: That’s so meta meta meta |
Pingback: The Brothers Grimm and Their Phonology Habit |
Pingback: Can Animals Talk? |
Jester Who-ver says:

May 26, 2012 at 10:39 pm

What do you mean by solve language?

Reply
1. Rachael Tatman says:
  
  May 27, 2012 at 8:56 am
  
  I’m using “solved” here sort of in a game theory sense… except instead of figuring out who would win, we need to learn all the rules in the first place. Once we generate a set of rules that will generate all possible utterances in a given language, I would consider that language “solved”.
  
  At this point, the only solved language is lojban, and that’s because it was created with that purpose in mind.
  
  Reply
  1. Jester Who-ver says:
    
    May 27, 2012 at 6:20 pm
    
    What is the benefit of a solved language?
  2. Rachael Tatman says:
    
    May 27, 2012 at 8:46 pm
    
    Great question! The most important thing is that we can then explain the language accurately to a computer. This would allow things like actually accurate computer translation, voice recognition technology and synthetic speech. Plus, linguists would get to enjoy the warm fuzzy feeling of having successfully modeled all of human language use… which would have been no mean feat.
    
    To be honest, I can’t really see it happening in my lifetime. We’ll hopefully make plenty of progress, though!
  3. Jester Who-ver says:
    
    May 27, 2012 at 8:59 pm
    
    No mean feat indeed. What about context and homonyms? How would one go about modeling those?
  4. Rachael Tatman says:
    
    May 28, 2012 at 11:11 am
    
    They’re a little more difficult to deal with, mainly due to the role that ambiguity plays. My work tends to look more at sound systems of languages, which are pretty easy to model in a way that computers can handle. Context and things like “common sense” tend to be very difficult to model, and most of the work I’ve seen on the subject has been along the lines of entering massive amounts of data. (In fact, I have a computer scientist friend who firmly believes that we won’t be able to effectively model langauge computationally until after we develop artificial intelligence.)
    
    Discourse analysis is one branch of study that deals with context, and might be used in the future to work out which homonym is being said. At the moment, computational linguists seem to be leaning towards weighted statistical models for figuring out which pair of the homonym set is the most likely. It’s still an open question whether or humans use statistical modelling, though, and if you’re using a completely different system to model something, you’re going to get a slightly different output some of the time. It’s like how computers will make comprehension errors with speech recognition technology that a human never would.
  5. Jester Who-ver says:
    
    May 29, 2012 at 7:13 pm
    
    It’s amazing how much data our brain is capable of processing isn’t it? I sure hope though that no one is trying to get rid of ambiguity; it has it’s flaws, but a computer’s expression would be severely limited without it… and structural ambiguity is very important for messages to be understood by people on a variety of different levels of understanding, not to mention the sheer entertainment value of them universally.
    
    And if using statistics is the way, is a concrete solution even possible? Then I wonder is one even necessary?
    
    Thanks for conversing with me on this, the ability to learn is not something I take for granted.
Pingback: Great ideas in linguistics: Language acquisition |
Bruno says:

May 29, 2016 at 4:53 pm

Hi, I stumbled across your blog today and I really liked it, so I kept reading… but then I read this: “Generativity was a game-changer for linguistics. Before that point, linguistics was basically phrenology, which I’ve mentioned before. Phrenology is to modern linguistics what naturalism is to modern biology. Phrenologists collected knowledge about languages haphazardly, without a whole lot of underlying theoretical structure.”
And I thought: you can’t be serious. Are you? I mean there was a LOT of linguistics being done before Chomsky (and even during Chomsky there was a lot a non-generative linguistics being done), and to dismiss that as somehow being atheoretical or pre-theoretical… I still like your blog, but this entry, no.

Reply
1. Rachael Tatman says:
  
  May 30, 2016 at 12:46 pm
  
  You’re absolutely right. This post was written four years ago–if I were to write it today I’d have included earlier work (the Structuralists, the Prague school, even Sanskrit grammars) and a lot more nuance in this discussion. Thanks for the feedback!
  
  Reply

Share this:

Related

Published by Rachael Tatman

13 thoughts on “Talkin’ ’bout my generativity”

Leave a comment Cancel reply