So if you’re like me, you sometimes take notes on the computer and end up using some shortcuts so you can keep up with the speed of whoever’s talking. One of the short cuts I use a lot is replacing the word “and” with punctuation. When I’m handwriting things I only ever use “+” (becuase I can’t reliably write an ampersand), but in typing I use both “+” and “&”. And I realized recently, after going back to change which one I used, that I had the intuition that they should be used for different things.
Like sometimes happens with linguistic intuitions, though, I didn’t really have a solid idea of how they were different, just that they were. Fortunately, I had a ready-made way to figure it out. Since I use both symbols on Twitter quite a bit, all I had to do was grab tweets of mine that used either + or & and figure out what the difference was.
I got 450 tweets from between October 7th and November 11th of this year from my own account (@rctatman). I used either & or + in 83 of them, or roughly 18%. This number is a little bit inflated because I was livetweeting a lot of conference talks in that time period, and if a talk has two authors I start every livetweet from that talk with “AuthorName1 & AuthorName2:”. 43 tweets use & in this way. If we get rid of those, only around 8% of my tweets contain either + or &. They’re still a lot more common in my tweets than in writing in other genres, though, so it’s still a good amount of data.
So what do I use + for? See for yourself! Below are all the things I conjoined with + in my Twitter dataset. (Spelling errors intact. I’m dyslexic, so if I don’t carefully edit text—and even sometimes when I do, to my eternal chagrin—I tend to have a lot of spelling errors. Also, a lot of these tweets are from EMNLP so there’s quite a bit of jargon.)
- time + space
- confusable Iberian language + English
- Data + code
- easy + nice
- entity linking + entity clustering
- group + individual
- handy-dandy worksheet + tips
- Jim + Brenda, Finn + Jake
- Language + action
- linguistic rules + statio-temporal clustering
- poster + long paper
- Ratings + text
- static + default methods
- syntax thing + cattle
- the cooperative principle + Gricean maxims
- Title + first author
- to simplify manipulation + preserve struture
If you’ve had some syntactic training, it might jump out to you that most of these things have the same syntactic structure: they’re noun phrases! There are just a couple of exception. The first is “static + default methods”, where the things that are being conjoined are actually adjectives modifying a single noun. The other is “to simplify manipulation + preserve struture”. I’m going to remain agnostic about where in the verb phrase that coordination is taking place, though, so I don’t get into any syntax arguments ;). That said, this is a fairly robust pattern! Remember that I haven’t been taught any rules about what I “should” do, so this is just an emergent pattern.
Ok, so what about &? Like I said, my number one use is for conjunction of names. This probably comes from my academic writing training. Most of the papers I read that use author names for in-line citations use an & between them. But I do also use it in the main body of tweets. My use of & is a little bit harder to characterize, so I’m going to go through and tell you about each type of thing.
First, I use it to conjoin user names with the @ tag. This makes sense, since I have a strong tendency to use & with names:
- @uwengineering & @uwnlp
- @amazon @baidu @Grammarly & @google
In some cases, I do use it in the same way as I do +, for conjoining noun phrases:
- the entities & relations
- these features & our corpus
- LSTM & attention models
- apples & concrete
- context & content
But I also use it for comparatives:
- Better suited for weak (bag-level) labels & interpretable and flexible
- easier & faster
And, perhaps more interestingly, for really high-level conjugation, like at the level of the sentence or entire verb phrase (again, I’m not going to make ANY claims about what happens in and around verbs—you’ll need to talk to a syntactician for that!).
- Classified as + or – & then compared to polls
- in 30% of games the group performance was below average & in 17% group was worse than worst individual
- math word problems are boring & kids learn better if they’re interested in the theme of the problem
- our system is the first temporal tagger designed for social media data & it doesn’t require hand tagging
- use a small labeled corpus w/ small lexicon & choose words with high prob. of 1 label
And, finally, it gets used in sort of miscellaneous places, like hashtags and between URLs.
So & gets used in a lot more places than + does. I think that this is probably because, on some subconscious level I consider & to be the default (or, in linguistics terms, “unmarked“). This might be related to how I’m processing these symbols when I read them. I’m one of those people who hears an internal voice when reading/writing, so I tend to have canonical vocalizations of most typed symbols. I read @ as “at”, for example, and emoticons as a prosodic beat with some sort of emotive sound. Like I read the snorting emoji as the sound of someone snorting. For & and +, I read & as “and” and + as “plus”. I also use “plus” as a conjunction fairly often in speech, as do many of my friends, so it’s possible that it may pattern with my use in speech (I don’t have any data for that, though!). But I don’t say “plus” nearly as often as I say “and”. “And” is definitely the default and I guess that, by extension, & is as well.
Another thing that might possibly be at play here is ease of entering these symbols. While I’m on my phone they’re pretty much equally easy to type, on a full keyboard + is slightly easier, since I don’t have to reach as far from the shift key. But if that were the only factor my default would be +, so I’m fairly comfortable claiming that the fact that I use & for more types of conjunction is based on the influence of speech.
A BIG caveat before I wrap up—this is a bespoke analysis. It may hold for me, but I don’t claim that it’s the norm of any of my language communities. I’d need a lot more data for that! That said, I think it’s really neat that I’ve unconsciously fallen into a really regular pattern of use for two punctuation symbols that are basically interchangeable. It’s a great little example of the human tendency to unconsciously tidy up language.