RT this: OUP Dictionary Team monitors Twitterer’s tweets

Purdy, Director of Publicity

A recent study out of Harvard confirms Twitter is all vanity. This is not a big surprise to the dictionary team at Oxford University Press. OUP lexicographers have been monitoring more than 1.5 million random tweets Since January 2009 and have noticed any number of interesting facts about the impact of Twitter on language usage. For example the 500 words most frequently used words on Twitter are significantly different from the top 500 words in general English text. At the very top, there are many of the usual suspects: “the”, “to”, “as”, “and”, “in”… though “I” is right up at number 2, whereas for general text it is only at number 10. No doubt this reflects on the intrinsically solipsistic nature of Twitter. The most common word is “the”, which is the same in general English.

Since January OUP’s dictionary team has sorted through many random tweets.  Here are the basic numbers:
Total tweets = 1,496,981
Total sentences = 2,098,630
Total words = 22,431,033
Average words per tweet = 14.98
Average sentences per tweet = 1.40
Average words per sentence in Twitter= 10.69
Average words per sentence in general usage = 22.09

Other interesting tidbits include:

Verbs are much more common in their gerund form in Twitter than in general text. “Going”, “getting” and “watching” all appear in the top 100 words or so.

“Watching”, “trying”, “listening”, “reading” and “eating” are all in the top 100 first words, revealing just how often people use Twitter to report on whatever they are experiencing (or consuming) at the time.

Evidence of greater informality than general English: “ok” is much more common, and so is “f***”.

And that is how we roll here at OUP, monitoring new social media and the changes in the English language up to the minute.  Tweet on.

Technorati Tags:

    Comments

  1. jrome said :

    Jun 4, 2009

    Likely the first messages sent by telegraph differed from handwritten letters in much the same way. As email use begins to decline and forms of human and machine communication shift to real-time services such as Twitter, significant patterns of language usage will appear. Glad to hear the OED is listening.

    Cheers!

  2. Joe Smith said :

    Jun 4, 2009

    The most interesting fact about the use of twitter is that it’s proof that the pace of life is speeding up–and that haste in resulting in greater self-interest and less compassion. After all, when you’re in a hurry your less apt to care about anyone but yourself. If you ask me, researchers should be studying how we can benefit from computer and communications technology without internalizing the “machine values” of speed, efficiency, and standardization.

  3. Gregory Korte said :

    Jun 4, 2009

    Correct me if I’m wrong — you’re the linguists, after all — but words like “watching” and “getting” can also be participles. And I’d bet they’re more often used on Twitter as present participles (”I am watching American Idol”) than as true gerunds (”Watching American Idol is one of my favorite pastimes.”)

    Of course, a Tweet that simply says, “Watching American Idol” could be ambiguous, but I still imagine that as a participle where the subject and auxiliary verb are understood.

    Having said that, I’ve noticed the same phenomenon, and I’m fascinated by the way “statusing” (as a gerund) is changing the written word. As Calvin once said to Hobbes, “Verbing weirds language.”

  4. Michael Muller said :

    Jun 4, 2009

    Good start. The prevalence of gerunds is not surprising, given the present-tense question that prompts the user.

    What have you found about the word-length and sentence length, in comparison with (a) language in general, (b) other forms of electronic communication, (c) other terse forms of electronic communication (e.g., SMS)?

    What have you found about the vocabulary choices, in comparison to the same domains as listed above? Does twitter form a sociolect?

    thanks

  5. Elise said :

    Jun 6, 2009

    I’m really confused.

    Of course there is an especially high number of “ing” verbs. Are you guys unaware that the entire point of Twitter is answering the question, “What are you doing?” That is the premise upon which it was built.

  6. templetonpress said :

    Jun 24, 2009

    I’m re-tweeting this post to my account right now, lol! (<-10.69ish words, use of I, use of gerund verb). Ok, one more half sentence for the stats. (<-1.40ish sentences, use of Ok)

  7. english school oxford said :

    Aug 7, 2009

    Interesting information. But it’s obvious that Twitter doesn’t represent the English language. Like texting, it uses its own rules and conventions. I can see it changing the English language over time, just as texting has, by implanting certain shorthand slang.

    Trackbacks

  1. From Twitted by jenhennen:

    Jun 4, 2009

    [...] This post was Twitted by jenhennen – Real-url.org [...]

  2. From ITRT News » Twittery Stats:

    Jun 4, 2009

    [...] The Dictionary Team at Oxford University Press has analyzed randomly-selected tweets and has found the sentence length is cut roughly by half than in general usage, gerunds are the norm, and most statements refer to the author of the tweet. The statistics are reported on the OUP blog. [...]

  3. From Brian Eisley » The OED is watching you:

    Jun 4, 2009

    [...] discovered (by way of Roy Tennant) that the people behind the Oxford English Dictionary are monitoring Twitter. They’re apparently interested in the way that people use the language there: OUP [...]

  4. From City Site Guide » OUP Dictionary Team Dissects Twitter:

    Jun 5, 2009

    [...] to the study, the average tweet contains 14.98 words and the most popular word is “the.” In [...]

  5. From Gregory is … contemplating the linguistics of statusing at Gregory Korte:

    Jun 5, 2009

    [...] This post is a compilation of previous observations I’ve made, mostly on Facebook or on in this comment currently awaiting moderation at the Oxford University Press [...]

  6. From ResourceShelf » Blog Archive » Oxford University Press Dictionary Team Monitors Twitterer’s Tweets:

    Jun 9, 2009

    [...] From the Blog Post: OUP lexicographers have been monitoring more than 1.5 million random tweets Since January 2009 and have noticed any number of interesting facts about the impact of Twitter on language usage. For example the 500 words most frequently used words on Twitter are significantly different from the top 500 words in general English text. At the very top, there are many of the usual suspects: “the”, “to”, “as”, “and”, “in”… though “I” is right up at number 2, whereas for general text it is only at number 10. No doubt this reflects on the intrinsically solipsistic nature of Twitter. The most common word is “the”, which is the same in general English. [...]

  7. From Language Links » How Twitter English — Twenglish? — is different:

    Jun 10, 2009

    [...] Lexicographers at Oxford University Press have analyzed over 1.5 million tweets since the beginning of the year, and have come up with a number of statistics and conclusions. [...]

  8. From ThickCulture » Whither Twitter, Where are You Taking Us?:

    Jun 18, 2009

    [...] dictionary team at the Oxford University Press is on top of the sitch.  Here’s some of their observations:: “Since January OUP’s dictionary team has sorted through many random tweets.  Here are [...]

  9. From Onderzoek: veel ‘ikke, ikke’ in Tweets | Twittermania:

    Jun 18, 2009

    [...] recente studie van Harvard bleek het al: ijdelheid is een belangrijke drijfveer om te Twitteren. Onderzoek van een aantal lexicografen van Oxford University Press bevestigt dit. Zij hebben sinds januari 1,5 [...]

  10. From Set Phasers on Tweet: A Star Trek Snowclone Blizzard on Twitter : OUPblog:

    Jun 25, 2009

    [...] and enjoy these recent Twitter examples of a prolific snowclone, in the spirit of recent posts on Twitteration and Trekitude. By the way, since I totally love the new Star Trek movie, and I dearly love Twitter, [...]

  11. From RT this: OUP Dictionary Team monitors Twitterer’s tweets : OUPblog | Learn English Online With Me:

    Jul 2, 2009

    [...] the article here: RT this: OUP Dictionary Team monitors Twitterer’s tweets : OUPblog Share and [...]

  12. From Dictionaries starting to recognize Twitter terms? « CyberText Newsletter:

    Jul 4, 2009

    [...] OUP Dictionary Team monitors Twitterer’s tweets” from the Oxford University Press USA blog (http://blog.oup.com/2009/06/oxford-twitter/ where they discuss some of their findings from monitoring close on 1.5 million Tweets since January [...]

  13. From Stover Style: A President’s Blog » Blog Archive » Tweet, tweet:

    Sep 15, 2009

    [...] http://blog.oup.com/2009/06/oxford-twitter/ [...]

  14. From ⇔ The Helsinki Institute for Information Technology finds that people tend to update their statuses with “mundane” messages:

    Sep 21, 2009

    [...] said, earlier this year the Oxford University Press studied 1.5 million tweets to see which words were found most [...]

  15. From Study: Microbloggers are really boring:

    Sep 21, 2009

    [...] said, earlier this year the Oxford University Press studied 1.5 million tweets to see which words were found most [...]

  16. From Study: Study: Microbloggers are Really Boring « ResourceShelf:

    Sep 21, 2009

    [...] See Also: OUP Dictionary Team monitors Twitterer’s tweets (via Oxford University Press USA Blog) [...]

  17. From La vita dei microbloggers è davvero noiosa come scrivono? - Commenta la tecnologia, la telefonia, i software:

    Sep 22, 2009

    [...] perchè uno studio analogo realizzato dalla Oxford University Press su 1.5 milioni di messaggi su Twitter, ha mostrato risultati leggermente diversi. [...]

Post a Comment

Editor's Picks