Oxford University Press's
Academic Insights for the Thinking World

RT this: OUP Dictionary Team monitors Twitterer’s tweets

Purdy, Director of Publicity

A recent study out of Harvard confirms Twitter is all vanity. This is not a big surprise to the dictionary team at Oxford University Press. OUP lexicographers have been monitoring more than 1.5 million random tweets Since January 2009 and have noticed any number of interesting facts about the impact of Twitter on language usage. For example the 500 words most frequently used words on Twitter are significantly different from the top 500 words in general English text. At the very top, there are many of the usual suspects: “the”, “to”, “as”, “and”, “in”… though “I” is right up at number 2, whereas for general text it is only at number 10. No doubt this reflects on the intrinsically solipsistic nature of Twitter. The most common word is “the”, which is the same in general English.

Since January OUP’s dictionary team has sorted through many random tweets.  Here are the basic numbers:
Total tweets = 1,496,981
Total sentences = 2,098,630
Total words = 22,431,033
Average words per tweet = 14.98
Average sentences per tweet = 1.40
Average words per sentence in Twitter= 10.69
Average words per sentence in general usage = 22.09

Other interesting tidbits include:

Verbs are much more common in their gerund form in Twitter than in general text. “Going”, “getting” and “watching” all appear in the top 100 words or so.

“Watching”, “trying”, “listening”, “reading” and “eating” are all in the top 100 first words, revealing just how often people use Twitter to report on whatever they are experiencing (or consuming) at the time.

Evidence of greater informality than general English: “ok” is much more common, and so is “f***”.

And that is how we roll here at OUP, monitoring new social media and the changes in the English language up to the minute.  Tweet on.

Recent Comments

  1. Twitted by jenhennen

    [...] This post was Twitted by jenhennen – Real-url.org [...]

  2. jrome

    Likely the first messages sent by telegraph differed from handwritten letters in much the same way. As email use begins to decline and forms of human and machine communication shift to real-time services such as Twitter, significant patterns of language usage will appear. Glad to hear the OED is listening.

    Cheers!

  3. Joe Smith

    The most interesting fact about the use of twitter is that it’s proof that the pace of life is speeding up–and that haste in resulting in greater self-interest and less compassion. After all, when you’re in a hurry your less apt to care about anyone but yourself. If you ask me, researchers should be studying how we can benefit from computer and communications technology without internalizing the “machine values” of speed, efficiency, and standardization.

  4. [...] The Dictionary Team at Oxford University Press has analyzed randomly-selected tweets and has found the sentence length is cut roughly by half than in general usage, gerunds are the norm, and most statements refer to the author of the tweet. The statistics are reported on the OUP blog. [...]

  5. [...] discovered (by way of Roy Tennant) that the people behind the Oxford English Dictionary are monitoring Twitter. They’re apparently interested in the way that people use the language there: OUP [...]

  6. Gregory Korte

    Correct me if I’m wrong — you’re the linguists, after all — but words like “watching” and “getting” can also be participles. And I’d bet they’re more often used on Twitter as present participles (“I am watching American Idol”) than as true gerunds (“Watching American Idol is one of my favorite pastimes.”)

    Of course, a Tweet that simply says, “Watching American Idol” could be ambiguous, but I still imagine that as a participle where the subject and auxiliary verb are understood.

    Having said that, I’ve noticed the same phenomenon, and I’m fascinated by the way “statusing” (as a gerund) is changing the written word. As Calvin once said to Hobbes, “Verbing weirds language.”

  7. Michael Muller

    Good start. The prevalence of gerunds is not surprising, given the present-tense question that prompts the user.

    What have you found about the word-length and sentence length, in comparison with (a) language in general, (b) other forms of electronic communication, (c) other terse forms of electronic communication (e.g., SMS)?

    What have you found about the vocabulary choices, in comparison to the same domains as listed above? Does twitter form a sociolect?

    thanks

  8. [...] to the study, the average tweet contains 14.98 words and the most popular word is “the.” In [...]

  9. [...] This post is a compilation of previous observations I’ve made, mostly on Facebook or on in this comment currently awaiting moderation at the Oxford University Press [...]

  10. Elise

    I’m really confused.

    Of course there is an especially high number of “ing” verbs. Are you guys unaware that the entire point of Twitter is answering the question, “What are you doing?” That is the premise upon which it was built.

  11. [...] From the Blog Post: OUP lexicographers have been monitoring more than 1.5 million random tweets Since January 2009 and have noticed any number of interesting facts about the impact of Twitter on language usage. For example the 500 words most frequently used words on Twitter are significantly different from the top 500 words in general English text. At the very top, there are many of the usual suspects: “the”, “to”, “as”, “and”, “in”… though “I” is right up at number 2, whereas for general text it is only at number 10. No doubt this reflects on the intrinsically solipsistic nature of Twitter. The most common word is “the”, which is the same in general English. [...]

  12. [...] Lexicographers at Oxford University Press have analyzed over 1.5 million tweets since the beginning of the year, and have come up with a number of statistics and conclusions. [...]

  13. [...] dictionary team at the Oxford University Press is on top of the sitch.  Here’s some of their observations:: “Since January OUP’s dictionary team has sorted through many random tweets.  Here are [...]

  14. [...] recente studie van Harvard bleek het al: ijdelheid is een belangrijke drijfveer om te Twitteren. Onderzoek van een aantal lexicografen van Oxford University Press bevestigt dit. Zij hebben sinds januari 1,5 [...]

  15. templetonpress

    I’m re-tweeting this post to my account right now, lol! (<-10.69ish words, use of I, use of gerund verb). Ok, one more half sentence for the stats. (<-1.40ish sentences, use of Ok)

  16. [...] and enjoy these recent Twitter examples of a prolific snowclone, in the spirit of recent posts on Twitteration and Trekitude. By the way, since I totally love the new Star Trek movie, and I dearly love Twitter, [...]

  17. [...] the article here: RT this: OUP Dictionary Team monitors Twitterer’s tweets : OUPblog Share and [...]

  18. [...] OUP Dictionary Team monitors Twitterer’s tweets” from the Oxford University Press USA blog (http://blog.oup.com/2009/06/oxford-twitter/ where they discuss some of their findings from monitoring close on 1.5 million Tweets since January [...]

  19. english school oxford

    Interesting information. But it’s obvious that Twitter doesn’t represent the English language. Like texting, it uses its own rules and conventions. I can see it changing the English language over time, just as texting has, by implanting certain shorthand slang.

  20. [...] said, earlier this year the Oxford University Press studied 1.5 million tweets to see which words were found most [...]

  21. [...] said, earlier this year the Oxford University Press studied 1.5 million tweets to see which words were found most [...]

  22. [...] See Also: OUP Dictionary Team monitors Twitterer’s tweets (via Oxford University Press USA Blog) [...]

  23. [...] perchè uno studio analogo realizzato dalla Oxford University Press su 1.5 milioni di messaggi su Twitter, ha mostrato risultati leggermente diversi. [...]

  24. [...] According to the people at Oxford University Press, “no doubt this reflects on the intrinsically solipsistic nature of Twitter.” No doubt… [...]

  25. [...] variable. Going back to Twitter I looked for statistics and found this Oxford University blog post Total tweets = 1,496,981 Total sentences = 2,098,630 Total words = 22,431,033 Average words per [...]

  26. [...] Dictionary — "the definitive record of the English language," mind you — monitors the language used on Twitter. [...]

  27. [...] to the Oxford University Press, the average tweet is 14.98 words (call it [...]

  28. [...] it differs from everyday language in some very specific ways. Back in 2009, Oxford University Press looked at almost 1.5 million random tweets, and found some interesting distinctions between their content and that of text in general usage. [...]

  29. [...] it differs from everyday language in some very specific ways. Back in 2009, Oxford University Press looked at almost 1.5 million random tweets, and found some interesting distinctions between their content and that of text in general usage. [...]

  30. [...] count comes in around 200,000. That’s slightly over twice the length of Hacking Web Apps. (Or roughly 13,000 Tweets or 200 blog posts.) Those numbers are just trivia associated with the mechanics of [...]

  31. […] Source: Oxford University Press Blog […]

Leave a Comment

Your email address will not be published. Required fields are marked *