Rebecca OUP-US
Today is an exciting day at the OUPblog. We are gearing up to launch our newest column which will appear for the first time tomorrow. Casper Grathwohl, Reference Publisher for OUP-USA and the Academic Division in Oxford, has graciously agreed to be the “opening-act” and introduce the impetus behind our newest column. Check out what Casper has to say below. Be sure to come back tomorrow and read From A To Zimmer!
Earlier this year Oxford introduced a new look to its dictionaries—a “refresh” of our classic design. One of the new elements you’ll notice is a little logo on the cover of every
dictionary with the words “Powered by the Oxford Corpus” next to it. Intriguing. Most people have probably never heard of a corpus. So why are we making such a big deal of it? Well, the story of the Oxford English Corpus sits at the heart of our ability to track language and reflect real language usage—by real speakers—in our dictionaries.
The corpus is a carefully selected electronic repository of more than 1.5 billion words pulled from newspapers, blogs, magazines, scientific papers, journals, books, websites, transcripts from television and radio, and many other print, online, and spoken English sources from around the world. Together this content is representative of our living language, and Oxford lexicographers analyze it to build the most sophisticated and accurate dictionary content of any publisher in the world. We can tell how the various uses of the noun “terror” have shifted in the United States after 9/11. Through its collocations—the words that most often come before or after—we discovered that the verb “to cause” is used far more often to denote negative events (such as “to cause a flood”) than positive ones. These are just a few examples of the nuances of our language that Oxford lexicographers are tracking when building new dictionary data.
And why is this important? Because when it comes to language, precision is power. The more exact you can be in asking for something, the better chance you have of getting just what you asked for. It’s that simple. Language is evolving at an ever-increasing clip, and if we’re not keeping up—employing hundreds of lexicographers using tools like the corpus—there’s no way you’re going to be able to. I’m not a lexicographer myself, but I’ve spent time poking around the corpus, playing with the tools and uncovering examples of how you are playing with the language. I can tell you that it is awesome. I’ve been lost in it for hours. And we’re the only dictionary publisher with a corpus this rich and expansive. Anyone can cut and paste a bunch of articles or blogs into a database, but we hand-select a representative balance of sources—online and print, British and American, spoken and written—to ensure the accuracy and currency of every definition our lexicographers write. The Oxford English Corpus is part of our ongoing commitment as your guide to the English language, and that’s why we’re highlighting it on the cover of every one of our dictionaries.
But I’m not writing just to tell you about the Oxford English Corpus, I’m also here to introduce a new online column appearing on the OUP blog by American lexicographer Ben Zimmer. Ben is an editor in Oxford’s New York office and his column will attempt to capture something of the daily life of the English language. And he has the Oxford English Corpus as a tool to back up his riffs. How does a word make it into the dictionary? What “Bushisms” will really last beyond W’s tenure in office? I’ve read Ben’s first column and I can tell you it’s just the right blend of serious language discussion and interesting cultural commentary—perfect for all of us armchair linguists. Enjoy the column folks.

Comments
Anthony said :
Jun 28, 2007
line 4, impetuous??? what, no copy editor on the blog’s opening day?
Stumblng Tumblr said :
Jun 28, 2007
Did you mean “impetus”, not “impetuous”?
Rebecca said :
Jul 2, 2007
Sorry! Even I make mistakes!
Peter Adams said :
Jul 29, 2007
I gather the corpus is not available to scholars other than those working on OUP projects. Correct?