By Ben Zimmer
Greetings, OUPblog readers! It’s been about six months since I had my “Last Word” around these parts, and it’s good to be back, reporting in from my new vantage point as executive producer of the Visual Thesaurus. When I was writing the column “From A to Zimmer” here, I often talked about how the OUP dictionary program uses the latest computational tools to shed new light on the inner workings of the English language. The development of the Oxford English Corpus has been particularly useful in tracking English usage, illuminating everything from spelling errors to shifting idioms to innovative combining forms like -licious. In my new job, I still get the chance to fuse lexicography with state-of-the-art technology. One fun example of this fusion is a new online spelling bee that adapts to players’ skill levels. It tells us a lot about how people grapple with the confusing rules of English spelling.
American schoolchildren have been competing in spelling bees for about two centuries now, originally sparked by the spelling textbooks of Noah Webster, whose 250th birthday was celebrated by American lexiphiles two weeks ago. Since Webster’s time, the spelling bee has become a distinctly American tradition, with its lasting appeal showcased in movies like Akeelah and the Bee and Spellbound, and the widely watched national broadcast of the Scripps National Spelling Bee on ABC and ESPN. Even Great Britain is belatedly joining in the fun, with the (UK) Times currently sponsoring the first-ever national Spelling Bee.
When we launched the Visual Thesaurus Spelling Bee this past summer, we knew there was a built-in interest, but the response was still surprising. So far there have been 15,000 players who have tried their hand at spelling a grand total of 500,000 words. It’s clearly habit-forming, with many repeat visitors. The reason why it’s so addictive is that it’s been designed to be adaptive, so the more words that are spelled correctly, the more difficult the words become. And conversely, if you’re not a great speller, the words will get easier and easier. That way a player will always be quizzed at the appropriate skill level — from the orthographically challenged to the most expert spellers.
As more and more players try the Bee, the game has steadily improved based on data collected on how words are spelled. Words are being continuously reanalyzed for difficulty based on how spellers fare. Every five minutes, words are rescored for difficulty taking into account the latest data from the Bee spellers. That means there’s an increasingly better fit to different skill levels. As the player continues to spell, the quiz narrows in on his or her score, on a scale from 200 to 800. A 200-level speller will get quizzed on the easiest words, but 800-level spellers are in for a fiendish challenge — matching their wits against such oddities as puerperal (relating to childbirth), faineant (disinclined to work or exertion), and palilalia (a pathological condition in which a word is rapidly and involuntarily repeated — something you might get from trying to spell too many words!).
There’s some sophisticated data analysis going on behind the scenes to score both players and words. Using intricate algorithms and curve-fitting models, the Bee is able to determine not just how difficult a word is to spell, but how well a word is at discriminating good spellers from bad spellers. That way the Bee can quickly zero in on a player’s skill level, in much the same way that computer-adaptive tests like the GRE and GMAT tailor themselves to test-takers’ abilities.
For each word, a graph is generated to plot the distribution of right and wrong answers across different skill levels. Then a curve is drawn to fit the data. If that curve rises very steeply, then the word is a good “discriminator”: it’s an accurate way to separate the good spellers from the bad spellers. Take two relatively easy words: harried and horrendous. Both of them are about the same difficulty level: 350 on the scale of 200 to 800. Here are their graphs, with player’s skill levels on the x-axis and the frequency of correct answers on the y-axis:
As the graphs illustrate, the curve for horrendous rises much more steeply than the one for harried. So if you spell horrendous incorrectly, it’s a very good bet that your skill level is below 350. And if you spell it right, then you probably can handle words at a level above 350. Each time a player spells a word right or wrong in the Bee, that gets added to the growing pool of data about each word’s difficulty and ability to discriminate good spellers from bad spellers.
We’ve come a long way from Noah Webster’s “Blue-Backed Speller,” but the impulse to test one’s spelling prowess is still running strong — both in the American spelling-bee tradition and now, increasingly, among the international audience of Anglophones. Everybody loves a challenge, even if they accidentally learn something in the process! And as the creators of the challenge, we’re constantly learning too, finding the patterns of how people succeed and fail when confronting the odd and outrageous rules of English spelling.
To try the Bee yourself, click here.