Today, Professor Christian Kay from the Historical Thesaurus of the Oxford English Dictionary team talks about categorization in the HTOED.

When I first came to Glasgow, I was puzzled by children ringing my doorbell and asking, “Goat ony ginger boatles, Missus?” (which loosely translates as, “Do you have any ginger bottles which you might give us, Madam?”) I understood that they wanted to take the bottles back to the shop and claim a refund on them, but I could not understand why their trade was so specialized. To me, a ginger bottle must be a bottle containing a ginger-flavoured liquid. Only later did I learn that for Glaswegian children, and many adults, “ginger” was a generic term for any fizzy drink – what I, equally illogically, call lemonade.

What was happening here was a clash of categorization systems. The Glaswegians and I were surveying the world of drinks and organizing it in different ways. As an incomer, I had to learn the categories of their society if I was to operate successfully within it. A similar situation faces anyone moving to a new place, or learning a new language; children learning their first language may initially identify different categories from those employed by adults.

09 - 247 Prof Christian Kay 006In the Historical Thesaurus of the OED, categories shift in time rather than in space. In category 01.02.07 People, for example, we find increasing numbers of relatively recent words referring to people in terms of their age: teenager (first recorded in OED2 in 1941), bobby-soxer (1944), pre-schooler (1954), subteenager (1959). The need to make such fine distinctions perhaps reflects the importance of age in our society, as do terms at the other end of the scale, such as senior citizen (1938), third age (1972), and, less flatteringly, wrinkly (1972) and crumbly (1976) to refer to an old person. Comparable terms for many other members of the animal kingdom exist in 01.02.06 Animals, but here our world knowledge may not immediately supply the categories. How many modern urban dwellers know that the words teg, hoggerel and thrinter refer to sheep in their first, second and third years respectively, or indeed to sheep at all? Fortunately for the classifier, this information is readily available in the OED.

A further complication lies in the fact that people can happily operate with more than one system of classification. Ask someone what a tomato is, and they are quite likely to reply that it is a type of vegetable, even if they are aware that technically a tomato is a fruit. The clash here is between a folk category, based on the use of the object in our society (we eat it with other vegetables), and a scientific or expert category (fruits have seeds). Since much of English vocabulary came into use before serious classification of the natural world got underway in the eighteenth century, the two systems often have to be juggled in HTOED. Early words for plants, for example, fall more readily into categories such as ‘medicinal’, ‘poisonous’, or ‘yielding a dye’ than they do into a scientific taxonomy. Classifiers have to take account of such duality. In the case of tomatoes, pumpkins, cucumbers, and so on, the solution is a category called Fruits as vegetables.

Categorization is a basic human cognitive skill. We begin in childhood to organize things according to whether they HTOED-hi-resare alike or unalike, and continue this process in adult life. Most of us will impose some sort of order on our material possessions, sorting books by author (or title, or subject, or size …), socks by colour, sweaters by season, and so on – the categories may vary from person to person, but the principle is there. Dictionary definitions often categorize words by reference to other words, as when OED defines sofa as “a form of lounge or couch” or rapier as “a long, thin, sharp-pointed sword”. Such relationships are revealed by proximity in HTOED, which thus constitutes a map of their development of in the history of the English language.

We use words and the categories they represent to impose order on our universe. If we hear the word tree, or hill, or green, very different images of these phenomena may spring to mind. Yet the multitude of trees that an individual may have seen have enough in common to form a category of Trees which is shared by other speakers of the language, and thus enables communication to proceed. (Except, of course, when the categories are fuzzy, as they mostly are, and we start arguing about whether a tree is actually a bush, or a hill qualifies as a mountain, or this particular green is closer to blue or yellow, but that’s a different problem.)

