By Scott B. Weingart and Jeana Jorgensen
Computational analysis and feminist theory generally aren’t the first things that come to mind in association with fairy tales. This unlikely pairing, however, can lead to important insights regarding how cultures understand and represent themselves. For example, by looking at how characters are described in European fairy tales, we’ve been able to show how Western culture tends to bias the younger generation, especially the men. While that result probably won’t shock anyone more than passingly familiar with the Western world, the method of reaching these results allows us to look at cultural biases in a new light. Our study and many others like it are part of a growing trend in applying the power of computing and quantitative analysis toward understanding ourselves.
This is not a new idea. Isaac Asimov’s science fiction Foundation novels, dating back to 1942, explore the repercussions of being able to mathematically predict human activity based on an analysis of history. In the early 20th century, the Annales school of history began crunching historical numbers to learn more about cultures on a large scale. Various groups since then have risen with similar goals, including the cliometricians in the 1960s and the cliodynamicists more recently.
Folklorists, too, have always been interested in tracing large-scale patterns in expressive culture ranging from storytelling to pottery. In one now-classic example of structural analysis, Russian folklorist Vladimir Propp separated fairy tales into plot components based upon the action being performed regardless of the character performing it (hence it doesn’t matter whether a witch or dragon steals the princess; what matters is that the princess has been removed from the civilized sphere, creating the need for a hero and a quest). More recently, folklorists such as Kathleen Ragan and Timothy Tangherlini have been using statistical analysis and geographical information systems to study gender bias in folktale publications and storytelling diffusion over time and space.
The biggest news to hit the streets recently combined the power of Google, a few Harvard mathematicians, and five million digitized books covering the last two centuries. They dubbed their computational study of culture “culturomics”, and several more research projects have grown in its wake.
This type of research has traditionally been limited by inadequate technology, incomplete data, and the scarcity of scholars well-versed in both computation and traditional humanities research. That scene is now changing, due largely to efforts from both sides of the cultural divide, the humanities and the sciences. It is in this context that we undertook a study of European fairy tales, yielding interesting and occasionally unexpected results.
An analysis of over 10,000 references to people and body parts in six collections of Western European fairy tales can reveal quite a bit. Understanding fairy tales pays off twofold: they reveal the popular culture and beliefs of the past, while simultaneously showing what cultural messages are being transferred to modern readers. There is no doubt that the Disney renditions of classic fairy tales both reflect assumptions of the past and helped shape the gender roles of the present.
One finding from this analysis dealt with the use of adjectives when describing bodies or body parts in the stories. The most frequently-used adjectives cluster around the themes of maturation, gaining and maintaining beauty or wealth, and the struggle for survival, all concepts that still have a prominent place in our culture.
The use of age in these stories is of particular interest. While young people are described more than twice as frequently as old, the word old (and similar words indicating old age) appears more frequently than the word young (and related terms). That means the tellers of these stories rarely find it necessary to mention when someone is young, but often feel the need to describe the age of older people.
In fact, old people tend to attract more adjectives than their younger counterparts in general. If someone is going to be described in any way at all, whether it be about their beauty or their age or their strength, it’s far more likely that those descriptions are attached to the old rather than the young. This trend also holds true with regards to gender; men are described significantly less frequently than women. Combining these facts, it appears that although old women are brought up relatively infrequently, they are described much more frequently than would be expected.
The fact that women are described more frequently than men fits with a common feminist theory suggesting Western culture treats the male perspective as universal, unmarked, public, and default. Extending that theory further, the fairy tale analysis reveals that the young perspective is also default and unmarked. Older people and especially older women must be described in greater detail and with greater frequency, marking them as old or as women or both, because otherwise the character is assumed as young and masculine, maintaining those traits which are considered defaults.
These results just scratch the surface of what can be discovered using the automated and quantitative analysis of cultural data. As technology and data sources improve, there will be an increasing number of studies which combine algorithms and statistics with traditional humanistic theories and frameworks. The holy grail, which we are reaching ever-closer to, is the successful bridging of traditional close reading approaches of humanistic inquiry and the distant reading quantitative methods being developed by researchers like Franco Moretti and the Google Ngrams Team. This is another step on that path.
Scott B. Weingart is an Information Science Ph.D. student at Indiana University studying the history of science. and Dr. Jeana Jorgensen is a recent graduate of Indiana University who specializes in folklore and gender studies. This work is from a paper they co-presented at Digital Humanities 2011, for which they won the Paul Fortier Prize for best young researchers at the conference. The paper ‘Computational analysis of the body in European fairy tales‘ is in the journal Literary and Linguistic Computing, and is available to read for free for a limited time.
Literary and Linguistic Computing is an international journal which publishes material on all aspects of computing and information technology applied to literature and language research and teaching. Papers include results of research projects, description and evaluation of techniques and methodologies, and reports on work in progress.
The Oxford Treasury of Fairy Tales, ed. Geraldine McCaughrean & Sophy Williams, Oxford University Press, 2012.
Fairy Tales and Other Stories by Hans Christian Andersen, ed. W.A. & J.K. Craigie, Oxford University Press, 1914: via digitized content at the New York Public Library.