Almost the entirety of Western literature can be fit neatly into just six story arcs, according to a new data-mining study.
From the panoply of novels that Western society has produced, distinct narrative patterns emerge, and many attempts have been made to pin down the shape of a story and categorize a protagonist’s journey. French writer Georges Polti claims there are 36 different types dramatic stories, while others have counted seven narrative arcs or 20.
But new research from the University of Vermont utilizing data-mining techniques suggests that the majority of the Western canon falls into one of six basic categories.
A Story’s Path
Researchers from the Computational Story Lab looked at over 1,700 books from Project Gutenberg for their study, winnowing out books such as dictionaries or those with less than 150 downloads. They analyzed the content of each book by taking samples of text, what they called “windows”, from throughout the story. They used the aptly named “hedonometer” , also developed by the Computational Story Lab, to compile a list of over 10,000 words and rate them on a spectrum of positive to negative using Amazon’s Mechanical Turk service. They published their results last month on arXiv.org.
Adding up these windows over the course of a whole book produced graphs of characters’ fortunes — the highs and lows — throughout a given novel, and generated a broad visualization of the arc the story takes. According to the researchers, theses are the six story arcs that appear time and time again in Western literature:
- “Rags to riches” (the story gets better over time);
- “Man in a hole” (fortunes fall, but the protagonist bounces back);
- “Cinderella” (there’s an initial rise in good fortunes, followed by a setback, but a happy ending)
- “Tragedy” or “riches to rags” (things only get worse);
- “Oedipus” (bad luck, followed by promise, ending in a final fall)
- “Icarus” (opens with good fortunes, but doomed to fail)
While some stories don’t fit into these archetypes, the researchers say that the majority of Western classics fall into one of these categories. The “man in a hole” and rags-to-riches storylines seemed to be the most prevalent, depending on which statistical technique they applied to the data.
The researchers do note that their technique will only track broad changes in emotional valence over time, ignoring shifts that occur on the level of the sentence or paragraph. For example, they provide a detailed breakdown of Harry Potter and the Deathly Hallows, which doesn’t fit neatly into any one of the categories.
However, the seven-book series, when taken as a whole, produces a more definite “rags to riches” story that fits in with established arcs.
In addition, their process cannot separate the fortunes of multiple characters, which could be a problem for books with multiple story lines — their program would definitely struggle with something as complex as Game of Thrones. Instead, their algorithm lumps the characters into one and tracks the overall emotional tone of the book from beginning to end.
How We Talk About Ourselves
The Gutenberg Collection is a compendium of classic works, however, more modern stories were not sampled. To call back to Game of Thrones again, there are many modern novels that tell more complex tales and which embrace emotional ambiguity, muddying the story’s arc and making a precise definition difficult.
In addition, the researchers looked only at works from the Western canon — analyzing stories from other cultures may produce very different trends and hint at diverse preferences. They say that they hope to include novels from other countries in future studies.
While their work hints at some of the broader patterns of thought that dominate Western culture, the researchers say that it could also help to teach computers how to communicate better. Teaching artificial intelligence to construct stories that follow popular arcs could allow them to form better arguments and relate concepts with more accuracy. The AI authors out there, even those trained on Shakespeare, fall somewhat short of engaging — or even comprehensible. Reading the stories that emerge from a particular culture gives unique insights into norms, practices and overall patterns of thought. If we want to teach computers to think like us, they’ll need to understand how we see the world.