6 Story Arcs Define Western Literature, Data-Mining Study Reveals

By Nathaniel Scharping | July 6, 2016 3:18 pm

(Credit: jorisvo/Shutterstock)

Almost the entirety of Western literature can be fit neatly into just six story arcs, according to a new data-mining study.

From the panoply of novels that Western society has produced, distinct narrative patterns emerge, and many attempts have been made to pin down the shape of a story and categorize a protagonist’s journey. French writer Georges Polti claims there are 36 different types dramatic stories, while others have counted seven narrative arcs or 20.

But new research from the University of Vermont utilizing data-mining techniques suggests that the majority of the Western canon falls into one of six basic categories.

A Story’s Path

Researchers from the Computational Story Lab looked at over 1,700 books from Project Gutenberg for their study, winnowing out books such as dictionaries or those with less than 150 downloads. They analyzed the content of each book by taking samples of text, what they called “windows”, from throughout the story. They used the aptly named “hedonometer” , also developed by the Computational Story Lab, to compile a list of over 10,000 words and rate them on a spectrum of positive to negative using Amazon’s Mechanical Turk service. They published their results last month on arXiv.org.

Adding up these windows over the course of a whole book produced graphs of characters’ fortunes — the highs and lows — throughout a given novel, and generated a broad visualization of the arc the story takes. According to the researchers, theses are the six story arcs that appear time and time again in Western literature:

  • “Rags to riches” (the story gets better over time);
  • “Man in a hole” (fortunes fall, but the protagonist bounces back);
  • “Cinderella” (there’s an initial rise in good fortunes, followed by a setback, but a happy ending)
  • “Tragedy” or “riches to rags” (things only get worse);
  • “Oedipus” (bad luck, followed by promise, ending in a final fall)
  • “Icarus” (opens with good fortunes, but doomed to fail)

The six main story arcs in Western literature. From top left: rags to riches, man in a hole, Cinderella, tragedy, Oedipus, Icarus. (Credit: Reagan et. al/ University of Vermont)

While some stories don’t fit into these archetypes, the researchers say that the majority of Western classics fall into one of these categories. The “man in a hole” and rags-to-riches storylines seemed to be the most prevalent, depending on which statistical technique they applied to the data.

The researchers do note that their technique will only track broad changes in emotional valence over time, ignoring shifts that occur on the level of the sentence or paragraph. For example, they provide a detailed breakdown of Harry Potter and the Deathly Hallows, which doesn’t fit neatly into any one of the categories.


However, the seven-book series, when taken as a whole, produces a more definite “rags to riches” story that fits in with established arcs.

In addition, their process cannot separate the fortunes of multiple characters, which could be a problem for books with multiple story lines — their program would definitely struggle with something as complex as Game of Thrones. Instead, their algorithm lumps the characters into one and tracks the overall emotional tone of the book from beginning to end.

How We Talk About Ourselves

The Gutenberg Collection is a compendium of classic works, however, more modern stories were not sampled. To call back to Game of Thrones again, there are many modern novels that tell more complex tales and which embrace emotional ambiguity, muddying the story’s arc and making a precise definition difficult.

In addition, the researchers looked only at works from the Western canon — analyzing stories from other cultures may produce very different trends and hint at diverse preferences. They say that they hope to include novels from other countries in future studies.

While their work hints at some of the broader patterns of thought that dominate Western culture, the researchers say that it could also help to teach computers how to communicate better. Teaching artificial intelligence to construct stories that follow popular arcs could allow them to form better arguments and relate concepts with more accuracy. The AI authors out there, even those trained on Shakespeare, fall somewhat short of engaging — or even comprehensible. Reading the stories that emerge from a particular culture gives unique insights into norms, practices and overall patterns of thought. If we want to teach computers to think like us, they’ll need to understand how we see the world.

CATEGORIZED UNDER: Technology, top posts
  • Chris H

    Funny how we have that 6 degrees of separation rule/generalization for people as well.

  • OWilson

    Ah. Our brave new world.

    Computer books (Hollywood movie plots) and computer music.

    Grapes of Wrath = A family moves to California.

    Casablanca = A guy bumps into an old girlfriend.

    No need to plow through Sir Walter Scott, or Dickens.

    Wait for the text version, and you’ll get the idea! :)

    • http://www.mazepath.com/uncleal/qz4.htm Uncle Al

      Would The Story of O then be a Reader’s Digest condensed version of 50 Shades of Gray“? Do the US Tax Code! “The AI authors out there, even those trained on Shakespeare, fall somewhat short of engaging — or even comprehensible.” Indeed.

      If we want to teach computers to think like us,” It should be “we,” but no matter. Revoke Asimov’s First Law of Robotics and you’ve got it.

      • OWilson

        Ah, the simpler, less complicated days of Asimov. Or, even Einstein, Penrose, and Feynman.

        “If you show us sufficient proof, we’ll believe you”.

        Velikosky and Von Daniken couldn’t, and were summarily dismissed.

        Today, Hanson, Mann and Gore, with much less, become public Icons.

        • zlop

          Hidden hand of the Oligarchy is one
          with certain figures — even Einstein.
          “Albert Einstein: The Myth, the Plagiarist
          & the Zionist”

  • Lorie Franceschi

    Just a thought, with the millions if not billions of books in Western Literature, they can figure this all out by studying only 1700 books that are on line?

    • zlop

      Why not — most newspapers just copy main publications.
      Are you there — this is good/bad . .. …

      • Lorie Franceschi

        The article is talking about books and so am I, which there have been billions of them written. If you want to follow your thinking, why study 1700? All you have to do, following your line of thinking, is study the Bible, the Torah, the Qu’arn, the Bhagavad Gita, Bishido, and the teachings of Confucius.

        • zlop

          Silly woman., you are,
          does not describe all women.

          • Lorie Franceschi

            i am glad that you think that I am silly. I was just pointing out through your thought line, that you don’t need to study 1700 books that are on line or look at newspapers (which by the way are not considered books) that all the holy books I mentioned have all six story arcs in them. Read one of them all the way through, I don’t care which one( and you will find the six story arcs. Do some of your own research.

          • zlop

            Similar to ancient story themes.
            Silly Woman is an approximation that people use.

            How many woman approximations do we have?
            I met someone who dressed as Betty Boop for Halloween.

          • OWilson

            Look to Madison Avenue for your stereotypes.

            Today’s woman has a career, she puts great tasting and nutritious fast food on the counter for her family, and helps her stay at home milquetoast husband, find his keys, and batteries for the TV remote for when his beer drinking pals come over for the weekend NFL game.

            She dutifully serves them all frozen pizza, with a smile, which they swear is “delivery” :)

          • http://paulyhart.com paulyhart

            on the thought of following religious books, i think you would have to break some of them up. from the bible – esther, job, the gospels might be some to fit western structure. it would be an interesting study.

        • Skip

          Because people like a variety of fantasy stories . Not just religious one.

    • Wings_42

      It’s been literally 50 years since I minored in statistics but I can confidently state that a sample size of 1700 novels is big enough to be a valid representation if they were truly randomly selected within the criteria discussed in the article, I doubt that a sample size of thousands or even millions of novels that match the criteria would vary even 1% from the story arcs mined from the selected 1700.

      There may, however be a lack of randomness in books available through Project Gutenberg. Another possible bias could like within the actual sampling criteria and application of those criteria from the Project Gutenberg library. If so, the results would be biased in the direction of the sampling error.

      The purpose of this study seems to be to provide an analytic tool for understanding underlying structures in literature, not to put a box around what defines a novel. It should be fun and enlightening to match the appropriate story arc to a novel or even better to find a classic novel with a story arc other than these six.

      • Lorie Franceschi

        I do remember my stats class, and i was taught 1500 was the number. I think that total randomness might be the factor here. I do agree that that the story arcs are along the six that are listed, but I believe there is a few more. As I said earlier also, just study the holy books of the major and some of the minor religions and you will come up with the same figures.

  • http://www.snowflakehell.blogspot.com Ken Brody

    My novels would likely be classified as “rags to riches” if you ignore such details as the impending destruction of all living things (Sage of Sagittarius) and assassination of the AI protagonist (Pa’an). It would give up on Curtain of Heaven.

    Next novel I’ll just hire that Mechanical Turk.

  • John Scanlon

    Did anyone notice that there’s actually just one sine curve? Rather than a finite typology of plots, the appropriate tool here is Fourier analysis.

    • zlop

      Literature compresses reality.

  • Nora Qudus

    this is not new just stated differently learned this 30 years ago in college.

  • Brian R Young

    Me thinks ZLOP may be an AI computer program?

  • Billy_Ruffian

    Leave it to science jerks to suck the soul out of art.

  • Geoff Kieley

    The late novelist John Gardner once said there are only two kinds of stories: A man goes on a journey, or a stranger comes to town


    I agree, many of the stories I have read fit into these story arcs


Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!


See More

Collapse bottom bar