Data Mining Proves Darwin’s Finches Weren’t Really His

By Shannon Palus | January 3, 2014 10:11 am

795px-Darwin's_finches_by_Gould

Darwin’s finches weren’t his. That’s what a team of computer scientists found when they tested out a new data-mining technique, Reference Publication Year Spectroscopy (RPYS).

This is the first time the technique has been applied to study the origins of a popular phrase – before now it’s been used only to pinpoint the beginnings of scientific fields. In the future RPYS could have applications from exploring virality in pop culture to debunking other scientific legends.

How it Works

Researchers using the technique begin by searching a database of academic papers for a specified term – in this case, “Darwin’s finches.” For any papers that contain the term, the researchers extract the citations section, in which previously published research is listed. The dates of these previous publications are then plotted on a graph by year.

The resulting graph resembles an emission spectrum in chemistry, hence the technique’s name. Just as atomic spectroscopy identifies elements that compose a sample – they show up as distinct lines on an emission spectrum – the idea behind RPYS is to identify the key publications that contribute to a term or idea. In that way RPYS is a reverse of something that researchers already do: look up how often their work has been cited in the literature.

The Origin of Darwin’s finches

As the test case researchers chose the term “Darwin’s finches” because the appearance and popularization of the term had already been explained by historians.

Anyone who has read Darwin’s work knows that he didn’t actually do all that much work with finches, though he did collect them while in the Galapagos. It was evolutionary biologist David Lack who, over 100 years later, did the significant work of connecting the geographically isolated birds to their evolutionary differences. Lack outlined this work in a 1947 paper titled “Darwin’s finches” – a nod to the father of evolutionary theory.

And indeed, when researchers applied RYPS, a spike appeared at 1947 – indicating that Lack, not Darwin, popularized the concept of “Darwin’s finches.”

Thus, the researchers write, “Charles Darwin, the originator of evolutionary theory, was given credit for finches he did not see and for observations and insights about the finches he never made.” The results were published in arXiv.

Popularization of a Concept

It’s important to note that this is just a measure of influence. The highest quality papers are not always the ones which are cited most frequently. For example, a recent study published in Nature showed that there are gender disparities in paper citations. Articles with women as prominent authors were cited less than papers authored by men. Further, the RPYS doesn’t easily show the first paper to mention something. Lack himself didn’t coin the term “Darwin’s finches” – he just popularized it.

The main point of the paper is that RPYS works, and could help debunk other misconceptions about where research came from. In the meantime: don’t believe everything you hear.

CATEGORIZED UNDER: Technology, top posts
MORE ABOUT: computers
  • Klae A. Klevenger

    Anyone who has taken a class on Darwin would know this. It is common knowledge that Darwin didn’t do anything with finches. This is not an important find.

    • alnontr

      They used this case to test a new data mining technique specifically because its history is known. Anyone who actually read the article would know this.

      • Klae A. Klevenger

        I did read the article. While I can’t say that it’s been changed, I can say I only now see what you are referring too. I would have much rather had an article that at first glance looks like an article about what it is an article about.

  • http://www.marvelitech.com/web-data-mining-services/ shapnariyan

    Data mining is the process of taking a huge amount of data and
    analyzing it from a variety of angles and format that
    makes it useful information to help a business improve business.

  • http://www.marvelitech.com/web-data-mining-services/ shapnariyan

    Data mining is the process of taking a huge amount of data and analyzing it from a variety of angles and format that makes it useful information to help a business improve business.

  • http://www.marvelitech.com/web-data-mining-services/ shapnariyan

    Data mining is the process of taking a huge amount of data and analyzing it from a variety of angles and format that makes it useful information to help a business improve business.

  • logeshs

    Data mining is the process of taking a huge amount of data and analyzing it from a variety of angles and format that makes it useful information to help a business improve business.

  • shapna

    yeah, data mining is the process of collecting huge information and filter it by needed. It helps especially in loan division on bank sector.

NEW ON DISCOVER
OPEN
CITIZEN SCIENCE
ADVERTISEMENT

Discover's Newsletter

Sign up to get the latest science news delivered weekly right to your inbox!

D-brief

Briefing you on the must-know news and trending topics in science and technology today.
ADVERTISEMENT

See More

ADVERTISEMENT
Collapse bottom bar
+

Login to your Account

X
E-mail address:
Password:
Remember me
Forgot your password?
No problem. Click here to have it e-mailed to you.

Not Registered Yet?

Register now for FREE. Registration only takes a few minutes to complete. Register now »