What’s a scientist to do with 1.2 million photos, most of grass but some containing valuable data about endangered animals? Turn the whole thing over to the public, if you’re the creators of Snapshot Serengeti. This project caught the attention of tens of thousands of volunteers. Now their work has produced a massive dataset that’s already helping scientists in a range of fields.
Most online citizen science involves a degree of tedium—counting craters, tracing kelp mats. But Snapshot Serengeti is part safari, part detective work. That may be why volunteers tore through the photos so eagerly.
The pictures came from 225 camera traps set up in a grid across 1,125 square kilometers of Serengeti National Park in Tanzania. The cameras have infrared sensors that are triggered by a combination of heat and motion. That means when an animal walks past, the camera snaps a quick burst of pictures.
The cameras were bolted onto trees or metal poles and surrounded by steel cases. Nevertheless, about 15 percent of the cameras had to be replaced each year after being damaged by weather or animals.
Between 2010 and 2013, the camera traps captured 1.2 million scenes. To sort through the overwhelming number of pictures, scientists turned them into an online game for citizen scientists. Snapshot Serengeti is hosted at the Zooniverse, a citizen science portal. (All the images uploaded to Snapshot Serengeti have now been classified, but you can still play around with it. And the cameras are still running, so aspiring classifiers should stay tuned for new pictures.)
Volunteers could classify a picture as empty if the camera had misfired on some branches or grass blades waving in the sun. That was the case for about three quarters of the photos. When an animal was present, users went through a quick guide to determine the most likely species. (What color or pattern does its fur have? What are its horns and tail shaped like? What might it be mistaken for?)
Animals could be classified as one of 48 different species (aardvark, porcupine, hippopotamus) or groups of species (rodent, miscellaneous bird). Users also reported how many animals they saw, what the animals were doing (moving? eating?), and whether any young were around.
The 28,000 registered Snapshot Serengeti users, along with about 40,000 unregistered users, classified more than 300,000 animal photos. Then scientists led by Alexandra Swanson at the University of Oxford used a “simple algorithm” to merge these classifications into a single consensus dataset. They designated each picture with the animal or animals that the most people had picked.
They also gave each image a score for uncertainty and difficulty. A photo of a furry haunch pressed against the camera lens, for example, might have high uncertainty because volunteers didn’t agree on how to classify it. A clear shot of two giraffes, on the other hand, would get more consistent answers.
But how accurate were the volunteers? Swanson and her coauthors created a smaller, “gold standard” set of images to find out. Experts classified 4,149 of the Snapshot Serengeti images. When they checked these classifications against the larger, volunteer dataset, the researchers saw that species IDs by citizen scientists were almost 97 percent accurate.
The researchers are making their dataset available to other scientists, and hope that it will be as useful as the photos are entertaining. Already, they say, their collaborators are using the data to work on automated species detection and classification—in other words, teaching computers to do the same tasks that the tens of thousands of volunteers did.
If you participated in Snapshot Serengeti, you can rest assured that your time (and my time) spent staring at warthogs and elands wasn’t wasted. Like these cheetahs, you’ve earned a nap.
All images: Snapshot Serengeti.
Swanson, A., Kosmala, M., Lintott, C., Simpson, R., Smith, A., & Packer, C. (2015). Snapshot Serengeti, high-frequency annotated camera trap images of 40 mammalian species in an African savanna Scientific Data, 2 DOI: 10.1038/sdata.2015.26