Today’s article is a guest post by (former) Zooniverse ecologist Dr Ali Swanson. It’s a little longer than our average post, but it’s well worth a read!

 

At the Zooniverse, we’re always looking for ways to make projects better, faster, and more engaging. Over the last year or so, we’ve started trying out a new approach to project building that we call “cascade filtering” – basically breaking a complex workflow up into a series of “yes or no” questions that filter out images of interest. This makes a project easier to do, but also standardizes the project structure so that it’s ultimately easier to do things like incorporate machine learning routines for image recognition or port the project over to mobile.

Snapshots at Sea (SAS) was our first cascade filtering project. Launched in 2015, Snapshots at Sea asks folks to identify images of humpback whales from thousands of photos taken by tourists on tour boats. The images from SAS that do have humpback whale flukes get passed onto Whales as Individuals, where they’re prepared for individual image recognition algorithms.

Snapshots at Sea has been really popular since it launched, but we at Zooniverse wanted to actually test this more rigorously to see if we should recommend it to other research teams. So in June, we launched an experiment that simultaneously ran two versions of Snapshots at Sea – one cascade filtering version (that we call the “Yes/No workflow”) and one more traditional version (the “Survey workflow”) that looks like your standard camera trap project.

This figure provides a flowchart for each workflow, showing the different questions and the retirement rules.

Snapshots at Sea workflow design: Flowchart details the questions, retirement rules, and thresholds for passing images onto subsequent steps on each implementation of Snapshots at Sea. Additional details of each workflow are described in the text under Methods: Data Collection.

And this figure shows the hourly and cumulative classifications for each workflow. Since the Yes/No workflow ultimately requires more classifications (because each classification doesn’t contain quite as much information as one from the Survey workflow), the cumulative classifications are shown as the proportion of classifications needed for completion.

Instantaneous and cumulative classifications through time for each Snapshots at Sea workflow: Cumulative classifications are plotted as the proportion of total required to complete the dataset (n = 66,320 for the Survey workflow and n = 95,195 for the Yes/No workflow). Data additions are indicated in gray dashed vertical lines, newsletters in blue. The Yes/No workflow was completed on June 27 and de-activated (removed from the website) on July 5.

You can see that the Yes/No workflow got a lot more action! In fact, all four Yes/No questions were completed in half the time it took to complete the Survey workflow. This is probably because Yes/No classifications were so fast to do. They took anywhere from 1 second (on the mobile swipe app) to 3 seconds, while Survey classifications took 8-10 seconds.

Length of time to complete classifications on each workflow: Classification duration in seconds (calculated as time that an image loaded into a volunteer’s browser to the time that “Done” was selected) for each workflow across different devices. Median durations given in black text above the median line; sample sizes given in gray text below the median line. Note that the Y-axis is on a Log10 scale.

Importantly, accuracy was pretty similar across each workflow; aggregated volunteer answers agreed with Ted’s answers 95-99% of the time.

So, cascade filtering seems to work really, really well when it comes to producing results. But what about for the volunteer community? Not only do we want to produce scientific results, but we also want to provide opportunities for deeper engagement and learning. There was some evidence that folks might comment less on Talk when classifying on the Yes/No workflow – possibly because it would break up the “flow” of classifications, but also maybe because the interface doesn’t encourage it. (We actually realized that the mobile swipe app doesn’t seem to provide any way to get to Talk!! Ack! Something that is on our list to fix ASAP.) But we also saw more super active classifiers on the Yes/No workflow, with twice as many folks who had made >1,000 classifications (during the experiment, that is – I’m sure many of you have contributed wayyyy more than that overall). This kind of extended activity is generally good, because many other Zooniverse studies have shown that the longer folks stay around a project, the more chance there is of discovering something new or learning about science (check out these papers at https://arxiv.org/pdf/1601.05973v1.pdf and http://eprints.whiterose.ac.uk/86535/.

Interestingly, when we asked you what you all thought about the two different workflows, we found that, surprise!, different people like different styles.

Workflow Preference: The number of volunteers who reported preferring each workflow. Note that we limited responses to those volunteers who had actually tried both workflows.

While the cascade filtering approach was a bit more popular, it was really telling to hear why people liked each approach. Some preferred the Yes/No questions because they were quick and easy and people felt like they were contributing something right away; those who liked the Survey enjoyed being able to provide a more robust and complete classification. My favorite revelation was that sometimes people preferred different workflows depending on their mood.

So! All in all, we’re pretty happy with cascade filtering. Now, it doesn’t work for every project: Snapshots at Sea really just needed to know about one species, and the number of Yes/No questions needed for something like Snapshot Serengeti would probably be overwhelming. But we’re exploring ways to add a cascade filtering step to other projects, like we’ve done to filter out empties and vehicles on Camera CATalogue.

We’ve just submitted this work as a scientific article to the journal Citizen Science: Theory and Practice, and we’ll let you know if and when it’s published. It’s been so much fun to explore what makes a project work, and we’d never be able to do any of this without your help. So thanks again for all that you do!