Happy New Year! We’re kicking 2024 off on a high note with the year’s first publication!

Humans in the loop: Community science and machine learning synergies for overcoming herbarium digitization bottlenecks” is a new publication from the Notes from Nature and Zooniverse teams. It includes details about the collaborative work we’ve carried out as part of an NSF-funded project called Leaping the Specimen Digitization Gap, or DigiLeap, for short.

The article shares how we have combined Zooniverse volunteer efforts with automated approaches in an attempt to increase the speed and efficiency with which we can digitize herbaria.

From the Methods overview:

We present two new semi-automated services. The first detects and classifies typewritten, handwritten, or mixed labels from herbarium sheets. The second uses a workflow tuned for specimen labels to label text using optical character recognition (OCR). The label finder and classifier was built via humans-in-the-loop processes that utilize the community science Notes from Nature platform to develop training and validation data sets to feed into a machine learning pipeline.

Guralnick et al., 2024

You can read the full open-access publication via the following link: https://doi.org/10.1002/aps3.11560.