Conference

NAACL-HLT 2021

We present a portion of our six-year-long unique industry experience in efficient natural language data annotation via Crowdsourcing.

Image
Image
Image
+2
Image

Crowdsourcing Natural Language Data at Scale: A Hands-On Tutorial

In this introductory tutorial, we present a portion of our six-year-long unique industry experience in efficient natural language data annotation via Crowdsourcing. We will make an introduction to data labeling via public crowdsourcing marketplaces, and will present the key components of efficient label collection that includes task design and decomposition, quality control, and annotator selection. This will be followed by a practical session, where participants address a real-world language resource production task, experiment with selecting settings for the labeling process, and launch their label collection project on one of the largest crowdsourcing marketplaces. The projects will be run on real crowds within the tutorial session. We will present useful mathematical foundations, quality control techniques and tricks, and provide the attendees with an opportunity to discuss their own annotation ideas.

Tutorial team

Image
Alexey Drutsa
TolokaDeputy CEO, COOProfile link
Image
Dmitry Ustalov
TolokaHead of Ecosystem DevelopmentProfile link
Image
Olga Megorskaya
TolokaCEOProfile link
Image
Daria Baidakova
TolokaDirector of Educational ProgramsProfile link
Image
Valentina Fedorova
TolokaAnalyst

Don't miss out

Be the first to hear about our workshops, 
tutorials, and webinars.
Fractal