Research Benchmarks

Toloka News

Receive information about platform updates, training materials, and other news.

Toloka News

Receive information about platform updates, training materials, and other news.

Quality control lies at the heart of crowdsourcing. Below are a few handy examples — you can use them as benchmarks to achieve the described levels of quality on popular research datasets.

TaskDatasetAggregationQualityReference
Image
Classification
CINIC-10Dawid-SkeneAccuracy on Test = 88%GitHub
Colab
Text
Classification
Large Movie
Review Dataset
Dawid-SkeneAccuracy on Test = 89%GitHub
Colab
Audio
Transcription
CrowdSpeechFine-Tuned T5Word Error Rate
on test-clean = 5.22
GitHub
Paper
Colab