At Toloka, we are committed to unlocking AI opportunities. Every day, our researchers tackle pressing AI and ML challenges,
make appearances at prominent global events, and publish their findings in scientific journals. Scroll down to learn more.
Browse through some of our latest work.
We describe the curriculum and course structure and show why and how we added elements of project-based learning and data-centric competitions to our course on crowdsourcing.
We present the results of the WMT’22 Shared Task on Large-Scale Machine Translation Evaluation for African Languages.
We present a human-in-the-loop approach to learning the most useful combination of prompt keywords using a genetic algorithm.
Our experiments on two different image datasets, dresses from Zalando's FEIDEGGER and shoes from the Toloka Shoes Dataset, confirm that one can yield meaningful clusters with no machine learning algorithms purely with crowdsourcing.
Find out how Toloka powers top-tier research across the globe.
The challenge solicited solutions that processed RAW camera images captured in night scenes. The organizers used Toloka to evaluate the visual appearance of the results. Mean opinion scores were calculated for a “people’s choice” ranking of solutions.
Labeling a large number of images can be labor- and time-consuming, and labeling images in planetary science often requires the help of crowdsourcing. In their “Machine Learning for Planetary Science” book NASA researchers acknowledge Toloka.
Saiph Savage, an Assistant Professor at Northeastern University and Director of the Northeastern Civic AI Lab collaborated with Toloka to lead a research initiative called “A.I. For Good Framework to Empower Digital Workers” to help rural workers get better wages and conditions.
As part of the conference, teams competed to devise the best system for machine translation of articles into different languages. Toloka provided human judgements as the ground truth for translation quality evaluation. All relevant language pairs were covered for fast evaluations using Toloka’s global crowd.
We regularly hold tutorials and lead workshops at some of the biggest AI conferences around the globe.
CANDLE: Collaboration of Humans and Learning Algorithms for Data Labeling.
Toloka social at NeurIPS 2022. From old biases to new opportunities: Annotator empowerment and data excellence.
We shared more than six years of our crowdsourcing experience and bridge the gap between crowdsourcing and recommender systems communities.
This сrowd science workshop explores how a reimagined perspective on crowdsourcing platforms could provide a more equitable, fair, and rewarding experience.
This сrowd science workshop focuses on the best practices for efficient and trustworthy crowdsourcing.
High-Quality Data Labeling at Scale with Toloka workshop aims to provide a comprehensive picture of how crowdsourcing can be applied to real life AI production.
We shared some of the unique insights we have gained from six years of industry experience in efficient natural language data.
In this tutorial, we presented a systematic view on using Human-in-the-Loop to obtain scalable offline evaluation processes and high-quality relevance judgements
On this сrowd science workshop we discussed key issues in preparing labeled data for machine learning, with a focus on remoteness, fairness, and mechanisms in the context of crowdso...
We presented a data processing pipeline used for training self-driving cars. Participants gained practical experience launching an annotation project in Toloka.
We explored the practical aspects of how crowdsourcing can be applied to information retrieval. Participants learnt how to create a dataset with relevant products.
We explored the practice of efficient data collection via crowdsourcing: aggregation, incremental relabeling, and pricing.
We thrive on continuous improvement and international cooperation. Contact us on LinkedIn if you’d like to collaborate.
Toloka partners with universities across the world to incorporate crowd science techniques.