Toloka documentation


To facilitate research and integration projects, we have developed a few open source libraries. Anyone may feel free to use our libraries and contribute to the projects.


This is a Python library that allows ML engineers and data scientists to scale the data labeling process and control it programmatically.

Benefits of Toloka-Kit:

  • Make your data labeling processes reproducible.
  • Integrate data labeling processes with your ML environment.

If you want to try it out, use our sample labeling pipeline for selecting road signs in images, which we presented at CVPR 2020. Even though it's very simple, this pipeline plays an essential role in image segmentation for self-driving vehicles.


This is a Python library that implements most of the popular crowdsourcing algorithms.

Crowd-Kit includes:

  • A variety of methods for aggregating Toloker responses.
  • Metrics for evaluating the quality of assignments and Tolokers.
  • Quality control methods.

The library has a friendly and easy-to-use interface and works with Pandas dataframes.

In this article: