Toloka Kit: a Python Library for Your Labeling Projects

by Toloka Team on Mar 18th, 2021
Great news for ML engineers and data scientists! We are proud to introduce Toloka Kit – a Python library for your data labeling projects. You can do a lot directly from the Toloka interface, but when you’re building a scalable and automated ML pipeline, you need a little more flexibility. And that’s where Toloka Kit comes in – it allows you to control your project programmatically and makes it even easier to integrate crowdsourcing into your processes.

Why you should try it

Here’s how you can benefit from Toloka Kit:

  • Easily reuse projects by just copying and pasting code. No need to configure parameters in the interface over and over again.
  • Train your ML models and run your data labeling projects in the same environment.
  • Take advantage of open-source code that anyone can use and contribute to.

How to start

For a smooth and easy start with Toloka Kit, follow our example, in which we describe the data labeling pipeline from CVPR 2020. It’s a simple yet fundamental solution to the popular task of segmenting images for autonomous vehicles. You’ll find a detailed description of how to create and run a segmentation project using our Python library and ultimately get a dataset of traffic sign annotations on street-level images. All the code in the example is written in Jupyter Notebook. Check it out, apply it to your project, and if you have any questions or issues, let us know!

