Improve labeling quality: test your tasks before launching them

Toloka Team
by Toloka Team
Image

Subscribe to Toloka News

Subscribe to Toloka News

How do you make sure your data labeling tasks are clear and well-designed? Test them prior to project launch by doing a couple of tasks yourself!

After you set up a project, you want to make sure your tasks are easy to do correctly. When tasks are simple and straightforward, Tolokers complete them faster and make fewer mistakes, which translates directly to better results. But how do you check this? Try out a few tasks from a Toloker's point of view. If you find something that might cause problems, you can go back to the project settings and fix it before you launch the pool of tasks.

To try it out, upload your data to the pool and click “Check project”.

Image

What we recommend checking

You will see the task and instructions exactly as Tolokers see it. Make sure you try all the answer options and check both the mobile and desktop versions. Here's what to pay attention to:

1. The Toloker's task interface

  • The text and other content isn't overlapped or cut off.
  • The task looks good on both desktops and mobile devices (if applicable to your projects).
  • Buttons, text boxes, links, and options all work the way they're supposed to.
  • Response validation works as expected (if applicable to your project). If the task involves entering a phone number or email address, it's important to validate the format.

2. Detailed instructions

When testing your tasks, you'll be working with real data that you uploaded to the pool. This way, you can ensure that your instructions account for every possible scenario that Tolokers may encounter when completing the task.

Make sure that your instructions:

  • Include input data, task explanations, and examples of complex cases.
  • Are simple and clear enough that even Tolokers who have never done this type of task before can follow along with ease.

3. Task complexity

Tolokers perform better with short, uniform tasks. While you are trying out the tasks, note how long each one takes and how many steps are involved. If the task appears time-consuming, you might need to divide it into smaller tasks or reduce the number of answers.

4. Output format

Check the results and make sure the format meets your expectations. For example, if you're collecting phone numbers, make sure they're in the format you need, or if you have multiple response options, check that the labels match the names of your radio buttons.

If you discover issues that you can improve in the task, go back and edit your project settings. Tweak the task interface and instructions as much as you need to. Once that's done, your project is ready to launch.

Create new project

We recommend that you always double-check new projects. You should also check previously created projects whenever you make changes to the interface or instructions.

If you need help with writing instructions, designing a task interface, or any other aspect of getting your Toloka project up and running, contact support or start a chat on the platform.

Article written by:
Toloka Team
Toloka Team
Updated: 

Recent articles

Have a data labeling project?

Take advantage of Toloka technologies. Chat with our expert to learn how to get reliable training data for machine learning at any scale.
Fractal

More about Toloka

  • Our mission is to empower businesses with high quality data to develop AI products that are safe, responsible and trustworthy.
  • Toloka is a European company. Our global headquarters is located in Amsterdam. In addition to the Netherlands, Toloka has offices in the US, Israel, Switzerland, and Serbia. We provide data for Generative AI development.
  • We are the trusted data partner for all stages of AI development–from training to evaluation. Toloka has over a decade of experience supporting clients with its unique methodology and optimal combination of machine learning technology and human expertise. Toloka offers high quality expert data for training models at scale.
  • The Toloka team has supported clients with high-quality data and exceptional service for over 10 years.
  • Toloka ensures the quality and accuracy of collected data through rigorous quality assurance measures–including multiple checks and verifications–to provide our clients with data that is reliable and accurate. Our unique quality control methodology includes built-in post-verification, dynamic overlaps, cross-validation, and golden sets.
  • Toloka has developed a state-of-the-art technology platform for data labeling and has over 10 years of managing human efforts, ensuring operational excellence at scale. Now, Toloka collaborates with data workers from 100+ countries speaking 40+ languages across 20+ knowledge domains and 120+ subdomains.
  • Toloka provides high-quality data for each stage of large language model (LLM) and generative AI (GenAI) development as a managed service. We offer data for fine-tuning, RLHF, and evaluation. Toloka handles a diverse range of projects and tasks of any data type—text, image, audio, and video—showcasing our versatility and ability to cater to various client needs.
  • Toloka addresses ML training data production needs for companies of various sizes and industries– from big tech giants to startups. Our experts cover over 20 knowledge domains and 120 subdomains, enabling us to serve every industry, including complex fields such as medicine and law. Many successful projects have demonstrated Toloka's expertise in delivering high-quality data to clients. Learn more about the use cases we feature on our customer case studies page.