Toloka documentation

Object recognition & detection

In this tutorial, you will learn how to run object recognition in Toloka. We will use a project preset designed specifically for this type of data labeling.

Object recognition is a type of data labeling task with an image and an editor for selecting an image area.

Tolokers look at the image and select the objects that you need to detect. After you collect all the labeled images, you can apply your dataset for computer vision training.

Prerequisites

Before you begin:

  • Make sure you are registered in Toloka as a requester.

  • Top up your Toloka account. If you are unsure about the budget, you can do that later in this tutorial. Toloka will display the budget estimate for your project.

Choose a preset

We recommend starting with a project preset for easier configuration and better results.

  1. Follow this link, or create a project manually:

    1. In the main menu, choose the Projects tab, and click Create a project.

      Choose a preset. Step 1

    2. Select the Object recognition & detection preset.

  2. Click Choose this preset in the pop-up tab.

Create a project

Set up how your tasks will look for Tolokers. Tolokers are people around the world who get paid for completing your tasks.

  1. Under General information, add the project name and description.

    • Name to show Tolokers: In 2–5 words, state the general idea of the project.

    • Description for Tolokers: In a couple of sentences, explain what you expect Tolokers to do. This is just an overview. You will write instructions later.

    Create a project. Step 1

  2. In the Task interface section, set up what your tasks will look like. This preset has a task template with validation, keyboard shortcuts, and task layout pre-configured.

    Note

    This tutorial uses Template Builder, but you can use the HTML/JS/CSS editor for the same purpose.

    1. Using the Visual editor, set up your task in the Config section:

      • Paste link to a sample image: This image is only used to display the task interface preview on the right.

      • Choose the shape for outlining objects in photos: All tasks in a project use the same shape.

      Create a project. Visual editor

    2. Select the I want to outline multiple types of objects checkbox if you need to detect more than one category of objects in an image. Replace the samples with your types:

      • Object name for Tolokers: This is the label that Tolokers will see. Make sure it is clear and correct.

      • Name in labeling results: This is the value you will see in the file with the labeling results.

    3. Raw task data is stored in the XSLX, TSV, or JSON format. The labeling results are presented in a TSV file. The Data specification section determines which parameters these files might contain.

      Click Show specifications and check the values:

      • Input data: Parameters in the file with raw task data.

      • Output data: Parameters in the file with labeling results.

      Input data and Output data match the task interface you set up in Template Builder. Check that there are fields for all data types you use for your tasks, and for the ones you want to see in the results file.

  3. Under Instructions for Tolokers, add the instructions Tolokers will see when they start doing your tasks. You can add text, tables, and images to your instructions.

    In this type of project, Tolokers will select objects in images with the shape you’ve specified in the Config section.

    Create a project. Selection instructions

    Check the sample text of the instructions, and update it to fit your project.

    Tip

    When writing instructions, remember that most Tolokers don’t know anything about your tasks beforehand. Make sure your instructions are as clear as possible, but not too wordy. For successful data labeling, try to strike a balance between covering all the essentials and keeping it short. Learn more in our knowledge base.

  4. To save your data and continue, click Create a project.

    Create a project. Step 4

Create a pool

A pool is a set of tasks sent out to Tolokers at the same time. One project can have many pools. When creating a pool, you set up pricing, audience filters for Tolokers, and quality control.

  1. Click Create new pool.

  2. Under General information, set the Pool name.

  3. Under Audience, set up filters to select Tolokers for your pool.

    1. Clear My tasks may contain shocking or pornographic content if your project has none of those.

    2. To select Tolokers based on their language, location, age, gender, and other parameters, click the Add filter button.

      For example, add the Languages filter:

      Create a pool. Step 3.2

    3. Tasks in pools will automatically be available in the web version of Toloka and the mobile app. If you want to change the default settings and limit the visibility of the task for any of the versions, add the Client filter and select the desired value: Toloka web version or Toloka for mobile.

    4. Use the Speed/quality balance slider to change the number of Tolokers who can see your tasks. Move the slider to the right to exclude Tolokers with lower ratings from participating in your project.

      Create a pool. Step 3.3

  4. Under Quality control, set quality control rules for more accurate results.

    1. Click the Non-automatic acceptance toggle, and specify the number of days for checking the task in the Review period in days field.

      What is non-automatic acceptance (assignment review)?

      The non-automatic acceptance option allows you to review completed assignments before accepting them and paying for them. If the Toloker didn't follow the instructions, you can reject the assignment.

    2. Keep the pre-configured Fast responses rule as is. This rule filters out Tolokers who complete tasks too fast. The default settings mean that Tolokers are banned from the project for 1 day if they complete tasks in 4 out of 5 task suites in less than 15 seconds.

    3. Delete the pre-configured Majority vote rule.

    4. Click Add a quality control rule → Results of assignment review, and enter the following values:

      Create a pool. Selection results rule

      This means that if 35% or more of a Toloker's responses are rejected, the Toloker is banned and can't access your tasks for 15 days. The rule takes effect after 3 responses of the Toloker are reviewed.

  5. Click Show advanced settings. In the Additional settings, specify Time per task suite, sec. It should be long enough to read the instructions and wait for task data to download (for example, 1,200 seconds).

  6. In Price, set up how much a single task will cost for you.

    1. In Price per task suite, set the amount of money to pay per task suite done by one Toloker. A task suite is a page with a number of tasks. It can contain one or several tasks.

      If the tasks are simple, you can add 8–10 tasks per suite.

    2. In the Overlap field, define how many Tolokers must do each task.

      For image area selection tasks, it is usually 1. This means that each task will have 1 response.

  7. To save the settings and continue, click Create pool.

Upload data

At this step, upload your task data to Toloka.

  1. Click Upload data.

  2. Attach a prepared dataset or media files.

    1. To download a template, click one of the buttons:

      • Template in XLSX
      • Template in TSV
      • Template in JSON

      For this type of project, the file with tasks must have one parameter. Its name equals INPUT:image, and the values are links to the images.

      INPUT:image
      https://tlk.s3.yandex.net/sdc/photos/0c7b15d0f4e5db0416189afe6486f3cb.jpg
      https://tlk.s3.yandex.net/sdc/photos/3e5d32e31dfd50fe826fd46b761481cf.jpg
      https://tlk.s3.yandex.net/sdc/photos/0fb80721932aa59ecc684141433170b6.jpg
      
    2. Open the downloaded file, and replace the sample links with links to your images.

    3. Click Select prepared dataset, and upload the file you’ve just made.

    Upload media files from your device to avoid difficulties with placing links in the file with tasks.

    To the right of the Select prepared dataset link, click Select media files and choose files.

    Features:

    • This option is available only for the projects that have a single url type input field in the specification.

    • After uploading, tasks are automatically created with links to the uploaded files.

    Note

    Supported file formats:

    • Images — GIF, JPG, JPEG, PNG, WebP.
    • Audio — FLAC, MP3, WAV, M4A.
    • Video — MP4.
  3. Click Continue.

  4. Tasks are shown to Tolokers in suites. A suite is a single page with multiple tasks. Define how many tasks to include per suite:

    • General tasks: These are tasks for Tolokers to label.

    • Control tasks: These are tasks with predefined answers used to control the quality of responses. You don’t need control tasks in area selection projects.

    • Training tasks: These are tasks with predefined answers and explanations for Tolokers. Normally you use training tasks in separate training pools. You don’t have to include them.

    For example, you can add 8 general tasks per suite:

    Create a pool. Step 3

  5. Click Combine tasks into suites.

Start labeling

  1. Make sure you have topped up your account.

  2. To send the tasks to Tolokers and begin the labeling process, click Start labeling.

    Start labeling. Step 2

  3. In the pop-up panel, review the budget and click Launch.

See the results

Track the labeling progress on the pool page. You can start the review when the first results are received.

After the specified time period, all responses are automatically accepted, regardless of their quality.

  1. Go to the pool, and click Review assignments.

    See results. Step 1

  2. Choose an assignment.

  3. Check the responses, and click Accept or Decline. For rejected responses, enter a comment to specify the reason.

    Note

    To learn about other ways of review, see the Reviewing Tolokers' responses section.

  4. After checking all the assignments, click Download results.

    See results. Step 4

    You will get the TSV file with the labeling results. The point coordinates in the file are presented in JSON.

Troubleshooting

Do I need to convert all the images in the task to the same size or can they be different?

You can use different image sizes.

How do I mark up triangles so that they close automatically when the third point is selected?

Use the X shortcut for closing areas.

You can customize the shortcuts using the properties of the plugin.field.image-annotation.hotkeys component.

How do I create a task for selecting objects in images?

We recommend that you break down your object selection project into three projects in Toloka:

  1. Sorting images containing an object.

    1. Create a task using the Image classification preset.

    2. Sort the images containing the object you are looking for.

    3. Show the image to the Toloker and ask if the object is in the image. Response options: Yes/No.

  2. Selecting objects in images.

    1. Select the object in the images you obtained after the previous project. You already have such a project. Run the task with non-automatic acceptance.

    2. Use the quality control rules: fast responses, non-automatic acceptance, and post-review re-assessment. Description of rules with examples.

  3. Reviewing object selection assignments.

    1. Create a task using the Object recognition & detection preset.

    2. Hide the editor and ask whether the object is selected correctly. Response options: Yes/No.

    3. In the input data, pass the images and coordinates of the labeled objects from the previous task.

    Now you can run the resulting pool with an overlap of 3–5 or with dynamic overlap. After that, you can aggregate the results and then upload the data for review to Project 2.

    To prevent the users who worked on the second project from doing the review, assign a skill to them. Use this skill as a filter in the pools of the third project.

I have a task for area selection in an image. What should the Toloker do if there is no selectable object in the image?

Main options:

  • Select an arbitrary area in the image (for example, put a square in the upper-right corner). In this case, the project instructions for reviewers should also reflect this.

  • Ask the Toloker to skip the task and report it in a personal message. Messages are reviewed by the requester. If the object is truly missing, the task is deleted from the pool by resetting the overlap.

  • Add an additional No object checkbox to the task interface. Make sure that your interface checks that either the object is selected or the checkbox is enabled. In this case, add information about the checkbox value in the review task interface.

How much would 2000 images with a large number of different types of selectable objects cost? How do I create a task for this amount of work?

In the case of crowdsourcing, it's better to break down this task. The simpler the task, the cheaper it is and the better the quality of the final result. The cost of labeling a single class of objects in photos might be about $0.01.

Base your task on the Object recognition & detection preset. See the step-by-step guide for creating this type of project in the tutorial.

How do I implement selection of 3 different areas in an image? Select the name, image, and price in the product page screenshot.

In the Config section on the project settings page, use the Visual editor and select the I want to outline multiple types of objects checkbox. Replace the samples with your categories.

What are the input data in the case of object labeling in an image: the coordinates of the object relative to the image, or the coordinates of the object in the Toloka user window?

The coordinates are relative to the image.

How do I use control and training tasks in the standard template with an area selection editor?

In the standard template with an area selection editor, you can't use the control tasks, because in order for the assignment to be accepted by the system as correct, the object selected by the user must exactly match the control object. This is almost impossible. Therefore, you can leave the GOLDEN field empty in the task file or simply delete all the columns except INPUT.

You can't also use training and the main pool with the Training type in an area selection project due to the same reason.

Such tasks are usually run with non-automatic acceptance: the Toloker submits an assignment, and then the assignment is rejected or accepted after the review.

For pre-selection of users, you can use “examination tasks”. Review the assignments and assign skills based on the percentage of accepted assignments. For this purpose, add the Results of assignment review rule to the pool. To make sure that only the good Tolokers are admitted to the main pool, put a skill-based filter to the pool.

See also

Datasets and reference