Image classification

  1. Create a project
  2. Add a task pool
  3. Upload tasks
  4. Set up quality control
  5. Add training
  6. Start the pool and get the results

Projects of the classification type are intended for tasks with multiple choice. Examples are moderating content or grouping images by category.

You may need additional projects for your task, such as dataset pre-check or checking performers' responses. Learn more about this in Decomposition of the task.

Let's say you want to group the photos of cats according to the cat's mood. Create the following task: the performer looks at the image and selects a response option.
Tip.

Run the project in the Sandbox first. This helps you avoid making mistakes and spending money on a task that isn't working right.

Example of a prepared task


To run tasks and get responses:

Create a project

The project defines what the task will look like for a performer.

  1. Choose a template:

    1. Click Create project.

    2. Select the Image classification template.

  2. Provide general information:

    1. Enter a clear name and a short description for the project. Performers will see this in the task list.

    2. Optionally add a Private comment.

    3. Click Save.

  3. Edit the task interface in the editor you selected:

    Template Builder
    HTML/CSS/JS editor
    1. The task interface describes how the elements should be arranged in the task.

      Use the ready-made template for this project with pre-configured validation, keyboard shortcuts, and task layout. The performer won't be able to submit a response without selecting one of the options.

      For more information, see the Template Builder Help:

    2. To specify what data you will pass to the performer and receive in response, create input and output data fields.

      What are input and output data?

      Input data is types of objects that are passed to the performer for completing the task. For example, this could be a text, an image, or geographic coordinates.

      Output data is types of objects that you receive after the task is completed. For example, this could be one of several response options, typed text, or an uploaded file.

      Input data fields are created from the code on the Example of input data tab.

      The output data fields depend on the components that use data.output and values supported by it.

      Click Show specifications to see the input and output data fields.

      Learn more about input and output data fields in the Template Builder Help.

      In this project:

      • Input data field: image — A link to an image.

        Change the data type to string to add links to your files.

      • Output data field: result — string for saving the performer's response.

    1. The task interface describes how the elements should be arranged in the task.

      In the HTML interface, use the standard HTML tags and special expressions in double curly brackets for input and output data fields.

      1. Edit the HTML block in the Task interface. After the image line, add the question:
        {{img src=image width="100%" height="400px"}} <div>What's the cat's mood</b>?<div>
        Copied to clipboard
      2. For this project, leave the JS and CSS blocks unchanged.

    2. In the Data specification section, you can configure the input and output data fields.

      What are input and output data?

      Input data is types of objects that are passed to the performer for completing the task. For example, this could be a text, an image, or geographic coordinates.

      Output data is types of objects that you receive after the task is completed. For example, this could be one of several response options, typed text, or an uploaded file.

      In this project:

      • Input data field: image — A link to an image.

        Change the data type to string to add links to your files.

      • Output data field: result — string for saving the performer's response.

    3. Click to see the performer's view of the task.

      Note. The project preview shows one task with standard data. You can define the number of tasks to show on the page later.

      Check if the task options work correctly. In the lower-right corner, click Submit.

      To exit preview mode, click Exit in the lower-left corner. If there were errors when testing, check the code blocks that you entered.

    4. Save the changes.

    1. Write short and clear instructions. Describe what needs to be done and give examples in them.

      You can prepare instructions in HTML format, then copy and paste into the editor. Click <> to switch to HTML mode.

    2. Click Finish.

Learn more about working with the project in Project.

Add a task pool

A pool is a set of paid tasks sent out for completion at the same time.

  1. Open the project and click Add pool.
  2. Give the pool any convenient name. It is available only to you — the performer will only see the name of the project.
  3. Set the price per task suite (for instance, $0.02).
    What is a task suite?

    A task suite can contain one or several tasks that are shown on the same page. If the tasks are simple, you can add 10-20 tasks per suite. Don't make task suites too long because it slows down loading speed for performers.

    Performers get paid for completing the entire task suite.

    The number of tasks per suite is set when uploading tasks.

    What is a fair price for a task suite?

    The general rule of pricing is the more time the performer spends to complete the task, the higher the price is.

    You can register in Toloka as a performer and find out how much other requesters pay for tasks, or see examples of cost for different types of tasks.

  4. Set the Time allowed for completing a task suite. It should be long enough to read the instructions and wait for task data to download. (for example, 600 seconds).
  5. Set Overlap, which is the number of performers to complete the same task. For classification tasks, 3 is enough.
  6. Add Filters to select performers. To make your task available only to English-speaking users, set filters by language and country detected by the phone number.
  7. Save the pool.

Upload tasks

Prepare your own task file. Check out the example in the demo TSV file. In the Toloka interface under the pool name, there are links to TSV file templates for regular, control, and training tasks.

  1. Click Upload. In the window that opens, you can also download a sample TSV file by clicking Sample file for uploading tasks.

    What is TSV?
    A TSV file presents a table as a text file in which columns are separated by tabs.
    You can work with it both in a table editor and a text editor, and then save it to the desired format. More about working with a TSV file. There is a CSV format that is similar to TSV, but you should use a TSV file for uploading.
    Note. Before uploading the file, make sure it is saved in UTF-8 encoding.
  2. Add input data in it. The header of the input data column contains the word INPUT. Leave the other columns empty.
  3. Upload the tasks using Smart mixing and enter the number of tasks per page. For example: 9 main tasks and 1 control task.
    What is smart mixing?
    Smart mixing randomly generates pages with tasks so that tasks are not repeated for each performer.
  4. Add control tasks. To do this, click the Edit button and give the correct responses for several tasks.
    Note.

    If you selected something else instead of smart mixing, click Edit. If this button is missing, delete the file and upload it again.

    What are the control tasks?

    Control tasks are tasks with the correct response known in advance. They are used to track the performer's quality of responses. The response you provided is compared to the performer's response. If they match, it means the performer answered correctly.

    Control tasks should make up at least 1% of the total number of tasks. This means that for 1000 tasks you should add at least 20 control tasks.

    More about control tasks.

Set up quality control

Quality control rules allow you to filter out inattentive performers. You can configure quality control both in the project and in the pool.

Attention.

Quality control settings are applied to all project pools, so you can't change them in just one of the pools.

When you clone a project, its quality control settings aren't transferred.

    Go to pool editing (the Edit button in the upper-right corner of the page) and click Add Quality Control Rule.

    You can copy quality control settings from another pool. To do this, click Copy settings from in the Users filter section.

  1. Add the Control tasks section and specify the following values:

    This means that a performer who gives more than 40% of incorrect responses will be blocked for five days and won't be able to complete tasks in this project.

  2. Add a restriction for Fast responses.

    The Minimum time per page value depends on the number of tasks on this page. It takes 2-4 seconds to identify the cat's mood. This means that a page with 10 tasks may take 20-30 seconds to complete.

    A performer can make an accidental mistake once in a while, but after 2-3 repeated mistakes you can ban the performer for a while.

    Specify the following values:

    This means that a user who submits two assignments in less than 20 seconds will be blocked for 10 days and won't be able to access your tasks.

    Attention. The assignments submitted by banned performers will be taken into account if they are not rejected manually using assignment review They can be reassigned by setting up the Recompletion of assignments from banned users rule.

Add training

Create a training pool:

  1. Open the project page.

  2. Go to the Training tab.

  3. Click the Add training button.

  4. Fill in the training settings fields.

    You can use the Retry after field to set up repeated training.
  5. Click Save training.
After you create a training pool:
  1. Get the task template (TSV) or edit the one you used for uploading the main pool tasks.
    Note. TSV files for all project pools have the same structure.
  2. Add links to images for the training tasks in the TSV file.
  3. Upload the file and specify the number of tasks on the page. For example, 10. This number must not exceed the number of tasks per page in the main pool.
  4. Click Download and enter the number of training tasks on the page.
  5. Click Add.
  6. Click Mark upand then Create training tasks. Next, add correct answers and hints for all the uploaded tasks.
  7. After the file is uploaded, open the Preview and check that the tasks are displayed correctly.
  8. Open the main pool with tasks, link Training to it and set the Level required to 55. This means that the main pool will be available for users who made no more than 45% of mistakes in the training pool.

    To link the training pool, go to the main pool editing mode and select your training pool in the Training parameter drop-down list.

Learn more about creating a pool with training.

Start the pool and get the results

  1. Start the pool by clicking .
  2. Track the completion of tasks in the Pool statistics section.
  3. When the pool is completed, launch aggregation of results. To do this, find the Download results button and click  → Dawid-Skene aggregation model next to it.

    Aggregation of responses is necessary to get a complete picture of all results. Learn more about aggregation.

  4. Track the aggregation progress on the Operations page. When the process is completed, click Download.

Troubleshooting

Why do I see a syntax error when I upload a task where a user has to view an image and write feedback?

The error might occur if the expected input type is URL, but a string is received.

There may be two reasons:
  • The input field has the "link" type.
  • The pool was created for an outdated project version. It means that the pool was created before you changed the input field type.
I have a task for photo classification. When there are more than 5 photos on the page, why does Toloka split them across 2 pages?

Toloka will split the links to images in the uploaded file into task suites depending on the method you specified when uploading the TSV file. For more information about the three upload methods, see the Guide.

How do I make an image expand to its maximum size on click?

To the component that inserts the image, add the parameters: real-size=true and screenshot=true.

Do I need to convert all the images in the task to the same size or can they be different?
You can use different image sizes.