Break your task down into steps until each separate level is clear enough for any performer to handle.

Subscribe to Toloka News

Subscribe to Toloka News

Decomposition means to break a task down into parts by replacing one large problem with a series of smaller, separate problems that are easier to solve and can be completed by different performers. It is the first step to take when planning a new crowdsourcing project.

Benefits of decomposition

Before we delve into details, let us introduce a running example that we will use throughout the article. Suppose we have a dataset of images and our goal is to classify each image as a cat or a dog. The crowdsourcing paradigm prescribes to distribute the task across multiple workers, each of whom labels a subset of the images. As a result, for each image we have several labels and the goal of the aggregation method is to combine multiple noisy labels into a single high-quality answer to the task. For brevity, in this article we will consider the binary classification task and give references to the literature where the multiclass case is described.

  • Makes it easier to review completed tasks
  • Reduces the number of errors
  • Lowers the entry threshold for performers: the easier the task, the more people will complete it correctly and quickly
  • Saves money: you spend less on the project overall

How is this possible?

It may seem counterintuitive, but splitting one task into several smaller ones helps lower the project budget. This happens because smaller tasks are easier to complete without mistakes, so we don't need to re-evaluate them as often as the complex ones.

CostTasks for revaluationActive users
One complex task100%15-25%600
Several smaller tasks70%10-15%11000

How do I know that a task needs to be decomposed?

If your task offers a choice of 3-5 answers and the instructions fit on one page without scrolling, then most likely your task doesn't need to be decomposed. In all other cases, you should probably try to break down the task. You can also discover when you need to decompose by running short experiments. If your task is taken up very slowly or all the performers are filtered out due to low skill levels, but you don't see any problems with your control tasks, you can assume that the task is too complex.

Ways to decompose a task

Decomposing a complex challenge

If your task is aimed at getting an answer to a complex question, try dividing it into a series of simple ones that are easy to answer and independent of each other. For example, instead of asking whether a tech support specialist gave a “good” or “bad” answer, ask if the response was detailed, friendly and grammatically correct.

Decomposing a multi-task

If your task involves answering a series of questions at once, try asking them successively, one question at a time. If you have a set of pictures and you need to outline traffic lights on them, first ask if the picture contains a traffic light and then (if yes) ask to outline it. Best practice here is to use two different projects for collecting data.

Decomposing a multitude of options

Sometimes there's only one question that needs to be answered, but there are too many possible answers and it's difficult to remember all the rules about them at once. If there are more than 10 options to choose from, we recommend grouping them thematically, so that a performer first makes a general choice of theme and then chooses within a smaller variety of answers. Best practice here is to support the successive classification with a clean interface that displays only the necessary options.

Decomposing a crowdsourcing project itself

Collecting crowd data involves more than just setting up a task for performers. You also need to set up control mechanisms to maintain good quality. If the best control method is human evaluation, try adding a post-verification project where performers will check tasks completed by other performers.

Real-life crowdsourcing projects normally demand a combination of various decomposition techniques.
Here are some examples:

Let's say your task is to regularly update information about local businesses in order to keep an up-to-date list. You give performers an offline task to find a particular business, check the address and opening hours, and provide a photo. After the task is complete, you find out that some answers are only partly correct. Some performers didn't provide a quality photo, while others got the opening times wrong. How can you clean up this data? Do you need to pay the performers who were only partly correct? Where do you get an extra budget to re-label the objects with missing data? This task can be decomposed by splitting it into three independent projects where one simple piece of information is collected, and performers don't get confused with multi-tasking:

  •  An entrance photo
  •  Address
  •  Opening hours

This allows you to use simple quality control mechanisms, choose performers who are better at each individual task, and save money on relabeling incorrect data.


Decomposition and…

When we say that decomposition is the key, we mean it. Here's how decomposition is connected to other components of crowdsourcing:

  • Instructions
    A well-decomposed task is easily explained using simple instructions. To learn more about clear instructions, see our special page.
  • Pricing
    The simpler a task is, the quicker it can be submitted, and the cheaper it is. Here's a section on pricing principles and mechanisms.
  • Interface
    Each single step of a decomposed task should be supported by the task interface: clean and simple, with no unnecessary elements. To learn more about transparent task interfaces, see our “Interfaces” section.
  • Quality control
    A set of simple tasks is easy to check with basic quality control methods, such as majority vote or golden sets. Learn more about establishing quality control.