Break your task down into steps until each separate level is clear enough for any performer to handle.
Subscribe to Toloka News
Subscribe to Toloka News
Decomposition means to break a task down into parts by replacing one large problem with a series of smaller, separate problems that are easier to solve and can be completed by different performers. It is the first step to take when planning a new crowdsourcing project.
Before we delve into details, let us introduce a running example that we will use throughout the article. Suppose we have a dataset of images and our goal is to classify each image as a cat or a dog. The crowdsourcing paradigm prescribes to distribute the task across multiple workers, each of whom labels a subset of the images. As a result, for each image we have several labels and the goal of the aggregation method is to combine multiple noisy labels into a single high-quality answer to the task. For brevity, in this article we will consider the binary classification task and give references to the literature where the multiclass case is described.
It may seem counterintuitive, but splitting one task into several smaller ones helps lower the project budget. This happens because smaller tasks are easier to complete without mistakes, so we don't need to re-evaluate them as often as the complex ones.
|Cost||Tasks for revaluation||Active users|
|One complex task||100%||15-25%||600|
|Several smaller tasks||70%||10-15%||11000|
If your task offers a choice of 3-5 answers and the instructions fit on one page without scrolling, then most likely your task doesn't need to be decomposed. In all other cases, you should probably try to break down the task. You can also discover when you need to decompose by running short experiments. If your task is taken up very slowly or all the performers are filtered out due to low skill levels, but you don't see any problems with your control tasks, you can assume that the task is too complex.
If your task is aimed at getting an answer to a complex question, try dividing it into a series of simple ones that are easy to answer and independent of each other. For example, instead of asking whether a tech support specialist gave a “good” or “bad” answer, ask if the response was detailed, friendly and grammatically correct.
If your task involves answering a series of questions at once, try asking them successively, one question at a time. If you have a set of pictures and you need to outline traffic lights on them, first ask if the picture contains a traffic light and then (if yes) ask to outline it. Best practice here is to use two different projects for collecting data.
Sometimes there's only one question that needs to be answered, but there are too many possible answers and it's difficult to remember all the rules about them at once. If there are more than 10 options to choose from, we recommend grouping them thematically, so that a performer first makes a general choice of theme and then chooses within a smaller variety of answers. Best practice here is to support the successive classification with a clean interface that displays only the necessary options.
Collecting crowd data involves more than just setting up a task for performers. You also need to set up control mechanisms to maintain good quality. If the best control method is human evaluation, try adding a post-verification project where performers will check tasks completed by other performers.
Let's say your task is to regularly update information about local businesses in order to keep an up-to-date list. You give performers an offline task to find a particular business, check the address and opening hours, and provide a photo. After the task is complete, you find out that some answers are only partly correct. Some performers didn't provide a quality photo, while others got the opening times wrong. How can you clean up this data? Do you need to pay the performers who were only partly correct? Where do you get an extra budget to re-label the objects with missing data? This task can be decomposed by splitting it into three independent projects where one simple piece of information is collected, and performers don't get confused with multi-tasking:
• An entrance photo
• Opening hours
This allows you to use simple quality control mechanisms, choose performers who are better at each individual task, and save money on relabeling incorrect data.
When we say that decomposition is the key, we mean it. Here's how decomposition is connected to other components of crowdsourcing: