# Searching for product characteristics

In this tutorial, you will learn how to run searching for product characteristics in Toloka. We will use a project preset designed specifically for this type of data labeling.

Searching for product characteristics is a type of data labeling task with a text specifying what is required to find, and an input area for collected information.

Tolokers read what and where they should find, search for the required data, and enter it to the task interface.

Use this preset for the following purposes:

• Searching for an item, such as a product or service, by its description.

• Searching for information on a specific website, such as cities where products can be delivered on a store's website.

• Searching for information about an item online, such as a company's contact information.

## Prerequisites

Before you begin:

• Make sure you are registered in Toloka as a requester.

• Top up your Toloka account. If you are unsure about the budget, you can do that later in this tutorial. Toloka will display the budget estimate for your project.

## Choose a preset

We recommend starting with a project preset for easier configuration and better results.

1. In the main menu, choose the Projects tab, and click Create a project.

2. Select the Mining business contacts preset.

2. Click Choose this preset in the pop-up tab.

## Create a project

1. Under General information, add the project name and description.

• Name to show Tolokers: In 2–5 words, state the general idea of the project.

• Description for Tolokers: In a couple of sentences, explain what you expect Tolokers to do. This is just an overview. You will write instructions later.

Note

This tutorial uses Template Builder, but you can use the HTML/JS/CSS editor for the same purpose.

1. In the Config section, you can edit the code to change the appearance and layout of the task elements. For example, to configure the field label which specifies the object Tolokers should find information about, replace the sample text with your value in the label property.

2. To configure the sources where Tolokers should search for information, edit the view.link-group component.

• The first url component contains the "path": "site" property. The site is a parameter for a link where Tolokers should seek information. You will specify the list of these links in the file with tasks later in this tutorial.

• The second and third url components contain the links to the search queries. You can change the search engines Tolokers will use in your tasks.

Note

To learn about other properties of the Config section, their possible values and the impact on the task interface, see the Template Builder Help.

3. Raw task data is stored in the XSLX, TSV, or JSON format. The labeling results are presented in a TSV file. The Data specification section determines which parameters these files might contain.

Click Show specifications and check the values:

• Input data: Parameters in the file with raw task data.

• Output data: Parameters in the file with labeling results.

Input data and Output data match the task interface you set up in Template Builder. Check that there are fields for all data types you use for your tasks, and for the ones you want to see in the results file.

Check the sample text of the instructions, and update it to fit your project.

Tip

When writing instructions, remember that most Tolokers don’t know anything about your tasks beforehand. Make sure your instructions are as clear as possible, but not too wordy. For successful data labeling, try to strike a balance between covering all the essentials and keeping it short. Learn more in our knowledge base.

4. To save your data and continue, click Create a project.

## Create a pool

A pool is a set of tasks sent out to Tolokers at the same time. One project can have many pools. When creating a pool, you set up pricing, audience filters for Tolokers, and quality control.

1. Click Create new pool.

2. Under General information, set the Pool name.

3. Under Audience, set up filters to select Tolokers for your pool.

1. Clear My tasks may contain shocking or pornographic content if your project has none of those.

2. To select Tolokers based on their language, location, age, gender, and other parameters, click the Add filter button.

3. Use the Speed/quality balance slider to change the number of Tolokers who can see your tasks. Move the slider to the right to exclude Tolokers with lower ratings from participating in your project.

4. Click the Add skill button. Enter the skill name and description, and leave the skill not public.

5. Click Add filter → My skills and select your skill from the previous step. Configure the filter to recruit Tolokers without the skill or those whose skill value is 65 and higher.

Later in this tutorial, you will add a rule that assigns this skill to Tolokers according to how accurate their responses are.

4. Under Quality control, set quality control rules for more accurate results.

1. To filter out Tolokers who complete tasks too fast, edit the pre-configured Fast responses rule. Specify the following values:

These settings mean that a Toloker who completes a task suite in less than 10 seconds will be suspended and won't be able to access tasks of the project for 1 day.

2. Delete the pre-configured Majority vote rule.

3. Click Add a quality control rule → Control tasks, and enter the following values:

This means that a Toloker who gives 3 or more responses to the control tasks will be assigned the skill you’ve created before. The skill value is equal to the percent of correct responses. As you’ve set the filter to select Tolokers whose skill value is 65 or more in step 3.5, the rule allows you to filter out Tolokers who have shown poor results.

5. In Price, set up how much a single task will cost for you.

1. In Price per task suite, set the amount of money to pay per task suite done by one Toloker. A task suite is a page with a number of tasks. It can contain one or several tasks.

2. In the Overlap field, define how many Tolokers must do each task.

The default value (3) means that each task will have 3 responses.

This time should be enough to read the instructions, load the task, search information online, and respond (for example, 600 seconds).

7. To save the settings and continue, click Create pool.

2. Create the tasks for Tolokers:

• Template in XLSX
• Template in TSV
• Template in JSON

For this type of project, the file with tasks must have two parameters:

• INPUT:name: The names of the objects you ask Tolokers to find information about.

• INPUT:site: The links where you ask Tolokers to search data.

INPUT:name	INPUT:site
Restauracja Basniowa	https://basniowa.rybnik.pl/
Ristorante Pizzeria Zia Caterina	https://www.ristorantepizzeriaziacaterina.it/
De Vijf Sinnen	http://www.devijfsinnen.nl/


3. Click Drop file here or select, and upload the file you’ve just made.

4. Click Continue.

3. Tasks are shown to Tolokers in suites. A suite is a single page with multiple tasks. Define how many tasks to include per suite:

• Control tasks: These are tasks with predefined answers used to control the quality of responses. You will create them in the next step.

• Training tasks: These are tasks with predefined answers and explanations for Tolokers. Normally you use training tasks in separate training pools. You don’t have to include them.

4. Click Combine tasks into suites.

1. Click Edit.

3. Select the email and phone checkboxes, and enter the correct answer for a task. Then, click the Save and go to next button. Add several control tasks this way.

Tip

For large pools (over 1000 tasks), we recommend adding at least 1% of control tasks to the pool. For small pools (around 100 tasks), you need 10% control tasks.

## Start labeling

1. Make sure you have topped up your account.

2. To send the tasks to Tolokers and begin the labeling process, click Start labeling.

3. In the pop-up panel, review the budget and click Launch.

## See the results

1. You can see the labeling progress on the pool page. Wait until the labeling is completed.

2. Click the arrow next to the Download results button and choose Run Dawid-Skene model from the drop-down menu. Click Yes in the pop-up window.

3. Open the same drop-down menu again, and click View aggregations list.

4. Wait until the aggregation is complete, and click Download. You will get the TSV file with the labeling results:

• INPUT: The data you uploaded for labeling.

• OUTPUT: The results of labeling (information found by Tolokers).

• CONFIDENCE: The response significance according to the Dawid-Skene model.