Toloka documentation

Audio transcription

In this tutorial, you will learn how to run audio transcription in Toloka. We will use a project preset designed specifically for this type of data labeling.

Audio transcription is a type of data labeling task with an audio file and a text input area. Tolokers listen to the short recording and type the text they hear. After you collect the results, you can apply the dataset for your speech recognition models.

Note

You may need additional projects for your task, such as dataset pre-check or checking Tolokers' responses. Learn more about this in Decomposition of the task.

Prerequisites

Before you begin:

  • Make sure you are registered in Toloka as a requester.

  • Top up your Toloka account. If you are unsure about the budget, you can do that later in this tutorial. Toloka will display the budget estimate for your project.

Choose a preset

We recommend starting with a project preset for easier configuration and better results.

  1. Follow this link, or create a project manually:

    1. In the main menu, choose the Projects tab, and click Create a project.

      Choose a preset. Step 1

    2. Select the Transcribing audio recordings preset.

  2. Click Choose this preset in the pop-up tab.

Create a project

Set up how your tasks will look for Tolokers. Tolokers are people around the world who get paid for completing your tasks.

  1. Under General information, add the project name and description.

    • Name to show Tolokers: In 2–5 words, state the general idea of the project.

    • Description for Tolokers: In a couple of sentences, explain what you expect Tolokers to do. This is just an overview. You will write instructions later.

    Create a project. Step 1

  2. In the Task interface section, set up what your tasks will look like. This preset has a task template with layout and validation pre-configured. The Toloker won't be able to submit the response without listening to the audio recording and adding the text.

    Note

    This tutorial uses Template Builder, but you can use the HTML/JS/CSS editor for the same purpose.

    1. In the Config section, you can change the texts Tolokers will see in your task. All tasks in a project use the same texts.

      Create a project. Config texts

      Note

      To learn about other properties of the Config section, their possible values and the impact on the task interface, see the Template Builder Help.

    2. In the Input data example section, add a link to a sample audio. This audio is only used to display the task interface preview on the right.

      Create a project. Input data example

    3. Raw task data is stored in the XSLX, TSV, or JSON format. The labeling results are presented in a TSV file. The Data specification section determines which parameters these files might contain.

      Click Show specifications and check the values:

      • Input data: Parameters in the file with raw task data.

      • Output data: Parameters in the file with labeling results.

      Input data and Output data match the task interface you set up in Template Builder. Check that there are fields for all data types you use for your tasks, and for the ones you want to see in the results file.

  3. Under Instructions for Tolokers, add the instructions Tolokers will see when they start doing your tasks. You can add text, tables, and images to your instructions.

    Check the sample text of the instructions, and update it to fit your project.

    Tip

    When writing instructions, remember that most Tolokers don’t know anything about your tasks beforehand. Make sure your instructions are as clear as possible, but not too wordy. For successful data labeling, try to strike a balance between covering all the essentials and keeping it short. Learn more in our knowledge base.

  4. To save your data and continue, click Create a project.

    Create a project. Step 4

Create a pool

A pool is a set of tasks sent out to Tolokers at the same time. One project can have many pools. When creating a pool, you set up pricing, audience filters for Tolokers, and quality control.

  1. Click Create new pool.

  2. Under General information, set the Pool name.

  3. Under Audience, set up filters to select Tolokers for your pool.

    1. Clear My tasks may contain shocking or pornographic content if your project has none of those.

    2. To select Tolokers based on their language, location, age, gender, and other parameters, click the Add filter button.

      • For example, add the Languages filter:

        Create a pool. Language filter

      • It is best to launch transcription tasks in the Toloka web version so that Tolokers can use the keyboard for typing. Add the Device type filter, and set its value to Personal computer.

    3. Use the Speed/quality balance slider to change the number of Tolokers who can see your tasks. Move the slider to the right to exclude Tolokers with lower ratings from participating in your project.

      Create a pool. Step 3.3

  4. In Quality control, set quality control rules for more accurate results.

    1. Click the Non-automatic acceptance toggle, and specify the number of days for checking the task in the Review period in days field.

      What is non-automatic acceptance (assignment review)?

      The non-automatic acceptance option allows you to review completed assignments before accepting them and paying for them. If the Toloker didn't follow the instructions, you can reject the assignment.

    2. Delete the pre-configured Majority vote rule.

    3. Edit the pre-configured Fast responses rule to catch bots. This rule filters out Tolokers who complete tasks too fast.

      Set the Minimum time per task suite. A task suite is a page with a number of tasks. It can contain one or several tasks. If the tasks are simple, you can add 6–10 tasks per suite.

      • The minimum time per suite value depends on two characteristics: the number of tasks on the page, and the length of audio recordings.

      • Make allowances for technical errors. For example, some recordings failed to load or play. The Toloker will quickly submit responses for tasks like this and this won't be an error.

      To catch bots, set 10–15 seconds per response. Ban Tolokers after two fast responses.

      Create a pool. Fast answers filter

      This means that a user who completes two or more task suites in less than 10 seconds will be banned for 10 days and won't be able to access your tasks.

    4. For a trial pool, the settings you’ve just made are enough. You can get better results if you set the additional quality control rules.

      The additional quality control rules
      1. Click Add a quality control rule, and add the second Fast responses rule. It will help detect Tolokers who retype texts carelessly, make mistakes, skip words, or don't take the task seriously enough.

        In this case, the Minimum time per task suite value depends on the length of recordings and their number per task suite, as well as on how difficult the task is (it's hard to hear, it contains jargon, it has special rules for transcribing, and so on). Ban Tolokers after three fast responses.

        Create a pool. Fast answers control rule

        This means that a user who gives a minimum of 3 responses in less than 30 seconds will be banned for 5 days and won't be able to complete your tasks.

      2. Add the Results of assignment review quality control rule and enter the following values:

        Create a pool. Assignments review control rule

        This means that if 35% or more of a Toloker's responses are rejected, the Toloker is banned and can't access your tasks for 15 days. The rule takes effect after 3 responses of the Toloker are reviewed.

      3. Add the Processing rejected and accepted assignments rule:

        Create a pool. Processing rejected and accepted assignments control rule

        This means that if you reject assignments during the review, they'll be sent for re-completion, but to another Toloker.

  5. In Price, set up how much a single task will cost for you.

    1. In Price per task suite, set the amount of money to pay per task suite done by one Toloker.

      The price depends on the length and complexity of the audio recordings.

    2. In the Overlap field, define how many Tolokers must do each task.

      For the speech transcription, overlap is 1, as a rule. This means that each task will have 1 response.

  6. Click Show advanced settings. In Additional settings, specify the Time per task suite, sec.

    This time should be enough to read the instructions, load the task, listen to audio recordings, and type text (for example, 1,200 seconds).

  7. To save the settings and continue, click Create pool.

Upload data

At this step, upload your task data to Toloka. We recommend that all the recordings in a pool are about the same length.

  1. Click Upload data.

  2. Attach a prepared dataset or media files.

    1. To download a template, click one of the buttons:

      • Template in XLSX
      • Template in TSV
      • Template in JSON

      For this type of project, the file with tasks must have one parameter. Its name equals INPUT:audio, and the values are links to the audio files.

      INPUT:audio
      https://tlk.s3.yandex.net/ext_dataset/noisy_speech/noisy_tested_wav/p257_068.wav
      https://tlk.s3.yandex.net/ext_dataset/noisy_speech/noisy_tested_wav/p257_095.wav
      https://tlk.s3.yandex.net/ext_dataset/noisy_speech/noisy_tested_wav/p257_293.wav
      
    2. Open the downloaded file, and replace the sample links with links to your audio files.

    3. Click Select prepared dataset, and upload the file you’ve just made.

    Upload media files from your device to avoid difficulties with placing links in the file with tasks.

    To the right of the Select prepared dataset link, click Select media files and choose files.

    Features:

    • This option is available only for the projects that have a single url type input field in the specification.

    • After uploading, tasks are automatically created with links to the uploaded files.

    Note

    Supported file formats:

    • Images — GIF, JPG, JPEG, PNG, WebP.
    • Audio — FLAC, MP3, WAV, M4A.
    • Video — MP4.
  3. Click Continue.

  4. Tasks are shown to Tolokers in suites. A suite is a single page with multiple tasks. Define how many tasks to include per suite:

    • General tasks: These are tasks for Tolokers to label.

    • Control tasks: These are tasks with predefined answers used to control the quality of responses. For this project, you don’t need control tasks because of the enabled Non-automatic acceptance option.

    • Training tasks: These are tasks with predefined answers and explanations for Tolokers. Normally you use training tasks in separate training pools. You don’t have to include them.

    For example, you can add 5 general tasks per suite:

    Upload data. Tasks per suite

    This means that there will be 5 audio recordings per suit, each recording with a text field for transcription.

  5. Click Combine tasks into suites.

Start labeling

  1. Make sure you have topped up your account.

  2. To send the tasks to Tolokers and begin the labeling process, click Start labeling.

    Start labeling. Step 2

  3. In the pop-up panel, review the budget and click Launch.

See the results

Track the labeling progress on the pool page. You can start the review when the first results are received.

After the specified time period, all responses are automatically accepted, regardless of their quality.

  1. Go to the pool, and click Review assignments.

    See the results. Review assignments

  2. Choose an assignment.

  3. Check the responses, and click Accept or Decline. For rejected responses, enter a comment to specify the reason.

    Note

    To learn about other ways of review, see the Reviewing Tolokers' responses section.

  4. After checking all the assignments, click Download results.

    See the results. Download results

    You will get the TSV file with the labeling results.

See also

Datasets and reference

Sample dataset files with tasks