Toloka cares about the security of your data and data labeling results. This section describes the key principles of working with data in Toloka.
The rights to the results with labeled data belong to the requester, and no one else.
The data that the requester sends to Toloka is strictly confidential. Support staff have access to this data. The list of employees granted this access is regularly reviewed and their activities are monitored by our team.
If necessary, the requester may grant access to their account to other users, for example, to their employees.
According to the User Agreement, Tolokers must treat the requester's data as confidential. We make sure that no fraudulent users get access to Toloka: cheaters get a warning or ban.
By transmitting data to Toloka, you automatically agree to this data being processed by third parties, namely, Tolokers in Toloka.
To avoid transmitting personal data, you can anonymize the data yourself. For example, you can cover up people's faces in photos and change the pitch of the voice in speech recordings.
For tasks that use recognition of passports or questionnaires with personal information, you can cut out data fields and use separate tasks for them. For example, you can send the last name, first name, and middle name separately for recognition.
We make every effort to ensure the security of your data, but sometimes tasks are completed by cheaters. If your data is confidential, take additional measures to protect it.
Toloka has dedicated anti-fraud system for banning cheaters, but the quality control is shared responsibility of the requester and the platform. The requester is responsible for the quality control of his projects and protection of his data. Projects require individual approach in setting quality controls to ensure best quality of labelled data.
To protect your data:
Decompose tasks. You can break your tasks down into several projects (steps). Each step contains only part of the data, and a malicious user won't be able to put the pieces together.
If you use input fields to transmit data that Tolokers shouldn't see, hide the fields that contain this data. For instance, do this if you moderate comments and you need the authors' personal data in the results for further data processing.
Use any name or pseudonym to identify yourself for the Tolokers — you don't have to disclose your company name. This is useful for requesters who don't want to attract unnecessary interest from Tolokers.
You can use the Submitted responses quality control rule to limit the number of task suites each Toloker can complete. This lets you better protect your data from fraud and minimize your risks (scattered data has no commercial value).
Set up the process to avoid storing data on Toloka servers. To do this:
When uploading tasks to the pool, use IDs in the input data rather than content itself.
Set up a server to return the task content based on IDs.
In the task interface, add code that sends IDs to the server and displays the content received.
For instance, you can store text comments on your server and use the file with tasks to pass only the comment IDs. As a result, only the ID+result pairs are saved in the Toloka database, but these pairs don't have any commercial value in the absence of input data.
We are always happy to sign an NDA (Non-Disclosure Agreement) with requesters. If that's critical for your company, please contact us at firstname.lastname@example.org
You can send requests from Toloka tasks to your server. To get assistance in setting up the process, contact us at email@example.com.
This is useful if:
You need images or other files for your tasks, and you want to store them on your own server.
You need to load dynamic data in the tasks.
You want to log user actions.
See the documents governing interaction with Toloka