Data protection at Toloka

Toloka Team
by Toloka Team
Image

Subscribe to Toloka News

Subscribe to Toloka News

We live and breathe data at Toloka, which is why data protection and information security are top priority for us. This is something we won’t ever compromise on. Read on to find out what actions we take to keep your data safe.

First of all, let's define the types of data we are talking about:

  • Information about our Tolokers (including PII).
  • Information about our requesters (including PII).
  • The data that our requesters provide for their projects.
  • The data that our Tolokers collect and label as they’re completing those projects.

What is PII?

PII, or personally identifiable information, is a type of sensitive data that can identify a person. It can contain personal information like full name, telephone number, email address, residential or postal address, license plate number, photos, and biometrics. This type of information may come from both our clients and annotators. PII is protected by privacy laws like GDPR (probably one of the toughest and best-known privacy regulations in the world). Violating these laws can lead to some very steep fines.

The data that Tolokers collect during tasks may contain personal information about other Tolokers or third parties (we share some examples of that later). The data that requesters upload to the platform for labeling may also contain third-party PII.

Where is data stored?

We use servers provided by Microsoft Azure for data storage that may be located in the US, the EU, or Asia. The client can choose where they’d like to keep their data: they can opt for virtual storage in a private cloud or their own on-premises storage.

The client can:

  • Store data outside Toloka’s infrastructure.
  • Make use of a private cloud on the servers we use.
  • Choose the exact region of their servers.

All data is processed on application servers and transmitted through an API (or web interface) using an encrypted TLS channel.

Our microservice architecture is divided into access zones, which are protected by a host-based firewall and microservice authentication. Access control is guided by our information security system.

How we process PII

Apart from strict security measures dictated by regulatory bodies, we use a combination of our own methods for PII management.

PII of Tolokers

We restrict access to the PII of Tolokers. Only a handful of employees can access Toloker PII in special circumstances as part of their job responsibilities.

We encourage requesters to explain to Tolokers how they will use the data that they collect, and we offer templates for many types of data collection tasks to help requesters do this correctly.

We give requesters tools to choose which Tolokers can access their tasks. For full transparency, requesters can always see the hash_id of Tolokers who are labeling their tasks. All other data processing is automated so that Toloka employees don’t have access to project data.

Finally, we delete all PII that belongs to Tolokers when their accounts are deleted. We also delete all project data if the requester asks us to. We call it the “Wipe-out privilege”.

Cookies

We’re transparent about cookies and tracking. Every Toloka visitor can set their cookie preferences and decline our tracking request. There are three types of cookies on our website:

  • Session cookies, which are needed for the website to function properly.
  • Analytical cookies, which are used to prepare statistics on user behavior and demographics.
  • Marketing cookies, which are used to make offers and advertise the product.

Field tasks

PII protection measures are particularly important when it comes to field tasks. These are tasks that ask Tolokers to go to a location on the map and submit photo evidence, such as photos of a business at a certain address. Field tasks raise unique privacy concerns — when Tolokers take photos on the street and happen to capture random pedestrians or private vehicles, they could be obtaining PII from third parties without their consent. This is why we detect and blur all license plate numbers and faces as the data comes in. This is a native feature built into our mobile app.

Protection against unauthorized access

We use several tried-and-true methods to prevent unauthorized access to your data:

  • SSO (single sign-on) everywhere on our system.
  • MFA (multi-factor authentication) for accessing internal infrastructure from the outside.
  • TSA (two-step authentication) for access to production services like Azure.
  • SDLC (secure development lifecycle), a process that governs security principles of software development.
  • SAST (static application security testing), a special tool that scans the source code of applications for errors and inconsistencies.
  • DAST (dynamic application security testing), a special mechanism that tests applications as they run to identify any performance issues, shortcomings, and vulnerabilities.
  • Security agents and performance monitoring on laptops/desktops using our SOC (security operation center).
  • First-class laptop/desktop encryption.

To ensure best practices on an individual level, all Toloka employees complete these required courses:

  • Development Security.
  • Advanced Anti-Phishing Tools.
  • NDA (Non-Disclosure Agreement) Principles.

Audits and security certificates

Since we work with large IT companies, we make sure to comply with all major information security regulations. This includes undergoing regular audits and attaining specialized certificates. Plus, we spend a great deal of time studying and actualizing security demands from our clients.

Toloka already has ISO 27001 – the most fundamental international standard for managing information security. This certificate means we have the right tools in place for managing our data securely 24/7. Since this is an ongoing process, we have our own ISO 27001 internal audits every quarter to make sure the standards are maintained throughout. This involves conducting risk assessments and finding ways to address any issues.

We’re also working to comply with:

  • ISO 27701, a privacy extension to ISO 27001 that deals with managing PII. It ensures the company is operating in accordance with GDPR (General Data Protection Regulation) and CPPA (California Privacy Protection Agency) among other regulatory bodies.
  • HIPPA (Health Insurance Portability and Accountability Act), an American federal law that ensures medical data is protected from fraud, theft, and accidental leaks. We are adapting our processes to meet HIPPA requirements so that we can safely handle medical data.
  • SOC 2/3 (Service and Organization Controls 2/3), a series of comprehensive audits that examine all processes and procedures in tech companies that offer online services. In essence, this is an international standard that deals with cybersecurity risk management.

Accredited auditors also organize controlled penetration tests (ethical hacking into our system) on a regular basis to identify any potential vulnerability issues during a targeted cyber attack. The last audit of this type was conducted by KPMG in April of this year.

There’s more

If you want to learn more about our security measures, visit our security page. Please get in touch with us on Slack if you have any questions or comments – we always appreciate your feedback.

Article written by:
Toloka Team
Toloka Team
Updated: 

Recent articles

Have a data labeling project?

Take advantage of Toloka technologies. Chat with our expert to learn how to get reliable training data for machine learning at any scale.
Fractal