This dataset was collected for the KDD'2020
«Prediction of Hourly Earnings and Completion Time on a Crowdsourcing Platform»
. This dataset contains over 18 million tasks performed by 161,377 users in Yandex.Toloka. Data was collected over a 3-month period (September-November, 2018) to include all sessions on the main page of the web platform where a user chose a task, and all tasks that were completed by users. The session data includes the timestamp, user ID, ID of the project chosen by the user, ID of the task assigned to the user, and additional information. The task data includes start and completion timestamps, the project ID, the ID of the assigned user, the reward for completing the task, the number of micro tasks involved in the task, the amount of input data and output data, and additional information. Some static characteristics of users and projects are also provided, such as user registration dates and the length of project instructions. The data has been anonymized: no real internal ids, personal user data, proprietary requester data (including task results), or textual info (including project names and descriptions) have been used.