In this tutorial, we present some key techniques for efficiently collecting labeled data, including aggregation, incremental relabeling, and dynamic pricing.
In this tutorial, we introduce data labeling via public crowdsourcing marketplaces and present some key techniques for efficiently collecting labeled data , including aggregation, incremental relabeling, and dynamic pricing.
This is followed by a practice session, where participants choose one real label collection task, experiment with selecting settings for the labeling process, and launch their own label collection project on one of the largest crowdsourcing marketplaces. During the tutorial, all projects are run on the real Toloka crowd. While we are waiting for the crowd performers to annotate participants’ projects, we present the major theoretical results in efficient aggregation, incremental relabeling, and dynamic pricing. We also discuss the strengths and weaknesses of crowdsourcing, as well as applicability to real-world tasks, summarizing our five years of research and industrial expertise in crowdsourcing. All participants receive feedback on their projects and practical advice.
— The concept of crowdsourcing
— Crowdsourcing task examples
— Crowdsourcing platforms
— Yandex crowdsourcing experience
— Decomposition for an effective pipeline
— Task instruction & interface: best practices
— Quality control techniques
— How Toloka works
— Types of tasks in Toloka
— Creating a project in Toloka
— Dataset and required labels
— Discussion: how to collect labels?
— Data labeling pipeline for implementation
Participants:
— create
— configure
— run data labeling projects on real performers in real-time
— Aggregation models
— Incremental relabeling
— Dynamic pricing
Break
— Completing the label collection process
— Project results
— Ideas for further work and research
— References to literature and other tutorials
Deliver Europe 2023
Join us at the Europe's leading conference for senior decision makers in retail, e-commerce and supply chain.
Hosts:
ICML 2023
Reinforcement Learning from Human Feedback: A Tutorial
Hosts: