Remoteness, fairness, and mechanisms as challenges of data supply by humans for automation.
Despite the clear advantages of AI, automation driven by machine learning carries pitfalls that affect the lives of millions of people. The negative repercussions include the disappearance of many well-established mass professions and increased consumption of labeled data produced by humans. This data is not always obtained in a positive environment: data suppliers are often managed in an old-fashioned way and have to work full-time on routine pre-assigned tasks, leading to job dissatisfaction. Crowdsourcing is a modern and effective alternative as it gives flexibility and freedom to task executors in terms of place, time and the task type they want to work on. However, many potential stakeholders of crowdsourcing processes hesitate to use this technology due to a series of doubts that have continued to circulate over the past decade. To address these issues, our workshop focuses on the research and industry communities and covers three important aspects of data supply: Remoteness, Fairness, and Mechanisms.
Data labeling requesters (data consumers for ML systems) doubt the effectiveness and efficiency of remote work. They need trustworthy quality control techniques and ways to guarantee reliable results on time. Crowdsourcing is one of the viable solutions for effective remote work. However, despite the rapid growth and the body of literature on the topic, crowdsourcing is in its infancy and, to a large extent, is still an art. It lacks clear guidelines and accepted practices for both the requesters and the performers (also known as workers), which makes it much harder to reach the full potential of this technology. We intend to reverse this and achieve a breakthrough by turning the art into a science.
Crowd workers (data suppliers) doubt the availability and choice of tasks. They need fair and ethical task assignment, fair compensation, and growth opportunities. We believe that the working environment (e.g. a crowdsourcing platform) may help meet these needs — it should provide flexibility in choosing tasks and working hours, and access to tasks should be fair and ethical. We also aim to address bias in task design and execution that can skew results in ways that data requesters don’t anticipate. Since quality, fairness and growth opportunities for performers are central to our workshop, we invite a diverse group of performers from a global public crowdsourcing platform to our panel-led discussion.
Matchmakers (the working environment, usually represented by a crowdsourcing platform) doubt the effectiveness of economic mechanisms that underlie their two-sided market. They need a mechanism design that guarantees proper incentives for both sides: flexibility and fairness for workers, and quality and efficiency for data requesters. We stress that economic mechanisms are the key to address the issues of remoteness and fairness successfully. Our intention is to deepen the interaction between and within communities that work on mechanisms and crowdsourcing.
Our panel discussion will gather all stakeholders: researchers, representatives of global crowd platforms like Toloka and Amazon MTurk, performers, and requesters who work with the crowd on a large scale. We hope to stimulate a fruitful conversation, shed light on what is not often discussed, develop solutions to problems and find new growth points for crowdsourcing.
Introduction & Icebreakers
Data Excellence: Better Data for Better AI
— Lora Aroyo (invited talk)
A Gamified Crowdsourcing Framework for Data-Driven Co-creation of Policy Making and Social Foresight
— Andrea Tocchetti and Marco Brambilla (contributed talk)
— Sihang Qiu, Ujwal Gadiraju, Alessandro Bozzon and Geert-Jan Houben (contributed talk)
Quality Control in Crowdsourcing
— Seid Muhie Yimam (invited talk)
What Can Crowd Computing Do for the Next Generation of AI Technology?
— Ujwal Gadiraju and Jie Yang (contributed talk)
Real-Time Crowdsourcing of Health Data in a Low-Income country: A Case Study of Human Data Supply on Malaria First-Line Treatment Policy Tracking in Nigeria
— Olubayo Adekanmbi, Wuraola Fisayo Oyewusi and Ezekiel Ogundepo (contributed talk)
"Successes and failures in crowdsourcing: experiences from work providers, performers and platforms"
Modeling and Aggregation of Complex Annotations via Annotation Distance
— Matt Lease (invited talk)
Active Learning from Crowd in Item Screening
— Evgeny Krivosheev, Burcu Sayin, Alessandro Bozzon and Zoltán Szlávik (contributed talk)
Human Computation Requires and Enables a New Approach to Ethics
— Libuse Veprek, Patricia Seymour and Pietro Michelucci (contributed talk)
Bias in Human-in-the-Loop Artificial Intelligence
— Gianluca Demartini (invited talk)
VAIDA: An Educative Benchmark Creation Paradigm using Visual Analytics for Interactively Discouraging Artifacts
— Anjana Arunkumar, Swaroop Mishra, Bhavdeep Sachdeva, Chitta Baral and Chris Bryan (contributed talk)
Achieving Data Excellence
— Praveen Paritosh (invited talk)
Deliver Europe 2023
Join us at the Europe's leading conference for senior decision makers in retail, e-commerce and supply chain.
Reinforcement Learning from Human Feedback: A Tutorial