Modern Web services widely employ sophisticated Machine Learning techniques to rank news, posts, products, and other items presented to the users or contributed by them. These techniques are usually built on offline data pipelines and use a numerical approximation of the relevance of the demonstrated content. In our hands-on tutorial, we present a systematic view on using Human-in-the-Loop to obtain scalable offline evaluation processes and, in particular, high-quality relevance judgements. We will introduce the ranking problem to the attendees, discuss the commonly used ranking quality metrics, and then focus on Human-in-the-Loop-based approach to obtain relevance judgements at scale. More precisely, we will present a thorough introduction to pairwise comparisons, demonstrate how these comparisons can be obtained using Crowdsourcing, and organize a hands-on practice session in which the attendees will obtain high-quality relevance judgements for search quality evaluation. Finally, we will discuss the obtained relevance judgements, point out directions for further studies, and answer questions asked during the tutorial.


Alexey Drutsa
Head of Efficiency & Growth Division
Dmitry Ustalov
Analyst / Software Developer
Daria Baidakova
Education & Customer Success Team Lead
Nikita Popov
Senior Data Scientist


09:00 - 09:45
Introduction to Offline Evaluation 
09:45 - 10:45
Ranking and Quality Metrics
10:45 - 11:30
Human-in-the-Loop Essentials
11:30 - 13:00
Hands-On Practice Session I
13:00 - 14:00
Lunch Break
14:00 - 14:30
Hands-On Practice Session II 
14:30 - 15:30
Pairwise Comparisons
15:30 - 16:00
Final Remarks and Conclusion
Don't miss
Don't miss our informative workshops, tutorials, and webinars.
Mon Nov 08 2021 12:09:09 GMT+0300 (Moscow Standard Time)