Toloka Team
Collecting a diverse dataset for apparel sizing CV model
About the client
Our client is a European company developing a personalized size and fit recommendation solution for apparel brands. Through an innovative, intuitive and secure body scan technology, brands help their customers find the perfect size for each garment.
Challenge
To build a robust AI solution, a large dataset is needed for training models. The client’s ML team had a limited timeframe to collect real-life measurements of real people and "feed" the data to their computer vision model. The goal was to improve the model’s accuracy. For the initial model training, they collected data from employees and friends, but the dataset wasn’t big enough. Also, this data wasn’t diverse enough to work with different body types and parameters.
Solution
Toloka’s global crowd is the perfect solution for collecting diverse data — it’s fast and easy to gather any type of data from different people around the world. Our client launched a project with the help of our partner Training Data, a data collection and labeling company that used Toloka’s platform to gather the dataset of human body measurements. To run the project smoothly the client used the benefits of the Microsoft Azure ecosystem – one-click shopping, security, automated billing and document flow.
The data collection project was more complex than expected. Tolokers were asked to take photos of themselves while measuring 22 parameters of their body, which is not easy to do at the same time. The client prepared comprehensive guidelines on how to take measurements, and these instructions were modified in a crowd-friendly interface to make it faster and easier for Tolokers to understand. We translated the task instructions into multiple languages to engage more Tolokers from a wide range of countries and obtain more diverse results.
As part of the quality control checks, participants were also asked to submit the measurement numbers separately from the photos. Sometimes there were discrepancies that required verification. The client’s team checked the data and discarded incomplete or invalid measurements.
Result
At the end of the project, 500 complete sets of measurements were collected from the crowd.
The data plays an essential role in product development and gives the size and fit recommendation solution a competitive advantage. Major French fashion institutes have become interested in using the app and collaborating with the developer on innovative projects.
Fair pay for Tolokers
At Toloka, we strive to ensure fair and flexible payment for crowd participants based on the task complexity. This task required more time and effort from Tolokers than typical tasks like classification or recognition, and most people spent about 20 minutes taking measurements and submitting photos. Each participant received fair payment for a submitted set of measurements.
With Toloka we were able to capture a large dataset we needed in a shorter period of time. Other advantages of this crowdsourcing solution are price (compared with other providers) and data management. Toloka is a reliable platform in terms of both technology and customer support. — Client’s team
Article written by:
Toloka Team
Updated:
Oct 2, 2023