Toloka Team
Translating through Crowdsourcing: The AliExpress Story
AliExpress has the highest traffic of any e-commerce website in Russia with almost 9 million daily visitors and more than 100 million items for sale. When AliExpress Russia opened its doors in 2019, the company quickly ran into localization problems with many inaccurate translations of catalogued products that required lengthy and ultimately unsuccessful solutions. Looking for new ways to solve this, AliExpress turned to Toloka for crowdsourcing-based solutions, which proved to be the right decision.
Replacing outdated methodology
Virtually all automated translation follows the same logic, executed step by step:
Separation of written content into smaller pieces
Translation and validation of individual lexical components bit by bit
Subsequent reassembling of the original message
Traditional Pipeline
All attempts at improving translation quality have so far been along the same line. Originally, Toloka was no different and followed the same tried-and-true pattern, which entailed going back and forth between translating and validating until an adequate result was achieved.
The problem is that this method is quite inefficient. It relies heavily on going through multiple stages (or "projects") and therefore requires a lot of time to execute. This issue of impracticality becomes even more pronounced when many projects are run in parallel. From a business perspective, this sort of setup is far from ideal.
New translation algorithm
Toloka programmers, led by Andrey Olkhovik, decided to skip the validation stage altogether, at least in the traditional form known to most in the auto-translation field.
New Experimental Approach
Now, Tolokers could choose one of the answers provided in a multiple-choice box containing up to 4 options, or tick "none of the options fits" and offer their own answer.
Task example
As a result, the new versions provided by Tolokers became fixed options in a multiple-choice box for other contributors to choose from, and the whole "choose the right option" process started over.
Different users either chose those answers and hence verified them, or offered their own, newer versions. The cycle continued until there were no other "better" answers to offer, and the best answers had been confirmed by most Tolokers.
Results
Although this crowdsourcing-based method requires a much more careful selection of Tolokers, it provides a significant improvement in translation accuracy and offers the following advantages: 1 task – 1 project. Both the metrics and the budget are focused, not scattered across numerous Toloka projects Less error-prone automatization. With reduced automatization, the potential for mistranslations also decreases Quick start. Less preparation is needed to begin the whole process Smaller budget. Not 2 projects for $0.01 + 50% commission, but 1 project for $0.02 + 25% commission
Article written by:
Toloka Team
Updated:
Jun 30, 2021