Success Stories
Learn how companies around the world are pushing the boundaries of AI with LLM post-training and evaluation

Supporting the launch of JetBrains’ Developer Productivity AI Arena
Feb 12, 2026
How Toloka helped poolside define and measure AI quality for developers
Oct 15, 2025
From word docs to data analysis: Evaluating AI agent performance across everyday apps
Oct 1, 2025
Creating domain-ready datasets: how Toloka's hybrid approach generates realistic and high-quality data
Aug 4, 2025
Detecting hidden harm in long contexts: how Toloka built an advanced safety dataset
Jul 14, 2025
Standardizing AI safety with MLCommons
May 15, 2025
AI agents under attack: a case study on advanced agent red-teaming
Apr 28, 2025
Multi-domain, multi-language SFT dataset pushes LLM performance to the next level
Oct 22, 2024
Toloka helps ServiceNow increase evaluation throughput multiple times
Oct 11, 2024