Success Stories
Learn how companies around the world are pushing the boundaries of AI with LLM post-training and evaluation

How Toloka helped poolside define and measure AI quality for developers
Oct 15, 2025

From word docs to data analysis: Evaluating AI agent performance across everyday apps
Oct 1, 2025

Creating domain-ready datasets: how Toloka's hybrid approach generates realistic and high-quality data
Aug 4, 2025

Detecting hidden harm in long contexts: how Toloka built an advanced safety dataset
Jul 14, 2025

Standardizing AI safety with MLCommons
May 15, 2025

AI agents under attack: a case study on advanced agent red-teaming
Apr 28, 2025

Multi-domain, multi-language SFT dataset pushes LLM performance to the next level
Oct 22, 2024

Toloka helps ServiceNow increase evaluation throughput multiple times
Oct 11, 2024

LLM for code generation: a scalable pipeline to gather SFT data
Apr 29, 2024
Load More
