Introducing Tendem:
The first Hybrid AI + Human agent
Tendem is a hybrid AI + Human agent that combines AI automation with human expertise to deliver business-ready results. It automates the entire workflow—from task breakdown to expert matching and quality verification—resulting in higher-quality outputs than AI-only tools and faster, better results than human-only marketplaces.
Tendem performance validation
To assess Tendem's success, we conducted rigorous in-house testing across 94 diverse, hard real-world business tasks. The benchmark measures client-perceived outcomes across three key dimensions: result quality, execution time, and price.
Tendem Internal Benchmark
Benchmark Composition: 94 tasks across four key business functions
74.5%
Tendem
53.2%
Upwork
40.4%
ChatGPT Agent
16.4 hours
Tendem
35.0 hours
Upwork
ChatGPT Agent
Tendem Benchmark Insights
Quality First
Tendem improves the overall Good rate by +21.3 percentage points (pp) versus Upwork. The largest quality gain is in Completeness (+22.3 pp), indicating that step-gates prevent omissions and enforce acceptance criteria.
Speed
Tendem cuts the median total time by an impressive 53% compared to the human-only baseline. This is driven by faster connection and execution times.
Closing the Gap
For ambiguous, multi-document, or spec-heavy tasks, the hybrid step-gates substantially cut omission/fabrication errors, solving the "last-mile gap" where AI-only systems fail.
Underlying Strength
Tendem's autonomous AI agent is competitive on web browsing/tool-use and close to leading models on hard knowledge, providing a solid backbone for the hybrid system.
External Industry Benchmarks
Tendem’s AI agent — tested in fully automated mode without any human input — performs on par with leading AI systems on standard industry benchmarks:
System
Humanity’s Last Exam
GAIA
Tendem’s AI Agent
71.0%
39.0%
78.2%
ChatGPT Agent
68.9%
41.6%
—
ChatGPT Deep Research
51.5%
26.6%
67.4%
Manus
—
—
73.4%
Contributors
Toloka AI
Toloka AI
