University-level Math Reasoning Dataset
This dataset is designed to develop complex reasoning and problem-solving skills in STEM.
Size
13,500+ text-only and 600+ multimodal real-world math problems, each with step-by-step solutions and final answers.
Format
LaTeX and natural language explanations. Multimodal samples include images (graphs, diagrams, etc.).
Quality
Created and validated by domain experts (university math professors, teachers, and vetted professionals), ensuring non-synthetic, high-quality content.
Complexity
University-level problems aligned with US university curricula.
Covers 7 core subjects:
Fine-tuning experiments
Our fine-tuning experiments demonstrate that this dataset significantly improves LLM performance on complex mathematical reasoning tasks at every skill level, from high school to university and olympiad-level problems.
Model
Model size
GSM8k
MATH
MATH lvl-5
MathOdyssey
U-MATH text
GPT-4o
-
0.950
0.758
0.550
0.481
0.462
Our fine-tuned Mathstral
7b
0.859
0.583
0.319
0.382
0.293
Mathstral
7b
0.832
0.486
0.224
0.336
0.189
Our fine-tuned DeepSeek-Math
7b
0.843
0.543
0.262
0.372
0.251
Numina fine-tuned DeepSeek-Math
7b
0.782
0.512
0.232
0.341
0.232
DeepSeek-Math
7b
0.803
0.427
0.168
0.305
0.192





