Evaluating Mathematical Reasoning in LLMs
Where:
Online
Date:
Feb 26, 2025
17:00 CET
What you'll learn about:
Mathematical reasoning for LLMs: Key use cases and areas where math reasoning fundamentally enhances language model capabilities.
Auto-formalization in math: AlphaProof, Lean, and the applicability of automated verifiers for LLM training and evaluation.
Expert-curated data collection: Best practices for sourcing high-quality datasets to assess and enhance model performance, including for university-level mathematical reasoning.
Designing effective benchmarks: Developing robust evaluation metrics, assessing model performance across different domains, and extracting actionable insights.
Boosting LLM performance with new data: Optimally leveraging a custom university-level training dataset to enhance the mathematical proficiency of top-performing LLMs.
Practical applications: Applying the results in academic, educational, and product contexts.