← Events
/
Conference
Aligning LLMs to Low-Resource Languages
Feb 22, 2024
00:00 GMT+2
This tutorial provides a detailed guide on collecting data for aligning large language models (LLMs) with low-resource languages (LRLs).
Agenda and key topics
Overview
This tutorial provides a detailed guide on collecting data for aligning large language models (LLMs) with low-resource languages (LRLs). It addresses the challenge of data scarcity in these languages and introduces a pipeline for generating high-quality data, using Swahili as a primary example. The tutorial covers strategies for dataset collection and alignment of LLMs to LRLs, offering comprehensive guidance on producing and utilizing high-quality data for language technology development in under-resourced languages.
Materials
Notebooks










