Papers
arxiv:2106.03193

The FLORES-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation

Published on Jun 6, 2021
Authors:
,
,
,
,
,
,
,
,
,

Abstract

The FLORES-101 evaluation benchmark improves low-resource and multilingual machine translation by providing high-quality, multilingually aligned translations across 101 languages.

AI-generated summary

One of the biggest challenges hindering progress in low-resource and multilingual machine translation is the lack of good evaluation benchmarks. Current evaluation benchmarks either lack good coverage of low-resource languages, consider only restricted domains, or are low quality because they are constructed using semi-automatic procedures. In this work, we introduce the FLORES-101 evaluation benchmark, consisting of 3001 sentences extracted from English Wikipedia and covering a variety of different topics and domains. These sentences have been translated in 101 languages by professional translators through a carefully controlled process. The resulting dataset enables better assessment of model quality on the long tail of low-resource languages, including the evaluation of many-to-many multilingual translation systems, as all translations are multilingually aligned. By publicly releasing such a high-quality and high-coverage dataset, we hope to foster progress in the machine translation community and beyond.

Community

Sign up or log in to comment

Models citing this paper 162

Browse 162 models citing this paper

Datasets citing this paper 11

Browse 11 datasets citing this paper

Spaces citing this paper 300

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.