RuBQ: A Russian Dataset for Question Answering over Wikidata
Abstract
A high-quality Russian knowledge base question answering dataset, RuBQ, is introduced, featuring machine translations, SPARQL queries, and entity linking.
The paper presents RuBQ, the first Russian knowledge base question answering (KBQA) dataset. The high-quality dataset consists of 1,500 Russian questions of varying complexity, their English machine translations, SPARQL queries to Wikidata, reference answers, as well as a Wikidata sample of triples containing entities with Russian labels. The dataset creation started with a large collection of question-answer pairs from online quizzes. The data underwent automatic filtering, crowd-assisted entity linking, automatic generation of SPARQL queries, and their subsequent in-house verification.
Models citing this paper 0
No model linking this paper
Datasets citing this paper 3
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper