ISSN 2071-8594

Russian academy of sciences


Gennady Osipov

D. A. Evseev Query Generation for Complex Question Answering in Russian with Syntax Parser


This paper describes the system which translates a natural language question into a SPARQL-query. The question answering system consists of: the syntax parser, which builds a syntax tree of a sentence; the component, which defines the SPARQL query template using the syntax tree; models, which find entities and relations to fill in the slots of the SPARQL query template. We use BERT for entity detection and relation ranking. One of the characteristics of BERT training on knowledge base question answering subtasks in Russian is small amount of training data. Due to this, we investigate training of multilingual BERT, pretrained on LC-QUAD2.0 dataset, on entity detection and relation ranking tasks on small amount of Russian samples from RuBQ dataset. The proposed question answering system outperforms previous approaches on RuBQ dataset.

question answering system, knowledge base, query generation, multilingual BERT.

PP. 57-65.

DOI 10.14357/20718594210305