Gennady Osipov

F.N Solovyev, A.M. Chepovskiy An extension of the short text language identification model


In our work we address the problem of the natural language identification in short texts. A Bayesian classifier is employed. We propose an extension of the language identification model by the incorporation of the new cyrillic languages of the russian small nations.


statistical language model, natural language identification, languages of russian small nations.

