ISSN 2071-8594

Российская академия наук

Главный редактор

Г.С. Осипов

Д.А. Девяткин, Р.Е. Суворов, И.В. Соченков "Архитектура поисково-аналитической системы и исследование информационного пространства, связанного с арктической зоной"

Аннотация.

В работе рассмотрен вопрос создания методов и экспериментальных программных средств информационной поддержки принятия решений на основе анализа информационного пространства для заданной темы. Приводится постановка задачи создания таких программных средств, их архитектура и основной алгоритм работы, типы и структура источников информации, а также типы запросов, которые могут быть полезны эксперту. В качестве примера рассматривается информационный фон вокруг арктической зоны. Приведена методика и результаты первичного анализа информационного пространства, включающие описание актуальных сюжетов и ассоциативных правил, описывающих значимые связи между сущностями.

Ключевые слова:

информационно-поисковая система, мониторинг информационного пространства, мониторинг событий, извлечение информации, извлечение отношений, базы знаний, поддержка принятия решений.

Стр. 37-46.

Полная версия статьи в формате pdf.

REFERENCES

1. Imran M. et al. Processing social media messages in mass emergency: a survey //ACM Computing Surveys (CSUR). – 2015. – T. 47. – №. 4. – S. 67.
2. Petrovic S. Real-time event detection in massive streams. – 2013.
3. Li R. et al. Tedas: A twitter-based event detection and analysis system // Data engineering (icde), 2012 ieee 28th international conference on. – IEEE, 2012. – S. 1273-1276.
4. Disaster SitRep – A vertical search engine and information analysis tool in disaster management domain / Li Zheng, Chao Shen, Liang Tang et al. // Proceedings of 2012 IEEE 13th International Conference on Information Reuse and Integration (IRI). — 2012. — P. 457–465.
5. Tweedr: Mining Twitter to inform disaster response / Zahra Ashktorab, Christopher Brown, Manojit Nandi, Aron Culotta // Proceedings of ISCRAM. — 2014. — P. 354–358.
6. Xiaohua L. et al. Recognizing named entities in tweets / Xiaohua Liu, Shaodian Zhang, Furu Wei, Ming Zhou // Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies . – Association for Computational Linguistics. — 2011. — P. 359–367.
7. Bhattacharya A., Tiwari M. K., Harding J. A. A framework for ontology based decision support system for elearning modules, business modeling and manufacturing systems // Journal of Intelligent Manufacturing. – 2012. – T. 23. – №. 5. – S. 1763-1781.
8. Rao L., Mansingh G., Osei-Bryson K. M. Building ontology based knowledge maps to assist business process reengineering //Decision Support Systems. – 2012. – T. 52. – №. 3. – S. 577-589.
9. Hersovici M. et al. The shark-search algorithm. An application: tailored Web site mapping //Computer Networks and ISDN Systems. – 1998. – T. 30. – №. 1. – S. 317-326.
10. Chen Z. et al. An improved shark-search algorithm based on multi-information // Fuzzy Systems and Knowledge Discovery, 2007. FSKD 2007. Fourth International
Conference on. – IEEE, 2007. – T. 4. – S. 659-658
11. Su C. et al. An efficient adaptive focused crawler based on ontology learning // Hybrid Intelligent Systems, 2005. HIS'05. Fifth International Conference on. – IEEE, 2005.
– S. 6 pp.
12. Liu H., Janssen J., Milios E. Using HMM to learn user browsing patterns for focused web crawling // Data & Knowledge Engineering. — 2006. — Vol. 59, no. 2. — P. 270–291
13. Blanvillain O., Kasioumis N., Banos V. BlogForever Crawler: Techniques and Algorithms to Harvest Modern Weblogs //Proceedings of the 4th International Conference on Web Intelligence, Mining and Semantics (WIMS14). – ACM, 2014. – S. 7.
14. Florian R. et al. Named entity recognition through classifier combination //Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003-Volume 4. – Association for Computational Linguistics, 2003. – S. 168-171.
15. Al-Rfou R. et al. Polyglot-NER: Massive multilingual named entity recognition //Proceedings of the 2015 SIAM International Conference on Data Mining, Vancouver,
British Columbia, Canada. – 2015.
16. Wikipedia – svobodnaya entsiklopediya [Elektronnyy resurs] / Wikimedia. – URL: http://wikipedia.org (provereno 20.01.2016).
17. Bollacker K. et al. Freebase: a collaboratively created graph database for structuring human knowledge //Proceedings of the 2008 ACM SIGMOD international conference on Management of data. – ACM, 2008. – S. 1247-1250.
18. Manning C. D. et al. Introduction to information retrieval. – Cambridge : Cambridge university press, 2008. – T. 1. – S. 496.
19. Sochenkov I. V., Suvorov R. Ye. Servisy polnotekstovogo poiska v informatsionno-analiticheskoy sisteme (Chast 1) //Informatsionnye tekhnologii i vychislitelnye sistemy. M.: ISA RAN. – 2013. – №. 2. – S. 69-78.
20. Takase S., Okazaki N., Inui K. Fast and Large-scale Unsupervised Relation Extraction. – 2015.
21. Angeli G., Premkumar M. J., Manning C. D. Leveraging Linguistic Structure For Open Domain Information Extraction //Proceedings of the 53rd Annual Meeting of the
Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing,
ACL. – 2015. – S. 26-31.
22. TAC Knowledge Base Population [Elektronnyy resurs] // NIST Information Technology Laboratory. – 2015. –
URL: http://www.nist.gov/tac/2015/KBP/ (provereno 20.01.2016).
23. Hoffmann R. et al. Knowledge-based weak supervision for information extraction of overlapping relations //Proceedings of the 49th Annual Meeting of the Association
for Computational Linguistics: Human Language Technologies-Volume 1. – Association for Computational Linguistics, 2011. – S. 541-550.
24. Scrapy. A Fast and Powerful Scraping and Web Crawling Framework [Elektronnyy resurs] // Scrapy. – 2016. –
URL: http://scrapy.org/ (provereno 20.01.2016).
25. Osipov G. et al. Relational-situational method for intelligent search and analysis of scientific publications //Proceedings of the Integrating IR Technologies for Professional Search Workshop. – 2013. – S. 57-64.
26. Agrawal R., Imielinski T., Swami A. Mining association rules between sets of items in large databases // Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data / ACM. — Vol. 22. — 1993. — P. 207–216.
27. Blei D. M. Probabilistic topic models // Communications of the ACM. – 2012. – T. 55. – №. 4. – S. 77-84.
28. D.A. Devyatkin, R.Ye. Suvorov, I.V. Sochenkov. Metod tematicheskoy klasterizatsii masshtabnykh kollektsiy nauchno-tekhnicheskikh dokumentov // Informatsionnye tekhnologii i vychislitelnye sistemy. - 2013. - № 1. - S. 33-42.
29. Haklay M., Weber P. Openstreetmap: User-generated street maps //Pervasive Computing, IEEE. – 2008. – T. 7. – №. 4. – S. 12-18.
30. Titan:Distributed Graph Database [Elektronnyy resurs] // DataStax. – 2016. – URL:
http://thinkaurelius.github.io/titan/ (provereno 20.01.2016).
31. Lakshman A., Malik P. Cassandra: a decentralized structured storage system // ACM SIGOPS Operating Systems Review. – 2010. – T. 44. – №. 2. – S. 35-40.
32. Joishi J., Sureka A. Vishleshan: performance comparison and programming process mining algorithms in graphoriented and relational database query languages. – 2015.
33. Rodriguez M. A. The Gremlin graph traversal machine and language (invited talk) // Proceedings of the 15th Symposium on Database Programming Languages. –
ACM, 2015. – S. 1-10.
34. Aho A. V., Corasick M. J. Efficient string matching: an aid to bibliographic search // Communications of the ACM. – 1975. – T. 18. – №. 6. – S. 333-340.
35. Al-Rfou R., Perozzi B., Skiena S. Polyglot: Distributed word representations for multilingual nlp //arXiv preprint arXiv:1307.1662. – 2013.