ISSN 2071-8594

Russian academy of sciences

Editor-in-Chief

Gennady Osipov

V.L. Arlazarov, E.L. Pliskin, A.V. Soloviev Measuring topic divergence in document networks and its possible applications

Abstract.

The paper proposes a new step to promote interdisciplinary propagation of such a popular in the fields of applied statistics and machine learning approach, as the topic modeling (TM). We introduce TM-based concept of «topical divergence» to account for the document (or its author) individuality with regard to its local network neighborhood. We propose several possible interdisciplinary applications of topical divergence related to sociology and management of science.

Keywords:

topic modeling, document networks, probabilistic methods, social networks, sociology of culture, management of science.

PP. 62-67.

REFERENCES

1. Arun S. Maiya, Robert M. Rolfe. Topic similarity networks: visual analytics for large document sets. In Big Data 2014 IEEE International Conference, pp. 364-372.
2. Jonathan Chang, David M. Blei. Relational Topic Models for Document Networks. Annals of Applied Statistics, 2010, Vol. 4, No. 1, 124–150.
3. Paul DiMaggio, Manish Nag, David Blei. Exploiting affinities between topic modeling and the sociological perspective on culture: Application to newspaper coverage of US government arts funding. Poetics 41, no. 6 (2013): 570-606.
4. Q. Mei, D. Cai, D. Zhang, C. Zhai. Topic modeling with network regularization. In WWW’08. New York, NY, USA: ACM, 2008, pp. 101–110.
5. Sun, Yizhou, et al. iTopicModel: Information network-integrated topic modeling. Ninth IEEE International Conference on Data Mining. IEEE, 2009, pp. 493-502.
6. Mrinmaya Sachan, Danish Contractor, Tanveer A. Faruquie, L. Venkata Subramaniam. Using content and interactions for discovering communities in social networks. In Proceedings of the 21st international conference on World Wide Web 2012 Apr 16 (pp. 331-340). ACM.
7. Kaplan, S. and Vakili, K., 2015. The double?edged sword of recombination in breakthrough innovation. Strategic Management Journal, 36(10), pp.1435-1457.
8. C. Wang, J. Paisley, and D. Blei. Online variational inference for the hierarchical Dirichlet process. Artificial Intelligence and Statistics, 2011.
9. A. Korshunov, A. Gomzin. Topic modeling in natural language texts. Proceedings of ISP RAS, Moscow, Russia, 23 (2012).
10. https://en.wikipedia.org/wiki/Hellinger_distance
11. Lee, S., Song, J., and Kim, Y. An Empirical Comparison of Four Text Mining Methods. Journal of Computer Information Systems, (51:1), 2010, pp. 1-10.
12. D. Blei and J. Lafferty. Topic Models. In A. Srivastava and M. Sahami, editors, Text Mining: Classification, Clustering, and Applications. Chapman & Hall/CRC Data Mining and Knowledge Discovery Series, 2009.
13. Ali Daud, Juanzi Li, Lizhu Zhou, Faqir Muhammad. Knowledge discovery through directed probabilistic topic models: a survey. In Proceedings of Frontiers of Computer Science in China. 2010, 280-301.
14. X. Zhu, Z. Ghahramani, and J. D. Lafferty. Semi-supervised learning using gaussian fields and harmonic functions. In ICML, pages 912–919, 2003.
15. D. A. Gubanov, D. A. Novikov, A. G. Chkhartishvili. Informational influence and informational control models in social networks. Automation and Remote Control. July 2011, Volume 72, Issue 7, pp 1557–1567.
16. Perugini, Saverio; Goncalves, Marcos Andre; and Fox, Edward A., «Recommender Systems Research: A Connection-centric Survey» (2004). Computer Science Faculty Publications. Paper 34. http://ecommons.udayton.edu/cps_fac_pub/34.
17. Lerman, K. and Hogg, T., 2010, April. Using a model of social dynamics to predict popularity of news. In Proceedings of the 19th international conference on World Wide Web (pp. 621-630). ACM.
18. Friedkin, N.E. and Johnsen, E.C., 1997. Social positions in influence networks. Social Networks, 19(3), pp.209-222.
19. Meadows, A.J. and O'Connor, J.G., 1971. Bibliographical statistics as a guide to growth points in science. Social Studies of Science, 1(1), pp.95-99.
20. T.S.Kuhn. The structure of scientific revolutions. The university of Chicago press, Chicago, 1970.
21. Fortunato, S., 2010. Community detection in graphs. Physics reports, 486(3), pp.75-174.