×

Properties of vector embeddings in social networks. (English) Zbl 1462.91017

Summary: Embedding social network data into a low-dimensional vector space has shown promising performance for many real-world applications, such as node classification, node clustering, link prediction and network visualization. However, the information contained in these vector embeddings remains abstract and hard to interpret. Methods for inspecting embeddings usually rely on visualization methods, which do not work on a larger scale and do not give concrete interpretations of vector embeddings in terms of preserved network properties (e.g., centrality or betweenness measures). In this paper, we study and investigate network properties preserved by recent random walk-based embedding procedures like node2vec, DeepWalk or LINE. We propose a method that applies learning to rank in order to relate embeddings to network centralities. We evaluate our approach with extensive experiments on real-world and artificial social networks. Experiments show that each embedding method learns different network properties. In addition, we show that our graph embeddings in combination with neural networks provide a computationally efficient way to approximate the Closeness Centrality measure in social networks.

MSC:

91D30 Social networks; opinion dynamics
05C90 Applications of graph theory

References:

[1] Kossinets, G.; Watts, D.J.; Empirical analysis of an evolving social network; Science: 2006; Volume 311 ,88-90. · Zbl 1226.91055
[2] Romero, D.M.; Kleinberg, J.M.; The directed closure process in hybrid social-information networks, with an analysis of link formation on Twitter; Proceedings of the Fourth International Conference on Weblogs and Social Media, ICWSM 2010: ; .
[3] Szabo, G.; Huberman, B.A.; Predicting the popularity of online content; Commun. ACM: 2010; Volume 53 ,80-88.
[4] Sakaki, T.; Okazaki, M.; Matsuo, Y.; Earthquake shakes Twitter users: Real-time event detection by social sensors; Proceedings of the 19th international conference on World wide web, ACM: ; ,851-860.
[5] Helic, D.; Strohmaier, M.; Granitzer, M.; Scherer, R.; Models of human navigation in information networks based on decentralized search; Proceedings of the 24th ACM Conference on Hypertext and Social Media: ; ,89-98.
[6] Helic, D.; Körner, C.; Granitzer, M.; Strohmaier, M.; Trattner, C.; Navigational efficiency of broad vs. narrow folksonomies; Proceedings of the 23rd ACM conference on Hypertext and social media: ; ,63-72.
[7] He, X.; Gao, M.; Kan, M.Y.; Wang, D.; Birank: Towards ranking on bipartite graphs; IEEE Trans. Knowl. Data Eng.: 2017; Volume 29 ,57-71.
[8] Wang, X.; Nie, L.; Song, X.; Zhang, D.; Chua, T.S.; Unifying virtual and physical worlds: Learning toward local and global consistency; ACM Trans. Inf. Syst.: 2017; Volume 36 ,4.
[9] Asur, S.; Huberman, B.A.; Predicting the Future with Social Media; Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology: ; Volume Volume 1 ,492-499.
[10] Grover, A.; Leskovec, J.; node2vec: Scalable feature learning for networks; Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining: ; ,855-864.
[11] Shaw, B.; Jebara, T.; Structure Preserving Embedding; Proceedings of the 26th Annual International Conference on Machine Learning, ICML ’09: ; ,937-944.
[12] Perozzi, B.; Al-Rfou, R.; Skiena, S.; Deepwalk: Online learning of social representations; Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining: ; ,701-710.
[13] Mikolov, T.; Chen, K.; Corrado, G.; Dean, J.; Efficient estimation of word representations in vector space; arXiv: 2013; .
[14] Tang, J.; Qu, M.; Wang, M.; Zhang, M.; Yan, J.; Mei, Q.; Line: Large-scale information network embedding; Proceedings of the 24th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee: ; ,1067-1077.
[15] Fatemeh Salehi Rizi, M.G.; Ziegler, K.; Global and Local Feature Learning for Ego-Network Analysis; Proceedings of the 14th International Workshop on Technologies for Information Retrieval (TIR): ; .
[16] Le, Q.; Mikolov, T.; Distributed representations of sentences and documents; Proceedings of the 31st International Conference on Machine Learning (ICML-14): ; ,1188-1196.
[17] Mcauley, J.; Leskovec, J.; Discovering social circles in ego networks; ACM Trans. Knowl. Discov. Data: 2014; Volume 8 ,4.
[18] Ding, C.H.; He, X.; Zha, H.; Gu, M.; Simon, H.D.; A min-max cut algorithm for graph partitioning and data clustering; Proceedings of the 2001 IEEE International Conference on Data Mining: ; ,107-114.
[19] Liben-Nowell, D.; Kleinberg, J.; The link-prediction problem for social networks; J. Assoc. Inf. Sci. Technol.: 2007; Volume 58 ,1019-1031.
[20] Ziegler, K.; Caelen, O.; Garchery, M.; Granitzer, M.; He-Guelton, L.; Jurgovsky, J.; Portier, P.E.; Zwicklbauer, S.; Injecting Semantic Background Knowledge into Neural Networks using Graph Embeddings; Proceedings of the 2017 IEEE 26th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE): ; .
[21] van der Maaten, L.; Hinton, G.; Visualizing data using t-SNE; J. Mach. Learn. Res.: 2008; Volume 9 ,2579-2605. · Zbl 1225.68219
[22] Mahendran, A.; Vedaldi, A.; Visualizing deep convolutional neural networks using natural pre-images; Int. J. Comput. Vis.: 2016; Volume 120 ,233-255.
[23] Feder, T.; Motwani, R.; Clique partitions, graph compression and speeding-up algorithms; Proceedings of the Twenty-Third Annual ACM Symposium on Theory of Computing: ; ,123-133. · Zbl 0831.68073
[24] Newman, M.E.; A measure of betweenness centrality based on random walks; Soc. Netw.: 2005; Volume 27 ,39-54.
[25] Rojas, R.; ; Neural Networks: A Systematic Introduction: Berlin, Germany 2013; . · Zbl 0861.68072
[26] Goyal, P.; Ferrara, E.; Graph Embedding Techniques, Applications, and Performance: A Survey; arXiv: 2017; .
[27] Goldberg, Y.; Levy, O.; word2vec Explained: Deriving Mikolov et al.’s negative-sampling word-embedding method; arXiv: 2014; .
[28] Kullback, S.; Leibler, R.A.; On information and sufficiency; Ann. Math. Stat.: 1951; Volume 22 ,79-86. · Zbl 0042.38403
[29] Recht, B.; Re, C.; Wright, S.; Niu, F.; Hogwild: A lock-free approach to parallelizing stochastic gradient descent; Proceedings of the Advances in Neural Information Processing Systems: ; ,693-701.
[30] Janicke, S.; Heine, C.; Hellmuth, M.; Stadler, P.F.; Scheuermann, G.; Visualization of graph products; IEEE Trans. Vis. Comput. Graph.: 2010; Volume 16 ,1082-1089.
[31] Wang, D.; Cui, P.; Zhu, W.; Structural deep network embedding; Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining: ; ,1225-1234.
[32] Ou, M.; Cui, P.; Pei, J.; Zhang, Z.; Zhu, W.; Asymmetric Transitivity Preserving Graph Embedding; Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining: ; ,1105-1114.
[33] Li, J.; Dani, H.; Hu, X.; Tang, J.; Chang, Y.; Liu, H.; Attributed Network Embedding for Learning in a Dynamic Environment; arXiv: 2017; .
[34] Liao, L.; He, X.; Zhang, H.; Chua, T.S.; Attributed Social Network Embedding; arXiv: 2017; .
[35] Okamoto, K.; Chen, W.; Li, X.Y.; Ranking of closeness centrality for large-scale social networks; Lect. Notes Comput. Sci.: 2008; Volume 5059 ,186-195. · Zbl 1143.91362
[36] Zafarani, R.; Abbasi, M.A.; Liu, H.; ; Social Media Mining: An Introduction: Cambridge, UK 2014; .
[37] Borgatti, S.P.; Centrality and network flow; Soc. Netw.: 2005; Volume 27 ,55-71.
[38] Ferrara, E.; Fiumara, G.; Topological features of online social networks; arXiv: 2012; . · Zbl 1329.91112
[39] Sun, B.; Mitra, P.; Giles, C.L.; Learning to rank graphs for online similar graph search; Proceedings of the 18th ACM Conference on Information and Knowledge Management: ; ,1871-1874.
[40] Agarwal, S.; Learning to rank on graphs; Mach. Learn.: 2010; Volume 81 ,333-357. · Zbl 1470.68068
[41] Yazdani, M.; Collobert, R.; Popescu-Belis, A.; Learning to rank on network data; Proceedings of the Eleventh Workshop on Mining and Learning with Graphs: ; .
[42] Herbrich, R.; Graepel, T.; Obermayer, K.; Large margin rank boundaries for ordinal regression; Advances in Large Margin Classifiers: Cambridge, MA, USA 2000; .
[43] Boser, B.E.; Guyon, I.M.; Vapnik, V.N.; A training algorithm for optimal margin classifiers; Proceedings of the Fifth Annual Workshop on Computational Learning Theory: ; ,144-152.
[44] Hyndman, R.J.; Koehler, A.B.; Another look at measures of forecast accuracy; Int. J. Forecast.: 2006; Volume 22 ,679-688.
[45] Nair, V.; Hinton, G.E.; Rectified linear units improve restricted boltzmann machines; Proceedings of the 27th International Conference on Machine Learning (ICML-10): ; ,807-814.
[46] Han, J.; Moraga, C.; The influence of the sigmoid function parameters on the speed of backpropagation learning; From Natural to Artificial Neural Computation: Berlin/Heidelberg, Germany 1995; ,195-201.
[47] Li, M.; Zhang, T.; Chen, Y.; Smola, A.J.; Efficient mini-batch training for stochastic optimization; Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining: ; ,661-670.
[48] Albert, R.; Barabási, A.L.; Statistical mechanics of complex networks; Rev. Mod. Phys.: 2002; Volume 74 ,47. · Zbl 1205.82086
[49] Thomas, A.; Community Detection for NetworkX’s Documentation; ; .
[50] Stephens, M.A.; EDF statistics for goodness of fit and some comparisons; J. Am. Stat. Assoc.: 1974; Volume 69 ,730-737.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.