×

Multi-source information fusion based heterogeneous network embedding. (English) Zbl 1459.68171

Summary: Heterogeneous network embedding aims to learn a mapping between network data in original topological space and vectored data in low dimensional latent space, while encoding valuable information, such as structural and semantic information. The resulting vector representation has shown promising performance for extensive real-world applications, such as node classification and node clustering. However, most of existing methods merely focus on modeling network structural information, ignoring the rich multi-source information of different types of nodes. In this paper, we propose a novel Multi-source Information Fusion based Heterogeneous Network Embedding (MIFHNE) approach. We first capture the semantic information using the strategy of meta-graph based random walk. Subsequently, we jointly model the structural proximity, attribute information and label information in the framework of Nonnegative Matrix Factorization (NMF). Theoretical proofs and comprehensive experiments on two real-world heterogeneous network datasets demonstrate the feasibility and effectiveness of our approach.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
68R10 Graph theory (including graph drawing) in computer science
Full Text: DOI

References:

[1] Shi, C.; Li, Y.; Zhang, J.; Sun, Y.; Yu, P. S., A survey of heterogeneous information network analysis, IEEE Trans. Knowl. Data Eng., 29, 1, 17-37 (2017)
[2] Zhang, M.; Wang, J.; Wang, W., Heterank: a general similarity measure in heterogeneous information networks by integrating multi-type relationships, Inf. Sci., 453, 389-407 (2018) · Zbl 1440.68319
[3] Cui, P.; Wang, X.; Pei, J.; Zhu, W., A survey on network embedding, IEEE Trans. Knowl. Data Eng., 31, 5, 833-852 (2019)
[4] Li, B.; Pi, D., Network representation learning: a systematic literature review, Neural Comput. Appl. (2020)
[5] Gui, H.; Liu, J.; Tao, F.; Jiang, M.; Norick, B.; Kaplan, L. M.; Han, J., Embedding learning with events in heterogeneous information networks, IEEE Trans. Knowl. Data Eng., 29, 11, 2428-2441 (2017)
[6] Dong, Y.; Chawla, N. V.; Swami, A., metapath2vec: scalable representation learning for heterogeneous networks, (Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2017)), 135-144
[7] H. Li, H. Wang, Z. Yang, M. Odagaki, Variation autoencoder based network representation learning for classification, in: Proceedings of ACL 2017, Student Research Workshop, 2017, pp. 56-61.
[8] Perozzi, B.; Al-Rfou, R.; Skiena, S., Deepwalk: online learning of social representations, (The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2014)), 701-710
[9] Grover, A.; Leskovec, J., node2vec: scalable feature learning for networks, in, (Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2016)), 855-864
[10] Mikolov, T.; Chen, K.; Corrado, G.; Dean, J., Efficient estimation of word representations in vector space, (International Conference on Learning Representations (2013))
[11] Tang, J.; Qu, M.; Wang, M.; Zhang, M.; Yan, J.; Mei, Q., Line:large-scale information network embedding, (International World Wide Web Conferences (2015)), 1067-1077
[12] Cao, S.; Lu, W.; Xu, Q., Grarep: learning graph representations with global structural information, (Proceedings of the 24th ACM International Conference on Information and Knowledge Management (2015)), 891-900
[13] Wang, X.; Cui, P.; Wang, J.; Pei, J.; Zhu, W.; Yang, S., Community preserving network embedding, (Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (2017)), 203-209
[14] Cao, S.; Lu, W.; Xu, Q., Deep neural networks for learning graph representations, (Thirtieth AAAI Conference on Artificial Intelligence (2016)), 1145-1152
[15] Dai, Q.; Li, Q.; Tang, J.; Wang, D., Adversarial network embedding, (Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (2018))
[16] Yang, C.; Zhao, D.; Zhao, D.; Chang, E. Y.; Chang, E. Y., Network representation learning with rich text information, (Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (2015)), 2111-2117
[17] Tu, C.; Zhang, W.; Liu, Z.; Sun, M., Max-margin deepwalk: discriminative learning of network representation, (International Joint Conference on Artificial Intelligence (2016)), 3889-3895
[18] Pan, S.; Wu, J.; Zhu, X.; Zhang, C.; Wang, Y., Tri-party deep network representation, (International Joint Conference on Artificial Intelligence (2016)), 1895-1901
[19] Huang, X.; Li, J.; Hu, X., Accelerated attributed network embedding, (SIAM International Conference on Data Mining (2017)), 633-641
[20] Liao, L.; He, X.; Zhang, H.; Chua, T.-S., Attributed social network embedding, IEEE Trans. Knowl. Data Eng., 30, 12, 2257-2270 (2018)
[21] Jacob, Y.; Denoyer, L.; Gallinari, P., Learning latent representations of nodes for classifying in heterogeneous social networks, (Proceedings of the 7th ACM International Conference on Web Search and Data Mining (2014)), 373-382
[22] Chang, S.; Han, W.; Tang, J.; Qi, G.; Aggarwal, C. C.; Huang, T. S., Heterogeneous network embedding via deep architectures, (Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2015)), 119-128
[23] Tu, K.; Cui, P.; Wang, X.; Wang, F.; Zhu, W., Structural deep embedding for hyper-networks, (Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (2018))
[24] Zhang, D.; Yin, J.; Zhu, X.; Zhang, C., Metagraph2vec: complex semantic path augmented heterogeneous network embedding, (Pacific-Asia Conference on Knowledge Discovery and Data Mining (2018)), 196-208
[25] Shi, C.; Hu, B.; Zhao, W. X.; Philip, S. Y., Heterogeneous information network embedding for recommendation, IEEE Trans. Knowl. Data Eng., 31, 2, 357-370 (2018)
[26] Pio, G.; Serafino, F.; Malerba, D.; Ceci, M., Multi-type clustering and classification from heterogeneous networks, Inf. Sci., 425, 107-126 (2018)
[27] Lei, X.; Zhang, Y., Predicting disease-genes based on network information loss and protein complexes in heterogeneous network, Inf. Sci., 479, 386-400 (2019)
[28] Lee, D. D.; Seung, H. S., Algorithms for non-negative matrix factorization, (Proceedings of the 13th International Conference on Neural Information Processing Systems (2001)), 535-541
[29] Y. Fang, W. Lin, V.W. Zheng, M. Wu, C.C. Chang, X.L. Li, Semantic proximity search on graphs with metagraph-based learning, in: IEEE International Conference on Data Engineering, 2016, pp. 277-288.https://doi.org/10.1109/ICDE.2016.7498247.
[30] Lin, Y. R.; Sun, J.; Sundaram, H.; Kelliher, A.; Castro, P.; Konuru, R., Community discovery via metagraph factorization, ACM Trans. Knowl. Discov. Data, 5, 3, 1-44 (2011)
[31] Cai, H.; Zheng, V. W.; Chang, C. C., A comprehensive survey of graph embedding: problems, techniques and applications, IEEE Trans. Knowl. Data Eng., 30, 9, 1616-1637 (2018)
[32] J.H. Li, C.D. Wang, L. Huang, D. Huang, J.H. Lai, P. Chen, Attributed network embedding with micro-meso structure, in: International Conference on Database Systems for Advanced Applications, 2018, pp. 20-36.https://doi.org/10.1007/978-3-319-91452-7_2.
[33] Katz, L., A new status index derived from sociometric analysis, Psychometrika, 18, 1, 39-43 (1953) · Zbl 0053.27606
[34] Goyal, P.; Ferrara, E., Graph embedding techniques, applications, and performance: a survey, Knowl.-Based Syst., 151, 78-94 (2018)
[35] Z. Akata, C. Thurau, C. Bauckhage, Non-negative matrix factorization in multimodality data for segmentation and label prediction, in: 16th Computer Vision Winter Workshop, 2011.
[36] Boyd, S.; Vandenberghe, L., Convex Optimization (2004), Cambridge University Press · Zbl 1058.90049
[37] Tang, J.; Zhang, J.; Yao, L.; Li, J.; Zhang, L.; Su, Z., Arnetminer: extraction and mining of academic social networks, (ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2008)), 990-998
[38] Le, Q. V.; Mikolov, T., Distributed representations of sentences and documents, (Proceedings of the 31th International Conference on Machine Learning (2014)), 1188-1196
[39] Arandjelovic, O., Weighted linear fusion of multimodal data: a reasonable baseline?, (Proceedings of the 2016 ACM Conference on Multimedia Conference (2016)), 851-857
[40] Mikolov, T.; Sutskever, I.; Chen, K.; Corrado, G. S.; Dean, J., Distributed representations of words and phrases and their compositionality, (27th Annual Conference on Neural Information Processing Systems (2013)), 3111-3119
[41] Ribeiro, L. F.R.; Saverese, P. H.P.; Figueiredo, D. R., struc2vec: learning node representations from structural identity, (Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2017)), 385-394
[42] Silva, J.; Willett, R., Hypergraph-based anomaly detection of high-dimensional co-occurrences, IEEE Trans. Pattern Anal. Mach. Intell., 31, 3, 563-569 (2009)
[43] Tang, L.; Liu, H., Relational learning via latent social dimensions, (ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2009)), 817-826
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.