×

HNS: hierarchical negative sampling for network representation learning. (English) Zbl 1475.68235

Summary: Network representation learning (NRL) aims at modeling network graph by encoding vertices and edges into a low-dimensional space. These learned representations can be used for subsequent applications, such as vertex classification and link prediction. Negative Sampling (NS) is the most widely used method for boosting the performance of NRL. However, most of the existing work only randomly draws negative samples based on vertex frequencies, i.e., the vertices with higher frequency are more likely to be drawn, which ignores the situation that the sampled one may not be a true negative sample, thus, lead to undesirable embeddings. In this paper, we propose a new negative sampling method, called Hierarchical Negative Sampling (HNS), which is able to model the latent structures of vertices and learn the relations among them. During sampling, HNS can draw more appropriate negative samples and thereby obtain better performance on network embeddings. Firstly, we theoretically demonstrate the superiority of HNS over NS. And then we use experimental results to show that our proposed method outperforms the state-of-the-art models on vertex classification tasks at different training scales in real-world networks.

MSC:

68R10 Graph theory (including graph drawing) in computer science
62D05 Sampling theory, sample surveys
68T05 Learning and adaptive systems in artificial intelligence
Full Text: DOI

References:

[1] Armandpour, M.; Ding, P.; Huang, J.; Hu, X., Robust negative sampling for network embedding, Proceedings of the AAAI Conference on Artificial Intelligence., 33, 3191-3198 (2019)
[2] Blei, D. M.; Ng, A. Y.; Jordan, M. I., Latent dirichlet allocation, Journal of Machine Learning Research, 3, Jan, 993-1022 (2003) · Zbl 1112.68379
[3] L. Cai, W.Y. Wang, Kbgan: Adversarial learning for knowledge graph embeddings, 2017. arXiv preprint arXiv:1711.04071.
[4] Cavallari, S.; Zheng, V. W.; Cai, H.; Chang, K. C.-C.; Cambria, E., Learning community embedding with community detection and node embedding on graphs, (Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (2017)), 377-386
[5] Chen, J.; Gong, Z.; Liu, W., A nonparametric model for online topic discovery with word embeddings, Information Sciences, 504, 32-47 (2019)
[6] Chen, J.; Gong, Z.; Liu, W., A dirichlet process biterm-based mixture model for short text stream clustering, Applied Intelligence, 1-11 (2020)
[7] Chen, L.; Yuan, F.; Jose, J. M.; Zhang, W., Improving negative sampling for word representation using self-embedded features, (Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining (2018)), 99-107
[8] Cui, P.; Wang, X.; Pei, J.; Zhu, W., A survey on network embedding, IEEE Transactions on Knowledge and Data Engineering, 31, 5, 833-852 (2018)
[9] L. Du, Y. Wang, G. Song, Z. Lu, J. Wang, Dynamic network embedding: An extended approach for skip-gram based network embedding, in: IJCAI, 2018. pp. 2086-2092.
[10] Dunlavy, D. M.; Kolda, T. G.; Acar, E., Temporal link prediction using matrix and tensor factorizations, ACM Transactions on Knowledge Discovery from Data (TKDD), 5, 2, 1-27 (2011)
[11] Fan, R.-E.; Chang, K.-W.; Hsieh, C.-J.; Wang, X.-R.; Lin, C.-J., Liblinear: A library for large linear classification, Journal of Machine Learning Research, 9, Aug, 1871-1874 (2008) · Zbl 1225.68175
[12] Fischer, A.; Botero, J. F.; Beck, M. T.; De Meer, H.; Hesselbach, X., Virtual network embedding: A survey, IEEE Communications Surveys & Tutorials, 15, 4, 1888-1906 (2013)
[13] Gao, H.; Huang, H., Self-paced network embedding, (Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (2018)), 1406-1415
[14] Griffiths, T. L.; Steyvers, M., Finding scientific topics, Proceedings of the National academy of Sciences, 101, suppl 1, 5228-5235 (2004)
[15] Grover, A.; Leskovec, J., node2vec: Scalable feature learning for networks, (Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining (2016)), 855-864
[16] Guo, J.; Gong, Z., A nonparametric model for event discovery in the geospatial-temporal space, (Proceedings of the 25th ACM International on Conference on Information and Knowledge Management (2016)), 499-508
[17] Gutmann, M. U.; Hyvärinen, A., Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics, Journal of Machine Learning Research, 13, Feb, 307-361 (2012) · Zbl 1283.62064
[18] Liu, Y.; Liu, Z.; Chua, T.-S.; Sun, M., Topical word embeddings, (Twenty-Ninth AAAI Conference on Artificial Intelligence (2015))
[19] Liu, Z.; Zhang, Y.; Chang, E. Y.; Sun, M., Plda+ parallel latent dirichlet allocation with data placement and pipeline processing, ACM Transactions on Intelligent Systems and Technology (TIST), 2, 3, 1-18 (2011)
[20] McCallum, A. K.; Nigam, K.; Rennie, J.; Seymore, K., Automating the construction of internet portals with machine learning, Information Retrieval, 3, 2, 127-163 (2000)
[21] Mikolov, T., Chen, K., Corrado, G., Dean, J., 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
[22] T. Mikolov, I. Sutskever, K. Chen, G.S. Corrado, J. Dean, Distributed representations of words and phrases and their compositionality, in: Advances in neural information processing systems, 2013, pp. 3111-3119.
[23] A. Mnih, Y.W. Teh, A fast and simple algorithm for training neural probabilistic language models, 2012. arXiv preprint arXiv:1206.6426.
[24] Morin, F.; Bengio, Y., Hierarchical probabilistic neural network language model, Aistats, 5, 246-252 (2005), Citeseer
[25] Perozzi, B.; Al-Rfou, R.; Skiena, S., Deepwalk: Online learning of social representations, (Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2014)), 701-710
[26] Ribeiro, L. F.; Saverese, P. H.; Figueiredo, D. R., struc2vec: Learning node representations from structural identity, (Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2017)), 385-394
[27] Shi, B.; Lam, W.; Jameel, S.; Schockaert, S.; Lai, K. P., Jointly learning word embeddings and latent topics, (Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (2017)), 375-384
[28] Sun, A.; Lim, E.-P., Hierarchical text classification and evaluation, (Proceedings 2001 IEEE International Conference on Data Mining. IEEE (2001)), 521-528
[29] Tang, J.; Qu, M.; Wang, M.; Zhang, M.; Yan, J.; Mei, Q., Line: Large-scale information network embedding, (Proceedings of the 24th International Conference on World Wide Web (2015)), 1067-1077
[30] Tang, J.; Zhang, J.; Yao, L.; Li, J.; Zhang, L.; Su, Z., Arnetminer: extraction and mining of academic social networks, (Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2008)), 990-998
[31] Y.W. Teh, Dirichlet process, Rncyclopedia of Machine Learning, 2011.
[32] Y.W.Teh, M.I. Jordan, M.J. Beal, D.M. Blei, Sharing clusters among related groups: Hierarchical dirichlet processes, in: Advances in Neural Information Processing Systems, 2005, pp. 1385-1392.
[33] Tsitsulin, A.; Mottin, D.; Karras, P.; Müller, E., Verse: Versatile graph embeddings from similarity measures, (Proceedings of the 2018 World Wide Web Conference (2018)), 539-548
[34] C. Tu, H. Liu, Z. Liu, M. Sun, Cane: Context-aware network embedding for relation modeling, in: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (vol. 1: Long Papers), 2017. pp. 1722-1731.
[35] Tu, C.; Zeng, X.; Wang, H.; Zhang, Z.; Liu, Z.; Sun, M.; Zhang, B.; Lin, L., A unified framework for community detection and network representation learning, IEEE Transactions on Knowledge and Data Engineering, 31, 6, 1051-1065 (2018)
[36] Tu, K.; Cui, P.; Wang, X.; Yu, P. S.; Zhu, W., Deep recursive network embedding with regular equivalence, (Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (2018)), 2357-2366
[37] Wang, P.; Li, S.; Pan, R., Incorporating gan for negative sampling in knowledge representation learning, (Thirty-Second AAAI Conference on Artificial Intelligence (2018))
[38] Wang, Q.; Mao, Z.; Wang, B.; Guo, L., Knowledge graph embedding: A survey of approaches and applications, IEEE Transactions on Knowledge and Data Engineering, 29, 12, 2724-2743 (2017)
[39] Wang, W.; Chen, J.; Wang, J.; Chen, J.; Gong, Z., Geography-aware inductive matrix completion for personalized point of interest recommendation in smart cities, IEEE Internet of Things Journal. (2019)
[40] Wang, W.; Chen, J.; Wang, J.; Chen, J.; Liu, J.; Gong, Z., Trust-enhanced collaborative filtering for personalized point of interests recommendation, IEEE Transactions on Industrial Informatics (2019)
[41] Wang, X.; Cui, P.; Wang, J.; Pei, J.; Zhu, W.; Yang, S., Community preserving network embedding, (Thirty-first AAAI Conference on Artificial Intelligence (2017))
[42] Wang, Y.; Bai, H.; Stanton, M.; Chen, W.-Y.; Chang, E. Y., Plda: Parallel latent dirichlet allocation for large-scale applications, (International Conference on Algorithmic Applications in Management (2009), Springer), 301-314
[43] Williams, R. J., Simple statistical gradient-following algorithms for connectionist reinforcement learning, Machine Learning, 8, 3-4, 229-256 (1992) · Zbl 0772.68076
[44] Yang, C.; Liu, Z.; Zhao, D.; Sun, M.; Chang, E., Network representation learning with rich text information, (Twenty-Fourth International Joint Conference on Artificial Intelligence (2015))
[45] Yang, J.; Leskovec, J., Overlapping community detection at scale: a nonnegative matrix factorization approach, (Proceedings of the Sixth ACM International Conference on Web Search and Data Mining (2013)), 587-596
[46] Yang, J.; McAuley, J.; Leskovec, J., Community detection in networks with node attributes, (2013 IEEE 13th International Conference on Data Mining (2013), IEEE), 1151-1156
[47] Yin, J.; Wang, J., A model-based approach for text clustering with outlier detection, (2016 IEEE 32nd International Conference on Data Engineering (ICDE) (2016), IEEE), 625-636
[48] Zhang, C.; Swami, A.; Chawla, N. V., Shne: Representation learning for semantic-associated heterogeneous networks, (Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining (2019)), 690-698
[49] Zhang, Y.; Yao, Q.; Shao, Y.; Chen, L., Nscaching: Simple and efficient negative sampling for knowledge graph embedding, (2019 IEEE 35th International Conference on Data Engineering (ICDE) (2019), IEEE), 614-625
[50] Zhang, Z.; Cui, P.; Wang, X.; Pei, J.; Yao, X.; Zhu, W., Arbitrary-order proximity preserved network embedding, (Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (2018)), 2778-2786
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.