skip to main content
research-article

Neural PathSim for Inductive Similarity Search in Heterogeneous Information Networks

Published: 30 October 2021 Publication History

Abstract

PathSim is a widely used meta-path-based similarity in heterogeneous information networks. Numerous applications rely on the computation of PathSim, including similarity search and clustering. Computing PathSim scores on large graphs is computationally challenging due to its high time and storage complexity. In this paper, we propose to transform the problem of approximating the ground truth PathSim scores into a learning problem. We design an encoder-decoder based framework, NeuPath, where the algorithmic structure of PathSim is considered. Specifically, the encoder module identifies Top T optimized path instances, which can approximate the ground truth PathSim, and maps each path instance to an embedding vector. The decoder transforms each embedding vector into a scalar respectively, which identifies the similarity score. We perform extensive experiments on two real-world datasets in different domains, ACM and IMDB. Our results demonstrate that NeuPath performs better than state-of-the-art baselines in the PathSim approximation task and similarity search task.

References

[1]
Peter W Battaglia, Jessica B Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Vinicius Zambaldi, Mateusz Malinowski, Andrea Tacchetti, David Raposo, Adam Santoro, Ryan Faulkner, et al. Relational inductive biases, deep learning, and graph networks. arXiv preprint arXiv:1806.01261, 2018.
[2]
Phuc Do and Phu Pham. Dw-pathsim: a distributed computing model for topic-driven weighted meta-path-based similarity measure in a large-scale content-based heterogeneous information network. Journal of Information and Telecommunication, 3 (1): 19--38, 2019.
[3]
Yuxiao Dong, Nitesh V Chawla, and Ananthram Swami. metapath2vec: Scalable representation learning for heterogeneous networks. In SIGKDD, pages 135--144, 2017.
[4]
Changjun Fan, Li Zeng, Yuhui Ding, Muhao Chen, Yizhou Sun, and Zhong Liu. Learning to identify high betweenness centrality nodes from scratch: A novel graph neural network approach. In CIKM, pages 559--568, 2019.
[5]
Yuan Fang, Wenqing Lin, Vincent W Zheng, Min Wu, Kevin Chen-Chuan Chang, and Xiao-Li Li. Semantic proximity search on graphs with metagraph-based learning. In ICDE, pages 277--288, 2016.
[6]
Francois Fouss, Alain Pirotte, Jean-Michel Renders, and Marco Saerens. Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation. IEEE Transactions on knowledge and data engineering, 19 (3): 355--369, 2007.
[7]
Justin Gilmer, Samuel S Schoenholz, Patrick F Riley, Oriol Vinyals, and George E Dahl. Neural message passing for quantum chemistry. ICML, 2017.
[8]
Aditya Grover and Jure Leskovec. node2vec: Scalable feature learning for networks. In SIGKDD, pages 855--864, 2016.
[9]
William L Hamilton, Rex Ying, and Jure Leskovec. Inductive representation learning on large graphs. In NeurIPS, pages 1025--1035, 2017.
[10]
Yu He, Yangqiu Song, Jianxin Li, Cheng Ji, Jian Peng, and Hao Peng. Hetespaceywalk: A heterogeneous spacey random walk for heterogeneous information network embedding. In CIKM, pages 639--648, 2019.
[11]
Anahita Hosseini, Ting Chen, Wenjun Wu, Yizhou Sun, and Majid Sarrafzadeh. Heteromed: Heterogeneous information network for medical diagnosis. In CIKM, pages 763--772, 2018.
[12]
Shifu Hou, Yanfang Ye, Yangqiu Song, and Melih Abdulhayoglu. Hindroid: An intelligent android malware detection system based on structured heterogeneous information network. In SIGKDD, pages 1507--1515, 2017.
[13]
Ziniu Hu, Yuxiao Dong, Kuansan Wang, and Yizhou Sun. Heterogeneous graph transformer. In WWW, pages 2704--2710, 2020.
[14]
elin and Kek"al"ainen(2002)]jarvelin2002cumulatedKalervo J"arvelin and Jaana Kek"al"ainen. Cumulated gain-based evaluation of ir techniques. ACM Transactions on Information Systems (TOIS), 20 (4): 422--446, 2002.
[15]
Glen Jeh and Jennifer Widom. Simrank: a measure of structural-context similarity. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 538--543, 2002.
[16]
Glen Jeh and Jennifer Widom. Scaling personalized web search. In WWW, pages 271--279, 2003.
[17]
Thomas N Kipf and Max Welling. Semi-supervised classification with graph convolutional networks. ICLR, 2017.
[18]
Yehuda Koren. Factorization meets the neighborhood: a multifaceted collaborative filtering model. In SIGKDD, pages 426--434, 2008.
[19]
]lao2010fastNi Lao and William W Cohen. Fast query execution for retrieval models based on path-constrained random walks. In SIGKDD, pages 881--888, 2010 a.
[20]
]lao2010relationalNi Lao and William W Cohen. Relational retrieval using a combination of path-constrained random walks. Machine learning, 81 (1): 53--67, 2010 b.
[21]
Pan Li, Yanbang Wang, Hongwei Wang, and Jure Leskovec. Distance encoding: Design provably more powerful neural networks for graph representation learning. In NeurIPS, 2020.
[22]
Xin Liu, Haojie Pan, Mutian He, Yangqiu Song, Xin Jiang, and Lifeng Shang. Neural subgraph isomorphism counting. In SIGKDD, pages 1959--1969, 2020.
[23]
Zhaoyu Lou, Jiaxuan You, Chengtao Wen, Arquimedes Canedo, Jure Leskovec, et al. Neural subgraph matching. arXiv preprint arXiv:2007.03092, 2020.
[24]
Changping Meng, Reynold Cheng, Silviu Maniu, Pierre Senellart, and Wangda Zhang. Discovering meta-paths in large heterogeneous information networks. In WWW, pages 754--764, 2015.
[25]
Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. The pagerank citation ranking: Bringing order to the web. Technical report, Stanford InfoLab, 1999.
[26]
Adam Santoro, David Raposo, David G Barrett, Mateusz Malinowski, Razvan Pascanu, Peter Battaglia, and Timothy Lillicrap. A simple neural network module for relational reasoning. In NeurIPS, pages 4967--4976, 2017.
[27]
Michael Schlichtkrull, Thomas N Kipf, Peter Bloem, Rianne Van Den Berg, Ivan Titov, and Max Welling. Modeling relational data with graph convolutional networks. In ESWC, pages 593--607, 2018.
[28]
Chuan Shi, Xiangnan Kong, Yue Huang, S Yu Philip, and Bin Wu. Hetesim: A general framework for relevance measure in heterogeneous networks. IEEE Transactions on Knowledge and Data Engineering, 26 (10): 2479--2492, 2014.
[29]
Chuan Shi, Yitong Li, Jiawei Zhang, Yizhou Sun, and S Yu Philip. A survey of heterogeneous information network analysis. IEEE Transactions on Knowledge and Data Engineering, 29 (1): 17--37, 2016.
[30]
Chuan Shi, Binbin Hu, Wayne Xin Zhao, and S Yu Philip. Heterogeneous information network embedding for recommendation. IEEE Transactions on Knowledge and Data Engineering (TKDE), 31 (2): 357--370, 2018.
[31]
Yizhou Sun, Jiawei Han, Xifeng Yan, Philip S. Yu, and Tianyi Wu. Pathsim: Meta path-based top-k similarity search in heterogeneous information networks. VLDB, pages 992--1003, 2011.
[32]
Yizhou Sun, Brandon Norick, Jiawei Han, Xifeng Yan, Philip S. Yu, and Xiao Yu. Integrating meta-path selection with user-guided object clustering in heterogeneous information networks. In SIGKDD, pages 1348--1356, 2012.
[33]
Jian Tang, Meng Qu, and Qiaozhu Mei. Pte: Predictive text embedding through large-scale heterogeneous text networks. In SIGKDD, pages 1165--1174, 2015.
[34]
Hanghang Tong, Christos Faloutsos, and Jia-Yu Pan. Fast random walk with restart and its applications. In ICDM, pages 613--622. IEEE, 2006.
[35]
ović et al.(2018)Velivc ković, Cucurull, Casanova, Romero, Lio, and Bengio]velivckovic2017graphPetar Velivc ković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. Graph attention networks. ICLR, 2018.
[36]
Petar Velikovi, Rex Ying, Matilde Padovano, Raia Hadsell, and Charles Blundell. Neural execution of graph algorithms. In ICLR, 2020.
[37]
Chenguang Wang, Yangqiu Song, Haoran Li, Ming Zhang, and Jiawei Han. Knowsim: A document similarity measure on structured heterogeneous information networks. In ICDM, pages 1015--1020, 2015.
[38]
Chenguang Wang, Yangqiu Song, Haoran Li, Ming Zhang, and Jiawei Han. Text classification with heterogeneous information network kernels. In AAAI, volume 30, 2016.
[39]
Xiao Wang, Houye Ji, Chuan Shi, Bai Wang, Yanfang Ye, Peng Cui, and Philip S. Yu. Heterogeneous graph attention network. In WWW, pages 2022--2032, 2019.
[40]
Keylu Xu, Jingling Li, Mozhi Zhang, Simon S Du, Ken-ichi Kawarabayashi, and Stefanie Jegelka. What can neural networks reason about? ICLR, 2020.
[41]
Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neural networks? In ICLR, 2019.
[42]
Xiaowei Xu, Nurcan Yuruk, Zhidan Feng, and Thomas AJ Schweiger. Scan: a structural clustering algorithm for networks. In SIGKDD, pages 824--833, 2007.
[43]
Xiao Yu, Xiang Ren, Yizhou Sun, Quanquan Gu, Bradley Sturt, Urvashi Khandelwal, Brandon Norick, and Jiawei Han. Personalized entity recommendation: A heterogeneous information network approach. In WSDM, pages 283--292, 2014.
[44]
Seongjun Yun, Minbyul Jeong, Raehyun Kim, Jaewoo Kang, and Hyunwoo J Kim. Graph transformer networks. In NeurIPS, pages 11960--11970, 2019.
[45]
Huan Zhao, Quanming Yao, Jianda Li, Yangqiu Song, and Dik Lun Lee. Meta-graph based recommendation fusion over heterogeneous information networks. In SIGKDD, pages 635--644, 2017.

Cited By

View all
  • (2024)Adaptive Graph Neural Network with Incremental Learning Mechanism for Knowledge Graph ReasoningElectronics10.3390/electronics1314277813:14(2778)Online publication date: 15-Jul-2024
  • (2024)A fast malware detection model based on heterogeneous graph similarity searchComputer Networks10.1016/j.comnet.2024.110799254(110799)Online publication date: Dec-2024
  • (2022)Semantic enhanced Top-k similarity search on weighted HINNeural Computing and Applications10.1007/s00521-022-07339-634:19(16911-16927)Online publication date: 1-Oct-2022

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
October 2021
4966 pages
ISBN:9781450384469
DOI:10.1145/3459637
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 October 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. graph neural networks
  2. heterogeneous information networks
  3. similarity search

Qualifiers

  • Research-article

Funding Sources

  • NSFC
  • RIF
  • GRF
  • MHKJFS

Conference

CIKM '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)27
  • Downloads (Last 6 weeks)4
Reflects downloads up to 23 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Adaptive Graph Neural Network with Incremental Learning Mechanism for Knowledge Graph ReasoningElectronics10.3390/electronics1314277813:14(2778)Online publication date: 15-Jul-2024
  • (2024)A fast malware detection model based on heterogeneous graph similarity searchComputer Networks10.1016/j.comnet.2024.110799254(110799)Online publication date: Dec-2024
  • (2022)Semantic enhanced Top-k similarity search on weighted HINNeural Computing and Applications10.1007/s00521-022-07339-634:19(16911-16927)Online publication date: 1-Oct-2022

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media