×

Saliency-aware regularized graph neural network. (English) Zbl 1537.68175

Summary: The crux of graph classification lies in the effective representation learning for the entire graph. Typical graph neural networks focus on modeling the local dependencies when aggregating features of neighboring nodes, and obtain the representation for the entire graph by aggregating node features. Such methods have two potential limitations: 1) the global node saliency w.r.t. graph classification is not explicitly modeled, which is crucial since different nodes may have different semantic relevance to graph classification; 2) the graph representation directly aggregated from node features may have limited effectiveness to reflect graph-level information. In this work, we propose the Saliency-Aware Regularized Graph Neural Network (SAR-GNN) for graph classification, which consists of two core modules: 1) a traditional graph neural network serving as the backbone for learning node features and 2) the Graph Neural Memory designed to distill a compact graph representation from node features of the backbone. We first estimate the global node saliency by measuring the semantic similarity between the compact graph representation and node features. Then the learned saliency distribution is leveraged to regularize the neighborhood aggregation of the backbone, which facilitates the message passing of features for salient nodes and suppresses the less relevant nodes. Thus, our model can learn more effective graph representation. We demonstrate the merits of SAR-GNN by extensive experiments on seven datasets across various types of graph data.

MSC:

68T07 Artificial neural networks and deep learning
62H30 Classification and discrimination; cluster analysis (statistical aspects)

References:

[1] Scarselli, F.; Gori, M.; Tsoi, A. C.; Hagenbuchner, M.; Monfardini, G., The graph neural network model, IEEE Trans. Neural Netw., 20, 1, 61-80 (2008)
[2] Micheli, A., Neural network for graphs: a contextual constructive approach, IEEE Trans. Neural Netw., 20, 3, 498-511 (2009)
[3] Kipf, T. N.; Welling, M., Semi-supervised classification with graph convolutional networks, (International Conference on Learning Representations (2017))
[4] Hamilton, W. L.; Ying, Z.; Leskovec, J., Inductive representation learning on large graphs, (Advances in Neural Information Processing Systems (2017)), 1024-1034
[5] Xu, K.; Hu, W.; Leskovec, J.; Jegelka, S., How powerful are graph neural networks?, (International Conference on Learning Representations (2019))
[6] Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Liò, P.; Bengio, Y., Graph attention networks, (International Conference on Learning Representations (2018))
[7] Lee, J.; Lee, I.; Kang, J., Self-attention graph pooling, (International Conference on Machine Learning (2019)), 3734-3743
[8] Knyazev, B.; Taylor, G. W.; Amer, M., Understanding attention and generalization in graph neural networks, (Advances in Neural Information Processing Systems (2019)), 4202-4212
[9] Ying, C.; Cai, T.; Luo, S.; Zheng, S.; Ke, G.; He, D.; Shen, Y.; Liu, T.-Y., Do transformers really perform badly for graph representation?, (Advances in Neural Information Processing Systems (2021)), 28877-28888
[10] Bodnar, C.; Cangea, C.; Liò, P., Deep graph mapper: seeing graphs through the neural lens, Front. Big Data, 4 (2021)
[11] Zhang, M.; Cui, Z.; Neumann, M.; Chen, Y., An end-to-end deep learning architecture for graph classification, (Proceedings of the AAAI Conference on Artificial Intelligence (2018))
[12] Leman, A. A.; Weisfeiler, B., A reduction of a graph to a canonical form and an algebra arising during this reduction, Naučn. Tech. Inf., 2, 9, 12-16 (1968)
[13] Simonovsky, M.; Komodakis, N., Dynamic edge-conditioned filters in convolutional neural networks on graphs, (Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)), 3693-3702
[14] Ying, R.; You, J.; Morris, C.; Ren, X.; Hamilton, W. L.; Leskovec, J., Hierarchical graph representation learning with differentiable pooling, (Advances in Neural Information Processing Systems (2018)), 4800-4810
[15] Ma, Y.; Wang, S.; Aggarwal, C. C.; Tang, J., Graph convolutional networks with EigenPooling, (Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (2019)), 723-731
[16] Gao, H.; Ji, S., Graph U-nets, (International Conference on Machine Learning (2019)), 2083-2092
[17] Wu, Z.; Pan, S.; Chen, F.; Long, G.; Zhang, C.; Philip, S. Y., A comprehensive survey on graph neural networks, IEEE Trans. Neural Netw. Learn. Syst., 32, 1, 4-24 (2020)
[18] Shervashidze, N.; Schweitzer, P.; Van Leeuwen, E. J.; Mehlhorn, K.; Borgwardt, K. M., Weisfeiler-Lehman graph kernels, J. Mach. Learn. Res., 12, 9 (2011) · Zbl 1280.68194
[19] Zhang, T.; Wang, Y.; Cui, Z.; Zhou, C.; Cui, B.; Huang, H.; Yang, J., Deep Wasserstein graph discriminant learning for graph classification, (Proceedings of the AAAI Conference on Artificial Intelligence (2021)), 10914-10922
[20] Chen, T.; Bian, S.; Sun, Y., Are powerful graph neural nets necessary? A dissection on graph classification, (International Conference on Learning Representations (2019))
[21] Errica, F.; Podda, M.; Bacciu, D.; Micheli, A., A fair comparison of graph neural networks for graph classification, (International Conference on Learning Representations (2020))
[22] Krizhevsky, A.; Sutskever, I.; Hinton, G. E., Imagenet classification with deep convolutional neural networks, (Advances in Neural Information Processing Systems (2012)), 1097-1105
[23] Bruna, J.; Zaremba, W.; Szlam, A.; LeCun, Y., Spectral networks and locally connected networks on graphs, (International Conference on Learning Representations (2014))
[24] Defferrard, M.; Bresson, X.; Vandergheynst, P., Convolutional neural networks on graphs with fast localized spectral filtering, (Advances in Neural Information Processing Systems (2016)), 3844-3852
[25] Niepert, M.; Ahmed, M.; Kutzkov, K., Learning convolutional neural networks for graphs, (International Conference on Machine Learning (2016)), 2014-2023
[26] Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Kaiser, Ł.; Polosukhin, I., Attention is all you need, (Advances in Neural Information Processing Systems (2017)), 5998-6008
[27] Bevilacqua, B.; Frasca, F.; Lim, D.; Srinivasan, B.; Cai, C.; Balamurugan, G.; Bronstein, M. M.; Maron, H., Equivariant subgraph aggregation networks, (International Conference on Learning Representations (2022))
[28] Ying, R.; Bourgeois, D.; You, J.; Zitnik, M.; Leskovec, J., Gnnexplainer: generating explanations for graph neural networks, (Advances in Neural Information Processing Systems (2019))
[29] Luo, D.; Cheng, W.; Xu, D.; Yu, W.; Zong, B.; Chen, H.; Zhang, X., Parameterized explainer for graph neural network, (Advances in Neural Information Processing Systems (2020))
[30] Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova Bert, K., Pre-training of deep bidirectional transformers for language understanding, (Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2019))
[31] Yuan, L.; Chen, Y.; Wang, T.; Yu, W.; Shi, Y.; Jiang, Z.-H.; Tay, F. E.; Feng, J.; Yan, S., Tokens-to-token vit: training vision transformers from scratch on imagenet, (Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)), 558-567
[32] Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S., End-to-end object detection with transformers, (European Conference on Computer Vision (2020)), 213-229
[33] Jaegle, A.; Gimeno, F.; Brock, A.; Zisserman, A.; Vinyals, O.; Carreira, J., Perceiver: general perception with iterative attention, (International Conference on Machine Learning (2021))
[34] Debnath, A. K.; Lopez de Compadre, R. L.; Debnath, G.; Shusterman, A. J.; Hansch, C., Structure-activity relationship of mutagenic aromatic and heteroaromatic Nitro compounds. Correlation with molecular orbital energies and hydrophobicity, J. Med. Chem., 34, 2, 786-797 (1991)
[35] Schomburg, I.; Chang, A.; Ebeling, C.; Gremse, M.; Heldt, C.; Huhn, G.; Schomburg, D., BRENDA, the enzyme database: updates and major new developments, Nucleic Acids Res., 32, D431-D433 (2004)
[36] Borgwardt, K. M.; Ong, C. S.; Schönauer, S.; Vishwanathan, S.; Smola, A. J.; Kriegel, H.-P., Protein function prediction via graph kernels, Bioinformatics, 21, suppl_1 (2005), i47-i56
[37] Yanardag, P.; Vishwanathan, S. V.N., Deep graph kernels, (Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2015)), 1365-1374
[38] Riesen, K.; Bunke, H., IAM graph database repository for graph based pattern recognition and machine learning, (Structural, Syntactic, and Statistical Pattern Recognition (2008)), 287-297
[39] Kingma, D. P.; Ba, J., Adam: a method for stochastic optimization (2014), arXiv preprint
[40] Bouritsas, G.; Frasca, F.; Zafeiriou, S. P.; Bronstein, M., Improving graph neural network expressivity via subgraph isomorphism counting, IEEE Trans. Pattern Anal. Mach. Intell. (2022)
[41] Nguyen, D. Q.; Nguyen, T. D.; Phung, D., Universal graph transformer self-attention networks, (Companion Proceedings of the Web Conference 2022 (2022)), 193-196
[42] Zhu, Y.; Zhang, K.; Wang, J.; Ling, H.; Zhang, J.; Zha, H., Structural landmarking and interaction modelling: a “slim” network for graph classification, (Proceedings of the AAAI Conference on Artificial Intelligence (2022)), 9251-9259
[43] Zhang, K.; Zhu, Y.; Wang, J.; Zhang, J., Adaptive structural fingerprints for graph attention networks, (International Conference on Learning Representations (2019))
[44] van der Maaten, L.; Hinton, G., Visualizing data using t-SNE, J. Mach. Learn. Res., 9, 2579-2605 (2008) · Zbl 1225.68219
[45] Y.Q. Derrick Blakely, Jack Lanchantin, Time and space complexity of graph convolutional networks, Accessed on: 31 Dec. 2021, 2021.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.