×

Whole-graph embedding and adversarial attacks for life sciences. (English) Zbl 07909190

Mondaini, Rubem P. (ed.), Trends in biomathematics: stability and oscillations in environmental, social, and biological models. Selected works from the BIOMAT consortium lectures, Rio de Janeiro, Brazil, November 1–5, 2021. Cham: Springer. 1-21 (2022).
Summary: Networks provide a suitable model for many scientific and technological problems that require the representation of complex entities and their relations. Life sciences applications include systems biology, where molecular components are represented in integrated systems in which the interactions among them provide richer information than single components taken separately, or neuroimaging, where brain networks allow representing the connectivity between different brain locations. In the examples we focus on, a set of networks is available, with each network representing an entity (e.g., a molecule, a macro molecule, or a patient) and links expressing their relation in the chemical/biological domain.
The growing size and complexity of biomedical networks and the high computational complexity of graph analysis methods have lead to the investigation of the so-called whole-graph embedding techniques. Here, graphs are projected into lower dimensional vector spaces, while retaining their structural properties, allowing to reducing the data complexity at the same time keeping the topological and structural information. These techniques are showing very promising results in terms of their usability and potential. However, little research has focused on the analysis of their reliability and robustness. This need is strongly felt for real world applications, where corrupted data, either due to acquisition noise or to intentional attacks, could lead to misleading conclusions for the task at hand.
Our objective here is to investigate on the adoption of adversarial attacks to whole-graph embedding methods for evaluating their robustness for classification in applications of interest for life sciences.
For the entire collection see [Zbl 1515.92004].

MSC:

92C42 Systems biology, networks
05C90 Applications of graph theory

Software:

EigenPooling
Full Text: DOI

References:

[1] Lu Bai, Lixin Cui, Luca Rossi, Lixiang Xu, Xiao Bai, and Edwin Hancock. Local-global nested graph kernels using nested complexity traces. Pattern Recognition Letters, 134:87-95, 2020.
[2] Yunsheng Bai, Hao Ding, Yang Qiao, Agustin Marinovic, Ken Gu, Ting Chen, Yizhou Sun, and Wei Wang. Unsupervised inductive graph-level representation learning via graph-graph proximity, 2019.
[3] Yunsheng Bai, Hao Ding, Yang Qiao, Agustin Marinovic, Ken Gu, Ting Chen, Yizhou Sun, and Wei Wang. Unsupervised inductive graph-level representation learning via graph-graph proximity. In Proceedings of the 28th International Joint Conference on Artificial Intelligence, IJCAI’19, page 1988-1994. AAAI Press, 2019.
[4] Karsten M. Borgwardt, Cheng Soon Ong, Stefan Schönauer, S. V. N. Vishwanathan, Alex J. Smola, and Hans-Peter Kriegel. Protein function prediction via graph kernels. Bioinformatics, 21(suppl 1):i47-i56, 06 2005.
[5] Ulrik Brandes. On variants of shortest-path betweenness centrality and their generic computation. Social Networks, 30(2):136-145, 2008.
[6] Hongyun Cai, Vincent W Zheng, and Kevin Chen-Chuan Chang. A comprehensive survey of graph embedding: Problems, techniques, and applications. IEEE Transactions on Knowledge and Data Engineering, 30(9):1616-1637, 2018.
[7] Hong Chen and Hisashi Koga. Gl2vec: Graph embedding enriched by line graphs with edge features. In Int. Conf. on Neural Information Processing, pages 3-14. Springer, 2019.
[8] Jinyin Chen, Dunjie Zhang, Zhaoyan Ming, and Kejie Huang. GraphAttacker: A general multi-task graphattack framework. arXiv preprint arXiv:2101.06855, 2021.
[9] Liang Chen, Jintang Li, Jiaying Peng, Tao Xie, Zengxu Cao, Kun Xu, Xiangnan He, and Zibin Zheng. A survey of adversarial learning on graphs. CoRR, abs/2003.05730, 2020.
[10] Noé Cécillon, Vincent Labatut, Richard Dufour, and Georges Linarès. Graph embeddings for abusive language detection. SN Computer Science, 2(1), Jan 2021.
[11] Hanjun Dai, Hui Li, Tian Tian, Xin Huang, Lin Wang, Jun Zhu, and Le Song. Adversarial attack on graph structured data. In International conference on machine learning, pages 1115-1124. PMLR, 2018.
[12] AK Debnath, RL Lopez de Compadre, G Debnath, AJ Shusterman, and C. Hansch. Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. correlation with molecular orbital energies and hydrophobicity. J Med Chem, 34, 1991.
[13] Fernando M Delgado and Francisco Gómez-Vela. Computational methods for gene regulatory networks reconstruction and analysis: A review. Artif. Intell. Med., 95:133-145, 2019.
[14] Anjan Dutta, Pau Riba, Josep Lladós, and Alicia Fornés. Pyramidal stochastic graphlet embedding for document pattern classification. In 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), volume 1, pages 33-38. IEEE, 2017.
[15] Anjan Dutta, Pau Riba, Josep Lladós, and Alicia Fornés. Hierarchical stochastic graphlet embedding for graph-based pattern recognition. Neural Computing and Applications, 32(15):11579-11596, 2020.
[16] Vijay Prakash Dwivedi, Chaitanya K. Joshi, Thomas Laurent, Yoshua Bengio, and Xavier Bresson. Benchmarking graph neural networks, 2020.
[17] Alexis Galland and Marc Lelarge. Invariant embedding for graph classification. In ICML 2019 Workshop on Learning and Reasoning with Graph-Structured Representations, 2019.
[18] Ilaria Granata, Mario R Guarracino, Valery A Kalyagin, Lucia Maddalena, Ichcha Manipur, and Panos M Pardalos. Supervised classification of metabolic networks. In 2018 IEEE Int. Conf. on Bioinformatics and Biomedicine (BIBM), pages 2688-2693. IEEE, 2018.
[19] Ilaria Granata, Mario R Guarracino, Valery A Kalyagin, Lucia Maddalena, Ichcha Manipur, and Panos M Pardalos. Model simplification for supervised classification of metabolic networks. Ann. Math. Artif. Intell., 88(1):91-104, 2020. · Zbl 1436.62254
[20] Ilaria Granata, Mario Rosario Guarracino, Lucia Maddalena, and Ichcha Manipur. Network distances for weighted digraphs. In Yury Kochetov, Igor Bykadorov, and Tatiana Gruzdeva, editors, Mathematical Optimization Theory and Operations Research, pages 389-408. Springer Int. Publishing, 2020. · Zbl 1458.90167
[21] Ilaria Granata, Mario Rosario Guarracino, Lucia Maddalena, Ichcha Manipur, and Panos M. Pardalos. On network similarities and their applications. In Rubem P. Mondaini, editor, Trends in Biomathematics: Modeling Cells, Flows, Epidemics, and the Environment: Selected Works from the BIOMAT Consortium Lectures, Szeged, Hungary, 2019, pages 23-41, Cham, 2020. Springer International Publishing. · Zbl 1480.92087
[22] Ilaria Granata, Mario Manzo, Ari Kusumastuti, and Mario Rosario Guarracino. Learning from metabolic networks: Current trends and future directions for precision medicine. Current Medicinal Chemistry, 28(32), 2021.
[23] Maya Hirohara, Yutaka Saito, Yuki Koda, Kengo Sato, and Yasubumi Sakakibara. Convolutional neural network based on smiles representation of compounds for detecting chemical motif. BMC bioinformatics, 19(19):526, 2018.
[24] Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. Open graph benchmark: Datasets for machine learning on graphs, 2021.
[25] Jianming Huang, Zhongxi Fang, and Hiroyuki Kasai. LCS graph kernel based on Wasserstein distance in longest common subsequence metric space. Signal Processing, 189:108281, 2021.
[26] Hongwei Jin, Zhan Shi, Venkata Jaya Shankar Ashish Peruri, and Xinhua Zhang. Certified robustness of graph convolution networks for graph classification under topological attacks. Advances in Neural Information Processing Systems, 33, 2020.
[27] Wei Jin, Yaxing Li, Han Xu, Yiqi Wang, Shuiwang Ji, Charu Aggarwal, and Jiliang Tang. Adversarial attacks and defenses on graphs. SIGKDD Explor. Newsl., 22(2):19-34, January 2021.
[28] Nadeem Iqbal Kajla, Malik Muhammad Saad Missen, Muhammad Muzzamil Luqman, Mickael Coustaty, Arif Mehmood, and Gyu Sang Choi. Additive angular margin loss in deep graph neural network classifier for learning graph edit distance. IEEE Access, 8:201752-201761, 2020.
[29] Quoc Le and Tomas Mikolov. Distributed representations of sentences and documents. In Int. Conf. on machine learning, pages 1188-1196, 2014.
[30] Bentian Li and Dechang Pi. Network representation learning: a systematic literature review. Neural Computing and Applications, pages 1-33, 2020.
[31] Yao Ma, Suhang Wang, Charu C Aggarwal, and Jiliang Tang. Graph convolutional networks with eigenpooling. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 723-731, 2019.
[32] Yao Ma, Suhang Wang, Tyler Derr, Lingfei Wu, and Jiliang Tang. Graph adversarial attack via rewiring. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, KDD ’21, page 1161-1169, New York, NY, USA, 2021. Association for Computing Machinery.
[33] Lucia Maddalena, Ichcha Manipur, Mario Manzo, and Mario Rosario Guarracino. On whole-graph embedding techniques. In Rubem P. Mondaini, editor, Trends in Biomathematics: Chaos and Control in Epidemics, Ecosystems, and Cells: Selected Works from the 20th BIOMAT Consortium Lectures, Rio de Janeiro, Brazil, 2020, pages 115-131, Cham, 2021. Springer International Publishing. · Zbl 1482.68202
[34] Ichcha Manipur, Ilaria Granata, Lucia Maddalena, and Mario Rosario Guarracino. Clustering analysis of tumor metabolic networks. BMC Bioinformatics, 21(349), 2020.
[35] Ichcha Manipur, Mario Manzo, Ilaria Granata, Maurizio Giordano, Lucia Maddalena, and Mario Rosario Guarracino. Netpro2vec: a graph embedding framework for biomedical applications. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2021.
[36] Mario Manzo, Maurizio Giordano, Lucia Maddalena, and Mario Rosario Guarracino. Performance evaluation of adversarial attacks on whole-graph embedding models. In Dimitris E. Simos, Panos M. Pardalos, and Ilias S. Kotsireas Kotsireas, editors, Learning and Intelligent Optimization, LNCS. Springer, 2021.
[37] Mario Manzo and Alessandro Rozza. Dopsie: Deep-order proximity and structural information embedding. Machine Learning and Knowledge Extraction, 1(2):684-697, 2019.
[38] B.W. Matthews. Comparison of the predicted and observed secondary structure of t4 phage lysozyme. Biochimica et Biophysica Acta (BBA) - Protein Structure, 405(2):442-451, 1975.
[39] Annamalai Narayanan, Mahinthan Chandramohan, Rajasekar Venkatesan, Lihui Chen, Yang Liu, and Shantanu Jaiswal. graph2vec: Learning distributed representations of graphs. In Proceedings of the 13th International Workshop on Mining and Learning with Graphs (MLG), 2017.
[40] Walter Nelson, Marinka Zitnik, Bo Wang, Jure Leskovec, Anna Goldenberg, and Roded Sharan. To embed or not: network embedding as a paradigm in computational biology. Front Genet, 10:381, 2019.
[41] Mathias Niepert, Mohamed Ahmed, and Konstantin Kutzkov. Learning convolutional neural networks for graphs. In International conference on machine learning, pages 2014-2023, 2016.
[42] Saeid Rasti and Chrysafis Vogiatzis. A survey of computational methods in protein-protein interaction networks. Ann. Oper. Res., 276(1-2):35-87, 2019. · Zbl 1425.92079
[43] Radim Řehůřek and Petr Sojka. Software Framework for Topic Modelling with Large Corpora. In Proc. of the LREC 2010 Workshop on New Challenges for NLP Frameworks, pages 45-50, Valletta, Malta, May 2010. ELRA.
[44] Benedek Rozemberczki, Oliver Kiss, and Rik Sarkar. Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs. In Proc. of the 29th ACM Int. Conf. on Information and Knowledge Management (CIKM ’20). ACM, 2020.
[45] Benedek Rozemberczki and Rik Sarkar. Characteristic functions on graphs: Birds of a feather, from statistical descriptors to parametric models. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management, pages 1325-1334, 2020.
[46] Anwar Said, Saeed-Ul Hassan, Waseem Abbas, and Mudassir Shabbir. NetKI: A Kirchhoff index based statistical graph embedding in nearly linear time. Neurocomputing, 433:108-118, 2021.
[47] Alberto Sanfeliu and King-Sun Fu. A distance measure between attributed relational graphs for pattern recognition. IEEE transactions on systems, man, and cybernetics, 3:353-362, 1983. · Zbl 0511.68060
[48] CJ Stam, ECW Van Straaten, E Van Dellen, P Tewarie, G Gong, A Hillebrand, J Meier, and P Van Mieghem. The relation between structural and functional connectivity patterns in complex brain networks. Int. J. Psychophysiol., 103:149-160, 2016.
[49] Chang Su, Jie Tong, Yongjun Zhu, Peng Cui, and Fei Wang. Network embedding in biomedical data science. Briefings in Bioinformatics, 21(1):182-197, 12 2018.
[50] Lichao Sun, Ji Wang, Philip S. Yu, and Bo Li. Adversarial attack and defense on graph data: A survey. CoRR, abs/1812.10528, 2020.
[51] Zhigang Sun, Hongwei Huo, Jun Huan, and Jeffrey Scott Vitter. Feature reduction based on semantic similarity for graph classification. Neurocomputing, 397:114-126, 2020.
[52] Haoteng Tang, Guixiang Ma, Yurong Chen, Lei Guo, Wei Wang, Bo Zeng, and Liang Zhan. Adversarial attack on hierarchical graph pooling neural networks. arXiv preprint arXiv:2005.11560, 2020.
[53] Haoteng Tang, Guixiang Ma, Lifang He, Heng Huang, and Liang Zhan. Commpool: An interpretable graph pooling framework for hierarchical graph representation learning. Neural Networks, 143:669-677, 2021.
[54] Anton Tsitsulin, Davide Mottin, Panagiotis Karras, Alexander Bronstein, and Emmanuel Müller. NetLSD: hearing the shape of a graph. In Proc. of the 24th ACM SIGKDD Int. Conf. on Knowledge Discovery & Data Mining, pages 2347-2356, 2018.
[55] Mathias Uhlén, Linn Fagerberg, Björn M Hallström, Cecilia Lindskog, Per Oksvold, Adil Mardinoglu, Åsa Sivertsson, Caroline Kampf, Evelina Sjöstedt, Anna Asplund, et al. Tissue-based map of the human proteome. Science, 347(6220), 2015.
[56] Saurabh Verma and Zhi-Li Zhang. Hunt for the unique, stable, sparse and fast feature learning on graphs. In Adv Neural Inform Process Syst, pages 88-98, 2017.
[57] Juan Salamanca Viloria, Maria Francesca Allega, Matteo Lambrughi, and Elena Papaleo. An optimal distance cutoff for contact-based protein structure networks using side-chain centers of mass. Scientific reports, 7(1):1-11, 2017.
[58] Yadi Wang, Xiaoping Li, and Rubén Ruiz. Weighted general group lasso for gene selection in cancer classification. IEEE Trans Cybern, 49(8):2860-2873, 2018.
[59] Zhaohan Xi, Ren Pang, Shouling Ji, and Ting Wang. Graph backdoor. In 30th USENIX Security Symposium (USENIX Security 21), 2021.
[60] Yuning You, Tianlong Chen, Yongduo Sui, Ting Chen, Zhangyang Wang, and Yang Shen. Graph contrastive learning with augmentations. In H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 5812-5823. Curran Associates, Inc., 2020.
[61] Xiang Yue, Zhen Wang, Jingong Huang, Srinivasan Parthasarathy, Soheil Moosavinasab, Yungui Huang, Simon M Lin, Wen Zhang, Ping Zhang, and Huan Sun. Graph embedding on biomedical networks: methods, applications and evaluations. Bioinformatics, 36(4):1241-1251, 2020.
[62] Muhan Zhang, Zhicheng Cui, Marion Neumann, and Yixin Chen. An end-to-end deep learning architecture for graph classification. In AAAI, pages 4438-4445, 2018.
[63] Wen Zhang, Xiang Yue, Guifeng Tang, Wenjian Wu, Feng Huang, and Xining Zhang. Sfpel-lpi: Sequence-based feature projection ensemble learning for predicting lncrna-protein interactions. PLoS computational biology, 14(12):e1006616, 2018.
[64] Zaixi Zhang, Jinyuan Jia, Binghui Wang, and Neil Zhenqiang Gong. Backdoor attacks to graph neural networks. In Proceedings of the 26th ACM Symposium on Access Control Models and Technologies, pages 15-26, 2021.
[65] Zhen Zhang, Jiajun Bu, Martin Ester, Jianfeng Zhang, Chengwei Yao, Zhi Yu, and Can Wang. Hierarchical graph pooling with structure learning. CoRR, abs/1911.05954, 2019.
[66] Jiajun Zhou, Jie Shen, Shanqing Yu, Guanrong Chen, and Qi Xuan. M-evolve: Structural-mapping-based data augmentation for graph classification. IEEE Transactions on Network Science and Engineering, 8(1):190-200, 2020.
[67] Renyi Zhou, Zhangli Lu, Huimin Luo, Ju Xiang, Min Zeng, and Min Li. Nedd: a network embedding based method for predicting drug-disease associations. BMC bioinformatics, 21(13):1-12, 2020.
[68] Shichao Zhu, Lewei Zhou, Shirui Pan, Chuan Zhou, Guiying Yan, and Bin Wang. GSSNN: graph smoothing splines neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 7007-7014. AAAI Press, 2020.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.