×

Knowledge graph embedding methods for entity alignment: experimental review. (English) Zbl 07741289

Summary: In recent years, we have witnessed the proliferation of knowledge graphs (KG) in various domains, aiming to support applications like question answering, recommendations, etc. A frequent task when integrating knowledge from different KGs is to find which subgraphs refer to the same real-world entity, a task largely known as the Entity Alignment. Recently, embedding methods have been used for entity alignment tasks, that learn a vector-space representation of entities which preserves their similarity in the original KGs. A wide variety of supervised, unsupervised, and semi-supervised methods have been proposed that exploit both factual (attribute based) and structural information (relation based) of entities in the KGs. Still, a quantitative assessment of their strengths and weaknesses in real-world KGs according to different performance metrics and KG characteristics is missing from the literature. In this work, we conduct the first meta-level analysis of popular embedding methods for entity alignment, based on a statistically sound methodology. Our analysis reveals statistically significant correlations of different embedding methods with various meta-features extracted by KGs and rank them in a statistically significant way according to their effectiveness across all real-world KGs of our testbed. Finally, we study interesting trade-offs in terms of methods’ effectiveness and efficiency.

MSC:

68T30 Knowledge representation
68R10 Graph theory (including graph drawing) in computer science

References:

[1] Ahmetaj S, Efthymiou V, Fagin R, et al. (2021) Ontology-enriched query answering on relational databases. In: AAAI, pp 15247-15254
[2] Berrendorf M, Faerman E, Vermue L, et al. (2020) On the ambiguity of rank-based evaluation of entity alignment or link prediction methods. CoRR arXiv:abs/2002.06914
[3] Bordes A, Usunier N, García-Durán A, et al. (2013) Translating embeddings for modeling multi-relational data. In: NeurIPS, pp 2787-2795
[4] Cai W, Ma W, Zhan J, et al. (2022) Entity alignment with reliable path reasoning and relation-aware heterogeneous graph transformer. In: IJCAI, pp 1930-1937
[5] Cao Y, Liu Z, Li C, et al. (2019) Multi-channel graph neural network for entity alignment. In: ACL, pp 1452-1461
[6] Chaurasiya D, Surisetty A, Kumar N, et al. (2022) Entity alignment for knowledge graphs: Progress, challenges, and empirical studies. CoRR arXiv:abs/2205.08777
[7] Chen M, Tian Y, Yang M, et al. (2017) Multilingual knowledge graph embeddings for cross-lingual knowledge alignment. In: IJCAI, pp 1511-1517
[8] Chen M, Tian Y, Chang K, et al. (2018) Co-training embeddings of knowledge graphs and entity descriptions for cross-lingual entity alignment. In: IJCAI, pp 3998-4004
[9] Choudhary S, Luthra T, Mittal A, et al. (2021) A survey of knowledge graph embedding and their applications. CoRR arXiv:abs/2107.07842
[10] Christophides V, Efthymiou V, Stefanidis K (2015) Entity Resolution in the Web of Data. Theory and Technology, Morgan & Claypool Publishers, San Rafael, California, Synthesis Lectures on the Semantic Web
[11] Christophides V, Efthymiou V, Palpanas T, et al. (2021) An overview of end-to-end entity resolution for big data. ACM Comput Surv 53(6):127:1-127:42
[12] Coleman, C.; Narayanan, D.; Kang, D., Dawnbench: An end-to-end deep learning benchmark and competition, Training, 100, 101, 102 (2017)
[13] Coleman, C.; Kang, D.; Narayanan, D., Analysis of dawnbench, a time-to-accuracy machine learning performance benchmark, ACM SIGOPS Oper Syst Rev, 53, 1, 14-25 (2019) · doi:10.1145/3352020.3352024
[14] Demsar, J., Statistical comparisons of classifiers over multiple data sets, J Mach Learn Res, 7, 1-30 (2006) · Zbl 1222.68184
[15] Devlin J, Chang M, Lee K, et al. (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT, pp 4171-4186
[16] Dong X, Gabrilovich E, Heitz G, et al. (2014) Knowledge vault: a web-scale approach to probabilistic knowledge fusion. In: SIGKDD, pp 601-610
[17] Efthymiou V, Stefanidis K, Christophides V (2015) Big data entity resolution: From highly to somehow similar entity descriptions in the web. In: IEEE Big Data, pp 401-410
[18] Efthymiou V, Stefanidis K, Pitoura E, et al. (2022) FairER: Entity resolution with fairness constraints. In: CIKM, pp 3004-3008
[19] Fanourakis N, Efthymiou V, Christophides V, et al. (2023) Structural bias in knowledge graphs for the entity alignment task. In: ESWC
[20] Fisher J, Mittal A, Palfrey D, et al. (2020) Debiasing knowledge graph embeddings. In: EMNLP, pp 7332-7345
[21] Jiang J, Li M, Gu Z (2021) A survey on translating embedding based entity alignment in knowledge graphs. In: DSC, pp 187-194
[22] Kamigaito H, Hayashi K (2022) Comprehensive analysis of negative sampling in knowledge graph representation learning. In: ICML, pp 10661-10675
[23] Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: ICLR
[24] Kocmi T, Bojar O (2017) An exploration of word embedding initialization in deep-learning tasks. In: ICON, pp 56-64
[25] Lehmann, J.; Isele, R.; Jakob, M., Dbpedia - A large-scale, multilingual knowledge base extracted from wikipedia, Semantic Web, 6, 2, 167-195 (2015) · doi:10.3233/SW-140134
[26] Leone, M.; Huber, S.; Arora, A., A critical re-evaluation of neural methods for entity alignment, PVLDB, 15, 8, 1712-1725 (2022)
[27] Lin, J., Divergence measures based on the shannon entropy, IEEE Trans Inf Theory, 37, 1, 145-151 (1991) · Zbl 0712.94004 · doi:10.1109/18.61115
[28] Mao X, Wang W, Xu H, et al. (2020a) MRAEA: an efficient and robust entity alignment approach for cross-lingual knowledge graph. In: WSDM, pp 420-428
[29] Mao X, Wang W, Xu H, et al. (2020b) Relational reflection entity alignment. In: CIKM, pp 1095-1104
[30] Million E (2007) The Hadamard product. Course Notes 3(6)
[31] Nemenyi, P., Distribution-free Multiple Comparisons (1963), Princeton: Princeton University, Princeton
[32] Nickel M, Kiela D (2017) Poincaré embeddings for learning hierarchical representations. In: NeurIPS, pp 6338-6347
[33] Obraczka D, Schuchart J, Rahm E (2021) EAGER: embedding-assisted entity resolution for knowledge graphs. CoRR arXiv:abs/2101.06126
[34] Parisi, L.; Neagu, D.; Ma, R., Quantum relu activation for convolutional neural networks to improve diagnosis of parkinson’s disease and COVID-19, Expert Syst Appl, 187, 115892 (2022) · doi:10.1016/j.eswa.2021.115892
[35] Rebele T, Suchanek FM, Hoffart J, et al. (2016) YAGO: A multilingual knowledge base from wikipedia, wordnet, and geonames. In: ISWC, pp 177-185
[36] Suchanek, FM; Abiteboul, S.; Senellart, P., PARIS: probabilistic alignment of relations, instances, and schema, PVLDB, 5, 3, 157-168 (2011)
[37] Sun Z, Hu W, Zhang Q, et al. (2018) Bootstrapping entity alignment with knowledge graph embedding. In: IJCAI, pp 4396-4402
[38] Sun Z, Deng Z, Nie J, et al. (2019) Rotate: Knowledge graph embedding by relational rotation in complex space. In: ICLR
[39] Sun, Z.; Zhang, Q.; Hu, W., A benchmarking study of embedding-based entity alignment for knowledge graphs, PVLDB, 13, 11, 2326-2340 (2020)
[40] Tang X, Zhang J, Chen B, et al. (2020) BERT-INT: A bert-based interaction model for knowledge graph alignment. In: IJCAI, pp 3174-3180
[41] Tarus, JK; Niu, Z.; Mustafa, G., Knowledge-based recommendation: a review of ontology-based recommender systems for e-learning, Artif Intell Rev, 50, 1, 21-48 (2018) · doi:10.1007/s10462-017-9539-5
[42] Trisedya BD, Qi J, Zhang R (2019) Entity alignment between knowledge graphs using attribute embeddings. In: AAAI, pp 297-304
[43] Trouillon T, Welbl J, Riedel S, et al. (2016) Complex embeddings for simple link prediction. In: ICML, pp 2071-2080
[44] Velickovic P, Cucurull G, Casanova A, et al. (2018) Graph attention networks. In: ICLR
[45] Vrandecic D, Krötzsch M (2014) Wikidata: a free collaborative knowledgebase. Commun ACM 57:78-85. doi:10.1145/2629489
[46] Vretinaris A, Lei C, Efthymiou V, et al. (2021) Medical entity disambiguation using graph neural networks. In: SIGMOD, pp 2310-2318
[47] Wang, Q.; Mao, Z.; Wang, B., Knowledge graph embedding: a survey of approaches and applications, IEEE Trans Knowl Data Eng, 29, 12, 2724-2743 (2017) · doi:10.1109/TKDE.2017.2754499
[48] Wang Z, Yang J, Ye X (2020) Knowledge graph alignment with entity-pair embedding. In: EMNLP, pp 1672-1680
[49] Wang Z, Li M, Gu Z (2021) A review of entity alignment based on graph convolutional neural network. In: DSC, pp 144-151
[50] Wu Y, Liu X, Feng Y, et al. (2019) Relation-aware entity alignment for heterogeneous knowledge graphs. In: IJCAI, pp 5278-5284
[51] Xiong C, Dai Z, Callan J, et al. (2017) End-to-end neural ad-hoc ranking with kernel pooling. In: SIGIR, pp 55-64
[52] Yang B, Yih W, He X, et al. (2015) Embedding entities and relations for learning and inference in knowledge bases. In: ICLR
[53] Zeng, K.; Li, C.; Hou, L., A comprehensive survey of entity alignment for knowledge graphs, AI Open, 2, 1-13 (2021) · doi:10.1016/j.aiopen.2021.02.002
[54] Zhang, C.; Zhou, M.; Han, X., Knowledge graph embedding for hyper-relational data, Tsinghua Sci Technol, 22, 2, 185-197 (2017) · doi:10.23919/TST.2017.7889640
[55] Zhang Q, Sun Z, Hu W, et al. (2019) Multi-view knowledge graph embedding for entity alignment. In: IJCAI, pp 5429-5435
[56] Zhang, R.; Trisedya, BD; Li, M., A benchmark and comprehensive survey on knowledge graph entity alignment via representation learning, VLDBJ, 31, 5, 1143-1168 (2022) · doi:10.1007/s00778-022-00747-z
[57] Zhang Z, Liu H, Chen J, et al. (2020) An industry evaluation of embedding-based entity alignment. In: COLING, pp 179-189
[58] Zhao, X.; Zeng, W.; Tang, J., An experimental study of state-of-the-art entity alignment approaches, IEEE Trans Knowl Data Eng, 34, 6, 2610-2625 (2022)
[59] Zhu Q, Zhou X, Wu J, et al. (2019) Neighborhood-aware attentional representation for multilingual knowledge graphs. In: IJCAI, pp 1943-1949
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.