
A topological approach for capturing high-order interactions in graph data with applications to anomaly detection in time-varying cryptocurrency transaction graphs. (English) Zbl 07927553

Summary: Time-varying graphs are increasingly common in financial, social and biological data analysis applications. Feature extraction that efficiently encodes the complex structure of sparse, multi-layered, dynamic graphs presents computational and methodological challenges. In the past decade, topological data analysis has become a popular method of studying the shape of data. This is achieved by building an increasing sequence of simplicial complexes (called filtration) indexed by a scale parameter on top of the data to keep track of topological changes along with the filtration. This multi-scale summary, called persistence diagram (PD), is often vectorized to be used in machine learning algorithms. This paper introduces a topological approach to extract information on higher-order interactions encoded in persistence diagrams from graph data. Our framework has two main steps: first, we convert the graph into a higher-dimensional simplicial complex by adding structures such as triangles, tetrahedrons etc., and compute a PD using the so-called lower-star filtration which utilizes quantitative node attributes. Then, we vectorize the PD by averaging the associated Betti function over successive scale values of a one-dimensional grid using integration. A notable aspect of our procedure is that it avoids embedding a graph into a metric space. We show that the proposed vectorization summary is robust against input noise with respect to the \(L_1\) 1-Wasserstein distance. In simulation studies, the proposed approach leads to improved change point detection rates and outperforms one of the state-of-the-art methods for anomaly detection in time-varying graphs. In real data application, our approach leads to up to a 20% gain in anomalous price prediction in the Ethereum cryptocurrency transaction network.


55N31 Persistent homology and applications, topological data analysis
62R40 Topological data analysis
68T09 Computational aspects of data analysis and big data
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.