×

Bayesian anomaly detection methods for social networks. (English) Zbl 1194.62021

Summary: Learning the network structure of a large graph is computationally demanding, and dynamically monitoring the network over time for any changes in structure threatens to be more challenging still. This paper presents a two-stage method for anomaly detection in dynamic graphs: the first stage uses simple, conjugate Bayesian models for discrete time counting processes to track the pairwise links of all nodes in the graph to assess normality of behavior; the second stage applies standard network inference tools on a greatly reduced subset of potentially anomalous nodes. The utility of the method is demonstrated on simulated and real data sets.

MSC:

62F15 Bayesian inference
91D30 Social networks; opinion dynamics
05C90 Applications of graph theory
65C60 Computational problems in statistics (MSC2010)

References:

[1] Bernardo, J. M. and Smith, A. F. M. (1994). Bayesian Theory . Wiley, Chichester. · Zbl 0796.62002
[2] Brin, S. and Page, L. (1998). The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems 30 107-117.
[3] Faloutsos, C., McCurley, K. S. and Tomkins, A. (2004). Connection subgraphs in social networks. In Proceeding of SIAM International Conference on Data Mining, SIAM Workshop on Link Analysis, Counterterrorism and Security . SIAM, Philadelphia.
[4] Ferguson, T. S. (1973). A Bayesian analysis of some nonparametric problems. Ann. Statist. 1 209-230. · Zbl 0255.62037 · doi:10.1214/aos/1176342360
[5] Heard, N. A., Weston, D. J., Platanioti, K. and Hand, D. J. (2010). Supplement to “Bayesian anomaly detection methods for social networks.” , DOI:10.1214/10-AOAS329SUPPB . · Zbl 1194.62021 · doi:10.1214/10-AOAS329
[6] Mullahy, J. (1986). Specification and testing of some modified count data models. J. Econometrics 33 341-365. · doi:10.1016/0304-4076(86)90002-3
[7] Pan, J.-Y., Yang, H.-J., Faloutsos, C. and Duygulu, P. (2004). Automatic multimedia cross-modal correlation discovery. In KDD’04: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 653-658. ACM, New York.
[8] Priebe, C. E., Conroy, J. M., Marchette, D. J. and Park, Y. (2005). Scan statistics on Enron graphs. Computational & Mathematical Organization Theory 11 229-247. · Zbl 1086.68562
[9] Tong, H., Faloutsos, C. and Pan, J.-Y. (2006). Fast random walk with restart and its applications. In ICDM’06: Sixth IEEE International Conference on Data Mining 613-622. IEEE Computer Society, Washington, DC.
[10] von Luxburg, U. (2007). A tutorial on spectral clustering. Statist. Comput. 17 395-416. · doi:10.1007/s11222-007-9033-z
[11] Wasserman, S. and Pattison, P. (1996). Logit models and logistic regressions for social networks: I. An introduction to Markov graphs and p. Psychometrika 61 401-425. · Zbl 0866.92029 · doi:10.1007/BF02294547
[12] Ye, Q., Zhu, T., Hu, D., Wu, B., Du, N. and Wang, B. (2008). Cell phone mini challenge award: Social network accuracy-exploring temporal communication in mobile call graphs. In IEEE International Symposium on Visual Analytics Science and Technology 207-208. IEEE, Piscataway, NJ.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.