×

Model-based clustering of time-evolving networks through temporal exponential-family random graph models. (English) Zbl 1437.62233

Summary: Dynamic networks are a general language for describing time-evolving complex systems, and discrete time network models provide an emerging statistical technique for various applications. It is a fundamental research question to detect a set of nodes sharing similar connectivity patterns in time-evolving networks. Our work is primarily motivated by detecting groups based on interesting features of the time-evolving networks (e.g., stability). In this work, we propose a model-based clustering framework for time-evolving networks based on discrete time exponential-family random graph models, which simultaneously allows both modeling and detecting group structure. To choose the number of groups, we use the conditional likelihood to construct an effective model selection criterion. Furthermore, we propose an efficient variational expectation-maximization (EM) algorithm to find approximate maximum likelihood estimates of network parameters and mixing proportions. The power of our method is demonstrated in simulation studies and empirical applications to international trade networks and the collaboration networks of a large research university.

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)
62P20 Applications of statistics to economics
05C90 Applications of graph theory
94C15 Applications of graph theory to circuits and networks

References:

[1] Agarwal, A.; Xue, L., Model-based clustering of nonparametric weighted networks with application to water pollution analysis, Technometrics (2019)
[2] Airoldi, E. M.; Blei, D. M.; Fienberg, S. E.; Xing, E. P., Mixed membership stochastic blockmodels, J. Mach. Learn. Res., 9, 1981-2014 (2008) · Zbl 1225.68143
[3] Allman, E. S.; Matias, C.; Rhodes, J. A., Identifiability of parameters in latent structure models with many observed variables, Ann. Statist., 37, 6A, 3099-3132 (2009) · Zbl 1191.62003
[4] Allman, E. S.; Matias, C.; Rhodes, J. A., Parameter identifiability in a class of random graph mixture models, J. Statist. Plann. Inference, 141, 5, 1719-1736 (2011) · Zbl 1207.62010
[5] Amini, A. A.; Chen, A.; Bickel, P. J.; Levina, E., Pseudo-likelihood methods for community detection in large sparse networks, Ann. Statist., 41, 4, 2097-2122 (2013) · Zbl 1277.62166
[6] Bearman, P. S.; Moody, J.; Stovel, K., Chains of affection: The structure of adolescent romantic and sexual networks, Am. J. Sociol., 110, 1, 44-91 (2004)
[7] D. Bertsimas, 15.093J Optimization Methods, Massachusetts Institute of Technology: MIT OpenCourseWare. URL: https://ocw.mit.edu, (2009).
[8] Bickel, P. J.; Chen, A., A nonparametric view of network models and Newman-Girvan and other modularities, Proc. Natl. Acad. Sci., 106, 50, 21068-21073 (2009) · Zbl 1359.62411
[9] Choi, D. S.; Wolfe, P. J.; Airoldi, E. M., Stochastic blockmodels with a growing number of classes, Biometrika, 99, 2, 273-284 (2012) · Zbl 1318.62207
[10] Corneli, M.; Latouche, P.; Rossi, F., Multiple change points detection and clustering in dynamic networks, Stat. Comput., 28, 5, 989-1007 (2018) · Zbl 1405.62076
[11] Daudin, J.-J.; Picard, F.; Robin, S., A mixture model for random graphs, Stat. Comput., 18, 173-283 (2008)
[12] Fraley, C.; Raftery, A. E., Model-based clustering, discriminant analysis, and density estimation, J. Am. Stat. Assoc., 97, 458, 611-631 (2002) · Zbl 1073.62545
[13] Girvan, M.; Newman, M. E., Community structure in social and biological networks, Proc. Natl. Acad. Sci., 99, 12, 7821-7826 (2002) · Zbl 1032.91716
[14] Han, J.-D. J.; Bertin, N.; Hao, T.; Goldberg, D. S.; Berriz, G. F.; Zhang, L. V.; Dupuy, D.; Walhout, A. J.M.; Cusick, M. E.; Roth, F. P.; Vidal, M., Evidence for dynamically organized modularity in the yeast protein-protein interaction network, Nature, 430, 6995, 88-93 (2004)
[15] Handcock, M. S.; Raftery, A. E.; Tantrum, J. M., Model-based clustering for social networks, J. R. Stat. Soc. A, 170, 2, 301-354 (2007)
[16] Hanneke, S.; Fu, W.; Xing, E., Discrete temporal models of social networks, Electron. J. Stat., 4, 585-605 (2010) · Zbl 1329.91113
[17] Ho, Q.; Song, L.; Xing, E., Evolving cluster mixed-membership blockmodel for time-varying networks, J. Mach. Learn. Res.: Workshop Conf. Proc., 15, 342-350 (2011)
[18] Hoff, P. D.; Raftery, A. E.; Handcock, M. S., Latent space approaches to social network analysis, J. Amer. Statist. Assoc., 97, 460, 1090-1098 (2002) · Zbl 1041.62098
[19] Holland, P. W.; Laskey, K. B.; Leinhardt, S., Stochastic blockmodels: First steps, Social Networks, 5, 2, 109-137 (1983)
[20] Hunter, D. R.; Lange, K., A tutorial on MM algorithms, Amer. Statist., 58, 1, 30-37 (2004)
[21] Ji, P.; Jin., J., Coauthorship and citation networks for statisticians, Ann. Appl. Stat., 10, 4, 1779-1812 (2016) · Zbl 1454.62541
[22] Karrer, B.; Newman, M. E., Stochastic blockmodels and community structure in networks, Phys. Rev. E, 83, 1, 016107 (2011)
[23] Kim, B.; Lee, K.; Xue, L.; Niu, X., A review of dynamic network models with latent variables, Stat. Surv., 12, 105-135 (2018) · Zbl 1401.62103
[24] Knecht, A. B., Friendship Selection and Friends’ Influence. Dynamics of Networks and Actor Attributes in Early Adolescence (2008), Utrecht University, (Dissertation)
[25] Kossinets, G.; Watts, D. J., Empirical analysis of an evolving social network, Science, 311, 5757, 88-90 (2006) · Zbl 1226.91055
[26] Krivitsky, P. N.; Handcock, M. S., A separable model for dynamic networks, J. R. Stat. Soc. Ser. B Stat. Methodol., 76, 1, 29-46 (2014) · Zbl 1411.90079
[27] P.N. Krivitsky, M.S. Handcock, : Fit, Simulate and Diagnose Models for Network Evolution Based on Exponential-Family Random Graph Models, The Statnet Project (http://www.statnet.org), R package version 3.4.0, (2016).
[28] Krivitsky, P. N.; Handcock, M. S.; Morris, M., Adjusting for network size and composition effects in exponential-family random graph models, Stat. Methodol., 8, 4, 319-339 (2011) · Zbl 1215.91069
[29] Kruskal, J. B., Three-way arrays: rank and uniqueness of trilinear decompositions, with application to arithmetic complexity and statistics, Linear Algebra Appl., 18, 2, 95-138 (1977) · Zbl 0364.15021
[30] Lee, K. H.; Xue, L., Nonparametric finite mixture of Gaussian graphical models, Technometrics, 60, 4, 511-521 (2018)
[31] Matias, C.; Miele, V., Statistical clustering of temporal networks through a dynamic stochastic block model, J. R. Stat. Soc. Ser. B Stat. Methodol., 79, 4, 1119-1141 (2017) · Zbl 1373.62312
[32] Morris, M.; Kretzschmar, M., Concurrent partnerships and transmission dynamics in networks, Social Networks, 17, 3-4, 299-318 (1995)
[33] Newman, M. E., Coauthorship networks and patterns of scientific collaboration, Proc. Natl. Acad. Sci., 101, suppl 1, 5200-5205 (2004)
[34] Newman, M. E.; Girvan, M., Finding and evaluating community structure in networks, Phys. Rev. E, 69, 2, 026113 (2004)
[35] Nowicki, K.; Snijders, T. A., Estimation and prediction for stochastic blockstructures, J. Amer. Statist. Assoc., 96, 455, 1077-1087 (2001) · Zbl 1072.62542
[36] . R Core Team, R Core Team R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/, (2016).
[37] Saldana, D. F.; Yu, Y.; Feng, Y., How many communities are there?, J. Comput. Graph. Statist., 26, 1, 171-181 (2017)
[38] Snijders, T. A.; Van de Bunt, G. G.; Steglich, C. E., Introduction to stochastic actor-based models for network dynamics, Social Networks, 32, 1, 44-60 (2010)
[39] Snijders, T. A.; Nowicki, K., Estimation and prediction for stochastic blockmodels for graphs with latent block structure, J. Classification, 14, 1, 75-100 (1997) · Zbl 0896.62063
[40] Taylor, I. W.; Linding, R.; Warde-Farley, D.; Liu, Y.; Pesquita, C.; Faria, D.; Bull, S.; Pawson, T.; Morris, Q.; Wrana, J. L., Dynamic modularity in protein interaction networks predicts breast cancer outcome, Nature Biotechnol., 27, 2, 199-204 (2009)
[41] Varin, C.; Vidoni, P., A note on composite likelihood inference and model selection, Biometrika, 92, 3, 519-528 (2005) · Zbl 1183.62037
[42] Vu, D. Q.; Hunter, D. R.; Schweinberger, M., Model-based clustering of large networks, Ann. Appl. Stat., 7, 2, 1010-1039 (2013) · Zbl 1288.62106
[43] Wainwright, M. J.; Jordan, M. I., Graphical models, exponential families, and variational inference, Found. Trends Mach. Learn., 1, 1-2, 1-305 (2008) · Zbl 1193.62107
[44] Ward, M. D.; Hoff, P. D., Persistent patterns of international commerce, J. Peace Res., 44, 2, 157-175 (2007)
[45] Westveld, A. H.; Hoff, P. D., A mixed effects model for longitudinal relational and network data, with applications to international trade and conflict, Ann. Appl. Stat., 5, 2A, 843-872 (2011) · Zbl 1454.62509
[46] Xing, E. P.; Fu, W.; Song, L., A state-space mixed membership blockmodel for dynamic network tomography, Ann. Appl. Stat., 4, 2, 535-566 (2010) · Zbl 1194.62133
[47] Xu, K. S.; Hero, A. O., Dynamic stochastic blockmodels for time-evolving social networks, IEEE J. Sel. Top. Sign. Proces., 8, 4, 552-562 (2014)
[48] Xue, L.; Zou, H.; Cai, T., Nonconcave penalized composite conditional likelihood estimation of sparse ising models, Ann. Statist., 40, 3, 1403-1429 (2012) · Zbl 1284.62451
[49] Yang, T.; Chi, Y.; Zhu, S.; Gong, Y.; Jin, R., Detecting communities and their evolutions in dynamic social networks-a Bayesian approach, Mach. Learn., 82, 2, 157-189 (2011) · Zbl 1237.91189
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.