×

Empirical Bayes estimation for the stochastic blockmodel. (English) Zbl 1333.62088

Summary: Inference for the stochastic blockmodel is currently of burgeoning interest in the statistical community, as well as in various application domains as diverse as social networks, citation networks, brain connectivity networks (connectomics), etc. Recent theoretical developments have shown that spectral embedding of graphs yields tractable distributional results; in particular, a random dot product latent position graph formulation of the stochastic blockmodel informs a mixture of normal distributions for the adjacency spectral embedding. We employ this new theory to provide an empirical Bayes methodology for estimation of block memberships of vertices in a random graph drawn from the stochastic blockmodel, and demonstrate its practical utility. The posterior inference is conducted using a Metropolis-within-Gibbs algorithm. The theory and methods are illustrated through Monte Carlo simulation studies, both within the stochastic blockmodel and beyond, and experimental results on a Wikipedia graph are presented.

MSC:

62F15 Bayesian inference
62H30 Classification and discrimination; cluster analysis (statistical aspects)

References:

[1] Airoldi, E., D. Blei, S. Fienberg, and E. Xing (2008). Mixed membership stochastic blockmodels., Journal of Machine Learning Research. 9 , 1981-2014. · Zbl 1225.68143
[2] Airoldi, E. M., T. B. Costa, and S. H. Chan (2013). Stochastic blockmodel approximation of a graphon: Theory and consistent estimation. In C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Weinberger (Eds.), Advances in Neural Information Processing Systems 26 , pp. 692-700.
[3] Athreya, A., C. Priebe, M. Tang, V. Lyzinski, D. Marchette, and D. Sussman (2015). A limit theorem for scaled eigenvectors of random dot product graphs., Sankhya A . · Zbl 1338.62061 · doi:10.1007/s13171-015-0071-x
[4] Bickel, P. and A. Chen (2009). A nonparametric view of network models and Newman-Girvan and other modularities., Proceedings of the National Academy of Sciences 106 (50), 21068-21073. · Zbl 1359.62411 · doi:10.1073/pnas.0907096106
[5] Bickel, P., D. Choi, X. Chang, and H. Zhang (2013). Asymptotic normality of maximum likelihood and its variational approximation for stochastic blockmodels., The Annals of Statistics 41 (4), 1922-1943. · Zbl 1292.62042 · doi:10.1214/13-AOS1124
[6] Celisse, A., J.-J. Daudin, and L. Pierre (2012). Consistency of maximum-likelihood and variational estimators in the stochastic block model., Electronic Journal of Statistics 6 , 1847-1899. · Zbl 1295.62028 · doi:10.1214/12-EJS729
[7] Choi, D. S., P. J. Wolfe, and E. M. Airoldi (2012). Stochastic blockmodels with a growing number of classes., Biometrika 99 (2), 273-284. · Zbl 1318.62207 · doi:10.1093/biomet/asr053
[8] Fienberg, S. E. (2010). Introduction to papers on the modeling and analysis of network data., Ann. Appl. Statist 4 , 1-4. · doi:10.1214/10-AOAS346
[9] Fienberg, S. E. (2012). A brief history of statistical models for network analysis and open challenges., Journal of Computational and Graphical Statistics 21 (4), 825-839. · doi:10.1080/10618600.2012.738106
[10] Fishkind, D. E., D. L. Sussman, M. Tang, J. T. Vogelstein, and C. E. Priebe (2013). Consistent adjacency-spectral partitioning for the stochastic block model when the model parameters are unknown., SIAM Journal on Matrix Analysis and Applications 34 (1), 23-39. · Zbl 1314.05186 · doi:10.1137/120875600
[11] Fraley, C. and A. E. Raftery (2002). Model-based clustering, discriminant analysis, and density estimation., Journal of the American Statistical Association 97 , 611-631. · Zbl 1073.62545 · doi:10.1198/016214502760047131
[12] Goldenberg, A., A. X. Zheng, S. E. Fienberg, and E. M. Airoldi (2010). A survey of statistical network models., Foundations and Trends in Machine Learning 2 (2), 129-233. · Zbl 1184.68030 · doi:10.1561/2200000005
[13] Handcock, M., A. Raftery, and J. Tantrum (2007). Model-based clustering for social networks., Journal of the Royal Statistical Society. Series A (Statistics in Society) 170 (2), pp. 301-354. · doi:10.1111/j.1467-985X.2007.00471.x
[14] Hoff, P., A. Raftery, and M. Handcock (2002). Latent space approaches to social network analysis., Journal of the american Statistical association 97 (460), 1090-1098. · Zbl 1041.62098 · doi:10.1198/016214502388618906
[15] Holland, P. W., K. B. Laskey, and S. Leinhardt (1983). Stochastic blockmodels: First steps., Social Networks 5 (2), 109-137. · doi:10.1016/0378-8733(83)90021-7
[16] Lyzinski, V., D. Sussman, M. Tang, A. Athreya, and C. Priebe (2014). Perfect clustering for stochastic blockmodel graphs via adjacency spectral embedding., Electronic Journal of Statistics . · Zbl 1308.62131 · doi:10.1214/14-EJS978
[17] Newman, M. E. (2006). Modularity and community structure in networks., Proceedings of the National Academy of Sciences 103 (23), 8577-8582.
[18] Nickel, C. (2006)., Random dot product graphs: A model for social networks . Ph. D. thesis, Johns Hopkins University.
[19] Nowicki, K. and T. Snijders (2001). Estimation and prediction for stochastic blockstructures., Journal of the American Statistical Association 96 (455), pp. 1077-1087. · Zbl 1072.62542 · doi:10.1198/016214501753208735
[20] Olhede, S. C. and P. J. Wolfe (2013). Network histograms and universality of blockmodel approximation., arXiv
[21] Rohe, K., S. Chatterjee, and B. Yu (2011). Spectral clustering and the high-dimensional stochastic blockmodel., The Annals of Statistics 39 (4), 1878-1915. · Zbl 1227.62042 · doi:10.1214/11-AOS887
[22] Snijders, T. and K. Nowicki (1997). Estimation and prediction for stochastic blockmodels for graphs with latent block structure., Journal of Classification 14 (1), 75-100. · Zbl 0896.62063 · doi:10.1007/s003579900004
[23] Sussman, D. L. (2014)., Foundations of Adjacency Spectral Embedding . Ph. D. thesis, Johns Hopkins University.
[24] Sussman, D. L., M. Tang, D. E. Fishkind, and C. E. Priebe (2012). A consistent adjacency spectral embedding for stochastic blockmodel graphs., Journal of the American Statistical Association 107 (499), 1119-1128. · Zbl 1443.62188 · doi:10.1080/01621459.2012.699795
[25] Sussman, D. L., M. Tang, and C. E. Priebe (2014). Consistent latent position estimation and vertex classification for random dot product graphs., Pattern Analysis and Machine Intelligence, IEEE Transactions on 36 (1), 48-57.
[26] Tang, M., D. L. Sussman, and C. E. Priebe (2013). Universally consistent vertex classification for latent positions graphs., The Annals of Statistics 41 (3), 1406-1430. · Zbl 1273.62147 · doi:10.1214/13-AOS1112
[27] Young, S. and E. Scheinerman (2007). Random dot product graph models for social networks., Algorithms and models for the web-graph , 138-149. · Zbl 1136.05322 · doi:10.1007/978-3-540-77004-6_11
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.