×

Network representation using graph root distributions. (English) Zbl 1468.62258

The author introduces a new parameterization of exchangeable random graphs (i.e., random graphs which are probabilistically invariant under a permutation of the vertices) satisfying some mild conditions on a related spectral decomposition. This parameterization is in terms of a family of probability distributions (called graph root distributions) on a separable Kreĭn space. Issues of identifiability are discussed. It is shown that closeness of two graph root distributions in a certain Wasserstein distance implies closeness of the corresponding graphons in cut distance. Statistical questions of estimation using graph root distributions are considered for both dense and sparse random graphs. The paper concludes with numerical examples to illustrate its results.

MSC:

62E10 Characterization and structure theory of statistical distributions
62G05 Nonparametric estimation
62M15 Inference from stochastic processes and spectral analysis
05C80 Random graphs (graph-theoretic aspects)

References:

[1] Adamic, L. A. and Glance, N. (2005). The political blogosphere and the 2004 US election: Divided they blog. In Proceedings of the 3rd International Workshop on Link Discovery 36-43. ACM.
[2] Airoldi, E. M., Blei, D. M., Fienberg, S. E. and Xing, E. P. (2008). Mixed membership stochastic blockmodels. J. Mach. Learn. Res. 9 1981-2014. · Zbl 1225.68143
[3] Airoldi, E. M., Costa, T. B. and Chan, S. H. (2013). Stochastic blockmodel approximation of a graphon: Theory and consistent estimation. In Advances in Neural Information Processing Systems 692-700.
[4] Aldous, D. J. (1981). Representations for partially exchangeable arrays of random variables. J. Multivariate Anal. 11 581-598. · Zbl 0474.60044 · doi:10.1016/0047-259X(81)90099-3
[5] Athreya, A., Fishkind, D. E., Tang, M., Priebe, C. E., Park, Y., Vogelstein, J. T., Levin, K., Lyzinski, V., Qin, Y. et al. (2017). Statistical inference on random dot product graphs: A survey. J. Mach. Learn. Res. 18 Paper No. 226, 92. · Zbl 1473.05279
[6] Bhattacharyya, S. and Chatterjee, S. (2018). Spectral clustering for multiple sparse networks: I. arXiv preprint arXiv:1805.10594.
[7] Bickel, P. J. and Chen, A. (2009). A nonparametric view of network models and Newman-Girvan and other modularities. Proc. Natl. Acad. Sci. USA 106 21068-21073. · Zbl 1359.62411
[8] Bickel, P. J., Chen, A. and Levina, E. (2011). The method of moments and degree distributions for network models. Ann. Statist. 39 2280-2301. · Zbl 1232.91577 · doi:10.1214/11-AOS904
[9] Bollobás, B., Janson, S. and Riordan, O. (2007). The phase transition in inhomogeneous random graphs. Random Structures Algorithms 31 3-122. · Zbl 1123.05083 · doi:10.1002/rsa.20168
[10] Caron, F. and Fox, E. B. (2017). Sparse graphs using exchangeable random measures. J. R. Stat. Soc. Ser. B. Stat. Methodol. 79 1295-1366. · Zbl 1381.62072 · doi:10.1111/rssb.12233
[11] Chan, S. and Airoldi, E. (2014). A consistent histogram estimator for exchangeable graph models. In International Conference on Machine Learning 208-216.
[12] Chatterjee, S. (2015). Matrix estimation by universal singular value thresholding. Ann. Statist. 43 177-214. · Zbl 1308.62038 · doi:10.1214/14-AOS1272
[13] Chen, K. and Lei, J. (2018). Network cross-validation for determining the number of communities in network data. J. Amer. Statist. Assoc. 113 241-251. · Zbl 1398.62159 · doi:10.1080/01621459.2016.1246365
[14] Chin, P., Rao, A. and Vu, V. (2015). Stochastic block model and community detection in sparse graphs: A spectral algorithm with optimal rate of recovery. In Conference on Learning Theory 391-423.
[15] Coja-Oghlan, A. (2010). Graph partitioning via adaptive spectral techniques. Combin. Probab. Comput. 19 227-284. · Zbl 1209.05178 · doi:10.1017/S0963548309990514
[16] Crane, H. and Dempsey, W. (2018). Edge exchangeable models for interaction networks. J. Amer. Statist. Assoc. 113 1311-1326. · Zbl 1402.90027 · doi:10.1080/01621459.2017.1341413
[17] Gao, C., Lu, Y. and Zhou, H. H. (2015). Rate-optimal graphon estimation. Ann. Statist. 43 2624-2652. · Zbl 1332.60050 · doi:10.1214/15-AOS1354
[18] Gao, C., Ma, Z., Zhang, A. Y. and Zhou, H. H. (2018). Community detection in degree-corrected block models. Ann. Statist. 46 2153-2185. · Zbl 1408.62116 · doi:10.1214/17-AOS1615
[19] Gohberg, I. C. and Kreĭn, M. G. (1988). Introduction to the Theory of Linear Nonselfadjoint Operators. 18. Amer. Math. Soc., Providence, R.I.
[20] Goldenberg, A., Zheng, A. X., Fienberg, S. E. and Airoldi, E. M. (2010). A survey of statistical network models. Found. Trends Mach. Learn. 2 129-233. · Zbl 1184.68030
[21] Hall, P. and Horowitz, J. L. (2007). Methodology and convergence rates for functional linear regression. Ann. Statist. 35 70-91. · Zbl 1114.62048 · doi:10.1214/009053606000000957
[22] Hoff, P. (2008). Modeling homophily and stochastic equivalence in symmetric relational data. In Advances in Neural Information Processing Systems 657-664.
[23] Holland, P. W., Laskey, K. B. and Leinhardt, S. (1983). Stochastic blockmodels: First steps. Soc. Netw. 5 109-137. · doi:10.1016/0378-8733(83)90021-7
[24] Hoover, D. N. (1982). Row-column exchangeability and a generalized model for probability. In Exchangeability in Probability and Statistics (Rome, 1981) 281-291. North-Holland, Amsterdam. · Zbl 0495.60040
[25] Jin, J. (2015). Fast community detection by SCORE. Ann. Statist. 43 57-89. · Zbl 1310.62076 · doi:10.1214/14-AOS1265
[26] Kallenberg, O. (1989). On the representation theorem for exchangeable arrays. J. Multivariate Anal. 30 137-154. · Zbl 0676.60046 · doi:10.1016/0047-259X(89)90092-4
[27] Karrer, B. and Newman, M. E. J. (2011). Stochastic blockmodels and community structure in networks. Phys. Rev. E (3) 83 016107, 10. · doi:10.1103/PhysRevE.83.016107
[28] Klopp, O., Tsybakov, A. B. and Verzelen, N. (2017). Oracle inequalities for network models and sparse graphon estimation. Ann. Statist. 45 316-354. · Zbl 1367.62090 · doi:10.1214/16-AOS1454
[29] Klopp, O. and Verzelen, N. (2019). Optimal graphon estimation in cut distance. Probab. Theory Related Fields 174 1033-1090. · Zbl 1420.62143 · doi:10.1007/s00440-018-0878-1
[30] Kolaczyk, E. D. (2009). Statistical Analysis of Network Data: Methods and Models. Springer Series in Statistics. Springer, New York. · Zbl 1277.62021 · doi:10.1007/978-0-387-88146-1
[31] Krebs, V. (2004). Social network analysis software & services for organizations, communities, and their consultants. http://www.orgnet.com/.
[32] Lax, P. D. (2002). Functional Analysis. Pure and Applied Mathematics (New York). Wiley Interscience, New York. · Zbl 1009.47001
[33] Lei, J. (2014). Adaptive global testing for functional linear models. J. Amer. Statist. Assoc. 109 624-634. · Zbl 1367.62214 · doi:10.1080/01621459.2013.856794
[34] Lei, J. (2016). A goodness-of-fit test for stochastic block models. Ann. Statist. 44 401-424. · Zbl 1331.62283 · doi:10.1214/15-AOS1370
[35] Lei, J. (2020). Convergence and concentration of empirical measures under Wasserstein distance in unbounded functional spaces. Bernoulli 26 767-798. · Zbl 1455.60009 · doi:10.3150/19-BEJ1151
[36] Lei, J. (2021). Supplement to “Network representation using graph root distributions.” https://doi.org/10.1214/20-AOS1976SUPP
[37] Lovász, L. (2012). Large Networks and Graph Limits. American Mathematical Society Colloquium Publications 60. Amer. Math. Soc., Providence, RI. · Zbl 1292.05001 · doi:10.1090/coll/060
[38] McSherry, F. (2001). Spectral partitioning of random graphs. In 42nd IEEE Symposium on Foundations of Computer Science (Las Vegas, NV, 2001) 529-537. IEEE Computer Soc., Los Alamitos, CA.
[39] Meister, A. (2011). Asymptotic equivalence of functional linear regression and a white noise inverse problem. Ann. Statist. 39 1471-1495. · Zbl 1221.62011 · doi:10.1214/10-AOS872
[40] Newman, M. E. J. (2010). Networks: An Introduction. Oxford Univ. Press, Oxford. · Zbl 1195.94003 · doi:10.1093/acprof:oso/9780199206650.001.0001
[41] Nickel, C. L. M. (2007). Random Dot Product Graphs: A Model for Social Networks. ProQuest LLC, Ann Arbor, MI. Thesis (Ph.D.)-The Johns Hopkins University.
[42] Ong, C. S., Mary, X., Canu, S. and Smola, A. J. (2004). Learning with non-positive kernels. In Proceedings of the Twenty-First International Conference on Machine Learning 81. ACM.
[43] Orbanz, P. and Roy, D. M. (2015). Bayesian models of graphs, arrays and other exchangeable random structures. IEEE Trans. Pattern Anal. Mach. Intell. 37 437-461.
[44] Penrose, M. (2003). Random Geometric Graphs. Oxford Studies in Probability 5. Oxford Univ. Press, Oxford. · Zbl 1029.60007 · doi:10.1093/acprof:oso/9780198506263.001.0001
[45] Rohe, K., Chatterjee, S. and Yu, B. (2011). Spectral clustering and the high-dimensional stochastic blockmodel. Ann. Statist. 39 1878-1915. · Zbl 1227.62042 · doi:10.1214/11-AOS887
[46] Rubin-Delanchy, P., Priebe, C. E. and Tang, M. (2017). The generalised random dot product graph. arXiv preprint arXiv:1709.05506.
[47] Sewell, D. K. and Chen, Y. (2015). Latent space models for dynamic networks. J. Amer. Statist. Assoc. 110 1646-1657. · Zbl 1373.62580 · doi:10.1080/01621459.2014.988214
[48] Sussman, D. L., Tang, M., Fishkind, D. E. and Priebe, C. E. (2012). A consistent adjacency spectral embedding for stochastic blockmodel graphs. J. Amer. Statist. Assoc. 107 1119-1128. · Zbl 1443.62188 · doi:10.1080/01621459.2012.699795
[49] Tang, M., Athreya, A., Sussman, D. L., Lyzinski, V. and Priebe, C. E. (2017). A nonparametric two-sample hypothesis testing problem for random graphs. Bernoulli 23 1599-1630. · Zbl 1450.62040 · doi:10.3150/15-BEJ789
[50] Wolfe, P. J. and Olhede, S. C. (2013). Nonparametric graphon estimation. arXiv preprint arXiv:1309.5936.
[51] Xu, J. (2018). Rates of convergence of spectral methods for graphon estimation. In Proceedings of the 35th International Conference on Machine Learning (J. Dy and A. Krause, eds.). Proceedings of Machine Learning Research 80 5433-5442. PMLR, Stockholmsmässan, Stockholm Sweden.
[52] Xu, J., Massoulié, L. and Lelarge, M. (2014). Edge label inference in generalized stochastic block models: From spectral theory to impossibility results. In Conference on Learning Theory 903-920.
[53] Zhang, Y., Levina, E. and Zhu, J. (2017). Estimating network edge probabilities by neighbourhood smoothing. Biometrika 104 771-783. · Zbl 07072327 · doi:10.1093/biomet/asx042
[54] Zhao, Y., Levina, E. and Zhu, J. (2012). Consistency of community detection in networks under degree-corrected stochastic block models. Ann. Statist. 40 2266-2292 · Zbl 1257.62095 · doi:10.1214/12-AOS1036
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.