×

Learning sparse graphons and the generalized Kesten-Stigum threshold. (English) Zbl 1539.62161

Summary: The problem of learning graphons has attracted considerable attention across several scientific communities, with significant progress over the recent years in sparser regimes. Yet, the current techniques still require diverging degrees in order to succeed with efficient algorithms in the challenging cases where the local structure of the graph is homogeneous. This paper provides an efficient algorithm to learn graphons in the constant expected degree regime. The algorithm is shown to succeed in estimating the rank-\(k\) projection of a graphon in the \(L_2\) metric if the top \(k\) eigenvalues of the graphon satisfy a generalized Kesten-Stigum condition.

MSC:

62H22 Probabilistic graphical models

References:

[1] Abbe, E. (2017). Community detection and stochastic block models: Recent developments. J. Mach. Learn. Res. 18 Paper No. 177, 86. · Zbl 1403.62110
[2] ABBE, E., BOIX-ADSERÀ, E., RALLI, P. and SANDON, C. (2020). Graph powering and spectral robustness. SIAM J. Math. Data Sci. 2 132-157. · Zbl 1484.62117 · doi:10.1137/19M1257135
[3] ABBE, E., LI, S. and SLY, A. (2023). Supplement to “Learning sparse graphons and the generalized Kesten-Stigum threshold.” https://doi.org/10.1214/23-AOS2262SUPP
[4] ABBE, E. and SANDON, C. (2015). Recovering communities in the general stochastic block model without knowing the parameters. In Advances in Neural Information Processing Systems (C. Cortes, N. Lawrence, D. Lee, M. Sugiyama and R. Garnett, eds.) 28. Curran Associates, Red Hook, NY.
[5] Abbe, E. and Sandon, C. (2015). Community detection in general stochastic block models: Fundamental limits and efficient algorithms for recovery. In 2015 IEEE 56th Annual Symposium on Foundations of Computer Science—FOCS 2015 670-688. IEEE Computer Soc., Los Alamitos, CA. · doi:10.1109/FOCS.2015.47
[6] ABBE, E. and SANDON, C. (2018). Proof of the achievability conjectures for the general stochastic block model. Comm. Pure Appl. Math. 71 1334-1406. · Zbl 1394.62082 · doi:10.1002/cpa.21719
[7] AIROLDI, E. M., COSTA, T. B. and CHAN, S. H. (2013). Stochastic blockmodel approximation of a graphon: Theory and consistent estimation. In Advances in Neural Information Processing Systems (C. J. Burges, L. Bottou, M. Welling, Z. Ghahramani and K. Q. Weinberger, eds.) 26. Curran Associates, Red Hook, NY.
[8] Amini, A. A., Chen, A., Bickel, P. J. and Levina, E. (2013). Pseudo-likelihood methods for community detection in large sparse networks. Ann. Statist. 41 2097-2122. · Zbl 1277.62166 · doi:10.1214/13-AOS1138
[9] Anandkumar, A., Ge, R., Hsu, D. and Kakade, S. M. (2014). A tensor approach to learning mixed membership community models. J. Mach. Learn. Res. 15 2239-2312. · Zbl 1318.68136
[10] Bickel, P. J. and Chen, A. (2009). A nonparametric view of network models and Newman-Girvan and other modularities. Proc. Natl. Acad. Sci. USA 106 21068-21073. · Zbl 1359.62411
[11] Bickel, P. J., Chen, A. and Levina, E. (2011). The method of moments and degree distributions for network models. Ann. Statist. 39 2280-2301. · Zbl 1232.91577 · doi:10.1214/11-AOS904
[12] Bollobás, B., Janson, S. and Riordan, O. (2007). The phase transition in inhomogeneous random graphs. Random Structures Algorithms 31 3-122. · Zbl 1123.05083 · doi:10.1002/rsa.20168
[13] BOPPANA, R. B. (1987). Eigenvalues and graph bisection: An average-case analysis. In 28th Annual Symposium on Foundations of Computer Science (sfcs 1987) 280-285. · doi:10.1109/SFCS.1987.22
[14] BORDENAVE, C., LELARGE, M. and MASSOULIÉ, L. (2018). Nonbacktracking spectrum of random graphs: Community detection and nonregular Ramanujan graphs. Ann. Probab. 46 1-71. · Zbl 1386.05174 · doi:10.1214/16-AOP1142
[15] BORGS, C., CHAYES, J. T., COHN, H. and GANGULY, S. (2021). Consistent nonparametric estimation for heavy-tailed sparse graphs. Ann. Statist. 49 1904-1930. · Zbl 1486.62080 · doi:10.1214/20-aos1985
[16] Borgs, C., Chayes, J. T., Lovász, L., Sós, V. T. and Vesztergombi, K. (2008). Convergent sequences of dense graphs. I. Subgraph frequencies, metric properties and testing. Adv. Math. 219 1801-1851. · Zbl 1213.05161 · doi:10.1016/j.aim.2008.07.008
[17] Borgs, C., Chayes, J. T., Lovász, L., Sós, V. T. and Vesztergombi, K. (2012). Convergent sequences of dense graphs II. Multiway cuts and statistical physics. Ann. of Math. (2) 176 151-219. · Zbl 1247.05124 · doi:10.4007/annals.2012.176.1.2
[18] BORGS, C., CHAYES, J. T. and SMITH, A. (2015). Private graphon estimation for sparse graphs. In Proceedings of the 28th International Conference on Neural Information Processing Systems, Vol. 1. NIPS’15 1369-1377. MIT Press, Cambridge, MA.
[19] BUI, T. N., CHAUDHURI, S., LEIGHTON, F. T. and SIPSER, M. (1987). Graph bisection algorithms with good average case behavior. Combinatorica 7 171-191. · doi:10.1007/BF02579448
[20] Choi, D. S., Wolfe, P. J. and Airoldi, E. M. (2012). Stochastic blockmodels with a growing number of classes. Biometrika 99 273-284. · Zbl 1318.62207 · doi:10.1093/biomet/asr053
[21] Coja-Oghlan, A. (2010). Graph partitioning via adaptive spectral techniques. Combin. Probab. Comput. 19 227-284. · Zbl 1209.05178 · doi:10.1017/S0963548309990514
[22] DECELLE, A., KRZAKALA, F., MOORE, C. and ZDEBOROVÁ, L. (2011). Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications. Phys. Rev. E 84 066106.
[23] Dyer, M. E. and Frieze, A. M. (1989). The solution of some random NP-hard problems in polynomial expected time. J. Algorithms 10 451-489. · Zbl 0689.68049 · doi:10.1016/0196-6774(89)90001-1
[24] Gao, C., Lu, Y. and Zhou, H. H. (2015). Rate-optimal graphon estimation. Ann. Statist. 43 2624-2652. · Zbl 1332.60050 · doi:10.1214/15-AOS1354
[25] Gulikers, L., Lelarge, M. and Massoulié, L. (2017). Non-backtracking spectrum of degree-corrected stochastic block models. In 8th Innovations in Theoretical Computer Science Conference. LIPIcs. Leibniz Int. Proc. Inform. 67 Art. No. 44, 27. Schloss Dagstuhl. Leibniz-Zent. Inform., Wadern. · Zbl 1402.05192
[26] Gulikers, L., Lelarge, M. and Massoulié, L. (2018). An impossibility result for reconstruction in the degree-corrected stochastic block model. Ann. Appl. Probab. 28 3002-3027. · Zbl 1417.91409 · doi:10.1214/18-AAP1381
[27] HASHIMOTO, K. (1989). Zeta functions of finite graphs and representations of \(p\)-adic groups. In Automorphic Forms and Geometry of Arithmetic Varieties. Adv. Stud. Pure Math. 15 211-280. Academic Press, Boston, MA. · Zbl 0688.00008 · doi:10.2969/aspm/01510211
[28] Holland, P. W., Laskey, K. B. and Leinhardt, S. (1983). Stochastic blockmodels: First steps. Soc. Netw. 5 109-137. · doi:10.1016/0378-8733(83)90021-7
[29] Kallenberg, O. (1999). Multivariate sampling and the estimation problem for exchangeable arrays. J. Theoret. Probab. 12 859-883. · Zbl 0983.62055 · doi:10.1023/A:1021692202530
[30] KESTEN, H. and STIGUM, B. P. (1966). A limit theorem for multidimensional Galton-Watson processes. Ann. Math. Stat. 37 1211-1223. · Zbl 0203.17401 · doi:10.1214/aoms/1177699266
[31] Klopp, O., Tsybakov, A. B. and Verzelen, N. (2017). Oracle inequalities for network models and sparse graphon estimation. Ann. Statist. 45 316-354. · Zbl 1367.62090 · doi:10.1214/16-AOS1454
[32] Krivelevich, M. and Sudakov, B. (2003). The largest eigenvalue of sparse random graphs. Combin. Probab. Comput. 12 61-72. · Zbl 1012.05109 · doi:10.1017/S0963548302005424
[33] KRZAKALA, F., MOORE, C., MOSSEL, E., NEEMAN, J., SLY, A., ZDEBOROVÁ, L. and ZHANG, P. (2013). Spectral redemption in clustering sparse networks. Proc. Natl. Acad. Sci. USA 110 20935-20940. · Zbl 1359.62252 · doi:10.1073/pnas.1312486110
[34] Massoulié, L. (2014). Community detection thresholds and the weak Ramanujan property. In STOC’14—Proceedings of the 2014 ACM Symposium on Theory of Computing 694-703. ACM, New York. · Zbl 1315.68210
[35] Mossel, E., Neeman, J. and Sly, A. (2015). Reconstruction and estimation in the planted partition model. Probab. Theory Related Fields 162 431-461. · Zbl 1320.05113 · doi:10.1007/s00440-014-0576-6
[36] MOSSEL, E., NEEMAN, J. and SLY, A. (2016). Belief propagation, robust reconstruction and optimal recovery of block models. Ann. Appl. Probab. 26 2211-2256. · Zbl 1350.05154 · doi:10.1214/15-AAP1145
[37] MOSSEL, E., NEEMAN, J. and SLY, A. (2018). A proof of the block model threshold conjecture. Combinatorica 38 665-708. · Zbl 1424.05272 · doi:10.1007/s00493-016-3238-8
[38] MOSSEL, E. and PERES, Y. (2003). Information flow on trees. Ann. Appl. Probab. 13 817-844. · Zbl 1050.60082 · doi:10.1214/aoap/1060202828
[39] Rohe, K., Chatterjee, S. and Yu, B. (2011). Spectral clustering and the high-dimensional stochastic blockmodel. Ann. Statist. 39 1878-1915. · Zbl 1227.62042 · doi:10.1214/11-AOS887
[40] STEPHAN, L. and MASSOULIÉ, L. (2022). Non-backtracking spectra of weighted inhomogeneous random graphs. Math. Stat. Learn. 5 201-271. · Zbl 1511.05154 · doi:10.4171/msl/34
[41] TALENTI, G. (1987). Recovering a function from a finite number of moments. Inverse Probl. 3 501-517. · Zbl 0649.44007
[42] WOLFE, P. J. and OLHEDE, S. C. Nonparametric graphon estimation. Available at arXiv:1309.5936. · Zbl 07582648
[43] XU, J. (2018). Rates of convergence of spectral methods for graphon estimation. In Proceedings of the 35th International Conference on Machine Learning (J. Dy and A. Krause, eds.). Proceedings of Machine Learning Research 80 5433-5442. PMLR, Stockholm, Sweden.
[44] XU, J., MASSOULIÉ, L. and LELARGE, M. (2014). Edge label inference in generalized stochastic block models: From spectral theory to impossibility results. In Proceedings of the 27th Conference on Learning Theory (M. F. Balcan, V. Feldman and C. Szepesvári, eds.). Proceedings of Machine Learning Research 35 903-920. PMLR, Barcelona, Spain.
[45] Zhang, Y., Levina, E. and Zhu, J. (2017). Estimating network edge probabilities by neighbourhood smoothing. Biometrika 104 771-783. · Zbl 07072327 · doi:10.1093/biomet/asx042
[46] Zhao, Y., Levina, E. and Zhu, J. (2012). Consistency of community detection in networks under degree-corrected stochastic block models. Ann. Statist. 40 2266-2292 · Zbl 1257.62095 · doi:10.1214/12-AOS1036
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.