×

Combinatorial statistics and the sciences. (English) Zbl 1533.60013

Beliaev, Dmitry (ed.) et al., International congress of mathematicians 2022, ICM 2022, Helsinki, Finland, virtual, July 6–14, 2022. Volume 6. Sections 12–14. Berlin: European Mathematical Society (EMS). 4170-4188 (2023).
Summary: Combinatorial statistics studies inference in discrete stochastic models. Inference of such models plays an important role in the sciences. We survey research in combinatorial statistics involving the tree broadcast process. We review the mathematical questions that arise in the analysis of this process and its inference via “belief propagation.” We discuss the mathematical connections to statistical physics, the social sciences, biological sciences, and theoretical computer science.
For the entire collection see [Zbl 1532.00040].

MSC:

60C05 Combinatorial probability
60-02 Research exposition (monographs, survey articles) pertaining to probability theory
62-01 Introductory exposition (textbooks, tutorial papers, etc.) pertaining to statistics
68-02 Research exposition (monographs, survey articles) pertaining to computer science

References:

[1] E. Abbe and E. Boix-Adserà, An information-percolation bound for spin synchro-nization on general graphs. Ann. Appl. Probab. 30 (2020), no. 3, 1066-1090. · Zbl 1461.62025
[2] E. Abbe and C. Sandon, Community detection in general stochastic block models: fundamental limits and efficient algorithms for recovery. In 2015 IEEE 56th annual symposium on foundations of computer science, pp. 670-688, IEEE, 2015.
[3] E. Abbe and C. Sandon, Detection in the stochastic block model with multiple clusters: proof of the achievability conjectures, acyclic bp, and the information-computation gap. 2015, arXiv:1512.09080.
[4] E. Abbe and C. Sandon, Crossing the KS threshold in the stochastic block model with information theory. In 2016 IEEE international symposium on information theory (ISIT), pp. 840-844, IEEE, 2016.
[5] E. Abbe and C. Sandon, Proof of the achievability conjectures for the general stochastic block model. Comm. Pure Appl. Math. 71 (2018), no. 7, 1334-1406. · Zbl 1394.62082
[6] N. Alon, I. Benjamini, E. Lubetzky, and S. Sodin, Non-backtracking random walks mix faster. Commun. Contemp. Math. 9 (2007), no. 04, 585-603. · Zbl 1140.60301
[7] K. B. Athreya and P. E. Ney, Branching processes. Springer, New York, 1972. · Zbl 0259.60002
[8] J. Banks, C. Moore, J. Neeman, and P. Netrapalli, Information-theoretic thresh-olds for community detection in sparse networks. In Conference on learning theory, pp. 383-416, PMLR, 2016.
[9] P. Bickel and A. Chen, A nonparametric view of network models and Newman-Girvan and other modularities. Proc. Natl. Acad. Sci. 106 (2009), no. 50, 21068-21073. · Zbl 1359.62411
[10] P. M. Bleher, J. Ruiz, and V. A. Zagrebnov, On the purity of the limiting Gibbs state for the Ising model on the Bethe lattice. J. Stat. Phys. 79 (1995), no. 1-2, 473-482. · Zbl 1081.82515
[11] B. Bollobás, S. Janson, and O. Riordan, The phase transition in inhomogeneous random graphs. Random Structures Algorithms 31 (2007), no. 1, 3-122. · Zbl 1123.05083
[12] C. Bordenave, M. Lelarge, and L. Massoulié, Non-backtracking spectrum of random graphs: community detection and non-regular Ramanujan graphs. In Foundations of computer science (FOCS), 2015 IEEE 56th annual symposium on, pp. 1347-1357, IEEE, 2015.
[13] C. Borgs, J. Chayes, E. Mossel, and S. Roch, The Kesten-Stigum reconstruction bound is tight for roughly symmetric binary channels. In Proceedings of IEEE FOCS 2006, pp. 518-530, IEEE, 2006.
[14] J. M. Carlson, J. T. Chayes, L. Chayes, J. P. Sethna, and D. J. Thouless, Bethe lattice spin glass: the effects of a ferromagnetic bias and external fields. I. Bifur-cation analysis. J. Stat. Phys. 61 (1990), no. 5-6, 987-1067.
[15] J. M. Carlson, J. T. Chayes, J. P. Sethna, and D. J. Thouless, Bethe lattice spin glass: the effects of a ferromagnetic bias and external fields. II. Magnetized spin-glass phase and the de Almeida-Thouless line. J. Stat. Phys. 61 (1990), no. 5-6, 1069-1084.
[16] J. A. Cavender, Taxonomy with confidence. Math. Biosci. 40 (1978), no. 3-4, 271-280. · Zbl 0391.92015
[17] J. Chang, Full reconstruction of Markov models on evolutionary trees: identifia-bility and consistency. Math. Biosci. 137 (1996), 51-73 · Zbl 1059.92504
[18] J. T. Chayes, L. Chayes, J. P. Sethna, and D. J. Thouless, A mean field spin glass with short-range interactions. Comm. Math. Phys. 106 (1986), no. 1, 41-89.
[19] A. Coja-Oghlan, Graph partitioning via adaptive spectral techniques. Combin. Probab. Comput. 19 (2010), no. 2, 227-284. · Zbl 1209.05178
[20] A. Coja-Oghlan, F. Krzakala, W. Perkins, and L. Zdeborová, Information-theoretic thresholds from the cavity method. Adv. Math. 333 (2018), 694-795. · Zbl 1397.82013
[21] A. Coja-Oghlan and K. Panagiotou, The asymptotic k-SAT threshold. Adv. Math. 288 (2016), 985-1068. · Zbl 1394.60007
[22] A. Coja-Oglan and K. Panagiotou, Catching the k-NAESAT threshold. In Pro-ceedings of the forty-fourth annual ACM symposium on theory of computing, pp. 899-908, ACM, 2012. · Zbl 1286.68185
[23] A. Condon and R. Karp, Algorithms for graph partitioning on the planted parti-tion model. Random Structures Algorithms 18 (2001), no. 2, 116-140. · Zbl 0972.68129
[24] C. Darwin, On the origin of species (1859).
[25] C. Daskalakis, E. Mossel, and S. Roch, Evolutionary trees and the Ising model on the Bethe lattice: a proof of Steel’s conjecture. PTRF 149 (2011), no. 1-2, 149-189. · Zbl 1221.92063
[26] A. Decelle, F. Krzakala, C. Moore, and L. Zdeborová, Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications. Phys. Rev. E 84 (2011), 066106.
[27] J. Ding, A. Sly, and N. Sun, Proof of the satisfiability conjecture for large k. In Proceedings of the forty-seventh annual ACM symposium on theory of computing, pp. 59-68, ACM, 2015. · Zbl 1321.68304
[28] J. Ding, A. Sly, and N. Sun, Maximum independent sets on random regular graphs. Acta Math. 217 (2016), no. 2, 263-340. · Zbl 1371.05261
[29] J. Ding, A. Sly, and N. Sun, Satisfiability threshold for random regular NAE-SAT. Comm. Math. Phys. 341 (2016), no. 2, 435-489. · Zbl 1335.68243
[30] M. Dyer and A. Frieze, The solution of some random NP-hard problems in poly-nomial expected time. J. Algorithms 10 (1989), no. 4, 451-489. · Zbl 0689.68049
[31] P. L. Erdős, M. A. Steel, L. A. Székely, and T. A. Warnow, A few logs suffice to build (almost) all trees (part 1). Random Structures Algorithms 14 (1999), no. 2, 153-184. · Zbl 0945.60004
[32] P. L. Erdős, M. A. Steel, L. A. Székely, and T. A. Warnow, A few logs suffice to build (almost) all trees (part 2). Theoret. Comput. Sci. 221 (1999), 77-118. · Zbl 0933.68100
[33] P. Erdős and A. Rényi, On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci. 5 (1960), no. 1, 17-60. · Zbl 0103.16301
[34] W. S. Evans, C. Kenyon, Y. Y. Peres, and L. J. Schulman, Broadcasting on trees and the Ising model. Ann. Appl. Probab. 10 (2000), no. 2, 410-433. · Zbl 1052.60076
[35] W. S. Evans and L. J. Schulman, Signal propagation and noisy circuits. IEEE Trans. Inf. Theory 45 (1999), no. 7, 2367-2373. · Zbl 0960.94002
[36] J. S. Farris, A probability model for inferring evolutionary trees. Syst. Zool. 22 (1973), no. 4, 250-256.
[37] J. Felsenstein, Maximum-likelihood estimation of evolutionary trees from contin-uous characters. Am. J. Hum. Genet. 25 (1973), no. 5, 471.
[38] J. Felsenstein, Inferring phylogenies. Sinauer, New York, NY, 2004.
[39] J. Friedman, A proof of Alon’s second eigenvalue conjecture and related problems. Amer. Math. Soc., 2008. · Zbl 1177.05070
[40] A. Gershchenfeld and A. Montanari, Reconstruction for models on random graphs. In Foundations of computer science, annual IEEE symposium on, pp. 194-204, IEEE Computer Society, 2007.
[41] G. Grimmett, The random-cluster model. In Probability on discrete structures, pp. 73-123, Encyclopaedia Math. Sci. 110, Springer, Berlin, 2004. · Zbl 1045.60105
[42] K-i. Hashimoto, Zeta functions of finite graphs and representations of p-adic groups. In Automorphic forms and geometry of arithmetic varieties, pp. 211-280, Elsevier, 1989. · Zbl 0709.22005
[43] Y. Higuchi, Remarks on the limiting Gibbs states on a .d C 1/-tree. Publ. Res. Inst. Math. Sci. 13 (1977), no. 2, 335-348. · Zbl 0376.60096
[44] J. Hilden, GEN EX-An algebraic approach to pedigree probability calculus. Clinical Genetics 1 (1970), no. 5-6, 319-348.
[45] P. Holland, K. Laskey, and S. Leinhardt, Stochastic blockmodels: first steps. Soc. Netw. 5 (1983), no. 2, 109-137.
[46] D. Ioffe, Extremality of the disordered state for the Ising model on general trees. In Trees (Versailles, 1995), pp. 3-14, Progr. Probab. 40, Birkhäuser, Basel, 1996. · Zbl 0865.60095
[47] D. Ioffe, On the extremality of the disordered state for the Ising model on the Bethe lattice. Lett. Math. Phys. 37 (1996), no. 2, 137-143. · Zbl 0848.60090
[48] V. Jain, F. Koehler, J. Liu, and E. Mossel, Accuracy-memory tradeoffs and phase transitions in belief propagation. In Proceedings of the thirty-second conference on learning theory, edited by A. Beygelzimer and D. Hsu, pp. 1756-1771, Proc. Mach. Learn. Res. 99, PMLR, Phoenix, USA, 2019.
[49] S. Janson and E. Mossel, Robust reconstruction on trees is determined by the second eigenvalue. Ann. Probab. 32 (2004), 2630-2649. · Zbl 1061.60105
[50] M. Jerrum and G. Sorkin, The Metropolis algorithm for graph bisection. Discrete Appl. Math. 82 (1998), no. 1-3, 155-175. · Zbl 0932.60079
[51] H. Kesten and B. P. Stigum, Additional limit theorems for indecomposable multi-dimensional Galton-Watson processes. Ann. Math. Stat. 37 (1966), 1463-1481. · Zbl 0203.17402
[52] H. Kesten and B. P. Stigum, Limit theorems for decomposable multi-dimensional Galton-Watson processes. J. Math. Anal. Appl. 17 (1967), 309-338. · Zbl 0203.17501
[53] F. Koehler and E. Mossel, Reconstruction on trees and low-degree polynomials, 2021. arXiv:2109.06915.
[54] F. Kr M zakała, A. Montanari, F. Ricci-Tersenghi, G. Semerjian, and L. Zdeborová, Gibbs states and the set of solutions of random constraint satisfaction problems. Proc. Natl. Acad. Sci. 104 (2007), no. 25, 10318-10323. · Zbl 1190.68031
[55] F. Krzakala, C. Moore, E. Mossel, J. Neeman, A. Sly Z. L, and P. Zhang, Spectral redemption: clustering sparse networks. Proc. Natl. Acad. Sci. 100 (2013), no. 52, 20935-20940. · Zbl 1359.62252
[56] R. Lyons, The Ising model and percolation on trees and tree-like graphs. Comm. Math. Phys. 125 (1989), no. 2, 337-353. · Zbl 0679.60101
[57] A. Makur, E. Mossel, and Y. Polyanskiy, Broadcasting on random directed acyclic graphs. IEEE Inf. Theory 66 (2020), no. 2, 780-812. · Zbl 1434.94003
[58] E. Maneva, E. Mossel, and M. J. Wainwright, A new look at survey propagation and its generalizations. J. ACM 54 (2007), 41. · Zbl 1312.68175
[59] F. McSherry, Spectral partitioning of random graphs. In Foundations of computer science, 2001. Proceedings 42nd IEEE symposium on, pp. 529-537, IEEE, 2001.
[60] M. Mézard and A. Montanari, Reconstruction on trees and the spin glass transi-tion. J. Stat. Phys. 124 (2006), 1317-1350. · Zbl 1113.82013
[61] M. Mézard and A. Montanari, Information, physics, and computation. Oxford University Press, USA, 2009. · Zbl 1163.94001
[62] M. Mézard, G. Parisi, and R. Zecchina, Analytic and algorithmic solution of random satisfiability problems. Science 297 (2002), no. 5582, 812-815.
[63] M. Mezard and R. Zecchina, Random k-satisfiability: from an analytic solution to an efficient algorithm. Phys. Rev. E 66 (2002).
[64] A. Moitra, E. Mossel, and C. Sandon, Parallels between phase transitions and cir-cuit complexity? In Conference on learning theory, pp. 2910-2946, PMLR, 2020.
[65] A. Montanari, R. Restrepo, and P. Tetali, Reconstruction and clustering in random constraint satisfaction problems. SIAM J. Discrete Math. 25 (2011), no. 2, 771-808. · Zbl 1223.68077
[66] E. Mossel, Recursive reconstruction on periodic trees. Random Structures Algo-rithms 13 (1998), no. 1, 81-97. · Zbl 0959.05112
[67] E. Mossel, Reconstruction on trees: beating the second eigenvalue. Ann. Appl. Probab. 11 (2001), no. 1, 285-300. · Zbl 1021.90008
[68] E. Mossel, Survey: Information flow on trees. In Graphs, morphisms and statis-tical physics. DIMACS series in discrete mathematics and theoretical computer science, edited by J. Nestril and P. Winkler, pp. 155-170, 2004. · Zbl 1066.94006
[69] E. Mossel, Deep learning and hierarchal generative models. 2019, arXiv:1612.09057.
[70] E. Mossel, J. Neeman, and A. Sly, Reconstruction and estimation in the planted partition model. Probab. Theory Related Fields 3-4 (2015), 431-461. · Zbl 1320.05113
[71] E. Mossel, J. Neeman, and A. Sly, Belief propagation, robust reconstruction, and optimal recovery of block models. Ann. Appl. Probab. 26 (2016), no. 4, 2211-2256. · Zbl 1350.05154
[72] E. Mossel, J. Neeman, and A. Sly, A proof of the block model threshold conjec-ture. Combinatorica 38 (2018), no. 3, 665-708. · Zbl 1424.05272
[73] E. Mossel and Y. Peres, Information flow on trees. Ann. Appl. Probab. 13 (2003), no. 3, 817-844. · Zbl 1050.60082
[74] E. Mossel and S. Roch, Learning nonsingular phylogenies and hidden Markov models. In Proceedings of the thirty-seventh annual ACM symposium on theory of computing, Baltimore (STOC’05), MD, USA, pp. 366-376, ACM, 2005. · Zbl 1192.68394
[75] E. Mossel, S. Roch, and A. Sly, On the inference of large phylogenies with long branches: How long is too long? Bull. Math. Biol. 73 (2011), no. 7, 1627-1644. · Zbl 1402.92319
[76] E. Mossel and M. Steel, A phase transition for a random cluster model on phylo-genetic trees. Math. Biosci. 187 (2004), no. 2, 189-203. · Zbl 1047.92032
[77] J. Neyman, Molecular studies of evolution: a source of novel statistical prob-lems. In Statistical decision theory and related topics, edited by S. S. Gupta and J. Yackel, pp. 1-27, Elsevier, 1971. · Zbl 0231.62010
[78] M. M. G. Parisi, A replica analysis of the travelling salesman problem. J. Phys. 47 (1986), 1285-1296.
[79] M. M. G. Parisi, On the solution of the random link matching problem. J. Phys. 48 (1987), 1451-1459.
[80] J. Pearl, Reverend Bayes on inference engines: A distributed hierarchical approach. Cognitive Systems Laboratory, School of Engineering and Applied Science, 1982.
[81] R. Pemantle and Y. Peres, The critical Ising model on trees, concave recursions and nonlinear capacity. Ann. Probab. 38 (2010), no. 1, 184-206. · Zbl 1197.60092
[82] Y. Polyanskiy and Y. Wu, Application of the information-percolation method to reconstruction problems on graphs. Math. Stat. Learn. 2 (2020), no. 1, 1-24. · Zbl 1439.05205
[83] C. J. Preston, Gibbs states on countable sets: Gibbs states and Markov random fields. Cambridge University Press, 1974. · Zbl 0297.60054
[84] S. Roch and A. Sly, Phase transition in the sample complexity of likelihood-based phylogeny inference. Probab. Theory Related Fields 169 (2017), no. 1, 3-62. · Zbl 1379.92041
[85] K. Rohe, S. Chatterjee, and B. Yu, Spectral clustering and the high-dimensional stochastic blockmodel. Ann. Statist. 39 (2011), no. 4, 1878-1915. · Zbl 1227.62042
[86] C. Semple and M. Steel, Phylogenetics. Math. Appl. Ser. 22, Oxford University Press, 2003. · Zbl 1043.92026
[87] A. Sly, Reconstruction of random colourings. Comm. Math. Phys. 288 (2009). · Zbl 1274.60163
[88] A. Sly, Reconstruction for the Potts model. Ann. Probab. 39 (2011), no. 4, 1365-1406. · Zbl 1234.60095
[89] A. Sly and Y. Zhang, Reconstruction of colourings without freezing. 2016, arXiv:1610.02770.
[90] T. Snijders and K. Nowicki, Estimation and prediction for stochastic blockmodels for graphs with latent block structure. J. Classification 14 (1997), no. 1, 75-100. · Zbl 0896.62063
[91] S. Sodin, Random matrices, nonbacktracking walks, and orthogonal polynomials. J. Math. Phys. 48 (2007), 123503. · Zbl 1153.81436
[92] F. Spitzer, Markov random fields on an infinite tree. Ann. Probab. 3 (1975), no. 3, 387-398. · Zbl 0313.60072
[93] M. Steel, Phylogeny: Discrete and random processes in evolution. SIAM, 2016. · Zbl 1361.92001
[94] M. Steel, My Favourite Conjecture, 2001, http://www.math.canterbury.ac.nz/  mathmas/conjecture.pdf.
[95] J. von Neumann, Probabilistic logics and the synthesis of reliable organisms from unreliable components. In Automata studies, pp. 43-98, Ann. of Math. Stud. 34, Princeton University Press, Princeton, NJ, 1956. Elchanan Mossel
[96] Massachusetts Avenue, Cambridge, MA 02139-4307, USA, elmos@mit.edu
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.