×

Testing for high-dimensional geometry in random graphs. (English) Zbl 1349.05315

Summary: We study the problem of detecting the presence of an underlying high-dimensional geometric structure in a random graph. Under the null hypothesis, the observed graph is a realization of an Erdős-Rényi random graph \(G(n, p)\). Under the alternative, the graph is generated from the \(G(n,p,d)\) model, where each vertex corresponds to a latent independent random vector uniformly distributed on the sphere \(\mathbb{S}^{d-1}\), and two vertices are connected if the corresponding latent vectors are close enough. In the dense regime (i.e., \(p\) is a constant), we propose a near-optimal and computationally efficient testing procedure based on a new quantity which we call signed triangles. The proof of the detection lower bound is based on a new bound on the total variation distance between a Wishart matrix and an appropriately normalized GOE matrix. In the sparse regime, we make a conjecture for the optimal detection boundary. We conclude the paper with some preliminary steps on the problem of estimating the dimension in \(G(n,p,d)\).

MSC:

05C80 Random graphs (graph-theoretic aspects)
05C10 Planar graphs; geometric and topological aspects of graph theory

References:

[1] E.Abbe, A.Bandeira, and G.Hall, Exact recovery in the stochastic block model, IEEE Trans Network Sci Eng, arXiv preprint arXiv:1405.3267 (2014), Available at http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=7298436&filter
[2] I.Abraham, S.Chechik, D.Kempe, and A.Slivkins, Low‐distortion inference of latent similarities from a multiplex social network organization, Proceedings of the Twenty‐Fourth Annual ACM‐SIAM Symposium on Discrete Algorithms, SIAM, New Orleans, Louisiana, USA, 2013, pp. 1853-1883.
[3] M.Abramowitz and I. A.Stegun, Handbook of mathematical functions, Dover, New York, NY, USA, 1964. · Zbl 0171.38503
[4] N.Alon, M.Krivelevich, and B.Sudakov, Finding a large hidden clique in a random graph, Proceedings of the Ninth Annual ACM‐SIAM Symposium on Discrete Algorithms, Society for Industrial and Applied Mathematics, San Francisco, California, USA, 1998, pp. 594-598. · Zbl 0930.68109
[5] G. W.Anderson, A.Guionnet, and O.Zeitouni, An introduction to random matrices, Number 118, Cambridge University Press, Cambridge, UK, 2010. · Zbl 1184.15023
[6] E.Arias‐Castro, Ś.Bubeck, and ǵ.Lugosi, Detecting positive correlations in a multivariate sample, Bernoulli21 (2015), 209-241. · Zbl 1359.62208
[7] E.Arias‐Castro and N.Verzelen, Community detection in dense random networks, Ann Stat42 (2014), 940-969. · Zbl 1305.62035
[8] Z.Bai and J. W.Silverstein, Spectral analysis of large dimensional random matrices, 2nd edition, Springer, New York, 2009.
[9] Q.Berthet and P.Rigollet, Complexity theoretic lower bounds for sparse principal component detection, Conference on Learning Theory, Princeton, 2013, pp. 1046-1066, Available at https://orfe.princeton.edu/conferences/colt2013/.
[10] P. J.Bickel, A.Chen, and E.Levina, The method of moments and degree distributions for network models, Ann Stat39 (2011), 2280-2301. · Zbl 1232.91577
[11] H.Breu and D. G.Kirkpatrick, Unit disk graph recognition is NP‐hard, Computat Geomet9 (1998), 3-24. · Zbl 0894.68099
[12] S.Bubeck, E.Mossel, and M. Z.Rácz, On the influence of the seed graph in the preferential attachment model, IEEE Trans Network Sci Eng2 (2015), 30-39.
[13] L.Devroye, A.György, G.Lugosi, and F.Udina, High‐dimensional random geometric graphs and their clique number, Electron J Probab16 (2011), 2481-2508. · Zbl 1244.05200
[14] N.ElKaroui, On the largest eigenvalue of Wishart matrices with identity covariance when n, p and \(p / n \to \infty \), arXiv preprint math (2003).
[15] N.ElKaroui, Tracy‐Widom limit for the largest eigenvalue of a large class of complex sample covariance matrices, Ann Probab35 (2007), 663-714. · Zbl 1117.60020
[16] R.Eldan, An efficiency upper bound for inverse covariance estimation, Israel J Math207 (2015), 1-9. · Zbl 1327.62365
[17] P.Erdős and A.Rényi, On the evolution of random graphs, Publ Math Inst Hungar Acad Sci5 (1960), 17-61. · Zbl 0103.16301
[18] A.Frieze and R.Kannan, A new approach to the planted clique problem, In R.Hariharan (ed.), M.Mukund (ed.), and V.Vinay (ed.), editors, IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, Volume 2 of Leibniz International Proceedings in Informatics (LIPIcs), 2008, pp. 187-198. · Zbl 1248.68245
[19] R.Gupta, T.Roughgarden, and C.Seshadhri, Decompositions of triangle‐dense graphs, Proceedings, of the 5th Conference on Innovations in Theoretical Computer Science, ACM, Princeton, New Jersey, USA, 2014, pp. 471-482. · Zbl 1365.05245
[20] P. D.Hoff, A. E.Raftery, and M. S.Handcock, Latent space approaches to social network analysis, J Am Stat Assoc97 (2002), 1090-1098. · Zbl 1041.62098
[21] T.Jiang and D.Li, Approximation of rectangular beta‐laguerre ensembles and large deviations, J Theory Probab28 (2015), 804-847. · Zbl 1333.15032
[22] I. M.Johnstone, On the distribution of the largest eigenvalue in principal components analysis, Ann Stat29 (2001), 295-327. · Zbl 1016.62078
[23] E.Mossel, J.Neeman, and A.Sly, Reconstruction and estimation in the planted partition model, Probab Theory Relat Fields162 (2015), 431-461. · Zbl 1320.05113
[24] M.Penrose, Random geometric graphs, Volume 5 of Oxford studies in probability, Oxford University Press, Oxford, UK, 2003. · Zbl 1029.60007
[25] P.Sarkar, D.Chakrabarti, and A. W.Moore, Theoretical justification of popular link prediction heuristics, Conference on Learning Theory, Haifa, Israel, 2010, pp. 295-307.
[26] S.Sodin, Tail‐sensitive Gaussian asymptotics for marginals of concentrated measures in high dimension, In V. D.Milman (ed.), editor, Geometric aspects of functional analysis, Volume 1910 of lecture notes in mathematics, 2007, pp. 271-295. · Zbl 1142.60016
[27] N.Verzelen and E.Arias‐Castro, Community detection in sparse random networks, arXiv preprint arXiv:1308.2955 (2013).
[28] D. J.Watts and S. H.Strogatz, Collective dynamics of ‘small‐world’ networks, Nature393 (1998), 440-442. · Zbl 1368.05139
[29] J.Wishart, The generalised product moment distribution in samples from a normal multivariate population, Biometrika20A (1928), 32-52. · JFM 54.0565.02
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.