-
Sharp bound for the Erdős-Straus non-averaging set problem
Authors:
Huy Tuan Pham,
Dmitrii Zakharov
Abstract:
A set of integers $A$ is non-averaging if there is no element $a$ in $A$ which can be written as an average of a subset of $A$ not containing $a$. We show that the largest non-averaging subset of $\{1, \ldots, n\}$ has size $n^{1/4+o(1)}$, thus solving the Erdős-Straus problem. We also determine the largest size of a non-averaging set in a $d$-dimensional box for any fixed $d$. Our main tool inclu…
▽ More
A set of integers $A$ is non-averaging if there is no element $a$ in $A$ which can be written as an average of a subset of $A$ not containing $a$. We show that the largest non-averaging subset of $\{1, \ldots, n\}$ has size $n^{1/4+o(1)}$, thus solving the Erdős-Straus problem. We also determine the largest size of a non-averaging set in a $d$-dimensional box for any fixed $d$. Our main tool includes the structure theorem for the set of subset sums due to Conlon, Fox and the first author, together with a result about the structure of a point set in nearly convex position.
△ Less
Submitted 18 October, 2024;
originally announced October 2024.
-
On monochromatic solutions to linear equations over the integers
Authors:
Dingding Dong,
Nitya Mani,
Huy Tuan Pham,
Jonathan Tidor
Abstract:
We study the number of monochromatic solutions to linear equations in a $2$-coloring of $\{1,\ldots,n\}$. We show that any nontrivial linear equation has a constant fraction of solutions that are monochromatic in any $2$-coloring of $\{1,\ldots,n\}$. We further study commonness of four-term equations and disprove a conjecture of Costello and Elvin by showing that, unlike over $\mathbb{F}_p$, the f…
▽ More
We study the number of monochromatic solutions to linear equations in a $2$-coloring of $\{1,\ldots,n\}$. We show that any nontrivial linear equation has a constant fraction of solutions that are monochromatic in any $2$-coloring of $\{1,\ldots,n\}$. We further study commonness of four-term equations and disprove a conjecture of Costello and Elvin by showing that, unlike over $\mathbb{F}_p$, the four-term equation $x_1 + 2x_2 - x_3 - 2x_4 = 0$ is uncommon over $\{1,\ldots,n\}$.
△ Less
Submitted 17 October, 2024;
originally announced October 2024.
-
Spread blow-up lemma with an application to perturbed random graphs
Authors:
Rajko Nenadov,
Huy Tuan Pham
Abstract:
Combining ideas of Pham, Sah, Sawhney, and Simkin on spread perfect matchings in super-regular bipartite graphs with an algorithmic blow-up lemma, we prove a spread version of the blow-up lemma. Intuitively, this means that there exists a probability measure over copies of a desired spanning graph $H$ in a given system of super-regular pairs which does not heavily pin down any subset of vertices.…
▽ More
Combining ideas of Pham, Sah, Sawhney, and Simkin on spread perfect matchings in super-regular bipartite graphs with an algorithmic blow-up lemma, we prove a spread version of the blow-up lemma. Intuitively, this means that there exists a probability measure over copies of a desired spanning graph $H$ in a given system of super-regular pairs which does not heavily pin down any subset of vertices. This allows one to complement the use of the blow-up lemma with the recently resolved Kahn-Kalai conjecture. As an application, we prove an approximate version of a conjecture of Böttcher, Parczyk, Sgueglia, and Skokan on the threshold for appearance of powers of Hamilton cycles in perturbed random graphs.
△ Less
Submitted 8 October, 2024;
originally announced October 2024.
-
Short proof of the hypergraph container theorem
Authors:
Rajko Nenadov,
Huy Tuan Pham
Abstract:
We present a short and simple proof of the celebrated hypergraph container theorem of Balogh--Morris--Samotij and Saxton--Thomason. On a high level, our argument utilises the idea of iteratively taking vertices of largest degree from an independent set and constructing a hypergraph of lower uniformity which preserves independent sets and inherits edge distribution. The original algorithms for cons…
▽ More
We present a short and simple proof of the celebrated hypergraph container theorem of Balogh--Morris--Samotij and Saxton--Thomason. On a high level, our argument utilises the idea of iteratively taking vertices of largest degree from an independent set and constructing a hypergraph of lower uniformity which preserves independent sets and inherits edge distribution. The original algorithms for constructing containers also remove in each step vertices of high degree which are not in the independent set. Our modified algorithm postpones this until the end, which surprisingly results in a significantly simplified analysis.
△ Less
Submitted 10 September, 2024; v1 submitted 15 August, 2024;
originally announced August 2024.
-
Sunflowers in set systems with small VC-dimension
Authors:
József Balogh,
Anton Bernshteyn,
Michelle Delcourt,
Asaf Ferber,
Huy Tuan Pham
Abstract:
A family of $r$ distinct sets $\{A_1,\ldots, A_r\}$ is an $r$-sunflower if for all $1 \leqslant i < j \leqslant r$ and $1 \leqslant i' < j' \leqslant r$, we have $A_i \cap A_j = A_{i'} \cap A_{j'}$. Erdős and Rado conjectured in 1960 that every family $\mathcal{H}$ of $\ell$-element sets of size at least $K(r)^\ell$ contains an $r$-sunflower, where $K(r)$ is some function that depends only on $r$.…
▽ More
A family of $r$ distinct sets $\{A_1,\ldots, A_r\}$ is an $r$-sunflower if for all $1 \leqslant i < j \leqslant r$ and $1 \leqslant i' < j' \leqslant r$, we have $A_i \cap A_j = A_{i'} \cap A_{j'}$. Erdős and Rado conjectured in 1960 that every family $\mathcal{H}$ of $\ell$-element sets of size at least $K(r)^\ell$ contains an $r$-sunflower, where $K(r)$ is some function that depends only on $r$. We prove that if $\mathcal{H}$ is a family of $\ell$-element sets of VC-dimension at most $d$ and $|\mathcal{H}| > (C r (\log d+\log^\ast \ell))^\ell$ for some absolute constant $C > 0$, then $\mathcal{H}$ contains an $r$-sunflower. This improves a recent result of Fox, Pach, and Suk. When $d=1$, we obtain a sharp bound, namely that $|\mathcal{H}| > (r-1)^\ell$ is sufficient. Along the way, we establish a strengthening of the Kahn-Kalai conjecture for set families of bounded VC-dimension, which is of independent interest.
△ Less
Submitted 7 August, 2024;
originally announced August 2024.
-
An explicit economical additive basis
Authors:
Vishesh Jain,
Huy Tuan Pham,
Mehtaab Sawhney,
Dmitrii Zakharov
Abstract:
We present an explicit subset $A\subseteq \mathbb{N} = \{0,1,\ldots\}$ such that $A + A = \mathbb{N}$ and for all $\varepsilon > 0$, \[\lim_{N\to \infty}\frac{\big|\big\{(n_1,n_2): n_1 + n_2 = N, (n_1,n_2)\in A^2\big\}\big|}{N^{\varepsilon}} = 0.\] This answers a question of Erdős.
We present an explicit subset $A\subseteq \mathbb{N} = \{0,1,\ldots\}$ such that $A + A = \mathbb{N}$ and for all $\varepsilon > 0$, \[\lim_{N\to \infty}\frac{\big|\big\{(n_1,n_2): n_1 + n_2 = N, (n_1,n_2)\in A^2\big\}\big|}{N^{\varepsilon}} = 0.\] This answers a question of Erdős.
△ Less
Submitted 14 May, 2024;
originally announced May 2024.
-
The largest subgraph without a forbidden induced subgraph
Authors:
Jacob Fox,
Rajko Nenadov,
Huy Tuan Pham
Abstract:
We initiate the systematic study of the following Turán-type question. Suppose $Γ$ is a graph with $n$ vertices such that the edge density between any pair of subsets of vertices of size at least $t$ is at most $1 - c$, for some $t$ and $c > 0$. What is the largest number of edges in a subgraph $G \subseteq Γ$ which does not contain a fixed graph $H$ as an induced subgraph or, more generally, whic…
▽ More
We initiate the systematic study of the following Turán-type question. Suppose $Γ$ is a graph with $n$ vertices such that the edge density between any pair of subsets of vertices of size at least $t$ is at most $1 - c$, for some $t$ and $c > 0$. What is the largest number of edges in a subgraph $G \subseteq Γ$ which does not contain a fixed graph $H$ as an induced subgraph or, more generally, which belongs to a hereditary property $\mathcal{P}$? This provides a common generalization of two recently studied cases, namely $Γ$ being a (pseudo-)random graph and a graph without a large complete bipartite subgraph. We focus on the interesting case where $H$ is a bipartite graph.
We determine the answer up to a constant factor with respect to $n$ and $t$, for certain bipartite $H$ and for $Γ$ either a dense random graph or a Paley graph with a square number of vertices. In particular, our bounds match if $H$ is a tree, or if one part of $H$ has $d$ vertices complete to the other part, all other vertices in that part have degree at most $d$, and the other part has sufficiently many vertices. As applications of the latter result, we answer a question of Alon, Krivelevich, and Samotij on the largest subgraph with a hereditary property which misses a bipartite graph, and determine up to a constant factor the largest number of edges in a string subgraph of $Γ$. The proofs are based on a variant of the dependent random choice and a novel approach for finding induced copies by inductively defining probability distributions supported on induced copies of smaller subgraphs.
△ Less
Submitted 8 June, 2024; v1 submitted 9 May, 2024;
originally announced May 2024.
-
A question of Erdős and Graham on Egyptian fractions
Authors:
David Conlon,
Jacob Fox,
Xiaoyu He,
Dhruv Mubayi,
Huy Tuan Pham,
Andrew Suk,
Jacques Verstraëte
Abstract:
Answering a question of Erdős and Graham, we show that for each fixed positive rational number $x$ the number of ways to write $x$ as a sum of reciprocals of distinct positive integers each at most $n$ is $2^{(c_x + o(1))n}$ for an explicit constant $c_x$ increasing with $x$.
Answering a question of Erdős and Graham, we show that for each fixed positive rational number $x$ the number of ways to write $x$ as a sum of reciprocals of distinct positive integers each at most $n$ is $2^{(c_x + o(1))n}$ for an explicit constant $c_x$ increasing with $x$.
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
Sampling from Spherical Spin Glasses in Total Variation via Algorithmic Stochastic Localization
Authors:
Brice Huang,
Andrea Montanari,
Huy Tuan Pham
Abstract:
We consider the problem of algorithmically sampling from the Gibbs measure of a mixed $p$-spin spherical spin glass. We give a polynomial-time algorithm that samples from the Gibbs measure up to vanishing total variation error, for any model whose mixture satisfies $$ξ''(s) < \frac{1}{(1-s)^2}, \qquad \forall s\in [0,1).$$ This includes the pure $p$-spin glasses above a critical temperature that i…
▽ More
We consider the problem of algorithmically sampling from the Gibbs measure of a mixed $p$-spin spherical spin glass. We give a polynomial-time algorithm that samples from the Gibbs measure up to vanishing total variation error, for any model whose mixture satisfies $$ξ''(s) < \frac{1}{(1-s)^2}, \qquad \forall s\in [0,1).$$ This includes the pure $p$-spin glasses above a critical temperature that is within an absolute ($p$-independent) constant of the so-called shattering phase transition. Our algorithm follows the algorithmic stochastic localization approach introduced in (Alaoui, Montanari, Sellke, 20022). A key step of this approach is to estimate the mean of a sequence of tilted measures. We produce an improved estimator for this task by identifying a suitable correction to the TAP fixed point selected by approximate message passing (AMP). As a consequence, we improve the algorithm's guarantee over previous work, from normalized Wasserstein to total variation error. In particular, the new algorithm and analysis opens the way to perform inference about one-dimensional projections of the measure.
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
Equivalence between Erdős-Hajnal and polynomial Rödl and Nikiforov conjectures
Authors:
Matija Bucić,
Jacob Fox,
Huy Tuan Pham
Abstract:
It is well-known that polynomial versions of theorems of Rödl and Nikiforov, as conjectured by Fox and Sudakov and Nguyen, Scott and Seymour imply the classical Erdős-Hajnal conjecture. In this note, we prove that these three conjectures are in fact equivalent, extending several previous particular results in this direction by Fox, Nguyen, Scott and Seymour; Nguyen, Scott and Seymour and Gishbolin…
▽ More
It is well-known that polynomial versions of theorems of Rödl and Nikiforov, as conjectured by Fox and Sudakov and Nguyen, Scott and Seymour imply the classical Erdős-Hajnal conjecture. In this note, we prove that these three conjectures are in fact equivalent, extending several previous particular results in this direction by Fox, Nguyen, Scott and Seymour; Nguyen, Scott and Seymour and Gishboliner and Shapira. We deduce that the family of string graphs satisfies the polynomial Rödl conjecture. We also derive analogous results for hypergraphs, tournaments, ordered graphs, and colored graphs.
△ Less
Submitted 19 April, 2024; v1 submitted 13 March, 2024;
originally announced March 2024.
-
A multipartite analogue of Dilworth's Theorem
Authors:
Jacob Fox,
Huy Tuan Pham
Abstract:
We prove that every partially ordered set on $n$ elements contains $k$ subsets $A_{1},A_{2},\dots,A_{k}$ such that either each of these subsets has size $Ω(n/k^{5})$ and, for every $i<j$, every element in $A_{i}$ is less than or equal to every element in $A_{j}$, or each of these subsets has size $Ω(n/(k^{2}\log n))$ and, for every $i \not = j$, every element in $A_{i}$ is incomparable with every…
▽ More
We prove that every partially ordered set on $n$ elements contains $k$ subsets $A_{1},A_{2},\dots,A_{k}$ such that either each of these subsets has size $Ω(n/k^{5})$ and, for every $i<j$, every element in $A_{i}$ is less than or equal to every element in $A_{j}$, or each of these subsets has size $Ω(n/(k^{2}\log n))$ and, for every $i \not = j$, every element in $A_{i}$ is incomparable with every element in $A_{j}$ for $i\ne j$. This answers a question of the first author from 2006. As a corollary, we prove for each positive integer $h$ there is $C_h$ such that for any $h$ partial orders $<_{1},<_{2},\dots,<_{h}$ on a set of $n$ elements, there exists $k$ subsets $A_{1},A_{2},\dots,A_{k}$ each of size at least $n/(k\log n)^{C_{h}}$ such that for each partial order $<_{\ell}$, either $a_{1}<_{\ell}a_{2}<_{\ell}\dots<_{\ell}a_{k}$ for any tuple of elements $(a_1,a_2,\dots,a_k) \in A_1\times A_2\times \dots \times A_k$, or $a_{1}>_{\ell}a_{2}>_{\ell}\dots>_{\ell}a_{k}$ for any $(a_1,a_2,\dots,a_k) \in A_1\times A_2\times \dots \times A_k$, or $a_i$ is incomparable with $a_j$ for any $i\ne j$, $a_i\in A_i$ and $a_j\in A_j$. This improves on a 2009 result of Pach and the first author motivated by problems in discrete geometry.
△ Less
Submitted 1 January, 2024;
originally announced January 2024.
-
Homogeneous structures in subset sums and non-averaging sets
Authors:
David Conlon,
Jacob Fox,
Huy Tuan Pham
Abstract:
We show that for every positive integer $k$ there are positive constants $C$ and $c$ such that if $A$ is a subset of $\{1, 2, \dots, n\}$ of size at least $C n^{1/k}$, then, for some $d \leq k-1$, the set of subset sums of $A$ contains a homogeneous $d$-dimensional generalized arithmetic progression of size at least $c|A|^{d+1}$. This strengthens a result of Szemerédi and Vu, who proved a similar…
▽ More
We show that for every positive integer $k$ there are positive constants $C$ and $c$ such that if $A$ is a subset of $\{1, 2, \dots, n\}$ of size at least $C n^{1/k}$, then, for some $d \leq k-1$, the set of subset sums of $A$ contains a homogeneous $d$-dimensional generalized arithmetic progression of size at least $c|A|^{d+1}$. This strengthens a result of Szemerédi and Vu, who proved a similar statement without the homogeneity condition. As an application, we make progress on the Erdős--Straus non-averaging sets problem, showing that every subset $A$ of $\{1, 2, \dots, n\}$ of size at least $n^{\sqrt{2} - 1 + o(1)}$ contains an element which is the average of two or more other elements of $A$. This gives the first polynomial improvement on a result of Erdős and Sárközy from 1990.
△ Less
Submitted 2 November, 2023;
originally announced November 2023.
-
Universality of Spectral Independence with Applications to Fast Mixing in Spin Glasses
Authors:
Nima Anari,
Vishesh Jain,
Frederic Koehler,
Huy Tuan Pham,
Thuy-Duong Vuong
Abstract:
We study Glauber dynamics for sampling from discrete distributions $μ$ on the hypercube $\{\pm 1\}^n$. Recently, techniques based on spectral independence have successfully yielded optimal $O(n)$ relaxation times for a host of different distributions $μ$. We show that spectral independence is universal: a relaxation time of $O(n)$ implies spectral independence.
We then study a notion of tractabi…
▽ More
We study Glauber dynamics for sampling from discrete distributions $μ$ on the hypercube $\{\pm 1\}^n$. Recently, techniques based on spectral independence have successfully yielded optimal $O(n)$ relaxation times for a host of different distributions $μ$. We show that spectral independence is universal: a relaxation time of $O(n)$ implies spectral independence.
We then study a notion of tractability for $μ$, defined in terms of smoothness of the multilinear extension of its Hamiltonian -- $\log μ$ -- over $[-1,+1]^n$. We show that Glauber dynamics has relaxation time $O(n)$ for such $μ$, and using the universality of spectral independence, we conclude that these distributions are also fractionally log-concave and consequently satisfy modified log-Sobolev inequalities. We sharpen our estimates and obtain approximate tensorization of entropy and the optimal $\widetilde{O}(n)$ mixing time for random Hamiltonians, i.e. the classically studied mixed $p$-spin model at sufficiently high temperature. These results have significant downstream consequences for concentration of measure, statistical testing, and learning.
△ Less
Submitted 19 July, 2023;
originally announced July 2023.
-
Set-coloring Ramsey numbers and error-correcting codes near the zero-rate threshold
Authors:
David Conlon,
Jacob Fox,
Huy Tuan Pham,
Yufei Zhao
Abstract:
For positive integers $n,r,s$ with $r > s$, the set-coloring Ramsey number $R(n;r,s)$ is the minimum $N$ such that if every edge of the complete graph $K_N$ receives a set of $s$ colors from a palette of $r$ colors, then there is a subset of $n$ vertices where all of the edges between them receive a common color. If $n$ is fixed and $\frac{s}{r}$ is less than and bounded away from…
▽ More
For positive integers $n,r,s$ with $r > s$, the set-coloring Ramsey number $R(n;r,s)$ is the minimum $N$ such that if every edge of the complete graph $K_N$ receives a set of $s$ colors from a palette of $r$ colors, then there is a subset of $n$ vertices where all of the edges between them receive a common color. If $n$ is fixed and $\frac{s}{r}$ is less than and bounded away from $1-\frac{1}{n-1}$, then $R(n;r,s)$ is known to grow exponentially in $r$, while if $\frac{s}{r}$ is greater than and bounded away from $1-\frac{1}{n-1}$, then $R(n;r,s)$ is bounded. Here we prove bounds for $R(n;r,s)$ in the intermediate range where $\frac{s}{r}$ is close to $1 - \frac{1}{n-1}$ by establishing a connection to the maximum size of error-correcting codes near the zero-rate threshold.
△ Less
Submitted 14 August, 2023; v1 submitted 23 May, 2023;
originally announced May 2023.
-
Optimal mixing of the down-up walk on independent sets of a given size
Authors:
Vishesh Jain,
Marcus Michelen,
Huy Tuan Pham,
Thuy-Duong Vuong
Abstract:
Let $G$ be a graph on $n$ vertices of maximum degree $Δ$. We show that, for any $δ> 0$, the down-up walk on independent sets of size $k \leq (1-δ)α_c(Δ)n$ mixes in time $O_{Δ,δ}(k\log{n})$, thereby resolving a conjecture of Davies and Perkins in an optimal form. Here, $α_{c}(Δ)n$ is the NP-hardness threshold for the problem of counting independent sets of a given size in a graph on $n$ vertices of…
▽ More
Let $G$ be a graph on $n$ vertices of maximum degree $Δ$. We show that, for any $δ> 0$, the down-up walk on independent sets of size $k \leq (1-δ)α_c(Δ)n$ mixes in time $O_{Δ,δ}(k\log{n})$, thereby resolving a conjecture of Davies and Perkins in an optimal form. Here, $α_{c}(Δ)n$ is the NP-hardness threshold for the problem of counting independent sets of a given size in a graph on $n$ vertices of maximum degree $Δ$. Our mixing time has optimal dependence on $k,n$ for the entire range of $k$; previously, even polynomial mixing was not known. In fact, for $k = Ω_Δ(n)$ in this range, we establish a log-Sobolev inequality with optimal constant $Ω_{Δ,δ}(1/n)$.
At the heart of our proof are three new ingredients, which may be of independent interest. The first is a method for lifting $\ell_\infty$-independence from a suitable distribution on the discrete cube -- in this case, the hard-core model -- to the slice by proving stability of an Edgeworth expansion using a multivariate zero-free region for the base distribution. The second is a generalization of the Lee-Yau induction to prove log-Sobolev inequalities for distributions on the slice with considerably less symmetry than the uniform distribution. The third is a sharp decomposition-type result which provides a lossless comparison between the Dirichlet form of the original Markov chain and that of the so-called projected chain in the presence of a contractive coupling.
△ Less
Submitted 10 May, 2023;
originally announced May 2023.
-
Optimal thresholds for Latin squares, Steiner Triple Systems, and edge colorings
Authors:
Vishesh Jain,
Huy Tuan Pham
Abstract:
We show that the threshold for the binomial random $3$-partite, $3$-uniform hypergraph $G^{3}((n,n,n),p)$ to contain a Latin square is $Θ(\log{n}/n)$. We also prove analogous results for Steiner triple systems and proper list edge-colorings of the complete (bipartite) graph with random lists. Our results answer several related questions of Johansson, Luria-Simkin, Casselgren-Häggkvist, Simkin, and…
▽ More
We show that the threshold for the binomial random $3$-partite, $3$-uniform hypergraph $G^{3}((n,n,n),p)$ to contain a Latin square is $Θ(\log{n}/n)$. We also prove analogous results for Steiner triple systems and proper list edge-colorings of the complete (bipartite) graph with random lists. Our results answer several related questions of Johansson, Luria-Simkin, Casselgren-Häggkvist, Simkin, and Kang-Kelly-Kühn-Methuku-Osthus.
△ Less
Submitted 19 December, 2022; v1 submitted 12 December, 2022;
originally announced December 2022.
-
Small subsets with large sumset: Beyond the Cauchy--Davenport bound
Authors:
Jacob Fox,
Sammy Luo,
Huy Tuan Pham,
Yunkun Zhou
Abstract:
For a subset $A$ of an abelian group $G$, given its size $|A|$, its doubling $κ=|A+A|/|A|$, and a parameter $s$ which is small compared to $|A|$, we study the size of the largest sumset $A+A'$ that can be guaranteed for a subset $A'$ of $A$ of size at most $s$. We show that a subset $A'\subseteq A$ of size at most $s$ can be found so that $|A+A'| = Ω(\min(κ^{1/3},s)|A|)$. Thus a sumset significant…
▽ More
For a subset $A$ of an abelian group $G$, given its size $|A|$, its doubling $κ=|A+A|/|A|$, and a parameter $s$ which is small compared to $|A|$, we study the size of the largest sumset $A+A'$ that can be guaranteed for a subset $A'$ of $A$ of size at most $s$. We show that a subset $A'\subseteq A$ of size at most $s$ can be found so that $|A+A'| = Ω(\min(κ^{1/3},s)|A|)$. Thus a sumset significantly larger than the Cauchy--Davenport bound can be guaranteed by a bounded size subset assuming that the doubling $κ$ is large. Building up on the same ideas, we resolve a conjecture of Bollobás, Leader and Tiba that for subsets $A,B$ of $\mathbb{Z}_p$ of size at most $αp$ for an appropriate constant $α>0$, one only needs three elements $b_1,b_2,b_3\in B$ to guarantee $|A+\{b_1,b_2,b_3\}|\ge |A|+|B|-1$. Allowing the use of larger subsets $A'$, we show that for sets $A$ of bounded doubling, one only needs a subset $A'$ with $o(|A|)$ elements to guarantee that $A+A'=A+A$. We also address another conjecture and a question raised by Bollobás, Leader and Tiba on high-dimensional analogs and sets whose sumset cannot be saturated by a bounded size subset.
△ Less
Submitted 13 October, 2022;
originally announced October 2022.
-
A Toolkit for Robust Thresholds
Authors:
Huy Tuan Pham,
Ashwin Sah,
Mehtaab Sawhney,
Michael Simkin
Abstract:
Consider a host hypergraph $G$ which contains a spanning structure due to minimum degree considerations. We collect three results proving that if the edges of $G$ are sampled at the appropriate rate then the spanning structure still appears with high probability in the sampled hypergraph. We prove such results for perfect matchings in hypergraphs above Dirac thresholds, for $K_r$-factors in graphs…
▽ More
Consider a host hypergraph $G$ which contains a spanning structure due to minimum degree considerations. We collect three results proving that if the edges of $G$ are sampled at the appropriate rate then the spanning structure still appears with high probability in the sampled hypergraph. We prove such results for perfect matchings in hypergraphs above Dirac thresholds, for $K_r$-factors in graphs satisfying the Hajnal--Szemerédi minimum degree condition, and for bounded-degree spanning trees. In each case our proof is based on constructing a spread measure and then applying recent results on the (fractional) Kahn--Kalai conjecture connecting the existence of such measures with probabilistic thresholds.
For our second result we give a shorter and more general proof of a recent theorem of Allen, Böttcher, Corsten, Davies, Jenssen, Morris, Roberts, and Skokan which handles the $r=3$ case with different techniques. In particular, we answer a question of theirs with regards to the number of $K_r$-factors in graphs satisfying the Hajnal--Szemerédi minimum degree condition.
△ Less
Submitted 15 May, 2023; v1 submitted 6 October, 2022;
originally announced October 2022.
-
On random irregular subgraphs
Authors:
Jacob Fox,
Sammy Luo,
Huy Tuan Pham
Abstract:
Let $G$ be a $d$-regular graph on $n$ vertices. Frieze, Gould, Karoński and Pfender began the study of the following random spanning subgraph model $H=H(G)$. Assign independently to each vertex $v$ of $G$ a uniform random number $x(v) \in [0,1]$, and an edge $(u,v)$ of $G$ is an edge of $H$ if and only if $x(u)+x(v) \geq 1$. Addressing a problem of Alon and Wei, we prove that if…
▽ More
Let $G$ be a $d$-regular graph on $n$ vertices. Frieze, Gould, Karoński and Pfender began the study of the following random spanning subgraph model $H=H(G)$. Assign independently to each vertex $v$ of $G$ a uniform random number $x(v) \in [0,1]$, and an edge $(u,v)$ of $G$ is an edge of $H$ if and only if $x(u)+x(v) \geq 1$. Addressing a problem of Alon and Wei, we prove that if $d = o(n/(\log n)^{12})$, then with high probability, for each nonnegative integer $k \leq d$, there are $(1+o(1))n/(d+1)$ vertices of degree $k$ in $H$.
△ Less
Submitted 27 July, 2022;
originally announced July 2022.
-
On a conjecture of Talagrand on selector processes and a consequence on positive empirical processes
Authors:
Jinyoung Park,
Huy Tuan Pham
Abstract:
For appropriate Gaussian processes, as a corollary of the majorizing measure theorem, Michel Talagrand (1987) proved that the event that the supremum is significantly larger than its expectation can be covered by a set of half-spaces whose sum of measures is small. We prove a conjecture of Talagrand that is the analog of this result in the Bernoulli-$p$ setting, and answer a question of Talagrand…
▽ More
For appropriate Gaussian processes, as a corollary of the majorizing measure theorem, Michel Talagrand (1987) proved that the event that the supremum is significantly larger than its expectation can be covered by a set of half-spaces whose sum of measures is small. We prove a conjecture of Talagrand that is the analog of this result in the Bernoulli-$p$ setting, and answer a question of Talagrand on the analogous result for general positive empirical processes.
△ Less
Submitted 22 January, 2024; v1 submitted 21 April, 2022;
originally announced April 2022.
-
A Proof of the Kahn-Kalai Conjecture
Authors:
Jinyoung Park,
Huy Tuan Pham
Abstract:
Proving the ``expectation-threshold'' conjecture of Kahn and Kalai, we show that for any increasing property $\mathcal{F}$ on a finite set $X$, $$p_c(\mathcal{F})=O(q(\mathcal{F})\log \ell(\mathcal{F})),$$ where $p_c(\mathcal{F})$ and $q(\mathcal{F})$ are the threshold and ``expectation threshold'' of $\mathcal{F}$, and $\ell(\mathcal{F})$ is the maximum of $2$ and the maximum size of a minimal me…
▽ More
Proving the ``expectation-threshold'' conjecture of Kahn and Kalai, we show that for any increasing property $\mathcal{F}$ on a finite set $X$, $$p_c(\mathcal{F})=O(q(\mathcal{F})\log \ell(\mathcal{F})),$$ where $p_c(\mathcal{F})$ and $q(\mathcal{F})$ are the threshold and ``expectation threshold'' of $\mathcal{F}$, and $\ell(\mathcal{F})$ is the maximum of $2$ and the maximum size of a minimal member of $\mathcal{F}$.
△ Less
Submitted 13 April, 2023; v1 submitted 31 March, 2022;
originally announced March 2022.
-
Dimension reduction for maximum matchings and the Fastest Mixing Markov Chain
Authors:
Vishesh Jain,
Huy Tuan Pham,
Thuy-Duong Vuong
Abstract:
Let $G = (V,E)$ be an undirected graph with maximum degree $Δ$ and vertex conductance $Ψ^*(G)$. We show that there exists a symmetric, stochastic matrix $P$, with off-diagonal entries supported on $E$, whose spectral gap $γ^*(P)$ satisfies \[Ψ^*(G)^{2}/\logΔ\lesssim γ^*(P) \lesssim Ψ^*(G).\] Our bound is optimal under the Small Set Expansion Hypothesis, and answers a question of Olesker-Taylor and…
▽ More
Let $G = (V,E)$ be an undirected graph with maximum degree $Δ$ and vertex conductance $Ψ^*(G)$. We show that there exists a symmetric, stochastic matrix $P$, with off-diagonal entries supported on $E$, whose spectral gap $γ^*(P)$ satisfies \[Ψ^*(G)^{2}/\logΔ\lesssim γ^*(P) \lesssim Ψ^*(G).\] Our bound is optimal under the Small Set Expansion Hypothesis, and answers a question of Olesker-Taylor and Zanetti, who obtained such a result with $\logΔ$ replaced by $\log|V|$.
In order to obtain our result, we show how to embed a negative-type semi-metric $d$ defined on $V$ into a negative-type semi-metric $d'$ supported in $\mathbb{R}^{O(\logΔ)}$, such that the (fractional) matching number of the weighted graph $(V,E,d)$ is approximately equal to that of $(V,E,d')$.
△ Less
Submitted 23 March, 2022; v1 submitted 8 March, 2022;
originally announced March 2022.
-
Universality for low degree factors of random polynomials over finite fields
Authors:
Jimmy He,
Huy Tuan Pham,
Max Wenqiang Xu
Abstract:
We show that the counts of low degree irreducible factors of a random polynomial $f$ over $\mathbb{F}_q$ with independent but non-uniform coefficients behave like that of a uniform random polynomial, exhibiting a form of universality for random polynomials over finite fields. Our strongest results require various assumptions on the parameters, but we are able to obtain results requiring only…
▽ More
We show that the counts of low degree irreducible factors of a random polynomial $f$ over $\mathbb{F}_q$ with independent but non-uniform coefficients behave like that of a uniform random polynomial, exhibiting a form of universality for random polynomials over finite fields. Our strongest results require various assumptions on the parameters, but we are able to obtain results requiring only $q=p$ a prime with $p\leq \exp({n^{1/13}})$ where $n$ is the degree of the polynomial. Our proofs use Fourier analysis, and rely on tools recently applied by Breuillard and Varjú to study the $ax+b$ process, which show equidistribution for $f(α)$ at a single point. We extend this to handle multiple roots and the Hasse derivatives of $f$, which allow us to study the irreducible factors with multiplicity.
△ Less
Submitted 22 February, 2022; v1 submitted 16 January, 2022;
originally announced January 2022.
-
Entropic Independence II: Optimal Sampling and Concentration via Restricted Modified Log-Sobolev Inequalities
Authors:
Nima Anari,
Vishesh Jain,
Frederic Koehler,
Huy Tuan Pham,
Thuy-Duong Vuong
Abstract:
We introduce a framework for obtaining tight mixing times for Markov chains based on what we call restricted modified log-Sobolev inequalities. Modified log-Sobolev inequalities (MLSI) quantify the rate of relative entropy contraction for the Markov operator, and are notoriously difficult to establish. However, infinitesimally close to stationarity, entropy contraction becomes equivalent to varian…
▽ More
We introduce a framework for obtaining tight mixing times for Markov chains based on what we call restricted modified log-Sobolev inequalities. Modified log-Sobolev inequalities (MLSI) quantify the rate of relative entropy contraction for the Markov operator, and are notoriously difficult to establish. However, infinitesimally close to stationarity, entropy contraction becomes equivalent to variance contraction, a.k.a. a Poincare inequality, which is significantly easier to establish through, e.g., spectral analysis. Motivated by this observation, we study restricted modified log-Sobolev inequalities that guarantee entropy contraction not for all starting distributions, but for those in a large neighborhood of the stationary distribution. We show how to sample from the hardcore and Ising models on $n$-node graphs that have a constant $δ$ relative gap to the tree-uniqueness threshold, in nearly-linear time $\widetilde O_δ(n)$. Notably, our bound does not depend on the maximum degree $Δ$, and is therefore optimal even for high-degree graphs. This improves on prior mixing time bounds of $\widetilde O_{δ, Δ}(n)$ and $\widetilde O_δ(n^2)$, established via (non-restricted) modified log-Sobolev and Poincare inequalities respectively. We further show that optimal concentration inequalities can still be achieved from the restricted form of modified log-Sobolev inequalities. To establish restricted entropy contraction, we extend the entropic independence framework of Anari, Jain, Koehler, Pham, and Vuong to the setting of distributions that are spectrally independent under a restricted set of external fields. We also develop an orthogonal trick that might be of independent interest: utilizing Bernoulli factories we show how to implement Glauber dynamics updates on high-degree graphs in $O(1)$ time, assuming standard adjacency array representation of the graph.
△ Less
Submitted 5 November, 2021;
originally announced November 2021.
-
Entropic Independence I: Modified Log-Sobolev Inequalities for Fractionally Log-Concave Distributions and High-Temperature Ising Models
Authors:
Nima Anari,
Vishesh Jain,
Frederic Koehler,
Huy Tuan Pham,
Thuy-Duong Vuong
Abstract:
We introduce a notion called entropic independence that is an entropic analog of spectral notions of high-dimensional expansion. Informally, entropic independence of a background distribution $μ$ on $k$-sized subsets of a ground set of elements says that for any (possibly randomly chosen) set $S$, the relative entropy of a single element of $S$ drawn uniformly at random carries at most $O(1/k)$ fr…
▽ More
We introduce a notion called entropic independence that is an entropic analog of spectral notions of high-dimensional expansion. Informally, entropic independence of a background distribution $μ$ on $k$-sized subsets of a ground set of elements says that for any (possibly randomly chosen) set $S$, the relative entropy of a single element of $S$ drawn uniformly at random carries at most $O(1/k)$ fraction of the relative entropy of $S$. Entropic independence is the analog of the notion of spectral independence, if one replaces variance by entropy. We use entropic independence to derive tight mixing time bounds, overcoming the lossy nature of spectral analysis of Markov chains on exponential-sized state spaces. In our main technical result, we show a general way of deriving entropy contraction, a.k.a. modified log-Sobolev inequalities, for down-up random walks from spectral notions. We show that spectral independence of a distribution under arbitrary external fields automatically implies entropic independence.
To derive our results, we relate entropic independence to properties of polynomials: $μ$ is entropically independent exactly when a transformed version of the generating polynomial of $μ$ is upper bounded by its linear tangent; this property is implied by concavity of the said transformation, which was shown by prior work to be locally equivalent to spectral independence. We apply our results to obtain tight modified log-Sobolev inequalities and mixing times for multi-step down-up walks on fractionally log-concave distributions. As our flagship application, we establish the tight mixing time of $O(n\log n)$ for Glauber dynamics on Ising models whose interaction matrix has eigenspectrum lying within an interval of length smaller than $1$, improving upon the prior quadratic dependence on $n$.
△ Less
Submitted 4 November, 2021; v1 submitted 8 June, 2021;
originally announced June 2021.
-
The upper logarithmic density of monochromatic subset sums
Authors:
David Conlon,
Jacob Fox,
Huy Tuan Pham
Abstract:
We show that in any two-coloring of the positive integers there is a color for which the set of positive integers that can be represented as a sum of distinct elements with this color has upper logarithmic density at least $(2+\sqrt{3})/4$ and this is best possible. This answers a forty-year-old question of Erdős.
We show that in any two-coloring of the positive integers there is a color for which the set of positive integers that can be represented as a sum of distinct elements with this color has upper logarithmic density at least $(2+\sqrt{3})/4$ and this is best possible. This answers a forty-year-old question of Erdős.
△ Less
Submitted 22 September, 2022; v1 submitted 31 May, 2021;
originally announced May 2021.
-
Global Convergence of Three-layer Neural Networks in the Mean Field Regime
Authors:
Huy Tuan Pham,
Phan-Minh Nguyen
Abstract:
In the mean field regime, neural networks are appropriately scaled so that as the width tends to infinity, the learning dynamics tends to a nonlinear and nontrivial dynamical limit, known as the mean field limit. This lends a way to study large-width neural networks via analyzing the mean field limit. Recent works have successfully applied such analysis to two-layer networks and provided global co…
▽ More
In the mean field regime, neural networks are appropriately scaled so that as the width tends to infinity, the learning dynamics tends to a nonlinear and nontrivial dynamical limit, known as the mean field limit. This lends a way to study large-width neural networks via analyzing the mean field limit. Recent works have successfully applied such analysis to two-layer networks and provided global convergence guarantees. The extension to multilayer ones however has been a highly challenging puzzle, and little is known about the optimization efficiency in the mean field regime when there are more than two layers.
In this work, we prove a global convergence result for unregularized feedforward three-layer networks in the mean field regime. We first develop a rigorous framework to establish the mean field limit of three-layer networks under stochastic gradient descent training. To that end, we propose the idea of a \textit{neuronal embedding}, which comprises of a fixed probability space that encapsulates neural networks of arbitrary sizes. The identified mean field limit is then used to prove a global convergence guarantee under suitable regularity and convergence mode assumptions, which -- unlike previous works on two-layer networks -- does not rely critically on convexity. Underlying the result is a universal approximation property, natural of neural networks, which importantly is shown to hold at \textit{any} finite training time (not necessarily at convergence) via an algebraic topology argument.
△ Less
Submitted 11 May, 2021;
originally announced May 2021.
-
Spectral independence, coupling with the stationary distribution, and the spectral gap of the Glauber dynamics
Authors:
Vishesh Jain,
Huy Tuan Pham,
Thuy Duong Vuong
Abstract:
We present a new lower bound on the spectral gap of the Glauber dynamics for the Gibbs distribution of a spectrally independent $q$-spin system on a graph $G = (V,E)$ with maximum degree $Δ$. Notably, for several interesting examples, our bound covers the entire regime of $Δ$ excluded by arguments based on coupling with the stationary distribution. As concrete applications, by combining our new lo…
▽ More
We present a new lower bound on the spectral gap of the Glauber dynamics for the Gibbs distribution of a spectrally independent $q$-spin system on a graph $G = (V,E)$ with maximum degree $Δ$. Notably, for several interesting examples, our bound covers the entire regime of $Δ$ excluded by arguments based on coupling with the stationary distribution. As concrete applications, by combining our new lower bound with known spectral independence computations and known coupling arguments:
(1) We show that for a triangle-free graph $G = (V,E)$ with maximum degree $Δ\geq 3$, the Glauber dynamics for the uniform distribution on proper $k$-colorings with $k \geq (1.763\dots + δ)Δ$ colors has spectral gap $\tildeΩ_δ(|V|^{-1})$. Previously, such a result was known either if the girth of $G$ is at least $5$ [Dyer et.~al, FOCS 2004], or under restrictions on $Δ$ [Chen et.~al, STOC 2021; Hayes-Vigoda, FOCS 2003].
(2) We show that for a regular graph $G = (V,E)$ with degree $Δ\geq 3$ and girth at least $6$, and for any $\varepsilon, δ> 0$, the partition function of the hardcore model with fugacity $λ\leq (1-δ)λ_{c}(Δ)$ may be approximated within a $(1+\varepsilon)$-multiplicative factor in time $\tilde{O}_δ(n^{2}\varepsilon^{-2})$. Previously, such a result was known if the girth is at least $7$ [Efthymiou et.~al, SICOMP 2019].
(3) We show for the binomial random graph $G(n,d/n)$ with $d = O(1)$, with high probability, an approximately uniformly random matching may be sampled in time $O_{d}(n^{2+o(1)})$. This improves the corresponding running time of $\tilde{O}_{d}(n^{3})$ due to [Jerrum-Sinclair, SICOMP 1989; Jerrum, 2003].
△ Less
Submitted 3 May, 2021;
originally announced May 2021.
-
Subset sums, completeness and colorings
Authors:
David Conlon,
Jacob Fox,
Huy Tuan Pham
Abstract:
We develop novel techniques which allow us to prove a diverse range of results relating to subset sums and complete sequences of positive integers, including solutions to several longstanding open problems. These include: solutions to the three problems of Burr and Erdős on Ramsey complete sequences, for which Erdős later offered a combined total of \…
▽ More
We develop novel techniques which allow us to prove a diverse range of results relating to subset sums and complete sequences of positive integers, including solutions to several longstanding open problems. These include: solutions to the three problems of Burr and Erdős on Ramsey complete sequences, for which Erdős later offered a combined total of \$350; analogous results for the new notion of density complete sequences; the solution to a conjecture of Alon and Erdős on the minimum number of colors needed to color the positive integers less than $n$ so that $n$ cannot be written as a monochromatic sum; the exact determination of an extremal function introduced by Erdős and Graham on sets of integers avoiding a given subset sum; and, answering a question reiterated by several authors, a homogeneous strengthening of a seminal result of Szemerédi and Vu on long arithmetic progressions in subset sums.
△ Less
Submitted 30 April, 2021;
originally announced April 2021.
-
Regularity method and large deviation principles for the Erdős--Rényi hypergraph
Authors:
Nicholas A. Cook,
Amir Dembo,
Huy Tuan Pham
Abstract:
We develop a quantitative large deviations theory for random hypergraphs, which rests on tensor decomposition and counting lemmas under a novel family of cut-type norms. As our main application, we obtain sharp asymptotics for joint upper and lower tails of homomorphism counts in the $r$-uniform Erdős--Rényi hypergraph for any fixed $r\ge 2$, generalizing and improving on previous results for the…
▽ More
We develop a quantitative large deviations theory for random hypergraphs, which rests on tensor decomposition and counting lemmas under a novel family of cut-type norms. As our main application, we obtain sharp asymptotics for joint upper and lower tails of homomorphism counts in the $r$-uniform Erdős--Rényi hypergraph for any fixed $r\ge 2$, generalizing and improving on previous results for the Erdős--Rényi graph ($r=2$). The theory is sufficiently quantitative to allow the density of the hypergraph to vanish at a polynomial rate, and additionally yields tail asymptotics for other nonlinear functionals, such as induced homomorphism counts.
△ Less
Submitted 9 May, 2023; v1 submitted 17 February, 2021;
originally announced February 2021.
-
On the sampling Lovász Local Lemma for atomic constraint satisfaction problems
Authors:
Vishesh Jain,
Huy Tuan Pham,
Thuy-Duong Vuong
Abstract:
We study the problem of sampling an approximately uniformly random satisfying assignment for atomic constraint satisfaction problems i.e. where each constraint is violated by only one assignment to its variables. Let $p$ denote the maximum probability of violation of any constraint and let $Δ$ denote the maximum degree of the line graph of the constraints.
Our main result is a nearly-linear (in…
▽ More
We study the problem of sampling an approximately uniformly random satisfying assignment for atomic constraint satisfaction problems i.e. where each constraint is violated by only one assignment to its variables. Let $p$ denote the maximum probability of violation of any constraint and let $Δ$ denote the maximum degree of the line graph of the constraints.
Our main result is a nearly-linear (in the number of variables) time algorithm for this problem, which is valid in a Lovász local lemma type regime that is considerably less restrictive compared to previous works. In particular, we provide sampling algorithms for the uniform distribution on:
(1) $q$-colorings of $k$-uniform hypergraphs with $Δ\lesssim q^{(k-4)/3 + o_{q}(1)}.$
The exponent $1/3$ improves the previously best-known $1/7$ in the case $q, Δ= O(1)$ [Jain, Pham, Vuong; arXiv, 2020] and $1/9$ in the general case [Feng, He, Yin; STOC 2021].
(2) Satisfying assignments of Boolean $k$-CNF formulas with $Δ\lesssim 2^{k/5.741}.$
The constant $5.741$ in the exponent improves the previously best-known $7$ in the case $k = O(1)$ [Jain, Pham, Vuong; arXiv, 2020] and $13$ in the general case [Feng, He, Yin; STOC 2021].
(3) Satisfying assignments of general atomic constraint satisfaction problems with $p\cdot Δ^{7.043} \lesssim 1.$
The constant $7.043$ improves upon the previously best-known constant of $350$ [Feng, He, Yin; STOC 2021].
At the heart of our analysis is a novel information-percolation type argument for showing the rapid mixing of the Glauber dynamics for a carefully constructed projection of the uniform distribution on satisfying assignments. Notably, there is no natural partial order on the space, and we believe that the techniques developed for the analysis may be of independent interest.
△ Less
Submitted 16 February, 2021;
originally announced February 2021.
-
Mixing time of fractional random walk on finite fields
Authors:
Jimmy He,
Huy Tuan Pham,
Max Wenqiang Xu
Abstract:
We study a random walk on $\mathbb{F}_p$ defined by $X_{n+1}=1/X_n+\varepsilon_{n+1}$ if $X_n\neq 0$, and $X_{n+1}=\varepsilon_{n+1}$ if $X_n=0$, where $\varepsilon_{n+1}$ are independent and identically distributed. This can be seen as a non-linear analogue of the Chung--Diaconis--Graham process. We show that the mixing time is of order $\log p$, answering a question of Chatterjee and Diaconis.
We study a random walk on $\mathbb{F}_p$ defined by $X_{n+1}=1/X_n+\varepsilon_{n+1}$ if $X_n\neq 0$, and $X_{n+1}=\varepsilon_{n+1}$ if $X_n=0$, where $\varepsilon_{n+1}$ are independent and identically distributed. This can be seen as a non-linear analogue of the Chung--Diaconis--Graham process. We show that the mixing time is of order $\log p$, answering a question of Chatterjee and Diaconis.
△ Less
Submitted 12 March, 2021; v1 submitted 4 February, 2021;
originally announced February 2021.
-
Towards the sampling Lovász Local Lemma
Authors:
Vishesh Jain,
Huy Tuan Pham,
Thuy Duong Vuong
Abstract:
Let $Φ= (V, \mathcal{C})$ be a constraint satisfaction problem on variables $v_1,\dots, v_n$ such that each constraint depends on at most $k$ variables and such that each variable assumes values in an alphabet of size at most $[q]$. Suppose that each constraint shares variables with at most $Δ$ constraints and that each constraint is violated with probability at most $p$ (under the product measure…
▽ More
Let $Φ= (V, \mathcal{C})$ be a constraint satisfaction problem on variables $v_1,\dots, v_n$ such that each constraint depends on at most $k$ variables and such that each variable assumes values in an alphabet of size at most $[q]$. Suppose that each constraint shares variables with at most $Δ$ constraints and that each constraint is violated with probability at most $p$ (under the product measure on its variables). We show that for $k, q = O(1)$, there is a deterministic, polynomial time algorithm to approximately count the number of satisfying assignments and a randomized, polynomial time algorithm to sample from approximately the uniform distribution on satisfying assignments, provided that \[C\cdot q^{3}\cdot k \cdot p \cdot Δ^{7} < 1, \quad \text{where }C \text{ is an absolute constant.}\] Previously, a result of this form was known essentially only in the special case when each constraint is violated by exactly one assignment to its variables.
For the special case of $k$-CNF formulas, the term $Δ^{7}$ improves the previously best known $Δ^{60}$ for deterministic algorithms [Moitra, J.ACM, 2019] and $Δ^{13}$ for randomized algorithms [Feng et al., arXiv, 2020]. For the special case of properly $q$-coloring $k$-uniform hypergraphs, the term $Δ^{7}$ improves the previously best known $Δ^{14}$ for deterministic algorithms [Guo et al., SICOMP, 2019] and $Δ^{9}$ for randomized algorithms [Feng et al., arXiv, 2020].
△ Less
Submitted 24 November, 2020;
originally announced November 2020.
-
A Note on the Global Convergence of Multilayer Neural Networks in the Mean Field Regime
Authors:
Huy Tuan Pham,
Phan-Minh Nguyen
Abstract:
In a recent work, we introduced a rigorous framework to describe the mean field limit of the gradient-based learning dynamics of multilayer neural networks, based on the idea of a neuronal embedding. There we also proved a global convergence guarantee for three-layer (as well as two-layer) networks using this framework.
In this companion note, we point out that the insights in our previous work…
▽ More
In a recent work, we introduced a rigorous framework to describe the mean field limit of the gradient-based learning dynamics of multilayer neural networks, based on the idea of a neuronal embedding. There we also proved a global convergence guarantee for three-layer (as well as two-layer) networks using this framework.
In this companion note, we point out that the insights in our previous work can be readily extended to prove a global convergence guarantee for multilayer networks of any depths. Unlike our previous three-layer global convergence guarantee that assumes i.i.d. initializations, our present result applies to a type of correlated initialization. This initialization allows to, at any finite training time, propagate a certain universal approximation property through the depth of the neural network. To achieve this effect, we introduce a bidirectional diversity condition.
△ Less
Submitted 16 June, 2020;
originally announced June 2020.
-
Tower-type bounds for Roth's theorem with popular differences
Authors:
Jacob Fox,
Huy Tuan Pham,
Yufei Zhao
Abstract:
Green developed an arithmetic regularity lemma to prove a strengthening of Roth's theorem on arithmetic progressions in dense sets. It states that for every $ε> 0$ there is some $N_0(ε)$ such that for every $N \ge N_0(ε)$ and $A \subset [N]$ with $|A| = αN$, there is some nonzero $d$ such that $A$ contains at least $(α^3 - ε) N$ three-term arithmetic progressions with common difference $d$.
We p…
▽ More
Green developed an arithmetic regularity lemma to prove a strengthening of Roth's theorem on arithmetic progressions in dense sets. It states that for every $ε> 0$ there is some $N_0(ε)$ such that for every $N \ge N_0(ε)$ and $A \subset [N]$ with $|A| = αN$, there is some nonzero $d$ such that $A$ contains at least $(α^3 - ε) N$ three-term arithmetic progressions with common difference $d$.
We prove that the minimum $N_0(ε)$ in Green's theorem is an exponential tower of 2s of height on the order of $\log(1/ε)$. Both the lower and upper bounds are new. It shows that the tower-type bounds that arise from the use of a regularity lemma in this application are quantitatively necessary.
△ Less
Submitted 28 April, 2020;
originally announced April 2020.
-
Irreducibility of random polynomials of bounded degree
Authors:
Huy Tuan Pham,
Max Wenqiang Xu
Abstract:
It is known that random monic integral polynomials of bounded degree $d$ and integral coefficients distributed uniformly and independently in $[-H,H]$ are irreducible over $\mathbb{Z}$ with probability tending to $1$ as $H\to \infty$. In this paper, we give a general criterion for guaranteeing the same conclusion under much more general coefficient distributions, allowing them to be nonuniformly a…
▽ More
It is known that random monic integral polynomials of bounded degree $d$ and integral coefficients distributed uniformly and independently in $[-H,H]$ are irreducible over $\mathbb{Z}$ with probability tending to $1$ as $H\to \infty$. In this paper, we give a general criterion for guaranteeing the same conclusion under much more general coefficient distributions, allowing them to be nonuniformly and dependently distributed over arbitrary sets.
△ Less
Submitted 20 July, 2021; v1 submitted 24 February, 2020;
originally announced February 2020.
-
A Rigorous Framework for the Mean Field Limit of Multilayer Neural Networks
Authors:
Phan-Minh Nguyen,
Huy Tuan Pham
Abstract:
We develop a mathematically rigorous framework for multilayer neural networks in the mean field regime. As the network's widths increase, the network's learning trajectory is shown to be well captured by a meaningful and dynamically nonlinear limit (the \textit{mean field} limit), which is characterized by a system of ODEs. Our framework applies to a broad range of network architectures, learning…
▽ More
We develop a mathematically rigorous framework for multilayer neural networks in the mean field regime. As the network's widths increase, the network's learning trajectory is shown to be well captured by a meaningful and dynamically nonlinear limit (the \textit{mean field} limit), which is characterized by a system of ODEs. Our framework applies to a broad range of network architectures, learning dynamics and network initializations. Central to the framework is the new idea of a \textit{neuronal embedding}, which comprises of a non-evolving probability space that allows to embed neural networks of arbitrary widths.
Using our framework, we prove several properties of large-width multilayer neural networks. Firstly we show that independent and identically distributed initializations cause strong degeneracy effects on the network's learning trajectory when the network's depth is at least four. Secondly we obtain several global convergence guarantees for feedforward multilayer networks under a number of different setups. These include two-layer and three-layer networks with independent and identically distributed initializations, and multilayer networks of arbitrary depths with a special type of correlated initializations that is motivated by the new concept of \textit{bidirectional diversity}. Unlike previous works that rely on convexity, our results admit non-convex losses and hinge on a certain universal approximation property, which is a distinctive feature of infinite-width neural networks and is shown to hold throughout the training process. Aside from being the first known results for global convergence of multilayer networks in the mean field regime, they demonstrate flexibility of our framework and incorporate several new ideas and insights that depart from the conventional convex optimization wisdom.
△ Less
Submitted 13 February, 2023; v1 submitted 30 January, 2020;
originally announced January 2020.
-
Common and Sidorenko Linear Equations
Authors:
Jacob Fox,
Huy Tuan Pham,
Yufei Zhao
Abstract:
A linear equation with coefficients in $\mathbb{F}_q$ is common if the number of monochromatic solutions in any two-coloring of $\mathbb{F}_q^n$ is asymptotically (as $n \to \infty$) at least the number expected in a random two-coloring. The linear equation is Sidorenko if the number of solutions in any dense subset of $\mathbb{F}_q^n$ is asymptotically at least the number expected in a random set…
▽ More
A linear equation with coefficients in $\mathbb{F}_q$ is common if the number of monochromatic solutions in any two-coloring of $\mathbb{F}_q^n$ is asymptotically (as $n \to \infty$) at least the number expected in a random two-coloring. The linear equation is Sidorenko if the number of solutions in any dense subset of $\mathbb{F}_q^n$ is asymptotically at least the number expected in a random set of the same density.
In this paper, we characterize those linear equations which are common, and those which are Sidorenko. The main novelty is a construction based on choosing random Fourier coefficients that shows that certain linear equations do not have these properties. This solves problems posed in a paper of Saad and Wolf.
△ Less
Submitted 15 December, 2020; v1 submitted 14 October, 2019;
originally announced October 2019.
-
Popular progression differences in vector spaces II
Authors:
Jacob Fox,
Huy Tuan Pham
Abstract:
Green used an arithmetic analogue of Szemerédi's celebrated regularity lemma to prove the following strengthening of Roth's theorem in vector spaces. For every $α>0$, $β<α^3$, and prime number $p$, there is a least positive integer $n_p(α,β)$ such that if $n \geq n_p(α,β)$, then for every subset of $\mathbb{F}_p^n$ of density at least $α$ there is a nonzero $d$ for which the density of three-term…
▽ More
Green used an arithmetic analogue of Szemerédi's celebrated regularity lemma to prove the following strengthening of Roth's theorem in vector spaces. For every $α>0$, $β<α^3$, and prime number $p$, there is a least positive integer $n_p(α,β)$ such that if $n \geq n_p(α,β)$, then for every subset of $\mathbb{F}_p^n$ of density at least $α$ there is a nonzero $d$ for which the density of three-term arithmetic progressions with common difference $d$ is at least $β$. We determine for $p \geq 19$ the tower height of $n_p(α,β)$ up to an absolute constant factor and an additive term depending only on $p$. In particular, if we want half the random bound (so $β=α^3/2$), then the dimension $n$ required is a tower of twos of height $Θ\left((\log p) \log \log (1/α)\right)$. It turns out that the tower height in general takes on a different form in several different regions of $α$ and $β$, and different arguments are used both in the upper and lower bounds to handle these cases.
△ Less
Submitted 21 November, 2019; v1 submitted 28 August, 2017;
originally announced August 2017.
-
Popular progression differences in vector spaces
Authors:
Jacob Fox,
Huy Tuan Pham
Abstract:
Green proved an arithmetic analogue of Szemerédi's celebrated regularity lemma and used it to verify a conjecture of Bergelson, Host, and Kra which sharpens Roth's theorem on three-term arithmetic progressions in dense sets. It shows that for every subset of $\mathbb{F}_p^n$ with $n$ sufficiently large, the density of three-term arithmetic progressions with some nonzero common difference is at lea…
▽ More
Green proved an arithmetic analogue of Szemerédi's celebrated regularity lemma and used it to verify a conjecture of Bergelson, Host, and Kra which sharpens Roth's theorem on three-term arithmetic progressions in dense sets. It shows that for every subset of $\mathbb{F}_p^n$ with $n$ sufficiently large, the density of three-term arithmetic progressions with some nonzero common difference is at least the random bound (the cube of the set density) up to an additive $ε$. For a fixed odd prime $p$, we prove that the required dimension grows as an exponential tower of $p$'s of height $Θ(\log(1/ε))$. This improves both the lower and upper bound, and is the first example of a result where a tower-type bound coming from applying a regularity lemma is shown to be necessary.
△ Less
Submitted 28 August, 2017;
originally announced August 2017.
-
Kinetic and related macroscopic models for chemotaxis on networks
Authors:
Raul Borsche,
Axel Klar,
Ha T. N. Pham
Abstract:
In this paper we consider kinetic and associated macroscopic models for chemotaxis on a network. Coupling conditions at the nodes of the network for the kinetic problem are presented and used to derive coupling conditions for the macroscopic approximations. The results of the different models are compared and relations to a Keller-Segel model on networks are discussed. For a numerical approximatio…
▽ More
In this paper we consider kinetic and associated macroscopic models for chemotaxis on a network. Coupling conditions at the nodes of the network for the kinetic problem are presented and used to derive coupling conditions for the macroscopic approximations. The results of the different models are compared and relations to a Keller-Segel model on networks are discussed. For a numerical approximation of the governing equations asymptotic preserving relaxation schemes are extended to directed graphs. Kinetic and macroscopic equations are investigated numerically and their solutions are compared for tripod and more general networks.
△ Less
Submitted 27 December, 2015; v1 submitted 22 December, 2015;
originally announced December 2015.