D. M. Stolyarov, “Hardy-Littlewood-Sobolev inequality for $p=1$”, Mat. Sb., 213:6 (2022), 125–174; Sb. Math., 213:6 (2022), 844

Sbornik: Mathematics

RUS ENG

JOURNALS PEOPLE ORGANISATIONS CONFERENCES SEMINARS VIDEO LIBRARY PACKAGE AMSBIB

	General information
	Latest issue
	Forthcoming papers
	Archive
	Impact factor
	Guidelines for authors
	License agreement
	Submit a manuscript

	Search papers
	Search references

	RSS
	Latest issue
	Current issues
	Archive issues
	What is RSS

Mat. Sb.:
Year:
Volume:
Issue:
Page:
	Find

Personal entry:
Login:
Password:
	Save password
	Enter
	Forgotten password?
	Register

Sbornik: Mathematics, 2022, Volume 213, Issue 6, Pages 844–889
DOI: https://doi.org/10.1070/SM9645 (Mi sm9645)

This article is cited in 6 scientific papers (total in 6 papers)

Hardy-Littlewood-Sobolev inequality for $p=1$

D. M. Stolyarov^ab

^a Saint Petersburg State University, St. Petersburg, Russia
^b St. Petersburg Department of Steklov Mathematical Institute of Russian Academy of Sciences, St. Petersburg, Russia

English version PDF (915 kB) HTML full-text Citations (6)

References:

PDF

HTML

DOI: https://doi.org/10.1070/SM9645

Abstract: Let $\mathcal{W}$ be a closed dilation and translation invariant subspace of the space of $\mathbb{R}^\ell$-valued Schwartz distributions in $d$ variables. We show that if the space $\mathcal{W}$ does not contain distributions of the type $a\otimes \delta_0$, $\delta_0$ being the Dirac delta, then the inequality $\|\operatorname{I}_\alpha [f]\|_{L_{d/(d-\alpha),1}}\lesssim \|f\|_{L_1}$ holds true for functions $f\in\mathcal{W}\cap L_1$ with a uniform constant; here $\operatorname{I}_\alpha$ is the Riesz potential of order $\alpha$ and $L_{p,1}$ is the Lorentz space. As particular cases, this result implies the inequality $\|\nabla^{m-1} f\|_{L_{d/(d-1),1}} \lesssim \|A f\|_{L_1}$, where $A$ is a cancelling elliptic differential operator of order $m$, and the inequality $\|\operatorname{I}_\alpha f\|_{L_{d/(d-\alpha),1}} \lesssim \|f\|_{L_1}$, where $f$ is a divergence free vector field.
Bibliography: 59 titles.

Keywords: Hardy-Littlewood-Sobolev inequality, Bourgain-Brezis inequalities, cancelling differential operators.

Funding agency	Grant number
Russian Foundation for Basic Research	18-31-00037 ��_�
This research was carried out with the support of the Russian Foundation for Basic Research (grant no. 18-31-00037-��_�).

Received: 19.07.2021 and 04.03.2022

Russian version:
Matematicheskii Sbornik, 2022, Volume 213, Number 6, Pages 125–174
DOI: https://doi.org/10.4213/sm9645

Bibliographic databases:

Document Type: Article

MSC: Primary 46E35, 42B35; Secondary 35N05, 42B25

Language: English

Original paper language: Russian

§ 1. Introduction

1.1. The main theorem

Let $\mathcal{S}(\mathbb{R}^d,\mathbb{R}^\ell)$ be the Schwartz class of $\mathbb{R}^\ell$-valued functions in $d$ variables and let $\mathcal{S}'(\mathbb{R}^d,\mathbb{R}^\ell)$ be the dual space of tempered $\mathbb{R}^\ell$-valued distributions.

Our main result stated below may be thought of as a substitute for the classical Hardy-Littlewood-Sobolev inequality at the endpoint case when $p=1$ (in the present text $p$ will stand for the parameter on the left-hand side of the inequality, which is usually denoted by $q$; the summability parameter on the right-hand side is always equal to $1$ in our considerations).

Theorem 1. Let $\mathcal{W}$ be a closed linear subspace of $\mathcal{S}'(\mathbb{R}^d,\mathbb{R}^\ell)$ that is invariant under translations and dilations, and let $\alpha \in (0,d]$. Then the constant in the inequality

$$ \begin{equation} \|\operatorname{I}_\alpha [f]\|_{L_{d/(d-\alpha),1}}\lesssim \|f\|_{L_1}, \qquad f\in\mathcal{W}, \end{equation} \tag{1.1} $$

is uniform with respect to all $f\in \mathcal{W}$ for which the right-hand side is finite, if and only if $\mathcal{W}$ does not contain the distributions $a\otimes \delta_0$, $a\in \mathbb{R}^\ell \setminus \{0\}$.

Theorem 1 (and a stronger version of it, Theorem 2 below) finds many applications in the theory of inequalities involving vectorial differential operators and $L_1$ norms (so-called Bourgain-Brezis inequalities).

Our main aim is to demonstrate the strength of the martingale approach proposed in [4] in application to such type inequalities and related problems. In particular, this approach allows one to get rid of the differential or Fourier structure (though the Fourier transform is implicitly present in Theorem 1 via translation invariance); Theorem 1 applies to various versions of Hardy spaces as well. From a pure analysis point of view, the main advantage of the martingale approach lies in its sharpness: we will push to the limit the interpolation parameters on the left-hand side of such inequalities, thus solving several open problems in the field. The presentation here is slightly more condensed than in the preprint [52]: the introduction is different, and also some technical lemmas in the main body of the paper do not have proofs. An interested reader can find these proofs in the preprint. Before passing to a more detailed exposition, we give two examples.

Example 1. Consider the case of divergence free distributions. Let $\ell = d$, and let

$$ \begin{equation*} \mathcal{W} = \bigl\{f\in \mathcal{S}'(\mathbb{R}^d,\mathbb{R}^d)\mid \operatorname{div} f = 0\bigr\}. \end{equation*} \notag $$

Clearly, the distributions of the type $a\otimes \delta_0$ do not belong to $\mathcal{W}$ in this case. Thus, Theorem 1 is applicable. If we choose $\alpha = 1$, this implies the inequality

$$ \begin{equation*} \|\operatorname{I}_1[f]\|_{L_{d/(d-1),1}} \lesssim \|f\|_{L_1}, \qquad \operatorname{div} f=0, \end{equation*} \notag $$

which was known as Open Problem $1$ in [12] (the original formulation uses the div-curl system instead of the Riesz potential). Recently, Hernandez and Spector have considered this particular inequality in [20] using different methods.

Example 2. Now let us consider a more classical example of gradient measures. Let ${\ell = d}$, and let

$$ \begin{equation*} \mathcal{W} = \bigl\{f\in\mathcal{S}'(\mathbb{R}^d,\mathbb{R}^d)\mid f = \nabla g,\, g \in \mathcal{S}'(\mathbb{R}^d,\mathbb{R}^d)\bigr\}. \end{equation*} \notag $$

This space can also be described as the space of curl-free distributions. Using this reformulation, one can see that the distributions $a\otimes \delta_0$ do not belong to $\mathcal{W}$ in this case either. Theorem 1 leads to the inequality

$$ \begin{equation} \|\operatorname{I}_\alpha [\nabla g]\|_{L_{d/(d-\alpha),1}}\lesssim \|\nabla g\|_{L_1}. \end{equation} \tag{1.2} $$

Using the embedding $L_{p,1}\hookrightarrow L_p$ and the fact that the $L_{d/(d-1)}$-norms of $g$ and $\operatorname{I}_1[\nabla g]$ are comparable for compactly supported functions $g$, one may derive the classical Gagliardo-Nirenberg embedding

$$ \begin{equation*} \dot{W}_1^1 \hookrightarrow L_{d/(d-1)} \end{equation*} \notag $$

from (1.2).

Example 3. One can go further and consider a vectorial, homogeneous of order $m$, elliptic differential operator $A$ that maps $V$-valued functions to $E$-valued functions; here $V$ and $E$ are finite-dimensional spaces. Let $\mathcal{L}(V,E)$ be the space of all linear operators with domain $V$ and image in $E$. One can think of $A$ in terms of its symbol $\mathbb{A}$ that is a mapping $\mathbb{A}\colon \mathbb{R}^d \to \mathcal{L}(V,E)$ such that

$$ \begin{equation*} A[f] = \mathcal{F}^{-1}\bigl[\mathbb{A} (2\pi i \xi)[\widehat{f}(\xi)]\bigr], \qquad f\in \mathcal{S}(\mathbb{R}^d,V). \end{equation*} \notag $$

Here and in what follows we use the standard Harmonic Analysis normalization of the Fourier transform

$$ \begin{equation*} \begin{gathered} \, \widehat{f}(\xi) =\mathcal{F}[f](\xi) = \int_{\mathbb{R}^d} f(x)\exp(-2\pi i\langle\xi,x\rangle)\,dx, \\ \widehat{\mu}(\xi) = \int_{\mathbb{R}^d}\exp(-2\pi i \langle\xi,x\rangle)\,d\mu(x). \end{gathered} \end{equation*} \notag $$

Since we assume that $A$ is homogeneous, the mapping $\mathbb{A}$ is a homogeneous (matrix-valued) polynomial of order $m$. In this case we set

$$ \begin{equation*} \mathcal{W} = \bigl\{f\in \mathcal{S}'(\mathbb{R}^d,E)\mid f = A[g],\, g \in \mathcal{S}'(\mathbb{R}^d,V)\bigr\}. \end{equation*} \notag $$

Theorem 1 says: the inequality

$$ \begin{equation} \|\operatorname{I}_{\alpha}[A[g]]\|_{L_{d/(d-\alpha),1}}\lesssim \|A[g]\|_{L_1} \end{equation} \tag{1.3} $$

holds true with a uniform constant if and only if the equation $A[u] = a\otimes \delta_0$, $a\in V$, does not have distributional solutions; the latter condition has a clear reformulation in terms of $\mathbb{A}$ (see formulae (1.6) and (1.8), and Remark 4 below). Operators that satisfy this condition are called cancelling; they were introduced in [42]. The particular case of $\alpha = 1$ in (1.3) solves Open Problem 8.3 in [42]. We note that a similar inequality with the classical Lebesgue space $L_{d/(d-\alpha)}$ on the left-hand side was proved in [42].

The reader may ask: why is it so important to improve from $L_p$ to $L_{p,1}$? The difference between Lorentz and Lebesgue spaces can be emphasized via a version of Hardy’s inequality (which follows from the embedding into $L_{d/(d-\alpha),1}$ since ${|\,{\cdot}\,|^{-\alpha} \in L_{d/\alpha,\infty}}$ and does not follow directly from the embedding into $L_{d/(d-\alpha)}$ since $|\,{\cdot}\,|^{-\alpha}\notin L_{d/\alpha}$).

Corollary 1. Let $\mathcal{W}$ be a translation and dilation invariant subspace of $\mathcal{S}'(\mathbb{R}^d,\mathbb{R}^\ell)$ that does not contain the distributions $a\otimes \delta_0$, $a\ne 0$. Let $\alpha \in (0,d)$, then

$$ \begin{equation*} \int_{\mathbb{R}^d}\frac{|\operatorname{I}_\alpha [f](x)|}{|x-x_0|^{\alpha}}\,dx \lesssim \|f\|_{L_1} \end{equation*} \notag $$

for any $x_0 \in \mathbb{R}^d$ and $f\in \mathcal{W} \cap L_1$.

The author believes that the main advantage is that Theorem 1 goes beyond the differential structure described in the examples above. It shows that the phenomenon of Bourgain-Brezis inequalities belongs to the field of Harmonic Analysis rather than classical real variable analysis of differential operators. Theorem 1 says that the spaces $\mathcal{W}$ that do not contain delta measures serve as ‘Hardy spaces for fractional operators’. It is high time to introduce a Littlewood-Paley decomposition that will enable one to strengthen Theorem 1.

1.2. Littlewood-Paley decompositions and the Besov scale

We will be using Besov-Lorentz spaces since they arise naturally in our considerations. For details on Besov-Lorentz spaces, see ¹ Ch. 3 in [33]; we provide a brief outline only (in fact, we do not use any delicate properties of these spaces and they do not appear anywhere except the introduction, § 1). We will not need the full scale, but only the space $\dot{B}_{p,1}^{0,1}$. The norm in this space is defined by

$$ \begin{equation} \|g\|_{\dot{B}_{p,1}^{0,1}} = \sum_{k\in\mathbb{Z}} \|g*(\psi_k - \psi_{k-1})\|_{L_{p,1}}, \end{equation} \tag{1.4} $$

where the functions $\psi_k(x) = A^{dk}\psi(A^{k}x)$ form an approximate identity constructed from a smooth function $\psi$ whose Fourier transform equals one in a neighborhood of the origin and is compactly supported; here $A > 1$ is an auxiliary parameter (different choices of $A$ lead to equivalent norms).

Theorem 2. Let $\mathcal{W}$ be a closed linear subspace of $\mathcal{S}'(\mathbb{R}^d,\mathbb{R}^\ell)$ that is invariant under translations and dilations, and let $\alpha \in (0,d)$. Then the constant in the inequality

$$ \begin{equation*} \|\operatorname{I}_\alpha [f]\|_{\dot{B}_{d/(d-\alpha),1}^{0,1}}\lesssim \|f\|_{L_1}, \qquad f\in\mathcal{W}, \end{equation*} \notag $$

The limit relations

$$ \begin{equation*} g*\psi_k \xrightarrow{L_{p,1}} g, \quad k \to \infty, \quad\text{and}\quad g*\psi_k \xrightarrow{L_{p,1}} 0, \quad k\to -\infty, \end{equation*} \notag $$

and the triangle inequality in $L_{p,1}$ (note that $p > 1$) imply that

$$ \begin{equation*} \|g\|_{L_{p,1}} \lesssim \|g\|_{\dot{B}_{p,1}^{0,1}}. \end{equation*} \notag $$

Thus, Theorem 1 is a consequence of Theorem 2.

One can use the embedding $\dot{B}_{d/(d-\alpha),1}^{0,1} \hookrightarrow \dot{B}_{d/(d-\alpha)}^{0,1}$ (which follows from $L_{p,1}\hookrightarrow L_p$) to obtain yet another corollary that solves Open Problem 8.2 in [42] (see also Open Problem 1 in [41]).

Corollary 2. Let $A$ be an elliptic cancelling differential operator of order $m$. Then

$$ \begin{equation*} \|f\|_{\dot{B}_{d/(d-\alpha)}^{m-\alpha,1}} \lesssim \|A[f]\|_{L_1}. \end{equation*} \notag $$

Remark 1. By the classical Besov embedding

$$ \begin{equation*} \operatorname{I}_{\gamma - \alpha}\colon \dot{B}_{d/(d-\alpha),1}^{0,1} \to \dot{B}_{d/(d-\gamma),1}^{0,1}, \qquad \alpha < \gamma \leqslant d, \end{equation*} \notag $$

which follows from (1.4) in our setting, it suffices to consider the case when $\alpha < {d}/{2}$ (equivalently, $p < 2$) in Theorem 2. What is more, Theorem 2 also covers the endpoint $\alpha = d$ in the sense that if there are no distributions $a\otimes \delta_0$ in $\mathcal{W}$, then

$$ \begin{equation*} \sum_{k\in\mathbb{Z}} A^{-dk} \|f*(\psi_k - \psi_{k-1})\|_{L_{\infty}} \lesssim \|f\|_{L_1}, \qquad f \in \mathcal{W}. \end{equation*} \notag $$

Theorem 2 has other applications, for example, it implies the classical Hardy inequality (going back to [19])

$$ \begin{equation*} \biggl(\int_{\mathbb{R}_+} \frac{|\widehat{f}(\xi)|^2}{|\xi|}\,d\xi\biggr)^{1/2} \lesssim \|f\|_{\operatorname{H}_1}; \end{equation*} \notag $$

here $\operatorname{H}_1$ is the analytic Hardy class on the line that consists of summable complex-valued functions with Fourier transforms supported on the positive semiaxis. The corresponding space $\mathcal{W}$ is given by the formula

$$ \begin{equation*} \mathcal{W} = \bigl\{(f_1,f_2) \in \mathcal{S}'(\mathbb{R},\mathbb{R}^2)\mid\operatorname{spec} f \subset [0,\infty),\, f=f_1 + if_2\bigr\}. \end{equation*} \notag $$

Theorem 2 leads to yet another corollary in the spirit of Hardy’s inequality; in the inequality below the symbol $\sigma_r$ denotes the $(d-1)$-Hausdorff measure on the sphere $\{\zeta \in \mathbb{R}^d\mid|\zeta| = r\}$.

Corollary 3. Assume that $d \geqslant 2$. Let $\mathcal{W}$ be a closed linear subspace of $\mathcal{S}'(\mathbb{R}^d,\mathbb{R}^\ell)$ that is invariant under translations and dilations and does not contain the distributions of the form $a\otimes \delta_0$, $a\in \mathbb{R}^\ell \setminus \{0\}$. Let $A > 1$. Then the inequality

$$ \begin{equation} \sum_{k\in\mathbb{Z}} A^{(1-d)k}\sup_{r\in [A^k,A^{k+1})} \int_{|\zeta| = r} |\widehat{f}(\zeta)|\,d\sigma_r(\zeta) \lesssim \|f\|_{L_1} \end{equation} \tag{1.5} $$

holds true for any $f\in\mathcal{W}\cap L_1$.

Proof. It suffices to prove the estimate

$$ \begin{equation*} A^{(1-d)k}\sup_{r\in [A^{k-1},A^{k})} \int_{|\zeta| = r} |\widehat{f}(\zeta)|\,d\sigma_r(\zeta) \lesssim A^{-\alpha k}\|f*(\psi_k - \psi_{k-1})\|_{L_{d/(d-\alpha)}}, \end{equation*} \notag $$

where $\alpha$ is a positive sufficiently small auxiliary parameter; then (1.5) will follow from Theorem 2. By dilation invariance, we may consider the case $k=0$ only. Let us assume that $\widehat{\psi}_0 - \widehat{\psi}_{-1}$ is nonzero in the region where $|\zeta|\in [A^{-1},1)$ (we can choose the function $\psi$ in the definition of the Besov space that satisfies this assumption). Thus, it suffices to show that

$$ \begin{equation*} \|\widehat{g}\|_{L_1(S_{r}(0))} \lesssim \|g\|_{L_{d/(d-\alpha)}} \quad\text{and}\quad S_r(0) = \bigl\{\zeta \in \mathbb{R}^d\mid |\zeta| = r\bigr\}, \quad r \in [A^{-1},1). \end{equation*} \notag $$

Since $\alpha$ is close to zero, the summability exponent ${d}/{(d-\alpha)}$ is close to one, and the desired inequality is a consequence of the Tomas-Stein theorem.

The corollary is proved.

Corollary 3, in particular, leads to the inequality

$$ \begin{equation*} \int_{\mathbb{R}^d}\frac{|\widehat{f}(\xi)|}{|\xi|^{d-1}}\,d\xi \lesssim \|f\|_{\dot{W}_1^1}, \end{equation*} \notag $$

The last inequality was first proved by Bourgain in the unpublished preprint [9]; also see [34]. For the case of the first gradient, Corollary 3 was proved by Kolyada in [26] for $d \geqslant 3$ (the case $d=2$ was open even in the case of the first gradient).

The appearance of the Hardy space in these considerations is not accidental. In fact, Sobolev-type and Hardy spaces are two examples of Fourier constrained spaces we survey in the forthcoming subsection.

1.3. Fourier constrained spaces

Let $l$ and $d$ be natural numbers. We will work with functions that map $\mathbb{R}^d$ to $\mathbb{C}^l$. We equip the latter space with the standard Euclidean norm on $\mathbb{R}^{2l}$:

$$ \begin{equation*} |a|^2 = \sum_{j=1}^l a_j\overline{a}_j, \qquad a\in \mathbb{C}^l. \end{equation*} \notag $$

Let $p \in [1,\infty)$. Let $L_p(\mathbb{R}^d,\mathbb{C}^l)$ be the space of $L_p$-summable functions with values in $\mathbb{C}^l$. We can further introduce the space $\boldsymbol{\mathrm{M}}(\mathbb{R}^d,\mathbb{C}^l)$ that consists of all charges ($\mathbb{C}^l$-valued sigma-additive Borel set functions) of finite variation. Here and in what follows we distinguish measures, which are always scalar and nonnegative, from charges, which can be either $\mathbb{R}$, or $\mathbb{C}$, or $\mathbb{R}^\ell$, or $\mathbb{C}^l$-valued. We define the norm in $\boldsymbol{\mathrm{M}}$ to be the total variation of $\mu$.

Let $k \leqslant l$ be a natural number. Consider a smooth mapping $\Omega\colon S^{d-1}\to G(l,k)$. The notation $S^{d-1}$ and $G(l,k)$ is used for the unit sphere in $\mathbb{R}^d$ and the (complex) Grassmannian, that is, the set of all (complex) linear $k$-dimensional subspaces of $\mathbb{C}^l$. The map $\Omega$ gives rise to a generalization of the Sobolev space

$$ \begin{equation*} W_1^\Omega = \bigl\{f\in L_1(\mathbb{R}^d,\mathbb{C}^l)\mid\forall\, \xi \in \mathbb{R}^d \setminus \{0\} \ \ \widehat{f}(\xi) \in \Omega(\xi/|\xi|)\bigr\} \end{equation*} \notag $$

and the $\operatorname{BV}$-space

$$ \begin{equation*} \operatorname{BV}^\Omega = \bigl\{\mu\in \boldsymbol{\mathrm{M}}(\mathbb{R}^d,\mathbb{C}^l)\mid \forall\, \xi \in \mathbb{R}^d \setminus \{0\} \ \ \widehat{\mu}(\xi) \in \Omega(\xi/|\xi|)\bigr\}. \end{equation*} \notag $$

These spaces inherit their norms from the ambient spaces $L_1$ and $\boldsymbol{\mathrm{M}}$, respectively.

Remark 2. The spaces $W_1^\Omega$ and $\operatorname{BV}^\Omega$ are closed in $L_1(\mathbb{R}^d,\mathbb{C}^l)$ and $\boldsymbol{\mathrm{M}}(\mathbb{R}^d,\mathbb{C}^l)$, respectively. These spaces are also translation and dilation invariant.

Example 4. Let $l=d$ and $k=1$. Consider the mapping

$$ \begin{equation*} \Omega(\zeta) = \mathbb{C}\zeta, \qquad \zeta \in S^{d-1}, \end{equation*} \notag $$

that is, the vector $\zeta$ is mapped to the complex line spanned by $\zeta$. In this case

$$ \begin{equation*} W_1^{\Omega} = \{\nabla f\mid f\in \dot{W}_{1}^1(\mathbb{R}^d)\} \quad\text{and}\quad \operatorname{BV}^\Omega = \{\nabla f\mid f\in \operatorname{BV}(\mathbb{R}^d)\}. \end{equation*} \notag $$

In other words, the classical spaces $\dot{W}_1^1$ and $\operatorname{BV}$ can be obtained by choosing specific $\Omega$.

Example 5. Let $l=d$ and $k=d-1$. Define the mapping $\Omega$ by the formula

$$ \begin{equation*} \Omega(\zeta) = \biggl\{\eta \in \mathbb{C}^d\Bigm|\sum_{j=1}^d \zeta_j\eta_j = 0\biggr\}, \qquad \zeta \in S^{d-1}, \end{equation*} \notag $$

that is, $\Omega(\zeta)$ is the orthogonal complement of the line spanned by $\zeta$. In this case $\operatorname{BV}^\Omega$ is the space of divergence free (solenoidal) charges.

Example 6. Recall Example 3. The associated function $\Omega$ is defined by the formula

$$ \begin{equation} \Omega(\zeta) = \mathrm{Im}\,\mathbb{A}(\zeta), \qquad \zeta \in S^{d-1}. \end{equation} \tag{1.6} $$

Since $A$ is elliptic, the image of $V$ has dimension $\dim V$ for any $\zeta \in S^{d-1}$, and we get indeed a smooth mapping into $G(\dim E,\dim V)$. The corresponding spaces $W_1^{\Omega}$ and $\operatorname{BV}^\Omega$ are usually denoted by $W_1^{A}$ and $\operatorname{BV}^{A}$.

In the case when $A = \nabla$, considered in Example 4, $V=\mathbb{C}$, $E=\mathbb{C}^d$ and

$$ \begin{equation*} \mathbb{A}(\zeta)[\lambda] = \zeta \lambda, \qquad \zeta \in \mathbb{R}^d, \quad \lambda \in V = \mathbb{C}. \end{equation*} \notag $$

The case considered in Example 5 corresponds to the differential operator $A = \mathrm{curl}$; there $V = E =\mathbb{C}^d$. Note that this operator is not elliptic: it is a constant rank operator only.

Example 7. Another important case is the case of a Fourier constrained space which satisfies the antisymmetry condition

$$ \begin{equation} \Omega(\zeta) \cap \Omega(-\zeta) = \varnothing \end{equation} \tag{1.7} $$

for any $\zeta \in \Omega$. By the celebrated Uchiyama theorem from [59], in this case ${W_1^\Omega \subset \mathcal{H}_1(\mathbb{R}^d,\mathbb{R}^\ell)}$, where the latter space is the real Hardy class. The necessity of the antisymmetry condition for the inclusion $W_1^\Omega \subset \mathcal{H}_1(\mathbb{R}^d,\mathbb{R}^\ell)$ had previously been observed by Janson in [21].

One can prove that the Schwartz functions are dense in $W_1^\Omega$ (see the preprint [52]). A good definition of the distributional space $\mathcal{W}$ related to $\Omega$ was introduced by Ayoush and Wojciechowski in [5]. For any $L \in G(l,k)$, we denote by $\pi_L$ the orthogonal projection of $\mathbb{C}^l$ onto $L$.

Definition 1. Define the space $\boldsymbol{\mathrm{W}}$ by

$$ \begin{equation*} \boldsymbol{\mathrm{W}} = \bigl\{f\in \mathcal{S}'(\mathbb{R}^d,\mathbb{C}^l)\mid \pi_{\Omega(\xi/|\xi|)^{\perp}}[\widehat{f}]\cdot H = 0\bigr\}. \end{equation*} \notag $$

Here $H$ is an auxiliary scalar Schwartz function that has a deep zero at the origin (that is, for any $N\in \mathbb{N}$ the relation $H(x) = O(|x|^{N})$ holds true as $x\to 0$) and is positive away from it. The function $H$ is used to make the formula correct: it kills the nonsmoothness of the projection at the origin. Note that the space $\boldsymbol{\mathrm{W}}$ contains all polynomials.

Remark 3. The space $\boldsymbol{\mathrm{W}}$ is dilation and translation invariant. It is closed as a subspace of $\mathcal{S}'(\mathbb{R}^d,\mathbb{C}^l)$.

Definition 2. We say that $\Omega$ satisfies the cancellation condition if

$$ \begin{equation} \bigcap_{\zeta \in S^{d-1}}\Omega(\zeta) = \{0\}. \end{equation} \tag{1.8} $$

This condition was introduced by Roginskaya and Wojciechowski in [38] and Van Schaftingen in [42] independently.

Remark 4. The cancellation condition (1.8) is equivalent to the absence of the charges $a\otimes \delta_0$, $a\in \mathbb{C}^l \setminus \{0\}$, in the spaces $\boldsymbol{\mathrm{W}}$. Indeed, if $a\otimes \delta_0 \in\boldsymbol{\mathrm{W}}$, then it also belongs to $\operatorname{BV}^\Omega$. In such a case, $a \in \Omega(\xi)$ for any $\xi$ and the intersection (1.8) contains $a$. If the intersection on the left-hand side of (1.8) contains some vector $a \in \mathbb{C}^l$, then $a\otimes \delta_0 \in \boldsymbol{\mathrm{W}}$.

Thus, Theorem 2 implies the theorem below in this case.

Theorem 3. Let $\Omega$ satisfy (1.8). Then,

$$ \begin{equation*} \operatorname{I}_{\alpha}\colon W_1^\Omega \to \dot{B}_{d/(d-\alpha),1}^{0,1} \end{equation*} \notag $$

when $\alpha \in (0,d)$.

1.4. Historical remarks

The Hardy-Littlewood-Sobolev inequality was invented by Sobolev in [45] as a tool to prove what is now called the Sobolev embedding theorem; unfortunately, his method did not work for the limit summability exponent $1$. See Ch. 5 in [50] for the classical Hardy-Littlewood-Sobolev inequality and its applications. The simplest and most natural case mentioned in Example 2 is equivalent via the Calderón-Zygmund theory to the limiting Sobolev embedding for the space $\dot{W}_1^1(\mathbb{R}^d)$. If $\alpha = 1$ and we embed into the Lebesgue space $L_{d/(d-1)}$, we get the classical Gagliardo-Nirenberg inequality obtained by Gagliardo [16] and Nirenberg [32]; see the book [31] for more information about this classical case. The embedding into the best possible Lorentz space $L_{d/(d-1),1}$ was first proved by Alvino in [2] and then rediscovered by Poornima [35] and Tartar (see [58] for more historical remarks on this question). For higher-order smoothnesses and Besov spaces, the complete result was obtained by Kolyada in [25]; see the papers [7] and [46] for earlier results and [48] for a different approach. We note that most of these results are formulated in the more general anisotropic setting, that is, when derivates with respect to different coordinates can have different orders. We refer the reader to the book [8] for details on anisotropic theory.

The inequality

$$ \begin{equation} \|\operatorname{I}_1[f]\|_{L_{d/(d-1)}} \lesssim \|f\|_{L_1}, \qquad \operatorname{div} f=0, \end{equation} \tag{1.9} $$

was obtained by Bourgain and Brezis in [11]. In a sense, this inequality served as a turning point for the whole theory. The proof in [11] relied upon Smirnov’s theorem from [44], which decomposes an arbitrary solenoidal charge into an integral of currents tangent to smooth curves, and a particular case of (1.9) for such special charges that had already been proved in [13]. Another proof of (1.9) was proposed by Van Schaftingen in [39]; see the papers [10] and [40] for related results. The inequality

$$ \begin{equation*} \|\operatorname{I}_1[f]\|_{L_{d/(d-1),1}} \lesssim \|f\|_{L_1}, \qquad \operatorname{div} f=0, \end{equation*} \notag $$

had been a long-standing conjecture (the question goes back to [12]) until Hernandez and Spector resolved it in [20] (in fact, the preprint [52] appeared almost simultaneously with [20]). Their proof also relies upon Smirnov’s theorem. To the author’s knowledge, the inequality

$$ \begin{equation*} \|\operatorname{I}_1[f]\|_{\dot{B}^{0,1}_{d/(d-1),1}} \lesssim \|f\|_{L_1}, \qquad \operatorname{div} f=0, \end{equation*} \notag $$

which is a complete form of Theorem 2 in this case, is not known in the literature.

The case of general differential operators as in Example 6, $\alpha = 1$, and the Lebesgue space instead of a Besov-Lorentz one was considered by Van Schaftingen in [42]. See earlier papers [55] (symmetric gradient), [27] (Hodge differentials), [29] (sharp constants in some of these inequalities), [12] (higher order derivatives and related approximation problems) for particular cases and the surveys [43] and [47] for more historical details. The differential operators satisfying (1.8) (see (1.6)) are called cancelling; the ellipticity condition can be replaced by a weaker constant rank condition: see [37]. For results on Hardy’s inequalities as in Corollary 1, see [30] and [14]. For results on Lorentz spaces $L_{d/(d-1),1}$ in the case of first-order operators $A$, see [49]. For generalizations to metric spaces other than $\mathbb{R}^d$, see [15].

There is also a related problem about sharp estimates for the singularities of charges in $\operatorname{BV}^\Omega$. The question is: what is the best possible bound from below for the lower Hausdorff dimension of $\mu \in \operatorname{BV}^\Omega$? It was raised in [38]. We note that if $\Omega$ is purely antisymmetric ((1.7) holds true for any $\zeta \in S^{d-1}$ as in Example 7), then both $\operatorname{BV}^\Omega$ and $W_1^\Omega$ are contained in the real Hardy class $\mathcal{H}_1(\mathbb{R}^d,\mathbb{C}^l)$ by the celebrated Uchiyama theorem [59]. Therefore, any measure $\mu \in \operatorname{BV}^\Omega$ is absolutely continuous with respect to the Lebesgue measure. The question for general $\Omega$ seems to be open. Partial results were obtained in [3], [5], [38] and [54]. The author supposes that the methods in the present text can also help in this related problem; see the preprint [51].

For a related problem on trace inequalities, see [17] (the classical case of the first gradient can be found in [31]). Theorem 2 implies (via trace inequalities for Riesz potentials; see [1]) the following ‘trace’ theorem.

Corollary 4. Let $\mathcal{W}$ be a dilation and translation invariant closed linear subspace of $\mathcal{S}'(\mathbb{R}^d,\mathbb{R}^\ell)$ that does not contain the charges $a\otimes \delta_0$, $a\in \mathbb{R}^\ell\setminus\{0\}$. Let $\alpha \in (0,d)$. Then the embedding $\operatorname{I}_{\alpha}\colon \mathcal{W}\cap L_1 \to L_q(\mu)$ is continuous, provided $q > 1$ and the measure $\mu$ satisfies the Frostman-type condition

$$ \begin{equation*} \mu(B_r(x))^{1/q} \lesssim r^{d-\alpha} \end{equation*} \notag $$

for any radius $r>0$ and centre $x\in \mathbb{R}^d$ of the Euclidean ball $B_r(x)$ (with uniform constants).

We have already said that the classical embedding theorems (that is, the ones for classical Sobolev spaces) allow anisotropic generalizations (see [25] for the strongest possible result in the classical setting of pure derivatives). Some partial results in the anisotropic setting and spaces of functions in the style of $W_1^\Omega$ were obtained in [23], [24] and [53]; see [24] for applications to questions in Banach space theory (the idea that inequalities of the type we discuss can deliver interesting information about the isomorphic type of Banach spaces goes back to [22] and [28]).

The techniques we use are different from those usually used in the field; however, there is some similarity to [10] and [12]. We rely upon Harmonic Analysis tools such as the time-frequency decomposition and Harnack’s inequality. This allows us to get rid of the differential structure and work in a more general setting of Fourier restrictions. We finish the introduction with the description of our methods.

1.5. Plan of the proof

The paper [4] suggested a discrete model for the problems mentioned above: the spaces $\operatorname{BV}^\Omega$ have relatives in the world of discrete time martingales over regular filtrations. In the discrete model problems are simpler, and the paper [4] contains solutions to them. The approach is based upon four ingredients: the monotonicity formula (in the world of discrete martingales it reduces to a simple form of convexity in $L_p$), splitting into convex and flat atoms, an improvement of the monotonicity formula in the case when the corresponding cancellation condition holds true, and a combinatorial argument.

Our plan is to transfer the approach of [4] to the Euclidean setting, finding appropriate translations for the notions of a martingale, an atom, the monotonicity formula, and other objects in that paper. The translation appears to be not quite literal, so there will be several new entities (the main of these are horizontal graphs in § 6 below: there was no horizontal interaction in the discrete world; this also forces us to introduce another classification of atoms: there will be saturated and nonsaturated atoms).

Section 2 contains our interpretation of the words ‘martingale’ and ‘monotonicity formula’. The first notion is replaced with the heat extension; more specifically, we consider a discrete-time martingale

$$ \begin{equation*} \{\operatorname{H}[f](\,\cdot\,,A^{-2k})\}_k, \end{equation*} \notag $$

where $\operatorname{H}[f] = \operatorname{H}[f](x,t)$ is the heat extension of $f$ and $A$ is an extremely large number specified in § 7. As for the monotonicity formula, we will be using a very particular case of a much more general monotonicity formula obtained by Bennett, Carbery and Tao in [6] (we will have to generalize the said simple case to fit a weighted setting). The monotonicity formula is stated in Proposition 1. We will mostly avoid the probabilistic terminology in the proof, however, the probabilistic point of view seems to be quite intuitive here.

Section 3 provides an improvement of the Bennett-Carbery-Tao monotonicity formula for rank-one measures in $\mathcal{W}$ when the latter space does not contain delta measures. The idea is that the inequality expressing the monotonicity formula turns to equality only when $f$ is a delta measure. Theorem 4 says that the measures $\mu$ such that $a\otimes \mu \in \mathcal{W}$ are somehow separated from delta-measures, and thus one can improve the monotonicity formula in Proposition 1 for these measures $\mu$. The separation is expressed through the notion of an invariant cone of measures from [36] (we adjust this notion to fit our Schwartz-class approach). Though we do not use the notions of tangent cone or tangent measure, the material in § 3 is reminiscent of some parts of the paper [36].

Section 4 contains our interpretation of the notion ‘atom’. We use a version of a time-frequency decomposition. However, we do not decompose the function over the space, but rather consider its norms in weighted $L_1$-spaces, where the weight is localized in a neighbourhood of an atom. The Uncertainty Principle says that the function $f_k = \operatorname{H}[f](\,\cdot\,, A^{-2k})$ behaves like a function on the lattice $A^{-k}\mathbb{Z}^d$. We exploit this principle by considering the values $\|f_k\|_{L_1(w_{k,j})}$, where the weight $w_{k,j}$ is concentrated in a neighbourhood of the point $A^{-k}j$, $j \in\mathbb{Z}^d$; the quantity $\|f_k\|_{L_1(w_{k,j})}$ is then treated as the value of the martingale $f$ on the atom $(k,j)$. We define convex and flat atoms similarly to [4] and prove several useful lemmas about our weights. This allows us to estimate the sum over convex atoms in the manner similar to [4]; this estimate is proved in Proposition 3.

Section 5 proposes a compactness argument that allows us to perturb (that is, add some flexibility to) the improvement of the Bennett-Carbery-Tao monotonicity formula obtained in § 3, that is, to prove a similar monotonicity formula for functions $f \in \mathcal{W}$ that are somehow close to rank-one measures; the precise formulation is in Theorem 5. In [4] it was shown that the growth of the $L_p$-norm of a martingale on a flat atom is smaller than the growth of the same quantity for the martingale generated by a delta measure. In the Euclidean case the situation is more complicated since we do not have perfect localization in the space variable (due to the Uncertainty Principle). So we introduce an additional concentration assumption. We prove that if this assumption holds for some atom, and the atom is flat, then the function $f$ is close to being a positive rank-one measure, and its $L_p$-norm in a certain weighted space grows slower than that of a delta measure.

In our reasonings graphs will indicate the subordination of atoms: if there is an arrow from $\mathfrak{A}$ to $\mathfrak{B}$, then some quantity of $\mathfrak{B}$ can be estimated by a similar quantity of $\mathfrak{A}$ uniformly. Section 6 introduces graphs that mark the horizontal subordination of atoms. We study simple combinatorial properties of these graphs and introduce the second classification of atoms (the first being flat/convex). Atoms can be saturated or nonsaturated. For saturated atoms we prove good bounds for the growth of the $L_p$-norm (this follows from Theorem 5 since saturated atoms fulfill the concentration condition in that theorem). What is more, if a nonsaturated atom is subordinate to a saturated one, then there is a certain control of the growth of the $L_p$-norm as the weight drifts from the former atom to the latter one; the precise formulation is in Theorem 6.

The final section, § 7, concludes the proof. We introduce the graph $\Gamma$ similar to the graph in [4]. Here its structure is more complicated, and it is not uniform. In fact, there can be vertices with infinitely many kids. However, $\Gamma$ is still a forest, that is, a union of trees. The improved monotonicity formula allows us to run induction over an individual tree (Proposition 4). Then a combinatorial argument similar to [4] reduces an estimate for a sum over flat atoms to the sum over convex atoms which we have already estimated.

§ 2. Gaussians and monotonicity formulae

Consider the heat extension of a function $f\in L_1(\mathbb{R}^d,\mathbb{R}^\ell)$, that is,

$$ \begin{equation*} \operatorname{H}[f](x,t) = (4\pi t)^{-d/2} \int_{\mathbb{R}^d}f(y)\exp\biggl(-\frac{|x-y|^2}{4t}\biggr)\,dy, \qquad x\in \mathbb{R}^d, \quad t > 0. \end{equation*} \notag $$

One can generalize this definition to the case of $f\in \mathcal{S}'(\mathbb{R}^d,\mathbb{R}^\ell)$ in the usual way. This extension satisfies the heat equation

$$ \begin{equation*} (\operatorname{H}[f])_t = \Delta_x \operatorname{H}[f] \end{equation*} \notag $$

and the semigroup property

$$ \begin{equation} \operatorname{H}\bigl[\operatorname{H}[f](\,\cdot\,,s)\bigr](x,t) = \operatorname{H}[f](x,t+s), \qquad t,s > 0, \quad x\in\mathbb{R}^d. \end{equation} \tag{2.1} $$

The operator $f\mapsto \operatorname{H}[f](\,\cdot\,,t)$ is a Fourier multiplier with symbol $\exp(-4\pi^2 t|\xi|^2)$, that is,

$$ \begin{equation*} \mathcal{F}\bigl[\operatorname{H}[f](\,\cdot\,,t)\bigr](\xi) = \exp(-4\pi^2 t|\xi|^2)\widehat{f}(\xi). \end{equation*} \notag $$

Therefore, $\operatorname{H}[f](\,\cdot\,,t) \in \mathcal{W}$ for any $t > 0$, provided $f\in \mathcal{W}$.

Let $A$ be a large natural parameter to be chosen later. It will be convenient for further considerations to assume that $A$ is odd. We will use the notation

$$ \begin{equation} p = \frac{d}{d-\alpha} \quad\text{and}\quad \alpha = d\,\frac{p-1}{p}, \end{equation} \tag{2.2} $$

which links the parameters in Theorem 2. By $p'$ we mean the conjugate exponent $p' = p/(p-1)$.

Remark 5. Theorem 2 follows from the inequality

$$ \begin{equation} \sum_{k\in\mathbb{Z}} A^{-\alpha k}\|\operatorname{H}[f](\,\cdot\,, A^{-2k})\|_{L_{p,1}} \lesssim \|f\|_{L_1}, \qquad f\in \mathcal{W}, \end{equation} \tag{2.3} $$

where the parameter $A$ is our choice (any $A>1$ suffices) and the parameters $\alpha$ and $p$ satisfy (2.2).

Indeed, the inequality

$$ \begin{equation*} \|\operatorname{I}_{\alpha}[f]*(\psi_k - \psi_{k-1})\|_{L_{p,1}} \lesssim A^{-\alpha k} \|\operatorname{H}[f](\,\cdot\,, A^{-2k})\|_{L_{p,1}} \end{equation*} \notag $$

follows from the fact that the Fourier transform of the function

$$ \begin{equation*} \frac{\widehat{\psi}(A^{-k}\xi) - \widehat{\psi}(A^{-k+1}\xi)}{A^{-\alpha k}|\xi|^\alpha \exp(-4\pi^2 A^{-2k}|\xi|^2)} \end{equation*} \notag $$

has a uniformly bounded (with respect to $k$) $L_1$-norm; by (1.4) summation of these inequalities for all $k\in\mathbb{Z}$ yields

$$ \begin{equation*} \|\operatorname{I}_{\alpha}[f]\|_{\dot{B}_{p,1}^{0,1}} \lesssim \sum_{k\in\mathbb{Z}} A^{-\alpha k}\|\operatorname{H}[f](\,\cdot\,, A^{-2k})\|_{L_{p,1}}. \end{equation*} \notag $$

Remark 6. Using dilation invariance of the problem, one can reduce (2.3) to

$$ \begin{equation*} \sum_{k\geqslant 0} A^{-\alpha k}\|\operatorname{H}[f](\,\cdot\,, A^{-2k})\|_{L_{p,1}} \lesssim \|f\|_{L_1}, \qquad f\in\mathcal{W}, \end{equation*} \notag $$

or, with the notation $f_k(x) = \operatorname{H}[f](x, A^{-2k})$,

$$ \begin{equation} \sum_{k\geqslant 0} A^{-\alpha k}\|f_k\|_{L_{p,1}} \lesssim \|f\|_{L_1}, \qquad f\in\mathcal{W}. \end{equation} \tag{2.4} $$

Definition 3. By a weight we mean a locally summable almost everywhere nonnegative function $w$ that defines a tempered distribution (that is, there exists $M \in \mathbb{N}$ such that $\displaystyle\int_{B_R(0)} w(x)\,dx \lesssim R^M$ for all $R > 1$).

We define the $L_p(w)$-norm by

$$ \begin{equation*} \|f\|_{L_p(w)} = \biggl(\int_{\mathbb{R}^d}|f(x)|^pw(x)\,dx\biggr)^{1/p}, \end{equation*} \notag $$

where the function $f$ can be vector-valued.

The next lemma and corollary are standard; we provide the proofs since we will use formulae from them.

Lemma 1. Let $w$ be a weight, let $g\in L_{1,\mathrm{loc}}\cap \mathcal{S}'(\mathbb{R}^d,\mathbb{R}^\ell)$ be a function, and let $p \geqslant 1$. Then for any $t > 0$,

$$ \begin{equation} \|\operatorname{H}[g](\,\cdot\,,t)\|_{L_p(w)} \leqslant \|g\|_{L_p(\operatorname{H}[w](\,\cdot\,,t))}, \end{equation} \tag{2.5} $$

provided the right-hand side is finite.

Proof. We raise the inequality to power $p$ and use Jensen’s inequality and Fubini’s theorem:

$$ \begin{equation} \int_{\mathbb{R}^d} |\operatorname{H}[g](x,t)|^p w(x)\,dx \leqslant \int_{\mathbb{R}^d}\operatorname{H}[|g|^p](x,t)w(x)\,dx = \int_{\mathbb{R}^d}|g|^p(x)\operatorname{H}[w](x,t)\,dx. \end{equation} \tag{2.6} $$

The lemma is proved.

Corollary 5. For any $m \geqslant k$,

$$ \begin{equation*} \|f_k\|_{L_p}\leqslant \|f_m\|_{L_p}, \qquad m \geqslant k, \end{equation*} \notag $$

provided the right-hand side is finite.

Proof. By the semigroup property (2.1) of the heat extension,

$$ \begin{equation} f_k(x) = \operatorname{H}[f_m](x,A^{-2k} - A^{-2m}), \end{equation} \tag{2.7} $$

so that the desired inequality follows indeed from Lemma 1 with $w = 1$, since $\operatorname{H}[1](x,t) = 1$ for any $x$ and $t$.

The corollary is proved.

Lemma 2. Let $p=1$. If (2.5) turns to equality for some $t > 0$, then $g = a\otimes h$, where $a\in \mathbb{R}^\ell$ and $h$ is a nonnegative scalar-valued function.

Proof. It follows from formula (2.6) that

$$ \begin{equation} \|g\|_{L_1(\operatorname{H}[w](\,\cdot\,,t))} - \|\operatorname{H}[g](\,\cdot\,,t)\|_{L_1(w)} = \int_{\mathbb{R}^d}\bigl(\operatorname{H}[|g|](x,t) - |\operatorname{H}[g](x,t)|\bigr)w(x)\,dx. \end{equation} \tag{2.8} $$

Therefore, if (2.5) turns to equality, then $\operatorname{H}[|g|](x,t) = |\operatorname{H}[g](x,t)|$ for all $x\in\mathbb{R}^d$. Plugging in $x=0$ we get

$$ \begin{equation*} \int |g|\,d\mu = \biggl|\int g\,d\mu\biggr|, \end{equation*} \notag $$

where $\mu$ is a Lebesgue continuous measure with Gaussian density. Let $\displaystyle a = \int g\,d\mu$. Recall that $\pi_a$ denotes the orthogonal projection onto the vector $a$. We write the chain of inequalities

$$ \begin{equation*} \int |g|\,d\mu \geqslant \int|\pi_a[g]|\,d\mu \geqslant \biggl|\int \pi_a[g]\,d\mu\biggr| = |\pi_a[a]| = |a|. \end{equation*} \notag $$

Both inequalities in this chain turn to equalities. Since we have $|\pi_a[g]| = |g|$ almost everywhere, $g(x)$ is proportional to $a$ for almost all $x$. This is equivalent to $g = a\otimes h$, where $h$ is a scalar function. Then, since the second inequality in the chain turns to equality, $h \geqslant 0$ almost everywhere.

The lemma is proved.

Proposition 1. Let $\mu$ be a nonnegative scalar measure of tempered growth and let $w$ be a continuous weight. Then

$$ \begin{equation} \bigl\|\operatorname{H}[\mu](\,\cdot\,,t)\bigr\|_{L_p(\operatorname{H}[w](\,\cdot\,,(1-t)/p))} \leqslant t^{-(d/2)(p-1)/p} \|\operatorname{H}[\mu](\,\cdot\,,1)\|_{L_p(w)}, \qquad t \in (0,1], \end{equation} \tag{2.9} $$

provided the right-hand side is finite.

The unweighted case of this proposition was proved in [6], this is a particular case of Proposition 3.1 in that paper (see Proposition 5 in the blog post [57]). Proposition 1 can be thought of as interpolation between the two extreme cases of $p=1$ and $p=\infty$. It is not quite clear how to interpolate in this context. Throughout the rest of this section we use the notation

$$ \begin{equation} \begin{gathered} \, u(x,t) = \operatorname{H}[\mu](x,t), \qquad v(x,t) = \operatorname{H}[w]\biggl(x,\frac{1-t}{p}\biggr), \\ Q_p[\mu,w](t) = t^{d(p-1)/2}\int_{\mathbb{R}^d} u^p(x,t)v(x,t)\,dx, \qquad x\in \mathbb{R}^d, \quad t \in (0,1]. \end{gathered} \end{equation} \tag{2.10} $$

We will often suppress the first two arguments of $Q_p$ if this does not lead to ambiguity. Note that $u$ solves the classical heat equation, while $v$ solves the rescaled backward heat equation

$$ \begin{equation} v_t = -\frac1p\Delta_x v. \end{equation} \tag{2.11} $$

Proposition 1 will follow from the inequality

$$ \begin{equation*} \frac{\partial Q_p(t)}{\partial t} \geqslant 0, \qquad t\in(0,1). \end{equation*} \notag $$

This inequality is a consequence of the wonderful identity

$$ \begin{equation} \frac{\partial Q_p(t)}{\partial t} = \frac{p-1}{4} (4\pi)^{-dp/2} t^{-d/2-2} \int_{\mathbb{R}^d}\mathbb{D}(x- Y_x)(\mu_{x,t}(\mathbb{R}^d))^pv(x,t)\,dx. \end{equation} \tag{2.12} $$

This formula needs clarifications. The measures $\mu_{x,t}$ are given by

$$ \begin{equation*} d\mu_{x,t}(y) = \exp\biggl(-\frac{|x-y|^2}{4t}\biggr)\,d\mu(y). \end{equation*} \notag $$

One may treat $\mathbb{R}^d$ as a probability space equipped with the probability measure $\frac{\mu_{x,t}}{\mu_{x,t}(\mathbb{R}^d)}$. Then the symbol $Y_x$ denotes the vectorial random variable $y$ on this probability space. Note that since $x$ is a constant random variable, we can replace $\mathbb{D}(x-Y_x)$ with $\mathbb{D}\, Y_x$ in (2.12). We use the probabilistic language to formulate the result in a similar manner to the original paper [6]. In fact, we will not use the probabilistic terminology anymore.

Proof of formula (2.12). Without loss of generality we can assume that $\mu$ is compactly supported, since the general case can be reduced to this by a standard limiting argument. Then the functions $u$ and $v$ are rapidly decaying at infinity, which allows integration by parts in the space variables without care about substitutions at the boundary.

We start with direct differentiation of $Q_p$, using the definition of this function (we suppress the arguments of functions):

$$ \begin{equation} \frac{\partial Q_p(t)}{\partial t} = t^{d(p-1)/2}\int_{\mathbb{R}^d}\biggl(\frac{d(p-1)}{2t}u^pv + pu^{p-1} u_tv + u^pv_t\biggr)\,dx. \end{equation} \tag{2.13} $$

We leave this formula for a while and rewrite the right-hand side of (2.12) in more classical terms. We start with

$$ \begin{equation*} \mathbb{D}(x- Y_x) = \mathbb{E}|x- Y_x|^2 - |\mathbb{E}(x-Y_x)|^2 \end{equation*} \notag $$

and compute these summands separately:

$$ \begin{equation*} \mathbb{E}|x-Y_x|^2 = \frac{\displaystyle\int_{\mathbb{R}^d}|y-x|^2 \exp\biggl(-\frac{|x-y|^2}{4t}\biggr)\,d\mu(y)} {\displaystyle\int_{\mathbb{R}^d} \exp\biggl(-\frac{|x-y|^2}{4t}\biggr)\,d\mu(y)} = \frac{4t^2 \widetilde{u}_t}{\widetilde{u}} \end{equation*} \notag $$

and

$$ \begin{equation*} |\mathbb{E}(x-Y_x)|^2 = \left|\frac{\displaystyle\int_{\mathbb{R}^d}(y-x) \exp\biggl(-\frac{|x-y|^2}{4t}\biggr)\,d\mu(y)} {\displaystyle\int_{\mathbb{R}^d}\exp\biggl(-\frac{|x-y|^2}{4t}\biggr)\,d\mu(y)}\right|^2 = \biggl|\frac{-2t\nabla_x\widetilde{u}}{\widetilde{u}}\biggr|^2 = 4t^2\frac{|\nabla_x\widetilde{u}|^2}{\widetilde{u}^2}, \end{equation*} \notag $$

where $\widetilde{u}(x,t) = (4\pi t)^{d/2}u(x,t)$ or $\widetilde{u}(x,t) = \mu_{x,t}(\mathbb{R}^d)$. We plug these formulae into the right-hand side of (2.12) and obtain

$$ \begin{equation*} (p-1)(4\pi)^{-dp/2}t^{-d/2}\int_{\mathbb{R}^d}\bigl(\widetilde u_t \widetilde u^{p-1} - |\nabla_x\widetilde u|^2 \widetilde u^{p-2}\bigr)v\,dx. \end{equation*} \notag $$

We express this quantity in terms of $u$ with the help of the formula $\widetilde{u}_t = (4\pi t)^{d/2}(u_t + \frac{d}{2t}u)$: the right-hand side of (2.12) is equal to

$$ \begin{equation*} (p-1)t^{d(p-1)/2}\int_{\mathbb{R}^d}\biggl(u_tu^{p-1} + \frac{d}{2t}u^p - |\nabla_x u|^2u^{p-2}\biggr) v\,dx. \end{equation*} \notag $$

Recalling (2.13), we are left with proving the identity

$$ \begin{equation*} \begin{aligned} \, &\int_{\mathbb{R}^d} \biggl(\frac{d(p-1)}{2t} u^pv + pu^{p-1}u_tv + u^pv_t\biggr)\,dx \\ &\qquad= (p-1)\int_{\mathbb{R}^d}\biggl(\frac{d}{2t}u^pv + u^{p-1}u_tv - |\nabla_x u|^2u^{p-2}v\biggr)\,dx, \end{aligned} \end{equation*} \notag $$

which is equivalent to

$$ \begin{equation} \int_{\mathbb{R}^d}(u^{p-1}u_t v + u^p v_t)\,dx = - (p-1) \int_{\mathbb{R}^d}|\nabla_x u|^2 u^{p-2}v\,dx. \end{equation} \tag{2.14} $$

We use the fact that $u$ solves the heat equation and $v$ solves (2.11) to rewrite the left-hand side of (2.14):

$$ \begin{equation} \int_{\mathbb{R}^d}(u^{p-1}u_t v + u^p v_t)\,dx = \int_{\mathbb{R}^d}u^{p-1}v\Delta_x u-\frac{1}{p}\int_{\mathbb{R}^d}u^p\Delta_x v. \end{equation} \tag{2.15} $$

We use integration by parts several times to rewrite the right-hand side of (2.14) (the angle brackets below denote the scalar product in $\mathbb{R}^d$):

$$ \begin{equation*} \begin{aligned} \, &{-}(p-1)\int_{\mathbb{R}^d}|\nabla_x u|^2u^{p-2}v = -\int_{\mathbb{R}^d}\langle(p-1)u^{p-2}\nabla_x u,v\nabla_x u\rangle \\ &\qquad= \int_{\mathbb{R}^d}u^{p-1}\operatorname{div}[v\nabla_x u] =\int_{\mathbb{R}^d}u^{p-1}v\Delta_xu + \int_{\mathbb{R}^d}u^{p-1}\langle\nabla_xu,\nabla_x v\rangle \\ &\qquad = \int_{\mathbb{R}^d}u^{p-1}v\Delta_xu -\frac1p\int_{\mathbb{R}^d}u^{p}\Delta_x v. \end{aligned} \end{equation*} \notag $$

This coincides with the right-hand side of (2.15). So, (2.14) is established together with (2.12).

Remark 7. We have proved Proposition 1. It also follows from (2.12) that inequality (2.9) turns to equality if and only if $\mu = \delta_x$ for some $x\in \mathbb{R}^d$. Indeed, $\mathbb{D}(x-Y_x) = \mathbb{D}(Y_x)= 0$ if and only if $y$ is constant $\mu_{x,t}$-almost everywhere, which means that $\mu_{x,t}$ is a delta measure.

Remark 8. One can interpret Proposition 1 as an averaged version of Harnack’s inequality for the heat equation.

§ 3. Improving the monotonicity formula

We will measure the regularity of our weights using a version of the modulus of continuity (or smoothness at infinity).

Definition 4. Let $w$ be a continuous positive weight. Define its smoothness function $\operatorname{s}[w]\colon \mathbb{R}_+\to[1,\infty]$ by the formula

$$ \begin{equation*} \operatorname{s}[w](\zeta) = \sup\biggl\{\frac{w(x)}{w(y)}\Bigm||x-y| \leqslant \zeta,\ x,y\in\mathbb{R}^d\biggr\}. \end{equation*} \notag $$

Remark 9. The function $\operatorname{s}[w]$ does not decrease and equals one at zero.

Example 8. Let $\theta > 0$. Then $\operatorname{s}[(1+|\cdot|)^{-\theta}](\zeta) = (1+\zeta)^{\theta}$.

Lemma 3. Let $\Psi \geqslant 0$. Then

$$ \begin{equation*} \operatorname{s}[w*\Psi]\leqslant \operatorname{s}[w] \end{equation*} \notag $$

pointwise.

Lemma 4. Let $t \in (0,1)$, let $w$ be a continuous weight. Then the dilated weight $W$, that is, $W(x) = w(tx)$, satisfies the inequality $\operatorname{s}[W] \leqslant \operatorname{s}[w]$ pointwise.

The next definition is inspired by [36].

Definition 5. We call a subset $\mathbb{M}$ of the class $\mathcal{S}'(\mathbb{R}^d)$ an invariant cone of measures provided it satisfies the following properties:

1) any element $\mu \in\mathbb{M}$ is a measure, that is, a nonnegative distribution;
2) the set $\mathbb{M}$ is closed in the topology of $\mathcal{S}'(\mathbb{R}^d)$;
3) the set $\mathbb{M}$ is dilation invariant;
4) the set $\mathbb{M}$ is translation invariant;
5) the set $\mathbb{M}$ is a cone in the sense that $c\mu\in\mathbb{M}$, provided $c \geqslant 0$ and $\mu \in \mathbb{M}$.

Example 9. Let $q \in [1, d-1]$ be a natural number. Let $\mathbb{M}_q$ be the set of all nonnegative distributions in $\mathbb{R}^d$ that depend on at most $q$ coordinates in the sense that for any $\mu \in \mathbb{M}_q$ there exists $L\in G(d,q)$ such that shifts by elements of $L^{\perp}$ preserve $\mu$. It is easy to see that $\mathbb{M}_q$ is an invariant cone of measures.

Example 10. The set

$$ \begin{equation} \mathbb{M}^\mathcal{W} = \bigl\{\mu \in \mathcal{S}'(\mathbb{R}^d)\mid\mu \geqslant 0\text{ and } \exists\ a\in \mathbb{R}^\ell \setminus \{0\}\text{ such that } a\otimes\mu \in \mathcal{W}\bigr\} \end{equation} \tag{3.1} $$

is an invariant cone of measures. For the classical case considered in Example 4, the cone $\mathbb{M}^\mathcal{W}$ in formula (3.1) coincides with the cone $\mathbb{M}_1$ introduced in the previous example; the case of a divergence-free space presented in Example 5 leads to $\mathbb{M}_{d-1}$.

Theorem 4. Let $\mathbb{M}$ be an invariant cone of measures that does not contain $\delta_0$. Let $G$ be a continuous positive weight that satisfies the smoothness bound

$$ \begin{equation} \operatorname{s}[G](\zeta) \leqslant C_G(1+|\zeta|)^{\theta_G}, \qquad \zeta \in \mathbb{R}_+. \end{equation} \tag{3.2} $$

Let $p \in (1,\infty)$ be a fixed number. There exists a small constant $\delta$, whose choice depends on the parameters $\mathbb{M}$, $\theta_G$, $C_G$ and $p$, but not on the particular choice of $G$ and $\mu\in \mathbb{M}$, such that

$$ \begin{equation} \|\operatorname{H}[\mu](\,\cdot\,,t)\|_{L_p(\operatorname{H}[G](\,\cdot\,,(1-t)/p))} \leqslant t^{-(d/2)(p-1)/p + \delta}\|\operatorname{H}[\mu](\,\cdot\,,1)\|_{L_p(G)}, \qquad t\in (0,1], \end{equation} \tag{3.3} $$

provided the value on the right-hand side is finite.

This theorem can be thought of as a quantification of Remark 7. The proof is quite lengthy (though fairly straightforward) and occupies the rest of § 3. First, we would like to get rid of time. We recall the function $Q_p = Q_p[\mu,G]$ defined in (2.10).

Lemma 5. The estimate (3.3) follows from the inequality

$$ \begin{equation} \frac{\partial Q_p}{\partial t}(1) \geqslant \delta p Q_p(1), \end{equation} \tag{3.4} $$

once it has been established uniformly for all continuous weights $G$ satisfying (3.2) and $\mu \in \mathbb{M}$.

Before we pass to the proof, we note that if $Q_p(1)$ is finite, then $Q_p(t)$ is finite for any $t\in (0,1)$ by Proposition 1. Therefore, we will always work with finite quantities in the proofs below.

Proof of Lemma 5. Assume (3.4) holds with a constant $\delta$ which is uniform with respect to $\mu$ and $G$. The estimate (3.3) can be rewritten in terms of $Q_p$ as ${Q_p(t) \leqslant t^{p\delta}Q_p(1)}$, $t \in (0,1]$; this clearly follows from

$$ \begin{equation} \frac{Q_p'(t)}{Q_p(t)} \geqslant \frac{p\delta}{t}, \qquad t\in (0,1). \end{equation} \tag{3.5} $$

We fix $\mu$ and $G$ and construct the functions $u$ and $v$ from them as prescribed by formula (2.10) (we set $w:= G$ in that formula). We also consider the dilated functions

$$ \begin{equation} \widetilde{u}(x,\theta) = u(tx,t^2\theta)\quad\text{and} \quad \widetilde{v}(x,\theta) = v(tx,t^2\theta), \qquad x\in \mathbb{R}^d, \quad \theta > 0. \end{equation} \tag{3.6} $$

These functions solve the same heat and backward heat equations as $u$ and $v$, respectively. Let us investigate the measure $\widetilde{\mu}$ and weight $\widetilde{G}$ that generate these functions as prescribed by formulae (2.10).

The situation with $\widetilde{\mu}$ is simpler: $\widetilde{\mu}$ is the limit of $\widetilde{u}(\,\cdot\,,\theta)$ as $\theta\to 0$, that is, $\widetilde{\mu}$ is a dilation of $\mu$. By the dilation invariance of $\mathbb{M}$ we have $\widetilde{\mu}\in \mathbb{M}$.

The weight $\widetilde{G}$ is the limit value of the function $\widetilde{v}$ at the level $\theta = 1$, and

$$ \begin{equation*} \widetilde v(x,\theta) = v(tx,t^2\theta) = \operatorname{H}[G]\biggl(tx,\frac{1-t^2\theta}{p}\biggr). \end{equation*} \notag $$

Consequently,

$$ \begin{equation*} \widetilde G(x) = \operatorname{H}[G]\biggl(tx,\frac{1-t^2}{p}\biggr). \end{equation*} \notag $$

So $\widetilde G(x) = G*\Phi(tx)$, where $\Phi$ is a certain Gaussian. Recall that $t < 1$, so Lemmas 3 and 4 lead to the inequalities

$$ \begin{equation*} \operatorname{s}[\widetilde G](\zeta) \leqslant \operatorname{s}[G](\zeta) \stackrel{(3.2)}{\leqslant} C_G(1+|\zeta|)^{\theta_G}. \end{equation*} \notag $$

Therefore, we are allowed to apply our assumption (3.4) to $\widetilde{\mu}$ and $\widetilde G$:

$$ \begin{equation} Q_p'[\widetilde \mu, \widetilde G](1) \geqslant p\delta Q_p[\widetilde \mu, \widetilde G](1). \end{equation} \tag{3.7} $$

It remains to express $Q_p[\widetilde \mu,\widetilde G]$ in terms of $Q_p[\mu,G]$:

$$ \begin{equation} \begin{aligned} \, \notag Q_p[\widetilde \mu,\widetilde G](\theta) &= \theta^{d(p-1)/2}\int_{\mathbb{R}^d} \widetilde u^p(x,\theta)\widetilde v(x,\theta)\,dx \\ &= \theta^{d(p-1)/2}\int_{\mathbb{R}^d}u^p(tx,t^2\theta)v(tx,t^2\theta)\,dx \notag \\ &=\theta^{d(p-1)/2}t^{-d}\int_{\mathbb{R}^d}u^p(y,t^2\theta)v(y,t^2\theta)\,dy = t^{-dp}Q_p[\mu,G](t^2\theta). \end{aligned} \end{equation} \tag{3.8} $$

Plugging in $\theta = 1$ we get

$$ \begin{equation} Q_p[\mu,G](t^2) = t^{dp}Q_p[\widetilde \mu,\widetilde G](1). \end{equation} \tag{3.9} $$

If we differentiate (3.8) with respect to $\theta$ and plug in $\theta = 1$, we obtain

$$ \begin{equation} Q_p'[\mu,G](t^2) = t^{dp-2}Q_p'[\widetilde \mu,\widetilde G](1). \end{equation} \tag{3.10} $$

A combination of (3.7), (3.9) and (3.10) leads to the desired estimate (3.5).

Lemma 5 is proved.

Proposition 2. Let $\rho$ be a fixed weight:

$$ \begin{equation} \rho(x) = (1+|x|)^{-\theta_G - 2d}, \qquad x\in\mathbb R^d. \end{equation} \tag{3.11} $$

Let the positive weight $G$ satisfy the smoothness bound (3.2). Then for any $\nu > 0$ there exists $\eta > 0$ such that any tempered measure $\mu \in \mathcal{S}'(\mathbb{R}^d)$ that satisfies

$$ \begin{equation} \int_{\mathbb{R}^d}\biggl(\int_{\mathbb{R}^d} \exp\biggl(-\frac{|x-y|^2}{4}\biggr)\,d\mu(y)\biggr)^pG(x)\,dx = 1, \end{equation} \tag{3.12} $$

$$ \begin{equation} \begin{split} &\int_{\mathbb{R}^d}\biggl(\int_{\mathbb{R}^d} |\mathrm m(x)-y|^2\exp\biggl(-\frac{|x-y|^2}{4}\biggr)\,d\mu(y)\biggr) \\ &\qquad\times \biggl(\int_{\mathbb{R}^d} \exp\biggl(-\frac{|x-y|^2}{4}\biggr)\,d\mu(y)\biggr)^{p-1}G(x)\,dx < \eta, \end{split} \end{equation} \tag{3.13} $$

where $\mathrm{m}(x)$ is shorthand for

$$ \begin{equation*} \int_{\mathbb{R}^d} y\, \exp\biggl(-\frac{|x-y|^2}{4}\biggr)\,d\mu(y) \bigg/\int_{\mathbb{R}^d}\exp\biggl(-\frac{|x-y|^2}{4}\biggr)\,d\mu(y), \end{equation*} \notag $$

is concentrated around a point $x_0\in \mathbb{R}^d$ in the following sense:

$$ \begin{equation} \nu\mu(B_\nu(x_0)) \geqslant \int_{|x-x_0| \geqslant \nu} \rho(x-x_0)\,d\mu(x). \end{equation} \tag{3.14} $$

The choice of $\eta$ is independent of $\mu$ and $G$; it depends on $\nu$, $C_G$ and $\theta_G$ only.

Remark 10. The replacement of $\mathrm{m}(x)$ with any other function of $x$ in (3.13) makes this inequality stronger.

Proof of Theorem 4 assuming Proposition 2. By Lemma 5 it suffices to show (3.4) with a certain uniformity in the choice of $\delta$. Assume the contrary: let there exist a sequence $\{\mu_n\}_n$ of measures in $\mathbb{M}$ and a sequence $\{G_n\}_n$ of weights satisfying (3.2) uniformly such that

$$ \begin{equation*} Q_p[\mu_n,G_n](1) = 1 \quad\text{and}\quad Q_p'[\mu_n,G_n](1) \to 0, \quad n\to \infty. \end{equation*} \notag $$

By (2.10) the pair $(\mu_n,G_n)$ fulfills (3.12), and by (2.12) it fulfills (3.13) with some $\eta_n$ tending to zero (the assumption $p > 1$ is crucial here: see formula (2.12)). Thus, by Proposition 2 and translation invariance (we shift the measures to have $x_0 = 0$) we can also assume that

$$ \begin{equation} \nu_n\mu_n(B_{\nu_n}(0)) \geqslant \int_{|x| \geqslant \nu_n}\rho(x)\,d\mu_n(x), \end{equation} \tag{3.15} $$

where $\nu_n \to 0$ as $n\to \infty$. Consider the measures $\widetilde \mu_n = \frac{\mu_n}{\mu_n(B_{\nu_n}(0))}$. Note that these measures still lie in $\mathbb{M}$. To get a contradiction, it suffices to show the limit relation

$$ \begin{equation*} \widetilde\mu_n \xrightarrow{\mathcal{S}'(\mathbb{R}^d)} \delta_0. \end{equation*} \notag $$

To verify the limit relation, pick $f$ to be an arbitrary Schwartz function and write the value at $f$ of $\widetilde\mu_n$ as a functional:

$$ \begin{equation*} \int_{\mathbb{R}^d}f(x)\,d\widetilde \mu_n(x) \stackrel{(3.15)}{=} \frac{1}{\mu_n(B_{\nu_n}(0))}\int_{B_{\nu_n}(0)}f(x)\,d\mu_n(x) + O(\nu_n) \end{equation*} \notag $$

since $|f(x)| \lesssim \rho(x)$. It remains to notice that

$$ \begin{equation*} \frac{1}{\mu_n(B_{\nu_n}(0))}\int_{B_{\nu_n}(0)}f(x)\,d\mu_n(x) \to f(0) \end{equation*} \notag $$

since $\nu_n\to 0$ and $f$ is continuous.

Now we present the proof of Proposition 2, thus completing the proof of Theorem 4.

Proof of Proposition 2. We assume that $\nu < d^{-1/2}$, which is not a restriction. We partition $\mathbb{R}^d$ into cubes $Q_k$, $k\in \mathbb{Z}^d$,

$$ \begin{equation*} Q_k = \prod_{i=1}^d[\nu k_i,\nu(k_i+1)), \qquad k = (k_1,k_2,\ldots,k_d). \end{equation*} \notag $$

Also let $a_k = \mu(Q_k)$. We split the reasoning into four steps.

Step 1: the quantity $\sum_{k\in\mathbb{Z}^d}a_k^pG(\nu k)$ is separated away from zero and infinity. To show the boundedness of the said sum, we start with a local estimate

$$ \begin{equation} \int_{\mathbb{R}^d}\exp\biggl(-\frac{|x-y|^2}{4}\biggr)\,d\mu(y) \geqslant \frac12 a_k, \qquad x\in Q_k. \end{equation} \tag{3.16} $$

Therefore,

$$ \begin{equation*} \begin{aligned} \, &2^{-p}\sum_{k\in\mathbb{Z}^d}a_k^pG(\nu k) \leqslant \sum_{k\in\mathbb{Z}^d} 2^{-p} a_k^p\nu^{-d}\operatorname{s}[G](\nu\sqrt{d})\int_{Q_k}G(x)\,dx \\ &\qquad \stackrel{(3.16)}{\leqslant} \nu^{-d}\operatorname{s}[G](\nu\sqrt{d})\sum_{k\in\mathbb{Z}^d}\int_{Q_k} \biggl(\int_{\mathbb{R}^d}\exp\biggl(-\frac{|x-y|^2}{4}\biggr)\,d\mu(y)\biggr)^pG(x)\,dx \\ &\qquad \stackrel{(3.12)}{=} \nu^{-d}\operatorname{s}[G](\nu\sqrt{d}). \end{aligned} \end{equation*} \notag $$

Thus, we have proved that

$$ \begin{equation} \sum_{k\in\mathbb{Z}^d}a_k^pG(\nu k) \lesssim 1. \end{equation} \tag{3.17} $$

The reverse inequality is a bit harder to obtain. We start with yet another local (with respect to $x$) estimate

$$ \begin{equation*} \begin{aligned} \, &\int_{\mathbb{R}^d}\exp\biggl(-\frac{|x-y|^2}{4}\biggr)\,d\mu(y) \lesssim \sum_{k\in\mathbb{Z}^d}\exp\biggl(-\frac{(|x-\nu k|- \sqrt{d})^2}{4}\biggr)a_k \\ &\quad\leqslant\biggl(\sum_{k\in\mathbb{Z}^d} \exp\biggl(-\frac{(|x-\nu k|- \sqrt{d})^2}{4}\biggr)a_k^p\biggr)^{1/p} \biggl(\sum_{k\in\mathbb{Z}^d} \exp\biggl(-\frac{(|x-\nu k|- \sqrt{d})^2}{4}\biggr)\biggr)^{1/p'} \\ &\quad \lesssim \biggl(\sum_{k\in\mathbb{Z}^d} \exp\biggl(-\frac{(|x-\nu k|-\sqrt{d})^2}{4}\biggr)a_k^p\biggr)^{1/p}. \end{aligned} \end{equation*} \notag $$

Consequently, by

$$ \begin{equation*} \begin{aligned} \, 1 &\stackrel{(3.12)}{\lesssim} \int_{\mathbb{R}^d} \sum_{k\in\mathbb{Z}^d} \exp\biggl(-\frac{(|x-\nu k|-\sqrt{d})^2}{4}\biggr)a_k^p G(x)\,dx \\ &\qquad= \sum_{k\in\mathbb{Z}^d}a_k^p\int_{\mathbb{R}^d} \exp\biggl(-\frac{(|x-\nu k| - \sqrt d)^2}{4}\biggr)G(x)\,dx \end{aligned} \end{equation*} \notag $$

the desired boundedness away from zero will follow once we have verified the inequality

$$ \begin{equation*} \int_{\mathbb{R}^d}\exp\biggl(-\frac{(|x-\nu k| - \sqrt d)^2}{4}\biggr)G(x)\,dx \lesssim G(\nu k). \end{equation*} \notag $$

The verification is as follows:

$$ \begin{equation*} \begin{aligned} \, &\int_{\mathbb{R}^d}\exp\biggl(-\frac{(|x-\nu k| - \sqrt d)^2}{4}\biggr)G(x)\,dx \\ &\qquad\leqslant G(\nu k) \int_{\mathbb{R}^d}\exp\biggl(-\frac{(|x-\nu k| - \sqrt d)^2}{4}\biggr) \operatorname{s}[G](|x-\nu k|)\,dx \\ &\qquad \stackrel{(3.2)}{\leqslant} G(\nu k) C_G\int_{\mathbb{R}^d} \exp\biggl(-\frac{(|x|-\sqrt{d})^2}{4}\biggr)(1+|x|)^{\theta_G}\,dx \lesssim G(\nu k). \end{aligned} \end{equation*} \notag $$

Now the proof of the inequality

$$ \begin{equation} 1\lesssim \sum_{k\in\mathbb{Z}^d}a_k^pG(\nu k) \end{equation} \tag{3.18} $$

is complete.

Step 2: kind points. Let $R$ be a large number to be specified below. Recall the definition of the weight $\rho$ in (3.11). A point $k\in\mathbb{Z}^d$ is called kind if

$$ \begin{equation*} \nu^pa_k^p \geqslant \sum_{\nu|k-m|\geqslant R}\rho(\nu(k-m))a_m^p. \end{equation*} \notag $$

We are going to show that most points are kind in the sense that

$$ \begin{equation} \sum_{k\text{ is kind}} a_k^pG(\nu k) \geqslant \frac{1}{2}\sum_{k\in\mathbb{Z}^d}a_k^pG(\nu k). \end{equation} \tag{3.19} $$

Note that both sides are finite by (3.17). A point that is not kind is evil. Then

$$ \begin{equation*} \sum_{k\text{ is evil}} a_k^pG(\nu k) \leqslant \nu^{-p} \sum_{\substack{k,m\in\mathbb{Z}^d \\ \nu|k-m| \geqslant R}}\rho(\nu(k-m))a_m^pG(\nu k), \end{equation*} \notag $$

and (3.19) follows, provided we have justified the estimate

$$ \begin{equation*} \nu^{-p}\sum_{k\colon \nu|k-m|\geqslant R}\rho(\nu(k-m)) G(\nu k) \leqslant \frac12 G(\nu m) \quad \text{for any } m\in\mathbb{Z}^d. \end{equation*} \notag $$

The justification is as follows:

$$ \begin{equation*} \begin{aligned} \, &\nu^{-p}\sum_{k\colon \nu|k-m|\geqslant R}\rho(\nu(k-m)) G(\nu k) \\ &\qquad\leqslant \frac{G(\nu m)}{\nu^{p}} \sum_{k\colon \nu|k-m|\geqslant R}\rho(\nu(k-m)) \operatorname{s}[G](\nu|k-m|) \\ &\ \stackrel{(3.2), (3.11)}{\leqslant} \frac{C_GG(\nu m)}{\nu^p}\sum_{k\colon \nu|k-m|\geqslant R}(1+\nu|k-m|)^{-2d} \leqslant \frac{G(\nu m)}{2}, \end{aligned} \end{equation*} \notag $$

provided $R$ is large. We fix $R$ to be sufficiently large so that the last estimate holds true together with

$$ \begin{equation} \sum_{\nu|m| > R}\rho(\nu m) \leqslant (2\operatorname{s}[\rho](1))^{-p'}; \end{equation} \tag{3.20} $$

where $p'$ is the conjugate exponent of $p$. Of course, the choice of $R$ depends on $\nu$.

Step 3: good points. Let $\tau$ be a parameter to be chosen below. We call a point $k\in \mathbb{Z}^d$ good if

$$ \begin{equation} \tau a_k \geqslant b_k, \quad \text{where } b_k = \min\biggl\{\sum_{\substack{m\colon\nu|k-m|\leqslant R \\ \sqrt{d} < |l-m|}} a_m \Bigm|l\in \mathbb{Z}^d\biggr\}. \end{equation} \tag{3.21} $$

In other words, $b_k$ is a sum of the $a_m$ where $m$ runs through the $\nu^{-1} R$-neighbourhood of $k$ excluding some small ball (we exclude points in a way that makes the sum as small as possible). We are going to prove that there exists a good kind point. More specifically, if $\tau > \Theta(\nu,R)\eta$, then there is a point $k$ that is $R$-kind and $\tau$-good. Here $\Theta$ is a specific positive function of two positive arguments. It is high time to use condition (3.13).

We start with a local bound from below which is similar to (3.16). Let $x\in Q_k$ and $m(x) \in Q_l$ for some $l \in \mathbb{Z}^d$. Then

$$ \begin{equation*} \begin{aligned} \, &\int_{\mathbb{R}^d}|\mathrm m(x)-y|^2\exp\biggl(-\frac{|x-y|^2}{4}\biggr)\,d\mu(y) \\ &\qquad \geqslant \sum_{\substack{m\colon \nu|k-m|\leqslant R \\ \sqrt{d} < |l-m|}} \int_{Q_m}|\mathrm m(x)-y|^2\exp\biggl(-\frac{|x-y|^2}{4}\biggr)\,d\mu(y) \\ &\qquad \geqslant\nu^2 \exp\biggl(-\frac{|R+\sqrt{d}|^2}{4}\biggr) \sum_{\substack{m\colon\nu|k-m|\leqslant R \\ \sqrt{d} < |l-m|}} a_m, \qquad x\in Q_k, \quad m(x) \in Q_l. \end{aligned} \end{equation*} \notag $$

Therefore,

$$ \begin{equation*} \int_{\mathbb{R}^d}|\mathrm m(x)-y|^2\exp\biggl(-\frac{|x-y|^2}{4}\biggr)\,d\mu(y) \geqslant \nu^2 \exp\biggl(-\frac{|R+\sqrt{d}|^2}{4}\biggr)b_k, \qquad x\in Q_k, \end{equation*} \notag $$

which implies that

$$ \begin{equation} \sum_{k\in\mathbb{Z}^d} a_k^{p-1}b_kG(\nu k) \stackrel{(3.13), (3.16)}{\leqslant} \operatorname{s}[G](\nu\sqrt{d})\frac{2^{p-1} \exp\bigl(\frac{|R+\sqrt{d}|^2}{4}\bigr)}{\nu^{2+d}}\eta. \end{equation} \tag{3.22} $$

We assume the contrary of our claim: let all kind points be $\tau$-bad (that is, not $\tau$-good). Then,

$$ \begin{equation*} \begin{aligned} \, \tau &\stackrel{(3.18)}{\lesssim} \tau \sum_{k\in\mathbb{Z}^d}a_k^pG(\nu k) \stackrel{(3.19)}{\leqslant} 2\tau \sum_{k\text{ is kind}}a_k^pG(\nu k) \\ &\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\stackrel{\text{kind points are bad}}{<} 2\sum_{k\in\mathbb{Z}^d} a_k^{p-1}b_kG(\nu k) \stackrel{(3.22)}{\leqslant} \operatorname{s}[G](\nu\sqrt{d})\frac{2^{p}\exp(\frac{|R+\sqrt{d}|^2}{4})}{\nu^{2+d}}\eta. \end{aligned} \end{equation*} \notag $$

We get a contradiction for $\tau := \Theta(\nu,R)\eta$, where $\Theta$ is a specific positive function that can easily be written down (one needs to collect the constants in (3.18) and combine them with the above formula). Therefore, there exists a kind good point $k_0$.

Step 4: end of the proof. Note that if $k$ is a good atom and $\tau < 1$, then $a_k$ must be excluded from the sum that defines $b_k$ (because otherwise $b_k \geqslant a_k$). In other words, if $k$ is good, then the parameter $l$ at which the minimum in (3.21) is attained lies close to $k$: $|k-l| \leqslant \sqrt{d}$. Therefore, if $k$ is good, then

$$ \begin{equation} \tau a_k \geqslant \sum_{\substack{\nu|k-m|\leqslant R \\ 2\sqrt{d} < |k-m|}} a_m. \end{equation} \tag{3.23} $$

We set $x_0 = k_0$ and will show that such a choice leads indeed to (3.14) with $\nu := 5\sqrt{d}\,\nu$ (we increase this parameter slightly). Without loss of generality we can assume that $x_0 = 0$. We recall that $\eta$ is still at our choice.

We wish to prove the inequality

$$ \begin{equation} \nu a_0 \geqslant \int_{|x|\geqslant 5\sqrt{d}\nu}\rho(x)\,d\mu(x). \end{equation} \tag{3.24} $$

We split the right-hand side into two integrals to be estimated individually:

$$ \begin{equation*} \int_{|x|\geqslant 5\sqrt{d}\nu} \leqslant \int_{\substack{\bigcup Q_m\colon 2\sqrt{d} < |m|\\ \nu |m|<R}}+ \int_{\bigcup Q_m\colon R \leqslant \nu|m|}. \end{equation*} \notag $$

The first part is estimated with the help of (3.23):

$$ \begin{equation*} \sum_{\substack{\nu|m|<R \\ 2\sqrt{d}<|m|}}a_m \stackrel{0\text{ is good}}{\leqslant} \tau a_0 < \frac{\nu}{2}a_0, \end{equation*} \notag $$

provided $\eta$ is sufficiently small (the specific bound can be expressed in terms of $\Theta$).

The second part is estimated by

$$ \begin{equation*} \begin{aligned} \, &\operatorname{s}[\rho](1) \sum_{R < \nu |m|}a_m\rho(\nu m) \\ &\qquad\leqslant\operatorname{s}[\rho](1) \biggl(\sum_{R < \nu |m|}a_m^p\rho(\nu m)\biggr)^{1/p} \biggl(\sum_{R < \nu |m|}\rho(\nu m)\biggr)^{1/p'} \stackrel{0\text{ is kind}}{\leqslant} \frac{\nu}{2}a_0, \end{aligned} \end{equation*} \notag $$

since we also assume (3.20). Thus, (3.24) is proved.

It remains to notice that (3.24) leads to (3.14) since $\mu(B_{5\nu\sqrt{d}}(0)) \geqslant a_0$.

Proposition 2 is proved.

We need to perturb Theorem 4 slightly.

Corollary 6. Let $\mathbb{M}$ be an invariant cone of measures that does not contain $\delta_0$. Let $p > 1$ be a fixed number. Also let the constants $C_G$ and $\theta_G$ be fixed. Then there exists $\widetilde \delta > 0$ such that for any sufficiently small $t > 0$ the following holds true. Let $H$ be a solution of the heat equation on $\mathbb{R}^d\times [t^2,1]$ such that $H(\,\cdot\,,t^2) \in \mathbb{M}$. Then the inequality

$$ \begin{equation*} \|H(\,\cdot\,,t)\|_{L_p(\operatorname{H}[G](\,\cdot\,,(1-t)/p))} \leqslant t^{-(d/2)(p-1)/p + \widetilde \delta} \|H(\,\cdot\,,1)\|_{L_p(G)} \end{equation*} \notag $$

is valid with any continuous positive weight $G$ satisfying (3.2), provided the right-hand side is finite.

Proof. Let us restate Theorem 4 in terms of the functions $u$ and $v$ generated by (2.10) with $\mu$ and $G$ in the roles of $\mu$ and $w$. It claims the inequality

$$ \begin{equation*} \|u(\,\cdot\,,t)\|_{L_p(v(\,\cdot\,, t))}\leqslant t^{-(d/2)(p-1)/p + \delta}\|u(\,\cdot\,,1)\|_{L_p(v(\,\cdot\,, 1))}, \qquad t \in (0,1), \end{equation*} \notag $$

for any functions $u$ and $v$ defined on the region $\mathbb{R}^d\times (0,1)$, $u$ solving the heat equation, $v$ solving the backward heat equation (2.11), and such that ${u(\,\cdot\,,0) \in \mathbb{M}}$ and $v(\,\cdot\,,1)$ satisfies (3.2). We can freely shift the region $\mathbb{R}^d \times (0,1)$ in any direction in $\mathbb{R}^{d+1}$ without any changes to this statement. If we dilate the region with coefficient $\lambda$ (making a change of variables in the style of (3.6)), where $\lambda$ is bounded away from zero and infinity to preserve (3.2) (with a possibly slightly worse constant $C_G$), then we see that the inequality

$$ \begin{equation*} \|u(\,\cdot\,, t)\|_{L_p(v(\,\cdot\,,t))}\leqslant \biggl(\frac{t-t_0}{t_1 - t_0}\biggr)^{-(d/2)(p-1)/p + \delta}\|u(\,\cdot\,,t_1)\|_{L_p(v(\,\cdot\,, t_1))}, \qquad t\in (t_0,t_1), \end{equation*} \notag $$

holds true for any pair of functions $u$ and $v$ defined on the region $\mathbb{R}^d\times [t_0, t_1]$, solving the same equations as usual, and such that $u(\,\cdot\,,t_0) \in \mathbb{M}$ and $v(\,\cdot\,,t_1)$ satisfies (3.2).

We plug $t_0 := t^2$ and $t_1 := 1$, $u:= H$ and $v(x,t):= \operatorname{H}[G](x,(1-t)/p)$ into the above inequality to obtain

$$ \begin{equation*} \|H(\,\cdot\,,t)\|_{L_p(\operatorname{H}[G](\,\cdot\,,(1-t)/p))} \leqslant \biggl(\frac{t-t^2}{1-t^2}\biggr)^{-(d/2)(p-1)/p + \delta} \|H(\,\cdot\,,1)\|_{L_p(G)}. \end{equation*} \notag $$

Thus, to finish the proof, it remains to show that

$$ \begin{equation} \biggl(\frac{t-t^2}{1-t^2}\biggr)^{-(d/2)(p-1)/p + \delta} \leqslant t^{-(d/2)(p-1)/p + \widetilde \delta}, \end{equation} \tag{3.25} $$

provided $t$ is sufficiently small and we can choose $\widetilde{\delta}$ arbitrarily small. We set $\widetilde \delta = \delta/2$ and rewrite (3.25) as

$$ \begin{equation} (1+t)^{(d/2)(p-1)/p - \delta} \leqslant t^{-\delta/2}. \end{equation} \tag{3.26} $$

This inequality is true, provided $t$ is sufficiently small, since the left-hand side is continuous at zero while the right-hand side blows up.

Corollary 6 is proved.

§ 4. Time-frequency decomposition and control of convex atoms

Recall that our main target is to prove (2.4). We rewrite the right-hand side as a telescopic sum:

$$ \begin{equation} \|f\|_{L_1} = \|f_0\|_{L_1} + \sum_{k \geqslant 0}\bigl(\|f_{k+1}\|_{L_1} - \|f_k\|_{L_1}\bigr). \end{equation} \tag{4.1} $$

It is crucial that each term is nonnegative according to Corollary 5. Due to technical reasons, we will work with the sum

$$ \begin{equation} \sum_{k \geqslant 0}\bigl(\|f_{k+3}\|_{L_1} - \|f_k\|_{L_1}\bigr), \end{equation} \tag{4.2} $$

which is bounded by $3\|f\|_{L_1}$.

Let $\theta_1 > d$ be a number to be specified in what follows. Consider the weight $w$ defined by

$$ \begin{equation} w(x) = \frac{(1+|x|)^{-\theta_1}}{\sum_{j\in\mathbb{Z}^d} (1+|x-j|)^{-\theta_1}}. \end{equation} \tag{4.3} $$

This weight satisfies the bounds

$$ \begin{equation} c_w(1+|x|)^{-\theta_1} \leqslant w(x) \leqslant C_{w}(1+|x|)^{-\theta_1}, \qquad x\in \mathbb{R}^d, \end{equation} \tag{4.4} $$

which, in particular, leads to (3.2) with $\theta_G:= \theta_1$ and $C_G := C_{w}/c_w$ (see Example 8). What is more, shifts of $w$ form a regular partition of unity:

$$ \begin{equation} \sum_{j\in \mathbb{Z}^d}w(x-j) = 1, \qquad x\in \mathbb{R}^d. \end{equation} \tag{4.5} $$

Now consider the partition of $\mathbb{R}^d$ into $A$-adic cubes. The cubes $\{Q_{0,j}\}_{j}$ have centres in the lattice $\mathbb{Z}^d$ and tile the whole space (up to a set of measure zero):

$$ \begin{equation*} Q_{0,j} = \biggl\{x\in \mathbb{R}^d\Bigm||x-j|_{\ell_{\infty}^d} \leqslant \frac12\biggr\}, \qquad j\in \mathbb{Z}^d; \end{equation*} \notag $$

here by the $\ell_\infty^d$-norm we mean the standard $\sup$-norm on $\mathbb{R}^d$. We form a family $\{Q_{k,j}\}_j$ by dilating this system of cubes:

$$ \begin{equation*} Q_{k,j} = \biggl\{x\in \mathbb{R}^d\Bigm||A^kx - j|_{\ell_\infty^d} \leqslant \frac12\biggr\}, \qquad j\in \mathbb{Z}^d, \quad k\geqslant 0. \end{equation*} \notag $$

Recall that we assume that $A$ is odd. By virtue of this assumption, the family $\{Q_{k,j}\}_{k,j}$ has a nice combinatorial property: any two cubes are either disjoint (up to a set of measure zero) or one of them contains the other.

We adjust the partition of unity (4.5) to each scale:

$$ \begin{equation*} w_{k,j}(x) = w(A^kx - j), \qquad x\in\mathbb{R}^d, \quad j\in \mathbb{Z}^d, \quad k \geqslant 0. \end{equation*} \notag $$

These weights form a partition of unity for any fixed $k$

$$ \begin{equation*} \sum_{j\in\mathbb{Z}^d} w_{k,j} = 1, \end{equation*} \notag $$

and they are regular in the following sense:

$$ \begin{equation*} c_w(1+|A^kx -j|)^{-\theta_1} \leqslant w_{k,j}(x) \leqslant C_{w}(1+|A^kx -j|)^{-\theta_1}, \qquad x\in \mathbb{R}^d. \end{equation*} \notag $$

In particular,

$$ \begin{equation*} \operatorname{s}[w_{k,j}](\zeta) \leqslant \frac{C_{w}}{c_w}(1+A^k\zeta)^{\theta_1}, \qquad \zeta \geqslant 0. \end{equation*} \notag $$

The weights just introduced allow us to split the sum (4.2) further:

$$ \begin{equation*} \|f_{k+3}\|_{L_1} - \|f_k\|_{L_1} = \sum_{j \in \mathbb{Z}^d}\bigl(\|f_{k+3}\|_{L_1(\operatorname{H}[w_{k,j}](\,\cdot\,, A^{-2k} - A^{-2k-6}))} - \|f_k\|_{L_1(w_{k,j})}\bigr), \end{equation*} \notag $$

since the weights $\operatorname{H}[w_{k,j}](\,\cdot\,, A^{-2k} - A^{-2k-6})$ also form a partition of unity. By Lemma 1 each term in the above sum is nonnegative. (We have applied the said lemma to $t := A^{-2k} - A^{-2k-6}$ and have used (2.7).)

Definition 6. A pair $(k,j)$, where $k \in \mathbb{N}\cup\{0\}$ and $j \in \mathbb{Z}^d$, is called an atom.

Let $\varepsilon$ be a small parameter to be specified in what follows.

Definition 7. An atom $(k,j)$ is called $\varepsilon$-convex if

$$ \begin{equation*} \begin{aligned} \, &\|f_{k+3}\|_{L_1(\operatorname{H}[w_{k,j}](\,\cdot\,, A^{-2k} - A^{-2k-6}))} - \|f_k\|_{L_1(w_{k,j})} \\ &\qquad \geqslant \varepsilon \|f_{k+3}\|_{L_1(\operatorname{H}[w_{k,j}](\,\cdot\,, A^{-2k} - A^{-2k-6}))}, \end{aligned} \end{equation*} \notag $$

and it is called $\varepsilon$-flat in the case when the above inequality is violated. The set of convex atoms is denoted by $\operatorname{Co}$, and the set of flat atoms is denoted by $\operatorname{Fl}$.

We will also simply say ‘flat’ and ‘convex’, suppressing the dependence on $\varepsilon$.

Remark 11. If $(k,j)$ is $\varepsilon$-flat, then

$$ \begin{equation*} \begin{aligned} \, &\|f_{k+2}\|_{L_1(\operatorname{H}[w_{k,j}](\,\cdot\,, A^{-2k} - A^{-2k-4}))} - \|f_k\|_{L_1(w_{k,j})} \\ &\quad \stackrel{\text{Lemma 1}}{\leqslant} \|f_{k+3}\|_{L_1(\operatorname{H}[w_{k,j}](\,\cdot\,, A^{-2k} - A^{-2k-6}))} - \|f_k\|_{L_1(w_{k,j})} \\ &\qquad\leqslant \frac{\varepsilon}{1-\varepsilon} \|f_k\|_{L_1(w_{k,j})} \stackrel{\text{Lemma 1}}{\leqslant} \frac{\varepsilon}{1-\varepsilon}\|f_{k+2}\|_{L_1(\operatorname{H}[w_{k,j}](\,\cdot\,, A^{-2k} - A^{-2k-4}))}. \end{aligned} \end{equation*} \notag $$

Convex atoms are easier to deal with, and in Proposition 3 we establish the ‘half’ of inequality (2.4) that corresponding to convex atoms.

Proposition 3. For any $\varepsilon > 0$ the inequality

$$ \begin{equation*} \sum_{k\geqslant 0}A^{-\alpha k}\|f_k\|_{L_{p,1}(\bigcup_{(k,j)\in \operatorname{Co}} Q_{k,j})} \lesssim \|f\|_{L_1} \end{equation*} \notag $$

holds true. The constant in this inequality can only depend on $A$, $\varepsilon$ and $\theta_1$.

The proof of Proposition 3 needs some preparation and is based upon several useful lemmas. It is presented at the end of § 4.

Lemma 6. Let $p,q\in [1,\infty)$ and let $\{\Omega_j\}_j$ be a collection of measurable sets in $\mathbb{R}^d$. Assume that all of them have nonzero measure and are disjoint up to sets of measure zero. Then

$$ \begin{equation*} \|g\|_{L_{p,q}(\bigcup_j \Omega_j)} \leqslant \sum_j \|g\|_{L_{p,q}(\Omega_j)} \end{equation*} \notag $$

for any function $g$ (here $g$ can be vector valued).

We need to specify our choice of the Lorentz quasi-norm:

$$ \begin{equation} \|h\|_{L_{p,q}(\Omega)} = p^{1/q} \bigl\|t |\{x\in \Omega\mid|h(x)| \geqslant t\}|^{1/p}\bigr\|_{L_q(\mathbb{R}_+,dt/t)}; \end{equation} \tag{4.6} $$

where the absolute value of a set is its Lebesgue measure; here $\Omega$ is a Borel set of positive measure. Note that the expression above is not necessarily a norm.

Lemma 7. Let $G$ be a weight on $\mathbb{R}^d$ that satisfies the estimates

$$ \begin{equation} c_G(1+|x|)^{-\theta_G} \leqslant G(x) \leqslant C_{G}(1+|x|)^{-\theta_G}, \qquad x\in \mathbb{R}^d, \end{equation} \tag{4.7} $$

for some $\theta_G > d$. Then there are constants $\widetilde c_G$ and $\widetilde C_G$ that do not depend on $G$ itself, but only on $\theta_G, c_G$, and $C_G$, such that the estimate

$$ \begin{equation} \widetilde c_G(1+|x|)^{-\theta_G} \leqslant \operatorname{H}[G](x,t) \leqslant \widetilde C_{G}(1+|x|)^{-\theta_G}, \qquad x\in \mathbb{R}^d, \end{equation} \tag{4.8} $$

holds true for any $t\in [0,2]$.

Remark 12. The first inequality in (4.7) implies the first inequality in (4.8), whereas the second inequality in (4.7) implies the second one in (4.8).

Lemma 8. Let $\theta_u$ and $\theta_v$ be two constants larger than $d$. Let the weights $u$ and $v$ satisfy the inequalities

$$ \begin{equation} \begin{gathered} \, u(x) \geqslant c_u(1+|x|)^{-\theta_u}, \qquad x\in \mathbb{R}^d, \\ v(x) \leqslant C_v(1+|x|)^{-\theta_v}, \qquad x\in \mathbb{R}^d. \end{gathered} \end{equation} \tag{4.9} $$

Let $p \in [1,2]$. Assume that $\theta_v \geqslant p\theta_u$. Then for any $s\in [1/2,2]$ and $f\in L_1(u)$ the inequality

$$ \begin{equation*} \|\operatorname{H}[f](\,\cdot\,,s)\|_{L_p(v)} \lesssim \|f\|_{L_1(u)} \end{equation*} \notag $$

holds with a constant independent of $s$, $u$ and $v$; however, this constant can depend on $\theta_u$, $\theta_v$, $p$, $c_u$ and $C_v$.

We need to introduce the space $L_{p,1}(v)$. The norm in this space is defined by

$$ \begin{equation*} \|h\|_{L_{p,1}(v)} = p^{} \biggl\|\biggl(\int_{\Omega_t} v(x)\,dx\biggr)^{1/p}\biggr\|_{L_1(\mathbb{R}_+)}, \qquad \Omega_t = \{x\in \mathbb{R}^d\mid |h(x)| \geqslant t\}, \quad t > 0. \end{equation*} \notag $$

Note that this agrees with (4.6) if $v = \chi_\Omega$ (we prefer to work with continuous weights, so we give two definitions).

Corollary 7. Standard interpolation theory implies that if $\theta_v > p\theta_u$ in the notation and assumptions of the previous lemma, then

$$ \begin{equation*} \|\operatorname{H}[f](\,\cdot\,,s)\|_{L_{p,1}(v)} \lesssim \|f\|_{L_1(u)}, \qquad s\in\biggl[\frac12,2\biggr]. \end{equation*} \notag $$

Sometimes we will need to keep track of the constants in our inequalities. In fact, only the dependence on $A$ is of crucial importance (a small exception only occurs in § 7 below). We will write $\lesssim_A$ to indicate that the constant implicit in the symbol $\lesssim$ is independent of $A$. To avoid ambiguity we will usually comment on this type of independence.

Corollary 8. Let $(k,j)$ be an atom. Let the weights $u_{k,j}$ and $v_{k,j}$ satisfy the bounds

$$ \begin{equation} \begin{gathered} \, u_{k,j}(x) \geqslant c_u(1+|A^kx -j|)^{-\theta_u}, \qquad x\in \mathbb{R}^d, \\ \notag v_{k,j}(x) \leqslant C_v(1+|A^kx -j|)^{-\theta_v}, \qquad x\in \mathbb{R}^d. \end{gathered} \end{equation} \tag{4.10} $$

Also let $\theta_v > p\theta_u$ and $p \leqslant 2$. Then

$$ \begin{equation*} A^{-d(p-1)k/p}\|\operatorname{H}[f](\,\cdot\,,s)\|_{L_{p,1}(v_{k,j})} \lesssim_A \|f\|_{L_1(u_{k,j})} \end{equation*} \notag $$

whenever $s\in [A^{-2k}/2,2A^{-2k}]$. The constant in this inequality is uniform with respect to the parameters $s$, $A$, $k$, $j$, $u$, $v$ and $f$; however, it can depend on $p$, $\theta_u$, $\theta_v$, $c_u$ and $C_v$.

The last corollary will be needed only in § 6. It is more convenient to present it here. Recall that $\alpha =d(p-1)/p$.

Corollary 9. Let $p \leqslant 2$. Then the inequality

$$ \begin{equation*} \|f_1\|_{L_p(Q_{0,i})} \lesssim_A A^{\alpha} \|f_2\|_{L_1(\operatorname{H}[w_{0,i}](\,\cdot\,,1-A^{-4}))} \end{equation*} \notag $$

holds uniformly for all $f$, $i \in \mathbb{Z}^d$ and all $A > 2$ (the constants can depend on $\theta_1$).

Proof. Without loss of generality we can assume that $i = 0$. By Lemma 6 it suffices to show that

$$ \begin{equation} \sum_{j\colon Q_{1,j}\subset Q_{0,0}}\|f_1\|_{L_p(Q_{1,j})} \lesssim_A A^{\alpha} \|f_2\|_{L_1(\operatorname{H}[w_{0,0}](\,\cdot\,,1-A^{-4}))}. \end{equation} \tag{4.11} $$

By (2.7) and Corollary 8 (with $k=1$, $s=A^{-2}-A^{-4}$, $v_{1,j}:=\chi_{Q_{1,j}}$, $u_{1,j} := w_{1,j}$; so $\theta_u=\theta_1$ and $\theta_v=p\theta_1 + 1$),

$$ \begin{equation*} \sum_{j\colon Q_{1,j}\subset Q_{0,0}}\|f_1\|_{L_p(Q_{1,j})} \lesssim_A A^{\alpha}\sum_{j\colon Q_{1,j}\subset Q_{0,0}}\|f_2\|_{L_1(w_{1,j})}. \end{equation*} \notag $$

Thus, the desired inequality (4.11) will follow with the help of Lemma 7, provided we show that

$$ \begin{equation} \sum_{j\colon Q_{1,j} \subset Q_{0,0}}w_{1,j}(x) \lesssim_A w_{0,0}(x), \qquad x\in \mathbb{R}^d. \end{equation} \tag{4.12} $$

To demonstrate (4.12), we note first that the weights $\{w_{1,j}\}_{j\in\mathbb{Z}^d}$ form a partition of unity, which, in particular, means that the left-hand side of this inequality never exceeds $1$. Consequently, it suffices to prove the inequality in the case when ${|x| \geqslant 2\sqrt{d}}$. Note that in this case all the quantities $w_{1,j}(x)$ are comparable since ${Q_{1,j} \subset Q_{0,0}}$:

$$ \begin{equation*} w_{1,j}(x) \lesssim_A (1+|Ax - j|)^{-\theta_1} \lesssim_A (1+A|x|)^{-\theta_1}, \qquad A|x| \geqslant 2|j|. \end{equation*} \notag $$

Therefore,

$$ \begin{equation*} \sum_{j\colon Q_{1,j} \subset Q_{0,0}}w_{1,j}(x) \lesssim_A A^{d} (1+A|x|)^{-\theta_1}\leqslant A^{d-\theta_1}|x|^{-\theta_1} \lesssim_A w_{0,0}(x), \end{equation*} \notag $$

provided $A > 2$, $\theta_1 > d$ and $|x| \geqslant 2\sqrt{d}$.

The corollary is proved.

Proof of Proposition 3. First we apply Lemma 6:

$$ \begin{equation} \sum_{k\geqslant 0}A^{-\alpha k}\|f_k\|_{L_{p,1}(\bigcup_{(k,j)\in \operatorname{Co}} Q_{k,j})} \leqslant \sum_{k\geqslant 0}A^{-\alpha k}\sum_{j\colon (k,j)\in \operatorname{Co}} \|f_{k}\|_{L_{p,1}(Q_{k,j})}. \end{equation} \tag{4.13} $$

Then we use the representation

$$ \begin{equation*} f_k = \operatorname{H} [f_{k+3}](\,\cdot\,, A^{-2k} - A^{-2k-6}) \end{equation*} \notag $$

(see (2.7)) and apply Corollary 8 to

$$ \begin{equation*} u_{k,j} := \operatorname{H}[w_{k,j}](\,\cdot\,, A^{-2k} - A^{-2k-6}) \end{equation*} \notag $$

(by Lemma 7 this weight satisfies (4.10) with $\theta_u:= \theta_1$), $v_{k,j}:= \chi_{Q_{k,j}}$ (so that we set $\theta_v = p\theta_u + 1$) and $s = A^{-2k} - A^{-2k-6}$, which is fine for $A \geqslant 2$:

$$ \begin{equation*} A^{-\alpha k}\|f_{k}\|_{L_{p,1}(Q_{k,j})} \lesssim \|f_{k+3}\|_{L_1(\operatorname{H} [w_{k,j}](\,\cdot\,, A^{-2k} - A^{-2k-6}))}. \end{equation*} \notag $$

We continue the estimate (4.13) by using the definition of a convex atom:

$$ \begin{equation*} \begin{aligned} \, &\sum_{k\geqslant 0}\,\sum_{j\colon (k,j)\in \operatorname{Co}} \|f_{k+3}\|_{L_1(\operatorname{H} [w_{k,j}](\,\cdot\,, A^{-2k} - A^{-2k-6}))} \\ &\qquad\leqslant \frac{1}{\varepsilon} \sum_{k \geqslant 0}\sum_{j\in\mathbb{Z}^d} \bigl(\|f_{k+3}\|_{L_1(\operatorname{H} [w_{k,j}](\,\cdot\,, A^{-2k} - A^{-2k-6}))} - \|f_k\|_{L_1(w_{k,j})}\bigr) \\ &\qquad =\frac{1}{\varepsilon}\sum_{k \geqslant 0}\bigl(\|f_{k+3}\|_{L_1} - \|f_k\|_{L_1}\bigr) \stackrel{(4.1)}{\lesssim} \|f\|_{L_1}. \end{aligned} \end{equation*} \notag $$

The proposition is proved.

§ 5. Compactness argument

The informal meaning of the flat/convex classification of atoms is that $\varepsilon$-flat atoms mark places in $\mathbb{R}^d$ where the function $f_k$ is close to a positive rank-one function. Recall Lemma 2, which says that the presence of a $0$-flat atom ensures that ${f_{k+3} = a\otimes h}$ with $h \geqslant 0$. So we are looking for a compactness argument which will allow us to replace $0$ with $\varepsilon$, obtain that $f_k$ is somehow close to being a positive rank-one function, and then apply Corollary 6 to it. Unfortunately, this principle seems to fail in general. Imagine a function $f_k$ concentrated far away from the cube $Q_{k,j}$ and taking arbitrary values in this cube. It seems that we can deduce nothing from the flatness of $(k,j)$ since the behavior of $f_k$ in a neighbourhood of $Q_{k,j}$ is not very much related to the weighted norms under consideration.

This hints that our searched-for compactness argument should involve the assumption that $f_k$ is concentrated in a neighbourhood of $Q_{k,j}$. Fix some number $\theta_2$ to be specified below and assume that $\theta_2 < \theta_1$. Consider the weight

$$ \begin{equation} u(x) = (1+|x|)^{-\theta_2}. \end{equation} \tag{5.1} $$

Also let $C$ be some fixed constant. We express the concentration of $f_0$ on $(0,0)$ by the inequality

$$ \begin{equation} \|f_2\|_{L_1(u)} \leqslant C\|f_2\|_{L_1(\operatorname{H}[w_{0,0}] (\,\cdot\,,1 -A^{-4}))}. \end{equation} \tag{5.2} $$

The presence of $f_2$ instead of $f_0$ is dictated by technical reasons. This condition can be transferred to an arbitrary atom in the usual way. Also let $\theta_3 > d$ be a number to be specified below. Here we only need that

$$ \begin{equation*} \theta_3 > p\theta_1. \end{equation*} \notag $$

Consider the weight

$$ \begin{equation*} v(x) = (1+|x|)^{-\theta_3}. \end{equation*} \notag $$

Recall the cone $\mathbb{M}^\mathcal{W}$, which is naturally related to the space $\mathcal{W}$ by formula (3.1).

Theorem 5. Let $\delta_0 \notin \mathbb{M}^\mathcal{W}$. Let $\widetilde \delta$ be the specific number obtained in Corollary 6 for the cone $\mathbb{M}^\mathcal{W}$ and the weight $G:= v$. Fix $\theta_1$, $\theta_2$ and $\theta_3$. For any fixed $C$ and any sufficiently large $A$ there exists $\varepsilon$ depending on all the parameters (except $f\in L_1\cap \mathcal{W}$) such that if $(0,0)$ is $\varepsilon$-flat and $f$ satisfies the concentration condition (5.2), then

$$ \begin{equation} \|f_1\|_{L_p(\operatorname{H}[v](\,\cdot\,,(1-A^{-2})/p))} \leqslant A^{d(p-1)/p - \widetilde \delta/4} \|f_0\|_{L_p(v)}. \end{equation} \tag{5.3} $$

Moreover, the inequality

$$ \begin{equation} \|f_2\|_{L_1(\operatorname{H}[w_{0,0}] (\,\cdot\,, 1-A^{-4}))} \lesssim_A\|f_0\|_{L_p(Q_{0,0})} \end{equation} \tag{5.4} $$

holds true, provided $\varepsilon$ is sufficiently small (the constant in this inequality does not depend on $\varepsilon$ or $A$, provided $\varepsilon$ is sufficiently small).

The proof of Theorem 5 occupies the rest of § 5. We start with several useful lemmas. Let $\Omega \subset \mathbb{R}^d$ be an open convex set. We define the Lipschitz semi-norm by the formula

$$ \begin{equation*} \|g\|_{\operatorname{Lip}(\Omega)} = \sup_{\substack{x,y\in\Omega \\ x\ne y}}\frac{|g(x) - g(y)|}{|x-y|}, \end{equation*} \notag $$

note that $\|g\|_{\operatorname{Lip}(\Omega)} = \|\nabla g\|_{L_{\infty}(\Omega)}$.

Lemma 9. Let $R > 0$ be a fixed number and let $G$ be a weight satisfying the bound (4.9) with parameters $\theta_G$ and $c_G$. Then

$$ \begin{equation} \|\operatorname{H}[f](\,\cdot\,,s)\|_{\operatorname{Lip}(B_R(0))} \lesssim \|f\|_{L_1(G)} \end{equation} \tag{5.5} $$

and

$$ \begin{equation} \|\operatorname{H}[|f|](\,\cdot\,,s)\|_{\operatorname{Lip}(B_R(0))} \lesssim \|f\|_{L_1(G)}, \end{equation} \tag{5.6} $$

provided $s\in [1/2,2]$. The constants in these inequalities are independent of $s$.

Remark 13. If we pick an arbitrary $s\in (0,1]$, then inequalities (5.5) and (5.6) still hold true. However, the constants in these inequalities are not uniform with respect to $s$.

Till the end of § 5 we use the notation

$$ \begin{equation} \widetilde{w} = \operatorname{H}[w](\,\cdot\,,1-t), \end{equation} \tag{5.7} $$

where the weight $w$ is given by (4.3).

Lemma 10. Assume $f$ satisfies the flatness condition in the form

$$ \begin{equation} \|f\|_{L_1(\widetilde{w})} - \|\operatorname{H}[f](\,\cdot\,,1-t)\|_{L_1(w)} \leqslant \varepsilon \|f\|_{L_1(\widetilde{w})}, \end{equation} \tag{5.8} $$

where $t \in [0,1/2]$ is a fixed real number. Fix a large number $R$ and assume that

$$ \begin{equation} \int_{|x|\geqslant R}|f(x)|\widetilde{w}(x)\,dx \leqslant \frac12 \|f\|_{L_1(\widetilde{w})}. \end{equation} \tag{5.9} $$

Then there exists $c > 0$ such that the inequality

$$ \begin{equation} \bigl|\operatorname{H}[f](x,1-t)\bigr| \geqslant c\|f\|_{L_1(\widetilde{w})} \end{equation} \tag{5.10} $$

holds true for any $x\in B_R(0)$, provided $\varepsilon$ is sufficiently small. The constant $c$ does not depend on $t$ as long as $\varepsilon$ is sufficiently small and $R$ is fixed.

Proof. First let us prove a similar inequality, which definitely avoids unwanted cancellations:

$$ \begin{equation} \operatorname{H}[|f|](x,1-t) \geqslant c_1\|f\|_{L_1(\widetilde{w})}, \qquad |x| \leqslant R. \end{equation} \tag{5.11} $$

This is straightforward:

$$ \begin{equation*} \begin{aligned} \, \operatorname{H}[|f|](x,1-t) &= (4\pi(1-t))^{-d/2}\int_{\mathbb{R}^d} |f(y)| \exp\biggl(-\frac{|x-y|^2}{4(1-t)}\biggr)\,dy \\ &\geqslant(4\pi(1-t))^{-d/2}\int_{|y|\leqslant R} |f(y)| \exp\biggl(-\frac{|x-y|^2}{4(1-t)}\biggr)\,dy \\ &\geqslant \widetilde c\int_{|y|\leqslant R}|f(y)|\widetilde w(y), \end{aligned} \end{equation*} \notag $$

where

$$ \begin{equation*} \widetilde c = \inf_{|x|,|y| \leqslant R} \frac{(4\pi)^{-d/2}(1-t)^{-d/2}\exp(-\frac{|x-y|^2}{4(1-t)})}{\widetilde{w}(y)} \gtrsim \exp(-R^2)(1+R)^{\theta_1} \end{equation*} \notag $$

(we have used Lemma 7 here). To finish the proof of (5.11) we simply use (5.9) and set $c_1:= \widetilde{c}/2$.

Now we return to the proof of (5.10). We will show that this inequality holds true for $c := c_1/2$. We recall (2.8) to state that our flatness assumption (5.8) leads to

$$ \begin{equation} \int_{\mathbb{R}^d}\bigl(\operatorname{H}[|f|](x,1-t) - |\operatorname{H}[f](x,1-t)|\bigr)w(x)\,dx \leqslant \varepsilon \|f\|_{L_1(\widetilde{w})}. \end{equation} \tag{5.12} $$

Assume the contrary: let there exist $x_0\in B_R(0)$ such that

$$ \begin{equation*} |\operatorname{H}[f](x_0,1-t)| \leqslant \frac12 c_1\|f\|_{L_1(\widetilde{w})}. \end{equation*} \notag $$

According to (5.11),

$$ \begin{equation*} \operatorname{H}[|f|](x_0,1-t) - |\operatorname{H}[f](x_0,1-t)| > \frac{c_1}{2}\|f\|_{L_1(\widetilde{w})}. \end{equation*} \notag $$

By Lemma 9 the expression on the left-hand side of the above inequality is a Lipschitz function of $x_0$ with Lipschitz constant $L\|f\|_{L_1(\widetilde{w})}$, where $L$ depends only on $R$ and the parameters of $\widetilde{w}$ (those are $\theta_1$ and $c_{w}$; we have used Lemma 7 here again). Therefore,

$$ \begin{equation*} \operatorname{H}[|f|](x,1-t) - |\operatorname{H}[f](x,1-t)| > \frac{c_1}{5}\|f\|_{L_1(\widetilde{w})} \end{equation*} \notag $$

for any $x\in B_R(0) \cap B_{c_1/(10 L)}(x_0)$. We integrate this inequality with respect to $x$:

$$ \begin{equation*} \begin{aligned} \, &\int_{B_R(0) \cap B_{c_1/(10 L)}(x_0)} \bigl(\operatorname{H}[|f|](x,1-t) - |\operatorname{H}[f](x,1-t)|\bigr)w(x)\,dx \\ &\qquad \geqslant\frac{\pi_d}{d!}\biggl(\frac{c_1}{10L}\biggr)^d (\operatorname{s}[w](R))^{-1}\frac{c_1}{5} \|f\|_{L_1(\widetilde{w})}; \end{aligned} \end{equation*} \notag $$

here $\pi_d$ is the volume of the unit ball in $\mathbb{R}^d$ (we have assumed that $R$ is larger than $c_1/(10L)$, which is not a restriction, to estimate the volume of the intersection of two balls such that the centre of the smaller ball lies in the larger one, by $(d!)^{-1}$ times the volume of the smaller ball). The latter inequality contradicts (5.12), provided

$$ \begin{equation*} \varepsilon < \frac{\pi_d}{d!}\biggl(\frac{c_1}{10L}\biggr)^d (\operatorname{s}[w](R))^{-1}\frac{c_1}{5}. \end{equation*} \notag $$

Lemma 10 is proved.

Lemma 11. Assume that

$$ \begin{equation} \|g\|_{L_1(u)} \leqslant C\|g\|_{L_1(\widetilde{w})} \end{equation} \tag{5.13} $$

for some function $g \in L_1(\widetilde{w})$ and $C > 0$. Then there exists $R > 0$ which depends on $\gamma \in (0,1)$, $\theta_1$, $\theta_2$ and $C$ only, such that

$$ \begin{equation} \int_{|x|\geqslant R}|g(x)|\widetilde{w}(x)\,dx \leqslant \gamma \|g\|_{L_1(\widetilde{w})}. \end{equation} \tag{5.14} $$

The choice of $R$ is independent of $t < 1/2$ (this parameter is implicitly present in the definition (5.7) of $\widetilde{w}$) and $g$.

In particular, if $\gamma =1/2$, the concentration condition (5.2) implies (5.9) with $f:= f_2$ and $t:=A^{-4}$.

Proof of Lemma 11. We estimate the left-hand side of (5.14):

$$ \begin{equation*} \begin{aligned} \, \int_{|x| \geqslant R} |g(x)|\widetilde{w}(x)\,dx &\leqslant \int_{|x|\geqslant R}|g(x)|u(x)\,dx \sup_{|x|\geqslant R}\frac{\widetilde{w}(x)}{u(x)} \\ &\!\!\!\!\!\!\!\!\!\!\!\!\!\!\stackrel{\text{Lemma 7, (5.13)}}{\leqslant} C\|g\|_{L_1(\widetilde{w})}\sup_{|x|\geqslant R}\frac{C_{\widetilde{w}} (1+|x|)^{-\theta_1}}{(1+|x|)^{-\theta_2}} \\ &\!\!\!\!\!\stackrel{\theta_2 < \theta_1}{\leqslant} CC_{\widetilde{w}} (1+R)^{\theta_2 - \theta_1}\|g\|_{L_1(\widetilde{w})}, \end{aligned} \end{equation*} \notag $$

and see that the quantity in front of $\|g\|_{L_1(\widetilde{w})}$ becomes arbitrarily small when ${R\to \infty}$ since $\theta_2 < \theta_1$.

The lemma is proved.

Lemma 12. Let $\{L_R\}_{R \in \mathbb{N}}$ be a sequence of positive scalars. Then the set

$$ \begin{equation*} \bigl\{g \in L_1(\widetilde{w})\mid\|g\|_{L_1(u)} \leqslant C\|g\|_{L_1(\widetilde{w})} \leqslant C; \ \forall\, R\in \mathbb{N}\ \|g\|_{\operatorname{Lip}(B_R(0))} \leqslant L_R\bigr\} \end{equation*} \notag $$

is compact in $L_1(\widetilde{w})$.

Proof of Theorem 5. We use Lemma 11 with $g:= f_2$ and $t := A^{-4}$ to choose $R>\sqrt{d}$ such that (5.9) holds true for $f:=f_2$, that is, such that

$$ \begin{equation*} \int_{|x| \geqslant R} |f_2(x)|\operatorname{H}[w](\,\cdot\,,1-A^{-4})(x)\,dx \leqslant \frac12 \|f_2\|_{L_1(\operatorname{H}[w](\,\cdot\,,1-A^{-4}))}. \end{equation*} \notag $$

By Lemma 10 for $f:= f_2$ and $t := A^{-4}$ (the application of this lemma is legal since, by virtue of Remark 11 condition (5.8) follows in this case from the assumption that $(0,0)$ is flat)

$$ \begin{equation} |f_0(x)| = \bigl|\operatorname{H}[f_2](x,1-A^{-4})\bigr| \geqslant c\|f_{2}\|_{L_1(\operatorname{H}[w](\,\cdot\,,1-A^{-4}))}, \qquad x\in B_R(0), \end{equation} \tag{5.15} $$

provided $\varepsilon$ is sufficiently small. Note that the constant $c$ depends neither on $A$, nor on $\varepsilon$ (provided $\varepsilon$ is sufficiently small). Inequality (5.15) justifies (5.4).

Now we fix $A$ and allow $\varepsilon$ to depend on $A$. Our aim is to prove (5.3). Assume the contrary: let there exist a sequence of functions $f^n \in L_1\cap\mathcal{W}$ such that the atom $(0,0)$ is $(1/n)$-flat for $f^n$, the condition (5.2) is fulfilled for $f_2:=f_2^n$, but (5.3) is violated in the following sense:

$$ \begin{equation} \|f_1^n\|_{L_p(\operatorname{H}[v](\,\cdot\,,(1-A^{-2})/p))} > A^{d(p-1)/p -\widetilde \delta/4} \|f_0^n\|_{L_p(v)}. \end{equation} \tag{5.16} $$

Without loss of generality we assume that

$$ \begin{equation} \|f_0^n\|_{L_1(w)} = 1. \end{equation} \tag{5.17} $$

Since $(0,0)$ is $(1/n)$-flat for $f^n$,

$$ \begin{equation} \|f_3^n\|_{L_1(\operatorname{H}[w](\,\cdot\,,1-A^{-6}))} \leqslant 2, \end{equation} \tag{5.18} $$

and, by Remark 13, we also have

$$ \begin{equation*} \|f_2^n\|_{\operatorname{Lip}(B_R(0))} \leqslant L_R \end{equation*} \notag $$

for some fixed constants $L_R$ (these constants do not depend on $n$; they certainly depend on $A$ since

$$ \begin{equation*} f_2^n = \operatorname{H}[f_3^n](\,\cdot\,,A^{-4} - A^{-6}) \quad\text{and}\quad s = A^{-4} - A^{-6} \end{equation*} \notag $$

in the terminology of Remark 13). By (5.18),

$$ \begin{equation*} \|f_2^n\|_{L_1(\operatorname{H}[w](\,\cdot\,,1-A^{-4}))} \leqslant 2. \end{equation*} \notag $$

We apply Lemma 12 and extract a subsequence of the sequence $\{f_2^n\}_{n}$ that converges to a function $F$ in $L_1(\operatorname{H}[w](\,\cdot\,,1-A^{-4}))$. Without loss of generality we can assume that $\{f_2^n\}_n$ converges to $F$ itself. Since the topology of $L_1(\widetilde{w})$ is stronger than that of $\mathcal{S}'(\mathbb{R}^d,\mathbb{R}^\ell)$ (we use that $\widetilde{w}$ satisfies the bound (4.4) thanks to Lemma 7), we obtain $F\in \mathcal{W}$.

By Lemma 8 for $u := \widetilde{w}$ and $v := w$, we have

$$ \begin{equation*} f_0^{n} \to \operatorname{H}[F](\,\cdot\,,1-A^{-4}) \end{equation*} \notag $$

in $L_1(w)$. Then, in particular, our normalization (5.17) implies that $F\ne 0$. Therefore, the flatness assumption on the atom $(0,0)$ leads to

$$ \begin{equation*} \|F\|_{L_1(\operatorname{H}[w](\,\cdot\,,1-A^{-4}))} = \|\operatorname{H}[F](\,\cdot\,,1-A^{-4})\|_{L_1(w)}. \end{equation*} \notag $$

By Lemma 2, $F = a\otimes h$, where $a\in\mathbb{R}^\ell$ and $h\geqslant0$. Note that $h\in \mathbb{M}^\mathcal{W}$.

On the other hand,

$$ \begin{equation*} f_0^{n} \to \operatorname{H}[F](\,\cdot\,,1-A^{-4}) \quad \text{in } L_p(v) \end{equation*} \notag $$

and

$$ \begin{equation*} f_1^n \to \operatorname{H}[F](\,\cdot\,, A^{-2}-A^{-4}) \quad \text{in } L_p\biggl(\operatorname{H}[v]\biggl(\,\cdot\,,\frac{1-A^{-2}}p\biggr)\biggr) \end{equation*} \notag $$

by Lemma 8 since we have assumed that $\theta_3 \geqslant p\theta_1$ (as usual, we have used Lemma 7 several times here). Thus, (5.16) implies that

$$ \begin{equation*} \begin{aligned} \, &\|\operatorname{H}[h](\,\cdot\,, A^{-2}-A^{-4})\|_{L_p(\operatorname{H}[v](\,\cdot\,,(1-A^{-2})/p))} \\ &\qquad\geqslant A^{d(p-1)/p - \widetilde \delta/4} \|\operatorname{H}[h](\,\cdot\,,1-A^{-4})\|_{L_p(v)}, \end{aligned} \end{equation*} \notag $$

which contradicts Corollary 6 since $h \in \mathbb{M}^\mathcal{W}$ (we apply the corollary to the function $H(x,\theta):=\operatorname{H}[h](x,\theta-A^{-4})$, the weight $G:=v$ and $t:=A^{-2}$).

Theorem 5 is proved.

Corollary 10. By Lemmas 7 and 8,

$$ \begin{equation*} \|f_0\|_{L_p(v)} \lesssim_A \|f_2\|_{L_1(\operatorname{H}[w_{0,0}] (\,\cdot\,, 1-A^{-4}))}. \end{equation*} \notag $$

Thus, if all the assumptions of Theorem 5 are satisfied, then one can combine inequalities (5.3) and (5.4) into

$$ \begin{equation} \|f_1\|_{L_p(\operatorname{H}[v](\,\cdot\,,(1-A^{-2})/p))} \lesssim_A A^{d(p-1)/p - \widetilde \delta/4}\|f_0\|_{L_p(Q_{0,0})}. \end{equation} \tag{5.19} $$

Although the constant in this inequality does not depend on $\varepsilon$, the inequality becomes valid only if $\varepsilon$ is sufficiently small, and the required smallness of $\varepsilon$ can depend on $A$. By Lemma 7, (5.19) also implies that

$$ \begin{equation} \|f_1\|_{L_p(Q_{0,0})} \lesssim_A A^{d(p-1)/p - \widetilde \delta/4}\|f_0\|_{L_p(Q_{0,0})}. \end{equation} \tag{5.20} $$

§ 6. Horizontal interaction

Let $(k,j)$ be an atom. It is convenient to introduce the notation

$$ \begin{equation*} f_{k,j}^* = \|f_{k+2}\|_{L_1(\operatorname{H}[w_{k,j}](\,\cdot\,, A^{-2k} - A^{-2k-4}))}. \end{equation*} \notag $$

The quantity $f^*_{k,j}$ may informally be thought of as the size of the function $f_k$ on the cube $Q_{k,j}$ (or the size of the martingale $f$ on the atom $(k,j)$ at time $k$). Note that (for example, by Lemma 8 and the assumption that $f\in L_1$) the sequence $\{f_{k,j}^*\}_{j\in\mathbb{Z}^d}$ is bounded for any $k$, which ensures that the maximal functions introduced in the following definition are finite.

Definition 8. Let $\theta_4 > d$ be a number to be specified below. Consider the collection of maximal functions $\operatorname{M}_k^{\theta_4}\colon \mathbb{Z}^d\to \mathbb{R}^+$, $k \in\mathbb{N}\cup\{0\}$, defined as follows:

$$ \begin{equation*} \operatorname{M}^{\theta_4}_{k,j}[f] = \sup_{i\in\mathbb{Z}^d}(1+|i-j|)^{-\theta_4}f_{k,i}^*, \qquad j \in \mathbb{Z}^d. \end{equation*} \notag $$

Similar smoothing maximal function were also used in [12]. This particular definition has been borrowed from [56].

We fix $k$, $\theta_4$ and $f$ provisionally. We suppress these parameters in our notation if this does not lead to ambiguity and simply write $f^*_j$ and $\operatorname{M}_j$. The maximal operator we have introduced generates an interesting oriented graph.

Definition 9. By the horizontal graph $\widetilde\Gamma_k$ we mean the following oriented graph. Fix some number $\lambda > 1$ to be specified below (we require $\lambda$ to be close to one). The set of vertices of $\widetilde\Gamma_k$ is the lattice $\mathbb{Z}^d$. For each point $j$, we find some point $\vec{j} \in \mathbb{Z}^d$ such that

$$ \begin{equation*} \operatorname{M}_{k,j}[f] \leqslant \lambda (1+|\vec{j} - j|)^{-\theta_4}f^*_{k,\vec j}\,. \end{equation*} \notag $$

If $\vec j = j$, then we do nothing. Otherwise we draw an arrow from $\vec j$ to $j$.

Note that each vertex has at most one incoming arrow. Informally, an arrow $\vec j \to j$ signifies that the point $\vec j$ dominates $j$ in the sense that we can estimate the quantity $f_j^*$ uniformly in terms of $f_{\vec j}^*$.

Lemma 13. If $\lambda$ is sufficiently close to one (depending on $\theta_4$ only and independent of $f$), then there are no oriented paths of length $2$ in $\widetilde\Gamma_k$.

Proof. Assume the contrary: let there be a path of length $2$. Without loss of generality, we can assume that the path is $j\to 0 \to i$, where $i\ne 0$ and $j\ne 0$ by construction. Then

$$ \begin{equation*} \operatorname{M}_i \leqslant \lambda (1+|i|)^{-\theta_4}f_0^* \quad\text{and}\quad \operatorname{M}_0\leqslant \lambda (1+|j|)^{-\theta_4}f_j^*, \end{equation*} \notag $$

which leads to the inequality

$$ \begin{equation*} \operatorname{M}_i \leqslant \lambda^2 \bigl((1+|i|)(1+|j|)\bigr)^{-\theta_4} f_j^*. \end{equation*} \notag $$

Combining this with the definition of $\operatorname{M}_i$ and noting that $f_j^* > 0$, we arrive at

$$ \begin{equation*} (1+|i-j|)^{-\theta_4} \leqslant \lambda^2\bigl((1+|i|)(1+|j|)\bigr)^{-\theta_4}, \end{equation*} \notag $$

which is equivalent to

$$ \begin{equation*} 1+|i-j| \geqslant \lambda^{-2/\theta_4} (1+|i| + |j| + |i|\,|j|). \end{equation*} \notag $$

We note that $|i||j| \geqslant \frac13(1+|i|+|j|)$ for any $i,j \in \mathbb{Z}^d\setminus \{0\}$, so that

$$ \begin{equation*} 1+|i-j| \geqslant \frac43\lambda^{-2/\theta_4}(1+|i| + |j|), \end{equation*} \notag $$

which is definitely false if $\lambda <(4/3)^{\theta_4/2}$.

The lemma is proved.

In particular, there are no pairs of arrows $i\to j$ and $j \to i$ in $\widetilde\Gamma_k$, so this is indeed an oriented graph. We fix

$$ \begin{equation} \lambda = \min\biggl(2, \frac{1+ (4/3)^{\theta_4/2}}{2}\biggr). \end{equation} \tag{6.1} $$

Definition 10. Let $K>1$ be a real number. We say that an atom $(k,j)$ is $K$-saturated if

$$ \begin{equation*} \operatorname{M}_{k,j}^{\theta_4}[f] \leqslant K f_{k,j}^*. \end{equation*} \notag $$

Lemma 14. If a vertex $j$ does not have an incoming arrow in $\widetilde\Gamma_k$, then the atom $(k,j)$ is $2$-saturated. If $(k,j)$ is not $2$-saturated, then it has an incoming arrow.

Proof. Let us prove the first claim. By construction, if $(k,j)$ does not have an incoming arrow, then

$$ \begin{equation*} \operatorname{M}_j \leqslant \lambda f_j^* < 2f_j^*, \end{equation*} \notag $$

which means that $(k,j)$ is $2$-saturated.

The second claim follows from the first by contradiction.

The lemma is proved.

Lemmas 13 and 14 imply the following statement.

Corollary 11. If a vertex in $\widetilde{\Gamma}_k$ has an outgoing arrow, then the corresponding atom is $2$-saturated.

Recall the weight $u$ defined in (5.1).

Lemma 15. Let $\theta_2 > \theta_4 + d$. Then there exists a constant $C$ depending on $\theta_1$, $\theta_2$ and $\theta_4$ only such that if $(0,0)$ is $2$-saturated, then it fulfills the concentration assumption

$$ \begin{equation*} \|f_{2}\|_{L_1(u)} \leqslant Cf^*_{0,0}. \end{equation*} \notag $$

Proof. It suffices to prove that

$$ \begin{equation*} \|f_{2}\|_{L_1(u)} \leqslant \frac{C}{2}\operatorname{M}_{0,0}^{\theta_4}[f]. \end{equation*} \notag $$

It remains to write several inequalities:

$$ \begin{equation*} \begin{aligned} \, &\int_{\mathbb{R}^d}|f_2(x)|u(x)\,dx = \sum_{i\in\mathbb{Z}^d}\int_{Q_{0,i}} |f_2(x)|u(x)\,dx \\ &\!\qquad \stackrel{(5.1)}{\leqslant} \operatorname{s}[u](\sqrt{d}) \sum_{i\in\mathbb{Z}^d} (1+|i|)^{-\theta_2}\int_{Q_{0,i}}|f_2(x)|\,dx \\ &\quad\stackrel{\text{Lemma 3}}{\lesssim} \frac{\operatorname{s}[u](\sqrt{d})\operatorname{s}[w](\sqrt{d})}{\widetilde{w}(0)} \sum_{i\in\mathbb{Z}^d} (1+|i|)^{-\theta_2}f_{0,i}^* \\ &\qquad\leqslant \frac{\operatorname{s}[u](\sqrt{d})\operatorname{s}[w](\sqrt{d})}{\widetilde{w}(0)} \sum_{i\in\mathbb{Z}^d} (1+|i|)^{\theta_4-\theta_2}\operatorname{M}_{0,0}^{\theta_4}[f] \lesssim \operatorname{M}_{0,0}^{\theta_4}[f]; \end{aligned} \end{equation*} \notag $$

as usual, we are using the notation $\widetilde{w} = \operatorname{H}[w](\,\cdot\,,1 - A^{-4})$.

The lemma is proved.

Lemma 16. Let the atom $(k,i)$ be subordinate to $(k,j)$ in the sense that the arrow $j \to i$ exists in the graph $\widetilde\Gamma_k$. Then the inequality

$$ \begin{equation*} \|f_{k+1}\|_{L_p(Q_{k,i})} \lesssim_A A^{\alpha (k+1)}(1+|i-j|)^{-\theta_4}f^*_{k,j} \end{equation*} \notag $$

holds true; the constant in it depends neither on $f$, nor on $A$, nor on the particular choice of $i$ and $j$.

Proof. Without loss of generality we can assume that $k=0$. We apply Corollary 9:

$$ \begin{equation*} \begin{aligned} \, \|f_1\|_{L_p(Q_{0,i})} &\lesssim_A A^{\alpha}\|f_2\|_{L_1(\operatorname{H}[w_{0,i}](\,\cdot\,,1-A^{-4}))} \\ &= A^{\alpha}f^*_{0,i} \leqslant A^{\alpha}\operatorname{M}_{0,i}^{\theta_4}[f] \stackrel{j\to i}{\leqslant}\lambda A^{\alpha} (1+|i-j|)^{-\theta_4} f_{0,j}^*. \end{aligned} \end{equation*} \notag $$

The lemma is proved.

Theorem 6. Let $\delta_0\notin \mathbb{M}^\mathcal{W}$, and let the parameters $\theta_1$, $\theta_2$, $\theta_3$, $\theta_4$ and $p$ be fixed. Let these parameters satisfy

$$ \begin{equation*} p \leqslant 2, \qquad \theta_1 > \theta_2, \qquad \theta_3 \geqslant p\theta_1, \qquad \theta_2 > \theta_4 + d \quad\textit{and}\quad \theta_4 > d. \end{equation*} \notag $$

Also let $\theta_5$ be a fixed parameter such that $d < \theta_5 < \theta_4$. The following statement is true for any sufficiently large $A$. There exists $\varepsilon > 0$, possibly depending on $A$, and a positive constant $\delta^*$ independent of $A$ such that if the atom $(k,j)$ is $\varepsilon$-flat and $2$-saturated whereas the atom $(k,i)$ is subordinate to $(k,j)$ in the graph $\widetilde\Gamma_k$, then

$$ \begin{equation*} \|f_{k+1}\|_{L_p(Q_{k,i})} \lesssim_A A^{\alpha - \delta^*}(1+|i-j|)^{-\theta_5}\|f_{k}\|_{L_p(Q_{k,j})}. \end{equation*} \notag $$

Proof. Without loss of generality we can assume that $k=0$ and $j=0$. The desired inequality will follow from the two inequalities below (recall $\widetilde{\delta}$ from Corollary 6):

$$ \begin{equation} \|f_1\|_{L_p(Q_{0,i})} \lesssim_A A^{\alpha}(1+|i|)^{-\theta_4}\|f_0\|_{L_p(Q_{0,0})} \end{equation} \tag{6.2} $$

and

$$ \begin{equation} \|f_1\|_{L_p(Q_{0,i})} \lesssim_A A^{\alpha - \widetilde \delta/4} (1+|i|)^{\theta_3/p}\|f_0\|_{L_p(Q_{0,0})}. \end{equation} \tag{6.3} $$

According to Lemma 15, the atom $(0,0)$ fulfills the concentration condition (5.2). Thus, the application of Theorem 5 and Corollary 10 to the atom $(0,0)$ is legal.

Inequality (6.2) is a consequence of Lemma 16 and (5.4).

Let us prove (6.3):

$$ \begin{equation*} \begin{aligned} \, \|f_1\|_{L_{(Q_{0,i})}}^p &\stackrel{\text{Lemma 7}}{\lesssim_A} \operatorname{s}[v](\sqrt{d})(1+|i|)^{\theta_3}\int_{Q_{0,i}}|f_1(x)|^p\operatorname{H}[v] \biggl(x,\frac{1 - A^{-2}}{p}\biggr)\,dx \\ &\!\quad \leqslant \operatorname{s}[v](\sqrt{d}) (1+|i|)^{\theta_3} \|f_1\|_{L_p(\operatorname{H}[v](x,(1 - A^{-2})/p))}^p \\ &\ \stackrel{(5.19)}{\lesssim_A} A^{p(\alpha -\widetilde \delta/4)} (1+|i|)^{\theta_3}\|f_0\|_{L_p(Q_{0,0})}^p. \end{aligned} \end{equation*} \notag $$

The theorem is proved.

§ 7. Vertical interaction and control of flat atoms

Now we introduce a graph $\Gamma$ that expresses the vertical domination of atoms. In this graph arrows always go down, that is, from an atom $(k,j)$ to $(k+1,j')$.

Definition 11. The set of vertices of a vertical graph $\Gamma$ is the set of all $\varepsilon$-flat atoms. We draw an arrow from $(k,j)$ to $(k+1,j')$ if $(k,j)$ is $2$-saturated and either $Q_{k+1,j'} \subset Q_{k,j}$ or $Q_{k+1,j'}\subset Q_{k,i}$, $Q_{k,i}$ is not $2$-saturated, and $j\to i$ in $\widetilde\Gamma_k$.

It will be important for our considerations that any two cubes $Q_{k,j}$ and $Q_{k',j'}$ are either disjoint up to a set of measure zero, or one of them contains the other (this follows from the assumption that $A$ is odd).

Remark 14. Note that $\Gamma$ is a forest in the sense that it is a disjoint union of maximal by inclusion oriented trees $\mathcal{T}_1,\mathcal{T}_2,\dots$ . Indeed, there are no nonoriented cycles in $\Gamma$ since every vertex has at most one incoming arrow and all arrows ‘go down’ (that is, from a vertex $(k,j)$ to $(k+1,j')$). We denote the atom corresponding to the root of $\mathcal{T}_q$, $q\in \mathbb{N}$, by $(k_q,j_q)$.

Remark 15. By Definition 11 and Corollary 11, only $2$-saturated atoms can have outgoing arrows in $\Gamma$. Thus, on a tree $\mathcal{T}_q$ only leaves can be not $2$-saturated.

In Lemma 17 and Corollary 12 we assume that $\varepsilon$ is sufficiently small (depending on $A$).

Lemma 17. Assume that $\delta_0\notin \mathbb{M}^\mathcal{W}$. Let $(k,j)$ be a flat atom and let $\{(k+1,i)\}_{i\in J}$ be all its kids in $\Gamma$. Then

$$ \begin{equation*} \|f_{k+1}\|_{L_p(\bigcup_{i\in J} Q_{k+1,i})} \lesssim_A A^{\alpha - \delta^*}\|f_k\|_{L_p(Q_{k,j})}. \end{equation*} \notag $$

Proof. Without loss of generality we can assume that $k=0$ and $j=0$. By construction

$$ \begin{equation*} \bigcup_{i\in J} Q_{1,i} \subset \Omega_{0,0} := Q_{0,0} \cup \biggl(\bigcup_{j\colon(0,0)\xrightarrow{\widetilde{\Gamma}_0} (0,j)}Q_{0,j}\biggr). \end{equation*} \notag $$

By Remark 15, $(0,0)$ is $2$-saturated (otherwise it has no kids in $\Gamma$ and there is nothing to prove), so it is legal to apply Theorems 5 and 6. Then

$$ \begin{equation*} \begin{aligned} \, \|f_1\|^p_{L_p(\bigcup_{i\in J} Q_{1,i})} &\leqslant \|f_1\|^p_{L_p(\Omega_{0,0})} \\ &\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\stackrel{\text{(5.20), Theorem 6}}{\lesssim_A} A^{p(\alpha - \delta^*)} \biggl(1+ \sum_{j\in \mathbb{Z}^d} (1+|j|)^{-p\theta_5}\biggr) \|f_0\|^p_{L_p(Q_{0,0})} \\ &\!\lesssim_A A^{p(\alpha - \delta^*)}\|f_0\|^p_{L_p(Q_{0,0})}. \end{aligned} \end{equation*} \notag $$

The lemma is proved.

Let $\delta^{**} = \delta^*/2$. The next two corollaries are immediate consequences of Lemma 17 and do not need proofs.

Corollary 12. Assume that $\delta_0\notin \mathbb{M}^\mathcal{W}$. Let $(k,j)$ be a flat atom and let $\{(k+1,i)\}_{i\in J}$ be all its kids in $\Gamma$. Then

$$ \begin{equation} \|f_{k+1}\|_{L_p(\bigcup_{i\in J} Q_{k+1,i})} \leqslant A^{\alpha - \delta^{**}}\|f_k\|_{L_p(Q_{k,j})}, \end{equation} \tag{7.1} $$

provided $A$ is sufficiently large.

Now we fix $A$ so large that (7.1) holds true. We fix $\varepsilon$ to be as small as prescribed by Theorems 5 and 6 for our particular choice of $A$.

Corollary 13. Assume that $\delta_0\notin \mathbb{M}^\mathcal{W}$. Let $\mathcal{T}_q$ be a maximal by inclusion tree in $\Gamma$ and let $(k_q,j_q)$ be its root. Then, for any $N\in\mathbb{N}$,

$$ \begin{equation} \|f_{k_q+N}\|_{L_p(\bigcup_{(k_q+N,i)\in\mathcal{T}_q} Q_{k_q+N,i})} \leqslant A^{(\alpha - \delta^{**})N} \|f_{k_q}\|_{L_p(Q_{k_q,j_q})}. \end{equation} \tag{7.2} $$

Let us investigate the roots of our trees.

Lemma 18. Let $(k_q,j_q)$ be the root of $\mathcal{T}_q$ and let $k_q \geqslant 1$. Then an atom $(k_q-1,j')$ such that $Q_{k_q,j_q}\subset Q_{(k_q-1,j')}$ is either $\varepsilon$-convex or subordinate to an $\varepsilon$-convex atom in $\widetilde\Gamma_{k_q-1}$.

Proof. Assume that the atom $(k_q-1,j')$ is $\varepsilon$-flat, otherwise there is nothing to prove. To prove the lemma we need to show that $(k_q-1,j')$ is subordinate to a convex atom in $\widetilde\Gamma_{k_q-1}$. Note that $(k_q-1,j')$ is not $2$-saturated, because there is no arrow $(k_{q}-1,j')\to (k_q,j_q)$ in $\Gamma$. Therefore, by Lemma 14 this atom is subordinate to another atom $(k_q-1,\vec{j})$ in $\widetilde\Gamma_{k_q-1}$. Corollary 11 says that $(k_q-1,\vec{j})$ is $2$-saturated, and thus it cannot be $\varepsilon$-flat since there is no arrow $(k_q-1,\vec{j})\to (k_q,j_q)$ in $\Gamma$. Therefore, $(k_q-1,\vec{j})$ is convex, and the lemma is proved.

Recall that, since we have already fixed $A$, all the constants in our inequalities are now allowed to depend on $A$. However, we still care about the uniformity with respect to $N$.

Proposition 4. Assume that $\delta_0\notin \mathbb{M}^\mathcal{W}$. Let $\{\mathcal{T}_q\}_q$ be all the trees that start their development at the level $K \geqslant 1$, that is, such that $k_q = K$. Then, for any $N \in \mathbb{N}$,

$$ \begin{equation} \sum_q\|f_{K+N}\|_{L_p(\bigcup_{(K+N,i) \in \mathcal{T}_q} Q_{K+N,i})} \lesssim_N A^{(\alpha - \delta^{**})N + \alpha K} \bigl(\|f_{K+2}\|_{L_1} - \|f_{K-1}\|_{L_1}\bigr), \end{equation} \tag{7.3} $$

where the constant in this inequality is independent of $N$.

Proof. We rely upon Lemma 18 and analyze the two cases arising in that lemma separately. Let $\mathcal{T}_q$ be some tree with root $(K,j_q)$. Let $Q_{K,j_q} \subset Q_{K-1,j'}$.

Consider the first case: $(K-1,j')$ is convex. In this case

Consider the second case: now let $(K-1,j')$ be subordinate to a convex atom $(K-1,\vec{j})$ in $\widetilde\Gamma_{K-1}$. In this case

$$ \begin{equation} \begin{aligned} \, \notag &\|f_{K+N}\|_{L_p(\bigcup_{(K+N,i) \in \mathcal{T}_q}Q_{K+N,i})} \stackrel{(7.2)}{\leqslant} A^{(\alpha - \delta^{**})N}\|f_{K}\|_{L_p(Q_{K,j_q})} \\ \notag &\,\qquad \stackrel{\text{Corollary 8}}{\lesssim} A^{(\alpha - \delta^{**})N+\alpha K}\|f_{K+1}\|_{L_1(\operatorname{H}[w_{K -1, j'}](\,\cdot\,,A^{-2K+2} - A^{-2K - 2}))} \\ \notag &\!\qquad\qquad=A^{(\alpha - \delta^{**})N+\alpha K} f_{K-1,j'}^* \\ \notag &\stackrel{(K-1,\vec{j})\stackrel{\widetilde \Gamma_{K-1}}{\xrightarrow{\hspace{0.5cm}}} (K-1,j')}{\lesssim} A^{(\alpha - \delta^{**})N+\alpha K} (1+|j'-\vec{j}|)^{-\theta_4} f_{K-1,\vec{j}}^* \\ \notag &\!\!\qquad\qquad =A^{(\alpha - \delta^{**})N+\alpha K}(1+|j'-\vec{j}|)^{-\theta_4}\|f_{K+1}\|_{L_1(\operatorname{H}[w_{K -1, \vec{j}}](\,\cdot\,,A^{-2K+2} - A^{-2K - 2}))} \\ \notag &\!\!\!\quad\qquad \stackrel{\text{Lemma 1}}{\lesssim}\! A^{(\alpha - \delta^{**})N+\alpha K}(1+|j'-\vec{j}|)^{-\theta_4}\|f_{K+2}\|_{L_1(\operatorname{H}[w_{K -1, \vec{j}}](\,\cdot\,,A^{-2K+2} - A^{-2K - 4}))} \\ \notag &\!\!\qquad \stackrel{(K-1,\vec{j})\in \operatorname{Co}}{\lesssim} A^{(\alpha - \delta^{**})N+\alpha K}(1+|j'-\vec{j}|)^{-\theta_4} \\ &\qquad\qquad\quad\times \bigl(\|f_{K+2}\|_{L_1(\operatorname{H}[w_{K -1, \vec{j}}](\,\cdot\,,A^{-2K+2} - A^{-2K - 4}))} - \|f_{K-1}\|_{L_1(w_{K-1,\vec{j}})}\bigr). \end{aligned} \end{equation} \tag{7.5} $$

We sum the estimates (7.4) and (7.5) over all trees $\mathcal{T}_q$ that have roots on the level $K$. On the left-hand side we obtain the quantity we want to estimate in (7.3). The quantity on the right is bounded by

$$ \begin{equation} \begin{aligned} \, \notag &A^{d+(\alpha - \delta^{**})N+\alpha K}\sum_{\vec{j}\in\mathbb{Z}^d}\sum_{j'\in\mathbb{Z}^d} (1+|\vec{j}-j'|)^{-\theta_4} \\ &\qquad\qquad\times \bigl(\|f_{K+2}\|_{L_1(\operatorname{H}[w_{K -1, \vec{j}}](\,\cdot\,,A^{-2K+2} - A^{-2K - 4}))} - \|f_{K-1}\|_{L_1(w_{K-1,\vec{j}})}\bigr), \end{aligned} \end{equation} \tag{7.6} $$

since any cube $Q_{K-1,j'}$ contains at most $A^d$ cubes of the next generation. We recall that $\theta_4 > d$, and thus the sum with respect to $j'$ is bounded by a constant. It remains to use that the weights $w_{K-1,\vec{j}}$ form a partition of unity to bound (7.6) by the right-hand side of (7.3).

Proposition 4 is proved.

Remark 16. In the case when $K = 0$ the inequality (7.3) is replaced with

$$ \begin{equation*} \sum_q\|f_{N}\|_{L_p(\bigcup_{(N,i) \in \mathcal{T}_q} Q_{N,i})} \lesssim_N A^{(\alpha - \delta^{**})N} \|f_{2}\|_{L_1}. \end{equation*} \notag $$

Proposition 5. Assume that $\delta_0\notin \mathbb{M}^\mathcal{W}$. Let $\{\mathcal{T}_q\}_q$ be all the trees that start their development at the level $K \geqslant 1$, that is, such that $k_q = K$. Then, for any $N > 0$,

$$ \begin{equation*} \|f_{K+N}\|_{L_{p,1}(\bigcup_{(K+N,i) \in \cup_q\mathcal{T}_q} Q_{K+N,i})} \lesssim_N A^{(\alpha - \delta^{***})N + \alpha K} \bigl(\|f_{K+2}\|_{L_1} - \|f_{K-1}\|_{L_1}\bigr), \end{equation*} \notag $$

where $\delta^{***} > 0$ is a fixed real number and the constant in the inequality does not depend on $N$. In the case when $K=0$ we have

$$ \begin{equation*} \|f_{N}\|_{L_{p,1}(\bigcup_{(N,i) \in \cup_q\mathcal{T}_q} Q_{N,i})} \lesssim_N A^{(\alpha - \delta^{***})N} \|f_{2}\|_{L_1}. \end{equation*} \notag $$

Proof. First, Lemma 6 enables us to derive similar estimates where the Lorentz norm is replaced with the $L_p$-norm from Proposition 4 and Remark 16 (with ${\delta^{***} = \delta^{**}}$). Second, we note that all our combinatorial considerations (the definitions of atoms and constructions of graphs) do not depend on $p$ if we assume $A$ to be sufficiently large. Therefore, the trivial interpolation inequality

$$ \begin{equation*} \|g\|_{L_{p,1}}\lesssim \|g\|_{L_{p_1}}^{1/2} \|g\|_{L_{p_2}}^{1/2}, \end{equation*} \notag $$

where $p_1$ and $p_2$ are small perturbations of $p$ satisfying ${1}/{p_1} + {1}/{p_2} =2/p$, enables us to deduce the desired Lorentz bounds from the already obtained bounds on the $L_{p_1}$ and $L_{p_2}$ norms (we obtain that $\delta^{***}$ is the arithmetic mean of $\delta^{**}$ for $p_1$ and $\delta^{**}$ for $p_2$).

The proposition is proved.

The next corollary follows immediately from Proposition 5: one needs to compute the sum of a geometric series.

Corollary 14. Assume that $\delta_0\notin \mathbb{M}^\mathcal{W}$. Let $\{\mathcal{T}_q\}_q$ be all the trees that start their development at the level $K$. Then

$$ \begin{equation*} \begin{aligned} \, &\sum_{N \geqslant 0} A^{-\alpha (K+N)}\|f_{K+N}\|_{L_{p,1}(\bigcup_{(K+N,i) \in \cup_q\mathcal{T}_q} Q_{K+N,i})} \\ &\qquad\qquad\lesssim \bigl(\|f_{K+2}\|_{L_1} - \|f_{K-1}\|_{L_1}\bigr), \qquad K \geqslant 1, \end{aligned} \end{equation*} \notag $$

and

$$ \begin{equation*} \sum_{N \geqslant 0} A^{-\alpha N}\|f_{N}\|_{L_{p,1}(\bigcup_{(N,i) \in \cup_q\mathcal{T}_q} Q_{N,i})} \lesssim \|f_{2}\|_{L_1}. \end{equation*} \notag $$

Proof of Theorem 2. By Remark 1 it suffices to consider the case when $p \leqslant 2$. By Remark 6 it suffices to prove inequality (2.4). We choose the parameters

$$ \begin{equation*} \theta_1 = 2d+4, \qquad \theta_2 = 2d+3, \qquad \theta_3 = 4d+9, \qquad \theta_4 = d+2 \quad\text{and}\quad \theta_5 = d+1, \end{equation*} \notag $$

and see that they fulfill all our previous requirements. This allows us to choose $\lambda$ (see (6.1)) and $A$ (this parameter is chosen to be sufficiently large in order (7.1) to be true for $p$, $p_1$ and $p_2$ in the proof of Proposition 5). We also choose $\varepsilon$ as prescribed by Theorems 5 (with $C$ coming from Lemma 15) and Theorem 6. This provides us with the sets $\operatorname{Co}$ and $\operatorname{Fl}$ and the graphs $\{\widetilde\Gamma_k\}_k$ and $\Gamma$. By Lemma 6 it suffices to prove the estimates

$$ \begin{equation*} \sum_{k\geqslant 0}A^{-\alpha k}\|f_k\|_{L_{p,1}(\bigcup_{(k,j)\in \operatorname{Co}} Q_{k,j})}\lesssim \|f\|_{L_1} \end{equation*} \notag $$

and

$$ \begin{equation*} \sum_{k\geqslant 0}A^{-\alpha k}\|f_k\|_{L_{p,1}(\bigcup_{(k,j)\in \operatorname{Fl}} Q_{k,j})}\lesssim \|f\|_{L_1}. \end{equation*} \notag $$

The first inequality is established in Proposition 3. The second follows from Corollary 14 and formula (4.1) since any flat atom is a vertex in $\Gamma$:

$$ \begin{equation*} \begin{aligned} \, &\sum_{k\geqslant 0}A^{-\alpha k}\|f_k\|_{L_{p,1}(\bigcup_{(k,j)\in \operatorname{Fl}} Q_{k,j})} \\ &\quad\stackrel{\text{Lemma 6}}{\leqslant} \sum_{k\geqslant 0} A^{-\alpha k}\sum_{K \leqslant k} \|f_{k}\|_{L_{p,1}(\bigcup_{(k,j)\in \bigcup_{k_q = K} \mathcal{T}_q} Q_{k,j})} \\ &\qquad= \sum_{K \geqslant 0} \sum_{N \geqslant 0} A^{-\alpha (K+N)}\|f_{K+N}\|_{L_{p,1}(\bigcup_{(K+N,j)\in \bigcup_{k_q = K} \mathcal{T}_q} Q_{K+N,j})} \\ &\qquad\lesssim \|f_{2}\|_{L_1} + \sum_{K \geqslant 1}\bigl(\|f_{K+2}\|_{L_1} - \|f_{K-1}\|_{L_1}\bigr) \lesssim \|f\|_{L_1}. \end{aligned} \end{equation*} \notag $$

The theorem is proved.

Acknowledgements

I would like to express my gratitude to Rami Ayoush and Michal Wojciechowski for long and fruitful collaboration and for sharing their ideas with me. I also wish to thank Daniel Spector for discussions concerning this work.



Bibliography

1.	D. R. Adams and L. I. Hedberg, Function spaces and potential theory, Grundlehren Math. Wiss., 314, Springer-Verlag, Berlin, 1996, xii+366 pp.
2.	A. Alvino, “Sulla diseguaglianza di Sobolev in spazi di Lorentz”, Boll. Un. Mat. Ital. A (5), 14:1 (1977), 148–156
3.	A. Arroyo-Rabasa, G. De Philippis, J. Hirsch and F. Rindler, “Dimensional estimates and rectifiability for measures satisfying linear PDE constraints”, Geom. Funct. Anal., 29:3 (2019), 639–658
4.	R. Ayoush, D. M. Stolyarov and M. Wojciechowski, “Sobolev martingales”, Rev. Mat. Iberoam., 37:4 (2021), 1225–1246
5.	R. Ayoush and M. Wojciechowski, On dimension and regularity of bundle measures, arXiv: 1708.01458
6.	J. Bennett, A. Carbery and T. Tao, “On the multilinear restriction and Kakeya conjectures”, Acta Math., 196:2 (2006), 261–302
7.	O. V. Besov and V. P. Il'in, “An embedding theorem for a limiting exponent”, Mat. Zametki, 6:2 (1969), 129–138 ; English transl. in Math. Notes, 6:2 (1969), 537–542
8.	O. V. Besov, V. P. Il'in and S. M. Nikol'skii, Integral representations of functions and imbedding theorems, 2nd ed., rev. and compl., Nauka, Moscow, 1996, 480 pp. ; English transl. of 1st ed., v. I, II, Scripta Series in Mathematics, V. H. Winston & Sons, Washington, DC; Halsted Press [John Wiley & Sons], New York–Toronto, ON–London, 1978, 1979, viii+345 pp., viii+311 pp.
9.	J. Bourgain, A Hardy inequality in Sobolev spaces, Vrije Univ., Brussels, 1981
10.	J. Bourgain and H. Brezis, “On the equation $\operatorname{div} Y = f$ and application to control of phases”, J. Amer. Math. Soc., 16:2 (2003), 393–426
11.	J. Bourgain and H. Brezis, “New estimates for the Laplacian, the div–curl, and related Hodge systems”, C. R. Math. Acad. Sci. Paris, 338:7 (2004), 539–543
12.	J. Bourgain and H. Brezis, “New estimates for elliptic equations and Hodge type systems”, J. Eur. Math. Soc. (JEMS), 9:2 (2007), 277–315
13.	J. Bourgain, H. Brezis and P. Mironescu, “$H^{1/2}$ maps with values into the circle: minimal connections, lifting, and the Ginzburg-Landau equation”, Publ. Math. Inst. Hautes Études Sci., 99 (2004), 1–115
14.	P. Bousquet and J. Van Schaftingen, “Hardy-Sobolev inequalities for vector fields and canceling linear differential operators”, Indiana Univ. Math. J., 63:5 (2014), 1419–1445
15.	S. Chanillo, J. Van Schaftingen and Po-Lam Yung, “Bourgain-Brezis inequalities on symmetric spaces of non-compact type”, J. Funct. Anal., 273:4 (2017), 1504–1547
16.	E. Gagliardo, “Ulteriori proprieta di alcune classi di funzioni in piu variabili”, Ricerche Mat., 8 (1959), 24–51
17.	F. Gmeineder, B. Raita and J. Van Schaftingen, “On limiting trace inequalities for vectorial differential operators”, Indiana Univ. Math. J., 70:5 (2021), 2133–2176
18.	L. Grafakos, Modern Fourier analysis, 2nd ed., Grad. Texts in Math., 250, Springer, New York, 2009, xvi+504 pp.
19.	G. H. Hardy and J. E. Littlewood, “Some new properties of Fourier constants”, Math. Ann., 97:1 (1927), 159–209
20.	F. Hernandez and D. Spector, Fractional integration and optimal estimates for elliptic systems, arXiv: 2008.05639
21.	S. Janson, “Characterizations of $H^1$ by singular integral transforms on martingales and $R^n$”, Math. Scand., 41:1 (1977), 140–152
22.	S. V. Kislyakov, “Sobolev imbedding operators and the nonisomorphism of certain Banach spaces”, Funktsional. Anal. i Prilozhen., 9:4 (1975), 22–27 ; English transl. in Funct. Anal. Appl., 9:4 (1975), 290–294
23.	S. V. Kislyakov and D. V. Maksimov, “An embedding theorem with anisotropy for vector fields”, Investigations on linear operators and function theory. Part 45, Zap. Nauchn. Sem. S.-Peterburg. Otdel. Mat. Inst. Steklov. (POMI), 456, St Petersburg Department of the Steklov Mathematical Institute, St Petersburg, 2017, 114–124 ; English transl. in J. Math. Sci. (N.Y.), 234:3 (2018), 343–349
24.	S. V. Kislyakov, D. V. Maximov and D. M. Stolyarov, “Differential expressions with mixed homogeneity and spaces of smooth functions they generate in arbitrary dimension”, J. Funct. Anal., 269:10 (2015), 3220–3263
25.	V. I. Kolyada, “On an embedding of Sobolev spaces”, Mat. Zametki, 54:3 (1993), 48–71 ; English transl. in Math. Notes, 54:3 (1993), 908–922
26.	V. I. Kolyada, “Embedding theorems for Sobolev and Hardy-Sobolev spaces and estimates of Fourier transforms”, Ann. Mat. Pura Appl. (4), 198:2 (2019), 615–637
27.	L. Lanzani and E. M. Stein, “A note on div curl inequalities”, Math. Res. Lett., 12:1 (2005), 57–61
28.	J. Lindenstrauss and A. Pełczyński, “Absolutely summing operators in $\mathscr L_p$-spaces and their applications”, Studia Math., 29:3 (1968), 275–326
29.	V. Maz'ya, “Bourgain-Brezis type inequality with explicit constants”, Interpolation theory and applications, Contemp. Math., 445, Amer. Math. Soc., Providence, RI, 2007, 247–252
30.	V. Maz'ya, “Estimates for differential operators of vector analysis involving $L^1$-norm”, J. Eur. Math. Soc. (JEMS), 12:1 (2010), 221–240
31.	V. Maz'ya, Sobolev spaces, Leningrad University Publishing House, Leningrad, 1985, 416 pp. ; English transl., V. Maz'ya, Sobolev spaces, With applications to elliptic partial differential equations, 2nd rev. and augm. ed., Grundlehren Math. Wiss., 342, Springer, Heidelberg, 2011, xxviii+866 pp.
32.	L. Nirenberg, “On elliptic partial differential equations”, Ann. Scuola Norm. Sup. Pisa Cl. Sci. (3), 13:2 (1959), 115–162
33.	J. Peetre, New thoughts on Besov spaces, Duke Univ. Math. Ser., 1, Math. Department, Duke Univ., Durham, NC, 1976, vi+305 pp.
34.	A. Pełczyński and M. Wojciechowski, “Molecular decompositions and embedding theorems for vector-valued Sobolev spaces with gradient norm”, Studia Math., 107:1 (1993), 61–100
35.	S. Poornima, “An embedding theorem for the Sobolev space $W^{1,1}$”, Bull. Sci. Math. (2), 107:3 (1983), 253–259
36.	D. Preiss, “Geometry of measures in $\mathbf R^n$: distribution, rectifiability, and densities”, Ann. of Math. (2), 125:3 (1987), 537–643
37.	B. Raiţă, $L^1$-estimates for constant rank operators, arXiv: 1811.10057
38.	M. Roginskaya and M. Wojciechowski, “Singularity of vector valued measures in terms of Fourier transform”, J. Fourier Anal. Appl., 12:2 (2006), 213–223
39.	J. Van Schaftingen, “Estimates for $L^1$-vector fields”, C. R. Math. Acad. Sci. Paris, 339:3 (2004), 181–186
40.	J. Van Schaftingen, “A simple proof of an inequality of Bourgain, Brezis and Mironescu”, C. R. Math. Acad. Sci. Paris, 338:1 (2004), 23–26
41.	J. Van Schaftingen, “Limiting fractional and Lorentz space estimates of differential forms”, Proc. Amer. Math. Soc., 138:1 (2010), 235–240
42.	J. Van Schaftingen, “Limiting Sobolev inequalities for vector fields and canceling linear differential operators”, J. Eur. Math. Soc. (JEMS), 15:3 (2013), 877–921
43.	J. Van Schaftingen, “Limiting Bourgain-Brezis estimates for systems of linear differential equations: theme and variations”, J. Fixed Point Theory Appl., 15:2 (2014), 273–297
44.	S. K. Smirnov, “Decomposition of solenoidal vector charges into elementary solenoids and the structure of normal one-dimensional currents”, Algebra i Analiz, 5:4 (1993), 206–238 ; English transl. in St. Petersburg Math. J., 5:4 (1994), 841–867
45.	S. L. Sobolev, “On a theorem of functional analysis”, Mat. Sb., 4(46):3 (1938), 471–497 ; English transl. in Amer. Math. Soc. Transl. Ser. 2, 34, Amer. Math. Soc., Providence, RI, 1963, 39–68
46.	V. A. Solonnikov, “Inequalities for functions of the classes $\vec W_{p}(R^n)$”, Boundary-value problems of mathematical physics and related problems of function theory. Part 6, Zap. Nauchn. Sem. LOMI, 27, Nauka, Leningrad. Otdel., Leningrad, 1972, 194–210 ; English transl. in J. Soviet Math., 3 (1975), 549–564
47.	D. Spector, “New directions in harmonic analysis on $L^1$”, Nonlinear Anal., 192 (2020), 111685, 20 pp.
48.	D. Spector, “An optimal Sobolev embedding for $L^1$”, J. Funct. Anal., 279:3 (2020), 108559, 26 pp.
49.	D. Spector and J. Van Schaftingen, “Optimal embeddings into Lorentz spaces for some vector differential operators via Gagliardo's lemma”, Atti Accad. Naz. Lincei Rend. Lincei Mat. Appl., 30:3 (2019), 413–436
50.	E. M. Stein, Singular integrals and differentiability properties of functions, Princeton Math. Ser., 30, Princeton Univ. Press, Princeton, NJ, 1970, xiv+290 pp.
51.	D. Stolyarov, Dimension estimates for vectorial measures with restricted spectrum, arXiv: 2010.14961
52.	D. Stolyarov, Hardy-Littlewood-Sobolev inequality for $p=1$, arXiv: 2010.05297
53.	D. M. Stolyarov, “Weakly canceling operators and singular integrals”, Tr. Mat. Inst. Steklova, 312, Function Spaces, Approximation Theory, and Related Problems of Analysis. (2021), 259–271 ; English transl. in Proc. Steklov Inst. Math., 312 (2021), 249–260
54.	D. M. Stolyarov and M. Wojciechowski, “Dimension of gradient measures”, C. R. Math. Acad. Sci. Paris, 352:10 (2014), 791–795
55.	M. J. Strauss, “Variations of Korn's and Sobolev's equalities”, Partial differential equations (Univ. California, Berkeley, CA 1971), Proc. Sympos. Pure Math., 23, Amer. Math. Soc., Providence, RI, 1973, 207–214
56.	T. Tao, Uchiyama's constructive proof of the Fefferman-Stein decomposition, 2007 https://terrytao.wordpress.com/2007/02/23/
57.	T. Tao, Symmetric functions in a fractional number of variables, and the multilinear Kakeya conjecture, 2019 https://terrytao.wordpress.com/2019/06/
58.	L. Tartar, “Imbedding theorems of Sobolev spaces into Lorentz spaces”, Boll. Unione Mat. Ital. Sez. B Artic. Ric. Mat. (8), 1:3 (1998), 479–500
59.	A. Uchiyama, “A constructive proof of the Fefferman-Stein decomposition of $\operatorname{BMO}(\mathbf R^n)$”, Acta Math., 148 (1982), 215–241

Citation: D. M. Stolyarov, “Hardy-Littlewood-Sobolev inequality for $p=1$”, Mat. Sb., 213:6 (2022), 125–174; Sb. Math., 213:6 (2022), 844–889

Citation in format AMSBIB

\Bibitem{Sto22}

\by D.~M.~Stolyarov

\paper Hardy-Littlewood-Sobolev inequality for $p=1$

\jour Mat. Sb.

\yr 2022

\vol 213

\issue 6

\pages 125--174

\mathnet{http://mi.mathnet.ru/sm9645}

\crossref{https://doi.org/10.4213/sm9645}

\mathscinet{http://mathscinet.ams.org/mathscinet-getitem?mr=4461456}

\adsnasa{https://adsabs.harvard.edu/cgi-bin/bib_query?2022SbMat.213..844S}

\transl

\jour Sb. Math.

\yr 2022

\vol 213

\issue 6

\pages 844--889

\crossref{https://doi.org/10.1070/SM9645}

\isi{https://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcApp=Publons&SrcAuth=Publons_CEL&DestLinkType=FullRecord&DestApp=WOS_CPL&KeyUT=000992264800006}

\scopus{https://www.scopus.com/record/display.url?origin=inward&eid=2-s2.0-85165899741}

Linking options:

https://www.mathnet.ru/eng/sm9645

https://doi.org/10.1070/SM9645

https://www.mathnet.ru/eng/sm/v213/i6/p125

This publication is cited in the following 6 articles:

Citing articles in Google Scholar: Russian citations, English citations
Related articles in Google Scholar: Russian articles, English articles

Statistics & downloads:
Abstract page:	542
Russian version PDF:	100
English version PDF:	132
Russian version HTML:	224
English version HTML:	169
References:	98
First page:	21

�� QR-��?

Registration to the website

Logotypes