1 Introduction

One particularly useful way to study many important operators in Harmonic Analysis is that of decomposing them into sums of simpler dyadic operators. An example of a recent striking result using this strategy is the proof of the sharp weighted estimate for the Hilbert transform by Petermichl [23]. This was a key step towards the full \(A_2\) theorem for general Calderón-Zygmund operators, finally proven by Hytönen in [9]. Of course there are many instances of this useful technique, but we will not try to give a thorough historical overview here.

The proof in [9] was a tour de force which was the culmination of many previous partial efforts by others, see [9] and the references therein. Hytönen did not only prove the \(A_2\) theorem, but he also showed that general Calderón-Zygmund operators could be represented as averages of certain simpler “Haar shifts” in the spirit of [23]. The sharp weighted bound then followed from the corresponding one for these simpler operators.

Later, Lerner gave a simplification of the \(A_2\) theorem in [15] which avoided the use of most of the complicated machinery in [9]; it mainly relied on a general pointwise estimate for functions in terms of positive dyadic operators which had already been proven in [14]. The weighted result for the positive dyadic shifts that this contribution reduced the problem to had already been shown before in [12], see also [3] and [4]. More precisely, the proof of Lerner (essentially) gave the following pointwise estimate for general Calderón-Zygmund operators T: for every dyadic cube Q

$$\begin{aligned} |Tf(x)| \lesssim \sum _{m=0}^\infty 2^{-\delta m} \mathcal {A}^m_{\mathcal {S}}|f|(x) \quad \text {for a.e. } x\in Q, \end{aligned}$$
(1.1)

where \(\delta > 0\) depends on the operator T, \(\mathcal {S}\) are collections of dyadic cubes (belonging to same dyadic grid for each fixed \(\mathcal {S}\)) which depend on f, T and m, and \(\mathcal {A}^m_{\mathcal {S}}\) are positive dyadic operators defined by

$$\begin{aligned} \mathcal {A}_{S}^m f(x) = \sum _{Q \in \mathcal {S}} \langle f \rangle _{Q^{(m)}} \mathbbm {1}_Q(x), \end{aligned}$$

where \(Q^{(m)}\) denotes the mth dyadic parent of Q. Moreover, the collections \(\mathcal {S}\) in (1.1) are sparse in the usual sense: given \(0 < \eta < 1\), we say that a collection of cubes \(\mathcal {S}\) belonging to the same dyadic grid is \(\eta \)-sparse if for all cubes \(Q \in \mathcal {S}\) there exist measurable subsets \(E(Q) \subset Q\) with \(|E(Q)| \ge \eta |Q|\) and \(E(Q) \cap E(Q') = \emptyset \) unless \(Q = Q'\). A collection is called simply sparse if it is \(\frac{1}{2}\)-sparse.

From this pointwise estimate Lerner continues the proof by showing that bounding the operator norm of each \(\mathcal {A}^m_\mathcal {S}\) can be reduced to just estimating the operator norm of \(\mathcal {A}^0_{S'}\) in the same space for all possible sparse collections \(\mathcal {S}'\). More precisely, he shows that

$$\begin{aligned} \Vert \mathcal {A}^m_{\mathcal {S}}f\Vert _{\mathbb {X}} \lesssim (m+1) \sup _{\mathscr {D}, \mathcal {S'}} \Vert \mathcal {A}^0_{\mathcal {S}'}f\Vert _{\mathbb {X}}, \end{aligned}$$
(1.2)

where the supremum is taken over all dyadic grids \(\mathscr {D}\) and all sparse collection \(\mathcal {S}' \subset \mathscr {D}\), and where \(\mathbb {X}\) is any Banach function space, in the sense of [1, Chapter 1].

It is at this point where the duality of \(\mathbb {X}\) is needed in the argument; the operators \(\mathcal {A}^m_\mathcal {S}\) do not lend themselves to Lerner’s pointwise formula, while their adjoints do. Consequently, the question of what to do when no duality is present was left open. Our main result answers this question by proving a stronger (though localized) statement: the operators \(\mathcal {A}^m_\mathcal {S}\) are actually pointwise bounded by positive dyadic 0-shifts:

Theorem A

Let P be a cube and \(\mathcal {S}\) a sparse collection of dyadic subcubes Q such that \(Q^{(m)} \subseteq P\), then for all nonnegative integrable functions f on P there exists another sparse collection \(\mathcal {S}'\) of dyadic subcubes of P such that

$$\begin{aligned} \mathcal {A}^m_{\mathcal {S}} f(x) \lesssim (m+1) \mathcal {A}^0_{\mathcal {S'}}f(x) \quad \forall x \in P \end{aligned}$$
(1.3)

In fact, we prove Theorem A in a slightly more general setting: first, the statement is proven for a certain natural multilinear generalization of the operators \(\mathcal {A}^m_\mathcal {S}\). Second, the sparse collection \(\mathcal {S}\) is replaced by a more general Carleson sequence. The relevant details are given in the next section.

The novelty in our approach is two-fold: we directly attack the pointwise estimate for the operators \(\mathcal {A}^m\), instead of bounding their norm in various spaces. Also, in proving the pointwise bound we develop an algorithm that constructively selects those cubes which will form the family \(\mathcal {S}'\). This algorithm has “memory” in a certain sense: each iteration takes into account the previous steps, a feature which is crucial in our method to ensure that \(\mathcal {S}\) is sparse.

As a corollary of Theorem A, we find an analogue of (1.1) for Calderón-Zygmund operators with more general moduli of continuity (see the next section for the precise definition). In particular, we obtain the following pointwise estimate for Calderón-Zygmund operators:

Corollary A.1

If P is a dyadic cube, f is an integrable function supported on P and T is a Calderón-Zygmund operator whose kernel has modulus of continuity \(\omega \), then

$$\begin{aligned} |Tf(x)| \lesssim \sum _{m=0}^\infty \omega (2^{-m}) (m+1) \mathcal {A}^0_{\mathcal {S}_m} |f|(x) \quad \text {for a.e. }x\in P, \end{aligned}$$
(1.4)

where \(\mathcal {S}_m\) are sparse collections belonging to at most \(3^d\) different dyadic grids.

Moreover, if we know that \(\omega \) satisfies the logarithmic Dini condition:

$$\begin{aligned} \int _0^1 \omega (t) \left( 1 + \log \left( \frac{1}{t} \right) \right) \frac{dt}{t} < \infty , \end{aligned}$$
(1.5)

then we can find sparse collections \(\{\mathcal {S}_1', \ldots , \mathcal {S}_{3^d}'\}\), belonging to possibly different dyadic grids, such that

$$\begin{aligned} |Tf(x)| \lesssim \sum _{i=1}^{3^d} \mathcal {A}^0_{\mathcal {S}_i'} |f|(x) \quad \text {for a.e. } x\in P. \end{aligned}$$
(1.6)

The factor m in (1.2) precluded a naive adaptation of the proof in [16] to an \(A_2\) theorem with the usual Dini condition:

$$\begin{aligned} \int _0^1 \omega (t) \frac{dt}{t} < \infty , \end{aligned}$$
(1.7)

since the sum

$$\begin{aligned} \sum _{m=0}^\infty \omega (2^{-m})(m+1) \simeq \int _0^1 \omega (t) \left( 1+\log \frac{1}{t} \right) \frac{dt}{t} \end{aligned}$$
(1.8)

could diverge for some moduli \(\omega \) satisfying only (1.7). Moreover, it was shown in [8] that the weak-type (1, 1) norm of the adjoints of the operators \(\mathcal {A}^m_\mathcal {S}\) was at least linear in m, even in the unweighted case, so using duality prevented an extension of this type. However, although our argument does not quite give an \(A_2\) theorem for Calderón-Zygmund operators satisfying the Dini condition (we still need (1.8) to be finite), our proof avoids the use of duality and the study of the adjoint operators \((\mathcal {A}_{\mathcal {S}}^m)^*\). It thus removes at least one of the obstructions to possible proofs of the \(A_2\) theorem with the Dini condition which follow this strategy. Hence, removing the linear factor of m in Theorem A remains as an open problem.

Apart from being interesting in its own right, a bound for Calderón-Zygmund operators by these sums of positive 0-shifts in cases where there is no duality has interesting applications, some of which we describe later. Before, let us state a second corollary to Theorem A.1:

Corollary A.2

Let \(\Vert \cdot \Vert _{\mathbb {X}}\) be a function quasi-norm (see Sect. 2) and T a Calderón-Zygmund operator satisfying the logarithmic Dini condition, then

$$\begin{aligned} \Vert Tf\Vert _{\mathbb {X}} \lesssim \sup _{\mathscr {D}, \mathcal {S}} \Vert \mathcal {A}^0_\mathcal {S}|f|\Vert _{\mathbb {X}}, \end{aligned}$$
(1.9)

where the supremum is taken over all dyadic grids \(\mathscr {D}\) and all sparse collections \(\mathcal {S} \subset \mathscr {D}\).

We now describe two immediate applications of our result. First we can continue the program, initiated in [5] and extended in [21], which aims to extend the sharp weighted estimates for Calderón-Zygmund operators to their multilinear analogues (as in [6]). In particular we obtain

Theorem B

Let T be a multilinear Calderón-Zygmund operator. Suppose \(1 < p_1, \ldots , p_k < \infty \), \(\frac{1}{p} = \frac{1}{p_1} + \cdots + \frac{1}{p_k}\) and \(\vec {w} \in A_{\vec {P}}\). Then

$$\begin{aligned} \Vert T\vec {f}\Vert _{L^p(v_{\vec {w}})} \lesssim [\vec {w}]_{A_{\vec {P}}}^{\max \left( 1, \frac{p_1'}{p}, \ldots , \frac{p_k'}{p}\right) }\prod _{i=1}^k \Vert f_i\Vert _{L^p(w_i)}. \end{aligned}$$
(1.10)

The same theorem was proven in [21] but with the additional hypothesis that p had to be at least 1. The proof of this theorem is an application of the result in [21] which proved the same estimate (without the condition \(p \ge 1\)) but for a multilinear analogue of the operators \(\mathcal {A}^m_{\mathcal {S}}\), together with Theorem A. In fact, we will need a multilinear version of Theorem A which we state and prove in the next section.

Our second application is a sharp aperture weighted estimate for square functions which extends a result in [17]. In particular:

Theorem C

Let \(\alpha > 0\), then the square function \(S_{\alpha ,\psi }\) for the cone in \(\mathbbm {R}^{d+1}_+\) of apperture \(\alpha \) and the standard kernel \(\psi \) satisfies

$$\begin{aligned} \Vert S_{\alpha ,\psi }f\Vert _{L^{p,\infty }(\mathbbm {R}^d,w)} \lesssim \alpha ^d [w]_{A_p}^{1/p} \Vert f\Vert _{L^p(\mathbbm {R}^d,w)} \quad \text {for }1 < p < 2 \end{aligned}$$

and

$$\begin{aligned} \Vert S_{\alpha ,\psi }f\Vert _{L^{2,\infty }(\mathbbm {R}^d,w)} \lesssim \alpha ^d [w]_{A_2}^{1/2} (1+\log [w]_{A_2}) \Vert f\Vert _{L^2(\mathbbm {R}^d,w)}. \end{aligned}$$
(1.11)

An analogous result was shown in [17] for \(2 < p< 3\):

$$\begin{aligned} \Vert S_{\alpha ,\psi }f\Vert _{L^{p,\infty }(\mathbbm {R}^d,w)} \lesssim \alpha ^d [w]_{A_p}^{1/2} (1+\log [w]_{A_p}) \Vert f\Vert _{L^p(\mathbbm {R}^d,w)}. \end{aligned}$$

The proof relies on the use of Lerner’s pointwise formula and previous results by Lacey and Scurry [13]. However, in [17] the requirement of \(p>2\) was necessary for the same reason why the proof of the multilinear weighted estimates required \(p \ge 1\) (a certain space had no satisfactory duality properties). Theorem A can be used in almost the same way as with the weighted multilinear estimates to prove Theorem C. Indeed, the proofs in [13, 17] reduce the problem to estimating certain discrete positive operators which can be seen to be particular instances of the positive multilinear m-shifts used in the proof of Theorem B.

As was noted in [13], estimate (1.11) can be seen as an analogue of the result in [19] stablishing the endpoint weighted weak-type estimate for Calderón-Zygmund operators

$$\begin{aligned} \Vert Tf\Vert _{L^{1,\infty }(w)} \lesssim [w]_{A_1} (1+\log [w]_{A_1}) \Vert f\Vert _{L^1(w)}. \end{aligned}$$

See also [22] for a similar estimate from below and more information on the sharpness of this estimate, known as the weak \(A_1\) conjecture. In this direction, it seems reasonable that Lacey and Scurry’s proof in [13] could be adapted to the multilinear setting, however we will not pursue this problem here.

Finally, as a third application of our results, it is possible to give a more direct proof of the result in [10] for the q-variation of Calderón-Zygmund operators satisfying the logarithmic Dini condition by using the pointwise estimate analogous to (1.1) in [10] and then applying Theorem A. However, we will not pursue this argumentation either.

Shortly before uploading this preprint, Andrei Lerner kindly communicated to the authors that he, jointly with Fedor Nazarov, had independently proven a theorem very similar to Corollary A.1 [18]. Though the hypothesis are the same, their result differs from the one in this note in that we give a localized pointwise estimate while their pointwise estimate is valid for all of \(\mathbbm {R}^d\). However, our result seems to be as powerful in the applications.Footnote 1

2 Pointwise domination

The goal of this section is the proof of Theorem A and its consequences as stated in the introduction. We will prove the result in the level of generality of multilinear operators. Given a cube \(P_0\) on \(\mathbb {R}^d\), we will denote by \(\mathscr {D}(P_0)\) the dyadic lattice obtained by successive dyadic subdivisions of \(P_0\). By a dyadic grid we will denote any dyadic lattice composed of cubes with sides parallel to the axis. A k-linear positive dyadic shift of complexity m is an operator of the form

$$\begin{aligned} \mathcal {A}_{P_0, \alpha }^m \vec {f}(x) = \mathcal {A}_{P_0, \alpha }^m (f_1,f_2, \ldots , f_k)(x) := \sum _{ \begin{array}{c}\scriptstyle {Q \in \mathscr {D}(P_0)}\\ \scriptstyle {Q^{(m)} \subseteq P_0}\end{array} } \alpha _Q \left( \prod _{i=1}^k \langle f_i \rangle _{Q^{(m)}} \right) \mathbbm {1}_Q(x). \end{aligned}$$

As a first step towards the proof of Theorem A, it is convenient to separate the scales of (or slice) \(\mathcal {A}^m_{P_0,\alpha }\) as follows:

$$\begin{aligned} \mathcal {A}^m_{P_0,\alpha }\vec {f}(x)&= \sum _{n=0}^{m-1} \sum _{j=1}^\infty \sum _{Q \in \mathscr {D}_{jm+n}(P_0)} \alpha _Q \left( \prod _{i=1}^k \langle f_i \rangle _{Q^{(m)}} \right) \mathbbm {1}_Q(x) \\&=: \sum _{n=0}^{m-1} \mathcal {A}_{P_0,\alpha }^{m,n} \vec {f}(x). \end{aligned}$$

Note that \(\mathscr {D}_k(P_0)\) denotes the kth generation of the lattice \(\mathscr {D}(P_0)\). Now we rewrite \(\mathcal {A}_{P_0,\alpha }^{m;n}\) as a sum of disjointly supported operators of the form \(\mathcal {A}_{P,\alpha }^{m;0}\). Indeed,

$$\begin{aligned} \mathcal {A}_{P_0,\alpha }^{m;n} \vec {f}(x)&= \sum _{j=1}^\infty \sum _{Q \in \mathscr {D}_{jm+n}(P_0)} \alpha _Q \left( \prod _{i=1}^k \langle f_i \rangle _{Q^{(m)}} \right) \mathbbm {1}_Q(x) \\&= \sum _{P \in \mathscr {D}_n(P_0)} \sum _{j=1}^\infty \sum _{Q \in \mathscr {D}_{jm}(P)} \alpha _Q \left( \prod _{i=1}^k \langle f_i \rangle _{Q^{(m)}} \right) \mathbbm {1}_Q(x) \\&= \sum _{P \in \mathscr {D}_n(P_0)} \mathcal {A}_{P,\alpha }^{m;0} \vec {f}(x), \end{aligned}$$

which leads to the expression

$$\begin{aligned} \mathcal {A}^m_{\alpha ,P_0}\vec {f}(x)&= \sum _{n=0}^{m-1} \sum _{P \in \mathscr {D}_n(P_0)} \mathcal {A}^{m;0}_{P,\alpha } \vec {f}(x). \end{aligned}$$

We say that a sequence \(\{\alpha _Q\}_{Q \in \mathscr {D}(P_0)}\) is Carleson if its Carleson constant \(\Vert \alpha \Vert _{\text {Car}(P_0)} < \infty \), where

$$\begin{aligned} \Vert \alpha \Vert _{\text {Car}(P_0)} = \sup _{P \in \mathscr {D}(P_0)} \frac{1}{|P|} \sum _{Q \in \mathscr {D}(P)} \alpha _Q|Q|. \end{aligned}$$

The following intermediate step is the key to our approach:

Proposition 2.1

Let \(m\ge 1\) and \(\alpha \) be a Carleson sequence. For integrable functions \(f_1, \ldots , f_k \ge 0\) on \(P_0\) there exists a sparse collection \(\mathcal {S}\) of cubes in \(\mathscr {D}(P_0)\) such that

$$\begin{aligned} \mathcal {A}^{m;0}_{P_0,\alpha } \vec {f}(x) \le C_1\Vert \alpha \Vert _{{\text {Car}}(P_0)} \sum _{Q \in \mathcal {S}} \left( \prod _{i=1}^k\langle f_i \rangle _Q \right) \mathbbm {1}_Q(x), \end{aligned}$$

where \(C_1\) only depends on k and d, and in particular is independent of m.

To prove Proposition 2.1 we will proceed in three steps: we will first construct the collection \(\mathcal {S}\), then show that we have the required pointwise bound, and finally that \(\mathcal {S}\) is sparse. By homogeneity, we will assume that \(\Vert \alpha \Vert _{{\text {Car}}(P_0)} = 1\). Also, we will assume that the sequence \(\alpha \) is finite, but our constants will be independent of the number of elements in the sequence.

Let \(\Delta _{P_0} = 0\) and, for each \(Q \in \mathscr {D}_{mj}(P_0)\) with \(j \ge 0\), define the sequence \(\{\gamma _Q\}_Q\) by

$$\begin{aligned} \gamma _Q = \max _{R \in \mathscr {D}_m(Q)} \alpha _R. \end{aligned}$$

For each \(Q \in \mathscr {D}_{mj}(P_0)\) with \(j \ge 0\), we will inductively define the quantities \(\Delta _Q\) and \(\beta _Q\) as follows:

where \(C_W\) is the boundedness constant of the unweighted endpoint weak-type of the operators \(\mathcal {A}^m\) proved in Theorem 4.1 in the Appendix. Also, for every \(R \in \mathscr {D}_m(Q)\) we define

$$\begin{aligned} \Delta _R = \Delta _Q + (\beta _Q - \alpha _R)\left( \prod _{i=1}^k \langle f_i \rangle _Q \right) . \end{aligned}$$

Note that the definition only applies to cubes in \(\mathscr {D}_{mj}(P_0)\) for some j. For all other cubes in \(\mathscr {D}_{P_0}\), we set \(\beta _Q= \Delta _Q = 0\). The collection \(\mathcal {S}\) consists of those cubes \(Q \in \mathscr {D}(P_0)\) for which \(\beta _Q \ne 0\). Note that, since \(2^{2(k+1)}C_W > 1 = \Vert \alpha \Vert _{{\text {Car}}(P_0)} \ge \alpha _R\) for all R and by the definition of \(\gamma _Q\), we must have \(\Delta _Q \ge 0\) for all Q. This can be easily seen by induction.

Remark 2.2

We are trying to construct a sparse operator of complexity 0 which dominates \(\mathcal {A}^{m;0}_{P_0,\alpha }\). One way to achieve this is to let \(\mathcal {S}\) be the collection of all dyadic subcubes of \(P_0\), but of course this does not yield a sparse collection. A better way would be to let \(\mathcal {S}\) consist of all dyadic cubes in \(P_0\) for which at least one of its mth generation children R satisfies \(\alpha _R > 0\); unfortunately this yields a collection \(\mathcal {S}\) which is not sparse, and in fact it can be seen that the Carleson sequence \(\beta \) associated with this collection can have a Carleson norm \(\Vert \beta \Vert _{{\text {Car}}(P_0)}\) which grows exponentially in m.

The main problem with this approach is that, when the time comes to decide whether a cube should be in \(\mathcal {S}\) or not, we do not take into account which cubes have been selected in the previous steps. Note that whenever we add a cube Q to \(\mathcal {S}\) we are not only “helping” to dominate the portion of \(\mathcal {A}^{m;0}_{P_0,\alpha }\) coming from Q, but also what may come from any of its descendants.

One can account for this by having the algorithm use a sort of “memory” to, essentially, keep track of how many cubes in \(\mathcal {S}\) (appropriately weighted with the averages of \(\vec {f}\)) lie above any particular cube. This is the purpose of \(\Delta _Q\). This can also be seen as the stopping time algorithm which selects a cube whenever the previously selected cubes do not provide enough height to dominate the operator until that point.

Lemma 2.3

We have the pointwise bound

$$\begin{aligned} \mathcal {A}^{m;0}_{P_0,\alpha } \vec {f}(x) \le \sum _{Q \in \mathscr {D}(P_0)} \beta _Q\left( \prod _{i=1}^k \langle f_i \rangle _Q \right) \mathbbm {1}_Q(x). \end{aligned}$$
(2.1)

Proof

We will prove by induction the following claim: if \(P \in \mathscr {D}_{jm}(P_0)\) for some \(j \ge 0\), then

$$\begin{aligned} \mathcal {A}^{m;0}_{P,\alpha } \vec {f}(x) \le \Delta _P + \sum _{Q \in \mathscr {D}(P)} \beta _Q \left( \prod _{i=1}^k \langle f_i \rangle _Q \right) \mathbbm {1}_Q(x). \end{aligned}$$
(2.2)

Note that, when \(P = P_0\), this is exactly (2.1). Since \(\alpha \) is finite, there is a smallest \(j_0 \in \mathbb {N}\) such that \(\alpha _Q = 0\) for all cubes \(Q \in \mathscr {D}_{\ge j_0m}(P_0)\).Footnote 2 Let Q be any cube in \(\mathscr {D}_{j_0m}(P_0)\), we obviously have

$$\begin{aligned} \mathcal {A}^{m;0}_{Q,\alpha } \vec {f} \equiv 0 \quad \text {in } Q. \end{aligned}$$

Since \(\Delta _Q \ge 0\), the claim (2.2) is trivial for \(P \in \mathscr {D}_{j_0m}(P_0)\). Now, assume by induction that we have proved (2.2) for all cubes \(P \in \mathscr {D}_{jm}(P_0)\) with \(1\le j_1 \le j\) and let P be any cube in \(\mathscr {D}_{(j_1-1)m}(P_0)\). By definition,

$$\begin{aligned} \mathcal {A}^{m;0}_{P,\alpha } \vec {f}(x) = \sum _{Q \in \mathscr {D}_m(P)} \left( \alpha _Q \left( \prod _{i=1}^k \langle f_i \rangle _P\right) \mathbbm {1}_Q(x) + \mathcal {A}_{Q,\alpha }^{m;0}\vec {f}(x) \right) . \end{aligned}$$

Let \(x \in Q \in \mathscr {D}_m(P)\), then by the induction hypothesis and the definition of \(\Delta _Q\):

$$\begin{aligned} \mathcal {A}^{m;0}_{P, \alpha }\vec {f}(x)&\le \alpha _Q \left( \prod _{i=1}^k \langle f_i \rangle _P \right) + \Delta _Q + \sum _{R \in \mathscr {D}(Q)} \beta _R \left( \prod _{i=1}^k \langle f_i \rangle _R \right) \mathbbm {1}_R(x) \\&= \alpha _Q \left( \prod _{i=1}^k \langle f_i \rangle _P \right) + \Delta _P + (\beta _P - \alpha _Q)\left( \prod _{i=1}^k \langle f_i \rangle _P \right) \\&\quad + \sum _{R \in \mathscr {D}(Q)} \beta _R \left( \prod _{i=1}^k \langle f_i \rangle _R \right) \mathbbm {1}_R(x) \\&= \Delta _P + \beta _P \left( \prod _{i=1}^k \langle f_i \rangle _P \right) + \sum _{R \in \mathscr {D}(Q)} \beta _R \left( \prod _{i=1}^k \langle f_i \rangle _R \right) \mathbbm {1}_R(x) \\&= \Delta _P + \sum _{R \in \mathscr {D}(P)} \beta _R \left( \prod _{i=1}^k \langle f_i \rangle _R \right) \mathbbm {1}_R(x), \end{aligned}$$

which is what we wanted to show. \(\square \)

Lemma 2.4

The collection \(\mathcal {S}\) is sparse.

Proof

Let \(P \in \mathcal {S}\), we have to show that the set

$$\begin{aligned} F := \bigcup _{Q \subsetneq P, Q \in \mathcal {S}} Q \end{aligned}$$

satisfies \(|F| \le \frac{1}{2}|P|\). To this end, let \(\mathcal {R}\) be the collection of maximal (strict) subcubes of P which are in \(\mathcal {S}\), Note that for all \(R \in \mathcal {R}\) we have \(R \in \mathscr {D}_{N_Rm}(P)\) for some \(N_R\ge 1\). We thus have

$$\begin{aligned} F = \bigsqcup _{R \in \mathcal {R}} R. \end{aligned}$$

By maximality, for all \(R \in \mathcal {R}\) and dyadic cubes Q with \(R \subsetneq Q \subsetneq P\) we have \(\beta _Q = 0\). For all \(R\in \mathcal {R}\) and \(1 \le j \le N_R\) we now claim that

$$\begin{aligned} \Delta _{R^{((N_R-j)m)}} \ge \beta _P\left( \prod _{i=1}^k \langle f_i \rangle _P \right) - \sum _{\nu =1}^j \alpha _{R^{((N_R-\nu )m)}} \left( \prod _{i=1}^k \langle f_i \rangle _{R^{((N_R-\nu +1)m)}} \right) . \end{aligned}$$
(2.3)

Indeed, one can prove this by induction on j. If \(j=1\) then by definition we have

$$\begin{aligned} \Delta _{R^{((N_R-1)m)}}&= \Delta _P + (\beta _P - \alpha _{R^{((N_R-1)m)}})\left( \prod _{i=1}^k \langle f_i \rangle _P \right) \\&\ge \beta _P \left( \prod _{i=1}^k \langle f_i \rangle _P \right) - \alpha _{R^{((N_R-1)m)}}\left( \prod _{i=1}^k \langle f_i \rangle _P \right) , \end{aligned}$$

since \(\Delta _P \ge 0\).

To prove the induction step, observe that (by the induction hypothesis) for \(j>1\)

$$\begin{aligned} \Delta _{R^{((N_R-j)m)}}&= \Delta _{R^{((N_R-j+1)m)}} + (\beta _{R^{((N_R-j+1)m)}} - \alpha _{R^{((N_R-j)m)}}) \left( \prod _{i=1}^k \langle f_i \rangle _{R^{((N_R-j+1)m)}} \right) \\&= \Delta _{R^{((N_R-j+1)m)}} - \alpha _{R^{((N_R-j)m)}} \left( \prod _{i=1}^k \langle f_i \rangle _{R^{((N_R-j+1)m)}} \right) \\&\ge \beta _P\left( \prod _{i=1}^k \langle f_i \rangle _P \right) - \sum _{\nu =1}^{j} \alpha _{R^{((N_R-\nu )m)}} \left( \prod _{i=1}^k \langle f_i \rangle _{R^{((N_R-\nu +1)m)}} \right) . \end{aligned}$$

From (2.3) with \(j = N_R\), we have (since the terms are nonnegative)

$$\begin{aligned} \Delta _{R} \ge \beta _{P} \left( \prod _{i=1}^k \langle f_i \rangle _P \right) - \mathcal {A}^{m;0}_{P,\alpha }\vec {f}(x) \end{aligned}$$

for all \(x \in R\). Since \(\beta _{R} \ne 0\), we must have

$$\begin{aligned} \left( \prod _{i=1}^k \langle f_i \rangle _R \right) \gamma _{R} - \Delta _{R} > 0, \end{aligned}$$

i.e.:

$$\begin{aligned} \left( \prod _{i=1}^k \langle f_i \rangle _R \right) \gamma _{R} + \mathcal {A}^{m;0}_{P,\alpha }\vec {f}(x) > 2^{2(k+1)}C_W \left( \prod _{i=1}^k \langle f_i \rangle _P \right) \end{aligned}$$

for all \(x \in R\). Let \(\mathcal {G}_P\vec {f} = \sum _{R \in \mathcal {R}} \gamma _{R} \left( \prod _{i=1}^k \langle f_i \rangle _R \right) \mathbbm {1}_{R}\), then for all \(x \in R\) we have

$$\begin{aligned} \mathcal {G}_Pf(x) + \mathcal {A}^{m;0}_{P,\alpha }\vec {f}(x) > 2^{2(k+1)}C_W \left( \prod _{i=1}^k \langle f_i \rangle _P \right) , \end{aligned}$$

hence

$$\begin{aligned} |F|&\le \left| \left\{ x \in P: \, \mathcal {G}_P \vec {f}(x) + \mathcal {A}^{m;0}_{P,\alpha } \vec {f}(x) > 2^{2(k+1)}C_W \left( \prod _{i=1}^k \langle f_i \rangle _P \right) \right\} \right| \\&\le \frac{\left\| \mathcal {G}_P + \mathcal {A}_{P,\alpha }^{m;0}\right\| _{L^{1}(P) \times \cdots \times L^1(P) \rightarrow L^{1/k,\infty }(P)}^{1/k}}{\left( 2^{2(k+1)}C_W \left( \prod _{i=1}^k \langle f_i \rangle _P \right) \right) ^{1/k}} \left( \prod _{i=1}^k \Vert f_i\Vert _{L^1(P)} \right) ^{1/k} \\&= \frac{\left\| \mathcal {G}_P + \mathcal {A}_{P,\alpha }^{m;0}\right\| _{L^{1}(P) \times \cdots \times L^1(P) \rightarrow L^{1/k,\infty }(P)}^{1/k}}{(2^{2(k+1)}C_W)^{1/k}} |P| \end{aligned}$$

Let us compute the operator norm \(\Vert \mathcal {G}_P\Vert _{L^1(P) \times \cdots \times L^1(P) \rightarrow L^{1/k,\infty }(P)}\). Observe that, since \(\gamma _Q \le 1\) for all Q, the operator \(\mathcal {G}\) is pointwise bounded by the multi-linear projection

$$\begin{aligned} \mathcal {P}_P\vec {f}(x) = \sum _{R \in \mathcal {R}} \left( \prod _{i=1}^k \langle f_i \rangle _R \right) \mathbbm {1}_{R}(x) = \prod _{i=1}^k \left( \sum _{R \in \mathcal {R}} \langle f_i \rangle _R \mathbbm {1}_{R}(x)\right) . \end{aligned}$$

For each \(1\le i \le k\), we have \(\Vert \sum _{R \in \mathcal {R}} \langle f_i \rangle _R \mathbbm {1}_R\Vert _{L_1(P)} \le \Vert f_i \Vert _{L_1(P)}\). Therefore, by Hölder’s inequality we get

$$\begin{aligned} \Vert \mathcal {P}_P\vec {f} \Vert _{L_{1/k,\infty }(P)} \le \prod _{i=1}^k \left\| \sum _{R \in \mathcal {R}} \langle f_i \rangle _R \mathbbm {1}_R\right\| _{L_1(P)} \le \prod _{i=1}^k \Vert f_i\Vert _{L_1(P)}. \end{aligned}$$

On the other hand we have

$$\begin{aligned} \Vert \mathcal {A}_{P,\alpha }^{m;0} \vec {f}\Vert _{L^{1/k,\infty }(P)} \le C_{W} \prod _{i=1}^k \Vert f_i\Vert _{L^1(P)} \end{aligned}$$

by Theorem 4.1. Combining these estimates we get

$$\begin{aligned} \Vert \mathcal {G}_P + \mathcal {A}_{P,\alpha }^{m;0}\Vert _{L^{1}(P) \times \cdots \times L^1(P) \rightarrow L^{1/k,\infty }(P)} \le 2^{k+1}(1+C_W) \le 2^{k+2}C_W \end{aligned}$$

and the result follows. \(\square \)

From Lemmas 2.3 and 2.4 Proposition 2.1 follows at once. The proof shows that one can actually take \(C_1=2^{2+k(7+d(2k-1))}\). We are now ready to finish the proof of Theorem A, which we state here in full generality:

Theorem 2.5

Let \(\alpha \) be a Carleson sequence and let \(P_0\) be a dyadic cube. For every k-tuple of nonnegative integrable functions \(f_1, \ldots , f_k\) on P there exists a sparse collection \(\mathcal {S}\) of cubes in \(\mathscr {D}(P)\) such that

$$\begin{aligned} \mathcal {A}^m_{P,\alpha } \vec {f}(x) \le C_2 \sum _{Q \in \mathcal {S}} \left( \prod _{i=1}^k \langle f_i \rangle _Q \right) \mathbbm {1}_Q(x). \end{aligned}$$

Proof

If \(m=0\) we can just apply Proposition 2.1 after noting that \(\mathcal {A}^0_{P_0,\alpha }\) can be written as \(\mathcal {A}^{1;0}_{P_0,\beta }\), where

$$\begin{aligned} \beta _Q = \alpha _{Q^{(1)}}. \end{aligned}$$

One easily sees that \(\Vert \alpha \Vert _{{\text {Car}}(P_0)} = \Vert \beta \Vert _{{\text {Car}}(P_0)}\). Hence, we may assume that \(m \ge 1\). Recall the expression

$$\begin{aligned} \mathcal {A}^m_{P_0,\alpha }\vec {f}(x)&= \sum _{n=0}^{m-1} \sum _{P \in \mathscr {D}_n(P_0)} \mathcal {A}^{m;0}_{P,\alpha } \vec {f}(x). \end{aligned}$$

from the beginning of the section. By Proposition 2.1, for each \(0 \le n \le m-1\) and each \(P \in \mathscr {D}_n(P_0)\) we can find a sparse collection of cubes \(\mathcal {S}_{P}^n \subset \mathscr {D}(P)\) such that

$$\begin{aligned} \mathcal {A}^{m;0}_{P,\alpha } \vec {f}(x) \le C_1 \Vert \alpha \Vert _{{\text {Car}}(P_0)} \sum _{Q \in \mathcal {S}_P^n} \left( \prod _{i=1}^k \langle f_i \rangle _Q \right) \mathbbm {1}_Q(x). \end{aligned}$$

Observe that the collection \(\mathcal {S}^n = \cup _{P \in \mathscr {D}_n(P_0)} \mathcal {S}_P^n\) is also sparse, so

$$\begin{aligned} \mathcal {A}^m_{P_0,\alpha }\vec {f}(x) \le C_1 \Vert \alpha \Vert _{{\text {Car}}(P_0)} \sum _{n=0}^{m-1} \sum _{Q \in \mathcal {S}^n} \left( \prod _{i=1}^k \langle f_i \rangle _Q \right) \mathbbm {1}_Q(x). \end{aligned}$$
(2.4)

For \(0 \le n \le m-1\) define

$$\begin{aligned} \mu _Q^n = {\left\{ \begin{array}{ll} 1 &{}\quad \text {if } Q \in \mathcal {S}^n \\ 0 &{}\quad \text {otherwise.} \end{array}\right. } \end{aligned}$$

Since the collections \(\mathcal {S}^n\) are sparse, the sequences \(\mu ^n\) are Carleson sequences with \(\Vert \mu ^n\Vert _{{\text {Car}}(P_0)} \le 2\), therefore the sequence

$$\begin{aligned} \mu _Q := \sum _{n=0}^{m-1} \mu _Q^n \end{aligned}$$

is also Carleson with \(\Vert \mu \Vert _{{\text {Car}}(P_0)} \le 2m\).

With this we can continue the argument using estimate (2.4) and the case \(m=0\):

$$\begin{aligned} \mathcal {A}^m_{P_0,\alpha }\vec {f}(x)&\le C_1 \Vert \alpha \Vert _{{\text {Car}}(P_0)} \sum _{n=0}^{m-1} \sum _{Q \in \mathcal {S}^n} \left( \prod _{i=1}^k \langle f_i \rangle _Q \right) \mathbbm {1}_Q(x) \\&= C_1 \Vert \alpha \Vert _{{\text {Car}}(P_0)} \sum _{n=0}^{m-1} \sum _{Q \in \mathscr {D}(P_0)} \mu _Q^n \left( \prod _{i=1}^k \langle f_i \rangle _Q \right) \mathbbm {1}_Q(x) \\&= C_1 \Vert \alpha \Vert _{{\text {Car}}(P_0)} \sum _{Q \in \mathscr {D}(P_0)} \mu _Q \left( \prod _{i=1}^k \langle f_i \rangle _Q \right) \mathbbm {1}_Q(x) \\&= C_1 \Vert \alpha \Vert _{{\text {Car}}(P_0)} \mathcal {A}^0_{P_0,\mu } \vec {f}(x) \\&\le C_1 \Vert \alpha \Vert _{{\text {Car}}(P_0)} C_1 2m \sum _{Q \in \mathcal {S}} \left( \prod _{i=1}^k \langle f_i \rangle _Q \right) \mathbbm {1}_Q(x), \end{aligned}$$

which yields the result with \(C_2 = 2C_1^2\). \(\square \)

Remark 2.6

The above procedure does not rely on any specific property of the Lebesgue measure. In fact, Theorem A also holds when we replace all averages—both in complexity 0 and complexity m operators—by averages with respect to any other locally finite Borel measure, because the proof is unaffected.

We now detail how to use Theorem A to derive the multilinear version of Corollaries A.1 and A.2. For us, a multilinear Calderón-Zygmund operator will be an operator T satisfying

$$\begin{aligned} T(f_1, \ldots , f_k) = \int _{\mathbb {R}^{dk}} K(x, y_1, \ldots , y_k) f_1(y_1) \ldots f_k(y_k) dy_1 \, \ldots \, dy_k \end{aligned}$$

for all \(x \notin \cap _{i=1}^k {\text {supp}}f_i\) for appropriate \(f_i\). Also we will require that T extends to a bounded operator from \(L^{q_1} \times \cdots L^{q_k}\) to \(L^q\) where

$$\begin{aligned} \frac{1}{q} = \frac{1}{q_1} + \cdots + \frac{1}{q_k}, \end{aligned}$$

and that it satisfies the size estimate

$$\begin{aligned} |K(y_0, \ldots , y_k)| \le \frac{A}{\left( \sum _{i,j=0}^k |y_i-y_j|\right) ^{kd}}. \end{aligned}$$

\(\omega \) will be the modulus of continuity of the kernel of the operator i.e. a positive nondecreasing continuous and doubling function that satisfies

$$\begin{aligned}&|K(y_0, \ldots , y_j, \ldots , y_k)-K(y_0, \ldots , y_j', \ldots , y_k)|\\&\quad \le C\omega \left( \frac{|y_j-y_j'|}{\sum _{i,j=0}^k |y_i-y_j|} \right) \frac{1}{\left( \sum _{i,j=0}^k |y_i-y_j|\right) ^{kd}} \end{aligned}$$

for all \(0 \le j \le k\), whenever \(|y_j-y_j'|\le \frac{1}{2}\max _{0 \le i \le k}|y_j-y_i|\). We can now prove Corollary A.1:

Proof of Corollary A.1

Fix a measurable f, and a cube \(Q_0 \subset \mathbb {R}^d\). Our starting point is the formula

$$\begin{aligned} |T\vec {f}(x)-m_{T\vec {f}}(Q_0)| \lesssim \sum _{Q \in \mathcal {S}} \sum _{m=0}^\infty \omega (2^{- m}) \prod _{i=1}^m \langle |f_i| \rangle _{2^m Q} \mathbbm {1}_Q(x), \end{aligned}$$

which holds for a sparse subcollection \(\mathcal {S} \subset \mathscr {D}(Q_0)\) (see [5, 10], we are implicitly using a slight improvement of Lerner’s formula which can be found in [8, Theorem 2.3]). Here \(m_f(Q)\) denotes the median of a measurable function f over a cube Q (see [16] for the precise definition), which satisfies

$$\begin{aligned} |m_f(Q)| \lesssim \frac{\Vert f\Vert _{L^{1,\infty }(Q)}}{|Q|}. \end{aligned}$$

Hence we can just write

$$\begin{aligned} |T\vec {f}(x)| \lesssim \sum _{m=0}^\infty \omega (2^{- m}) \prod _{i=1}^m \langle |f_i| \rangle _{2^m Q} \mathbbm {1}_Q(x), \end{aligned}$$
(2.5)

By an elaboration of \(\sum _{Q \in \mathcal {S}}\) the well-known one-third trick, it was proven in [10] that there exist dyadic systems \(\{\mathscr {D}^\rho \}_{\rho \in \{0,1/3,2/3\}^d}\) such that for every cube Q in \(\mathbb {R}^d\) and every \(m\ge 1\), there exists \(\rho \in \{0,1/3,2/3\}^d\) and \(R_{Q,m} \in \mathscr {D}^\rho \) such that

$$\begin{aligned} Q \subset R_{Q,m}, \; 2^mQ \subset Q^{(m)}, \; 3\ell (Q) < \ell (R_{Q,m}) \le 6\ell (Q). \end{aligned}$$

Also, we may assume that for each \(\rho \in \{0,1/3,2/3\}^d\) there exists a cube \(P(\rho )\) such that \(Q_0 \subset P(\rho ) \subset c_d P(\rho )\) for some dimensional constant \(c_d\). Using this, we can further write (2.5) as

$$\begin{aligned} |T\vec {f}(x)|&\lesssim \sum _{\rho \in \{0,\frac{1}{3},\frac{2}{3}\}^d} \sum _{m=0}^\infty \omega (2^{- m}) \sum _{\begin{array}{c}\scriptstyle {Q \in \mathcal {S}}\scriptstyle {R_{Q,m} \in \mathscr {D}^\rho }\end{array}} \left( \prod _{i=1}^k \langle |f_i| \rangle _{R_{Q,m}^{(m)}} \right) \mathbbm {1}_{R_Q}. \end{aligned}$$

Let \(\mathcal {F}^\rho _m = \{R_{Q,m}: \, R_Q \in \mathscr {D}^\rho \} \subset \mathscr {D}(P(\rho ))\). Then, we can estimate

$$\begin{aligned} |T\vec {f}(x)| \lesssim 6^d \sum _{\rho \in \{0,\frac{1}{3},\frac{2}{3}\}^d} \sum _{m=0}^\infty \omega (2^{- m}) \sum _{R \in \mathcal {F}^\rho _m} \left( \prod _{i=1}^k \langle |f_i| \rangle _{R^{(m)}} \right) \mathbbm {1}_R, \end{aligned}$$

since at most \(6^d\) cubes Q in \(\mathscr {D}\) are mapped to the same cube \(R_{Q,m}\). Define the sequence

$$\begin{aligned} \alpha ^\rho _Q = {\left\{ \begin{array}{ll} 1 &{}\quad \text {if } Q \in \mathcal {F}^\rho _m \\ 0 &{}\quad \text {otherwise}. \end{array}\right. } \end{aligned}$$

The collections \(\mathcal {F}^\rho _m\) are \(2^{-1}\cdot 6^{-d}\)-sparse, and hence Carleson with constant \(2 \cdot 6^d\). In order to apply Theorem A, for each fixed \(\rho \in \left\{ 0,\frac{1}{3},\frac{2}{3}\right\} ^d\), \(m\ge 0\), we now split the sum as follows:

$$\begin{aligned} \sum _{Q \in \mathscr {D}^\rho } \alpha _{Q}^\rho \left( \prod _{i=1}^k \langle |f_i| \rangle _{Q^{(m)}} \right) \mathbbm {1}_Q(x)= & {} \sum _{Q \in \mathscr {D}_{\ge m}(P(\rho ))} \alpha _{Q}^\rho \left( \prod _{i=1}^k \langle |f_i| \rangle _{Q^{(m)}} \right) \mathbbm {1}_Q(x) \\&+ \sum _{\ell =1}^\infty \sum _{Q \in \mathscr {D}_{m-\ell }(P(\rho ))} \alpha _{Q}^\rho \left( \prod _{i=1}^k \langle |f_i| \rangle _{Q^{(m)}} \right) \mathbbm {1}_Q(x) \\= & {} \mathrm {I} + \mathrm {II}.\\ \end{aligned}$$

Now, since \(f_i\) is supported on \(Q_0 \subset P(\rho )\) for \(1\le i \le k\) and all \(\rho \in \left\{ 0,\frac{1}{3},\frac{2}{3}\right\} ^d\), we claim that \(\mathrm {II} \le \mathrm {I}\). Indeed, compute

$$\begin{aligned} \sum _{\ell =1}^\infty \sum _{Q \in \mathscr {D}_{m-\ell }(P(\rho ))} \alpha _{Q}^\rho \left( \prod _{i=1}^k \langle |f_i| \rangle _{Q^{(m)}} \right) \mathbbm {1}_Q(x)\le & {} \sum _{\ell =1}^\infty \sum _{Q \in \mathscr {D}_{m-\ell }(P(\rho ))} \left( \prod _{i=1}^k \langle |f_i| \rangle _{Q^{(m)}} \right) \mathbbm {1}_Q(x)\\= & {} \sum _{\ell =1}^\infty \left( \prod _{i=1}^k \langle |f_i| \rangle _{P(\rho )^{(\ell )}} \right) .\\ \end{aligned}$$

Now observe that, by the support condition on the tuple \(\vec {f}\),

$$\begin{aligned} \prod _{i=1}^k \langle |f_i| \rangle _{P(\rho )^{(\ell )}} = 2^{-dk\ell } \prod _{i=1}^k \langle |f_i| \rangle _{P(\rho )}, \end{aligned}$$

which is enough to prove the claim. Therefore, we only need to work in the localized cubes \(P(\rho )\), \(\rho \in \left\{ 0,\frac{1}{3},\frac{2}{3}\right\} ^d\). Therefore, we can obtain the first assertion of Corollary A.1 applying Theorem A:

$$\begin{aligned} |T\vec {f}(x)|&\lesssim \sum _{\rho \in \{0,\frac{1}{3},\frac{2}{3}\}^d} \sum _{m=0}^\infty \omega (2^{- m}) \sum _{Q \in \mathscr {D}^\rho , \; Q \subset P(\rho )^{(m)} } \alpha _Q^\rho \left( \prod _{i=1}^k \langle |f_i| \rangle _{Q^{(m)}} \right) \mathbbm {1}_Q(x) \\&\lesssim \sum _{\rho \in \{0,\frac{1}{3},\frac{2}{3}\}^d} \sum _{m=0}^\infty \omega (2^{- m}) (m+1) \sum _{Q \in \mathcal {S}_{m,\vec {f}}} \left( \prod _{i=1}^k \langle |f_i| \rangle _Q \right) \mathbbm {1}_Q \\&= \sum _{\rho \in \{0,\frac{1}{3},\frac{2}{3}\}^d} \sum _{m=0}^\infty \omega (2^{- m}) (m+1) \mathcal {A}_{\mathcal {S}_{m,\vec {f}}} \vec {f} (x), \end{aligned}$$

for sparse collections \(\mathcal {S}_{m,\vec {f}}\) that may depend both on m and \(\vec {f}\) (and which are subfamilies of \(\mathscr {D}(P(\rho ))\) for each value of \(\rho \)). Now, reorganizing the sum above we obtain

$$\begin{aligned} |T\vec {f}(x)|&\lesssim \sum _{\rho \in \{0,\frac{1}{3},\frac{2}{3}\}^d} \sum _{\mathcal {S}_{m,\vec {f}} \subset \mathscr {D}^\rho } \omega (2^{- m}) (m+1) \mathcal {A}_{\mathcal {S}_{m,\vec {f}}} \vec {f} (x) \\&=: \sum _{\rho \in \{0,\frac{1}{3},\frac{2}{3}\}^d} \mathcal {A}_{\rho }\vec {f}(x) . \end{aligned}$$

Now, by the logarithmic Dini condition, each of the operators \(\mathcal {A}_{\rho }\) is bounded above by some absolute constant times a 0-shift whose associated sequence is 1-Carleson (and localized in \(P(\rho )\)) to which we can apply again Theorem A. Therefore, we obtain

$$\begin{aligned} |T\vec {f}(x)| \lesssim \sum _{\rho \in \{0,\frac{1}{3},\frac{2}{3}\}^d} \mathcal {A}_{\mathcal {S}_\rho } \vec {f}(x), \end{aligned}$$

for some sparse families \(\mathcal {S}_\rho \subset \mathscr {D}^\rho \) which depend on \(\vec {f}\). \(\square \)

We now introduce the notion of function quasi-norm. We say that \(\Vert \cdot \Vert _{\mathbb {X}}\), defined on the set of measurable functions, is a function quasi-norm if:

  1. (P1):

    There exists a constant \(C > 0\) such that

    $$\begin{aligned} \Vert f + g\Vert _{\mathbb {X}} \le C \left( \Vert f\Vert _{\mathbb {X}} + \Vert g\Vert _{\mathbb {X}} \right) , \end{aligned}$$
  2. (P2):

    \(\Vert \lambda f \Vert _{\mathbb {X}} = |\lambda |\Vert f\Vert _{\mathbb {X}}\) for all \(\lambda \in \mathbb {C}\).

  3. (P3):

    If \(|f(x)| \le |g(x)|\) almost-everywhere then \(\Vert f\Vert _{\mathbb {X}} \le \Vert g\Vert _{\mathbb {X}}\).

  4. (P4):

    \(\Vert \liminf _{n \rightarrow \infty }f_n\Vert _{\mathbb {X}} \le \liminf _{n \rightarrow \infty } \Vert f_n\Vert _{\mathbb {X}}\)

Fix some dyadic system \(\mathscr {D}\) such that there exists an increasing sequence of dyadic cubes \(\{P_{\ell }\}_\ell \subset \mathscr {D}\) whose union is the whole space \(\mathbb {R}^d\), and denote \(\mathbbm {1}_{P_\ell } \vec {f} = (\mathbbm {1}_{P_\ell } f_1, \ldots , \mathbbm {1}_{P_\ell }f_k)\). Now, taking into account properties (P1) and (P3), if we take quasi-norms in the second assertion of Corollary A.1, we have

$$\begin{aligned} \Vert \mathbbm {1}_{P_\ell }T(\mathbbm {1}_{P_\ell }\vec {f})\Vert _{\mathbb {X}} \lesssim \sup _{\mathscr {D}, \mathcal {S}} \Vert \mathcal {A}_\mathcal {S}(\mathbbm {1}_{P_\ell }\vec {f})\Vert _{\mathbb {X}} \; \forall \ell . \end{aligned}$$

On the one hand, since \(\vec {f}\) is integrable, \(T(\mathbbm {1}_{P_\ell }\vec {f})\) converges pointwise to \(T(\vec {f})\). Therefore, we have

$$\begin{aligned} \mathbbm {1}_{P_\ell }T(\mathbbm {1}_{P_\ell }\vec {f}) \rightarrow T(\vec {f}) \end{aligned}$$

pointwise. Finally, we apply property (P4) and we get

$$\begin{aligned} \Vert T\vec {f}\Vert _{\mathbb {X}} = \left\| \liminf _{\ell } \mathbbm {1}_{P_\ell }T(\mathbbm {1}_{P_\ell }\vec {f}) \right\| _{\mathbb {X}} \le \liminf _{\ell } \left\| \mathbbm {1}_{P_\ell }T(\mathbbm {1}_{P_\ell }\vec {f})\right\| _{\mathbb {X}}\lesssim \sup _{\mathscr {D},\mathcal {S}}\left\| \mathcal {A}_{\mathcal {S}}\vec {f} \right\| _{\mathbb {X}}. \end{aligned}$$

This is exactly Corollary A.2.

Remark 2.7

We note that the dependence on m in the pointwise estimate of shifts of complexity m must be at least linear in m. To see this, let us work in dimension one and fix a large integer m. For any interval \(I = [a,b)\) let \(I_j\) be the j-th interval of \(\mathscr {D}_m(I)\):

$$\begin{aligned} I_j = a+|I|[j2^{-m},(j+1)2^{-m}). \end{aligned}$$

Define a tower over an interval I to be the collection of intervals

$$\begin{aligned} \mathcal {T}_I = \left\{ [a,a+2^{-k}|I|):\, k \in \mathbb {N}\right\} . \end{aligned}$$

The collection of intervals \(\mathcal {S} = \bigcup _{J \in \mathscr {D}_m(I)} \mathcal {T}_{J}\) is a sparse collection. Now consider a function f on I which is defined by

$$\begin{aligned} f(x) = {\left\{ \begin{array}{ll} 0 &{}\quad \text {if } x \in I_j \text { with } j \text { even}, \\ 2 &{}\quad \text {otherwise}. \end{array}\right. } \end{aligned}$$

Denote \({\text {gen}}(J) = \log _2(\ell (I) \ell (J)^{-1})\) for cubes \(J \in \mathscr {D}(I)\). Observe that for any dyadic interval \(J \subseteq I\) with \({\text {gen}}(J) \le m-1\) we have

$$\begin{aligned} \langle f \rangle _J = 1. \end{aligned}$$

Consider now the action of \(\mathcal {A}^m_{\mathcal {S}}\) on f. If \(x \in (I_j)_0\) with j even then

$$\begin{aligned} \mathcal {A}^m_{\mathcal {S}} f(x) = m. \end{aligned}$$

In order to construct a collection \(\mathcal {S}'\) of intervals in I for which we have

$$\begin{aligned} \mathcal {A}^m_{\mathcal {S}}f(x) \le C \mathcal {A}^0_{\mathcal {S}'}f(x), \end{aligned}$$

we would need to select every interval \(J \subset I\) with \({\text {gen}}(J) \ge m-1\). Indeed, let \(I^k(x)\) be the interval in \(\mathscr {D}_k(I)\) which contains x and let \(\alpha _J\) be 1 if \(J \in \mathcal {S}'\) and 0 otherwise. Then

$$\begin{aligned} C \mathcal {A}^0_{\mathcal {S}'}f(x) = C\sum _{k=0}^{m-1} \alpha _{I^k(x)} \ge m \end{aligned}$$

for all \(x \in (I_j)_0\) with j even. This implies that at least m / C of these intervals must be in \(\mathcal {S}'\). But this implies that the height

$$\begin{aligned} \sum _{J \in \mathcal {S}'} \alpha _J \mathbbm {1}_J(x) \ge m/C \end{aligned}$$

on half of the interval I, which contradicts the hypothesis of \(\mathcal {S}'\) being sparse if m is large enough.

3 Applications

We are now ready to fully state and prove the applications of the pointwise bound as stated in the introduction. We begin with the multilinear sharp weighted estimates:

3.1 Multilinear \(A_2\) theorem

We need some more definitions first. These were introduced in [20].

Definition 3.1

(\(A_{\vec {P}}\) weights) Let \(\vec {P} = (p_1, \ldots , p_k)\) with \(1 \le p_1, \ldots , p_k < \infty \) and \(\frac{1}{p} = \frac{1}{p_1} + \cdots + \frac{1}{p_k}\). Given \(\vec {w} = (w_1, \ldots , w_k)\), set

$$\begin{aligned} v_{\vec {w}} = \prod _{i=1}^k w_{i}^{p/p_i}. \end{aligned}$$

We say that \(\vec {w}\) satisfies the k-linear \(A_{\vec {P}}\) condition if

$$\begin{aligned}{}[\vec {w}]_{A_{\vec {P}}} := \sup _{Q} \left( \frac{1}{|Q|}\int _Q v_{\vec {w}} \right) \prod _{i=1}^k \left( \frac{1}{|Q|}\int _Q w_i^{1-p_i'} \right) ^{p/p_i}. \end{aligned}$$

We call \([\vec {w}]_{A_{\vec {P}}}\) the \(A_{\vec {P}}\) constant of \(\vec {w}\). As usual, if \(p_i = 1\) then we interpret \(\frac{1}{|Q|}\int _Q w_i^{1-p_i'}\) to be \(({\text {ess inf}}_{Q} w_i)^{-1}\).

The following theorem was proved in [21]:

Theorem 3.2

Suppose \(1 < p_1, \ldots , p_k < \infty \), \(\frac{1}{p} = \frac{1}{p_1} + \cdots + \frac{1}{p_k}\) and \(\vec {w} \in A_{\vec {P}}\). Then

$$\begin{aligned} \Vert \mathcal {A}_S \vec {f}\Vert _{L^{p}({v_{\vec {w}}})} \lesssim [w]_{A_{\vec {P}}}^{\max \left( 1, \frac{p_1'}{p}, \ldots , \frac{p_k'}{p}\right) } \prod _{i=1}^k \Vert f_i\Vert _{L^p(w_i)}, \end{aligned}$$

whenever \(\mathcal {S}\) is sparse.

We can now use Corollary A.2 to extend the above result to general k-linear Calderón-Zygmund operators:

Theorem 3.3

Under the conditions of Theorem 3.2, for any k-linear Calderón-Zygmund operator T, we have

$$\begin{aligned} \Vert T\vec {f}\Vert _{L^p(v_{\vec {w}})} \lesssim [\vec {w}]_{A_{\vec {P}}}^{\max \left( 1, \frac{p_1'}{p}, \ldots , \frac{p_k'}{p}\right) }\prod _{i=1}^k \Vert f_i\Vert _{L^p(w_i)}. \end{aligned}$$

Proof

We just need to apply Corollary A.2 with \(\Vert \cdot \Vert _{\mathbb {X}} := \Vert \cdot \Vert _{L^p(v_{\vec {w}})}\), which clearly is a function quasi-norm. The assumption of \(\vec {f}\) being integrable is a qualitative one and can be trivially removed by the usual density arguments. \(\square \)

3.2 Sharp aperture weighted Littlewood-Paley theorem

Here we follow Lerner [17], the reader can find a nice introduction and some references there. We begin with some definitions:

Let \(\psi \in L^1(\mathbb {R}^d)\) with \(\int _{\mathbb {R}^d} \psi (x) \, dx = 0\) satisfy

$$\begin{aligned} |\psi (x)|&\lesssim \frac{1}{(1+|x|)^{d+\epsilon }} \end{aligned}$$
(3.1)
$$\begin{aligned} \int _{\mathbb {R}^d} |\psi (x+h)-\psi (x)| \, dx&\lesssim |h|^\epsilon . \end{aligned}$$
(3.2)

We will denote the upper half-space \(\mathbb {R}^d \times \mathbb {R}\) by \(\mathbb {R}^{d+1}_+\) and the \(\alpha \)-cone at x by

$$\begin{aligned} \Gamma _\alpha (x) = \left\{ (y,t) \in \mathbb {R}^{d+1}_+ :\, |y-x| \le \alpha t\right\} . \end{aligned}$$

Let \(\psi _t\) be the dilation of \(\psi \) which preserves the \(L^1\) norm, i.e.: \(\psi _t(x) = t^{-d} \psi (x/t)\), then we can define the square function \(S_{\alpha ,\psi }f\) by

$$\begin{aligned} S_{\alpha ,\psi }f(x) = \left( \int _{\Gamma _\alpha (x)} |(f *\psi _t)(y)|^2 \, \frac{dy \, dt}{t^{d+1}} \right) ^{1/2}. \end{aligned}$$

We will also need a regularized version. Let \(\Phi \) be a Schwartz function such that

$$\begin{aligned} \mathbbm {1}_{B(0,1)}(x) \le \Phi (x) \le \mathbbm {1}_{B(0,2)}(x). \end{aligned}$$

We define the regularized square function \(\widetilde{S}_{\alpha ,\psi }\) by

$$\begin{aligned} \widetilde{S}_{\alpha ,\psi }f(x) = \left( \int _{\mathbb {R}^{d+1}_+} \Phi \left( \frac{x-y}{t\alpha } \right) |(f*\psi _t)(y)|^2 \, \frac{dy\, dt}{t^{d+1}} \right) ^{1/2}. \end{aligned}$$

The regularized version can be used instead of \(S_{\alpha ,\psi }\) in most cases since we have

$$\begin{aligned} S_{\alpha ,\psi }f(x) \le \widetilde{S}_{\alpha ,\psi }f(x) \le S_{2\alpha ,\psi }f(x). \end{aligned}$$

It was proved in [17] that

$$\begin{aligned} |(\widetilde{S}_{\alpha ,\psi }f(x))^2 - (m_{Q_0}(\widetilde{S}_{\alpha ,\psi }f)^2)| \lesssim \alpha ^{2d} \sum _{m=0}^\infty 2^{-\delta m} \sum _{Q \in \mathcal {S}} \langle |f| \rangle _{2^m Q}^2 \mathbbm {1}_Q(x) \end{aligned}$$

By the same Theorem A in its bilinear formulation (with \(f_1=f_2=f\)), the last expression can be bounded, up to a constant, by an expression of the form

$$\begin{aligned} \alpha ^{2d} \sum _{\rho \in \{0,\frac{1}{3},\frac{2}{3}\}^d} \sum _{m=0}^\infty 2^{-\delta m} (m+1) \sum _{Q \in \mathcal {S}^{\rho ,m}} \langle |f| \rangle _Q^2 \mathbbm {1}_Q(x). \end{aligned}$$

As in [17], we know (a priori) that \(m_{Q_0}(\widetilde{S}_{\alpha ,\psi }f) \rightarrow 0\) as \(|Q| \rightarrow \infty \) so by the triangle inequality and Fatou’s Lemma we can ignore that term (or by arguing as we did in the previous section). Finally, arguing as in the proof of Corollaries A.1 and A.2, we arrive at

$$\begin{aligned} \Vert \widetilde{S}_{\alpha ,\psi }f\Vert _{L^{p,\infty }(w)} \lesssim \alpha ^d \sup _{\mathscr {D}, \mathcal {S}} \Vert \mathcal {A}^0_{\mathcal {S}}(f,f)^{1/2}\Vert _{L^{p,\infty }(w)}, \end{aligned}$$

where the supremum is taken over all dyadic grids \(\mathscr {D}\) and all sparse collections \(\mathcal {S} \subset \mathscr {D}\). To finish the argument we recall the following result, which was shown in [13]:

$$\begin{aligned} \Vert \mathcal {A}^0_{\mathcal {S}}(f,f)^{1/2}\Vert _{L^{p,\infty }(w)} \lesssim [w]_{A_p}^{\max (\frac{1}{2}, \frac{1}{p})} \Phi _p([w]_{A_p}) \Vert f\Vert _{L^p(w)} \end{aligned}$$
(3.3)

for \(1 < p < 3\), where

$$\begin{aligned} \Phi _p(t) = {\left\{ \begin{array}{ll} 1 &{}\quad \text {if } 1<p<2 \\ 1+\log t &{}\quad \text {if } 2 \le p < 3. \end{array}\right. } \end{aligned}$$

We are thus able to extend Lerner’s estimate to \(1 < p \le 2\), obtaining

$$\begin{aligned} \Vert S_{\alpha ,\psi }f\Vert _{L^{p,\infty }(w)} \lesssim \alpha ^d [w]_{A_p}^{1/p} \Vert f\Vert _{L^p(w)} \quad \text {for }1 < p < 2 \end{aligned}$$

and

$$\begin{aligned} \Vert S_{\alpha ,\psi }f\Vert _{L^{2,\infty }(w)} \lesssim \alpha ^d [w]_{A_2}^{1/2} (1+\log [w]_{A_2}) \Vert f\Vert _{L^2(w)}. \end{aligned}$$