1 Introduction and statement of results

Given \(b>1\), \(u>0\), consider the curve

$$\begin{aligned} \Gamma _{u,b}(t)=(t, u \gamma _b(t)), \quad t\in {\mathbb {R}}, \end{aligned}$$

where \(\gamma _b\) is homogeneous of degree b, with \(\gamma _b(\pm 1)\ne 0\).

That is, there are \(c_+\ne 0\), \(c_-\ne 0\) such that

$$\begin{aligned} \gamma _b(t)= {\left\{ \begin{array}{ll} c_{{}_{\scriptstyle {+}}} t^b,\,\, &{}t>0,\\ c_{{}_{\scriptstyle {-}}} (-t)^b, \,\,&{}t<0. \end{array}\right. } \end{aligned}$$
(1.1)

For \(f\in {\mathcal {S}}({\mathbb {R}}^2)\) the Hilbert transform along \(\Gamma _{u,b}\) is defined by

$$\begin{aligned} {H}^{(u)}\, f(x)= p.v. \int _{\mathbb {R}} f(x_1-t, x_2-u \gamma _b(t) )\frac{dt}{t}. \end{aligned}$$

For an arbitrary nonempty \(U\subset {\mathbb {R}}\) consider the maximal function

$$\begin{aligned} \mathcal {H}^U \,f(x)=\sup _{u\in U} |{H}^{(u)}\, f(x)|. \end{aligned}$$
(1.2)

The individual operators \(H^{(u)}\) extend to bounded operators on \(L^p({\mathbb {R}}^2)\) for \(1<p<\infty \) (see [10, 27]). The purpose of this paper is to prove, for \(p>2\), optimal \(L^p\) bounds for the maximal operator \({\mathcal {H}}^U\) in terms of suitable properties of U.

Our maximal function is motivated by a similar one involving directional Hilbert transforms which correspond to the limiting case \(b=1\), \(c_+=-c_-\) not covered here. This maximal function for Hilbert transforms along lines was considered by Karagulyan [18] who proved that in this case the \(L^2\rightarrow L^{2,\infty } \) operator norm is bounded below by \(c \sqrt{\log (\# U)}\); the lower bound was extended to all \(L^p\) by Łaba, Marinelli and Pramanik [19]. Demeter and Di Plinio [7] showed the upper bound \(O(\log (\#U))\) for \(p>2\) (see also [6] for the sharp \(L^2\) result with bound \(O(\log (\#U))\)). Moreover there is a sharp bound \(\approx \sqrt{\log (\#U)} \) for lacunary sets of directions (see also Di Plinio and Parissis [9]) and there are other improvements for direction sets of Vargas type. Another motivation for our work comes from the recent papers [8, 16] which take up the curved cases and analyze the linear operator \(f\mapsto H^{(u(\cdot ))}f\) for special classes of measurable functions \(x\mapsto u(x)\). [16] covers the case when u(x) depends only on \(x_1\) and [8] covers the case where u is Lipschitz. The analogous questions for variable lines are still not completely resolved (cf. [1, 2] for partial \(L^p\) ranges in the one-variable case, and [15] and the references therein for partial results related to the Lipschitz case).

For our curved variant we seek to get sharp results about the dependence of the operator norm

$$\begin{aligned} \Vert {\mathcal {H}}^U\Vert _{L^p\rightarrow L^p} =\sup \{\Vert {\mathcal {H}}^U\, f\Vert _p: \Vert f\Vert _p\le 1\} \end{aligned}$$

on U. Unlike in the case for lines we obtain for \(b>1\) an optimal bound when \(p>2\) and also observe a different type of dependence on U; namely it is not the cardinality of U that determines the size of the operator norm for the maximal operator but rather the minimal number of intervals of the form (R, 2R) that is needed to cover U. This number is comparable to

$$\begin{aligned} {\mathfrak {N}}(U):=1 +\#\{n\in {\mathbb {Z}}: [2^n, 2^{n+1}] \cap U\ne \emptyset \}. \end{aligned}$$
(1.3)

Theorem 1.1

For every \(p\in (2,\infty )\), the operator \({\mathcal {H}}^U\) is bounded on \(L^p\) if and only if \({\mathfrak {N}}(U)<\infty \). Moreover,

$$\begin{aligned} \Vert {\mathcal {H}}^U\Vert _{L^p\rightarrow L^p}\approx \sqrt{\log ({\mathfrak {N}}(U))}\,. \end{aligned}$$

The constants implicit in this equivalence depend only on p, b and \(|c_+/c_-|\).

Remarks

  1. (i)

    The lower bound \(c\sqrt{\log ({\mathfrak {N}}(U))}\) can be extended to all \(p>1\). Indeed, if we had a smaller operator norm for some \(p_0<2\) we could, by interpolation, also deduce a better upper bound for \(p>2\) which is not possible. The lower bound for \(p<2\) is generally not efficient, see however some results for lacunary sets in Sect. 7.

  2. (ii)

    Concerning upper bounds there is no endpoint result for general U with \({\mathfrak {N}}(U)<\infty \) when \(p=2\). In fact one can show using the Besicovitch set that for \(U=[1,2]\) the operator \({\mathcal {H}}^U\) even fails to be of restricted weak type (2, 2). Cf. [24, §8.3] for the details of a similar argument in the context of maximal functions for circular means.

  3. (iii)

    In our theorem we avoid the cases \(c_\pm =0\), for the following reasons. For the case \(c_+=0=c_-\) in (1.1) the operators \(H^{(u)}\) are equal to the Hilbert transform along a fixed line and the problems on \({\mathcal {H}}^U\) become trivial. For the choices \(c_+\ne 0\), \(c_-=0\) and \(c_+=0\), \(c_-\ne 0\) the curves are unbalanced and by [5, §6] the individual operators \({\mathcal {H}}^u\) are not bounded on \(L^p\).

  4. (iv)

    The operators \({\mathcal {H}}^{U}\) are invariant under conjugation with dilation operators with respect to the second variable; i.e. if \(\delta ^{(2)}_v\,f(x)=f(x_1, vx_2)\) then we have \({\mathcal {H}}^{vU}=\delta _{v^{-1}}^{(2)}{\mathcal {H}}^{U}\delta _v^{(2)}\) and thus the \(L^p\) operator norm of \({\mathcal {H}}^{U}\) and \({\mathcal {H}}^{vU}\) are the same. This shows that any dependence of \(c_+, c_-\) in the operator norms can always be reduced to a dependence on just \(|c_+/c_-|\) as one can assume that \(c_+=1\). The implicit constants in the above theorems depend on \(c_\pm , b,p\) but are uniform as long as \(|c_+/c_-| \) is taken in a compact subset of \((0,\infty )\), and b and p are taken in compact subsets of \((1,\infty )\). Thus implicit constants in all inequalities in this paper will be allowed to depend on \(c_\pm , b\), with the above understanding of boundedness on compact sets.

1.1 This paper

In Sect. 2 we describe the basic decomposition (2.8) of the Hilbert transform \(H^{(u)}\) into a standard nonisotropic singular integral operator \(S^u\) and two operators \(T^u_\pm \) which can be viewed as singular Fourier integral operators with favorable frequency localizations. The growth condition in terms of \(\sqrt{\log {\mathfrak {N}}(U)}\) is only relevant for the maximal function \(\sup _{u\in U} |S^uf|\) for which we prove \(L^p\) bounds for all \(1<p<\infty \). Here we use the Chang–Wilson–Wolff inequality, together with a variant of an approximation argument in [16]. It turns out that the full maximal operators associated to the \(T^u_\pm \) are bounded in \(L^p({\mathbb {R}}^2)\) for \(2<p<\infty \). This is related to space-time \(L^p\) inequalities (so-called local smoothing estimates) for Fourier integral operators in [21]. This connection has already been used by Marletta and Ricci in their work [20] on families of maximal functions along homogeneous curves. The results for \(S^u\), \(T^u_\pm \) are formulated in Sect. 2 as Theorems 2.2 and 2.3.

Section 3 contains several auxiliary results. A version of our maximal function for Mikhlin multipliers (dilated in the second variable) is given in Sect. 4; this is used to prove Theorem 2.2 in Sect. 5. Theorem 2.3 is proved in Sect. 6. In Sect. 7 we prove some results about upper bounds for the maximal functions \(\sup _{u\in U} |T^u_\pm f|\) when U is a lacunary set; one of these results will be helpful in the proof of lower bounds for the operator norm.

The proof of lower bounds is given in Sect. 8. The arguments for the lower bounds in \(L^2\) are based on ideas of Karagulyan [18]. “Appendix A” contains a Cotlar type inequality which is used in the proof of Theorem 2.2.

2 Decomposition of the Hilbert transforms

Let \(\chi _+\) be supported in (1 / 2, 2) such that \(\sum _{j\in {\mathbb {Z}}} \chi _+(2^j t) =1\) for \(t>0\). Let \(\chi _{{}_{\scriptstyle {-}}}(t)=\chi _+(-t)\) and \(\chi =\chi _{{}_{\scriptstyle {+}}} + \chi _{{}_{\scriptstyle {-}}}\). We define measures \(\sigma _{{}_{\scriptstyle {+}}}\) and \(\sigma _{{}_{\scriptstyle {-}}}\) by

$$\begin{aligned} \langle \sigma _{{}_{\scriptstyle {\pm }}},f\rangle = \int f(t,\gamma _b(t)) \chi _{{}_{\scriptstyle {\pm }}}(t) \frac{dt}{t}. \end{aligned}$$
(2.1)

Let, for \(j\in {\mathbb {Z}}\), the measure \(\sigma _j\) be defined by

$$\begin{aligned} \langle \sigma _j ,f\rangle = \int f(t,\gamma _b(t)) \chi (2^j t) \frac{dt}{t}. \end{aligned}$$

By homogeneity of \(\gamma _b\) we see that (in the sense of distributions) \(\sigma _j = 2^{j(1+b)} \sigma _0 (\delta _{2^j}^b \cdot )\) with \(\delta _{t}^bx= (tx_1, t^bx_2)\). Observe that \(\sigma _0=\sigma _{{}_{\scriptstyle {+}}} +\sigma _{{}_{\scriptstyle {-}}}\) satisfies the cancellation condition \(\widehat{\sigma }_0(0)=0\) (where \(\widehat{\sigma } (\xi )\equiv {\mathcal {F}}[\sigma ] (\xi ) = \int e^{-i\langle x,\xi \rangle } d\sigma (x)\) denotes the Fourier transform). For Schwartz functions f the Hilbert transform along \(\Gamma _b\) is then given by

$$\begin{aligned} H f= \sum _{j\in {\mathbb {Z}}}{\sigma _j}* f. \end{aligned}$$

2.1 Asymptotics for the Fourier transform of \(\sigma _0\)

We analyze \(\widehat{\sigma _\pm }(\xi )\) for large \(\xi \). We have

$$\begin{aligned} \widehat{\sigma _{{}_{\scriptstyle {\pm }}}} (\xi ) = \int e^{-i\psi _\pm (t, \xi )} \chi _{{}_{\scriptstyle {\pm }}}( t) \frac{dt}{t} \end{aligned}$$

with

$$\begin{aligned} \psi _{{}_{\scriptstyle {+}}}(t,\xi )&= t \xi _1 + {c_{{}_{\scriptstyle {+}}}}t^b\xi _2 ,\\ \psi _{{}_{\scriptstyle {-}}}(t,\xi )&= t \xi _1 + {c_{{}_{\scriptstyle {-}}}}(-t)^b\xi _2 . \end{aligned}$$

Observe that

$$\begin{aligned} \begin{aligned} \partial _t\psi _{{}_{\scriptstyle {+}}}(t,\xi )&= \xi _1+ c_{{}_{\scriptstyle {+}}} b t^{b-1}\xi _2, \\ \partial _t\psi _{{}_{\scriptstyle {-}}}(t,\xi )&= \xi _1- {c_{{}_{\scriptstyle {-}}}}b (-t)^{b-1}\xi _2. \end{aligned} \end{aligned}$$
(2.2)

Thus \(\psi _{{}_{\scriptstyle {+}}}\) has a critical point \(t_+(\xi )>0 \) when \(\xi _1/(c_{{}_{\scriptstyle {+}}}\xi _2)<0\), and \(\psi _{{}_{\scriptstyle {-}}}\) has a critical point \(t_-(\xi ) <0\) when \(\xi _1/(c_{{}_{\scriptstyle {-}}}\xi _2)>0\) , and \(t_\pm (\xi )\) are given by

$$\begin{aligned} t_+(\xi )= \left( \frac{-\xi _1}{bc_+\xi _2}\right) ^{\frac{1}{b-1}}, \quad t_-(\xi )= - \left( \frac{\xi _1}{bc_-\xi _2}\right) ^{\frac{1}{b-1}}. \end{aligned}$$

These critical points are nondegenerate as we have

$$\begin{aligned} \partial _{tt}\psi _{{}_{\scriptstyle {\pm }}}(t,\xi )=c_{{}_{\scriptstyle {\pm }}}b(b-1) (\pm t)^{b-2}\xi _2. \end{aligned}$$

Setting \(\Psi _\pm (\xi )=-\psi _{{}_{\scriptstyle {\pm }}}(t_\pm (\xi ),\xi )\) we get

$$\begin{aligned} \Psi _{{}_{\scriptstyle {+}}}(\xi )&= (b-1) c_{{}_{\scriptstyle {+}}}\xi _2\left( -\frac{\xi _1}{bc_{{}_{\scriptstyle {+}}}\xi _2}\right) ^{\frac{b}{b-1}},\\ \Psi _{{}_{\scriptstyle {-}}}(\xi )&= (b-1) c_{{}_{\scriptstyle {-}}}\xi _2\left( \frac{\xi _1}{bc_{{}_{\scriptstyle {-}}}\xi _2}\right) ^{\frac{b}{b-1}}. \end{aligned}$$

The functions \(\Psi _\pm \) are homogeneous of degree one and putting \(\xi _2=\pm 1\) we have the crucial lower bounds for the second derivatives of \(\xi _1\mapsto \Psi (\xi _1,\pm 1)\) needed for the application of the space time estimate in Sect. 3.4.

Assume \(|\xi |>1\). We observe that then

$$\begin{aligned} \inf _{1/3 \le t\le 3} \big |\partial _t \psi _{{}_{\scriptstyle {+}}} (t,\xi )\big | \gtrsim |\xi | \end{aligned}$$
(2.3a)

if \(\xi _1/c_+\xi _2\) does not belong to the interval \([-b (7/2)^{b-1} , -b (2/7)^{b-1} ]\).

Likewise, again for \(|\xi |>1\) we observe that

$$\begin{aligned} \inf _{-3\le t\le -1/3} \big |\partial _t \psi _{{}_{\scriptstyle {-}}}(t,\xi ) \big | \gtrsim |\xi | \end{aligned}$$
(2.3b)

if \(\xi _1/c_-\xi _2\) does not belong to the interval \([b (2/7)^{b-1} , b (7/2)^{b-1} ]\). These observations suggest the following decomposition of \( \sigma _0\).

Let \(\eta _0\) be supported in \(\{|\xi |\le 100\}\) and equal to 1 for \(|\xi |\le 50\). Let \(\varsigma _+\) be a \(C^\infty _c({\mathbb {R}})\) function supported on \((b (1/4)^{b-1} , b 4^{b-1} )\) which is equal to 1 on \([b (2/7)^{b-1} , b (7/2)^{b-1} ]\). Let \(\varsigma _-\) be a \(C^\infty _c({\mathbb {R}})\) function supported on \((-b 4^{b-1} , -b (1/4)^{b-1} )\) which is equal to 1 on \([-b (7/2)^{b-1} , -b (2/7)^{b-1} ]\). Then we decompose

$$\begin{aligned} \sigma _0= \phi _0+\mu _{0,+} +\mu _{0,-} \end{aligned}$$
(2.4a)

where \(\phi _0\) is given by

$$\begin{aligned} \widehat{ \phi _0}(\xi )&= \eta _0(\xi ) \widehat{\sigma }_0(\xi ) + (1-\eta _0(\xi )) \left( 1- \varsigma _- \left( \tfrac{\xi _1}{c_+\xi _2}\right) \right) \widehat{\sigma }_+(\xi )\nonumber \\&\quad + (1-\eta _0(\xi )) \left( 1- \varsigma _+\left( \tfrac{\xi _1}{c_-\xi _2}\right) \right) \widehat{\sigma }_-(\xi ) \end{aligned}$$
(2.4b)

and \(\mu _{{}_{\scriptstyle {0,\pm }}}\) are given by

$$\begin{aligned} \widehat{\mu }_{0,+} (\xi )&= (1-\eta _0(\xi )) \varsigma _- \left( \tfrac{\xi _1}{c_+\xi _2}\right) \widehat{\sigma }_+(\xi ) , \end{aligned}$$
(2.4c)
$$\begin{aligned} \widehat{\mu }_{0,-} (\xi )&= (1-\eta _0(\xi )) \varsigma _+\left( \tfrac{\xi _1}{c_-\xi _2}\right) \widehat{\sigma }_-(\xi ) . \end{aligned}$$
(2.4d)

Lemma 2.1

  1. (i)

    \(\phi _0\) is a Schwartz function with \(\widehat{\phi }_0(0)=0\).

  2. (ii)

    The function \(\widehat{\mu }_{0,+}\) is supported on

    $$\begin{aligned} {\mathrm{Sect}}_{{}_{\scriptstyle {+}}}= \left\{ \xi : |\xi |>50, \,\, -b 4^{b-1}< \frac{\xi _1}{c_+\xi _2} < -\frac{b}{4^{b-1} }\right\} \end{aligned}$$
    (2.5a)

    and satisfies

    $$\begin{aligned} \widehat{\mu }_{0,+} (\xi ) = \omega _+(\xi ) e^{i\Psi _+(\xi ) } + E_+(\xi ) \end{aligned}$$

    where \(\omega _+\) is a standard symbol of order \(-1/2\), and \(E_+(\xi )\) is a Schwartz function, both supported on \({\mathrm{Sect}}_{{}_{\scriptstyle {+}}}\).

  3. (iii)

    The function \(\widehat{\mu }_{0,-}\) is supported on

    $$\begin{aligned} {\mathrm{Sect}}_{{}_{\scriptstyle {-}}}= \left\{ \xi : |\xi |>50, \,\, \frac{b}{4^{b-1}}< \frac{\xi _1}{c_-\xi _2} < b 4^{b-1} \right\} \end{aligned}$$
    (2.5b)

    and satisfies

    $$\begin{aligned} \widehat{\mu }_{0,-} (\xi ) = \omega _-(\xi ) e^{i\Psi _-(\xi ) } + E_-(\xi ) \end{aligned}$$

    where \(\omega _-\) is a standard symbol of order \(-1/2\), and \(E_-(\xi )\) is a Schwartz function, both supported on \({\mathrm{Sect}}_{{}_{\scriptstyle {-}}}\).

Proof

In view of the lower bounds for \(\partial _t\psi _\pm \) stated in (2.3a), (2.3b) under their respective assumptions we see that \(\phi _0\) is a Schwartz function. We have that \(\widehat{\sigma }_+(0)=-\widehat{\sigma }_-(0)\) and it follows that \(\widehat{\phi }_0(0)=0\). The formulas for \(\widehat{\mu }_{0,\pm } (\xi ) \) follow by the method of stationary phase. \(\square \)

We now define \(\Phi _0\) by \(\widehat{\Phi }_0=\widehat{\phi }_0+ E_{{}_{\scriptstyle {+}}} +E_{{}_{\scriptstyle {-}}}\) so that \(\Phi _0\) is a Schwartz function with \(\widehat{\Phi }_0(0)=0\). Define \(\Phi _j\) , \(\kappa _{j,\pm } \) by

$$\begin{aligned} \widehat{\Phi }_j(\xi )=\widehat{\Phi }_0 (2^{-j}\xi _1, 2^{-jb}\xi _2) \end{aligned}$$

and

$$\begin{aligned} \widehat{\kappa _{j,\pm }}(\xi ) = \omega _\pm (2^{-j}\xi _1, 2^{-jb}\xi _2) e^{i \Psi _\pm (2^{-j}\xi _1, 2^{-jb}\xi _2)} . \end{aligned}$$

Define operators \(S^u\) and \(T^u_\pm \) by

$$\begin{aligned} \widehat{S^u f}(\xi )&= \sum _{j\in {\mathbb {Z}}} \widehat{\Phi }_j (\xi _1, u\xi _2)\widehat{f}(\xi ) \end{aligned}$$
(2.6)
$$\begin{aligned} \widehat{T_\pm ^u f}(\xi )&= \sum _{j\in {\mathbb {Z}}} \widehat{\kappa _{j,\pm }} (\xi _1, u\xi _2)\widehat{f}(\xi ) \end{aligned}$$
(2.7)

These expressions are at least well defined if f is a Schwartz function whose Fourier transform is compactly supported in \({\mathbb {R}}^2{\setminus } \{0\}\). For these functions we have then decomposed our Hilbert transform as

$$\begin{aligned} H^{(u)}\,f = S^u f + T_+^u f + T^u_-f . \end{aligned}$$
(2.8)

For the upper bound in Theorem 1.1 we shall prove

Theorem 2.2

For \(1<p<\infty \),

$$\begin{aligned} \left\| \sup _{u\in U} |S^u f| \right\| _p \lesssim \sqrt{\log ({\mathfrak {N}}(U))} \Vert f\Vert _p. \end{aligned}$$
(2.9)

Theorem 2.3

For \(2<p<\infty \),

$$\begin{aligned} \left\| \sup _{u>0} |T_\pm ^u f| \right\| _p \lesssim \Vert f\Vert _p. \end{aligned}$$
(2.10)

3 Auxiliary results

3.1 The Chang–Wilson–Wolff inequality

We consider the conditional expectation operators \({\mathbb {E}}_j\) generated by dyadic cubes of length \(2^{-j}\), i.e. intervals of the form \(\prod _{i=1}^d[n_i2^{-j}, (n_i+1)2^{-j})\) with \(n\in {\mathbb {Z}}^d\). Let \(f\in L^1_{\mathrm{loc}}({\mathbb {R}}^d)\). For each \(j \in \mathbb {N}\cup \{0\}\), \({\mathbb {E}}_j\) is given by

$$\begin{aligned} {\mathbb {E}}_j f(x) = \frac{1}{2^{-jd}} \int _{I_j(x)} f(y) dy \end{aligned}$$

where \(I_j(x)\) is the unique dyadic cube of side length \(2^{-j}\) that contains x. Let

$$\begin{aligned} {\mathbb {D}}_j = {\mathbb {E}}_{j+1} - {\mathbb {E}}_j \end{aligned}$$

be the martingale difference operator. Let \({\mathfrak {S}}f\) be the dyadic square function, defined by

$$\begin{aligned} {\mathfrak {S}}f(x) = \left( \sum _{j\in \mathbb {Z}} |{\mathbb {D}}_j f(x)|^2 \right) ^{1/2}. \end{aligned}$$

Also let \(\mathcal {M}\) be the dyadic maximal function, given by

$$\begin{aligned} \mathcal {M} f(x) = \sup _{j\in \mathbb {Z}} |{\mathbb {E}}_j f(x)|. \end{aligned}$$

The following is a slight variant of an inequality due to Chang et al. [4]:

Proposition 3.1

Suppose that \(f\in L^p({\mathbb {R}}^d)\cap L^\infty ({\mathbb {R}}^d)\) for some \(p<\infty \). Then there exist two universal constants \(c_1\) and \(c_2\) such that

$$\begin{aligned} {\mathrm{meas}}\Big (&\Big \{x \in \mathbb {R}^d :|f(x)|> 4 \lambda \text { and } {\mathfrak {S}}f(x) \le \varepsilon \lambda \Big \} \Big )\nonumber \\&\quad \le c_2 \exp (- c_1\varepsilon ^{-2}) {\mathrm{meas}}\Big (\Big \{x \in \mathbb {R}^d :\mathcal {M} f(x) > \lambda \Big \} \Big ) \end{aligned}$$
(3.1)

for all \(\lambda > 0\) and \(0< \varepsilon < 1/2\).

This is a scaling invariant version of the Chang–Wilson–Wolff inequality. For a detailed proof we refer to the arXiv version of our paper (arXiv:1902.00096, “Appendix B”).

We shall apply the one-dimensional version of this theorem for the vertical slices in \({\mathbb {R}}^2\). Let f be a measurable function in \(L^p({\mathbb {R}}^2)\cap L^\infty ({\mathbb {R}}^2)\), and for \(j \ge 0\), let \({\mathbb {E}}_j^{(2)}\) be the conditional expectation operator acting on the second variable, i.e.

$$\begin{aligned} {\mathbb {E}}_j^{(2)} \,f(x) = \frac{1}{2^{-j}} \int _{I_j(x_2)} f(x_1,y) dy \end{aligned}$$

where \(I_j(x_2)\) is the unique dyadic interval of length \(2^{-j}\) that contains \(x_2\). Let \({\mathbb {D}}_j^{(2)} = {\mathbb {E}}_{j+1}^{(2)} - {\mathbb {E}}_j^{(2)}\), and

$$\begin{aligned} {\mathfrak {S}}^{(2)}\, f(x) = \left( \sum _{j\in \mathbb {Z}} |{\mathbb {D}}_j^{(2)} f(x)|^2 \right) ^{1/2}. \end{aligned}$$

Then from the above proposition, we clearly have

$$\begin{aligned}&{\mathrm{meas}}\Big ( \Big \{x \in \mathbb {R}^2 :|f(x)|> 4 \lambda \text { and } {\mathfrak {S}}^{(2)} \,f(x) \le \varepsilon \lambda \Big \} \Big )\nonumber \\&\quad \le \, c_2 e^{-c_1\varepsilon ^{-2}} {\mathrm{meas}}\Big (\Big \{x \in \mathbb {R}^2 :\mathcal {M}^{(2)} \,f(x) > \lambda \Big \} \Big ) \end{aligned}$$
(3.2)

for all \(\lambda > 0\) and \(0< \varepsilon < \tfrac{1}{2}\), where \(\mathcal {M}^{(2)}\) is the dyadic maximal function in the second variable, i.e. \(\mathcal {M}^{(2)}\, f(x) = \sup _{j\in \mathbb {Z}} |{\mathbb {E}}^{(2)}_j \,f (x)|.\)

3.2 Martingale difference operators and Littlewood-Paley projections

We need some computations from [14] which are summarized in the following lemma. Let M denote the Hardy-Littlewood maximal operator acting on functions in \(L^p({\mathbb {R}})\). Let \(\phi \) be supported in \(( c^{-1},c)\cup (-c,-c^{-1})\) for some \(c>1\).

Lemma 3.2

Assume that \(f\in L^1+L^\infty ({\mathbb {R}})\). Then

  1. (i)

    For \(q\ge 1\), \(n\ge 0\),

    $$\begin{aligned} {\mathbb {E}}_k ({\mathcal {F}}^{-1}[\phi (2^{-k-n}\cdot ) \widehat{f}])(x) \lesssim 2^{-n(1-\frac{1}{q}) } \big (M(|f|(x)^q)\Big )^{1/q} \end{aligned}$$
  2. (ii)

    For \(n\ge 0\)

    $$\begin{aligned} {\mathbb {D}}_k ({\mathcal {F}}^{-1} [\phi (2^{-k+n}\cdot ) \widehat{f}])(x) \lesssim 2^{-n } Mf(x) \end{aligned}$$

    almost everywhere.

Proof of Lemma 3.2

Cf. Sublemma 4.2 in [14]. \(\square \)

Given a function on \({\mathbb {R}}^2\) we shall apply this lemma to \(y_2\mapsto f(y_1,y_2)\) and relate the square function \({\mathfrak {S}}^{(2)}\) to Littlewood-Paley square functions in the second variable.

Let \(\chi _b\) be an even \(C^\infty \) function supported in \((2^{-b}, 2^b)\cup (-2^b, -2^{-b})\) such that \(\sum _{k\in {\mathbb {Z}}} \chi _b(2^{-kb} t)=1\) for all \(t\ne 0\). Define the Littlewood-Paley projection type operators \(P^{(1)}_k\), \(P^{(2)}_{k,b}\) acting on Schwartz functions on \({\mathbb {R}}^2\) by

$$\begin{aligned} \widehat{P^{(1)}_k\, f} (\xi )&= \chi _1(2^{-k}\xi _1) \widehat{f}(\xi ) \end{aligned}$$
(3.3)
$$\begin{aligned} \widehat{P^{(2)}_{k,b} f} (\xi )&= \chi _b(2^{-kb}\xi _2) \widehat{f}(\xi ) \end{aligned}$$
(3.4)

Lemma 3.3

Let \(q>1\), \(b>0\), and let \(g\in L^1+L^\infty \). Then the pointwise inequality

$$\begin{aligned} {\mathfrak {S}}^{(2)}\, g \le C_{b,q} \left( \sum _{k\in {\mathbb {Z}}} \left[ M^{(2)}\left( \left| P^{(2)}_{k,b} g\right| ^q\right) \right] ^{2/q} \right) ^{1/2} \end{aligned}$$

holds almost everywhere. Here \(M^{(2)}\) denotes the Hardy-Littlewood maximal operator in the second variable.

Proof of Lemma 3.3

Let \(\phi _b\) be a \(C^\infty \) function with

$$\begin{aligned} {\text {supp}}(\phi _b) \subset (2^{-b}, 2^b)\cup (-2^b, -2^{-b}) \end{aligned}$$

which equals 1 on the support of \(\chi _b\). Define \(\widehat{\tilde{P}^{(2)}_{k,b} f} (\xi )= \phi _b(2^{-kb}\xi _2) \widehat{f}(\xi ).\) We write

$$\begin{aligned} {\mathbb {D}}_k^{(2)}= \sum _{n\in {\mathbb {Z}}} \sum _{\begin{array}{c} l\in {\mathbb {Z}}:\\ n\le k-lb<n+1 \end{array}} {\mathbb {D}}_k^{(2)} \tilde{P}^{(2)}_{l,b} P^{(2)}_{l,b} \end{aligned}$$

and use Minkowski’s inequality and Lemma 3.2 to estimate, with \(\varepsilon <1-1/q\),

$$\begin{aligned} {\mathfrak {S}}^{(2)} f&\lesssim \sum _{n\in {\mathbb {Z}}} 2^{-|n|\varepsilon } \left( \sum _{k=0}^\infty \left[ \sum _{\begin{array}{c} l\in {\mathbb {Z}}:\\ n\le k-lb<n+1 \end{array}} M^{(2)}(|P^{(2)}_{l,b} f|^q)\right] ^{2/q}\right) ^{1/2}\\&\lesssim \left( \sum _{l\in {\mathbb {Z}}}\big [ M^{(2)}(|P^{(2)}_{l,b} f\big | ^q)|^{2/q}\right) ^{1/2}. \end{aligned}$$

This finishes the proof of Lemma 3.3. \(\square \)

3.3 A variant of Cotlar’s inequality

Recall that \(\chi _+\in C^\infty _c({\mathbb {R}})\) be supported in (1 / 2, 2) such that \(\sum _{j=-\infty }^\infty \chi _+(2^j t)=1\) for \(t>0\) and let \(\eta = \chi _+(|\cdot |)\).

Consider a Mikhlin–Hörmander multiplier m on \({\mathbb {R}}^d\) satisfying the assumption

$$\begin{aligned} \sup _{t>0} \Vert \eta \,m(t\cdot )\Vert _{\mathscr {L}^1_{\alpha }} =:B(m)<\infty , \quad \alpha >d; \end{aligned}$$
(3.5)

here \(\mathscr {L}^1_{\alpha }\) is the potential space of functions g with \((I-\Delta )^{\frac{\alpha }{2}}g\in L^1\). Let \(Sf={\mathcal {F}}^{-1}[m\widehat{f}]\), and for \(n\in {\mathbb {Z}}\) let \(S_n\) be defined by

$$\begin{aligned} \widehat{S_n f}(\xi ) = \sum _{j\le n} \eta (2^{-j}\xi ) m(\xi )\widehat{f}(\xi ). \end{aligned}$$

Then both S and the \(S_n\) are of weak type (1, 1) and bounded on \(L^p\) for \(p \in (1,\infty )\) with uniform operator norms \(\lesssim _p B(m)\). We are interested in bounds for the maximal function

$$\begin{aligned} S_* f(x)= \sup _{n\in {\mathbb {Z}}} |S_nf(x)| \end{aligned}$$
(3.6)

Proposition 3.4

Let \(\alpha >d\), \(r>0\) and B(m) as in (3.5). For \(f\in L^p({\mathbb {R}}^d)\), we have, for almost every x, and for \(0<\delta \le 1/2\)

$$\begin{aligned} S_* f(x) \le \frac{1}{(1-\delta )^{1/r}}\big (M (|Sf|^r)(x)\big )^{1/r} + C_{d,\alpha } \delta ^{-1} B(m) M f(x). \end{aligned}$$
(3.7)

Proposition 3.4 is a variant of the standard Cotlar inequality regarding truncations of singular integrals. A proof is included in “Appendix A”.

3.4 An \(L^p\) space time estimate for Fourier integral operators of convolution type and vector valued extensions

Let \(S(a_0, a_1)\) be the sectorial region in \({\mathbb {R}}^2\)

$$\begin{aligned} S(a_0,a_1)=\{(\xi _1,\xi _2): a_0<|\xi _1|/|\xi _2|< a_1 , \,\xi _2>0\} \end{aligned}$$

and let \(\eta _{\mathrm{sect}}\) be \(C^\infty \) and compactly supported in \(S_{\mathrm{ann}}:= S (a_0,a_1)\cap \{\xi : 1<|\xi |<2\}\). Let \(q\in C^\infty \) be defined in \(S(a_0,a_1)\) and homogeneous of degree one, satisfying

$$\begin{aligned} q_{\xi \xi }\ne 0 \quad \text { on } S(a_0,a_1) \end{aligned}$$

i.e. the Hessian \(q_{\xi \xi }\) has rank one on the sector \(S(a_0, a_1)\). Model cases for \(q(\xi )\) are given by \(|\xi |\), or \(\xi _1^2/\xi _2\) in the sector \(\{|\xi _1|\le c|\xi _2|\}\). Define

$$\begin{aligned} F_R f(x,t) = \int e^{i (\langle x,\xi \rangle +t q(\xi ))} \eta _{\mathrm{sect}}(\xi /R) \widehat{f}(\xi ) d\xi . \end{aligned}$$

We need a so-called local smoothing estimate from [21] (the terminology is supposed to indicate that the integration over a compact time interval improves on the fixed time estimate \(\Vert F_R f(\cdot ,t)\Vert _p \lesssim R^{\frac{1}{2}-\frac{1}{p}} \Vert f\Vert _p\), \(2\le p<\infty \)).

Theorem

[21] If I is a compact interval then

$$\begin{aligned} \left( \int _I \int _{{\mathbb {R}}^2} |F_R f(x,t) |^p dx\, dt\right) ^{1/p} \lesssim C_I R^{\frac{1}{2}-\frac{1}{p}-\varepsilon (p) } \Vert f\Vert _p, \end{aligned}$$
(3.8)

with \(\varepsilon (p)>0\) if \(2<p<\infty \). The estimates are uniform as \(\eta _{\mathrm{sect}}\) ranges over a bounded subset of \(C^\infty \) functions supported in \(S_{\mathrm{\mathrm ann}}\).

In this paper we shall need a square-function extension of (3.8) which involves nonisotropic dilations of the associated multipliers of the form \(\xi \mapsto (2^{-j}\xi _1, 2^{-bj} \xi _2)\) with \(b\ge 1\), \(j\in {\mathbb {Z}}\) (the strict inequality \(b>1\) assumed in the introduction is not used here); see (6.8) below. We rely on a variant of a theorem in [23], for families of smooth multipliers \(\xi \mapsto m(\xi ,t)\) on \({\mathbb {R}}^d\) depending continuously on the parameter \(t\in I\), where I is a compact interval. Let \({\mathcal {P}}\) be a real matrix whose eigenvalues have positive real parts and consider the dilations \(\delta _s=\exp (s\log {\mathcal {P}})\).

Proposition 3.5

Let \(2<p<\infty \) and \(I \subset {\mathbb {R}}\) be a compact interval. Recall that \(\eta \) is a radial non-trivial \(C^\infty \) function with support in \(\{\xi :1/2<|\xi |<2\}\). Suppose

$$\begin{aligned} \sup _{t\in I} \sup _\xi |m(\xi ,t)|\le A, \end{aligned}$$

and assume that for all \(f\in {\mathcal {S}}({\mathbb {R}}^d)\),

$$\begin{aligned} \sup _{s>0}\left( \frac{1}{|I|}\int _I\big \Vert {\mathcal {F}}^{-1}[\eta m(\delta _s \cdot , t) \widehat{f} ] \big \Vert _p^pdt \right) ^{1/p} \le A\Vert f\Vert _p. \end{aligned}$$

Moreover, suppose that for all multiindices \(\alpha \) with \(|\alpha _1|+|\alpha _2|\le d+1\),

$$\begin{aligned} \big |\partial _\xi ^{\alpha } [\eta (\xi ) m(\delta _s \xi ,t) ]\big | \le B, \quad t\in I, s>0. \end{aligned}$$

Then there is a constant \(C_p>0\) such that

$$\begin{aligned} \left( \frac{1}{|I|}\int _I \big \Vert {\mathcal {F}}^{-1}[m(\cdot , t) \widehat{f} ] \big \Vert _p^pdt \right) ^{1/p} \le C_p A \log (2+ B/A)^{1/2-1/p} \Vert f\Vert _p. \end{aligned}$$
(3.9)

The proof is exactly the same as the proof for standard multipliers in [23]. We shall use the following consequence for a square function inequality to derive (6.8).

Corollary 3.6

Let \(2<p<\infty \) and \(I\subset {\mathbb {R}}\) be a compact interval. Suppose that there is a compact subset \(K\subset {\mathbb {R}}^2{\setminus }\{0\}\) such that \(m_0(\xi ,t)=0\) if \(\xi \in K^\complement \) or \(t\in I^\complement \). Suppose that for all multiindices \(\alpha \) with \(|\alpha _1|+|\alpha _2|\le 10\),

$$\begin{aligned} |\partial _\xi ^{\alpha } m_0(\xi ,t) | \le B, \quad t\in I, \end{aligned}$$

and that

$$\begin{aligned} \sup _{t\in I} \sup _{\xi }|m_0(\xi ,t)|\le A. \end{aligned}$$

Moreover, suppose that for all \(f\in {\mathcal {S}}({\mathbb {R}}^2) \) the inequality

$$\begin{aligned} \left( \frac{1}{|I|} \int _I\left\| {\mathcal {F}}^{-1}[m_0(\cdot ,t) \widehat{f} ] \right\| _p^pdt \right) ^{1/p} \le A\Vert f\Vert _p \end{aligned}$$

holds. Define \(T_j f(x,t) \) by \(\widehat{T_jf}(\xi ,t) = m_0(\delta _{2^{-j}} \xi ,t) \widehat{f}(\xi )\). Then there is a constant C(Kp) such that for all \(\{f_j\} \in L^p(\ell ^2)\) we also have

$$\begin{aligned}&\left( \frac{1}{|I|} \int _I \left\| \left( \sum _{j\in {\mathbb {Z}}} |T_jf_j(\cdot ,t)|^2\right) ^{1/2} \right\| _p^p dt \right) ^{1/p} \nonumber \\&\qquad \le C(K,p) A \log (2+ B/A)^{1/2-1/p} \left\| \left( \sum _j|f_j|^2\right) ^{1/2}\right\| _p. \end{aligned}$$
(3.10)

Proof of Corollary 3.6

This is a straightforward consequence of Proposition 3.5 (alternatively one can adapt the proof of Proposition 3.5 to a vector-valued setting). Let \(\widetilde{\phi } \in C^\infty _c({\mathbb {R}}^d{\setminus }\{0\})\) such that \(\widetilde{\phi }(\xi )=1\) for \(\xi \in K\). Let \({\mathcal {J}}\) be a subset of integers with the property that the supports of \(\widetilde{\phi }(\delta _{2^{-j}}\cdot ) \), \(j\in {\mathcal {J}}\) are disjoint. We may write \({\mathbb {Z}}\) as union over \(C_K\) such families. It is sufficient to show the analogue of (3.10) with the j-summation extended over \({\mathcal {J}}\). It will be convenient to work with an enumeration \(\{j_1,j_2,\dots \}\) of \({\mathcal {J}}\).

Let \(L_j\) be defined by \(\widehat{L_j f} = \widetilde{\phi }(\delta _{2^{-j}}\xi )\widehat{f}(\xi ).\) Let \(g= \sum _i L_{j_i} f_{j_i} \); then by the adjoint version of the Littlewood-Paley inequality we have

$$\begin{aligned} \Vert g\Vert _p\lesssim \left\| \left( \sum _i |f_{j_i}|^2\right) ^{1/2} \right\| _p. \end{aligned}$$
(3.11)

Notice that

$$\begin{aligned} T_j g= T_j f_j \end{aligned}$$
(3.12)

by the disjointness condition on the supports of \(\phi (\delta _{2^{-j_i}}\cdot )\). Let \(\{r_i\}_{i=1}^\infty \) denote the sequence of Rademacher functions. Applying Proposition 3.5 to the multipliers

$$\begin{aligned} m_\alpha (\xi )=\sum _{i=1}^\infty r_i(\alpha ) m_0(\delta _{2^{-j_i}}\xi ,t) \end{aligned}$$

and the function \(g= \sum _{i=1}^\infty {\mathcal {F}}[ \widetilde{\phi }(\delta _{2^{-j_i}}\cdot ) \widehat{f}_{j_i}]\) we get

$$\begin{aligned} \left( \int _0^1 \frac{1}{|I|} \int _I\left\| {\mathcal {F}}^{-1}[m_\alpha (\cdot , t) \widehat{g} ] \right\| _p^pdt\,d\alpha \right) ^{1/p} \lesssim A \log (2+ B/A)^{1/2-1/p} \Vert g\Vert _p\,. \end{aligned}$$
(3.13)

By interchanging the \(\alpha \)-integral and the (xt)-integral and applying Khintchine’s inequality we obtain

$$\begin{aligned} \left( \frac{1}{|I|} \int _I \left\| \left( \sum _{j\in {\mathbb {Z}}} |T_jg(\cdot ,t)|^2\right) ^{1/2} \right\| _p^p dt \right) ^{1/p} \lesssim A \log (2+ B/A)^{1/2-1/p} \Vert g\Vert _p \end{aligned}$$

and the proof is completed by applying (3.11) and (3.12). \(\square \)

3.5 A version of the Marcinkiewicz multiplier theorem

In the proof of Proposition 7.1 we shall use a well known version of the Marcinkiewicz multiplier theorem with minimal assumptions on the number of derivatives. Let \(\eta _{{\mathrm{pr}}}\) be a nontrivial \(C^\infty _c\) function which is even in all variables and supported in \(\{\xi : 1/2<|\xi _i|\le 2, i=1,2\}\). Let \(\mathscr {L}^{ 2}_{\alpha ,\alpha }\) the Sobolev space with mixed dominating smoothness consisting of \(g\in L^2\) such that

$$\begin{aligned} \Vert g\Vert _{\mathscr {L}^2_{\alpha ,\alpha }}=\left( \int (1+|\xi _1|^2)^{\alpha } (1+|\xi _2|^2)^{\alpha } |\widehat{g}(\xi )|^2 d\xi \right) ^{1/2} \end{aligned}$$

is finite. Let \(\alpha >1/2\) and m be a bounded function such that

$$\begin{aligned} \sup _{t_1>0, t_2>0} \Vert \eta _{{\mathrm{pr}}} \,m(t_1\cdot , t_2\cdot )\Vert _{{\mathcal {L}}^2_{\alpha ,\alpha }} \le B. \end{aligned}$$
(3.14)

Then we have, for \(1<p<\infty \),

$$\begin{aligned} \Vert {\mathcal {F}}^{-1}[m\widehat{f}]\Vert _p\le c_p B \Vert f\Vert _p. \end{aligned}$$
(3.15)

One can prove this using a straightforward product-type modification of Stein’s proof of the Mikhlin–Hörmander multiplier theorem in [25, §3]. One can also deduce it from Fefferman’s theorem [12], cf. [3, 13].

4 Some maximal function estimates for families of Mikhlin type multipliers on \({\mathbb {R}}^2\)

In this section we consider Mikhlin–Hörmander multipliers with respect to the dilation group \(\delta ^b_t\), \(b>0\), with \(\delta _t^b(\xi )= (t\xi _1, t^b\xi _2)\).

Theorem 4.1

Suppose that

$$\begin{aligned} \sup _{t>0} \sum _{|\alpha |\le 4} \big \Vert \partial ^\alpha \big ( \eta (\cdot ) a(\delta ^b_t\cdot )\big )\big \Vert _{L^1({\mathbb {R}}^2)} \le 1 \end{aligned}$$
(4.1)

Define, for \(n\in {\mathbb {Z}}\) the operator \(T_n\) by

$$\begin{aligned} \widehat{T_n f} (\xi )= a(\xi _1, 2^{bn}\xi _2)\widehat{f}(\xi ) . \end{aligned}$$
(4.2)

Let \({\mathcal {N}}\) be a subset of \({\mathbb {Z}}\) with \(\# {\mathcal {N}}=N\). Then for \(1<p<\infty \),

$$\begin{aligned} \left\| \sup _{n\in {\mathcal {N}}} |T_n f|\right\| _p \le C_p \sqrt{\log (1+N)} \Vert f\Vert _p. \end{aligned}$$
(4.3)

By the Marcinkiewicz interpolation theorem it suffices to show that there is \(A=A(p)\) such that the inequality

$$\begin{aligned} {\mathrm{meas}}\left( \{x: \sup _{n\in {\mathcal {N}}} |T_n f|>4\lambda \}\right) \le \big ( A \sqrt{\log (1+N)} \lambda ^{-1} \Vert f\Vert _p\big )^p \end{aligned}$$
(4.4)

holds for all Schwartz functions f whose Fourier transform is compactly supported in \({\mathbb {R}}^2{\setminus }\{0\}\), all \(\lambda >0\) and all \({\mathcal {N}}\) with \( \#{\mathcal {N}}\le N\).

One can decompose

$$\begin{aligned} a(\xi _1,\xi _2)=\sum _{j\in {\mathbb {Z}}} a_j (2^{-j}\xi _1,2^{-bj}\xi _2) \end{aligned}$$
(4.5)

where each \(a_j\) is supported in \(\{(\xi _1,\xi _2): 1/2< |\xi _1|+ |\xi _2|^{1/b}< 2\}\) and

$$\begin{aligned} \sup _j\int \big |\partial _\xi ^\alpha a_j(\xi ) \big |\, d\xi \le C_\alpha , \quad |\alpha |\le 4. \end{aligned}$$

We shall repeatedly use that the operators \(T_n\) are bounded on \(L^p({\mathbb {R}}^2)\) with norm independent of n. This follows by the Mikhlin-Hörmander multiplier theorem and rescaling in the second variable.

Let \({\mathcal {T}}_{\mathcal {N}}f:=\sup _{n\in {\mathcal {N}}} |T_n f|\) and set

$$\begin{aligned} \varepsilon _N := (\log (C_1N))^{-1/2} \end{aligned}$$
(4.6)

where \(C_1>c_1^{-1} \) with \(c_1\) as in (3.2), also \(\varepsilon _N<1/2\). Since f is a Schwartz function, with \(\widehat{f}\) compactly supported in \({\mathbb {R}}^2{\setminus } \{0\}\) the function \({\mathcal {T}}_{\mathcal {N}}f\) is in \(L^\infty \cap L^2\) which allows us to apply the Chang-Wilson-Wolff inequality.

We have that

$$\begin{aligned}&{\mathrm{meas}}\big (\{x\in \mathbb {R}^2:{\mathcal {T}}_{\mathcal {N}}f(x)>4\lambda \} \big ) \nonumber \\&\quad \le \sum _{n\in {\mathcal {N}}} {\mathrm{meas}}\big (\{x\in \mathbb {R}^2:|T_n f(x)|>4\lambda ,\,{\mathfrak {S}}^{(2)} T_n f(x)\le \varepsilon _N \lambda \} \big )\nonumber \\&\qquad + {\mathrm{meas}}\left( \left\{ x\in \mathbb {R}^2 : \sup _{ n\in {\mathcal {N}}} |{\mathfrak {S}}^{(2)} [T_n f](x)| > \varepsilon _N \lambda \right\} \right) . \end{aligned}$$
(4.7)

By the Chang–Wilson–Wolff inequality (3.2), the first term on the right hand side of (4.7) is bounded by

$$\begin{aligned}&c_2 Ne^{-c_1\varepsilon _N^{-2}} \max _{n\in {\mathcal {N}}} {\mathrm{meas}}\big (\big \{x\in {\mathbb {R}}^2: \mathcal {M}^{(2)}[T_nf]>\lambda \big \}\big )\\&\quad \le c_2 Ne^{-c_1\varepsilon _N^{-2}} \max _{n\in {\mathcal {N}}}\lambda ^{-p} \Vert {\mathcal {M}}^{(2)}[T_n f]\Vert _p^p \lesssim Ne^{-c_1\varepsilon _N^{-2}} \lambda ^{-p} \Vert f\Vert _p^p \lesssim \lambda ^{-p} \Vert f\Vert _p^p \end{aligned}$$

where we used that \(Ne^{-c_1\varepsilon _N^{-2}}\le 1\) (by (4.6)) and that the operators \(T_n\) are uniformly bounded.

By Chebyshev’s inequality the second term on the right hand side of (4.7) is bounded by

$$\begin{aligned}&\varepsilon _N^{-p} \lambda ^{-p} \left\| \sup _{n\in {\mathcal {N}}} {\mathfrak {S}}^{(2)}[T_n f] \right\| _{L^p}^p\\&\quad \lesssim \varepsilon _N^{-p} \lambda ^{-p} \left\| \sup _{n\in {\mathcal {N}}} \left( \sum _{k\in {\mathbb {Z}}} \left[ M^{(2)}( |T_n P^{(2)}_{k,b} f|^q) \right] ^{2/q} \right) ^{1/2} \right\| _p^p\,. \end{aligned}$$

Here we have used Lemma 3.3 with \(g=T_n f\) and the fact that the operators \(T_n\) and \(P^{(2)}_{k,b} \) commute; q will be chosen so that \(1< q < p\).

We shall now use an idea in [16] and approximate the operators \(T_n \) by a convolution operator acting in the first variable. Define \(T^{(1)} \) by

$$\begin{aligned} \widehat{T^{(1)} f} (\xi _1,\xi _2)= \sum _{j\in {\mathbb {Z}}} a_j(2^{-j}\xi _1,0) \widehat{f}(\xi _1,\xi _2). \end{aligned}$$

Recall the definition of \(\chi _b\) in Lemma 3.3. Notice also that

$$\begin{aligned} a_j(2^{-j}\xi _1, 2^{(n-j)b}\xi _2)\chi _{b}(2^{-kb}\xi _2)\equiv 0 \end{aligned}$$

if \(j<n+k-1\) and therefore we have

$$\begin{aligned} T_n P^{(2)}_{k,b}f&= \sum _{j\ge n+k-1} {\mathcal {F}}^{-1} [ a_j(2^{-j} \cdot , 2^{(n-j)b}\cdot )] * P^{(2)}_{k,b}f\nonumber \\&= \sum _{j\ge n+k-1} {\mathcal {F}}^{-1} [ a_j(2^{-j} \cdot , 0)] * P^{(2)}_{k,b}f \end{aligned}$$
(4.8a)
$$\begin{aligned}&\quad + \sum _{j\ge n+k-1} {\mathcal {F}}^{-1} [ a_j(2^{-j} \cdot , 2^{(n-j)b}\cdot )- a_j(2^{-j} \cdot , 0)] * P^{(2)}_{k,b}f. \end{aligned}$$
(4.8b)

For the first term (4.8a) we use the one-dimensional version of Proposition 3.4 to get

$$\begin{aligned} \left| \sum _{j\ge n+k-1} {\mathcal {F}}^{-1} [ a_j(2^{-j} \cdot , 0)] * P^{(2)}_{k,b}f \right| \lesssim M ^{(1)} \left( P^{(2)}_{k,b}f\right) + M ^{(1)}\left( T^{(1)} P^{(2)}_{k,b} f\right) .\nonumber \\ \end{aligned}$$
(4.9)

Here \(M^{(1)}\) denotes the Hardy–Littlewood maximal operator acting on the first variable.

Now consider the second term (4.8b). Let \(\tilde{\phi }\) be an appropriately chosen non-negative bump function supported in \((1/4,3)\cup (-3, -1/4)\) and let \(K_{j,k,n}\) be the convolution kernel with multiplier

$$\begin{aligned} \widehat{K_{j,k,n}} (\xi ) =\tilde{\phi }(2^{-kb}\xi _2)\big ( a_j(2^{-j} \xi _1, 2^{(n-j)b}\xi _2) - a_j(2^{-j} \xi _1, 0)\big ). \end{aligned}$$

Then

$$\begin{aligned} \widehat{K_{j,k,n}} (2^j\xi _1, 2^{kb}\xi _2) = 2^{(k+n-j)b} \tilde{\phi }(\xi _2) \xi _2 \int _0^1 \partial _{2} a_j (\xi _1, 2^{(k+n-j)b}s\xi _2) \, ds \end{aligned}$$

and we have \(\big \Vert \partial ^\alpha \big ( \widehat{K_{j,k,n}} (2^j\cdot , 2^{kb}\cdot ) \big )\big \Vert _1 \lesssim 2^{(k+n-j)b}\) for multiindices \(|\alpha |\le 3\). This implies

$$\begin{aligned} |K_{j,k,n}(x)|\lesssim 2^{(k+n-j)b} \frac{2^{j+kb}}{ (1+2^j|x_1|+ 2^{kb}|x_2|)^{3} } \end{aligned}$$

and hence

$$\begin{aligned} \sum _{j\ge n+k-1} \left| K_{j,k,n} * P^{(2)}_{k,b} f(x) \right| \lesssim M_{\mathrm{str}}\left( P^{(2)}_{k,b} f\right) (x) \end{aligned}$$

where \(M_{\mathrm{str}}\) is the strong maximal operator which is controlled by \(M^{(2)}\circ M^{(1)}\).

Combining the estimates we thus see that the second term on the right hand side of (4.7) is bounded by

$$\begin{aligned}&\varepsilon _N^{-p}\lambda ^{-p} \left( \left\| \left( \sum _{k\in {\mathbb {Z}}} \left[ M^{(2)}( |M^{(2)}M^{(1)} P^{(2)}_{k,b} f|^q) \right] ^{2/q} \right) ^{1/2} \right\| _p\right. \\&\qquad \left. + \left\| \left( \sum _{k\in {\mathbb {Z}}} \left[ M^{(2)}( |M^{(2)}M^{(1)} T^{(1)} P^{(2)}_{k,b} f|^q) \right] ^{2/q} \right) ^{1/2} \right\| _p\right) ^p . \end{aligned}$$

We use this with \(1<q<p\) and apply Fefferman-Stein estimates for the vector-valued versions of \(M^{(1)}\) and \(M^{(2)}\) and the Marcinkiewicz-Zygmund theorem on \(L^p(\ell ^2) \) boundedness applied to the operator \(T^{(1)}\). Consequently the last expression can be bounded by

$$\begin{aligned} C_p^p \varepsilon _N^{-p}\lambda ^{-p}\Vert f\Vert _p^p \lesssim C_p^p (\log (1+N))^{p/2}\lambda ^{-p}\Vert f\Vert _p^p\, , \end{aligned}$$

by the definition of \(\varepsilon _N\). This finishes the proof of (4.4) and thus the proof of Theorem 4.1. \(\square \)

5 Proof of Theorem 2.2

We decompose \( \Phi _0=\sum _{l\in {\mathbb {Z}}} \Phi _{0,l}\) where \(\widehat{\Phi }_{0,l}(\xi )= \chi _+(2^{-l}|\xi | )\widehat{\Phi }_{0}(\xi )\) . Define

$$\begin{aligned} a_{0,l} (\xi )&= \widehat{\Phi }_{0,l} (2^l \xi ),\\ \widetilde{a}_{0,l,s } (\xi )&= s^{b}\xi _2 \frac{\partial \widehat{\Phi }_{0,l}}{\partial \xi _2} (2^l \xi _1, 2^l s^b\xi _2 ). \end{aligned}$$

Then the functions \(a_{0,l}\) and \(\widetilde{a} _{0,l,s} \), for every \(s\in (1/2,2)\), are supported in \(\{\xi : 10^{-b}<|\xi |<10^b\}\) and satisfy the estimates

$$\begin{aligned} \int \big |\partial _\xi ^\alpha a_{0,l}(\xi )\big |d\xi + \int \big |\partial _\xi ^\alpha \widetilde{a}_{0,l,s}(\xi )\big |d\xi \le C 2^{-|l|} \end{aligned}$$

for all multiindices \(\alpha \) with \(|\alpha _1|+|\alpha _2| \le 10\). This means that there is a \(c>0\) such that the multipliers

$$\begin{aligned} \begin{aligned} a_l(\xi )&= c2^{|l|} \sum _{j\in {\mathbb {Z}}} a_{0,l} (2^{-j}\xi _1,2^{-jb} \xi _2),\\ \widetilde{a}_{l,s} (\xi )&= c2^{|l|} \sum _{j\in {\mathbb {Z}}} a_{0,l,s} (2^{-j}\xi _1,2^{-jb} \xi _2) \end{aligned} \end{aligned}$$
(5.1)

satisfy the conditions (4.1) in Theorem 4.1. Now define operators \(S^u_l\) and \(R^u_l\)

$$\begin{aligned} \widehat{S^{u}_l f}(\xi )&= \sum _{j\in {\mathbb {Z}}} \widehat{\Phi _{0,l}}(2^{-j}\xi _1, 2^{-jb}u\xi _2) \widehat{f}(\xi ),\\ \widehat{R^{u}_l f}(\xi )&= \sum _{j\in {\mathbb {Z}}} \widehat{\Phi _{0,l}}(2^{l-j}\xi _1, 2^{l-jb}u\xi _2) \widehat{f}(\xi ). \end{aligned}$$

The assertion of the theorem follows if we can prove

$$\begin{aligned} \left\| \sup _{u\in U} |S^{u}_{l} f| \right\| _p \lesssim 2^{-|l|} \sqrt{\log {\mathfrak {N}}(U)} \Vert f\Vert _p \end{aligned}$$

which follows by isotropic rescaling from

$$\begin{aligned} \left\| \sup _{u\in U} |R^{u}_{l} f| \right\| _p \lesssim 2^{-|l|} \sqrt{\log {\mathfrak {N}}(U)} \Vert f\Vert _p. \end{aligned}$$
(5.2)

Now let

$$\begin{aligned} {\mathcal {N}}=\{n\in {\mathbb {Z}}: \exists s\in (1/2,2) \text { such that } (2^{n}s)^{b} \in U\}. \end{aligned}$$

Observe that \(\#{\mathcal {N}}\le C(b) {\mathfrak {N}}(U)\). The inequality (5.2) follows from

$$\begin{aligned} \big \Vert \sup _{n\in {\mathcal {N}}} \sup _{1/2<s<2} |R^{(2^{n}\,s)^b}_{l} f| \big \Vert _p \lesssim 2^{-|l|} \sqrt{\log (1+\#{\mathcal {N}})} \Vert f\Vert _p \end{aligned}$$

which is a consequence of

$$\begin{aligned} \left\| \sup _{n\in {\mathcal {N}}} |R^{2^{nb}}_{l} \,f | \right\| _p \lesssim 2^{-|l|} \sqrt{\log (1+\#{\mathcal {N}})} \Vert f\Vert _p \end{aligned}$$
(5.3)

and

$$\begin{aligned} \int _{1/2}^{2} \left\| \sup _{n\in {\mathcal {N}}} \left| \frac{\partial }{\partial s} R^{(2^{n}\,s)^b}_{l} \,f \right| \right\| _p\,ds\, \lesssim 2^{-|l|} \sqrt{\log (1+\#{\mathcal {N}})} \Vert f\Vert _p \,. \end{aligned}$$
(5.4)

Since

$$\begin{aligned} {\mathcal {F}}[R^{2^{nb}}_{l}\, f](\xi )&= \sum _j a_{0,l}(2^{-j}\xi _1, 2^{nb-jb}\xi _2) \widehat{f}(\xi ),\\ {\mathcal {F}}[\partial _s R^{(2^{n}\,s)^b}_{l} \,f ](\xi )&= \frac{b}{s} \sum _j a_{0,l,s} (2^{-j}\xi _1, 2^{nb-jb}\xi _2) \widehat{f}(\xi ), \end{aligned}$$

the inequalities (5.3) and (5.4) follow by applying Theorem 4.1 to the multipliers in (5.1). \(\square \)

6 Proof of Theorem 2.3

We only consider the maximal function for the operator \(T^u_+\), since the analogous problem for \(T^u_-\) can be reduced to the former one by a change of variable (with a different curve). We omit the subscript and set \(T^u=T^u_+\).

Decompose \(\kappa _{0,+}=\sum _{\ell =0}^{\infty } \kappa _{0,\ell }\) where

$$\begin{aligned} \widehat{\kappa _{0,\ell }}(\xi ) = \chi _+(2^{-\ell }|\xi |) \omega _+(\xi ) e^{i\Psi _+(\xi )}. \end{aligned}$$

Notice that, by Lemma 2.1, \(|\xi _1|\approx |\xi _2|\approx 2^{\ell }\) for \( \xi \in {\text {supp}}(\widehat{\kappa _{0,\ell }})\). Define \(\kappa _{j,\ell }\) by \(\widehat{\kappa _{j,\ell }} (\xi )= \widehat{\kappa _{0,\ell }} (2^{-j}\xi _1,2^{-jb}\xi _2) \) and define \(T^u_{j,\ell }\) by

$$\begin{aligned} \widehat{T^u_{j,\ell } f}(\xi ) = \widehat{\kappa _{j,\ell } }(\xi _1,u \xi _2) \widehat{f}(\xi ). \end{aligned}$$
(6.1)

Then we have \(T^u=\sum _{\ell \ge 0} \sum _{j\in {\mathbb {Z}}} T^u_{j,\ell }\).

The assertion of the theorem follows if we can show, for \(2< p < \infty \), that there exists some \(\varepsilon =\varepsilon (p)>0\) with

$$\begin{aligned} \left\| \sup _{n\in {\mathbb {Z}}} \sup _{1/2<s<2} \left| \sum _{j\in {\mathbb {Z}}} T_{j,\ell }^{(2^{n}s)^{b}} \,f\right| \,\right\| _p\lesssim 2^{-\ell \varepsilon } \Vert f\Vert _p. \end{aligned}$$
(6.2)

Define \({\mathcal {R}}^u_{j,\ell } \) by

$$\begin{aligned} \widehat{{\mathcal {R}}^u_{j,\ell } f}(\xi ) = \widehat{\kappa _{0,\ell } }(2^{\ell -j}\xi _1,2^{\ell -jb}u \xi _2) \widehat{f}(\xi ). \end{aligned}$$

By isotropic rescaling inequality (6.2) is equivalent with

$$\begin{aligned} \left\| \sup _{n\in {\mathbb {Z}}} \sup _{1/2<s<2} \left| \sum _{j\in {\mathbb {Z}}} {\mathcal {R}}_{j,\ell } ^{(2^{n}s)^{b}} \,f\right| \right\| _p\lesssim 2^{-\ell \varepsilon } \Vert f\Vert _p. \end{aligned}$$
(6.3)

This inequality follows, by the embedding \(\ell ^p\subset \ell ^\infty \) and Fubini’s theorem from

$$\begin{aligned} \left( \sum _{n\in {\mathbb {Z}}} \left\| \sup _{ 1/2< s<2} \left| \sum _{j\in {\mathbb {Z}}} {\mathcal {R}}_{j,\ell }^{(2^n s)^{b}}\,f \right| \right\| _p^p\right) ^{1/p} \lesssim 2^{-\ell \varepsilon } \Vert f\Vert _p \end{aligned}$$
(6.4)

Fix nx and set \(G(s)= \sum _j {\mathcal {R}}_{j,\ell } ^{(2^n s)^{b}} \,f(x)\). We use the standard argument of applying the fundamental theorem of calculus to \(|G(s)|^p\) and then Hölder’s inequality which gives

$$\begin{aligned} |G(s)|^p \le |G(1)|^p + p \left( \int _{1/2}^2 |G(s) |^pds\right) ^{1/p'} \left( \int _{1/2}^2 |G'(s) |^pds\right) ^{1/p} . \end{aligned}$$

This inequality and another application of Hölder’s inequality in \({\mathbb {R}}^2\) shows that (6.4) follows from

$$\begin{aligned} \left( \sum _{n\in {\mathbb {Z}}} \int _{1/2}^2 \left\| \sum _j {\mathcal {R}}_{j,\ell }^{(2^n s)^{b}}\,f \right\| _p^pds\right) ^{1/p}\lesssim & {} 2^{-\ell (\varepsilon +1/p)} \Vert f\Vert _p, \end{aligned}$$
(6.5a)
$$\begin{aligned} \left( \sum _{n\in {\mathbb {Z}}} \int _{1/2}^2 \left\| \frac{\partial }{\partial s} \Big (\sum _j {\mathcal {R}}_{j,\ell } ^{(2^n s)^{b}}\,f\Big ) \right\| _p^pds\right) ^{1/p}\lesssim & {} 2^{\ell -\ell (\varepsilon +1/p)} \Vert f\Vert _p \end{aligned}$$
(6.5b)

and

$$\begin{aligned} \left( \sum _{n\in {\mathbb {Z}}} \left\| \sum _j {\mathcal {R}}_{j,\ell }^{2^{nb}}\,f \right\| _p^p \right) ^{1/p} \lesssim 2^{-\ell /p} \Vert f\Vert _p \end{aligned}$$
(6.5c)

for \(2< p < \infty \).

We focus on the derivation of the inequality (6.5a). Note that for \(s\in [1/2,2]\)

$$\begin{aligned} \widehat{\kappa _{0,\ell }}(\xi _1, s^b \xi _2)&= \omega _+(\xi _1, s^b\xi _2) \chi _+(2^{-\ell } |(\xi _1, s^b \xi _2)| ) e^{i \Psi _+(\xi _1, s^b\xi _2)}\\&= 2^{-\ell /2} \eta _{\ell ,s}(2^{-\ell }\xi ) e^{-is^{\frac{b}{b-1}}\Psi _+(\xi _1, \xi _2)} \end{aligned}$$

where

$$\begin{aligned} \eta _{\ell ,s}(\xi _1,\xi _2) = 2^{\ell /2} \omega _+(2^{\ell } \xi _1, 2^{\ell } s^b\xi _2) \chi _+(|(\xi _1, s^b \xi _2)| ) \end{aligned}$$

and taking into account that \(\omega _+\) is a symbol of order \(-1/2\) we see that the \(\eta _{\ell ,s}\) belong to a bounded set of \(C^\infty \) functions supported in an annulus \(\{\xi : a_0\le |\xi |\le a_0^{-1}\}\), for fixed \(a_0=a_0(b)<1\).

After changing variables \(t=s^{-\frac{b}{b-1}}\), with \(t\in (2^{-\frac{b}{b-1}} ,2^{\frac{b}{b-1}}) \) this puts us in the position to apply (3.8) with \(R=2^{\ell }\) and we obtain, with suitable \(\varepsilon '=\varepsilon '(p)>0\)

$$\begin{aligned} \left( \int _{1/2}^2 \Big \Vert \mathcal {F}^{-1}[ \widehat{\kappa _{0,\ell }} (\xi _1, s^b \xi _2) \widehat{f}]\Big \Vert _p ^p ds\right) ^{1/p} \lesssim 2^{-\ell (\varepsilon '+1/p)} \Vert f\Vert _p. \end{aligned}$$

By isotropic scaling, replacing \( \widehat{\kappa _{0,\ell }} (\xi _1, s^b \xi _2) \) with \( \widehat{\kappa _{0,\ell }} (2^\ell \xi _1, s^b 2^\ell \xi _2) \), we also have

$$\begin{aligned} \left( \int _{1/2}^2 \big \Vert {\mathcal {R}}_{0,\ell } ^{s^b}f \big \Vert _p ^p ds\right) ^{1/p} \le C_\varepsilon 2^{-\ell (\varepsilon '+1/p)} \Vert f\Vert _p. \end{aligned}$$
(6.6)

Let

$$\begin{aligned} m_{j, \ell } (\xi , s) = \widehat{\kappa _{0,\ell } }\left( 2^{\ell -j} \xi _1, s^b 2^{\ell -jb} \xi _2\right) \end{aligned}$$

and observe \(\widehat{{\mathcal {R}}_{j,\ell } ^{s^{b}}f} (\xi ) = m_{j, \ell } (\xi , s) \widehat{f}(\xi )\). The functions \(\xi \mapsto m_{0,\ell }(\xi ,s) \) are supported in a fixed annulus and satisfy

$$\begin{aligned} \left| \partial _{\xi _1}^{\alpha _1 } \partial _{\xi _2}^{\alpha _2} m_{0,\ell }(\xi ,s) \right| \lesssim 2^{\ell (\alpha _1+\alpha _2)} . \end{aligned}$$
(6.7)

By Corollary 3.6 we get the inequality

$$\begin{aligned}&\left( \int _{1/2}^2 \left\| \left( \sum _{j\in {\mathbb {Z}}} | {\mathcal {R}}^{s^{b}}_{j,\ell } f_j |^2\right) ^{1/2} \right\| _p^p ds\right) ^{1/p} \nonumber \\&\quad \lesssim 2^{-\ell (\varepsilon '+1/p)} (1+\ell )^{1/2-1/p} \left\| \left( \sum _j|f_j|^2\right) ^{1/2}\right\| _p. \end{aligned}$$
(6.8)

We can replace the multipliers \(m_{j, \ell }(\xi _1,\xi _2,s) \) by \(m_{j, \ell }( \xi _1,2^{nb} \xi _2, s) \), after scaling in the second variable. This means that for every fixed n we have proved, for \(\varepsilon <\varepsilon '\),

$$\begin{aligned} \left( \int _{1/2}^2 \left\| \left( \sum _j| {\mathcal {R}}_{j,\ell }^{(2^n s)^{b}} \,f_j|^2 \right) ^{1/2} \right\| _p^pds\right) ^{1/p} \lesssim 2^{-\ell (\varepsilon +1/p)} \left\| \left( \sum _j |f_j|^2\right) ^{1/2}\right\| _p, \end{aligned}$$
(6.9)

with the implicit constant independent of n.

We now combine this with Littlewood-Paley inequalities to prove (6.5a). Let \(\widetilde{\chi }^{(1)} \) be an even \(C^\infty \) function supported on \(\{\xi _1: |c_+|b 2^{-3b-1} \le |\xi _1|\le |c_+|b 2^{3b+1}\}\) and equal to 1 for \( |c_+|b 2^{-3b} \le |\xi _1|\le |c_+|b 2^{3b}\). Let \(\widetilde{\chi }_b^{(2)} \) be an even \(C^\infty \) function supported on \(\{\xi _2: 2^{-2b-1} \le |\xi _2|\le 2^{2b+1}\}\) and equal to 1 for \( 2^{-2b} \le |\xi _2|\le 2^{2b}\). Define \(\tilde{P}^{(1)}_{j} \), \(\tilde{P}^{(2)}_{j,b} \) by

$$\begin{aligned} \widehat{\tilde{P}^{(1)}_{j} f}(\xi )&= \widetilde{\chi }^{(1)} (2^{-j}\xi _1) \widehat{f}(\xi )\\ \widehat{\tilde{P}^{(2)}_{j,b} f}(\xi )&= \widetilde{\chi }^{(2)} (2^{-jb}\xi _2) \widehat{f}(\xi ) \end{aligned}$$

Then by the support properties of \(\widehat{\kappa _{0,\ell }}(2^\ell \cdot )\) we get for \(1/2\le s\le 2\)

$$\begin{aligned} {\mathcal {R}}_{j,\ell }^{(2^n s)^{b}}= \tilde{P}^{(1)}_{j} \tilde{P}^{(2)}_{j-n,b} {\mathcal {R}}_{j,\ell }^{(2^n s)^{b}} \tilde{P}^{(2)}_{j-n,b} \tilde{P}^{(1)}_{j} . \end{aligned}$$
(6.10)

Hence, by Littlewood-Paley theory

$$\begin{aligned}&\left( \sum _{n\in {\mathbb {Z}}} \int _{1/2}^2 \left\| \sum _j {\mathcal {R}}_{j,\ell }^{(2^n s)^{b}}\,f \right\| _p^pds\right) ^{1/p}\\&\quad \lesssim \left( \sum _{n\in {\mathbb {Z}}} \int _{1/2}^2 \left\| \left( \sum _j| {\mathcal {R}}_{j,\ell }^{(2^n s)^{b}} \tilde{P}^{(2)}_{j-n,b} \tilde{P}^{(1)}_{j} f|^2\right) ^{1/2} \right\| _p^pds\right) ^{1/p} \end{aligned}$$

and by (6.9) this is controlled by

$$\begin{aligned} 2^{-\ell (\varepsilon (p) +1/p)} \left( \sum _{n\in {\mathbb {Z}}} \left\| \left( \sum _{j\in {\mathbb {Z}}} | \tilde{P}^{(2)}_{j-n,b} \tilde{P}^{(1)}_{j} f|^2\right) ^{1/2} \right\| _p^p\right) ^{1/p} \end{aligned}$$

for some \(\varepsilon (p)>0\) when \(2<p<\infty .\) We finish the proof of (6.5a) by observing that

$$\begin{aligned} \left( \sum _{n\in {\mathbb {Z}}} \left\| \left( \sum _{j\in {\mathbb {Z}}}| \tilde{P}^{(2)}_{j-n,b} \tilde{P}^{(1)}_{j} f|^2\right) ^{1/2} \right\| _p^p\right) ^{1/p}&\le \left\| \left( \sum _{j\in {\mathbb {Z}}}\sum _{n\in {\mathbb {Z}}} | \tilde{P}^{(2)}_{j-n,b} \tilde{P}^{(1)}_{j} f|^2\right) ^{1/2} \right\| _p\\&=\left\| \left( \sum _{k_1\in {\mathbb {Z}}} \sum _{k_2\in {\mathbb {Z}}} | \tilde{P}^{(2)}_{k_2,b} \tilde{P}^{(1)}_{k_1} f|^2\right) ^{1/2} \right\| _p {\lesssim } \Vert f\Vert _p \end{aligned}$$

where we have used the embedding \(\ell ^2\hookrightarrow \ell ^p\) for \(p>2\), and applied a two-parameter Littlewood-Paley inequality.

We now turn to the estimate (6.5b). A computation shows

$$\begin{aligned}&2^{-\ell } \frac{\partial }{\partial s} \left( \sum _j {\mathcal {F}}[{\mathcal {R}}_{j,\ell }^{(2^n s)^{b} }f](\xi )\right) \nonumber \\&\quad = \widehat{f}(\xi ) \frac{b}{s} \sum _j \upsilon _\ell \left( 2^{-j}\xi _1 , s^b 2^{(n-j)b} \xi _2\right) e^{i2^\ell \Psi _+(2^{-j}\xi _1, s^b 2^{(n-j) b} \xi _2)} \end{aligned}$$
(6.11a)

where

$$\begin{aligned} \upsilon _\ell (\xi )= & {} 2^{-\ell }\chi _+'(|\xi |) \frac{\xi _2^2}{|\xi |} \omega _+(2^\ell \xi _1, 2^{\ell }\xi _2) +\chi _+(|\xi |) \xi _2\frac{\partial \omega _+}{\partial {\xi _2}}(2^\ell \xi _1, 2^{\ell }\xi _2)\nonumber \\&+ \chi _+(\xi )\omega _+(2^{\ell }\xi _1, 2^{\ell }\xi _2) i\xi _2\frac{\partial \Psi _+}{\partial {\xi _2}}(\xi _1, \xi _2). \end{aligned}$$
(6.11b)

Here the main contribution in (6.11b) comes from the third term (the others are similar but better by a factor of about \(2^{-\ell }\)).

It is now straightforward to check that in the proof of (6.5a) the term \({\mathcal {R}}_{j,\ell }^{(2^n s)^{b} }\,f\) can be replaced with \(2^{-\ell } \partial _s({\mathcal {R}}_{j,\ell }^{(2^n s)^{b} }\,f)\) and one obtains (6.5b).

Finally, a simple modification of the proof of (6.5a) would also prove (6.5c): in place of (3.8), one would use a fixed time estimate, as stated immediately before (3.8). This finishes the proof of Theorem 2.3.

7 Maximal functions for lacunary sets

We shall prove some upper bounds for the operator norm of \({\mathcal {H}}^U\) for lacunary sets.

Definition

Let \(\kappa >1\). A finite set U is called \(\kappa \)-lacunary if it can be arranged in a sequence \(U=\{u_1<u_2<\dots <u_M\}\) where \(u_{j+1}\le u_j/\kappa \) for \(j=1,\dots , M-1\). U is lacunary if U is \(\kappa \)-lacunary for some \(\kappa >1\).

Note that for lacunary sets we have \(\# U \approx {\mathfrak {N}}(U)\) (with the implicit constant depending on \(\kappa \)).

Proposition 7.1

Let U be a lacunary set. Then, for \(4/3<p<\infty \)

$$\begin{aligned} \Vert {\mathcal {H}}^U\Vert _{L^p\rightarrow L^p}\lesssim \sqrt{\log (1+(\#U))}\,. \end{aligned}$$
(7.1)

Proposition 7.1 will be used in the proof of lower bounds in Sect. 8. For this application it is important that (7.1) just holds for some \(p<2\). We do not know at this time whether the result extends to all \(p>1\).Footnote 1 For special lacunary sequences it does:

Proposition 7.2

Let U be a subset of \(\{2^{nb}: n\in {\mathbb {Z}}\}\). Then, for \(1<p<\infty \)

$$\begin{aligned} \Vert {\mathcal {H}}^U\Vert _{L^p\rightarrow L^p}\lesssim \sqrt{\log (1+(\#U))}\,. \end{aligned}$$

Here b is as in the definition of the curve \(\gamma _b\) in (1.1).

7.1 Proof of Proposition 7.1

We may assume that for every interval \(I_n:=[2^{nb}, 2^{(n+1)b})\), \(n\in {\mathbb {Z}}\), there is at most one \(u\in U\cap I_n\). This is because of the lacunarity assumption we can split U in O(1) many sets with this assumption.

We order \(U=\{u_\nu \}\) such that \(u_{\nu }<u_{\nu +1}\) and let \(n(\nu )\) be the unique integer for which \(u_\nu \in I_n\).

We split \(H^{(u)}= S^{u}+T^u\) as in (2.8). In view of Theorems 2.2, 2.3 it suffices to prove the inequality

$$\begin{aligned} \left\| \sup _{u\in U} |T^u_\pm f| \right\| _p\lesssim \Vert f\Vert _p \end{aligned}$$
(7.2)

for \(4/3<p\le 2\). By the reduction in Sect. 6 this can be accomplished if

$$\begin{aligned} \left\| \sup _{\nu } \left| \sum _j {\mathcal {R}}_{j,\ell }^{u_\nu }f\right| \right\| _p \lesssim 2^{-\ell \epsilon (p)} \Vert f\Vert _p \end{aligned}$$
(7.3)

can be proved for \(\epsilon (p)>0\), in our case in the range \(4/3<p\le 2\).

Replacing the \(\sup \) by an \(\ell ^2\) norm we see that (7.3) follows from

$$\begin{aligned} \left\| \left( \sum _{\nu } \left| \sum _j {\mathcal {R}}_{j,\ell }^{u_\nu }f\right| ^2\right) ^{1/2} \right\| _p \lesssim 2^{-\ell \epsilon (p)} \Vert f\Vert _p \end{aligned}$$
(7.4)

Analogously to (6.10) we have

$$\begin{aligned} {\mathcal {R}}_{j,\ell }^{u_\nu }= \tilde{P}^{(1)}_{j} \tilde{P}^{(2)}_{j-n(\nu ),b} {\mathcal {R}}_{j,\ell }^{u_\nu } \tilde{P}^{(2)}_{j-n(\nu ),b} \tilde{P}^{(1)}_{j} \end{aligned}$$

and thus, by Littlewood–Paley theory, (7.4) is a consequence of

$$\begin{aligned} \left\| \left( \sum _\nu \sum _{j\in {\mathbb {Z}}} \left| {\mathcal {R}}_{j,\ell }^{u_\nu } \tilde{P}^{(2)}_{j-n(\nu ),b} \tilde{P}^{(1)}_{j} \, f\right| ^2\right) ^{1/2} \right\| _p \lesssim 2^{-\ell \epsilon (p)} \Vert f\Vert _p. \end{aligned}$$
(7.5)

By a standard application of Khintchine’s inequality this estimate follows if we can prove

$$\begin{aligned} \left\| \sum _\nu \sum _{j\in {\mathbb {Z}}} c(\nu ,j) {\mathcal {R}}_{j,\ell }^{u_{\nu } } \tilde{P}^{(2)}_{j-n(\nu ),b} \tilde{P}^{(1)}_{j} \, f\right\| _p \lesssim 2^{-\ell \epsilon (p)} \Vert f\Vert _p. \end{aligned}$$
(7.6)

for an arbitrary choice of \(\{c(\nu , j)\}\) with \(\sup _{j,\nu } |c(\nu , j)|\le 1\). Let

$$\begin{aligned} \omega _\ell (\xi )= \omega _+(2^\ell \xi ) \chi _+(|\xi |) \end{aligned}$$

then \(\omega _\ell \) and its derivatives are \(O(2^{-\ell /2})\), by the symbol property of \(\omega _+\), and are supported on a common annulus. We see that the \(L^2\) operator norms of the individual operators \({\mathcal {R}}_{j,\ell }^{u_\nu }\) are \(O(2^{-\ell /2})\), and that the function

$$\begin{aligned} m_\ell (\xi )= & {} \sum _{\nu }\sum _j \tilde{\chi }^{(1)}(2^{-j}\xi _1) \tilde{\chi }^{(2)}(2^{-jb+n(\nu )b}\xi _2) \\&\times \, \omega _\ell (2^{-j}\xi _1, 2^{(n(\nu )-j) b}\xi _2) e^{i2^\ell \Psi _+(2^{-j}\xi _1,2^{(n(\nu )-j)b}\xi _2)} \end{aligned}$$

has \(L^\infty \) norm \(\lesssim 2^{-\ell /2}\). This implies

$$\begin{aligned} \left\| \sum _\nu \sum _{j\in {\mathbb {Z}}} c(\nu ,j) {\mathcal {R}}_{j,\ell }^{u_{\nu } } \tilde{P}^{(2)}_{j-n(\nu ),b} \tilde{P}^{(1)}_{j} f\right\| _2 \lesssim 2^{-\ell /2} \Vert f\Vert _2. \end{aligned}$$
(7.7)

For p near 1 we apply the Marcinkiewicz multiplier theorem in the form described in Sect. 3.5. It is not hard to check that the multiplier \(m_\ell \) satisfies the condition (3.14) with constant \(B\le C_\alpha 2^{\ell (2\alpha -1/2)}\). Hence we get

$$\begin{aligned} \left\| \sum _\nu \sum _{j\in {\mathbb {Z}}} c(\nu ,j) {\mathcal {R}}_{j,\ell }^{u_{\nu } } \tilde{P}^{(2)}_{j-n(\nu ),b} \tilde{P}^{(1)}_{j} \, f\right\| _p \lesssim 2^{\ell (2\alpha -\frac{1}{2})} \Vert f\Vert _p, \quad \alpha >1/2. \end{aligned}$$
(7.8)

We interpolate between (7.7) and (7.8). By choosing \(\alpha \) very close to 1 / 2, we obtain (7.6) for any \(p\in (4/3,2]\). \(\square \)

7.2 Proof of Proposition 7.2

We argue as in the proof of Proposition 7.1. The desired conclusion follows if under our present conditions (7.8) can be upgraded to

$$\begin{aligned} \left\| \sum _\nu \sum _{j\in {\mathbb {Z}}} c(\nu ,j) {\mathcal {R}}_{j,\ell }^{u_{\nu } } \tilde{P}^{(2)}_{j-n(\nu ),b} \tilde{P}^{(1)}_{j} f\right\| _p \le c_p (1+\ell ^4) \Vert f\Vert _p, \quad 1<p\le 2. \end{aligned}$$
(7.9)

As now \(u_\nu = 2^{n(\nu )b} \) for a strictly increasing sequence \(\{n(\nu )\}\) we see by another application of Littlewood-Paley theory that (7.9) is a consequence of the inequality

$$\begin{aligned} \left\| \left( \sum _{n\in {\mathbb {Z}}}\sum _{j\in {\mathbb {Z}}} \left| {\mathcal {R}}_{j,\ell }^{2^{nb}} f_{j,n}\right| ^2\right) ^{1/2} \right\| _p \lesssim (1+\ell ^4) \left\| \left( \sum _{j,n} |f_{j,n} |^2 \right) ^{1/2}\right\| _p. \end{aligned}$$
(7.10)

This is proved as in [16] by using a superposition of shifted maximal operators, in a vector-valued setting. To analyze the situation we recall how \({\mathcal {R}}_{j,\ell }^{u}\) was formed (namely by rescaling \(T_{j,\ell }^{u}\), then see Sect. 2).

Let \(\sigma _+\) be as in (2.1). Then there is a Schwartz function \(\varsigma \) such that

$$\begin{aligned} \widehat{{\mathcal {R}}^{2^{nb}}_{j,\ell } f}(\xi )&= \chi _+(|(2^{-j}\xi _1, 2^{nb-jb}\xi _2)|) \widehat{\sigma _+} (2^{\ell -j}\xi _1, 2^{\ell +nb-jb} \xi _2) \widehat{f}(\xi ) \\&\quad + \chi _+(|(2^{-j}\xi _1, 2^{nb-jb}\xi _2)|) \widehat{\varsigma } (2^{\ell -j}\xi _1, 2^{\ell +nb-jb} \xi _2) \widehat{f}(\xi ). \end{aligned}$$

Consider the second (error) term. It is easy to see that

$$\begin{aligned} \big |{\mathcal {F}}[ \chi _+(|(2^{-j}\xi _1, 2^{nb-jb}\xi _2)|) \widehat{\varsigma } (2^{\ell -j}\xi _1, 2^{\ell +nb-jb} \xi _2) \widehat{f}(\xi )](x)\big | \lesssim 2^{-\ell } M_{\mathrm{str} } f(x) \end{aligned}$$

so that these terms are taken care of by an application of the Fefferman-Stein inequality for the vector-valued strong maximal function.

We concentrate on the main term. We write \(\sigma _+=\sum _{m=2^{\ell -1}}^{2^{\ell +1} } \mu _{m}\) where the measure \(\mu _m\) is given by

$$\begin{aligned} \langle \mu _m,f\rangle = \int _{m2^{-\ell }}^{(m+1)2^{-\ell }} f(t,\gamma _b(t))\chi _+(t) \frac{dt}{t}. \end{aligned}$$

Define \({\mathcal {R}}^{u}_{j,\ell ,m} f\) by

$$\begin{aligned} \widehat{{\mathcal {R}}^{u}_{j,\ell ,m} f}(\xi ) = \chi (|(2^{-j}\xi _1, 2^{-jb}\xi _2)|)\widehat{\mu _m} (2^{\ell -j}\xi _1, 2^{\ell -jb}u \xi _2) \widehat{f}(\xi ). \end{aligned}$$

Then by the above discussion we have

$$\begin{aligned} \left| {\mathcal {R}}^{2^{nb}}_{j, \ell } f(x) - \sum _{m=2^{\ell -1}}^{2^{\ell +1} } {\mathcal {R}}^{2^{nb}}_{j, \ell ,m} f(x)\right| \lesssim 2^{-\ell } M_{\mathrm {str}} f(x) \end{aligned}$$

and hence, by Minkowski’s inequality, it suffices to show that

$$\begin{aligned} \left\| \left( \sum _{n,j\in {\mathbb {Z}}} \left| {\mathcal {R}}_{j,\ell ,m}^{2^{nb}} f_{j,n}\right| ^2\right) ^{1/2} \right\| _p \lesssim 2^{-\ell } (1+\ell )^4 \left\| \left( \sum _{j,n\in \mathbb {Z}} |f_{j,n} |^2 \right) ^{1/2}\right\| _p \end{aligned}$$
(7.11)

for \(2^{\ell -1}\le m\le 2^{\ell +1}\). Notice that

$$\begin{aligned}&\left| \mu _m* {\mathcal {F}}^{-1} [\chi _+(|\cdot |2^{-\ell })](y)\right| \\&\quad \lesssim 2^{-\ell } \frac{2^\ell }{ (1+2^\ell |y_1-m2^{-\ell }|)^{10}} \frac{ 2^{\ell } }{(1+2^\ell |y_2-m^b2^{-\ell b}|)^{10}} \end{aligned}$$

Now define

$$\begin{aligned} \rho _{m,k_1}^{(1)}(y_1)&= 2^{k_1}\left( 1+ |2^{k_1}y_1-m|\right) ^{-10}\\ \rho _{m,k_2}^{(2)}(y_2)&= 2^{bk_2 } \left( 1+ 2^{bk_2} |y_2-m^b 2^{-\ell (b-1)} |\right) ^{-10} \end{aligned}$$

We then have the pointwise estimate

$$\begin{aligned} \left| {\mathcal {R}}^{2^{nb}}_{j, \ell } f(x)\right| \lesssim 2^{-\ell } \left( \rho _{m,j}^{(1)}\otimes \rho _{m, j-n}^{(2)}\right) *|f|. \end{aligned}$$
(7.12)

By an application of inequalities for the shifted maximal operators (see [16, Theorem 3.1]) we see that the expressions

$$\begin{aligned}&\left( \int \Bigg | \left( \sum _{k_1, k_2} \left[ \int \rho _{m,k_1}^{(1)} (x_1-y_1) |g_{k_1,k_2} (y_1,x_2)|dy_2\right] ^2\right) ^{p/2}dx\right) ^{1/p} ,\\&\left( \int \Bigg | \left( \sum _{k_1, k_2} \left[ \int \rho _{m,k_2}^{(2)} (x_2-y_2) |g_{k_1,k_2} (x_1,y_2)|dy_2\right] ^2\right) ^{p/2}dx\right) ^{1/p} \end{aligned}$$

are both bounded by a constant times

$$\begin{aligned} (\log m)^2 \left\| \left( \sum _{k_1,k_2}|g_{k_1, k_2} |^2\right) ^{1/2} \right\| _p. \end{aligned}$$

Applying both estimates iteratively we get

$$\begin{aligned} \left\| \left( \sum _{k_1,k_2} \left[ \left( \rho _{m,k_1}^{(1)}\otimes \rho _{m, k_2}^{(2)}\right) *|g_{k_1,k_2}|\right] ^2\right) ^{1/2} \right\| _p \lesssim (\log m)^4 \left\| \left( \sum _{k_1,k_2}|g_{k_1, k_2} |^2\right) ^{1/2} \right\| _p. \end{aligned}$$

We apply this with \(g_{k_1,k_2}= f_{k_1, k_1-k_2}\) and use (7.12) to obtain (7.11). \(\square \)

8 Lower bounds

8.1 The main lower bound and some consequences

The purpose of this section is to prove the lower bound

Theorem 8.1

Let \(U\subset (0,\infty )\) and \(1<p<\infty \). Then there is a constant \(c_p\) such that

$$\begin{aligned} \Vert {\mathcal {H}}^U\Vert _{L^p\rightarrow L^p} \ge c_p \sqrt{\log ({\mathfrak {N}}(U))}. \end{aligned}$$

8.1.1 Some consequences

  1. (i)

    First, Theorem 8.1 in combination with the already proven upper bounds in Theorems 2.2 and 2.3 yields the equivalence (with constants depending on p)

    $$\begin{aligned} \Vert {\mathcal {H}}^U\Vert _{L^p\rightarrow L^p} \approx \sqrt{\log ({\mathfrak {N}}(U))} \end{aligned}$$
    (8.1)

    for \(2< p < \infty \), stated as Theorem 1.1.

  2. (ii)

    We also immediately get an equivalence in Propositions 7.1 and 7.2 which we formulate as

Corollary 8.2

Let U be a lacunary set. Then (8.1) holds for \(4/3<p<\infty \). If U is contained in \(\{2^{nb}: n\in {\mathbb {Z}}\} \) then (8.1) holds for \(1<p<\infty \).

8.1.2 Reduction to the case \(p=2\)

Let \(U_*\) be a maximal subset of U with the property that each interval \([2^n, 2^{n+1}]\) contains at most one point in U. Then \(\#(U_*)\approx {\mathfrak {N}}(U)\). Let \(\widetilde{U}\) be any finite subset of \(U_*\) with the understanding that \(\widetilde{U}=U_*\) if \(U_*\) is already finite. Clearly

$$\begin{aligned} \Vert {\mathcal {H}}^U\Vert _{L^p\rightarrow L^p} \ge \Vert {\mathcal {H}}^{U_*} \Vert _{L^p\rightarrow L^p} \ge \Vert {\mathcal {H}}^{\tilde{U}}\Vert _{L^p\rightarrow L^p} \end{aligned}$$

and thus it suffices to prove the inequality

$$\begin{aligned} \Vert {\mathcal {H}}^{\tilde{U}}\Vert _{L^p\rightarrow L^p} \gtrsim A_p \sqrt{\log (\#\widetilde{U}) }. \end{aligned}$$
(8.2)

We show that it suffices to prove (8.2) for \(p=2\): Since \(\widetilde{U}\) is a disjoint union of two lacunary sets we have the inequality

$$\begin{aligned} \Vert {\mathcal {H}}^{\tilde{U}}\Vert _{L^q\rightarrow L^q} \le C_q \sqrt{\log (\#\widetilde{U}) }, \quad \text { for } 4/3<q<\infty , \end{aligned}$$

by Proposition 7.1.

If \(1<p<2\) we pick q such that \(2<q<\infty \), and if \(2<p<\infty \) we pick q such that \(4/3<q<2\). Let \(\theta \in (0,1)\) such that \((1-\theta )/p+\theta /q=1/2\). We have

$$\begin{aligned} A_2 \big (\log (\#\widetilde{U}) \big )^{1/2}&\le \Vert {\mathcal {H}}^{\tilde{U}}\Vert _{L^2\rightarrow L^2} \le \Vert {\mathcal {H}}^{\tilde{U}}\Vert _{L^p\rightarrow L^p}^{1-\theta } \Vert {\mathcal {H}}^{\tilde{U}}\Vert _{L^q\rightarrow L^q}^\theta \\&\le \big (c_q (\log (\#\widetilde{U}))^{1/2} )^\theta \Vert {\mathcal {H}}^{\tilde{U}}\Vert _{L^p\rightarrow L^p}^{1-\theta } \end{aligned}$$

which implies

$$\begin{aligned} \Vert {\mathcal {H}}^{\tilde{U}}\Vert _{L^p\rightarrow L^p} \ge A_2^{\frac{1}{1-\theta }} c_q^{-\frac{\theta }{1-\theta }} \sqrt{\log (\#\widetilde{U}) }. \end{aligned}$$

For the remainder of this section we shall verify the lower bound in (8.2) for \(p=2\). We shall need to skim the set \(\widetilde{U}\) a bit more. To prepare for this we first study in more detail the multipliers of the Hilbert transforms.

8.2 Observations on the multipliers for the Hilbert transforms

We may assume \(c_+>0\). We write \(\widehat{H^{(u)} f} (\xi )= m(\xi _1,u\xi _2) \widehat{f}(\xi )\) where

$$\begin{aligned} m(\xi _1,\xi _2) = \lim _{\begin{array}{c} \varepsilon \rightarrow 0+\\ R\rightarrow \infty \end{array} }\left( \int _{\varepsilon<t\le R} e^{-i (t\xi _1+ c_+ t^b\xi _2)} \frac{dt}{t} + \int _{-R<t<-\varepsilon } e^{-i (t\xi _1+ c_- (-t)^b\xi _2)} \frac{dt}{t} \right) . \end{aligned}$$

By the homogeneity of the curve \(\Gamma _b\) with respect to the dilations \((\xi _1,\xi _2) \mapsto (\lambda \xi _1, \lambda ^b \xi _2)\), we see that \(m(\lambda \xi _1, \lambda ^b\xi _2)=m(\xi _1,\xi _2) \) for \(\lambda >0\). Moreover one can check that m is continuous on \(\mathbb {R}^2 {\setminus } \{0\}\),

$$\begin{aligned} m(\xi _1,0) = -\pi i \,{\mathrm{sign }}\xi _1, \quad \xi _1 \ne 0, \end{aligned}$$
(8.3a)

and if \(\xi _2 > 0\), then

$$\begin{aligned} m(0,\xi _2) = {\left\{ \begin{array}{ll} -\frac{1}{b}\log (c_+/c_-)&{}\text { if } c_- >0\\ -\frac{1}{b} \log (-c_+/c_-) -\frac{1}{b}\pi i &{}\text { if } c_- <0. \end{array}\right. } \end{aligned}$$
(8.3b)

We shall need the following Hölder continuity condition at the axes.

Lemma 8.3

There is \(C_\circ =C_\circ (b, c_\pm )\ge 1\) such that we have the estimates

$$\begin{aligned} |m(\xi _1,\xi _2)-m (\xi _1,0)|&\le C_\circ \left( \frac{|\xi _2|}{|\xi _1|^b} \right) ^{\frac{1}{2b}}, \end{aligned}$$
(8.4a)
$$\begin{aligned} |m(\xi _1,\xi _2)-m (0, \xi _2)|&\le C_\circ \left( \frac{|\xi _1|^b}{|\xi _2|} \right) ^{\frac{1}{2b}}. \end{aligned}$$
(8.4b)

Proof of Lemma 8.3

We have \(|m(\xi _1,\xi _2)|\le C_{\circ }(b, c_\pm )\) and therefore it suffices to show that (8.4a) holds for \(|\xi _2| \ll |\xi _1|^b\) and (8.4b) holds for \(|\xi _1|^b\ll |\xi _2|\).

For the proof of (8.4a) it suffices to check, by homogeneity and boundedness of m,

$$\begin{aligned} |m(\pm 1, \xi _2)- m(\pm 1,0)|\lesssim |\xi _2|^{\beta }, \quad |\xi _2|\le 1, \end{aligned}$$
(8.5)

for some \(\beta \ge (2b)^{-1}\). Let

$$\begin{aligned} A= A(\eta )= \frac{1}{2} |\eta |^{- \frac{1}{b+1}}. \end{aligned}$$
(8.6)

We have

$$\begin{aligned} m(1, \xi _2)- m(1,0)= \sum _{j=1}^3(I_{j,+}(c_+b\xi _2) - I_{j,-}(c_-b\xi _2)) \end{aligned}$$

where

$$\begin{aligned} I_{1,\pm }(\eta )&= \int _0^{A(\eta )} e^{\mp it}(e^{-i t^b\eta /b}-1) \frac{dt}{t}\,,\\ I_{2,\pm } (\eta )&=\int _{A(\eta )}^\infty e^{\mp it-i t^b\eta /b} \frac{dt}{t}\,,\\ I_{3,\pm }(\eta )&= -\int _{A(\eta )}^\infty e^{\mp it} \frac{dt}{t}\,. \end{aligned}$$

Clearly

$$\begin{aligned} |I_{1,\pm }(\eta ) |\le \int _0^A t^{b-1} |\eta | b^{-1} dt = A^b b^{-2}|\eta |. \end{aligned}$$

By integration by parts,

$$\begin{aligned} |I_{3,\pm }| \le 2 A^{-1}. \end{aligned}$$

By our choice (8.6)

$$\begin{aligned} |I_{1,\pm }(\eta ) |+|I_{3,\pm }(\eta ) | \lesssim |\eta |^{\frac{1}{b+1}} \end{aligned}$$

We may assume \(|\eta |<1\). Let \(B_1=B_1(\eta )= |\eta ^{-1/(b-1)}|/2\) and \(B_2=B_2(\eta )= 2|\eta ^{-1/(b-1)}|\). Then \(B_1(\eta )\ge A(\eta )\) and we split

$$\begin{aligned} I_{2,\pm } (\eta ) =\int _{A}^{B_1} + \int _{B_1}^{B_2} + \int _{B_2}^\infty e^{i\psi (t) } t^{-1} dt \end{aligned}$$

with \(\psi (t) =\mp t- t^b\eta /b\).

Note that for \(|t|\le B_1\) we have \(1/2<|\psi '(t)|\le 2\) and thus, by van der Corput’s lemma with first derivative we have \(|\int _A^{B_1} (...) dt| \lesssim A^{-1}\).

Note that \(|\psi ''(t)|=|\eta |(b-1) t^{b-2}\). For the second integral we apply van der Corput’s lemma with second derivatives and get \(|\int _{B_1}^{B_2} (...) dt| \lesssim |B_1|^{-1} |\eta |^{-1/2} (b-1)^{-1/2} |B_1|^{-(b-2)/2}\lesssim (b-1)^{-1/2} |\eta |^{1/(2b-2)}\).

Finally for the third integral we use that \(|\psi '(t)|\approx |\eta |t^{b-1} \) and \(|\psi ''(t) |\approx |\eta | (b-1) t^{b-2}\) and a straightforward integration by parts argument yields the bound \(O(|\eta |^{-1} B_2^{-b}) = O(|\eta |^{\frac{1}{b-1}})\).

The estimate for \(m(-1,\xi _2)-m(-1,0)\) is analogous. Altogether we obtain (8.5) with \(\beta = \min \{(b+1)^{-1}, (2b-2)^{-1}\}\), and we have \(\beta \ge (2b)^{-1}\).

We now turn to the proof of (8.4b). It suffices to check, by homogeneity and boundedness of m,

$$\begin{aligned} |m(\xi _1, \pm 1)- m(0, \pm 1)|\lesssim |\xi _1|^{1/2} , \quad |\xi _1|\le 1. \end{aligned}$$
(8.7)

Let

$$\begin{aligned} B= B(\xi _1)= (a|\xi _1|)^{-1/2}\quad \text { where } a= \min _\pm (bc_\pm /2)^{\frac{2}{b-1}}. \end{aligned}$$
(8.8)

We have

$$\begin{aligned} m( \xi _1, 1)- m(0,1) = \sum _{j=1}^3\big (II_{j,+}(\xi _1) - II_{j,-}(\xi _1)\big ) \end{aligned}$$

where

$$\begin{aligned} II_{1,\pm }(\xi _1)&= \int _0^{B(\xi _1) } (e^{\mp it\xi _1}-1)e^{-i c_\pm t^b} \frac{dt}{t} \,,\\ II_{2,\pm } (\xi _1)&=\int _{B(\xi _1) }^\infty e^{\mp it\xi _1-i c_\pm t^b} \frac{dt}{t} \,,\\ II_{3,\pm }(\xi _1)&= -\int _{B(\xi _1) }^\infty e^{\mp it} \frac{dt}{t}\,. \end{aligned}$$

The estimation of these terms is straightforward; we get

$$\begin{aligned} |II_{1,\pm }(\xi _1)|\lesssim |\xi _1| B(\xi _1) \end{aligned}$$

and

$$\begin{aligned} |II_{3,\pm }(\xi _1)| \lesssim B(\xi _1)^{-1} \end{aligned}$$

and both terms are \(O(|\xi |^{1/2})\), by our choice (8.8). By this choice we also have \(2 \le |c_\pm |b t^{b-1}\) for \(t\ge B(\xi _1)\) which implies that for \(|\xi _1|\le 1\)

$$\begin{aligned} \frac{1}{2}|c_{\pm }| bt^{b-1} \le |\partial _t (\mp t\xi _1- c_\pm t^{b})| \le 2|c_{\pm }| bt^{b-1} \text { for } t\ge B(\xi _1). \end{aligned}$$

Integration by parts now shows that

$$\begin{aligned} |II_{2,\pm }(\xi _1)| \lesssim B(\xi _1)^{-b} \end{aligned}$$

which is \(O(|\xi _1|^{b/2})\), hence also \(O(|\xi _1|^{1/2})\). The term \(m( \xi _1, -1)- m(0,-1)\) is similarly estimated. This completes the proof of (8.7). \(\square \)

8.3 Reduction to a lower bound for a lacunary maximal operator

Recall that \(\tilde{U}\subset U\) with \({\mathfrak {N}}(\tilde{U})<\infty \). Let \({\mathfrak {J}}\) be the collection of all integers n such that \([2^n, 2^{n+1}]\) has nonempty intersection with \(\tilde{U}\), thus \({\mathfrak {N}}(\tilde{U})=1+\#{\mathfrak {I}}\). Let

$$\begin{aligned} K=K(\tilde{U})= (C_\circ {\mathfrak {N}}(\tilde{U}))^{2b} \end{aligned}$$
(8.9)

where \(C_\circ \) is as in (8.4a), (8.4b). Let \({\mathfrak {I}}'\) be a maximal subfamily of \({\mathfrak {I}}\) with the condition

$$\begin{aligned} n_1\in {\mathfrak {I}}', \,\,n_2\in {\mathfrak {I}}', \,\, n_1<n_2 \,\, \implies n_2-n_1+1\ge \log _2 (8 K^2). \end{aligned}$$
(8.10)

Pick an integer M such that \(M+1\) is of the form \(2^\mu \) with \(\mu \in {\mathbb {N}}\) and such that

$$\begin{aligned} \frac{{\mathfrak {N}}(\tilde{U})}{\log _2 (16 K^2) } = \frac{{\mathfrak {N}}(\tilde{U})}{4+4b \log _2 (C_\circ {\mathfrak {N}}(\tilde{U}))} \in [M,2M). \end{aligned}$$

We may assume that the displayed quantity is \(\ge e^{100}\), so that the logarithm of this quantity is comparable to \(\log M\) (otherwise the desired lower bound for \(\Vert {\mathcal {H}}^U\Vert _{L^2\rightarrow L^2} \) just follows from the trivial lower bound for the Hilbert transform along a fixed curve).

We may now pick an increasing sequence \(\{u_j\}_{j=1}^M\) such that each \(u_j\) belongs to \(\tilde{U}\) and to exactly one interval determined by the collection \({\mathfrak {I}}'\). Hence we have

$$\begin{aligned} \frac{u_{j+1}}{u_j}\ge 16 K^2 \,. \end{aligned}$$
(8.11)

Given the reduction in Sect.  8.1.2 the lower bound \(\sqrt{\log ({\mathfrak {N}}(U))} \) in Theorem 8.1 follows from

Proposition 8.4

Let \(\tilde{U}\) and \(\{u_j\}_{j=1}^M\) be as above. Then there is \(c>0\) such that

$$\begin{aligned} \sup _{\Vert f\Vert _2=1} \left\| \sup _{1 \le j \le M} |\mathcal {H}^{(u_j)}f| \right\| _{2} \ge c \sqrt{\log M} \,. \end{aligned}$$

The proof of this proposition is based on a construction by Karagulyan [18].

8.4 A theorem of Karagulyan

We will invoke the following proposition, which is a small generalization of the main theorem of Karagulyan [18] (see also [19]). For \({\mu } \in \mathbb {N}\), let

$$\begin{aligned} W_{\mu } = \{\emptyset \} \cup \bigcup _{\ell =1}^{{\mu }-1} \{0,1\}^{\ell } \end{aligned}$$

be the set of binary words of length at most \({\mu }-1\), and let

$$\begin{aligned} \tau :W_{\mu } \rightarrow \{1, \dots , 2^{\mu }-1\} \end{aligned}$$

be the bijection given by \(\tau (\emptyset )=2^{\mu -1}\) and

$$\begin{aligned} \tau (w) = w_1 2^{\mu -1} + w_2 2^{\mu -2} + \dots + w_{\ell } 2^{\mu -\ell } + 2^{\mu -\ell -1} \end{aligned}$$

if \(w = w_1 w_2 \dots w_{\ell }\) for some \(\ell \in \{1,\dots ,\mu -1\}\), and each \(w_1, \dots , w_{\ell } \in \{0,1\}\). Observe that for a word w of length \(\ell \), \(\tau (w)\) is divisible by \(2^{\mu -\ell -1}\) but not by \(2^{\mu -\ell }\).

Proposition 8.5

Let \({\mu }\) be any positive integer, \(M = 2^{\mu } - 1\), and let \(S_1, \dots , S_M\) be pairwise disjoint subsets of the (frequency) plane \(\mathbb {R}^2\), so that every \(S_j\) contains balls of arbitrarily large radii (in other words, for every \(1 \le j \le M\) and every \(R > 0\), \(S_j\) contains some ball of radius R). Then there exists an \(L^2\) function f on \(\mathbb {R}^2\), that admits an orthogonal decomposition

$$\begin{aligned} f = \sum _{w \in W_{\mu }} f_w, \end{aligned}$$

where

$$\begin{aligned}&{\text {supp}}\widehat{f_w} \subset S_{\tau (w)} \quad \text {for all } w \in W_{\mu }, \text { and} \end{aligned}$$
(8.12)
$$\begin{aligned}&\Vert f\Vert _{L^2}^2 = \sum _{w \in W_{\mu }} \Vert f_w\Vert _{L^2}^2 \le 2; \end{aligned}$$
(8.13)

in addition,

$$\begin{aligned} \left\| \sup _{1 \le j \le M} \left| \sum _{w \in W_{\mu } :\tau (w) \ge j} f_w \right| \right\| _{L^2} \ge \frac{\sqrt{\mu } }{100} \Vert f\Vert _{L^2}. \end{aligned}$$
(8.14)

Accepting this for the moment, we prove Proposition 8.4.

8.5 Proof of Proposition 8.4

As before, suppose \(c_+ > 0\). Let

$$\begin{aligned} \rho ={\left\{ \begin{array}{ll} - \frac{1}{b}\log (c_+/c_-) &{}\text { if } c_->0,\\ -\frac{1}{b}\log (-c_+/c_-) -\frac{1}{b}\pi i &{}\text { if } c_-<0. \end{array}\right. } \end{aligned}$$

Then \(m(0,\xi _2)=\rho \) for \(\xi _2>0\) and \(m(\xi _1,0)= -\pi i\) for \(\xi _1>0\) (cf. (8.3b), (8.3a)). Let K as in (8.9), then

$$\begin{aligned} C_\circ K^{-\frac{1}{2b}} \le ({\mathfrak {N}}(\tilde{U}))^{-1}\le M^{-1}. \end{aligned}$$

From (8.4a) and (8.4b) we see, for \(\xi _1>0\), \(\xi _2>0\)

$$\begin{aligned} \xi _2/\xi _1^b\le K^{-1} \,\,&\implies \, |m(\xi _1,\xi _2)+\pi i| \le C_\circ K^{-\frac{1}{2b}}\le M^{-1}. \end{aligned}$$
(8.15a)
$$\begin{aligned} \xi _2/\xi _1^b\ge K \,\,&\implies \, |m(\xi _1,\xi _2)-\rho | \le C_\circ K^{-\frac{1}{2b}}\le M^{-1} \end{aligned}$$
(8.15b)

For \(1 \le j \le M\), define

$$\begin{aligned} S_j=\left\{ (\xi _1,\xi _2): \xi _1>0,\,\xi _2>0,\, \, \frac{1}{2Ku_j}< \frac{ \xi _2}{\xi _1^b} < \frac{1}{ Ku_j} \right\} , \end{aligned}$$
(8.16)

so that the \(S_j\) are pairwise disjoint, and contain balls of arbitrarily large radii. By Proposition 8.5, there exists an \(L^2\) function \(f = \sum _{w \in W_{\mu }} f_w\) on \(\mathbb {R}^2\), such that (8.12), (8.13) and (8.14) hold. Now for \(1 \le j \le M\),

$$\begin{aligned}&|\mathcal {H}^{(u_j)} f(x)-\rho f(x) | \ge \left| \sum _{{\begin{array}{c} w \in W_{\mu } :\\ \tau (w) \ge j \end{array}}} (\pi i+\rho ) \, f_w(x) \right| \\&\quad - \left| \sum _{\begin{array}{c} w \in W_{\mu } :\\ \tau (w) \ge j \end{array}} \big ( \mathcal {H}^{(u_j)}f_w(x) + \pi i f_w(x) \big ) \right| - \left| \sum _{{\begin{array}{c} w \in W_{\mu } :\\ \tau (w) < j \end{array}}} \big ( {\mathcal {H}}^{(u_j)}f_w(x) -\rho f_w(x)\big ) \right| , \end{aligned}$$

and thus, with \(c_0 = \pi (1-\frac{1}{b})\),

$$\begin{aligned}&\sup _{1\le j\le M} |\mathcal {H}^{(u_j)} f(x)-\rho f(x) | \ge c_0 \sup _{1\le j\le M}\left| \sum _{{\begin{array}{c} w \in W_{\mu } :\\ \tau (w) \ge j \end{array}}} f_w(x) \right| \nonumber \\&\quad - \sup _{1\le j\le M} \left| \sum _{\begin{array}{c} w \in W_{\mu } :\\ \tau (w) \ge j \end{array}} ( \mathcal {H}^{(u_j)}+\pi i)f_w(x) \right| - \sup _{1\le j\le M} \left| \sum _{{\begin{array}{c} w \in W_{\mu } :\\ \tau (w) < j \end{array}}} ( {\mathcal {H}}^{(u_j)}-\rho )f_w(x) \big ) \right| .\nonumber \\ \end{aligned}$$
(8.17)

Now \({\text {supp}}\widehat{f_w} \in S_{\tau (w)}\). If \(\tau (w)\ge j\), then for \(\xi \in {\text {supp}}\widehat{f_w }\), we have \(u_j\xi _2/\xi _1^b< u_{\tau (w)} \xi _2/\xi _1^b<K^{-1}\) and therefore, by (8.15a), we have \(|m(\xi _1, u_j\xi _2)+\pi i| \le M^{-1}\) for \(\xi \in {\text {supp}}\widehat{f_w }\). Hence

$$\begin{aligned} \big \Vert (\mathcal {H}^{(u_j)} + \pi i) f_w \big \Vert _{2} \le M^{-1} \Vert f_w\Vert _{2} \quad \text {if } \tau (w) \ge j. \end{aligned}$$
(8.18)

Moreover if \(\tau (w)<j\) we have, for \(\xi \in {\text {supp}}\widehat{f_w }\),

$$\begin{aligned} u_j \frac{\xi _2}{\xi _1^b} = \frac{u_j}{u_{\tau (w)}}u_{\tau (w) }\frac{\xi _2}{\xi _1^b} \ge 16 K^2 \frac{1}{2K}= 8K \end{aligned}$$

and hence, by (8.15b), \(|m(\xi _1, u_j \xi _2)-\rho |\le M^{-1}\) for \(\xi \in {\text {supp}}\widehat{f_w}\). Thus

$$\begin{aligned} \big \Vert (\mathcal {H}^{(u_j)} -\rho ) f_w \big \Vert _{2} \le M^{-1} \Vert f_w\Vert _{2} \quad \text {if } \tau (w) < j. \end{aligned}$$
(8.19)

Statements (8.18) and (8.19) imply

$$\begin{aligned}&\left\| \sup _{1 \le j \le M} \left| \sum _{\begin{array}{c} w \in W_{\mu } :\\ \tau (w) < j \end{array}} (\mathcal {H}^{(u_j)}-\rho ) f_w \right| \right\| _{2} \lesssim \Vert f\Vert _2 \end{aligned}$$
(8.20)
$$\begin{aligned}&\left\| \sup _{1 \le j \le M} \left| \sum _{\begin{array}{c} w \in W_{\mu } :\\ \tau (w) \ge j \end{array}} ( \mathcal {H}^{(u_j)} + \pi i ) f_w \right| \right\| _{2} \lesssim \Vert f\Vert _{2}. \end{aligned}$$
(8.21)

Indeed, to obtain (8.21) we use the Cauchy-Schwarz inequality in the w sum and replace a sup in j by an \(\ell ^2\) norm, then interchange integrals and sums and apply (8.19) to get

$$\begin{aligned}&\left\| \sup _{1 \le j \le M} \left| \sum _{\begin{array}{c} w \in W_{\mu } :\\ \tau (w)< j \end{array}} (\mathcal {H}^{(u_j)}-\rho ) f_w \right| \right\| _{2}\\&\quad \le M^{1/2} \left\| \left( \sum _{j=1}^M \sum _{\tau (w)< j} \ | (\mathcal {H}^{(u_j)}-\rho )f_w |^2 \right) ^{1/2} \right\| _{2} \\&\quad = M^{1/2} \left( \sum _{j=1}^M \sum _{\tau (w) < j} \big \Vert (\mathcal {H}^{(u_j)}-\rho ) f_w \big \Vert _{2}^2 \right) ^{1/2}\\&\quad \le M^{1/2} \left( \sum _{j=1}^M M^{-2} \sum _{w } \Vert f_w \Vert _{2}^2 \right) ^{1/2} \lesssim \Vert f\Vert _{2} \end{aligned}$$

(the last line following from (8.13)). Inequality (8.20) is proved in exactly the same way (relying on (8.18)).

Now we go back to (8.17), use (8.14) for the main part and (8.20), (8.21) for the two error terms. Then we get

$$\begin{aligned} \left\| \sup _{1 \le j \le M} |(\mathcal {H}^{(u_j)}-\rho ) f| \right\| _{2} \ge c \sqrt{\mu } \Vert f\Vert _{2} \end{aligned}$$

for some constant \(c= c(b, c_\pm )> 0\). If \(\sqrt{\mu } \ge 2|\rho | / c\) this also implies

$$\begin{aligned} \left\| \sup _{1 \le j \le M} |\mathcal {H}^{(u_j)} f| \right\| _{2} \ge (c/2) \sqrt{\mu } \Vert f\Vert _{2}. \end{aligned}$$

This completes the proof of Proposition 8.4, except for Proposition 8.5. \(\square \)

8.6 Proof of Proposition 8.5

Fix a non-negative Schwartz function \(\phi \) on \(\mathbb {R}^2\) with \(\int _{\mathbb {R}^2} \phi (x) dx = 1\), such that \(\widehat{\phi }\) is supported in the unit ball B(0, 1) centered at the origin. Define the frequency cutoff \(\phi _\rho \) by

$$\begin{aligned} \phi _{\rho }(x) := \rho ^2 \phi (\rho x). \end{aligned}$$

Then \(\widehat{\phi _\rho }\) is supported on \(B(0,\rho )\).

The following lemma explains what we actually construct, in order to prove Proposition 8.5:

Lemma 8.6

Let \({\mu } \in \mathbb {N}\), \(M = 2^{\mu } - 1\), and let \(S_1, \dots , S_M\) be as given in Proposition 8.5. Then there exist a sequence of sets \(\{E_w\}_{w \in W_{\mu }}\), modulation frequencies \(\{\xi _w\}_{w \in W_{\mu }} \subset \mathbb {R}^2\), and radii \(\{\rho _w\}_{w\in W_\mu }\) such that the following holds:

  1. (a)

    For every \(w \in W_{\mu }\), \(E_w \subset [0,1]^2\), and for every \(w \in W_{\mu -1}\), \(E_w\) is the disjoint union of \(E_{w0}\) and \(E_{w1}\) Also, \(E_{\emptyset } = [0,1]^2\). For \(\ell =0,\dots , \mu -1\), \([0,1]^2\) is a disjoint union of the \(E_w\) with \(\text {length}(w)=\ell \), and

    $$\begin{aligned} \sum _{w \in W_{\mu }} {\mathbb {1}}_{E_w}(x) = \mu . \end{aligned}$$
    (8.22)

    for every \(x \in [0,1]^2\).

  2. (b)

    For every \(w \in W_{\mu }\),

    $$\begin{aligned}&\displaystyle \Vert {\mathbb {1}}_{E_w}*\phi _{\rho _w} - {\mathbb {1}}_{E_w} \Vert _{L^2} \le 2^{-\mu -10}, \end{aligned}$$
    (8.23)
    $$\begin{aligned}&\displaystyle \int _{E_w} |\cos (\langle \xi _w, x\rangle )| dx \ge \frac{|E_w|}{3}, \end{aligned}$$
    (8.24)
    $$\begin{aligned}&\displaystyle B(\xi _w,\rho _w) \subset S_{\tau (w)}. \end{aligned}$$
    (8.25)
  3. (c)

    For every \(w \in W_{\mu -1}\), we have

    $$\begin{aligned} \begin{aligned} \cos (\langle \xi _w,x\rangle )&\ge 0 \quad \text {if } x \in E_{w0}, \\ \cos (\langle \xi _w,x\rangle )&< 0 \quad \text {if } x \in E_{w1}. \end{aligned} \end{aligned}$$
    (8.26)

With this lemma we can prove Proposition 8.5 as follows.

Proof of Proposition 8.5

For every \(w \in W_{\mu }\), let \(E_w\), \(\rho _w\) and \(\xi _w\) be as in Lemma 8.6. We set

$$\begin{aligned} f_w(x) := \mu ^{-1/2} e^{i \langle \xi _w,x\rangle } {\mathbb {1}}_{E_w}*\phi _{\rho _w}(x), \end{aligned}$$
(8.27a)

and let

$$\begin{aligned} f := \sum _{w \in W_{\mu }} f_w. \end{aligned}$$
(8.27b)

Then the support of \(\widehat{f_w}\) is contained inside \(B(\xi _w,\rho _w)\), so (8.12) follows from (8.25). Also, the \(\widehat{f_w}\)’s are supported in the sets \(S_{\tau (w)}\) which are disjoint and thus by orthogonality we have

$$\begin{aligned} \Vert f\Vert _{2} = \left\| \left( \sum _{w \in W_{\mu }} |f_w|^2 \right) ^{1/2}\right\| _{2}. \end{aligned}$$

But, from (8.23), we have

$$\begin{aligned} \Big \Vert f_w - \mu ^{-1/2} e^{i \langle \xi _w,x\rangle } {\mathbb {1}}_{E_w} \Big \Vert _{2} \le 2^{-\mu -10}. \end{aligned}$$
(8.28)

Observe

$$\begin{aligned}&\left( \sum _{w \in W_{\mu }} |f_w|^2 \right) ^{1/2}\\&\quad \le \left( \sum _{w \in W_{\mu }} \Big |f_w - \mu ^{-1/2} e^{i \langle \xi _w,x\rangle } {\mathbb {1}}_{E_w} \Big |^2 \right) ^{1/2} + \left( \sum _{w \in W_{\mu }} \Big |\mu ^{-1/2} e^{i \langle \xi _w,x\rangle } {\mathbb {1}}_{E_w} \Big |^2 \right) ^{1/2}, \end{aligned}$$

and using (8.22) to simplify the second term we get

$$\begin{aligned} \left( \sum _{w \in W_{\mu }} |f_w|^2 \right) ^{1/2} \le \left( \sum _{w \in W_{\mu }} \Big |f_w - \mu ^{-1/2} e^{i \langle \xi _w,x\rangle } {\mathbb {1}}_{E_w} \Big |^2\right) ^{1/2} + {\mathbb {1}}_{[0,1]^2} \end{aligned}$$

for almost every \(x \in \mathbb {R}^2\). Taking \(L^2\) norms of both sides, and using (8.28), we have

$$\begin{aligned} \left\| \left( \sum _{w \in W_{\mu }} |f_w|^2 \right) ^{1/2} \right\| _{2} \le 2^{-5-\mu /2}+1 <2. \end{aligned}$$

Thus (8.13) follows.

Lastly we have to verify (8.14). To do so, we first introduce an auxiliary family of functions \(\{F_w\}_{w \in W_{\mu }}\), where

$$\begin{aligned} F_w := {{\,\mathrm{Re}\,}}f_w \,{\mathbb {1}}_{E_w}. \end{aligned}$$
(8.29)

These \(F_w\)’s satisfy three key properties, namely

$$\begin{aligned}&\displaystyle \sum _{w \in W_{\mu }} \Vert F_w - {{\,\mathrm{Re}\,}}f_w\Vert _{L^2} \le 2^{-10}, \end{aligned}$$
(8.30)
$$\begin{aligned}&\displaystyle \frac{1}{3} \,\le \, \frac{ \sup _{1 \le j \le M} \Big | \sum _{w \in W_{\mu } :\tau (w) \ge j} F_w(x) \Big | }{ \sum _{w \in W_{\mu }} |F_w(x)| } \,\le 1 \quad \text {for a.e. } x \in [0,1]^2, \end{aligned}$$
(8.31)

and

$$\begin{aligned} \frac{\sqrt{\mu }}{4}\le \left\| \sum _{w \in W_{\mu }} |F_w| \right\| _{1} \le \left\| \sum _{w \in W_{\mu }} |F_w| \right\| _{2} \le \left\| \sum _{w \in W_{\mu }} |F_w| \right\| _{{\infty }} \le \sqrt{\mu }. \end{aligned}$$
(8.32)

Indeed, (8.30) will be a consequence of

$$\begin{aligned} \Vert F_w - {{\,\mathrm{Re}\,}}f_w\Vert _{L^2} \lesssim 2^{-\mu -10} \quad \text {for all } w \in W_{\mu }. \end{aligned}$$
(8.33)

Since \(F_w - {{\,\mathrm{Re}\,}}f_w = {{\,\mathrm{Re}\,}}f_w {\mathbb {1}}_{\mathbb {R}^2 {\setminus } E_w}\), heuristically, (8.33) says that the real part of each \(f_w\) is essentially supported on \(E_w\): the \(L^2\) norm of \({{\,\mathrm{Re}\,}}f_w\) outside \(E_w\) is small. Furthermore, (8.31) says that there isn’t much cancellation, if we first order the \(F_w\)’s according to the value of \(\tau (w)\), and then sum successively; this will be achieved by showing that \(\{F_w\}_{w \in W_{\mu }}\) form a tree system in the sense of Karagulyan [18] (who credits the idea to Nikišin and Ul’janov [22]).

Let us now establish the three key properties of the \(F_w\)’s, namely (8.30), (8.31) and (8.32). Since \(F_w - {{\,\mathrm{Re}\,}}f_w = {{\,\mathrm{Re}\,}}f_w {\mathbb {1}}_{(E_w)^\complement }\), and since

$$\begin{aligned} {{\,\mathrm{Re}\,}}f_w(x) = \frac{1}{\sqrt{\mu }} \cos (\langle \xi _w,x\rangle ) {\mathbb {1}}_{E_w}*\phi _{\ell _w}(x), \end{aligned}$$
(8.34)

we have

$$\begin{aligned} \Vert F_w - {{\,\mathrm{Re}\,}}f_w\Vert _{L^2(\mathbb {R}^2)}&= \big \Vert \mu ^{-1/2} \cos (\langle \xi _w,x\rangle ) {\mathbb {1}}_{E_w}*\phi _{\ell _w} \big \Vert _{L^2(\mathbb {R}^2 {\setminus } E_w)}\\&\le \big \Vert {\mathbb {1}}_{E_w} - {\mathbb {1}}_{E_w}*\phi _{\ell _w} \big \Vert _{L^2(\mathbb {R}^2{\setminus } E_w)} \le 2^{-\mu -10} \end{aligned}$$

by (8.23). This establishes (8.33), and (8.30) follows by summing over \(w \in W_{\mu }\).

Next we verify (8.31). The second inequality in (8.31) is immediate by the triangle inequality. For the first, we observe from (8.34) that if \(x \in E_w\), then \(F_w(x)\) has the same sign as \(\cos (\langle \xi _w,x\rangle )\) since \({\mathbb {1}}_{E_w}*\phi _{\ell _w}\) is everywhere positive. We claim that for almost every \(x \in [0,1]^2\), there exists \(j=j(x)\) such that \(F_w(x) \ge 0\) for every \(w \in W_{\mu }\) with \(\tau (w) \ge j\), and \(F_w(x) < 0\) for every \(w \in W_{\mu }\) with \(\tau (w) < j\). This is because for almost every \(x \in [0,1]^2\), there exists a unique word \(w(x) = w_1 \dots w_{\mu -1}\) of length \(\mu -1\) such that \(x \in E_{w(x)}\). By (8.26), it follows that, for every \(\ell = 0, 1, \dots , \mu -2\),

$$\begin{aligned} F_{w_1 \dots w_{\ell }}(x)&> 0 \quad \text {if } w_{\ell +1} = 0, \\ F_{w_1 \dots w_{\ell }}(x)&< 0 \quad \text {if } w_{\ell +1} = 1, \end{aligned}$$

and that \(F_{w'}(x) = 0\) if \(w' \in W_{\mu } {\setminus } \{\emptyset , w_1, w_1 w_2, \dots , w_1 \cdots w_{\mu -1}\}\). But

$$\begin{aligned} \tau (w_1 \dots w_{\ell }) = w_1 2^{\mu -1} + \dots + w_{\ell } 2^{\mu -\ell } + 2^{\mu -\ell -1}, \end{aligned}$$

while

$$\begin{aligned} \tau (w(x)) = w_1 2^{\mu -1} + \dots + w_{\ell } 2^{\mu -\ell } + w_{\ell +1} 2^{\mu -\ell -1} + \dots + w_{\mu -1} 2^1 + 2^0. \end{aligned}$$

This shows that for every \(\ell = 0, 1, \dots , \mu -2\),

$$\begin{aligned} \tau (w_1 \dots w_{\ell })&> \tau (w(x)) \quad \text {if } w_{\ell +1} = 0,\\ \tau (w_1 \dots w_{\ell })&< \tau (w(x)) \quad \text {if } w_{\ell +1} = 1. \end{aligned}$$

Thus for any \(w' \in W_{\mu }\), one has

$$\begin{aligned} F_{w'}(x)&\ge 0 \quad \text {if } \tau (w') > \tau (w(x)), \\ F_{w'}(x)&\le 0 \quad \text {if } \tau (w') < \tau (w(x)). \end{aligned}$$

If \(F_{w(x)}(x) \ge 0\), we set \(j(x) = \tau (w(x))\); if \(F_{w(x)}(x) < 0\), we set \(j(x) = \tau (w(x)) + 1\). It follows that that \(F_w(x) \ge 0\) whenever \(\tau (w) \ge j(x)\), and \(F_w(x) \le 0\) whenever \(\tau (w) < j(x)\). We distinguish two cases now. In the first case we have

$$\begin{aligned} \left| \sum _{w \in W_{\mu } :\tau (w) \ge j(x)} F_w(x) \right| \ge \frac{1}{3} \sum _{w \in W_{\mu }} |F_w(x)|. \end{aligned}$$

In the opposite case, we have \(| \sum _{w \in W_{\mu } :\tau (w) \ge j(x)} F_w(x) | < \frac{1}{3} \sum _{w \in W_{\mu }} |F_w(x)|\), so \(| \sum _{w \in W_{\mu } :\tau (w) < j(x)} F_w(x) | \ge \frac{2}{3} \sum _{w \in W_{\mu }} |F_w(x)|\). Then

$$\begin{aligned}&\left| \sum _{w \in W_{\mu }} F_w(x) \right| \ge \left| \sum _{\begin{array}{c} w \in W_{\mu } :\\ \tau (w)<j(x) \end{array}} F_w(x) \right| - \left| \sum _{\begin{array}{c} w \in W_{\mu } :\\ \tau (w) \ge j(x) \end{array}} F_w(x)\right| \\&\quad \ge \sum _{\begin{array}{c} w \in W_{\mu } :\\ \tau (w)<j(x) \end{array}} |F_w(x)| - \frac{1}{3} \sum _{w \in W_{\mu }} |F_w(x)| \ge \frac{1}{3} \sum _{w \in W_{\mu }} |F_w(x)|. \end{aligned}$$

Hence in both cases

$$\begin{aligned} \sup _{1 \le j \le M} \left| \sum _{w \in W_{\mu } :\tau (w) \ge j} F_w(x) \right| \ge \frac{1}{3} \sum _{w \in W_{\mu }} |F_w(x)| \end{aligned}$$

for every \(x \in [0,1]^2\). This completes the proof of (8.31).

Finally, we have to verify (8.32). Note that \(F_w\) is supported on \([0,1]^2\) for every \(w \in W_{\mu }\), and for almost every \(x \in [0,1]^2\), there exists at most \(\mu \) words \(w \in W_{\mu }\) for which \(F_w(x) \ne 0\). Furthermore, \(|F_w(x)| \le \mu ^{-1/2}\) for every \(x \in [0,1]^2\) and every \(w \in W_{\mu }\). Thus, we have

$$\begin{aligned} \left\| \sum _{w \in W_{\mu }} |F_w| \right\| _{1} \le \left\| \sum _{w \in W_{\mu }} |F_w| \right\| _{2} \le \left\| \sum _{w \in W_{\mu }} |F_w| \right\| _{{\infty }} \le \sqrt{\mu }. \end{aligned}$$

Next, for the lower bound,

$$\begin{aligned} \left\| \sum _{w \in W_{\mu }} |F_w| \right\| _{1} = \sum _{w \in W_{\mu }} \int _{E_w} \mu ^{-1/2}|\cos (\langle \xi _w ,x\rangle ) {\mathbb {1}}_{E_w}*\phi _{\ell _w}(x)| dx \end{aligned}$$

which is

$$\begin{aligned}&\ge \frac{1}{\sqrt{\mu }}\sum _{w \in W_{\mu }} \int _{E_w} \Big ( |\cos (\langle \xi _w,x\rangle )| - \big |\cos (\langle \xi _w,x\rangle )[{\mathbb {1}}_{E_w} - {\mathbb {1}}_{E_w}*\phi _{\ell _w}]\big |\Big )\,dx \\&\ge \frac{1}{\sqrt{\mu }}\sum _{w \in W_{\mu }} \left( \int _{E_w} |\cos (\langle \xi _w,x\rangle )| dx - \Vert {\mathbb {1}}_{E_w} - {\mathbb {1}}_{E_w}*\phi _{\ell _w}\Vert _{L^2} |E_w|^{1/2}\right) \\&\ge \frac{1}{\sqrt{\mu }} \sum _{w \in W_{\mu }}\left( \frac{ |E_w|}{3} - 2^{-\mu -10} \right) \ge \frac{\sqrt{\mu }}{3} - 2^{-\mu -10}\sqrt{\mu }\ge \frac{\sqrt{\mu }}{4}, \end{aligned}$$

where for the last line we have used (8.24), (8.23) and (8.22). This completes the proof of (8.32).

We will now return to the proof of (8.14). First,

$$\begin{aligned}&\sup _{1 \le j \le M} \left| \sum _{w \in W_{\mu } :\tau (w) \ge j} f_w(x) \right| \ge \sup _{1 \le j \le M} \left| \sum _{w \in W_{\mu } :\tau (w) \ge j} {{\,\mathrm{Re}\,}}f_w(x) \right| \\&\quad \ge \sup _{1 \le j \le M} \left| \sum _{w \in W_{\mu } :\tau (w) \ge j} F_w(x) \right| - \sum _{w \in W_{\mu }} |F_w(x) - {{\,\mathrm{Re}\,}}f_w(x)|, \end{aligned}$$

which by (8.31) is

$$\begin{aligned} \ge \frac{1}{3} \sum _{w \in W_{\mu }} |F_w(x)| - \sum _{w \in W_{\mu }} |F_w(x) - {{\,\mathrm{Re}\,}}f_w(x)|. \end{aligned}$$

From (8.30) and (8.32), we then have

$$\begin{aligned} \left\| \sup _{1 \le j \le M} \left| \sum _{w \in W_{\mu } :\tau (w) \ge j} f_w \right| \right\| _{L^2} \ge \frac{ \sqrt{\mu }}{12}- 2^{-10} \ge \frac{\sqrt{\mu }}{50}. \end{aligned}$$

Hence (8.14) follows from (8.13). This finishes the proof of Proposition 8.5, except for the proof of Lemma 8.6. \(\square \)

The proof of Lemma 8.6 is done by induction over the length of words. The basic step is contained in

Lemma 8.7

Given \(\varepsilon >0\), a set E of finite measure and a set S in frequency space that contains balls of arbitrary large radii, there exist \(\rho _0>0\), a frequency \(\xi _0\) and a ball \(B= B(\xi _0,\rho _0) \subset S\) such that \(\Vert \phi _{\rho _0}*{\mathbb {1}}_E-{\mathbb {1}}_E\Vert _2<\varepsilon \) and \(\int _E|\cos (\langle \xi _0,x\rangle )|\, dx\ge |E|/3\).

Proof

Since \(\{\phi _\rho \}_{\rho >0}\) form an approximation of the identity there is \(R_1=R_1(S,E,\varepsilon )\) such that

$$\begin{aligned} \Vert \phi _\rho *{\mathbb {1}}_E-{\mathbb {1}}_E\Vert _2<\varepsilon \end{aligned}$$
(8.35)

for \(\rho >R_1\). Also observe that

$$\begin{aligned} \liminf _{|\xi | \rightarrow +\infty } \int _{E} |\cos (\langle \xi ,x\rangle )| dx&\ge \liminf _{|\xi | \rightarrow +\infty } \int _{E} \cos ^2(\langle \xi ,x\rangle ) dx \\&= \lim _{|\xi | \rightarrow +\infty } \int _{E} \frac{1+ \cos (2 \langle \xi ,x\rangle )}{2} dx = \frac{|E|}{2}, \end{aligned}$$

by the Riemann–Lebesgue lemma. Hence we find \(R_2=R_2(S,E,\varepsilon )\) such that

$$\begin{aligned} \int _{E}|\cos (\langle \xi ,x\rangle )|dx\ge |E|/3, \end{aligned}$$
(8.36)

for \(|\xi |\ge R_2\).

By assumption on S we can find a ball \(B_0\) of radius \(R_0> 10 \max \{R_1, R_2\}\), centered at some \(\Xi _0\) such that \(B_0\subset S\). There is a point \(\xi _0\in B(\Xi _0, R_0/2) \) that satisfies \(|\xi _0|\ge R_0/4\). Set \(\rho _0=R_0/4\). The ball \(B(\xi _0,\rho _0)\) is contained in \(B_0\) and thus in S. Also since \(\rho _0\ge R_1\) we have (8.35) for \(\rho =\rho _0\) and since \(|\xi _0|> R_2\) we have (8.36) for \(\xi =\xi _0\). \(\square \)

Proof of Lemma 8.6

We will construct a sequence of sets \(\{E_w\}\), radii \(\rho _w\) and modulation frequencies \(\xi _w\) using induction on the length of words. We use \(\varepsilon =2^{-\mu -10}\) in Lemma 8.7.

First let \(E_{\emptyset } = [0,1]^2\). We apply Lemma 8.7 with \(E=E_\emptyset \) and \(S=S_{\tau (\emptyset )}\). We thus find \(\xi _\emptyset \), \(\rho _\emptyset \) such that (8.23), (8.24), (8.25) hold for \(w=\emptyset \). We consider the two words of length one, i.e. 0 and 1 and let

$$\begin{aligned} E_{0}&:= \{x \in E_\emptyset :\cos (\langle \xi _w ,x\rangle ) \ge 0\}\\ E_{1}&:= \{x \in E_\emptyset :\cos (\langle \xi _w,x\rangle ) < 0\} \end{aligned}$$

so that \(E_\emptyset \) is a disjoint union of \(E_{0}\) and \(E_{1}\), and (8.26) holds for \(w=\emptyset \). Clearly \([0,1]^2\) is a disjoint union of the \(E_w\) with words w of length 1.

Suppose \(E_w, \rho _w, \xi _w\) are defined for all words of length \(\ell <\mu -1\). Take any word of length \(\ell +1\), of the form w0 or \(w_1\) where w is of length \(\ell \), and where \(E_w,\rho _w, \xi _w\) satisfy (8.23), (8.24), (8.25), and where \([0,1]^2 \) is a disjoint union of the \(E_w\) with \(\text {length}(w)=\ell \). We let

$$\begin{aligned} E_{w0}&:= \{x \in E_w :\cos (\langle \xi _w,x\rangle ) \ge 0\}\\ E_{w1}&:= \{x \in E_w :\cos (\langle \xi _w,x\rangle ) < 0\} \end{aligned}$$

so that (8.26) holds, \(E_w\) is a disjoint union of \(E_{w0}\) and \(E_{w1}\), and thus \([0,1]^2\) is a disjoint union of all \(E_{\overline{w}}\) where \(\overline{w} \) runs over all words of length \(\ell +1\).

We now use Lemma 8.7 to find \(\rho _{w0}, \xi _{w0}\) so that (8.23), (8.24) and (8.25) hold for w0 in place of w. Then we use Lemma 8.7 again to find \(\rho _{w1}, \xi _{w1}\) so that (8.23), (8.24) and (8.25) hold for w1 in place of w.

At step \(\ell =\mu -1\) this completes our construction of \(E_w\), \(\rho _w\) and \(\xi _w\) for all \(w \in W_{\mu }\), and all the properties stated in Lemma 8.6 are satisfied at every stage of the construction. Note that the balls \(B(\xi _w,\rho _w)\), \(B(\xi _{\tilde{w}},\rho _{\tilde{w}})\) are disjoint for different w, \(\tilde{w}\) because these balls belong to the disjoint sets \(S_{\tau (w)}\), \(S_{\tau (\tilde{w})}\), respectively.

Finally we have by our construction, for \(\ell =0,\dots , \mu -1\),

$$\begin{aligned} \sum _{w: \text {length}(w)=\ell } {\mathbb {1}}_{E_w}={\mathbb {1}}_{[0,1]^2}, \end{aligned}$$

and we obtain (8.22) by summing in \(\ell \). \(\square \)