1 Introduction

In this article, we establish Fourier extension estimates for compact subsets of the hyperbolic, or one-sheeted, hyperboloid in three dimensions. This surface may be defined as the set of points \((\tau ,\xi ) \in \mathbb {R}\times \mathbb {R}^2\) satisfying the relation \(\tau ^2 = 1 + \xi _1^2-\xi _2^2\). Setting \(\phi (\xi ) := \sqrt{1+\xi _1^2-\xi _2^2}\) and \(\Omega := \{\xi \in \mathbb {R}^2 : 1+\xi _1^2-\xi _2^2 \ge 0\}\), we may restrict our attention to the graph

$$\begin{aligned} \Sigma := \{(\phi (\xi ),\xi ) : \xi \in \Omega \}. \end{aligned}$$

We aim to adapt the polynomial partitioning method of Guth [5] to obtain extension estimates for a bounded subset of \(\Sigma \) near (1, 0), which we denote by \(\Sigma _1\). Use of the parabolic scalings \(P_r(\tau ,\xi ) := (r^{-2}\tau ,r^{-1}\xi )\) in Guth’s argument presents an immediate obstacle here, as hyperboloids are not invariant under such transformations. To overcome this minor issue, we will simultaneously prove extension estimates for all parabolic rescalings of \(\Sigma _1\) with constants uniform in the scaling parameter. Toward that end, let \(U := \{\xi : |\xi | \le \delta _0/10\}\), where \(\delta _0 > 0\) is a small constant to be chosen later, and for each \(r \in (0,1]\), let \(\phi _r(\xi ) := r^{-2}(\phi (r\xi )-1)\) and

$$\begin{aligned} \Sigma _r := \{(\phi _r(\xi ),\xi ) : \xi \in U\}. \end{aligned}$$

Each \(\Sigma _r\) is the image of \(\Sigma _1 \cap \{(\tau ,\xi ) : \xi \in rU\}\) under the parabolic scaling \(P_r\), and the ‘\(-1\)’ in \(\phi _r\) just makes \(\Sigma _r\) converge to the hyperbolic paraboloid \(\Sigma _0 := \{(\frac{1}{2}(\xi _1^2-\xi _2^2),\xi ) : \xi \in U\}\) as \(r\rightarrow 0\). We associate to \(\Sigma _r\) the extension operator

$$\begin{aligned} \mathcal {E}_r f(t,x) := \int _{U} e^{2\pi i(t,x)\cdot (\phi _r(\xi ),\xi )}f(\xi )d\xi . \end{aligned}$$

Theorem 1.1

If \(q > 13/4\) and \(p > (q/2)'\), then \(\mathcal {E}_r : L^p(U) \rightarrow L^q(\mathbb {R}^3)\) with operator norm bounded uniformly in r.

Remark 1.2

The bilinear and bilinear-to-linear theories for \(\mathcal {E}_1\) appear in a separate article [1] of Stovall, Oliveira e Silva, and the author. Using the bilinear machinery and Theorem 1.1, boundedness of \(\mathcal {E}_r\) on the parabolic scaling line \(p = (q/2)'\) (for \(q > 13/4\)) can also be proved. See [1, Remark 5.2], as well as [8, 9], and [6] for arguments of this type.

Theorem 1.1 can be compared to several recent developments in the restriction/extension theory for hyperbolic surfaces in three dimensions. Cho and Lee [3] generalized Guth’s argument in [5] to the hyperbolic paraboloid, proving strong type (pq) extension estimates in the range \(q > 13/4\), \(p \ge q\). Later work of Kim [6] and Stovall [8] brought those estimates to the scaling line \(p =(q/2)'\). (Letting \(r \rightarrow 0\) and applying Fatou’s lemma, Theorem 1.1 reproves the off-scaling extension estimates for the hyperbolic paraboloid.) Recently, Buschenhenke–Müller–Vargas [2] and Guo–Oh [4] independently obtained extension estimates for all smooth compact surfaces in \(\mathbb {R}^3\) with negative Gaussian curvature using polynomial partitioning. In particular, Theorem 1.1 is now (essentially) a special case of their results, which were announced after the completion of the arXiv preprint version of the present article.

The rest of the article is organized as follows: In Sect. 2, we adapt the notion of ‘broad points’ in [5] to the hyperbolic hyperboloid, motivating our definition through the geometry of the surface. In Sect. 3, we use Kim’s argument in [6] to reduce Theorem 1.1 to Theorem 2.1, an estimate on the contribution to \(\mathcal {E}_r\) from broad points. Finally, in Sect. 4, the heart of the article, we prove Theorem 2.1 using polynomial partitioning as in [5].

Notation and Terminology As is standard, we write \(A \lesssim B\) or \(A = O(B)\) if there exists a constant \(C > 0\) such that \(A \le CB\). Generally, an implicit constant is not allowed to depend on any parameters present in the article. In particular, constants never depend on the parabolic scaling parameter r. There are exceptions: In Sect. 4, constants may depend on the exponent \(\varepsilon \) from Theorems 2.1 and 4.1. To highlight dependence on a parameter s, we will sometimes write \(\lesssim _s\) in place of \(\lesssim \). Likewise, we write \( c \ll 1\) to mean that c is sufficiently small, and we use subscripts to indicate dependence on parameters. A number \(\delta \) is ‘dyadic’ if \(\delta = 2^j\) for some \(j \in \mathbb {Z}\), and an interval I is ‘dyadic’ if \(I = [k2^j, (k+1)2^j)\) for some \(j,k \in \mathbb {Z}\). If uv are geometric objects that form an angle, such as two lines or a vector and a plane, then \(\angle (u,v)\) denotes the measure of their (smallest) angle. Finally, ‘hyperboloid’ always means the hyperbolic (one-sheeted) hyperboloid.

2 Broad Points and the Geometry of the Hyperboloid

In this section, we adapt the notion of ‘broad points’ to the hyperboloid. Informally, given a function \(f \in L^1(U)\), a point \((t,x) \in \mathbb {R}\times \mathbb {R}^2\) is ‘broad’ for \(\mathcal {E}_rf\) if there exist small, well-separated squares \(\tau _1,\tau _2 \subseteq U\) such that \(f\chi _{\tau _1}\) and \(f\chi _{\tau _2}\) contribute significantly to \(\mathcal {E}_rf(t,x)\); otherwise (tx) is ‘narrow’. To estimate \(\mathcal {E}_rf\), it suffices to bound the contributions from broad and narrow points separately. The narrow contribution will be handled by a parabolic rescaling argument, since (morally) its Fourier transform is supported in a small rectangular cap in \(\Sigma _r\). The broad contribution will be handled by polynomial partitioning, using, in particular, some techniques from bilinear restriction theory. In the latter argument, the precise separation condition imposed on the squares \(\tau _1,\tau _2\) will be crucial for ensuring that their lifts to \(\Sigma _r\) are appropriately transverse. Our choice of this condition will be motivated by the geometry of the hyperboloid, which we now describe.

First, the basic symmetries of the hyperboloid are the Lorentz transformations, linear maps on \(\mathbb {R}\times \mathbb {R}^2\) that preserve the quadratic form \((\tau ,\xi ) \mapsto \tau ^2 - \xi _1^2 +\xi _2^2\). Concretely, the spatial rotations

$$\begin{aligned} R_\omega (\tau ,\xi ) := (-\omega _2\xi _2+\omega _1\tau , \xi _1, \omega _1\xi _2+\omega _2\tau ), \quad \quad \omega \in \mathbb {S}^1, \end{aligned}$$
(2.1)

boosts

$$\begin{aligned} B_\nu (\tau ,\xi ) := \big (-\nu \xi _1+\sqrt{1+\nu ^2}\tau , \sqrt{1+\nu ^2}\xi _1-\nu \tau ,\xi _2\big ), \quad \quad \nu \in \mathbb {R}, \end{aligned}$$
(2.2)

and dilations

$$\begin{aligned} D_\lambda (\tau ,\xi ) := \bigg (\tau , \frac{\lambda +\lambda ^{-1}}{2}\xi _1 + \frac{\lambda -\lambda ^{-1}}{2}\xi _2, \frac{\lambda -\lambda ^{-1}}{2}\xi _1+\frac{\lambda +\lambda ^{-1}}{2}\xi _2\bigg ), \quad \quad \lambda \in \mathbb {R}, \end{aligned}$$
(2.3)

will be of particular use to us. We define a measure \(d\mu \) on \(\Sigma \) by setting

$$\begin{aligned} \int _\Sigma g d\mu := \int _\Omega g(\phi (\xi ),\xi )\frac{d\xi }{\phi (\xi )} \end{aligned}$$
(2.4)

for g continuous and compactly supported. This measure is Lorentz invariant in the following sense: If L is a Lorentz transformation and \({\text {supp}}g \subseteq \Sigma \) and \(L^{-1}({\text {supp}}g)\subseteq \Sigma \), then

$$\begin{aligned} \int _\Sigma (g \circ L)d\mu = \int _\Sigma g d\mu . \end{aligned}$$

We also record the following notation for later use. Given a Lorentz transformation L and \(\xi \in \Omega \), let

$$\begin{aligned} \overline{L}(\xi ) := \pi (L(\phi (\xi ),\xi )), \end{aligned}$$
(2.5)

where \(\pi (\tau ,\xi ) := \xi \) is the projection to the spatial coordinates. If \({L}(\phi (\xi ),\xi ) \in \Sigma \) (equivalently, if \(e_1\cdot L(\phi (\xi ),\xi ) \ge 0\)), then \(\overline{ML}(\xi ) = \overline{M}(\overline{L}(\xi ))\) for any other Lorentz transformation M. In particular, if \(V \subseteq \Omega \) and \(L(\phi (\xi ),\xi ) \in \Sigma \) for \(\xi \in V\), then \(\overline{L}\) is invertible on V with \(\overline{L}^{-1}(\zeta ) = \overline{L^{-1}}(\zeta )\) for \(\zeta \in \overline{L}(V)\).

Second, the (hyperbolic) hyperboloid is doubly ruled. The aforementioned separation condition will be adapted to this structure: Informally, two small squares \(\tau _1,\tau _2 \subseteq U\) will be ‘separated’ if their lifts to the hyperboloid do not intersect a common line contained in the surface. While the precise version of this condition will be stated in Sect. 4, we record a few preparatory details here. The Lorentz norm of \((\tau ,\xi ) \in \mathbb {R}\times \mathbb {R}^2\) is defined as

$$\begin{aligned} \llbracket (\tau ,\xi )\rrbracket := \sqrt{|\tau ^2 - \xi _1^2 + \xi _2^2|}. \end{aligned}$$

It is clearly Lorentz invariant, and if \((\tau ,\xi ),(\tau ',\xi ') \in \Sigma \), then \(\llbracket (\tau ,\xi )-(\tau ',\xi ')\rrbracket = 0\) if and only if \((\tau ,\xi )\) and \((\tau ',\xi ')\) belong to a common line contained \(\Sigma \). The latter property can be checked by using the formulae

$$\begin{aligned} \ell _{(\tau ,\xi )}^\pm (t) := (\tau ,\xi ) + t(\xi _1\tau \mp \xi _2, 1+\xi _1^2, \xi _1\xi _2 \pm \tau ), \end{aligned}$$
(2.6)

which parametrize the lines \(\ell _{(\tau ,\xi )}^\pm \subset \Sigma \) that intersect at \((\tau ,\xi ) \in \Sigma \). We also define the Lorentz separation of \(\xi ,\zeta \in \Omega \) as the quantity

$$\begin{aligned} {\mathrm{dist}_{\mathrm{L}}}(\xi ,\zeta ) := \llbracket (\phi (\xi ),\xi ) - (\phi (\zeta ),\zeta )\rrbracket , \end{aligned}$$

which can be viewed as the ‘distance’ between \((\phi (\xi ),\xi )\) and \((\phi (\zeta ),\zeta )\) modulo the rulings of \(\Sigma \). Given this definition, a more accurate rendering of our separation requirement would be that \(\mathrm{dist}_{\mathrm{L}}(\xi ,\zeta ) \gtrsim 1\) for all \(\xi \in \tau _1\) and \(\zeta \in \tau _2\). Near the end of this section, we will prove Lemma 2.2, which relates \(\mathrm{dist}_{\mathrm{L}}\) to some other geometric quantities.

Having described the geometry of the hyperboloid, we turn to defining broad points. Our first step is to divide each surface \(\Sigma _r\) into caps that lie above special sets which we call tiles. Consider the map \(\Phi : \mathbb {R}^2 \rightarrow \mathbb {R}^2\) given by

$$\begin{aligned} \Phi (\xi ) := \frac{(\xi _1\sqrt{1+\xi _2^2}+\xi _2\sqrt{1+\xi _1^2}, \xi _2-\xi _1)}{\sqrt{1+\xi _1^2}+\sqrt{1+\xi _2^2}} \end{aligned}$$

and, for each \(r \in (0,1]\), let \(\Phi _r(\xi ) := r^{-1}\Phi (r\xi )\). Recall the constant \(\delta _0\) used to define U, and assume henceforth that \(\delta _0\) is dyadic. Given two dyadic numbers \( \delta ,\delta ' \in (0,\delta _0]\), a \((\delta ,\delta ',r)\)-tile is any nonempty set of the form

$$\begin{aligned} \rho := \Phi _r(I_\delta \times I_{\delta '}) \cap U, \end{aligned}$$

where \(I_\delta \) and \(I_{\delta '}\) are dyadic intervals contained in \([-\delta _0,\delta _0)\) of length \(\delta \) and \(\delta '\), respectively. We denote the set of \((\delta , \delta ',r)\)-tiles by \(\mathcal {T}_{\delta ,\delta ',r}\). Observe that \(\Phi \) is a diffeomorphism near the origin. (Indeed, \(\Phi \) can be viewed as a perturbation of the map \(\xi \mapsto \frac{1}{2}(\xi _1+\xi _2, \xi _2-\xi _1)\) for \(\xi \) small.) Taking \(\delta _0\) sufficiently small, it is straightforward to check that \(\Vert \Phi _r^{-1}\Vert _{C^1(U)} \lesssim 1\) uniformly in r, and consequently that \(U \subseteq \Phi _r([-\delta _0,\delta _0)^2)\) for every r. We also note that for fixed \(\delta ,\delta ',r\), the \((\delta ,\delta ',r)\)-tiles are pairwise disjoint and satisfy

$$\begin{aligned} U = \bigcup _{\rho \in \mathcal {T}_{\delta ,\delta ',r}}\rho . \end{aligned}$$

Let us briefly mention the geometry underlying these definitions. The map \(\Phi \) was created with the following property in mind: If \(\ell \subset \mathbb {R}^2\) is a vertical or horizontal line that intersects \(\Phi _r^{-1}(U)\), then \(\Phi _r(\ell )\) is a line that lifts to a line contained in \(\Sigma _r\). Thus, each tile lifts to a quadrilateral (in fact, nearly rectangular) cap bounded by four lines. We can think of the collection \(\{\mathcal {T}_{\delta ,\delta ',r}\}_{\delta ,\delta '}\) as a dyadic grid adapted to \(\Sigma _r\). A more precise geometric description of \(\Phi \) will appear in Lemma 2.3 at the end of this section.

Now, let \(K \ge \delta _0^{-1}\) be a large dyadic constant. As suggested above, we will analyze contributions to \(\mathcal {E}_r\) from square-like sets \(\tau \). The \((K^{-1},K^{-1},r)\)-tiles will function as these basic pieces. However, controlling contributions from longer rectangle-like sets will also be essential. (As we will see, a collection of non-separated squares \(\tau \) must cluster around a line.) For each dyadic number \(\delta \in [K^{-1},\delta _0]\), let

$$\begin{aligned} \mathcal {R}_{\delta ,r} := \mathcal {T}_{K^{-1},\delta ,r} \cup \mathcal {T}_{\delta ,K^{-1},r} \end{aligned}$$

and also set

$$\begin{aligned} \mathcal {R}_r := \bigcup _{\delta \in [K^{-1},\delta _0]} \mathcal {R}_{\delta ,r}. \end{aligned}$$

Elements of \(\mathcal {R}_{\delta ,r}\) resemble rectangles of dimensions \(K^{-1} \times \delta \) and slope approximately 1 or \(-1\). We are now ready to define broad points. Given \(f \in L^1(U)\) and \(\alpha \in (0,1]\), we say that \((t,x) \in \mathbb {R}\times \mathbb {R}^2\) is \(\alpha \)-broad for \(\mathcal {E}_rf\) if

$$\begin{aligned} \max _{\rho \in \mathcal {R}_r}|\mathcal {E}_rf_\rho (t,x)| \le \alpha |\mathcal {E}_rf(t,x)|, \end{aligned}$$

where \(f_\rho := f\chi _\rho \). The \(\alpha \)-broad part of \(\mathcal {E}_rf\) is defined as

$$\begin{aligned} {\text {Br}}_\alpha \mathcal {E}_rf(t,x) := {\left\{ \begin{array}{ll} \mathcal {E}_rf(t,x) &{}\text {if } (t,x) \hbox { is } \alpha \hbox {-broad for } \mathcal {E}_rf,\\ 0 &{}\text {otherwise}.\end{array}\right. } \end{aligned}$$

In the next section, we will reduce Theorem 1.1 to the following estimate on the broad part:

Theorem 2.1

For every \(0 < \varepsilon \ll 1\), there exists a constant \(C_\varepsilon \), depending only on \(\varepsilon \), such that if \(K = 2^{\lceil \varepsilon ^{-10}\rceil }\), then

$$\begin{aligned} \Vert {\text {Br}}_{K^{-\varepsilon }}\mathcal {E}_rf\Vert _{L^{13/4}(B_R)} \le C_\varepsilon R^\varepsilon \Vert f\Vert _2^{12/13}\Vert f\Vert _\infty ^{1/13} \end{aligned}$$

for all \( r \in (0,1]\), \(R \ge 1\), and balls \(B_R\) of radius R.

To conclude this section, we present two geometric lemmas. We will need the following notation: For \(\xi \in \Omega \), let \(\ell _\xi ^\pm \) denote the lines in \(\mathbb {R}^2\) parametrized by

$$\begin{aligned} \ell _\xi ^\pm (t) := \xi + t(1 + \xi _1^2, \xi _1\xi _2 \pm \phi (\xi )). \end{aligned}$$
(2.7)

Geometrically, \(\ell _\xi ^\pm \) are the projections to the spatial coordinates of the lines \(\ell _{(\phi (\xi ),\xi )}^\pm \) defined in (2.6).

Lemma 2.2

For all \(\xi ,\zeta \in U\), we have

  1. (a)

    \({\text {dist}}(\xi , \ell _\zeta ^+ \cup \ell _\zeta ^-) \lesssim \mathrm{dist}_{\mathrm{L}}(\xi ,\zeta ) \lesssim |\xi -\zeta |\);

  2. (b)

    \(\mathrm{dist}_{\mathrm{L}}(\xi ,\zeta )^2 \sim |\langle (\nabla ^2\phi (\xi ))^{-1}(\nabla \phi (\xi )-\nabla \phi (\zeta )),\nabla \phi (\xi )-\nabla \phi (\zeta )\rangle |\).

Proof

(a) Let \(\xi '\) be the intersection of \(\ell _\xi ^-\) and \(\ell _\zeta ^+\). An easy calculation shows that \(\angle (\ell _\eta ^+,\ell _{\eta '}^-) \gtrsim 1\) for all \(\eta ,\eta ' \in U\). (In fact, the lines are nearly orthogonal.) In particular, the law of sines implies that \(\xi ' \in CU\) for some constant C. Let \(L := B_\nu R_\omega \), as defined in (2.1) and (2.2), with

$$\begin{aligned} \nu&:= \zeta _1,\\ \omega&:= \bigg (\frac{\phi (\zeta )}{\sqrt{1+\zeta _1^2}}, -\frac{\zeta _2}{\sqrt{1+\zeta _1^2}}\bigg ). \end{aligned}$$

Then \(L(\phi (\zeta ),\zeta ) = (1,0)\). Let \(\eta := \overline{L}(\xi )\) and \(\eta ' := \overline{L}(\xi ')\), using the notation from (2.5). Since \(\nu \) and \(\omega _2\) are very small, \(\overline{L}\) is essentially a perturbation of the identity. It is easy to check that \(L(\phi (\xi ),\xi ) \in \Sigma \) for all \(\xi \in CU\), provided \(\delta _0\) is sufficiently small, and thus \(\overline{L}\) is invertible on CU. Additionally, we have the bound \(\Vert \overline{L}^{-1}\Vert _{C^1(\overline{L}(CU))} \lesssim 1\). Combining these facts, we see that

$$\begin{aligned} {\text {dist}}(\xi ,\ell _\zeta ^+) \le |\xi - \xi '| \lesssim |\eta -\eta '|. \end{aligned}$$
(2.8)

Since L preserves the hyperboloid and is linear, it must permute the lines contained in the surface. Therefore, since L is close to the identity,

$$\begin{aligned} \overline{L}(\ell _\zeta ^+)&= \ell _0^+ = \mathbb {R}(1,1),\\ \overline{L}(\ell _\xi ^-)&= \ell _\eta ^-, \end{aligned}$$

which implies that \(\{\eta '\} = \mathbb {R}(1,1) \cap \ell _\eta ^-\). Then, since \(\eta \in \ell _\eta ^-\) and \(\angle (\mathbb {R}(1,1),\ell _\eta ^-) \gtrsim 1\), it follows that \(|\eta -\eta '| \lesssim {\text {dist}}(\eta ,\mathbb {R}(1,1))\). Thus, by (2.8), we have \({\text {dist}}(\xi ,\ell _\zeta ^+) \lesssim {\text {dist}}(\eta ,\mathbb {R}(1,1)) \le |\eta _1-\eta _2|\). A similar argument shows that \({\text {dist}}(\xi ,\ell _\zeta ^-) \lesssim |\eta _1+\eta _2|\). Hence,

$$\begin{aligned} {\text {dist}}(\xi ,\ell _\zeta ^+ \cup \ell _\zeta ^-)&= \min \{{\text {dist}}(\xi ,\ell _\zeta ^+),{\text {dist}}(\xi ,\ell _\zeta ^-)\}\\&\le \sqrt{|\eta _1^2 - \eta _2^2|}\\&\lesssim \sqrt{|1-\phi (\eta )|}\\&\sim \llbracket (\phi (\eta ), \eta ) - (1,0)\rrbracket \\&= \mathrm{dist}_{\mathrm{L}}(\xi ,\zeta ), \end{aligned}$$

where the last step used the Lorentz invariance of the Lorentz norm. The second inequality in (a) can be proved in a similar (but easier) fashion. (It also follows from part (b), using the Cauchy–Schwarz inequality and bounds on the derivatives of \(\phi \).)

(b) A straightforward computation shows that the right-hand side of (b) is equal to

$$\begin{aligned} \frac{1}{\phi (\xi )\phi (\zeta )^2}|(1+\xi _1^2)(\xi _1\phi (\zeta )&- \zeta _1\phi (\xi ))^2 + 2\xi _1\xi _2(-\xi _2\phi (\zeta ) + \zeta _2\phi (\xi ))(\xi _1\phi (\zeta ) \\&- \zeta _1\phi (\xi ))+ (-1+\xi _2^2)(-\xi _2\phi (\zeta ) + \zeta _2\phi (\xi ))^2|. \end{aligned}$$

The expression inside absolute value signs is equal to

$$\begin{aligned}&\phi (\xi )^2[(1+\xi _1^2)\zeta _1^2 - 2\xi _1\xi _2\zeta _1\zeta _2+(-1+\xi _2^2)\zeta _2^2]+ \phi (\xi )\phi (\zeta )[-2(1+\xi _1^2)\xi _1\zeta _1 \\&\quad +2\xi _1\xi _2(\xi _2\zeta _1+\xi _1\zeta _2)-2(-1+\xi _2^2)\xi _2\zeta _2]+ \phi (\zeta )^2[(1+\xi _1^2)\xi _1^2 - 2\xi _1^2\xi _2^2 \\&\quad + (-1+\xi _2^2)\xi _2^2], \end{aligned}$$

which, by the relations \(\phi (\xi )^2 = 1 + \xi _1^2 - \xi _2^2\) and \(\phi (\zeta )^2 = 1 +\zeta _1^2 - \zeta _2^2\), simplifies to

$$\begin{aligned}&\phi (\xi )^2[(\xi _1\zeta _1-\xi _2\zeta _2)^2+\phi (\zeta )^2-1] + 2\phi (\xi )\phi (\zeta )[-\xi _1\zeta _1+\xi _2\zeta _2] \\&\quad +\phi (\zeta )^2\phi (\xi )^2[\phi (\xi )^2-1]. \end{aligned}$$

Thus, by a bit more algebra, the right-hand side of (b) factors as

$$\begin{aligned} \frac{\phi (\xi )}{\phi (\zeta )^2}|1+\xi _1\zeta _1-\xi _2\zeta _2-\phi (\xi )\phi (\zeta )||\xi _1\zeta _1-\xi _2\zeta _2-\phi (\xi )\phi (\zeta )-1|. \end{aligned}$$

We also compute that

$$\begin{aligned} \mathrm{dist}_{\mathrm{L}}(\xi ,\zeta )^2 = 2|1+\xi _1\zeta _1-\xi _2\zeta _2-\phi (\xi )\phi (\zeta )|, \end{aligned}$$

and (b) follows. \(\square \)

Let us briefly interpret Lemma 2.2. Part (a) says that points with small Lorentz separation lie near a common line, while points with large Lorentz separation are genuinely separated. Part (b) relates Lorentz distance to a measure of ‘transversality’ that naturally arises in bilinear restriction theory (see [7, Theorem 1.1]). Crucially, whenever \(\xi \) and \(\zeta \) belong to separated squares (as discussed above), the right-hand side of (b) will be bounded below.

Lemma 2.3

If \(\xi \in U\) and \(\zeta = \Phi ^{-1}(\xi )\), then \(\ell _\xi ^+ \cap 2U = \Phi (\{\zeta _1\} \times \mathbb {R}) \cap 2U\) and \(\ell _\xi ^- \cap 2U = \Phi (\mathbb {R}\times \{\zeta _2\}) \cap 2U\).

Proof

We will only prove the first equality; the second follows in a similar manner. The proof rests on two claims.

Claim 1. If \(|\xi |,|\xi '| \le 1/2\) and \(\xi ' \in \ell _\xi ^+\), then \(\ell _{\xi '}^+ = \ell _\xi ^+\). Consider the lines \(\ell _\xi ^+\), \(\ell _{\xi '}^+\), and \(\ell _{\xi '}^-\). Each one contains \(\xi '\) and lifts to a line contained in \(\Sigma \). By elementary geometry, no three lines in the hyperboloid intersect at a common point. Thus, two of \(\ell _\xi ^+\), \(\ell _{\xi '}^+\), and \(\ell _{\xi '}^-\) must be identical. Since \(\ell _{\xi '}^+ \ne \ell _{\xi '}^-\) and \(\ell _{\xi }^+ \ne \ell _{\xi '}^-\), as is easy to check, we conclude that \(\ell _{\xi '}^+ = \ell _\xi ^+\).

Claim 2. For every \(\xi \in \mathbb {R}^2\), we have \(\{\Phi (\xi )\} = \ell _{(\xi _1,0)}^+ \cap \ell _{(\xi _2,0)}^-\). This relation can be checked directly, using (2.7). It is helpful to reparametrize (2.7) so that the second coordinates of \(\ell _{(\xi _1,0)}^+(t)\) and \(\ell _{(\xi _2,0)}^-(t)\) are identically t and \(-t\), respectively.

Now, fix \(\xi \in U\) and let \(\zeta := \Phi ^{-1}(\xi )\). Let \(\xi ' \in \ell _\xi ^+ \cap 2U\) and \(\zeta ' := \Phi ^{-1}(\xi ')\). Claim 2 implies that \(\xi \in \ell _{(\zeta _1,0)}^+\) and \(\xi ' \in \ell _{(\zeta _1',0)}^+\). Hence, by claim 1, we have \(\ell _{(\zeta _1,0)}^+ = \ell _\xi ^+ = \ell _{\xi '}^+ = \ell _{(\zeta _1',0)}^+\), and it follows that \(\zeta _1 = \zeta _1'\). Since \(\xi '\) was arbitrary, we conclude that \(\ell _\xi ^+ \cap 2U \subseteq \Phi (\{\zeta _1\} \times \mathbb {R}) \cap U\). The other direction is similar: Let \(\xi ' \in \Phi (\{\zeta _1\} \times \mathbb {R}) \cap 2U\), so that \(\xi ' = \Phi (\zeta _1,t)\) with \((\zeta _1,t) \in \Phi ^{-1}(2U) \subseteq \Omega \). Claim 2 implies that \(\xi ,\xi ' \in \ell _{(\zeta _1,0)}^+\). Hence, \(\xi ' \in \ell _\xi ^+\) by claim 1, and it follows that \(\Phi (\{\zeta _1\} \times \mathbb {R}) \cap 2U \subseteq \ell _\xi ^+ \cap 2U\). \(\square \)

3 Reduction to Theorem 2.1

In this section, we adapt the argument of Kim in [6] to show that Theorem 2.1 implies Theorem 1.1. The following parabolic rescaling lemma is the main tool required for this reduction.

Lemma 3.1

Let \(r \in (0,1]\) be dyadic and let \(\theta \in [0,1]\). If \(\Vert \mathcal {E}_s g\Vert _{L^q(B_{R/2})} \le M\Vert g\Vert _2^{1-\theta }\Vert g\Vert _\infty ^\theta \) for all \(s \in (0,1]\), balls \(B_{R/2}\) of radius R/2, and \(g \in L^\infty (U)\), then there exists an absolute constant C such that \(\Vert \mathcal {E}_r h\Vert _{L^q(B_R)} \le CM(\delta \delta ')^{\frac{1+\theta }{2} -\frac{2}{q}}\Vert h\Vert _2^{1-\theta }\Vert h\Vert _\infty ^\theta \) for all bounded functions h supported in \(\rho \in \mathcal {T}_{\delta ,\delta ',r}\), provided \(\delta ,\delta '\) are sufficiently small.

Proof

Fix \(h \in L^\infty (U)\) supported in \(\rho \in \mathcal {T}_{\delta ,\delta ',r}\). There exists \(\rho _1 \in \mathcal {T}_{r\delta ,r\delta ',1}\) such that \(r\rho \subseteq \rho _1\). By parabolic rescaling, we have

$$\begin{aligned} \Vert \mathcal {E}_r h\Vert _{L^q(B_R)} = r^{\frac{4}{q}-2}\Vert \mathcal {E}_1h_{\rho _1}\Vert _{L^q(P_r(B_R))}, \end{aligned}$$

where \(h_{\rho _1} := h(r^{-1}\cdot )\) is supported in \(\rho _1\). We assume without loss of generality that \(\delta \le \delta '\) and fix \(\eta \in \rho _1\). We claim that \(\rho _1\) lies in the intersection of an \(O(r\delta )\)-neighborhood of \(\ell _\eta ^+\) and an \(O(r\delta ')\)-neighborhood of \(\ell _\eta ^-\). Indeed, let \(\eta ' \in \rho _1\) and set \(\zeta = \Phi ^{-1}(\eta )\) and \(\zeta ' = \Phi ^{-1}(\eta ')\). By the definition of \((r\delta ,r\delta ',1)\)-tile, we have

$$\begin{aligned} {\text {dist}}(\zeta ', (\{\zeta _1\}\times \mathbb {R}) \cap \Phi ^{-1}(U))&\le r\delta ,\\ {\text {dist}}(\zeta ', (\mathbb {R}\times \{\zeta _2\}) \cap \Phi ^{-1}(U))&\le r\delta '. \end{aligned}$$

Thus, by Lemma 2.3 and the boundedness of \(\Vert \nabla \Phi \Vert \) near the origin, it follows that

$$\begin{aligned} {\text {dist}}(\eta ', \ell _\eta ^+)&\lesssim r\delta ,\\ {\text {dist}}(\eta ', \ell _\eta ^-)&\lesssim r\delta ', \end{aligned}$$

proving the claim.

Now, let \(L := (D_\lambda B_\nu R_\omega )^{-1}\) with

$$\begin{aligned} \lambda&:= \sqrt{\frac{\delta }{\delta '}},\\ \nu&:= \eta _1,\\ \omega&:= \bigg (\frac{\phi (\eta )}{\sqrt{1+\eta _1^2}}, -\frac{\eta _2}{\sqrt{1+\eta _1^2}}\bigg ), \end{aligned}$$

using the notation from (2.1)–(2.3). As in the proof of Lemma 2.2, the map \(\overline{B_\nu R_\omega }\) sends \(\eta \) to the origin and \(\ell _\eta ^\pm \) to \(\ell _0^\pm = \mathbb {R}(1,\pm 1)\) and satisfies \(\Vert \overline{B_\nu R_\omega }\Vert _{C^1(U)} \lesssim 1\). Thus, by the claim, \(\overline{B_\nu R_\omega }(\rho _1)\) lies in an \(O(r\delta ) \times O(r\delta ')\) rectangle with slope 1 centered at the origin, and consequently \(\overline{D_\lambda }(\overline{B_\nu R_\omega }(\rho _1))\) is contained in sU for some \(s \lesssim r\sqrt{\delta \delta '}\). It is easy to check that \(B_\nu R_\omega (\phi (\xi ),\xi ) \in \Sigma \) for all \(\xi \in U\), and thus by the discussion following (2.5),

$$\begin{aligned} \overline{L^{-1}}(\rho _1) = \overline{D_\lambda }(\overline{B_\nu R_\omega }(\rho _1)) \subseteq sU. \end{aligned}$$
(3.1)

We claim that

$$\begin{aligned} \overline{L}^{-1}(\rho _1) := \{\xi \in \Omega : \overline{L}(\xi ) \in \rho _1\} = \overline{L^{-1}}(\rho _1). \end{aligned}$$
(3.2)

Indeed, given a set \(V \subseteq \Omega \), let \(V^\pm := \{(\pm \phi (\xi ),\xi ) : \xi \in V\}\). Then

$$\begin{aligned} \overline{L}^{-1}(\rho _1)&= \{\xi \in \Omega : L(\phi (\xi ),\xi ) \in \rho _1^+ \cup \rho _1^-\}\\&= \{\xi \in \Omega : (\phi (\xi ),\xi ) \in L^{-1}(\rho _1^+) \cup -L^{-1}((-\rho _1)^+)\}. \end{aligned}$$

It is easy to check that \(e_1 \cdot L^{-1}(\phi (\zeta ),\zeta ) > 0\) for every \(\zeta \in U\). Thus, since \(-\rho _1 \subseteq U\) and \(\phi \ge 0\), we have \((\phi (\xi ),\xi ) \notin -L^{-1}((-\rho _1)^+)\) for every \(\xi \). Hence,

$$\begin{aligned} \overline{L}^{-1}(\rho _1) = \{\xi \in \Omega : (\phi (\xi ),\xi ) \in L^{-1}(\rho _1^+)\} = \overline{L^{-1}}(\rho _1), \end{aligned}$$

proving the claim.

Now, define \(F : \Sigma \rightarrow {\mathbb {C}}\) by \(F(\tau ,\xi ) := h_{\rho _1}(\xi )\phi (\xi )\) and assume that \(\delta ,\delta '\) are small enough that \(s \le 1\). Then, using (3.2) and (3.1), it is straightforward to check that \(L^{-1}({\text {supp}}F) \subseteq \Sigma \). Thus,

$$\begin{aligned} \mathcal {E}_1h_{\rho _1}(t,x)&= e^{-2\pi it}\int _\Sigma e^{2\pi i(t,x)\cdot (\tau ,\xi )}F(\tau ,\xi )d\mu (\tau ,\xi )\\&=e^{-2\pi it}\int _\Sigma e^{2\pi i(t,x)\cdot L(\tau ,\xi )}F(L(\tau ,\xi ))d\mu (\tau ,\xi ), \end{aligned}$$

where \(d\mu \) is the Lorentz-invariant measure given by (2.4). Hence, for \(H(\xi ) := h_{\rho _1}(\overline{L}(\xi ))\frac{\phi (\overline{L}(\xi ))}{\phi (\xi )}\), we have

$$\begin{aligned} |\mathcal {E}_1h_{\rho _1}(t,x)| = |\mathcal {E}_1H(L^*(t,x))|. \end{aligned}$$

Noting that \(|\det L| = 1\), we obtain the relation

$$\begin{aligned} \Vert \mathcal {E}_r h\Vert _{L^q(B_R)} = r^{\frac{4}{q}-2}\Vert \mathcal {E}_1 H\Vert _{L^q(L^* P_r(B_R))}, \end{aligned}$$

and parabolic rescaling then gives

$$\begin{aligned} \Vert \mathcal {E}_rh\Vert _{L^q(B_R)} \sim (\delta \delta ')^{1-\frac{2}{q}}\Vert \mathcal {E}_s[H(s\cdot )]\Vert _{L^q(P_{s^{-1}}L^* P_r(B_R))}, \end{aligned}$$

where \(H(s\cdot )\) is supported in U by (3.2) and (3.1).

We claim that \(P_{s^{-1}}L^* P_r(B_R)\) is covered by a bounded number of balls of radius R/2. Assuming the claim is true, the hypothesis of the lemma implies that

$$\begin{aligned} \Vert \mathcal {E}_r h\Vert _{L^q(B_R)} \lesssim M(\delta \delta ')^{1-\frac{2}{q}}\Vert H(s\cdot )\Vert _2^{1-\theta }\Vert H(s\cdot )\Vert _\infty ^\theta . \end{aligned}$$
(3.3)

To prove the claim, we may assume by translation invariance that \(B_R\) is centered at the origin. Let Q(abc) denote any rectangular box centered at zero with sides of length O(a), O(b), O(c) parallel to (1, 0, 0), (0, 1, 1), \((0,1,-1)\), respectively. Thus, for example, \(B_R \subseteq Q(R,R,R)\) and

$$\begin{aligned} P_r(B_R) \subseteq Q\bigg (\frac{R}{r^2},\frac{R}{r},\frac{R}{r}\bigg ). \end{aligned}$$

We have \(L^* = D_\lambda ^{-*}B_\nu ^{-*}R_\omega ^{-*}\), where \(S^{-*} := (S^{-1})^*\). Since \(R_\omega ^{-*}\) and \(B_\nu ^{-*}\) have bounded norm, we can ignore their contribution. Thus, from the definition of \(D_\lambda \), we have

$$\begin{aligned} L^* P_r(B_R) \subseteq Q\bigg (\frac{R}{r^2},\frac{\sqrt{\delta '}R}{\sqrt{\delta }r}, \frac{\sqrt{\delta }R}{\sqrt{\delta '}r}\bigg ). \end{aligned}$$

The definition of s then implies that

$$\begin{aligned} P_{s^{-1}}L^* P_r(B_R) \subseteq Q(\delta \delta ' R, \delta 'R, \delta R), \end{aligned}$$

which proves claim.

Finally, to finish the proof, we need to undo the changes of variable we have used. Using (3.2) and (3.1), we have \(L(\phi (\xi ),\xi ) \in \Sigma \) for all \(\xi \in \overline{L}^{-1}(\rho _1)\). Thus, \(\overline{L}\) is invertible on \({\text {supp}}H\) with \(\overline{L}^{-1}(\zeta ) = \overline{L^{-1}}(\zeta )\) for \(\zeta \in \overline{L}({\text {supp}}H) \subseteq U\). Moreover, \(\overline{L^{-1}} = \overline{D_\lambda }\circ \overline{B_\nu } \circ \overline{R_\omega }\) on U, so a straightforward calculation shows that \(|\det \nabla \overline{L}^{-1}| \lesssim 1\) on U. Using these observations, we find that

$$\begin{aligned} \Vert H(s\cdot )\Vert _2&\lesssim \frac{1}{\sqrt{\delta \delta '}}\Vert h\Vert _2,\\ \Vert H(s\cdot )\Vert _\infty&\lesssim \Vert h\Vert _\infty . \end{aligned}$$

Plugging these bounds into (3.3) completes the proof. \(\square \)

Proposition 3.2

Assume that Theorem 2.1 holds. Then for every \(\theta \in (3/13, 1]\) and \(0 < \varepsilon \ll _\theta 1\), there exists a constant \(C_{\varepsilon ,\theta }\), depending only on \(\varepsilon \) and \(\theta \), such that

$$\begin{aligned} \Vert \mathcal {E}_r f\Vert _{L^{13/4}(B_R)} \le C_{\varepsilon ,\theta }R^\varepsilon \Vert f\Vert _2^{1-\theta }\Vert f\Vert _\infty ^\theta \end{aligned}$$

for all \(r \in (0,1]\), \(R \ge 1\), and balls \(B_R\) of radius R.

Proof

We will induct on R. The base case, that \(R \sim 1\), holds trivially. We assume as our induction hypothesis that the proposition holds with R/2 in place of R. Additionally, we may assume that \(2C_\varepsilon ^{13/4} \le C_{\varepsilon ,\theta }^{13/4}\), where \(C_\varepsilon \) is the constant from Theorem 2.1. The definition of \(K^{-\varepsilon }\)-broad implies that

$$\begin{aligned} |\mathcal {E}_r f(t,x)| \le \max \{|{\text {Br}}_{K^{-\varepsilon }}\mathcal {E}_rf(t,x)|,K^\varepsilon \max _{\rho \in \mathcal {R}_r}|\mathcal {E}_rf_\rho (t,x)|\} \end{aligned}$$

for every \((t,x) \in \mathbb {R}\times \mathbb {R}^2\). It follows that

$$\begin{aligned} \int _{B_R}|\mathcal {E}_rf|^{13/4} \le \int _{B_R}|{\text {Br}}_{K^{-\varepsilon }}\mathcal {E}_rf|^{13/4} + K^{\frac{13}{4}\varepsilon }\sum _{\rho \in \mathcal {R}_r}\int _{B_R}|\mathcal {E}_rf_\rho |^{13/4} =: \text {I} + \text {II}. \end{aligned}$$

To bound the first term, we use Theorem 2.1 and Hölder’s inequality to get

$$\begin{aligned} \text {I} \le (C_\varepsilon \Vert f\Vert _2^{12/13}\Vert f\Vert _\infty ^{1/13})^{13/4} \le (C_\varepsilon \Vert f\Vert _2^{1-\theta }\Vert f\Vert _\infty ^\theta )^{13/4} \le \frac{1}{2}(C_{\varepsilon ,\theta }\Vert f\Vert _2^{1-\theta }\Vert f\Vert _\infty ^{\theta })^{13/4}. \end{aligned}$$

To bound the second term, we will use Lemma 3.1. We may assume that r is dyadic by parabolic rescaling, and the other hypothesis of the lemma holds by our inductive assumption. Additionally, by Hölder’s inequality, we may assume that \(\theta \) is close to 3/13; in particular, that \(\theta \le 5/13\). Then

$$\begin{aligned} \text {II}&\le K^{\frac{13}{4}\varepsilon }\sum _{\delta \in [K^{-1},\delta _0]}\sum _{\rho \in \mathcal {R}_{\delta ,r}}(CC_{\varepsilon ,\theta }R^{\varepsilon }(\delta K^{-1})^{\frac{1}{2}(\theta -\frac{3}{13})}\Vert f_\rho \Vert _{2}^{1-\theta }\Vert f_\rho \Vert _{\infty }^\theta )^{13/4}\\&\le K^{\frac{13}{4}\varepsilon + \frac{13}{8}(\frac{3}{13}-\theta )}C^{13/4}\sum _{\delta \in [K^{-1},\delta _0]}\delta ^{\frac{13}{8}(\theta -\frac{3}{13})}(C_{\varepsilon ,\theta }R^\varepsilon )^{13/4}\sum _{\rho \in \mathcal {R}_{\delta ,r}}\Vert f_\rho \Vert _{2}^{\frac{13}{4}(1-\theta )}\Vert f\Vert _\infty ^{\frac{13}{4}\theta }\\&\le \bigg [K^{\frac{13}{4}\varepsilon + \frac{13}{8}(\frac{3}{13}-\theta )}C^{13/4}\bigg (\sum _{\delta \in [K^{-1},\delta _0]}\delta ^{\frac{13}{8}(\theta -\frac{3}{13})}\bigg )2^{\frac{13}{8}(1-\theta )}\bigg ](C_{\varepsilon ,\theta }R^\varepsilon \Vert f\Vert _2^{1-\theta }\Vert f\Vert _\infty ^\theta )^{13/4}, \end{aligned}$$

where the last step used the inclusion \(\ell ^2 \hookrightarrow \ell ^{\frac{13}{4}(1-\theta )}\) and that \(\mathcal {R}_{\delta ,r}\) covers U with overlap of multiplicity 2. Since \(\theta > 3/13\), the sum over \(\delta \) is bounded and the power of K is negative for \(\varepsilon \) sufficiently small. Thus, since \(K \rightarrow \infty \) as \(\varepsilon \rightarrow 0\) by the hypothesis of Theorem 2.1, the expression in square brackets is at most 1/2 for \(\varepsilon \) sufficiently small, and the induction closes. \(\square \)

Assuming Theorem 2.1 holds, Proposition 3.2 implies the restricted strong type bounds

$$\begin{aligned} \Vert \mathcal {E}_rf_E\Vert _{L^{13/4}(B_R)} \lesssim _{\varepsilon ,p} R^\varepsilon |E|^{1/p}. \end{aligned}$$

for all \(p > 13/5\), measurable sets \(E \subseteq U\), and \(|f_E| \lesssim \chi _E\). Then, by real interpolation with the trivial \(L^1 \rightarrow L^\infty \) estimate, we obtain the strong type bounds

$$\begin{aligned} \Vert \mathcal {E}_r f\Vert _{L^{q}(B_R)} \lesssim _{\varepsilon ,p,q}R^\varepsilon \Vert f\Vert _p \end{aligned}$$

for all \(q > 13/4\) and \(p > (q/2)'\). Tao’s epsilon removal lemma, in the form of Theorem 5.3 in [6], consequently gives the global strong type bounds

$$\begin{aligned} \Vert \mathcal {E}_r f\Vert _q \lesssim _{p,q}\Vert f\Vert _p \end{aligned}$$
(3.4)

for the same range of pq, completing the proof of Theorem 1.1.

4 Proof of Theorem 2.1

We are left to prove Theorem 2.1; this will occupy the rest of the article. To enable an inductive argument, we will actually need to prove a slightly stronger theorem, as in [5]. In Sect. 2, we defined broad points by considering the contribution to \(\mathcal {E}_rf\) from each \(f_\rho \), where \(f_\rho := f\chi _\rho \) and \(\rho \in \mathcal {R}_r\). Soon we will work with wave packets of the form \(\mathcal {E}_rf_{\rho ,T}\), where \(f_{\rho ,T}\) is supported not in \(\rho \) but in a slight enlargement of it. Thus, we need a more general definition of broad points in which the functions \(f_\rho \) may have larger, overlapping supports. Given \(\rho = \Phi _r(I_\delta \times I_{\delta '}) \cap U \in \mathcal {T}_{\delta ,\delta ',r}\) and \(m \ge 1\), we define

$$\begin{aligned} m\rho := \Phi _r(m(I_\delta \times I_{\delta '})) \cap U, \end{aligned}$$

where \(m(I_\delta \times I_{\delta '})\) is the m-fold dilate of the rectangle \(I_\delta \times I_{\delta '}\) with respect to its center. Let

$$\begin{aligned} \mathcal {S}_r := \mathcal {R}_{K^{-1},r}; \end{aligned}$$

elements of \(\mathcal {S}_r\) are essentially \(K^{-1} \times K^{-1}\) squares. Now, given \(f \in L^1(U)\), suppose that \(f = \sum _{\tau \in \mathcal {S}_r} f_\tau \), with each \(f_\tau \) supported in \(m \tau \), for some \(m \ge 1\). In our modified definition, \((t,x) \in \mathbb {R}\times \mathbb {R}^2\) is \(\alpha \)-broad for \(\mathcal {E}_r f\) if

$$\begin{aligned} \max _{\rho \in \mathcal {R}_r}\bigg |\sum _{\tau \in \mathcal {S}_r : \tau \subseteq \rho }\mathcal {E}_rf_\tau (t,x)\bigg | \le \alpha |\mathcal {E}_rf(t,x)|. \end{aligned}$$

We define the \(\alpha \)-broad part of \(\mathcal {E}_rf\), still denoted by \({\text {Br}}_{\alpha }\mathcal {E}_rf\), as in Sect. 2. These definitions depend on the particular decomposition \(f = \sum _\tau f_\tau \).

Theorem 4.1

For every \(0 < \varepsilon \ll 1\), there exists a constant \(C_\varepsilon '\), depending only on \(\varepsilon \), such that if \(K = 2^{\lceil \varepsilon ^{-10}\rceil }\), then the following holds: If \(f = \sum _{\tau \in \mathcal {S}_r} f_\tau \) with each \(f_\tau \) supported in \(m \tau \), for some \(m \ge 1\), and if additionally f satisfies

(4.1)

for all \(\xi \in U\) and \(\tau \in \mathcal {S}_r\), then

$$\begin{aligned} \int _{B_R}|{\text {Br}}_{\alpha }\mathcal {E}_rf|^{13/4} \le C_\varepsilon 'R^{\varepsilon +\varepsilon ^6\log (K^\varepsilon \alpha m^2)}\bigg (\sum _{\tau \in \mathcal {S}_r}\Vert f_\tau \Vert _2^2\bigg )^{3/2+\varepsilon } \end{aligned}$$

for all \(r \in (0,1]\), \(R \gg _\varepsilon 1\), balls \(B_R\) of radius R, and \(\alpha \in [K^{-\varepsilon }, 1]\).

A couple of remarks may be helpful. Firstly, the dyadic structure of our tiles, as defined in Sect. 2, implies that if \(\tau \in \mathcal {S}_r\) and \(\rho \in \mathcal {R}_r\), then either \(\tau \cap \rho = \emptyset \) or \(\tau \subseteq \rho \). More generally, if \(\rho _1 \in \mathcal {T}_{\delta _1,\delta _1',r}\) and \(\rho _2 \in \mathcal {T}_{\delta _2,\delta _2',r}\), then either \(\rho _1 \cap \rho _2 = \emptyset \) or \(\rho _1 \cap \rho _2 \in \mathcal {T}_{\min _i \delta _i, \min _i\delta _i',r}\). Secondly, Theorem 4.1 is indeed stronger than Theorem 2.1. We can derive the latter from the former as follows: If \(R \sim _\varepsilon 1\), then the estimate in Theorem 2.1 is trivial, so we may assume that \(R \gg _\varepsilon 1\). By scaling, we also may assume that \(\Vert f\Vert _\infty = 1\). Thus, the condition (4.1) holds automatically. We now apply Theorem 4.1 with \(\alpha = K^{-\varepsilon }\) and \(m=1\) to get

$$\begin{aligned} \int _{B_R}|{\text {Br}}_{K^{-\varepsilon }}\mathcal {E}_rf|^{13/4} \le C_\varepsilon ' R^{\varepsilon }\Vert f\Vert _2^{3+2\varepsilon } \le |U|^{\varepsilon }C_\varepsilon ' R^{\varepsilon }\Vert f\Vert _2^3, \end{aligned}$$

and then raising both sides to the power 4/13 finishes the proof.

4.1 Preliminaries

Before beginning the proof of Theorem 4.1, we lay some groundwork. For the remainder of the article, \(\varepsilon \), m, r, R, \(B_R\), and \(\alpha \) are fixed. Implicit constants will be allowed to depend on \(\varepsilon \). The propositions and lemma we record in this subsection are by now quite standard.

We begin with the wave packet decomposition. Let \(\Theta \) be a collection of discs \(\theta \) of radius \(R^{-1/2}\) which cover U with bounded overlap. We denote by \(c_\theta \) the center of \(\theta \), and we let \(v_\theta \) be the unit normal vector to \(\Sigma _r\) at \((\phi _r(c_\theta ),c_\theta )\). We may assume that \(c_\theta \in U\) for every \(\theta \). Let \(\delta := \varepsilon ^2\), and for each \(\theta \), let \(\mathbb {T}(\theta )\) be a collection of tubes parallel to \(v_\theta \) with radius \(R^{\delta +1/2}\) and length R and which cover \(B_R\) with bounded overlap. If \(T \in \mathbb {T}(\theta )\), then \(v(T) := v_\theta \) denotes the direction of T. Finally, we set \(\mathbb {T}:= \bigcup _{\theta \in \Theta }\mathbb {T}(\theta )\). The following wave packet decomposition resembles Proposition 2.6 in [5]:

Proposition 4.2

For each \(T \in \mathbb {T}\), there exists a function \(f_T \in L^2(\mathbb {R}^2)\) such that:

  1. (i)

    If \(T \in \mathbb {T}(\theta )\), then \(f_T\) is supported in \(3\theta \);

  2. (ii)

    If \((t,x) \in B_R \setminus T\), then \(|\mathcal {E}_rf_T(t,x)| \le R^{-1000}\Vert f\Vert _2\);

  3. (iii)

    \(|\mathcal {E}_rf(t,x) - \sum _{T \in \mathbb {T}}\mathcal {E}_rf_T(t,x)| \le R^{-1000}\Vert f\Vert _2\) for every \((t,x) \in B_R\);

  4. (iv)

    If \(T_1,T_2 \in \mathbb {T}(\theta )\) and \(T_1 \cap T_2 = \emptyset \), then \(|\int f_{T_1}\overline{f_{T_2}}| \le R^{-1000}\Vert f\Vert _{L^2(\theta )}^2\);

  5. (v)

    \(\sum _{T \in \mathbb {T}(\theta )}\Vert f_T\Vert _2^2 \lesssim \Vert f\Vert _{L^2(\theta )}^2\).

Proof

Adapting Guth’s argument in [5] is straightforward. The fact that the derivatives of \(\phi _r\) are bounded in r (i.e. \(\sup _{\xi \in U}|\nabla ^k\phi _r(\xi )| \lesssim _k 1\)) ensures that all constants arising in the argument can be made uniform in r. We note, in particular, that the crucial derivative estimates appearing in line (17) of [5] hold uniformly in r when adapted to our setting. \(\square \)

Next, we record an orthogonality lemma from [5]. The special case \(N =1\) will be of particular use.

Lemma 4.3

Let \(\mathbb {T}_1,\ldots ,\mathbb {T}_N\) be subsets of \(\mathbb {T}\). Suppose that each tube in \(\mathbb {T}\) belongs to at most M of the \(\mathbb {T}_i\), and for each \(\tau \in \mathcal {S}_r\), let

$$\begin{aligned} f_{\tau ,i} := \sum _{T \in \mathbb {T}_i}f_{\tau ,T}, \end{aligned}$$

where the functions \(f_{\tau ,T}\) come from applying Proposition 4.2 to \(f_\tau \). Then

$$\begin{aligned} \sum _{i=1}^N\int _{3\theta }|f_{\tau ,i}|^2 \lesssim M\int _{10\theta }|f_\tau |^2 \end{aligned}$$

for every \(\theta \in \Theta \), and

$$\begin{aligned} \sum _{i =1}^N\int _U|f_{\tau ,i}|^2 \lesssim M\int _U|f_\tau |^2. \end{aligned}$$

Finally, we turn to polynomial partitioning. Let P be a polynomial on \(\mathbb {R}^d\). We denote the zero set of P by Z(P) and say that \(z \in Z(P)\) is nonsingular if \(\nabla P(z) \ne 0\). If z is nonsingular, then Z(P) is a smooth hypersurface near z. If every point of Z(P) is nonsingular, then we say that P is nonsingular.

Proposition 4.4

(Guth [5]) Given \(g \in L^1(\mathbb {R}^d)\) and \(D \ge 1\), there exists a polynomial P of degree at most D such that P is a product of nonsingular polynomials and each connected component O of \(\mathbb {R}^d \setminus Z(P)\) satisfies

$$\begin{aligned} \int _{O} |g| \sim \frac{1}{D^d}\int _{\mathbb {R}^d} |g|. \end{aligned}$$

We note that a product of nonsingular polynomials may have singular points. However, by a perturbation argument using Sard’s theorem, one can ensure that nonsingular points are dense in the zero set of the partitioning polynomial.

4.2 Main Proof

We are now ready to prove Theorem 4.1 in earnest. We will induct on R and \(\sum _{\tau \in \mathcal {S}_r}\Vert f_\tau \Vert _2^2\). The base cases, that \(R \sim 1\) or \(\sum _{\tau }\Vert f_\tau \Vert _2^2 \le R^{-1000}\), are easy to check, and our induction hypotheses are that Theorem 4.1 holds with: (i) R/2 in place of R, or (ii) g in place of f whenever \(\sum _{\tau }\Vert g_\tau \Vert _2^2 \le \frac{1}{2}\sum _{\tau }\Vert f_\tau \Vert _2^2\). Throughout the proof, we will assume that \(\varepsilon \) is sufficiently small and that R is sufficiently large in relation to \(\varepsilon \).

We begin by setting \(D := R^{\varepsilon ^4}\) and applying Proposition 4.4 to the function \(|{\text {Br}}_\alpha \mathcal {E}_r f|^{13/4}\chi _{B_R}\) to produce a polynomial P of degree at most D such that

$$\begin{aligned} \mathbb {R}^3 \setminus Z(P) = \bigcup _{i \in I}O_i, \end{aligned}$$

where the ‘cells’ \(O_i\) are connected, pairwise disjoint, and satisfy

$$\begin{aligned} \int _{B_R \cap O_i}|{\text {Br}}_\alpha \mathcal {E}_rf|^{13/4} \sim \frac{1}{D^3}\int _{B_R}|{\text {Br}}_\alpha \mathcal {E}_r f|^{13/4}. \end{aligned}$$
(4.2)

In particular, the number of cells is \(\#I \sim D^3\). We define the ‘wall’ W as the \(R^{1/2+\delta }\)-neighborhood of Z(P), and we set \(O_i' := O_i \setminus W\). Thus,

$$\begin{aligned} \int _{B_R}|{\text {Br}}_\alpha \mathcal {E}_rf|^{13/4} = \sum _{i \in I}\int _{B_R \cap O_i'}|{\text {Br}}_\alpha \mathcal {E}_rf|^{13/4} + \int _{B_R \cap W}|{\text {Br}}_\alpha \mathcal {E}_r f|^{13/4}. \end{aligned}$$
(4.3)

We now argue by cases, according to which term on the right-hand side of (4.3) dominates.

4.3 Cellular Case

Suppose that the total contribution from the shrunken cells \(O_i'\) dominates. In this case, we have

$$\begin{aligned} \int _{B_R}|{\text {Br}}_\alpha \mathcal {E}_rf|^{13/4} \lesssim \sum _{i \in I}\int _{B_R \cap O_i'}|{\text {Br}}_\alpha \mathcal {E}_rf|^{13/4}. \end{aligned}$$

Using (4.2), we then see that the contribution from any single \(O_i'\) is controlled by the average of all such contributions. Thus, ‘most’ cells should contribute close to the average, and it is straightforward to show that there exists \(J \subseteq I\) such that \(\#J \sim D^3\) and

$$\begin{aligned} \int _{B_R \cap O_i'}|{\text {Br}}_\alpha \mathcal {E}_r f|^{13/4} \sim \frac{1}{D^3}\int _{B_R}|{\text {Br}}_\alpha \mathcal {E}_r f|^{13/4} \end{aligned}$$
(4.4)

for all \(i \in J\). The lower bound on \(\#J\) will be the basis for a pigeonholing argument shortly.

First, some definitions are needed. For each \(i \in I\) and \(\tau \in \mathcal {S}_r\), we set

$$\begin{aligned} \mathbb {T}_i := \{T \in \mathbb {T}: T \cap O_i' \ne \emptyset \} \end{aligned}$$

and

$$\begin{aligned} f_{\tau ,i} := \sum _{T \in \mathbb {T}_i} f_{\tau ,T}, \end{aligned}$$

where the functions \(f_{\tau ,T}\) come from applying Proposition 4.2 to \(f_\tau \). We also set

$$\begin{aligned} f_{i} := \sum _{\tau \in \mathcal {S}_r} f_{\tau ,i}. \end{aligned}$$

Since \(f_\tau \) is supported in \(m \tau \), property (i) in Proposition 4.2 implies that \(f_{\tau ,i}\) is supported in an \(O(R^{-1/2})\)-neighborhood of \(m\tau \). Let \(\overline{f}_i := \chi _U f_i\) and \(\overline{f}_{i,\tau } := \chi _U f_{i,\tau }\). If R is sufficiently large, then \({\text {supp}}\overline{f}_{\tau ,i} \subseteq 2m\tau \). Consequently, \(\overline{f}_i\) has a well defined broad part with respect to these larger squares. Soon we will apply our induction hypothesis to \(\overline{f}_i\) (for some special i) with m replaced by 2m.

Lemma 4.5

If \((t,x) \in O_i'\) and \(\alpha \le 1/2\), then

$$\begin{aligned} |{\text {Br}}_\alpha \mathcal {E}_r f(t,x)| \le |{\text {Br}}_{2\alpha }\mathcal {E}_r \overline{f}_i(t,x)| + R^{-900}\sum _{\tau \in \mathcal {S}_r}\Vert f_\tau \Vert _2. \end{aligned}$$

Proof

First, we may assume that

$$\begin{aligned} |\mathcal {E}_rf(t,x)| \ge R^{-900}\sum _\tau \Vert f_\tau \Vert _2; \end{aligned}$$
(4.5)

otherwise, the required inequality is trivial. Since \((t,x) \in O_i'\), properties (iii) and (ii) in Proposition 4.2 imply that

$$\begin{aligned} \mathcal {E}_r f_\tau (t,x) = \sum _{T \in \mathbb {T}_i}\mathcal {E}_rf_{\tau ,T}(t,x) + O(R^{-990}\Vert f_\tau \Vert _2) \end{aligned}$$

for each \(\tau \). Summing over \(\tau \), we get

$$\begin{aligned} \mathcal {E}_rf(t,x) = \mathcal {E}_rf_i(t,x) + O\Big (R^{-990}\sum _\tau \Vert f_\tau \Vert _2\Big ). \end{aligned}$$
(4.6)

Now it suffices to show that if (tx) is \(\alpha \)-broad for f, then (tx) is \(2\alpha \)-broad for \(\overline{f}_i\). Assume the former and fix \(\rho \in \mathcal {R}_r\). Using Proposition 4.2 again, we have

$$\begin{aligned} \bigg \vert \sum _{\tau : \tau \subseteq \rho }\mathcal {E}_r\overline{f}_{\tau ,i}(t,x)\bigg \vert&= \bigg \vert \sum _{\tau : \tau \subseteq \rho }\mathcal {E}_rf_{\tau ,i}(t,x)\bigg \vert \\&= \bigg \vert \sum _{\tau : \tau \subseteq \rho }\mathcal {E}_rf_\tau (t,x)\bigg \vert + O\Big (R^{-990}\sum _\tau \Vert f_\tau \Vert _2\Big )\\&\le \alpha |\mathcal {E}_r f(t,x)| + O\Big (R^{-990}\sum _\tau \Vert f_\tau \Vert _2\Big ). \end{aligned}$$

Using (4.5), (4.6), and the fact that \(\alpha \ge K^{-\varepsilon }\), the right-hand side is at most \(2\alpha |\mathcal {E}_rf_i(t,x)| = 2\alpha |\mathcal {E}_r\overline{f}_i(t,x)|\) for R sufficiently large. \(\square \)

If \(\alpha > 1/2\), then the estimate in Theorem 4.1 holds trivially, since the power of R can then be made at least 1000 by taking \(\varepsilon \) sufficiently small. Thus, we may assume that \(\alpha \le 1/2\). Applying Lemma 4.5 to (4.4) and recalling that \(D = R^{\varepsilon ^4}\), we get

$$\begin{aligned} \int _{B_R}|{\text {Br}}_\alpha \mathcal {E}_rf|^{13/4} \lesssim D^3\int _{B_R \cap O_i'}|{\text {Br}}_{2\alpha }\mathcal {E}_r\overline{f}_i|^{13/4} + O\Big (R^{-1000}\Big (\sum _{\tau \in \mathcal {S}_r}\Vert f_\tau \Vert _2\Big )^{13/4}\Big ) \end{aligned}$$
(4.7)

for every \(i \in J\). We will now pick \(i_0 \in J\) so that \(\sum _{\tau \in \mathcal {S}_r}\Vert \overline{f}_{\tau ,i_0}\Vert _2^2\) is small, which will allow us to apply our induction hypothesis to \(\overline{f}_{i_0}\). Because Z(P) is the zero set of a polynomial of degree at most D, any line is either contained in Z(P) or intersects Z(P) at most D times. Thus, each tube in \(\mathbb {T}\) belongs to at most \(D+1\) of the sets \(\mathbb {T}_i\). Now, applying Lemma 4.3 and the bound \(\#J \gtrsim D^3\), we must have

$$\begin{aligned} \frac{1}{\#J}\sum _{i \in J}\sum _\tau \Vert f_{\tau ,i}\Vert _2^2 \le \frac{C}{D^2}\sum _\tau \Vert f_\tau \Vert _2^2 \end{aligned}$$

for some constant C. Consequently, there exists \(i_0 \in J\) such that

$$\begin{aligned} \sum _\tau \Vert \overline{f}_{\tau ,i_0}\Vert _2^2 \le \sum _\tau \Vert f_{\tau ,i_0}\Vert _2^2 \le \frac{C}{D^2}\sum _\tau \Vert f_\tau \Vert _2^2 \le \frac{1}{2}\sum _\tau \Vert f_\tau \Vert _2^2 \end{aligned}$$
(4.8)

for R sufficiently large. We can apply Theorem 4.1 to \(\overline{f}_{i_0}\) with 2m in place of m, provided (4.1) holds. Since (4.1) holds for f, Lemma 4.3 gives

Thus, after multiplying \(\overline{f}_{i_0}\) by a constant, we can apply Theorem 4.1 to (4.7) with \(i = i_0\) to get

$$\begin{aligned} \int _{B_R}|{\text {Br}}_\alpha \mathcal {E}_r f|^{13/4}&\le CD^3C_\varepsilon 'R^{\varepsilon + \varepsilon ^6\log (8K^\varepsilon \alpha m^2)}\Big (\sum _{\tau \in \mathcal {S}_r}\Vert \overline{f}_{\tau ,i_0}\Vert _2^2\Big )^{3/2+\varepsilon } \\&\quad + O\Big (R^{-1000}\Big (\sum _{\tau \in \mathcal {S}_r}\Vert f_\tau \Vert _2\Big )^{13/4}\Big ) \end{aligned}$$

for some C. If the big O term dominates, then the desired estimate follows easily. Assuming it does not, then by (4.8) and the definition of D, we have altogether

$$\begin{aligned} \int _{B_R}|{\text {Br}}_\alpha \mathcal {E}_rf|^{13/4} \le 2CR^{-2\varepsilon ^5+\varepsilon ^6\log (8)} C_\varepsilon 'R^{\varepsilon +\varepsilon ^6\log (K^\varepsilon \alpha m^2)}\Big (\sum _{\tau \in \mathcal {S}_r}\Vert f_\tau \Vert _2^2\Big )^{3/2+\varepsilon }, \end{aligned}$$

and the induction closes if \(\varepsilon \) is sufficiently small and R sufficiently large.

4.4 Algebraic Case

Next, suppose that the contribution from W dominates in (4.3), so that

$$\begin{aligned} \int _{B_R}|{\text {Br}}_\alpha \mathcal {E}_rf|^{13/4} \lesssim \int _{B_R \cap W}|{\text {Br}}_\alpha \mathcal {E}_r f|^{13/4}. \end{aligned}$$
(4.9)

Following Guth [5], we distinguish between tubes that intersect W transversely and those essentially tangent to W. Let \(\mathcal {B}\) be a collection of balls B of radius \(R^{1-\delta }\) that cover \(B_R\) with bounded overlap.

Definition 4.6

Fix \(B \in \mathcal {B}\). Let \(\mathbb {T}_B^\flat \) be the set of tubes T satisfying \(T \cap W \cap B \ne \emptyset \) and \(\angle (v(T),T_z Z(P)) \le R^{2\delta - 1/2}\) for every nonsingular point \(z \in Z(P) \cap 2B \cap 10T\). Let \(\mathbb {T}_B^\sharp \) be the set of tubes T satisfying \(T \cap W \cap B \ne \emptyset \) and \(T \notin \mathbb {T}_B^\flat \).

Observe that if T intersects \(W \cap B\), then T belongs to exactly one of \(\mathbb {T}_B^\flat \) and \(\mathbb {T}_B^\sharp \). (The definition of \(\mathbb {T}_B^\flat \) would be vacuous if \(Z(P) \cap 2B \cap 10T\) contained only singular points; however, as noted above, we can arrange for nonsingular points to be dense in Z(P).) Thus, on \(W \cap B\), each \(\mathcal {E}_rf_\tau \) is well approximated by the sum of the ‘tangent’ and ‘transverse’ wave packets, \(\{\mathcal {E}_rf_{\tau ,T}\}_{T \in \mathbb {T}_B^\flat }\) and \(\{\mathcal {E}_rf_{\tau ,B}\}_{T \in \mathbb {T}_B^\sharp }\), respectively. Roughly speaking, our desired bound for \(\Vert {\text {Br}}_\alpha \mathcal {E}_rf\Vert _{L^{13/4}(B \cap W)}\) will soon be reduced to a broad part estimate on the transverse contribution and a bilinear estimate on the tangent contribution. The following geometric lemma, due to Guth [5], will be critical for establishing those bounds:

Lemma 4.7

(a) Each \(T \in \mathbb {T}\) belongs to \(\mathbb {T}_B^\sharp \) for at most \(D^{O(1)}\) balls \(B \in \mathcal {B}\). (b) For each \(B \in \mathcal {B}\), the number of discs \(\theta \in \Theta \) such that \(\mathbb {T}_B^\flat \cap \mathbb {T}(\theta ) \ne \emptyset \) is at most \(R^{O(\delta ) + 1/2}\).

To carry out the bilinear argument, we need to define the separation condition mentioned in Sect. 2. Recall how we defined the Lorentz separation \(\mathrm{dist}_{\mathrm{L}}(\xi ,\zeta )\) of \(\xi ,\zeta \in \Omega \). We say that two squares \(\tau _1,\tau _2 \in \mathcal {S}_r\) are separated if

$$\begin{aligned} \mathrm{dist}_{\mathrm{L}}(r\xi , r\zeta ) \ge {C_0rm}{K^{-1}} \end{aligned}$$

for all \(\xi \in 2m\tau _1\) and \(\zeta \in 2m\tau _2\), where \(C_0 \ge 1\) is a constant to be chosen later. Part (a) of Lemma 2.2 implies that points having small Lorentz separation must lie near a common line. The next lemma extends this property to collections of non-separated squares.

Lemma 4.8

Let \(\mathcal {I}\subseteq \mathcal {S}_r\) be a collection of pairwise non-separated squares. Then there exist \(\sigma _1,\sigma _2,\sigma _3,\sigma _4 \in \mathcal {T}_{\delta ,\delta _0,r} \cup \mathcal {T}_{\delta _0,\delta ,r}\), with \(K^{-1} \le \delta \lesssim mK^{-1}\), such that \(\tau \subseteq \bigcup _{i=1}^4\sigma _i\) for every \(\tau \in \mathcal {I}\).

Proof

For \(\xi \in U\), let \(\overline{\xi } := \Phi ^{-1}(r\xi )\) and also set

$$\begin{aligned} {I}&:= \bigcup _{\tau \in I}\tau ,\\ \overline{I}&:= \{\overline{\xi } : \xi \in I\}. \end{aligned}$$

Fix \(\tau _1,\tau _2 \in \mathcal {I}\). By part (a) of Lemma 2.2 and the definition of (non-)separated squares, there exist \(\xi ^*\in 2m\tau _1\) and \(\zeta ^* \in 2m\tau _2\) such that

$$\begin{aligned} {\text {dist}}(r\xi ^*,\ell _{r\zeta ^*}^+ \cup \ell _{r\zeta ^*}^-) \lesssim {rm}{K^{-1}}. \end{aligned}$$

Let \(\eta \) be a point in \(\ell _{r\zeta ^*}^+ \cup \ell _{r\zeta ^*}^-\) closest to \(r\xi ^*\). By elementary geometry, \(\eta \) lies in 2U. Thus, from the bound \(\Vert \Phi ^{-1}\Vert _{C^1(2U)} \lesssim 1\) and Lemma 2.3, we have

$$\begin{aligned} {\text {dist}}(\overline{\xi ^*}, (\{\overline{\zeta _1^*}\} \times \mathbb {R}) \cup (\mathbb {R}\times \{\overline{\zeta _2^*}\})) \lesssim |r\xi ^* - \eta | \lesssim {rm}{K^{-1}}. \end{aligned}$$

Since \({\text {diam}}\Phi ^{-1}(r\cdot 2m\tau ) \lesssim rmK^{-1}\) for each \(\tau \), it follows that

$$\begin{aligned} {\text {dist}}(\overline{\xi }, (\{\overline{\zeta }_1\} \times \mathbb {R}) \cup (\mathbb {R}\times \{\overline{\zeta }_2\})) \le A \end{aligned}$$
(4.10)

for all \(\xi ,\zeta \in {I}\) and some \(A \lesssim rmK^{-1}\). Fix \(\zeta \in {I}\) and set

$$\begin{aligned} S := [\overline{\zeta }_1-A,\overline{\zeta }_1+A] \times \mathbb {R},\\ T := \mathbb {R}\times [\overline{\zeta }_2-A,\overline{\zeta }_2+A], \end{aligned}$$

so that \(\overline{I} \subseteq S \cup T\). Additionally, define

$$\begin{aligned} {3S} := [\overline{\zeta }_1-3A,\overline{\zeta }_1+3A] \times \mathbb {R},\\ {3T} := \mathbb {R}\times [\overline{\zeta }_2-3A,\overline{\zeta }_2+3A]. \end{aligned}$$

We consider three exhaustive cases:

  1. (i)

    If \(\overline{I} \cap (S \setminus 3T) \ne \emptyset \), then (4.10) implies that \(\overline{I} \cap (T \setminus 3S) = \emptyset \), and consequently \(\overline{I} \subseteq 3S\).

  2. (ii)

    If \(\overline{I} \cap (T \setminus 3S) \ne \emptyset \), then (4.10) implies that \(\overline{I} \cap (S \setminus 3T) = \emptyset \), and consequently \(\overline{I} \subseteq 3T\).

  3. (iii)

    Otherwise, \(\overline{I} \subseteq 3S \cap 3T\).

Thus, by symmetry, we may assume that \(\Phi _r^{-1}({I}) = r^{-1}\overline{I} \subseteq (r^{-1}\cdot 3S) \cap [-\delta _0,\delta _0)^2\). The interval

$$\begin{aligned}{}[{r^{-1}}(\overline{\zeta _1} - 3A), {r^{-1}}(\overline{\zeta _1}+3A)] \cap [-\delta _0,\delta _0) \end{aligned}$$

is covered by two dyadic intervals \(I_1,I_2 \subseteq [-\delta _0,\delta _0)\) of length \(\delta \lesssim r^{-1}A \lesssim mK^{-1}\). Thus, if we set

$$\begin{aligned} \sigma _1&:= \Phi _r(I_1 \times [-\delta _0,0)) \cap U,\\ \sigma _2&:= \Phi _r(I_2 \times [-\delta _0,0)) \cap U,\\ \sigma _3&:= \Phi _r(I_1 \times [0,\delta _0)) \cap U,\\ \sigma _4&:= \Phi _r(I_2 \times [0,\delta _0)) \cap U, \end{aligned}$$

then \(I \subseteq \bigcup _{i=1}^4 \sigma _i\) and the proof is complete. \(\square \)

As mentioned above, estimating \(\Vert {\text {Br}}_\alpha \mathcal {E}_rf\Vert _{L^{13/4}(B \cap W)}\) can be reduced to estimating certain contributions from transverse and tangent wave packets. The next lemma carries out this reduction. First, some notation is needed. For \(\tau \in \mathcal {S}_r\) and \(B \in \mathcal {B}\), we set

$$\begin{aligned} f_{\tau ,B}^\flat := \sum _{T \in \mathbb {T}_B^\flat } f_{\tau ,T} \quad \quad \text {and}\quad \quad f_{\tau ,B}^\sharp := \sum _{T \in \mathbb {T}_B^\sharp } f_{\tau ,T}. \end{aligned}$$

We also let

$$\begin{aligned} f_B^\flat := \sum _{\tau \in \mathcal {S}_r} f_{\tau ,B}^\flat \quad \quad \text {and} \quad \quad f_B^\sharp := \sum _{\tau \in \mathcal {S}_r} f_{\tau ,B}^\sharp . \end{aligned}$$

Given \(\mathcal {I}\subseteq \mathcal {S}_r\), we set

$$\begin{aligned} f_{\mathcal {I},B}^\flat := \sum _{\tau \in \mathcal {I}}f_{\tau ,B}^\flat \quad \quad \text {and}\quad \quad f_{\mathcal {I},B}^\sharp := \sum _{\tau \in \mathcal {I}}f_{\tau ,B}^\sharp . \end{aligned}$$

We note that \(f_{\mathcal {I},B}^\sharp \) (analogously \(f_{\mathcal {I},B}^\flat \)) has the natural decomposition \(f_{\mathcal {I},B}^\sharp = \sum _{\tau \in \mathcal {S}_r} f_{\tau ,\mathcal {I},B}^\sharp \), where

$$\begin{aligned} f_{\tau ,\mathcal {I},B}^\sharp := {\left\{ \begin{array}{ll} f_{\tau ,B}^\sharp &{}\text {if }\tau \in \mathcal {I},\\ 0 &{}\text {if } \tau \notin \mathcal {I}. \end{array}\right. } \end{aligned}$$

Let \(\overline{f}_{\mathcal {I},B}^\sharp := \chi _U f_{\mathcal {I},B}^\sharp \) and \(\overline{f}_{\tau ,B}^\sharp := \chi _Uf_{\tau ,B}^\sharp \). Then \({\text {supp}}\overline{f}_{\tau ,B}^\sharp \subseteq 2m\tau \), and thus \(\overline{f}_{\mathcal {I},B}^\sharp \) has a well defined broad part. Finally, we define

$$\begin{aligned} {\text {Bil}}(\mathcal {E}_r f_B^\flat ) := \sum _{\tau _1,\tau _2 \text { separated}}|\mathcal {E}_r f_{\tau _1,B}^\flat |^{1/2}|\mathcal {E}_r f_{\tau _2,B}^\flat |^{1/2}. \end{aligned}$$

Lemma 4.9

If \((t,x) \in B \cap W\) and \(\alpha m\) is sufficiently small, then

$$\begin{aligned} |{\text {Br}}_\alpha \mathcal {E}_rf(t,x)|&\le \sum _{\mathcal {I}\subseteq \mathcal {S}_r} |{\text {Br}}_{10\alpha } \mathcal {E}_r \overline{f}_{\mathcal {I},B}^\sharp (t,x)| + K^{100}{\text {Bil}}(\mathcal {E}_r f_B^\flat )(t,x) \\&\quad + R^{-900}\sum _{\tau \in \mathcal {S}_r}\Vert f_\tau \Vert _2. \end{aligned}$$

Proof

We may assume that (tx) is \(\alpha \)-broad for \(\mathcal {E}_rf\) and that

$$\begin{aligned} |\mathcal {E}_r f(t,x)| \ge R^{-900}\sum _\tau \Vert f_\tau \Vert _2. \end{aligned}$$
(4.11)

Let

$$\begin{aligned} \mathcal {I}:= \{\tau \in \mathcal {S}_r : |\mathcal {E}_r f(t,x)| \le K^{100}|\mathcal {E}_r f_{\tau ,B}^\flat (t,x)|\}. \end{aligned}$$

If \(\mathcal {I}\) contains a pair of separated squares, then the bound \(|{\text {Br}}_\alpha \mathcal {E}_r f(t,x)| \le K^{100}{\text {Bil}}(\mathcal {E}_r f_B^\flat )(t,x)\) follows immediately. Thus, we may assume that \(\mathcal {I}\) contains no pair of separated squares. By Lemma 4.8, there exist \(\sigma _1,\sigma _2,\sigma _3,\sigma _4 \in \mathcal {T}_{\delta ,\delta _0,r} \cup \mathcal {T}_{\delta _0,\delta ,r}\), with \(K^{-1} \le \delta \lesssim mK^{-1}\), such that \(\tau \subseteq \bigcup _{i=1}^4\sigma _i\) for every \(\tau \in \mathcal {I}\). Let

$$\begin{aligned} \mathcal {J}:= \Big \{\tau \in \mathcal {S}_r : \tau \subseteq \bigcup _{i=1}^4 \sigma _i\Big \}. \end{aligned}$$

Then

$$\begin{aligned} |\mathcal {E}_r f(t,x)| \le \sum _{i=1}^4\bigg \vert \sum _{\tau : \tau \subseteq \sigma _i} \mathcal {E}_rf_\tau (t,x)\bigg \vert + \bigg \vert \sum _{\tau \in \mathcal {J}^c} \mathcal {E}_rf_\tau (t,x)\bigg \vert . \end{aligned}$$

Since \(\delta \lesssim mK^{-1}\), each \(\sigma _i\) is a union of at most Cm elements of \(\mathcal {R}_{\delta _0,r}\) where C is a constant. Thus, since (tx) is \(\alpha \)-broad for \(\mathcal {E}_rf\) and \(\alpha m\) is sufficiently small, we have

$$\begin{aligned} \sum _{i=1}^4\bigg \vert \sum _{\tau : \tau \subseteq \sigma _i} \mathcal {E}_rf_\tau (t,x)\bigg \vert \le 4Cm\alpha |\mathcal {E}_r f(t,x)| \le \frac{1}{10}|\mathcal {E}_r f(t,x)|, \end{aligned}$$

and consequently,

$$\begin{aligned} \frac{9}{10}|\mathcal {E}_r f(t,x)| \le \bigg \vert \sum _{\tau \in \mathcal {J}^c} \mathcal {E}_r f_\tau (t,x)\bigg \vert . \end{aligned}$$

Since \((t,x) \in B \cap W\), properties (iii) and (ii) in Proposition 4.2 imply that

$$\begin{aligned} \mathcal {E}_r f_\tau (t,x) = \mathcal {E}_r f_{\tau ,B}^\sharp (t,x) + \mathcal {E}_r f_{\tau ,B}^\flat (t,x) + O(R^{-990}\Vert f_\tau \Vert _2) \end{aligned}$$

for every \(\tau \in \mathcal {S}_r\). Summing over \(\tau \in \mathcal {J}^c\), we get

$$\begin{aligned} \bigg \vert \sum _{\tau \in \mathcal {J}^c}\mathcal {E}_r f_\tau (t,x)\bigg \vert \le |\mathcal {E}_r f_{\mathcal {J}^c,B}^\sharp (t,x)| + |\mathcal {E}_r f_{\mathcal {J}^c,B}^\flat (t,x)| + O\Big (R^{-990}\sum _\tau \Vert f_\tau \Vert _2\Big ). \end{aligned}$$

Since \(\mathcal {J}^c \subseteq \mathcal {I}^c\), we have

$$\begin{aligned} |\mathcal {E}_r f_{\mathcal {J}^c,B}^\flat (t,x)| \le \#\mathcal {J}^c K^{-100}|\mathcal {E}_r f(t,x)| \le K^{-97}|\mathcal {E}_rf(t,x)|. \end{aligned}$$

Hence,

$$\begin{aligned} \frac{9}{10}|\mathcal {E}_r f(t,x)| \le |\mathcal {E}_r f_{\mathcal {J}^c,B}^\sharp (t,x)| + K^{-97}|\mathcal {E}_r f(t,x)| + O\Big (R^{990}\sum _\tau \Vert f_\tau \Vert _2\Big ). \end{aligned}$$

Using (4.11), we see that

$$\begin{aligned} |\mathcal {E}_r f(t,x)| \le \frac{5}{4}|\mathcal {E}_r f_{\mathcal {J}^c,B}^\sharp (t,x)| = \frac{5}{4}|\mathcal {E}_r \overline{f}_{\mathcal {J}^c,B}^\sharp (t,x)|, \end{aligned}$$

provided \(\varepsilon \) is sufficiently small and R sufficiently large. To finish the proof, we will show that (tx) is \(10\alpha \)-broad for \(\mathcal {E}_r \overline{f}_{\mathcal {J}^c,B}^\sharp \). It suffices to show that

$$\begin{aligned} \bigg \vert \sum _{\tau \in \mathcal {J}^c : \tau \subseteq \rho }\mathcal {E}_r \overline{f}_{\tau ,B}^\sharp (t,x)\bigg \vert \le 8\alpha |\mathcal {E}_rf(t,x)| \end{aligned}$$
(4.12)

for every \(\rho \in \mathcal {R}_r\). Fixing \(\rho \in \mathcal {R}_r\), we have

$$\begin{aligned} \bigg \vert \sum _{\tau \in \mathcal {J}^c : \tau \subseteq \rho } \mathcal {E}_r \overline{f}_{\tau ,B}^\sharp (t,x)\bigg \vert&= \bigg \vert \sum _{\tau \in \mathcal {J}^c : \tau \subseteq \rho } \mathcal {E}_r f_{\tau ,B}^\sharp (t,x)\bigg \vert \\&\le \bigg \vert \sum _{\tau \in \mathcal {J}^c : \tau \subseteq \rho } \mathcal {E}_r f_\tau (t,x)\bigg \vert + \bigg \vert \sum _{\tau \in \mathcal {J}^c : \tau \subseteq \rho } \mathcal {E}_r f_{\tau ,B}^\flat (t,x)\bigg \vert \\&\quad +O\Big (R^{-990}\sum _\tau \Vert f_\tau \Vert _2\Big ). \end{aligned}$$

As above, \(\mathcal {J}^c \subseteq \mathcal {I}^c\) implies that

$$\begin{aligned} \bigg \vert \sum _{\tau \in \mathcal {J}^c : \tau \subseteq \rho } \mathcal {E}_r f_{\tau ,B}^\flat (t,x)\bigg \vert \le K^{-97}|\mathcal {E}_rf(t,x)| \le \alpha |\mathcal {E}_r f(t,x)|. \end{aligned}$$

It is straightforward to check that \(\sigma _i \cap \rho \in \mathcal {R}_r\) for each \(i = 1,\ldots ,4\). Thus, since (tx) is \(\alpha \)-broad for \(\mathcal {E}_rf\), we have

$$\begin{aligned} \bigg \vert \sum _{\tau \in \mathcal {J}^c : \tau \subseteq \rho } \mathcal {E}_r f_\tau (t,x)\bigg \vert&\le \bigg \vert \sum _{\tau :\tau \subseteq \rho } \mathcal {E}_r f_\tau (t,x)\bigg \vert + \sum _{i=1}^4\bigg \vert \sum _{\tau : \tau \subseteq \sigma _i \cap \rho } \mathcal {E}_r f_\tau (t,x)\bigg \vert \\&\le 5\alpha |\mathcal {E}_rf(t,x)|. \end{aligned}$$

Using the preceding three estimates and (4.11), we arrive at (4.12). \(\square \)

If \(\alpha m \gtrsim 1\) so that Lemma 4.9 does not apply, then the estimate in Theorem 4.1 holds trivially, since the power of R can then be made at least 1000 by taking \(\varepsilon \) sufficiently small. Thus, we may assume that \(\alpha m \ll 1\). We now apply Lemma 4.9 to (4.9) to get

$$\begin{aligned} \nonumber \int _{B_R} |{\text {Br}}_\alpha \mathcal {E}_r f|^{13/4}&\lesssim \sum _{B \in \mathcal {B}}\sum _{\mathcal {I}\subseteq \mathcal {S}_r}\int _{B \cap W} |{\text {Br}}_{10\alpha }\mathcal {E}_r \overline{f}_{\mathcal {I},B}^\sharp |^{13/4} \\&\quad + \sum _{B \in \mathcal {B}}\int _{B \cap W} {\text {Bil}}(\mathcal {E}_rf_B^\flat )^{13/4}+ R^{-1000}\Big (\sum _{\tau \in \mathcal {S}_r}\Vert f_\tau \Vert _2\Big )^{13/4}; \end{aligned}$$
(4.13)

note that the implicit constant is allowed to depend on K, a function of \(\varepsilon \). If the last term dominates in (4.13), then the estimate in Theorem 4.1 holds trivially.

4.4.1 Transverse Subcase

Suppose that the first term dominates in (4.13), so that

$$\begin{aligned} \int _{B_R} |{\text {Br}}_\alpha \mathcal {E}_r f|^{13/4} \lesssim \sum _{B \in \mathcal {B}}\sum _{\mathcal {I}\subseteq \mathcal {S}_r}\int _B |{\text {Br}}_{10\alpha }\mathcal {E}_r \overline{f}_{\mathcal {I},B}^\sharp |^{13/4}. \end{aligned}$$
(4.14)

Each ball \(B \in \mathcal {B}\) has radius \(R^{1-\delta }\), so by induction on R, we can apply Theorem 4.1 to each summand in (4.14), whenever (4.1) holds. Since (4.1) holds for f, Lemma 4.3 gives

Thus, after multiplying by a constant, Theorem 4.1 implies that

$$\begin{aligned} \int _{B_R}|{\text {Br}}_\alpha \mathcal {E}_r f|^{13/4} \lesssim \sum _{B \in \mathcal {B}}\sum _{\mathcal {I}\subseteq \mathcal {S}_r}C_\varepsilon ' R^{(1-\delta )\varepsilon }R^{\varepsilon ^6\log (40 K^\varepsilon \alpha m^2)}\Big (\sum _{\tau \in \mathcal {S}_r}\Vert \overline{f}_{\tau ,B}^\sharp \Vert _2^2\Big )^{3/2+\varepsilon }. \end{aligned}$$

By Lemma 4.7, each \(T \in \mathbb {T}\) belongs to at most \(D^{O(1)}\) sets \(\mathbb {T}_B^\sharp \). Therefore, by Lemma 4.3, we have

$$\begin{aligned} \sum _{B \in \mathcal {B}}\Big (\sum _{\tau \in \mathcal {S}_r}\Vert \overline{f}_{\tau ,B}^\sharp \Vert _2^2\Big )^{3/2+\varepsilon } \le \Big (\sum _{\tau \in \mathcal {S}_r}\sum _{B \in \mathcal {B}}\Vert f_{\tau ,B}^\sharp \Vert _2^2\Big )^{3/2+\varepsilon } \lesssim D^{O(1)}\Big (\sum _{\tau \in \mathcal {S}_r}\Vert f_\tau \Vert _2^2\Big )^{3/2+\varepsilon }. \end{aligned}$$

Since \(\delta = \varepsilon ^2\), \(D = R^{\varepsilon ^4}\), and the number of subsets \(\mathcal {I}\subseteq \mathcal {S}_r\) depends only on K, we have altogether

$$\begin{aligned} \int _{B_R}|{\text {Br}}_\alpha \mathcal {E}_r f|^{13/4} \le CR^{-\varepsilon ^3+\varepsilon ^6\log (40)+O(\varepsilon ^4)}C_\varepsilon ' R^{\varepsilon +\varepsilon ^6\log (K^\varepsilon \alpha m^2)}\Big (\sum _{\tau \in \mathcal {S}_r}\Vert f_\tau \Vert _2^2\Big )^{3/2+\varepsilon } \end{aligned}$$

for some C (depending on \(\varepsilon \)). The power of the first R is negative for \(\varepsilon \) sufficiently small, and then the induction closes for R sufficiently large.

4.4.2 Tangent Subcase

In the remaining case, the second term in (4.13) dominates, whence

$$\begin{aligned} \int _{B_R} |{\text {Br}}_\alpha \mathcal {E}_r f|^{13/4} \lesssim \sum _{B \in \mathcal {B}}\int _{B \cap W} {\text {Bil}}(\mathcal {E}_r f_B^\flat )^{13/4}. \end{aligned}$$

We will bound the right-hand side directly (i.e. without induction) using basically standard bilinear restriction techniques and Lemma 4.7. Since \(\#\mathcal {B}= R^{O(\delta )} \le R^\varepsilon \), it will suffice to prove the following:

Proposition 4.10

For every \(B \in \mathcal {B}\), we have

$$\begin{aligned} \int _{B \cap W} {\text {Bil}}(\mathcal {E}_r f_B^\flat )^{13/4} \lesssim R^{O(\delta )}\Big (\sum _{\tau \in \mathcal {S}_{r}}\Vert f_\tau \Vert _2^2\Big )^{3/2}. \end{aligned}$$

We will need a preliminary lemma. Fix \(B \in \mathcal {B}\) and let \(\mathcal {Q}\) be a collection of cubes Q of side length \(R^{1/2}\) that cover \(B \cap W\) with bounded overlap. For each \(Q \in \mathcal {Q}\), let

$$\begin{aligned} \mathbb {T}_{B,Q}^\flat := \{T \in \mathbb {T}_B^\flat : T \cap Q \ne \emptyset \}. \end{aligned}$$

Henceforth, we will write ‘negligible’ in place of any quantity of size \(O(R^{-990}\sum _{\tau \in \mathcal {S}_r}\Vert f_\tau \Vert _2)\). In particular, if \((t,x) \in Q\), then

$$\begin{aligned} \mathcal {E}_r f_{\tau ,B}^\flat (t,x) = \sum _{T \in \mathbb {T}_{B,Q}^\flat }\mathcal {E}_r f_{\tau ,T}(t,x) + {\text {negligible}}. \end{aligned}$$
(4.15)

It will suffice to bound \(\Vert {\text {Bil}}(\mathcal {E}_rf_B^\flat )\Vert _{L^{13/4}(Q)}\) for each \(Q \in \mathcal {Q}\). Informally, the tubes in \(\mathbb {T}_{B,Q}^\flat \) are tangent to W at Q and are thus coplanar. Dually, the wave packets \(\{\mathcal {E}_rf_{\tau ,T}\}_{T \in \mathbb {T}_{B,Q}^\flat }\) have Fourier support near a curve formed by the intersection of \(\Sigma _r\) and a plane. Thus, estimating \(\Vert {\text {Bil}}(\mathcal {E}_rf_B^\flat )\Vert _{L^{13/4}(Q)}\) is essentially a two-dimensional bilinear restriction problem, making the \(L^4\) argument a natural approach (as done in [5], of course).

Lemma 4.11

For all \(Q \in \mathcal {Q}\) and separated \(\tau _1, \tau _2 \in \mathcal {S}_r\), we have

$$\begin{aligned} \int _Q |\mathcal {E}_r f_{\tau _1,B}^\flat |^2|\mathcal {E}_r f_{\tau _2,B}^\flat |^2&\lesssim R^{O(\delta )-1/2}\bigg (\sum _{T_1 \in \mathbb {T}_{B,Q}^\flat }\Vert f_{\tau _1,T_1}\Vert _2^2\bigg )\bigg (\sum _{T_2 \in \mathbb {T}_{B,Q}^\flat }\Vert f_{\tau _2,T_2}\Vert _2^2\bigg )\\&\quad + {\text {negligible}}. \end{aligned}$$

Proof

Let \(\psi _Q\) be a smooth function satisfying \(\chi _Q \le \psi _Q \le \chi _{2Q}\) and

$$\begin{aligned} |\hat{\psi }_Q(\tau ,\xi )| \lesssim R^{3/2}(1+|(\tau ,\xi )|R^{1/2})^{-10^6/\delta }. \end{aligned}$$

By (4.15) and Plancherel’s theorem, we have

$$\begin{aligned}&\nonumber \int _Q|\mathcal {E}_r f_{\tau _1,B}^\flat |^2|\mathcal {E}_rf_{\tau _2,B}^\flat |^2\nonumber \\&\quad = \sum _{T_1,\overline{T}_1,T_2,\overline{T}_2 \in \mathbb {T}_{B,Q}^\flat } \int _Q \mathcal {E}_rf_{\tau _1,T_1}\overline{\mathcal {E}_rf_{\tau _1,\overline{T}_1}}\mathcal {E}_r f_{\tau _2,T_2} \overline{\mathcal {E}_rf_{\tau _2,\overline{T}_2}} + {\text {negligible}}\nonumber \\&\quad \le \sum _{T_1,\overline{T}_1,T_2,\overline{T}_2 \in \mathbb {T}_{B,Q}^\flat } \int _{\mathbb {R}^3}(\hat{\psi }_Q *d\sigma _{\tau _1,T_1} *d\sigma _{\tau _2,T_2})(\overline{d\sigma _{\tau _1,\overline{T}_1}*d\sigma _{\tau _2,\overline{T}_2}}) + {\text {negligible}}, \end{aligned}$$
(4.16)

where \(d\sigma _{\tau ,T}\) is the measure on \(\Sigma _r\) defined by

$$\begin{aligned} \int _{\Sigma _r} g d\sigma _{\tau ,T} := \int _U g(\phi _r(\xi ),\xi )f_{\tau ,T}(\xi )d\xi . \end{aligned}$$
(4.17)

Fix \(T_1,\overline{T}_1,T_2,\overline{T}_2 \in \mathbb {T}_{B,Q}^\flat \) and let \(\xi ,\overline{\xi },\zeta ,\overline{\zeta }\) denote the centers of \(\theta (T_1),\theta (\overline{T}_1),\theta (T_2),\theta (\overline{T}_2)\), respectively. The rapid decay of \(\hat{\psi }_Q\) and the fact that \({\text {supp}}\chi _Uf_{\tau ,T} \subseteq \frac{3}{2}m\tau \) for every \(\tau \in \mathcal {S}_r\) and \(T \in \mathbb {T}\) imply that the contribution of \(T_1,\overline{T}_1,T_2,\overline{T}_2\) to (4.16) is negligible unless

$$\begin{aligned} \xi +\zeta&= \overline{\xi } + \overline{\zeta } + O(R^{\delta -1/2}),\\ \phi _r(\xi ) + \phi _r(\zeta )&= \phi _r(\overline{\xi }) + \phi _r(\overline{\zeta }) + O(R^{\delta -1/2}), \end{aligned}$$

and \(\xi ,\overline{\xi } \in 2m\tau _1\) and \(\zeta ,\overline{\zeta } \in 2m\tau _2\). We need to estimate the number of non-negligible terms in (4.16) involving given tubes \(T_1,T_2\).

Toward that end, we adapt some techniques of Cho–Lee [3] and Lee [7]. Assuming \(T_1,\overline{T}_1,T_2,\overline{T}_2\) contribute non-negligibly, then

$$\begin{aligned} \phi _r(\xi ) + \phi _r(\zeta ) = \phi _r(\overline{\xi }) + \phi _r(\xi +\zeta -\overline{\xi }) + O(R^{\delta -1/2}). \end{aligned}$$
(4.18)

We define a function \(\Psi : U \rightarrow \mathbb {R}\) by

$$\begin{aligned} \Psi (\eta ) := \phi _r(\eta ) + \phi _r(\xi +\zeta -\eta ) - \phi _r(\xi )-\phi _r(\zeta ) \end{aligned}$$

and denote by \(Z := \Psi ^{-1}(0)\) its zero set. We claim that \(|\nabla \Psi | \gtrsim 1\) on \(2m\tau _1\). Indeed, if \(\eta \in 2m\tau _1\), then by the Cauchy–Schwarz inequality, boundedness of \(\Vert (\nabla ^2 \phi )^{-1}\Vert \) on U, part (b) of Lemma 2.2, and finally the separation of \(\tau _1\) and \(\tau _2\), we have

$$\begin{aligned} |\nabla \phi _r(\eta ) - \nabla \phi _r(\zeta )|&= r^{-1}|\nabla \phi (r\eta )-\nabla \phi (r\zeta )|\gtrsim r^{-1}|\langle (\nabla ^2\phi (r\eta ))^{-1}(\nabla \phi (r\eta )\\&\quad -\nabla \phi (r\zeta )),\nabla \phi (r\eta )-\nabla \phi (r\zeta )\rangle |^{1/2}\\&\sim r^{-1}\mathrm{dist}_{\mathrm{L}}(r\eta ,r\zeta )\\&\ge C_0mK^{-1}, \end{aligned}$$

whence

$$\begin{aligned} |\Psi (\eta )|&= |\nabla \phi _r(\eta ) - \nabla \phi _r(\xi +\zeta -\eta )| \ge |\nabla \phi _r(\zeta )-\nabla \phi _r(\zeta )|\\&\quad - \Vert \phi \Vert _{C^1(U)}{\text {diam}}(2m\tau _1) \gtrsim 1 \end{aligned}$$

if \(C_0\) is sufficiently large. By the claim, Z is a smooth curve near \(\xi \), and (4.18) and a Taylor approximation argument imply that

$$\begin{aligned} {\text {dist}}(\overline{\xi },Z) \lesssim R^{\delta -1/2} \end{aligned}$$
(4.19)

for R sufficiently large. As mentioned above, tubes in \(\mathbb {T}_{B,Q}^\flat \) are nearly coplanar. Inspecting the definition, it is straightforward to check that \(\angle (v(T),T_z Z(P)) \le R^{2\delta -1/2}\) for all \(T \in \mathbb {T}_{B,Q}^\flat \) and some (nonsingular) \(z \in 2R^\delta Q \cap Z(P)\). Thus, dually, there exists a plane \(\Pi \) through the origin such that \({\text {dist}}((-1,\nabla \phi _r(\eta )),\Pi ) \lesssim R^{2\delta -1/2}\) for each \(\eta \in \{\xi ,\overline{\xi },\zeta \}\). Consequently, there exists a line whose \(O(R^{2\delta -1/2})\)-neighborhood contains \(\nabla \phi _r(\xi )\), \(\nabla \phi _r(\overline{\xi })\), and \(\nabla \phi _r(\zeta )\). Since \(|\nabla \phi _r(\xi )-\nabla \phi _r(\zeta )| \gtrsim 1\) due to the separation of \(\tau _1\) and \(\tau _2\), it follows that \(\nabla \phi _r(\overline{\xi })\) lies in an \(O(R^{2\delta -1/2})\)-neighborhood of the line \(\ell \) containing \(\nabla \phi _r(\xi )\) and \(\nabla \phi _r(\zeta )\). We consider now the smooth curve \(\tilde{\ell } := (\nabla \phi _r)^{-1}(\ell \cap 3U)\), noting that \(\nabla \phi \) (and thus \(\nabla \phi _r\)) is invertible near the origin since \(\det \nabla ^2\phi (0) \ne 0\). This curve contains \(\xi \) by construction, and the boundedness of \(\Vert (\nabla ^2\phi )^{-1}\Vert \) implies that

$$\begin{aligned} {\text {dist}}(\overline{\xi },\tilde{\ell }) \lesssim R^{2\delta -1/2}. \end{aligned}$$
(4.20)

Crucially, \(\tilde{\ell }\) and Z intersect transversely at \(\xi \). Indeed, parametrizing \(\tilde{\ell }\) by

$$\begin{aligned} \tilde{\ell }(t) := (\nabla \phi _r)^{-1}((1-t)\nabla \phi _r(\xi ) + t\nabla \phi _r(\zeta )), \end{aligned}$$

the tangent line to \(\tilde{\ell }\) at \(\xi \) is parallel to

$$\begin{aligned} \frac{d}{dt}\tilde{\ell }(t)\bigg \vert _{t=0} = (\nabla ^2\phi _r(\xi ))^{-1}(\nabla \phi _r(\zeta )-\nabla \phi _r(\xi )), \end{aligned}$$

and the normal line to Z at \(\xi \) is parallel to \(\nabla \Psi (\xi ) = \nabla \phi _r(\xi ) - \nabla \phi _r(\zeta )\). Thus, the bound

$$\begin{aligned} |\langle (\nabla ^2\phi _r(\xi ))^{-1}(\nabla \phi _r(\xi )-\nabla \phi _r(\zeta )), \nabla \phi _r(\xi )-\nabla \phi _r(\zeta )\rangle | \gtrsim 1, \end{aligned}$$

which follows from part (b) of Lemma 2.2 and the separation of \(\tau _1\) and \(\tau _2\), implies the claimed transverse intersection. Consequently, by (4.19) and (4.20), we have \(|\xi - \overline{\xi }| \lesssim R^{2\delta -1/2}\). A similar argument shows that \(|\zeta - \overline{\zeta }| \lesssim R^{2\delta -1/2}\). Since \(\#(\mathbb {T}_{B,Q}^\flat \cap \mathbb {T}(\theta )) \lesssim 1\) for every \(\theta \in \Theta \), it follows that for each \(T_1,T_2 \in \mathbb {T}_{B,Q}^\flat \), there are \(O(R^{8\delta })\) pairs \(\overline{T}_1,\overline{T}_2 \in \mathbb {T}_{B,Q}^\flat \) such that \(T_1,\overline{T_1},T_2,\overline{T}_2\) contribute non-negligibly to (4.16).

Hence, by the Cauchy–Schwarz inequality (a few times) and Young’s inequality, (4.16) is at most

$$\begin{aligned} R^{O(\delta )}\sum _{T_1,T_2 \in \mathbb {T}_{B,Q}^\flat } \int _{\mathbb {R}^3}|d\sigma _{\tau _1,T_1}*d\sigma _{\tau _2,T_2}|^2 + {\text {negligible}}. \end{aligned}$$
(4.21)

To estimate the convolution, we use Plancherel’s theorem and the familiar wave packet approximation

$$\begin{aligned} |\mathcal {E}_r g_T| \approx R^{-1/2}\Vert g_T\Vert _2\chi _T; \end{aligned}$$
(4.22)

we will give a rigorous argument in Lemma 4.12, appearing at the end of the article. If \(T_1,T_2 \in \mathbb {T}\) are such that \(3\theta (T_i) \cap \tau _i \ne \emptyset \), then the separation of \(\tau _1\) and \(\tau _2\) implies that the directions \(v(T_1)\) and \(v(T_2)\) are transverse and consequently that \(|T_1 \cap T_2| \lesssim R^{3\delta + 3/2}\). Hence, by Plancherel’s theorem and (4.22), we (essentially) have

$$\begin{aligned} \int _{\mathbb {R}^3}|d\sigma _{\tau _1,T_1} *d\sigma _{\tau _2,T_2}|^2 = \int _{\mathbb {R}^3}|\mathcal {E}_r f_{\tau _1,T_1}\mathcal {E}_rf_{\tau _2,T_2}|^2 \lesssim R^{3\delta -1/2}\Vert f_{\tau _1,T_1}\Vert _2^2\Vert f_{\tau _2,T_2}\Vert _2^2. \end{aligned}$$

Plugging this estimate into (4.21), we obtain the lemma. \(\square \)

Given Lemma 4.11, the rest of the proof of Proposition 4.10 is identical to the corresponding part of [5]. For the convenience of the reader, we repeat the details here. We set

$$\begin{aligned} S_{\tau ,B}^\flat := \bigg (\sum _{T \in \mathbb {T}_B^\flat } (R^{-1/2}\Vert f_{\tau ,T}\Vert _2\chi _{2T})^2\bigg )^{1/2} \end{aligned}$$

(cf. (4.22)). Let \(\tau _1, \tau _2 \in \mathcal {S}_r\) be separated squares. Lemma 4.11 implies that

$$\begin{aligned} \int _Q |\mathcal {E}_r f_{\tau _1,B}^\flat |^2|\mathcal {E}_r f_{\tau _2,B}^\flat |^2 \lesssim R^{O(\delta )}\int _Q (S_{\tau _1,B}^\flat )^2(S_{\tau _2,B}^\flat )^2 + {\text {negligible}}. \end{aligned}$$

Summing over \(Q \in \mathcal {Q}\) and exploiting the separation of \(\tau _1\) and \(\tau _2\) (as above) leads to the bound

$$\begin{aligned} \int _{B \cap W} |\mathcal {E}_r f_{\tau _1,B}^\flat |^2|\mathcal {E}_r f_{\tau _2,B}^\flat |^2&\lesssim R^{O(\delta )-1/2}\bigg (\sum _{T_1 \in \mathbb {T}_B^\flat }\Vert f_{\tau _1,T_1}\Vert _2^2\bigg )\bigg (\sum _{T_2 \in \mathbb {T}_B^\flat }\Vert f_{\tau _2,T_2}\Vert _2^2\bigg ) \\&\quad + {\text {negligible}}. \end{aligned}$$

By properties (i) and (iv) of Proposition 4.2, the functions \(f_{\tau ,T}\) are nearly orthogonal and we have

$$\begin{aligned} \sum _{T \in \mathbb {T}_B^\flat }\Vert f_{\tau ,T}\Vert _2^2 \lesssim \Vert f_{\tau ,B}^\flat \Vert _2^2+{\text {negligible}}\end{aligned}$$

for every \(\tau \). Thus, altogether,

$$\begin{aligned} \int _{B \cap W} |\mathcal {E}_r f_{\tau _1,B}^\flat |^2|\mathcal {E}_r f_{\tau _2,B}^\flat |^2 \lesssim R^{O(\delta )-1/2}\Vert f_{\tau _1,B}^\flat \Vert _2^2\Vert f_{\tau _2,B}^\flat \Vert _2^2 + {\text {negligible}}, \end{aligned}$$

and consequently by Hölder’s inequality,

$$\begin{aligned} \Vert {\text {Bil}}(\mathcal {E}_r f_B^\flat )\Vert _{L^4(B \cap W)} \lesssim R^{O(\delta )-1/8}\bigg (\sum _{\tau \in \mathcal {S}_r}\Vert f_{\tau ,B}^\flat \Vert _2^2\bigg )^{1/2} + {\text {negligible}}. \end{aligned}$$

The well-known estimate

$$\begin{aligned} \Vert \mathcal {E}_rg\Vert _{L^2(B_R)} \lesssim R^{1/2}\Vert g\Vert _2 \end{aligned}$$

(which is a consequence of Plancherel’s theorem for the spatial Fourier transform), together with Hölder’s inequality, implies that

$$\begin{aligned} \Vert {\text {Bil}}(\mathcal {E}_rf_B^\flat )\Vert _{L^2(B \cap W)} \lesssim R^{1/2}\bigg (\sum _{\tau \in \mathcal {S}_r}\Vert f_{\tau ,B}^\flat \Vert _2^2\bigg )^{1/2}. \end{aligned}$$

Hence, by interpolation,

$$\begin{aligned} \int _{B \cap W}{\text {Bil}}(\mathcal {E}_r f_B^\flat )^p \lesssim R^{O(\delta ) + \frac{5}{2}-\frac{3p}{4}}\bigg (\sum _{\tau \in \mathcal {S}_r}\Vert f_{\tau ,B}^\flat \Vert _2^2\bigg )^{p/2} + {\text {negligible}}\end{aligned}$$
(4.23)

for \(p \in [2,4]\). Now, on one hand, \(\Vert f_{\tau ,B}^\flat \Vert _2 \lesssim \Vert f_\tau \Vert _2\) by Lemma 4.3. On the other hand, Lemma 4.7 gives a different bound: There are at most \(R^{O(\delta )+1/2}\) discs \(\theta \in \Theta \) such that \(\mathbb {T}_B^\flat \cap \mathbb {T}(\theta ) \ne \emptyset \). By property (i) of Proposition 4.2, each \(f_{\tau ,B}^\flat \) is therefore supported in \(R^{O(\delta )+1/2}\) discs \(\theta \), on each of which we have the bound

$$\begin{aligned} \int _\theta |f_{\tau ,B}^\flat |^2 \lesssim \int _{10\theta }|f_\tau |^2 \lesssim R^{-1}, \end{aligned}$$

by Lemma 4.3 and (4.1). Thus, \(\Vert f_{\tau ,B}^\flat \Vert _2 \lesssim R^{O(\delta )-1/4}\). Combining these two estimates gives \(\Vert f_{\tau ,B}^\flat \Vert _2 \lesssim \Vert f_\tau \Vert _2^{{3/p}}R^{O(\delta )-\frac{1}{4}(1-\frac{3}{p})}\) for \(p \ge 3\). Plugging this bound into (4.23) yields

$$\begin{aligned} \int _{B \cap W}{\text {Bil}}(\mathcal {E}_rf_B^\flat )^p \lesssim R^{O(\delta ) + \frac{13}{4}-p}\bigg (\sum _{\tau \in \mathcal {S}_r}\Vert f_\tau \Vert _2^2\bigg )^{3/2}, \end{aligned}$$

and then taking \(p = 13/4\) completes the proof of Proposition 4.10.

To conclude the article, we rigorously prove the convolution estimate used in the proof of Lemma 4.11. This standard argument is sketched in [5]; we fill in the details here.

Lemma 4.12

If \(\tau _1,\tau _2 \in \mathcal {S}_r\) are separated squares and \(T_1,T_2 \in \mathbb {T}\) are such that \(3\theta (T_i) \cap \tau _i \ne \emptyset \), then

$$\begin{aligned} \int _{\mathbb {R}^3}|d\sigma _{\tau _1,T_1} *d\sigma _{\tau _2,T_2}|^2 \lesssim R^{-1/2}\Vert f_{\tau _1,T_1}\Vert _2\Vert f_{\tau _2,T_2}\Vert _2, \end{aligned}$$

where \(d\sigma _{\tau _i,T_i}\) is given by (4.17).

Proof

Let \(\theta _i := \theta (T_i)\) and \(c_i := c_{\theta _i}\). Since \(3\theta _i \cap \tau _i \ne \emptyset \), we have \(c_i \in 2m\tau _i\), and consequently, \(|\nabla \phi _r(c_1) - \nabla \phi _r(c_2)| \gtrsim 1\) by the separation of \(\tau _1\) and \(\tau _2\). Indeed, by the Cauchy–Schwarz inequality, boundedness of \(\Vert (\nabla ^2\phi )^{-1}\Vert \) on U, and part (b) of Lemma 2.2,

$$\begin{aligned} |\nabla \phi _r(c_1) - \nabla \phi _r(c_2)|&= r^{-1}|\nabla \phi (rc_1)-\nabla \phi (rc_2)|\\&\gtrsim r^{-1}|\langle (\nabla ^2\phi (rc_1))^{-1}(\nabla \phi (rc_1)-\nabla \phi (rc_2)), \nabla \phi (rc_1)\\&\quad -\nabla \phi (rc_2)\rangle |^{1/2}\sim r^{-1}\mathrm{dist}_{\mathrm{L}}(rc_1,rc_2)\\&\gtrsim 1. \end{aligned}$$

It follows (from the law of sines, say) that the unit normal vectors \(n_1 := v_{\theta _1}\) and \(n_2 := v_{\theta _2}\) satisfy \(\angle (n_1,n_2) \gtrsim 1\). Using this angle bound, we will foliate \(3\theta _1\) by lines whose lifts to \(\Sigma _r\) are transverse to the tangent plane \(T_{(\phi _r(c_2),c_2)}\Sigma _r\) above \(c_2\). Define the direction set

$$\begin{aligned} V := \{\omega \in \mathbb {S}^2 : \omega \cdot n_1 = 0~\text {and}~|\omega \cdot n_2| \ge c\}, \end{aligned}$$

where \(c > 0\). If c is sufficiently small relative to \(\angle (n_1,n_2)\), then V is nonempty. Choose \(\omega \in V\), let \(\overline{\omega } := (\omega _2,\omega _3)\), and let S be the rotation of \(\mathbb {R}^2\) satisfying \(S(0,1) = \overline{\omega }/|\overline{\omega }|\) (note that \(\overline{\omega } \ne 0\)). Define the lines \(\overline{\gamma }_s\) by

$$\begin{aligned} \overline{\gamma }_s(t) := S(s,t) + c_1, \end{aligned}$$

and note that \({\text {supp}}d\sigma _{\tau _1,T_1} \subseteq 3\theta _1 \subseteq \{\overline{\gamma }_s(t) : (s,t) \in I^2\}\), where \(I := [-3R^{-1/2},3R^{-1/2}]\). The lift of \(\overline{\gamma }_s\) to \(\Sigma _r\) is given by

$$\begin{aligned} \gamma _s(t) := (\phi _r(\overline{\gamma }_s(t)), \overline{\gamma }_s(t)) \end{aligned}$$

for st small. For almost every s, the function \(t \mapsto f_{\tau _1,T_1}(\overline{\gamma }_s(t))\) is measurable and

$$\begin{aligned} \int _{\gamma _s}g d\nu _s := \int _I g(\gamma _s(t))f_{\tau _1,T_1}(\overline{\gamma }_s(t))dt \end{aligned}$$

defines a measure \(d\nu _s\) on \(\gamma _s\). Using (4.17), an easy calculation shows that \(d\sigma _{\tau _1,T_1} = d\nu _s \chi _Ids\).

Now, to prove the required convolution estimate, it suffices to show that

$$\begin{aligned} |\langle d\sigma _{\tau _1,T_1} *d\sigma _{\tau _2,T_2}, \psi \rangle | \lesssim R^{-1/4}\Vert f_{\tau _1,T_1}\Vert _2\Vert f_{\tau _2,T_2}\Vert _2\Vert \psi \Vert _2 \end{aligned}$$

for all \(\psi \in C_c^\infty (\mathbb {R}^3)\); the brackets denote the pairing between distributions and test functions. We compute that

$$\begin{aligned}&|\langle d_{\tau _1,T_1} *d\sigma _{\tau _2,T_2}, \psi \rangle | \\&\quad =\bigg \vert \int _{\Sigma _r}\int _{\Sigma _r}\psi (\sigma +\tau , \zeta +\xi ) d\sigma _{\tau _2,T_2}(\sigma ,\zeta ) d\sigma _{\tau _1,T_1}(\tau ,\xi )\bigg \vert \\&\quad = \bigg \vert \int _I\int _{\gamma _s}\int _{\Sigma _r}\psi (\sigma +\tau ,\zeta +\xi )d \sigma _{\tau _2,T_2}(\sigma ,\zeta )d\nu _s(\tau ,\xi ) ds\bigg \vert \\&\quad \lesssim R^{-1/4}\bigg (\int _I\bigg \vert \int _{\gamma _s}\int _{\Sigma _r}\psi (\sigma +\tau ,\zeta +\xi )d\sigma _{\tau _2,T_2}(\sigma ,\zeta )d\nu _s(\tau ,\xi )\bigg \vert ^2 ds\bigg )^{1/2}. \end{aligned}$$

Using the definitions of \(d\sigma _{\tau _2,T_2}\) and \(d\nu _s\) and the Cauchy–Schwarz inequality, the quantity between absolute value signs is at most

$$\begin{aligned} \Vert f_{\tau _2,T_2}\Vert _2\bigg (\int _I\int _{3\theta _2}|\psi ((\phi _r(\zeta ),\zeta )+\gamma _s(t))|^2d\zeta dt\bigg )^{1/2}\bigg (\int _I |f_{\tau _1,T_1}(\overline{\gamma }_s(t))|^2dt\bigg )^{1/2}. \end{aligned}$$

Thus, if we can show that

$$\begin{aligned} \int _I\int _{3\theta _2}|\psi ((\phi _r(\zeta ),\zeta )+\gamma _s(t))|^2d\zeta dt \lesssim \Vert \psi \Vert _2^2, \end{aligned}$$

then a simple change of variable, using the definition of \(\overline{\gamma }_s\), gives the required estimate.

Toward that end, let \(G(\zeta ,t) := (\phi _r(\zeta ),\zeta ) + \gamma _s(t)\). We claim that G is invertible on \(3\theta _2 \times I\), provided R is sufficiently large. The definition of S implies that \(\overline{\gamma }_s'(t) = \overline{\omega }/|\overline{\omega }|\) for every st. Thus, the Jacobian of G at \((c_2,0)\) is given by

$$\begin{aligned} \nabla G(c_2,0) = \left( \begin{array}{ccc} \partial _1\phi _r(c_2) &{} \partial _2\phi _r(c_2) &{} \nabla \phi _r(\gamma _s(0))\cdot \overline{\omega }/|\overline{\omega }|\\ 1 &{} 0 &{}\omega _2/|\overline{\omega }| \\ 0 &{} 1 &{} \omega _3/|\overline{\omega }|\end{array}\right) . \end{aligned}$$

The first two columns of this matrix are orthogonal to \(n_2\). If we replace \(\gamma _s(0)\) by \(c_1\), then the third column becomes \(\omega /|\overline{\omega }|\), since \(\omega \cdot n_1 = 0\). The angle between \(\omega \) and the orthogonal complement of \(n_2\) is bounded below, since \(|\omega \cdot n_2| \ge c\). Combining these observations, we see that

$$\begin{aligned} |\det \nabla G(c_2,0)| = \frac{1}{|\overline{\omega }|}\left| \det \left( \begin{array}{ccc} \partial _1\phi _r(c_2) &{} \partial _2\phi _r(c_2) &{} \omega _1\\ 1 &{} 0 &{}\omega _2 \\ 0 &{} 1 &{} \omega _3\end{array}\right) \right| + O(R^{-1/2}) \gtrsim 1. \end{aligned}$$

Thus, the inverse function theorem implies that G is invertible on \(3\theta _2 \times I\), if R is sufficiently large. (The meaning of ‘sufficiently large’ does not depend on r or s, since the bounds \(\Vert \nabla G(c_2,0)\Vert \sim 1\) and \(\Vert (\nabla G(c_2,0))^{-1}\Vert \sim 1\) hold uniformly in these parameters.) Additionally, the bound \(|\det \nabla G(\zeta ,t)| \gtrsim 1\) holds on \(3\theta _2 \times I\), so we obtain

$$\begin{aligned}&\int _I\int _{3\theta _2}|\psi ((\phi _r(\zeta ),\zeta )+\gamma _s(t))|^2d\zeta dt \\&\quad = \iint _{G(3\theta _2\times I)}|\psi (\eta )|^2|\det \nabla G^{-1}(\eta )|d\eta \lesssim \Vert \psi \Vert _2^2, \end{aligned}$$

completing the proof. \(\square \)