1 Introduction

Let us recall Hörmander’s problem in [15]. Let \(n\ge 2\). Let \(B^{n}\) be the unit ball in \(\mathbb{R}^{n}\) and \(a: \mathbb{R}^{n}\times \mathbb{R}^{n-1}\mapsto \mathbb{R}\) be a smooth function supported on \(B^{n}\times B^{n-1}\). Let \(\phi : B^{n}\times B^{n-1}\to \mathbb{R}\) be a smooth function satisfying the following conditions:

  1. (H1)

    \(\mathrm{rank}\ \nabla _{{\mathbf {x}}}\nabla _{\xi}\phi ({\mathbf {x}}; \xi )=n-1\) for all \(({\mathbf {x}}; \xi )\in B^{n}\times B^{n-1}\);

  2. (H2)

    with the map \(G_{0}: B^{n}\times B^{n-1}\to \mathbb{R}^{n}\) defined by

    $$ G_{0}({\mathbf {x}}; \xi ):=\bigwedge _{j=1}^{n-1} \partial _{\xi _{j}} \nabla _{{\mathbf {x}}} \phi ({\mathbf {x}}; \xi ), $$
    (1.1)

    the curvature condition

    $$ \det \nabla _{\xi}^{2} \langle \nabla _{{\mathbf {x}}}\phi ({\mathbf {x}}; \xi ),G_{0}({\mathbf {x}}; \xi _{0})\rangle \big|_{ \xi =\xi _{0}}\neq 0 $$
    (1.2)

    holds for all \(({\mathbf {x}}; \xi _{0})\in \mathrm{supp}(a)\).

Define the oscillatory integral operator

$$ T_{N} f({\mathbf {x}}):=\int e^{iN\phi ({\mathbf {x}}; \xi )} f(\xi )a({\mathbf {x}}; \xi )d \xi . $$
(1.3)

If one takes \(\phi ({\mathbf {x}}; \xi )=\langle x,\xi\rangle +t|\xi |^{2}\) where \({\mathbf {x}}=(x, t)\), then one can check easily that Hypothesis (H1) and (H2) are satisfied, and \(T_{N}\) becomes the standard Fourier extension operator for the paraboloid. Hörmander [15] asked the question whether \(T_{N}\) satisfies similar \(L^{p}\)-boundedness properties to those of the Fourier extension operator. To be more precise, he asked whether it holds that

$$ \|T_{N} f\|_{q} \lesssim _{n, q} N^{-n/q} \|f\|_{\infty}, $$
(1.4)

for all \(q>\frac{2n}{n-1}\).

In dimension \(n=2\), the answer to Hörmander’s question is affirmative, see for instance Carleson and Sjölin [5], Hörmander [15] and Fefferman [6]. However, in dimension \(n=3\), Bourgain [3] showed that if one takes

$$ \phi ({\mathbf {x}}; \xi )=x_{1}\xi _{1}+x_{2}\xi _{2}+t\xi _{1}\xi _{2}+ \frac{1}{2}t^{2}\xi _{1}^{2},\ \ {\mathbf {x}}=(x_{1}, x_{2}, t), $$
(1.5)

then (1.4) may fail for every \(q<4\). Indeed, he showed that even if one replaces (H2) by the following stronger assumption

$$ (\mathrm{H2}^{+})\ \ \nabla _{\xi}^{2} \langle \nabla _{{\mathbf {x}}}\phi ({\mathbf {x}}; \xi ),G_{0}({\mathbf {x}}; \xi _{0})\rangle \big|_{ \xi =\xi _{0}} \text{ is positive definite,} $$
(1.6)

the estimate (1.4) may still fail for some \(q>3\), and it may even fail generically. Let us be more precise about this generic failure. By some elementary change of variables, phase functions \(\phi ({\mathbf {x}}; \xi )\) satisfying (H1) and (H2) can be taken to be

$$ \phi ({\mathbf {x}}; \xi )=\langle x,\xi\rangle +t\langle A\xi,\xi\rangle +O(|t||\xi |^{3}+| {\mathbf {x}}|^{2} |\xi |^{2}), $$
(1.7)

where \(A\) is a symmetric non-degenerate matrix. Condition \((\mathrm{H2}^{+})\) amounts to that \(A\) is positive definite. The form (1.7) is called a normal form of the phase function \(\phi ({\mathbf {x}}; \xi )\) at the origin (see [3, page 323]). Bourgain [3] proved in dimension \(n=3\) that, if

$$ \nabla _{\xi}^{2}\partial _{t}^{2}\phi \big|_{{\mathbf {x}}=0, \xi =0} \text{ is not a multiple of } \nabla _{\xi}^{2}\partial _{t}\phi \big|_{ {\mathbf {x}}=0, \xi =0} $$
(1.8)

where \(\phi ({\mathbf {x}}; \xi )\) is as in (1.7), then (1.4) fails for every \(q< \frac{118}{39}\), which is \(>3\).

On the other hand, for phase functions \(\phi ({\mathbf {x}}; \xi )\) satisfying (H1) and \((\mathrm{H2}^{+})\), Guth, Hickman and Iliopoulou [10] (basing on earlier work Guth [9]) obtained the optimal range of \(q\) for which (1.4) holds. Denote

$$ q_{n, \mathrm{GHI}}:= \textstyle\begin{cases} \frac{2(3n+1)}{3n-3}, & \hfill \text{ if $n$ is odd,} \\ \frac{2(3n+2)}{3n-2}, & \hfill \text{ if $n$ is even.} \end{cases} $$
(1.9)

Then (1.4) holds for all \(q>q_{n, \mathrm{GHI}}\).

In the first result of the paper, we show that generic failure in the spirit of Bourgain [3] occurs in every dimension \(n\ge 3\). Let us first introduce some terminology. At a given point \(({\mathbf {x}}_{0}; \xi _{0})\), consider a new phase function \(\phi '({\mathbf {x}}; \xi ):=\phi ({\mathbf {x}}_{0}+{\mathbf {x}}; \xi _{0}+\xi )\). Let us use \(\phi ''({\mathbf {x}}; \xi )\) to denote a normal form of \(\phi '({\mathbf {x}}; \xi )\) at the origin \({\mathbf {x}}=0, \xi =0\). We say that Bourgain’s condition holds at \(({\mathbf {x}}_{0}; \xi _{0})\) if

$$ \nabla _{\xi}^{2}\partial _{t}^{2}\phi ''\big|_{{\mathbf {x}}=0, \xi =0} \text{ is a multiple of } \nabla _{\xi}^{2}\partial _{t}\phi ''\big|_{ {\mathbf {x}}=0, \xi =0}, $$
(1.10)

where the implicit constant is allowed to depend on \({\mathbf {x}}_{0}\) and \(\xi _{0}\). Otherwise, we say that Bourgain’s condition fails at this point. As normal forms are not unique, we need to show that Bourgain’s condition is well-defined, and this is done in Corollary 2.2.

Theorem 1.1

Generic failure

Let \(n\ge 3\) and

$$ q_{n, 1}:=\frac{2(2n^{2}+n-1)}{2n^{2}-n-2}. $$
(1.11)

Let \(\phi : B^{n}\times B^{n-1}\to \mathbb{R}\) be a smooth function satisfying (H1) and (H2). If Bourgain’s condition fails at some \(({\mathbf {x}}_{0}; \xi _{0})\in \mathrm{supp}(a)\), then (1.4) may fail for every \(q< q_{n, 1}\).

Note that when \(n=3\), \(q_{n, 1}=40/13\), which is slightly better than Bourgain’s exponent \(118/39\).

Based on the above theorem, we think it is very natural to conjecture that (1.4) holds for every \(q>\frac{2n}{n-1}\) if Bourgain’s condition holds at every point. The following positive results provide some further evidence for such a conjecture.

For \(\delta >0\), we define \(\delta \)-tubes. Fix a dyadic cube \(\theta \subset B^{n-1}\) of side length \(\delta \) and let \(\xi _{\theta}\) be the center of \(\theta \). For \(v\in B^{n-1}\) with \(v\in \delta \mathbb{Z}^{n-1}\), let \(X_{t}(\xi _{\theta}, v)\in B^{n-1}\) denote the unique solution in the \(x\) variable to

$$ \nabla _{\xi}\phi (x, t; \xi _{\theta})=v. $$
(1.12)

By a \(\delta \)-tube, we mean

$$ T_{\theta , v}:=\{(x, t): |x-X_{t}(\xi _{\theta}, v)|\le \delta , |t| \le 1\}. $$
(1.13)

For a collection \(\mathbb{T}\) of tubes \(\{T_{\theta , v}\}\), we say that the tubes in \(\mathbb{T}\) point in different directions if for two different tubes \(T_{\theta _{1}, v_{1}}\) and \(T_{\theta _{2}, v_{2}}\) from \(\mathbb{T}\), we always have \(\theta _{1}\neq \theta _{2}\).

Theorem 1.2

Polynomial Wolff Axiom for \(\phi \)

Let \(n\ge 3\). If Bourgain’s condition holds for the phase function \(\phi \) at every \(({\mathbf {x}}_{0}; \xi _{0})\in \mathrm{supp}(a)\), then the following polynomial Wolff axiom for \(\phi \) holds: Let \(E\ge 2\) be an integer. For every \(\epsilon >0\), there exists \(C(n, E, \epsilon )>0\) such that for every collection \(\mathbb{T}\) of \(\delta \)-tubes pointing in different directions,

$$ \#\{T\in \mathbb{T}: T \subset S\}\le C(n, E, \epsilon )|S| \delta ^{1-n- \epsilon} $$
(1.14)

whenever \(S \subset B^{n}\) is a semialgebraic set of complexity \(\le E\).

The above polynomial Wolff axiom for \(\phi \) satisfying Bourgain’s condition is a generalization of that for \(\phi ({\mathbf {x}}; \xi )=\langle x,\xi\rangle +t|\xi |^{2}\), proven by Katz and Rogers [11]. Hickman and Rogers [12] used the polynomial Wolff axiom of Katz and Rogers and proved that (1.4) holds with \(\phi ({\mathbf {x}}; \xi )=\langle x,\xi\rangle +t|\xi |^{2}\) for

$$ q>q_{n, \mathrm{HR}}:=2+\frac{\lambda _{\mathrm{HR}}}{n}+O(n^{-2}), $$
(1.15)

where

$$ \lambda _{\mathrm{HR}}=4/(5-2\sqrt{3})=2.60434\dots $$
(1.16)

After verifying the polynomial Wolff axiom for general \(\phi \) satisfying Bourgain’s condition, one can expect to combine the argument of [10] and [12], and prove (1.4) for all \(q\) satisfying (1.15). We will indeed prove something stronger. Before stating the next theorem, we first recall the result of Hickman and Zahl [13]. Let \(\nu ^{1/2}\) be the real number that solves the equation

$$ 2 x^{3}+3 x^{2}-2=0. $$
(1.17)

Denote

$$ \lambda _{\mathrm{HZ}}:=\frac{4}{2-\nu}=2.59607\dots $$
(1.18)

Hickman and Zahl [13] used the strong polynomial Wolff axiom by Hickman-Rogers-Zhang [14] and independently Zahl [24] to further improve the result in [12] and obtained that (1.4) holds for

$$ q> q_{n, \mathrm{HZ}}:=2+\frac{\lambda _{\mathrm{HZ}}}{n}+O(n^{-2}), $$
(1.19)

with \(\phi ({\mathbf {x}}; \xi )=\langle x,\xi\rangle +t|\xi |^{2}\). This result gives the best asymptotic formula (as \(n\to \infty \)) in the literature for the Fourier restriction conjecture.

Theorem 1.3

Let \(\phi : B^{n}\times B^{n-1}\to \mathbb{R}\) be a smooth function satisfying (H1) and (H2+). If Bourgain’s condition holds for the phase function \(\phi ({\mathbf {x}}; \xi )\) at every point \(({\mathbf {x}}; \xi )\in \mathrm{supp}(a)\), then (1.4) holds for

$$ q>q_{n, 2}:=2+\frac{2.5921}{n}+O(n^{-2}). $$
(1.20)

Note that

$$ q_{n, 1}= 2+ \frac{4n+2}{ 2 n^{2}-n-2 }= 2+ \frac{2}{n}+ O(n^{-2}), $$
(1.21)

and therefore for large \(n\), we have the order

$$ q_{n, 1}< q_{n, 2}< q_{n ,\mathrm{HZ}}< q_{n, \mathrm{HR}}< q_{n, \mathrm{GHI}}. $$
(1.22)

When \(n=3\), the exponents in (1.22) are given by

$$ 3+\frac{1}{13}< 3+ \frac{3}{13}< 3+ \frac{1}{4}\le 3+ \frac{1}{4} < 3+ \frac{1}{3}, $$
(1.23)

respectively. Here \(q_{n, 2}\) with \(n=3\) matches Wang’s exponent [21].

As \(\phi ({\mathbf {x}}; \xi )=\langle x,\xi\rangle +t|\xi |^{2}\) also satisfies Bourgain’s condition, we obtain the following immediate corollary of Theorem 1.3.

Corollary 1.4

Improved Fourier restriction estimate

For \(q>q_{n, 2}\), it holds that

$$ \Big\| \int _{[0, 1]^{n-1}} e^{iN(\langle x,\xi\rangle +t|\xi |^{2})}f(\xi )d\xi \Big\| _{q} \lesssim _{n, q} N^{-n/q} \big\| f \big\| _{\infty}. $$
(1.24)

Recall Bourgain’s observation [3, Remark 3.43] that the phase function for the Bochner-Riesz problem also satisfies Bourgain’s condition, we see Theorem 1.3 also gives the currently best known bounds for the Bochner-Riesz problem. More precisely, for \(\alpha \geqslant 0\), the Bochner-Riesz multiplier of order \(\alpha \) is defined by

$$ m^{\alpha}(\xi ):=\left (1-|\xi |^{2}\right )_{+}^{\alpha}, \ \ \xi \in \mathbb{R}^{n}. $$

As a corollary of Theorem 1.3 (also Theorem 4.1 below), we obtain

Corollary 1.5

Improved Bochner-Riesz estimate on \(\mathbb{R}^{n}\)

For \(q>q_{n, 2}\), it holds that

$$ \big\| m^{\alpha}(D) f \big\| _{L^{q}(\mathbb{R}^{n})} \lesssim _{n, \alpha , q} \big\| f \big\| _{L^{q}(\mathbb{R}^{n})}, $$
(1.25)

whenever \(\alpha > n(\frac{1}{2}-\frac{1}{p})-\frac{1}{2}\).

Before stating the next corollary, let us discuss a prior attempt in breaking the critical range of exponents in (1.9) for general Hörmander operators. In [7], Gao, Li and Wang, under the assumptions (H1), \((\mathrm{H2})^{+}\), and the additional assumption

$$ G_{0}({\mathbf {x}}; \xi )/|G_{0}({\mathbf {x}}; \xi )| \text{ is constant in } {\mathbf {x}}, $$
(1.26)

proved that (1.4) holds for \(q\) satisfying the range of Hickman and Zahl (1.19).

By Theorem 2.1 and Lemma 2.3, one can check directly that phase functions satisfying (1.26) also satisfy Bourgain’s condition. Therefore the bounds Gao, Li and Wang obtained in [7] can also be improved to the one stated in Theorem 1.3.

The assumption (1.26) appears naturally in several interesting applications, including the generalized Bochner-Riesz problem for non-degenerate hyper-surfaces, the local smoothing estimates for fractional Schrödinger equations and sharp resolvent estimates outside of the uniform boundedness range. These applications were worked out carefully in [7]. We include the latter two here.

Let \(u: \mathbb{R}^{n-1}\times \mathbb{R}\to \mathbb{C}\) be the solution to the equation

$$ \textstyle\begin{cases} i\partial _{t} u+(-\Delta )^{\frac{\alpha}{2}}u=0, & (x, t)\in \mathbb{R}^{n-1} \times R, \\ u(x, 0)=f(x), & x\in \mathbb{R}^{n-1}, \end{cases} $$
(1.27)

where \(\alpha >1\) and \(f\) is a Schwartz function.

Corollary 1.6

Local smoothing estimates for fractional Schrödinger equations

Let \(\alpha >1\) and let \(u\) be a solution to (1.27). Then

$$ \big\| u \big\| _{L^{q}(\mathbb{R}^{n-1}\times [1, 2])} \lesssim _{\alpha , \beta , n, \epsilon , p} \big\| f \big\| _{L^{q}_{\beta}(\mathbb{R}^{n-1})}, $$
(1.28)

whenever

$$ \beta > (n-1)\alpha \Big( \frac{1}{2}-\frac{1}{q} \Big) -\frac{\alpha}{q}, $$
(1.29)

and \(q\) satisfies (1.20). Moreover, for each fixed \(q\), the range of \(\beta \) is sharp.

The conjectured range for (1.28) to hold is \(q>\frac{2n}{n-1}\), the same as the range in the Fourier restriction conjecture.

The resolvent estimate for the Laplacian on \(\mathbb{R}^{n}\) is of the form

$$ \big\| (-\Delta -z)^{-1} f \big\| _{L^{q}(\mathbb{R}^{n})} \le C_{p, q, n}(z) \big\| f \big\| _{L^{p}(\mathbb{R}^{n})}, \ \ z \in \mathbb{C}\setminus [0, \infty ). $$
(1.30)

Here \(C_{p, q, n}(z)\) is a constant that is allowed to depend on \(p, q, n\) and \(z\). We are particularly interested in tracking the dependence on \(z\), for fixed \(n, p\) and \(q\).

Corollary 1.7

Resolvent estimates

For \(q\) satisfying (1.20), we have

$$ \big\| (-\Delta -z)^{-1} f \big\| _{L^{q}(\mathbb{R}^{n})} \lesssim _{q, n} |z|^{-1+\gamma _{q}} \mathrm{dist}(z, [0, \infty ))^{-\gamma _{q}} \big\| f \big\| _{L^{q}(\mathbb{R}^{n})}, $$
(1.31)

where

$$ \gamma _{q}:=\frac{n+1}{2}-\frac{n}{q}. $$
(1.32)

Moreover, for fixed \(q\), the bound is optimal in \(z\).

The study of resolvent estimates in (1.30) has a long history and can be dated back to the work of Kenig, Ruiz and Sogge [17]. The authors there proved (1.30) for optimal ranges of \((p, q)\) for which the constant \(C_{p, q, n}(z)\) is independent of \(z\). These are called uniform Sobolev inequalities, and have found numerous applications including unique continuation properties, limiting absorption principles, etc.

The conjectured range for (1.31) to hold is \(q>\frac{2n}{n-1}\), the same as the Fourier restriction exponent. Moreover, if this conjecture turns out to be true, then by interpolation with known results, it would imply (1.30) for all combinations of \((p, q)\), with an optimal dependence of \(C_{p, q, n}(z)\) on \(z\).

In the last corollary, we discuss the connection of Theorem 1.3 to Sogge’s work [18] and [20]. In [20], Sogge studied the Nikodym problem on general Riemannian manifolds and proved that the Nikodym maximal operator satisfies better bounds on manifolds of constant scalar curvature than on general Riemannian manifolds. This is a strong indication that distance functions on manifolds of constant curvature satisfy Bourgain’s condition. In the next corollary, we show that this is indeed the case for \(S^{n}\), the \(n\)-dimensional Euclidean sphere.

Corollary 1.8

Let \(a: S^{n}\times S^{n}\to \mathbb{R}\) be a smooth function supported away from the diagonal. Let \(\mathrm{dist}\) be the distance function on \(S^{n}\). Then

$$ \Big\| \int _{S^{n}} e^{ iN \mathrm{dist}(x, y) } a(x, y)f(y) dy \Big\| _{L^{q}(S^{n})} \lesssim N^{-n/q} \big\| f \big\| _{L^{q}(S^{n})}, $$
(1.33)

for every \(q\) satisfying (1.20).

Proof of Corollary 1.8

By cutting the support of \(a\) into finitely many pieces, we without loss of generality assume that the support of \(a(x, y)\) is such that \(x\) is around the north pole and \(y\) is slightly away from the north pole. Write

$$ x=(x_{1}, \dots , x_{n}, \sqrt{1-|x'|^{2}}), \ \ y=(y_{1}, \dots , y_{n}, \sqrt{1-|y'|^{2}}), $$
(1.34)

where \(x'=(x_{1}, \dots , x_{n}), y'=(y_{1}, \dots , y_{n})\). Note that \(\mathrm{dist}(x, y)=\arccos (x\cdot y)\). When integrating in \(y\) on the left hand side of (1.33), we apply Fubini’s theorem and integrate in \(y\) with \(\sqrt{1-|y'|^{2}}=r\) for some \(r\) and then integrate in \(r\). For a fixed \(r\), our distance function can be written as

$$ \arccos ( x_{1}y_{1}+\cdots +x_{n-1} y_{n-1} + x_{n} \sqrt{1-r^{2}-|y''|^{2}}+r \sqrt{1-|x'|^{2}} ), $$
(1.35)

where \(y'':=(y_{1}, \dots , y_{n-1})\). Therefore, to prove Corollary 1.8, it suffices to show that (1.35) satisfies Bourgain’s condition in \(x'\) and \(y''\) variables. We will prove this by checking the definition of Bourgain’s condition as in (1.10). By rotation symmetry, it suffices to consider the phase function (1.35) near the north pole \(x'=0\) and \(y''=0\). Next, we apply a Taylor expansion for (1.35) about \(x'=0\) and \(y''=0\). After the Taylor expansion, note that all linear terms in \(y''\) can be written as

$$ (x_{1}+O(|x'|^{2}))y_{1}+\cdots +(x_{n-1}+O(|x'|^{2}))y_{n-1}. $$
(1.36)

We therefore apply the change of variables

$$ x_{1}+O(|x'|^{2})\mapsto x_{1}, \dots , x_{n-1}+O(|x'|^{2})\mapsto x_{n-1}, $$
(1.37)

to turn our phase function into a normal form. Let \(\phi ''\) denote this normal form. In the end, we just need to note that both matrices

$$ \nabla _{y''}^{2} \partial ^{2}_{t} \phi ''\big|_{x'=0, y''=0}, \ \ \nabla _{y''}^{2} \partial _{t} \phi ''\big|_{x'=0, y''=0} $$
(1.38)

are multiples of the identity matrix. This finishes the proof. □

It is conjectured (see Sogge [18]) that (1.33) holds for all \(q>2n/(n-1)\). The operator studied in (1.33) appears in the study of the Bochner-Riesz problem on Euclidean spheres \(S^{n}\). Let \(\Delta _{g}\) denote the Laplace-Beltrami operator, and

$$ 0< \lambda _{1}\le \lambda _{2}\le \dots $$
(1.39)

the eigenvalues of \(-\Delta _{g}\). Let \(E_{j}\) be the one-dimensional eigenspace for \(-\Delta _{g}\) with eigenvalue \(\lambda _{j}\), and \(e_{j}: L^{2}(S^{n})\to L^{2}(S^{n})\) be the projection operator onto the eigenspace \(E_{j}\). We define the Riesz means of index \(\alpha \ge 0\) as

$$ S^{\alpha}_{L}(f):=\sum _{j=1}^{\infty} \Big( 1-\frac{\lambda _{j}}{L} \Big) ^{\alpha}_{+} e_{j}(f). $$
(1.40)

One can follow the work Sogge [18] and Huang and Sogge [16], and deduce the following corollary from Corollary 1.8.

Corollary 1.9

Bochner-Riesz for spheres

Assume that \(q\) satisfies (1.20). We have that

$$ \big\| S^{\alpha}_{L} (f) \big\| _{L^{q}(S^{n})} \lesssim \big\| f \big\| _{L^{q}(S^{n})} \ \textit{ uniformly in } L, $$
(1.41)

whenever \(\alpha > n(\frac{1}{2}- \frac{1}{p} )-\frac{1}{2}\). Moreover, the range of \(\alpha \) is sharp for fixed \(q\).

At the end of the introducion, we discuss the proof of Theorem 1.3. To prove Theorem 1.3, we first prove a strong Polynomial Wolff Axiom (SPWA) for the phase function \(\phi \) satisfying Bourgain’s condition, see Sect. 6. It is a nested version of the Polynomial Wolff Axiom and generalizes the SPWA by Hickman and Zahl [13], which was built on the work of Hickman-Rogers-Zhang [14] and Zahl [24]. One can combine this strong polynomial Wolff axiom with the argument in [13], and prove (1.4) for \(q\) satisfying (1.19).

To improve the range in [13], we further develop the idea of “brooms” in dimension \(n=3\) in Wang [21] for the Fourier restriction problem. In [21], the second author introduced a notion of brooms. They enable one to exploit the feature that if the sum of a collection of wave packets is highly concentrated locally in space, then this collection must spread out on the far end, leading to new improvements on the range of exponent for the Fourier restriction conjecture in \(\mathbb{R}^{3}\).

However, a key geometric argument in [21] regarding brooms relies heavily on the space being three-dimensional. Even in the Fourier restriction setting, it was not clear how one can most efficiently generalize the notion and construction of brooms in [21] to higher dimensions. In the current paper, we come up with a slightly different notion of brooms, which works in all dimensions and also in the setting of oscillatory integral operators satisfying Hörmander’s conditions. This is done in Sect. 7.1. When proving the relevant broom estimates (see Theorem 7.8 in Sect. 7.3), we use an argument that can be viewed as a generalized pseudo-conformal transformation (Lemma 7.12). By this transformation and a counting lemma (Lemma 7.10 below) the key broom estimate is then reduced to what we call a Variety Uncertainty Principle. We state it in the Fourier transform case below as we will use it to prove the general case.

Lemma 1.10

Variety Uncertainty Principle

Given two \((m-1)\)-dimensional algebraic varieties \(Y_{1}, Y_{2}\) in \(\mathbb{R}^{n-1}\) that are transverse complete intersections (see Definition 5.1below). Let \(Z_{i}\subset Y_{i}\) be the part of \(Y_{i}\) where every point is non-singular and the angle formed by \(T_{{\mathbf {z}}_{i}}(Z_{i})\) and the space spanned by \(\{\vec {e}_{1}, \dots , \vec {e}_{m-1}\}\) is \(\le 1/(100n)\), for every \(i=1, 2\) and every \({\mathbf {z}}_{i}\in Z_{i}\). Here \(T_{{\mathbf {z}}_{i}}(Z_{i})\) refers to the tangent space and \(\vec{e}_{j}\) refers to a coordinate vector. Let \(1\le R_{1}\le R_{2}\). DenoteFootnote 1

$$ \Omega _{1}=\mathcal {N}_{\sqrt{R_{1}}}(Z_{1}), \ \ \Omega _{2}=\mathcal {N}_{1/ \sqrt{R_{2}}}(Z_{2}). $$
(1.42)

Assume that \(F: \mathbb{R}^{n-1}\to \mathbb{C}\) satisfies \(\mathrm{supp}(F)\subset \Omega _{2}\). Then

$$ \big\| \widehat{F} \big\| _{L^{2}(\Omega _{1})}^{2} \lesssim \Big( \frac{R_{1}}{R_{2}} \Big) ^{\frac{n-m}{2}-\delta} \big\| F \big\| _{L^{2}}^{2}, $$
(1.43)

for every \(\delta >0\), where the implicit constant depends on \(n, m\), \(\deg (Z_{1})\), \(\deg (Z_{2})\) and \(\delta \).

Lemma 1.10 may also be of independent interest as it can be viewed as a variant of the generalized Mizohata-Takeuchi Conjecture by Jonathan Bennett and Tony Carbery, as stated in (9) in [2]. For the original Mizohata-Takeuchi Conjecture and its influences, see e.g. the references in [2]. As one sees from the proof, Lemma 1.10 is proved by using the geometric information of neighborhoods of both the spatial set and the hypersurface in the Fourier space at many scales, and can be viewed as a result in the vein of Mizohata-Takeuchi but with much stronger assumptions about the neighborhoods of the underlying sets at many scales.

Structure of the paper. In Sect. 2 we first give an equivalent characterization of Bourgain’s condition, which is more straightforward to check, and then prove Theorem 1.1. In dimension \(n=3\), the improvement of our result over Bourgain’s [3] comes from a slightly more efficient way of constructing (curved) tubes that have high overlapping.

In Sect. 3, we show that for phase functions \(\phi ({\mathbf {x}}; \xi )\) satisfying Bourgain’s condition, the corresponding tubes satisfy the polynomial Wolff axiom. We follow largely the argument of Katz and Rogers [11].

In Sect. 4, we introduce the standard wave packet decomposition and standard reduction of Theorem 1.3 to a broad norm estimate (Theorem 4.2).

In Sect. 5 we apply a polynomial partitioning algorithm to decompose the broad norm in Theorem 4.2. The algorithm is a slight variant of that in Hickman and Rogers [12], with one difference that we need to have a better control of how fast cells shrink.

In Sect. 6, we prove the strong polynomial Wolff axiom (mentioned below Theorem 1.3) for phase functions satisfying Bourgain’s condition.

In Sect. 7, we define brooms and prove the broom estimate (Theorem 7.8), which is key to the proof of Theorem 4.2. It is worth noting that the broom estimate holds for all phase functions \(\phi ({\mathbf {x}}; \xi )\) satisfying (H1) and \((\mathrm{H2}^{+})\), and does not rely on Bourgain’s condition.

In Sect. 8, we define bushes and prove bush estimates. They are used to handle “small” grain resulting from the polynomial partitioning algorithm in Sect. 5.

In Sect. 9, we put all the ingredients together and finish the proof of Theorem 4.2, the broad norm estimate.

Notation. We use \({\mathbf {x}}=(x, t)\) to refer to a spatial points in \(\mathbb{R}^{n}\), and \(\xi \) or \(\omega \) for a frequency point in \(\mathbb{R}^{n-1}\). Denote \(\partial _{i}=\partial _{x_{i}}\) if \(1\le i\le n-1\) and \(\partial _{n}=\partial _{t}\).

We use \(d\) for the degree of polynomials that we will use in the polynomial partitioning lemmas. It is a large constant and is not allowed to depend on parameters like \(R\) or \(\lambda \).

We will use a few admissible parameters

$$ \epsilon ^{C} \leqslant \delta \ll _{\epsilon} \delta _{n} \ll _{ \epsilon} \delta _{n-1} \ll _{\epsilon} \cdots \ll _{\epsilon} \delta _{1}\ll _{\epsilon} \delta _{0} \ll _{\epsilon} \epsilon _{ \circ} \ll _{\epsilon} \epsilon . $$
(1.44)

Here \(C\) is some dimensional constant and the notation \(A \ll _{\epsilon} B\) indicates that \(A \leqslant C_{n, \epsilon}^{-1} B\) for some large admissible constant \(C_{n, \epsilon} \geqslant 1\). These parameters have exactly the same meaning as their counterparts \(\delta , \delta _{n}, \dots , \delta _{0}, \epsilon \) in Hickman-Rogers’ work [12]. In the current paper, we need more of these parameters. For each \(1\le n'\le n\), let \(\delta _{n'-1/2}\) be such that

$$ \delta _{n'}\ll _{\epsilon} \delta _{n'-1/2}\ll _{\epsilon} \delta _{n'-1}. $$
(1.45)

These new parameters will be used when we modify the polynomial partitioning algorithm of [12].

For two positive constant \(A, B\), by \(A\lessapprox B\) we mean \(A\le R^{O(\delta )} B\).

For a function \(F: \mathbb{R}^{n}\to \mathbb{C}\) and a region \(\Omega \subset \mathbb{R}^{n}\), we say that \(F\) is essentially supported on \(\Omega \) if it decay rapidly outside \(\Omega \). In other words, for \({\mathbf {x}}\in \mathbb{R}^{n}\setminus \Omega \), and every \(N\in \mathbb{N}\), it holds

$$ |F({\mathbf {x}})|\le C_{n, N} R^{-N}, $$
(1.46)

for some constant \(C_{n, N}\). Here \(R>1\) is as in Theorem 4.2.

2 Bourgain’s condition and proof of Theorem 1.1

This section consists of two subsections. In Sect. 2.1, we will give an equivalent form of Bourgain’s condition; in Sect. 2.2, we will prove Theorem 1.1.

Bourgain’s formulation of Bourgain’s condition in [3] requires that one first writes a phase function in its normal form, which differs from point to point. The equivalent form below (see Theorem 2.1) can be checked directly for a phase function, and can be more conveniently applied later when checking polynomial Wolff axioms (see Sect. 3).

The proof of Theorem 1.1 in Sect. 2.2 generalizes Bourgain’s argument in [3] for \(n=3\). Bourgain discovered that if a phase function fails Bourgain’s condition, then we always have “Kakeya compression” phenomenon. When reflected in the proof, this phenomenon means that the Jacobian determinant of the map \(X_{t}(\xi )\) in Lemma 2.4 can be “small”, which further means that the image of the map (the relevant Kakeya set) can also be “small”. This ends up being how we generalize Bourgain’s result to higher dimensions.

2.1 An equivalent formulation of Bourgain’s condition

In this subsection, we will provide an equivalent formulation of Bourgain’s condition. This equivalent formulation will be used in a few places below; for instance, it will play a crucial role in Sect. 3 and Sect. 6 when proving polynomial Wolff axioms for Hörmander’s operators satisfying Bourgain’s condition.

Define

$$ \vec {T_{j}}({\mathbf {x}}; \xi )=\partial _{\xi _{j}} \nabla _{{\mathbf {x}}} \phi ( {\mathbf {x}}; \xi ), $$
(2.1)

and

$$ \vec {V}({\mathbf {x}}; \xi )=\vec {T}_{1}({\mathbf {x}}; \xi )\wedge \cdots \wedge \vec {T}_{n-1} ({\mathbf {x}}; \xi ). $$
(2.2)

The main result in this subsection is

Theorem 2.1

Let \(\phi ({\mathbf {x}}; \xi )\) be a phase function that satisfies conditions (H1) and (H2). It satisfies Bourgain’s condition at \(({\mathbf {x}}_{0}; \xi _{0})\) if and only if

$$ ((\vec {V}\cdot \nabla _{{\mathbf {x}}})^{2} \nabla ^{2}_{\xi} \phi )({\mathbf {x}}_{0}; \xi _{0}) \textit{ is a multiple of } (( \vec {V}\cdot \nabla _{{\mathbf {x}}}) \nabla ^{2}_{\xi} \phi )({\mathbf {x}}_{0}; \xi _{0}). $$
(2.3)

The constant is allowed to depend on \({\mathbf {x}}_{0}\) and \(\xi _{0}\).

Corollary 2.2

Let \(\phi ({\mathbf {x}}; \xi )\) be a phase function satisfying conditions (H1) and (H2). That Bourgain’s condition holds at \(({\mathbf {x}}_{0}; \xi _{0})\) is independent of the choice of normal forms.

Here by \(\vec {V}\cdot \nabla _{{\mathbf {x}}}\), we mean

$$ V_{1}({\mathbf {x}}; \xi )\partial _{1}+\cdots +V_{n-1}({\mathbf {x}}; \xi )\partial _{n-1}+ V_{n}({\mathbf {x}}; \xi )\partial _{n}, $$
(2.4)

where \(\vec {V}=(V_{1}, \dots , V_{n})^{T}\) and for the sake of simplicity we introduced the notation \(\partial _{i}=\partial _{x_{i}}\) if \(i\le n-1\) and \(\partial _{n}=\partial _{t}\). It is perhaps worthy emphasizing that because of the dependence of \(\vec {V}\) on \({\mathbf {x}}\), the differential operator \((\vec {V}\cdot \nabla _{{\mathbf {x}}})^{2}\) is equal to

$$ \sum _{1\le i, j\le n}V_{i} V_{j}\cdot \partial _{i} \partial _{j}+ \sum _{1\le i\le n}V_{i}\sum _{1\le j\le n}\partial _{i} V_{j}\cdot \partial _{j}. $$
(2.5)

Proof of Theorem 2.1.

We first observe that if (2.3) is satisfied everywhere, then it is also satisfied everywhere if one replaces \(V\) by \(\lambda ({\mathbf {x}}; \xi ) V\) where \(\lambda ({\mathbf {x}}; \xi )\) is any smooth scalar function. This can be seen by the straightforward computation

$$ (\lambda \vec {V}\cdot \nabla _{{\mathbf {x}}})^{2} f= \lambda ^{2} ( \vec {V}\cdot \nabla _{{\mathbf {x}}})^{2} f+ \lambda \cdot (\vec {V} \cdot \nabla _{{\mathbf {x}}} \lambda )\vec {V}\cdot \nabla _{{\mathbf {x}}} f $$
(2.6)

for every smooth function \(f = f({\mathbf {x}}; \xi )\).

Back to the proof of the Theorem. We first prove that if \(\phi \) satisfies (2.3) everywhere, then it also satisfies (2.3) everywhere after any diffeomorphism in \({\mathbf {x}}\) only or in \(\xi \) only. Without loss of generality, we only need to verify this at \(({\mathbf{0}}, 0)\) and can assume the diffeomorphism always preserves the origin.

In order to show that (2.3) continues to hold after any diffeomorphism in \({\mathbf {x}}\), in light of the above property, we just need to show the “direction” of \(\vec {V}\cdot \nabla _{{\mathbf {x}}}\) is invariant. More specifically, if \(h\) is a diffeomorphism that preserves the origin and write \(h({\mathbf {x}}) = {\mathbf {y}}\), we use \(h_{*}\) to denote the tangent map of \(h\) at the origin and only need to prove

$$ h_{*} (\vec {T}_{1} \wedge \cdots \wedge \vec {T}_{n-1}) \parallel \vec {S}_{1} \wedge \cdots \wedge \vec {S}_{n-1} $$
(2.7)

where \(\vec {T}_{j}\) is the original \(\vec {T}_{j} ({\mathbf{0}}; 0) =\nabla _{{\mathbf {x}}} \partial _{\xi _{j}} \phi ({\mathbf{0}}; 0)\) as before and \(\vec {S}_{j}=\nabla _{{\mathbf {y}}} \partial _{\xi _{j}} \phi ({\mathbf{0}}; 0)\) is defined similarly in the tangent space of \(({\mathbf{0}}; 0)\) in \({\mathbf {y}}\) coordinates.

Now everything can be computed in the tangent space of \(({\mathbf{0}}; 0)\) in terms of \(n\)-variate functions \(\partial _{\xi _{j}} \phi (\cdot ; 0)\) and we now check (2.7) using linear algebra. View all vectors \(\vec {T}, \vec {S}\) as column vectors as before. Let \(J\) denote the Jacobian \(\frac{\partial {\mathbf {x}}}{\partial {\mathbf {y}}}|_{\mathbf{0}}\) with \(J_{ij} = \frac{\partial x_{i}}{\partial y_{j}}|_{\mathbf{0}}\). Then

$$ \vec {S}_{j} = J^{T} \cdot \vec {T}_{j}, 1 \leq j \leq n-1. $$
(2.8)

It is then easy to compute by considering the \((n-1)\)-th tensor product of \(J\) acting on the \((n-1)\)-fold wedge algebra that

$$ \wedge _{j=1}^{n-1} \vec {S}_{j} = \det (J)\cdot (J^{-1}) \cdot ( \wedge _{j=1}^{n-1} \vec {T}_{j}). $$
(2.9)

Finally, note that the matrix of \(h_{*}\) is \(\frac{\partial {\mathbf {y}}}{\partial {\mathbf {x}}}|_{\mathbf{0}} = J^{-1}\). By (2.9) we see both sides of (2.7) are parallel to \((J^{-1}) \cdot (\wedge _{j=1}^{n-1} \vec {T}_{j})\) and thus (2.7) holds.

Next we show that (2.3) continues to hold after any diffeomorphism in \(\xi \). First note that this property is preserved if we do any diffeomorphism in \(\xi \). Indeed, this will multiply a nonzero scalar (equal to the determinant of the linear change of variable in \(\xi \)) to the whole vector field \(\vec {V}\), and will result in a constant congruent transformation in \(\nabla _{\xi}^{2} \phi \) everywhere.

Hence, it suffices to show that (2.3) gets preserved if one does a change of variables of the following shape:

$$ \xi _{j} \mapsto \xi _{j} + \xi ^{T} A_{j} \xi + \text{ higher order terms} $$
(2.10)

where \(A_{j} (1 \leq j\leq n-1)\) is a symmetric \((n-1)\times (n-1)\) matrix. Assume the Taylor expansion of \(\phi \) near the origin is

$$\begin{aligned} \phi ({\mathbf {x}}; \xi ) = & a ({\mathbf {x}}) + b (\xi ) + \sum _{i, j} c_{i,j} x_{i} \xi _{j} + \sum _{i} x_{i}\cdot \xi ^{T} D_{i} \xi +\sum _{j} \xi _{j} \cdot {\mathbf {x}}^{T} E_{j} {\mathbf {x}} \\ + & \sum _{i_{1}, i_{2} , j_{1}, j_{2}} f_{i_{1}, i_{2}, j_{1}, j_{2}} x_{i_{1}} x_{i_{2}} \xi _{j_{1}} \xi _{j_{2}} + g({\mathbf {x}}; \xi ) \end{aligned}$$
(2.11)

where to simplify notation we wrote \(t=x_{n}\), the functions \(a, b\) are polynomials of degree \(\leq 3\) and \(g\) is a sum of terms of order \(\geq 3\) in \({\mathbf {x}}\) and terms of order \(\geq 3\) in \(\xi \). These terms will not play roles in verifying (2.3) at the origin. Here \(D_{i}\) and \(E_{j}\) refer to square matrices.

We use \(\tilde{\phi}\) to denote the new expression of \(\phi \) under the change of variable (2.10), and use \(\widetilde{\vec {T}}_{j}\) and \(\widetilde{\vec {V}}\) to denote the counterpart of \(\vec {T}_{j}\) and \(\vec {V}\) after the change of variable (2.10).

Note that \(\tilde{\phi}\) has the Taylor expansion

$$\begin{aligned} \tilde{\phi} ({\mathbf {x}}; \xi ) = & \sum _{i, j} c_{i,j} x_{i} \xi _{j} + \sum _{i} x_{i} \cdot \xi ^{T} (D_{i}+ \sum _{j} c_{i, j} A_{j}) \xi + \sum _{j} \xi _{j} \cdot {\mathbf {x}}^{T} E_{j} {\mathbf {x}} \\ &+ \sum _{i_{1}, i_{2} , j_{1}, j_{2}} f_{i_{1}, i_{2}, j_{1}, j_{2}} x_{i_{1}} x_{i_{2}} \xi _{j_{1}} \xi _{j_{2}} +\sum _{j} \xi ^{T} A_{j} \xi \cdot {\mathbf {x}}^{T} E_{j} {\mathbf {x}}\\ & + \text{ terms playing no role } \end{aligned}$$
(2.12)

Comparing (2.11) and (2.12), we see \(\vec {V}\) and \(\widetilde{\vec {V}}\) differ at the origin by terms of order at least 1 in \({\mathbf {x}}\) or \(\xi \). Denote \(\vec {V}_{0} = (V_{0, 1}, \ldots , V_{0, n})\) to be the common value of \(\vec {V}\) and \(\widetilde{\vec {V}}\) at \(({\mathbf{0}}, 0)\). Now

$$\begin{aligned} & (\widetilde{\vec {V}}\cdot \nabla _{{\mathbf {x}}}) \nabla ^{2}_{\xi} \tilde{\phi} ({\mathbf{0}}; 0)-(\vec {V}\cdot \nabla _{{\mathbf {x}}}) \nabla ^{2}_{ \xi} \phi ({\mathbf{0}}; 0) \\ = & \sum _{i} \sum _{j} V_{0, i} c_{ij} (2A_{j}) = 0 \end{aligned}$$

where the last equality is because by definition, \(\vec {V}_{0}\) is orthogonal to each \((c_{1j}, \ldots , c_{nj})^{T}\).

Next we compare \((\widetilde{\vec {V}}\cdot \nabla _{{\mathbf {x}}})^{2} \nabla ^{2}_{\xi} \tilde{\phi}\) and \((\vec {V}\cdot \nabla _{{\mathbf {x}}})^{2} \nabla ^{2}_{\xi} \phi \). We will show that they are also equal by using (2.11) and (2.12) to compute their difference. We begin by showing that near the origin,

$$ (\widetilde{\vec {V}}- \vec {V})({\mathbf {x}}; \xi ) = O(|\xi | + |{\mathbf {x}}|^{2}), $$
(2.13)

which will greatly simplify our computation. Indeed, use the definition (2.2) of \(\widetilde{\vec {V}}\) and \(\vec {V}\), we see that both vectors have the same constant terms. Moreover, observe that in the wedge definition (2.2) of \(\widetilde{\vec {V}}\) and \(\vec {V}\), all \(\vec {T}\) have the same linear term in \({\mathbf {x}}\), since the coefficients of all \(x_{i_{1}} x_{i_{2}} \xi _{j}\) terms are the same for \(\phi \) and \(\tilde{\phi}\). Hence \(\widetilde{\vec {V}}\) and \(\vec {V}\) also have the same linear term in \({\mathbf {x}}\) and (2.13) is seen to hold.

By (2.13), if we write

$$ \vec {V} ({\mathbf {x}}; \xi ) = \vec{V}_{0} + \sum _{i=1}^{n} x_{i} \vec{U}_{i} + O(|\xi | + |{\mathbf {x}}|^{2}), $$
(2.14)

we can reduce the effect of both \((\widetilde{\vec {V}}\cdot \nabla _{{\mathbf {x}}})^{2}\) and \((\vec {V}\cdot \nabla _{{\mathbf {x}}})^{2}\) at the origin to the action of a constant coefficient differential operator \(\mathcal{D}_{0} = (\vec{V}_{0}\cdot \nabla _{{\mathbf {x}}})^{2} + \sum _{j} V_{0, j} (\vec{U}_{j}\cdot \nabla _{{\mathbf {x}}})\). Hence

$$\begin{aligned} & (\widetilde{\vec {V}}\cdot \nabla _{{\mathbf {x}}})^{2} \nabla ^{2}_{\xi} \tilde{\phi}({\mathbf{0}}; 0)-(\vec {V}\cdot \nabla _{{\mathbf {x}}})^{2} \nabla ^{2}_{ \xi} \phi ({\mathbf{0}}; 0) \\ = & \mathcal{D}_{0}(\tilde{\phi} - \phi ) ({\mathbf{0}}; 0)= \sum _{j} 4( \vec {V}_{0}^{T} E_{j} \vec {V}_{0}) A_{j} + 2\sum _{j} \vec {C}_{j}^{T} U \vec {V}_{0} A_{j} \end{aligned}$$
(2.15)

where the column vector \(\vec{C}_{j} = (c_{1, j}, \ldots , c_{n, j})^{T}\) and the \(n\times n\) matrix \(U\) has the \(k\)-th column equal to \(\vec {U}_{k}\) and \((\cdot )_{j}\) means the \(j\)-th component.

In order to show (2.15) gives 0, it suffices to show the stronger statement

$$ 2E_{j} \vec {V}_{0} + U^{T} \vec {C}_{j} = 0, \forall 1 \leq j \leq n-1. $$
(2.16)

We prove (2.16) entrywisely. Take an arbitrary \(1 \leq k \leq n\), we need to prove

$$ 2\vec {E}_{j; k}\cdot \vec {V}_{0} + \vec {U}_{k} \cdot \vec {C}_{j} = 0 $$
(2.17)

where \(\vec {E}_{j; k}\) is the \(k\)-th row (or column) of \(E_{j}\). To show this we recall how one obtain \(\vec {V}_{0}\) and \(\vec {U}_{k}\). Recall from (2.2) and (2.11),

$$ \vec {V}_{0} = \vec {C}_{1} \wedge \vec {C}_{2} \wedge \cdots \wedge \vec {C}_{n-1}. $$
(2.18)

To compute \(\vec {U}_{k}\), note that it is the coefficient of \(x_{k}\) in (2.14). Its computation boils down to expanding the \(\vec {T}_{j}\) in (2.1) and (2.2) into the constant term, the \(x_{k}\) term and higher terms. We see

$$\begin{aligned} \vec {U}_{k} ={}& 2\vec {E}_{1; k} \wedge \vec {C}_{2} \wedge \cdots \wedge \vec {C}_{n-1} + 2\vec {C}_{1} \wedge \vec {E}_{2; k} \wedge \cdots \wedge \vec {C}_{n-1} + \cdots \\ &{}+ 2\vec {C}_{1} \wedge \cdots \wedge \vec {C}_{n-2} \wedge \vec {E}_{n-1; k}. \end{aligned}$$
(2.19)

Now there is only one nonzero term in the above expression, namely \(2\vec {C}_{1} \wedge \cdots \wedge \vec {E}_{j; k} \wedge \cdots \wedge \vec {C}_{n-1}\), that contributes to \(\vec {U}_{k}\cdot \vec {C}_{j}\). Hence

$$ \vec {U}_{k} \cdot \vec {C}_{j} = 2\det (\vec {C}_{1}, \ldots , \vec {C}_{j-1}, \vec {E}_{j; k}, \vec {C}_{j+1}, \ldots , \vec {C}_{n-1}, \vec {C}_{j}). $$
(2.20)

But we also have

$$ 2\vec {E}_{j; k} \cdot \vec {V}_{0} = 2\det (\vec {C}_{1}, \ldots , \vec {C}_{n-1}, \vec {E}_{j; k}). $$
(2.21)

Since the two add up to 0, we see (2.17), and thus (2.16) holds. This concludes the proof that both terms in (2.3) at \(({\mathbf{0}}; 0)\) remain the same under the change of variables (2.10), finishing the proof that the property (2.3) is preserved under every diffeomorphism in \({\mathbf {x}}\) only or in \(\xi \) only.

Now we just need to prove that if the phase function is in the normal form (1.7), Bourgain’s condition (1.10) at the origin coincides with (2.3). Indeed, in the normal form, the expression of every \(\vec {T_{j}}\) has no linear term in \({\mathbf {x}}\) and thus the action of \(\vec {V}\) at the origin is the same as \(\partial _{t}\) and the action of \(\vec {V}^{2}\) at the origin is the same as \(\partial _{t}^{2}\), verifying the above claim. □

The first part of the above proof immediately implies the following useful corollary.

Lemma 2.3

Let \(\lambda ({\mathbf {x}}; \xi )\) be a smooth scalar function that does not take value zero. Then for a phase function \(\phi ({\mathbf {x}}; \xi )\) satisfying conditions (H1) and (H2), it satisfies Bourgain’s condition at \(({\mathbf {x}}_{0}; \xi _{0})\) if and only if (2.3) holds with \(\vec {V}\) replaced by \(\lambda \cdot \vec {V}\).

2.2 Proof of Theorem 1.1

Given a phase function \(\phi ({\mathbf {x}}; \xi )\), we would like to show that (1.4) may fail for some \(q>\frac{2n}{n-1}\) and some \(f\in L^{\infty}\). Let us turn to the dual form of it:

$$ \int \Big| \int g({\mathbf {x}}) e^{iN\phi ({\mathbf {x}}; \xi )}a({\mathbf {x}}; \xi )d{\mathbf {x}}\Big| d \xi \lesssim N^{-n/q}\|g\|_{q'}, $$
(2.22)

for \(q>\frac{2n}{n-1}\). Let \(\delta \simeq N^{-1/2}\). Consider a \(\delta \)-net \(\{\xi _{\alpha}\}\) in the \(\xi \) variable. For each \(\alpha \), we will introduce a curved tube \(T_{\alpha}\), whose bottom is a disc of radius \(\delta \) and length is about \(\delta ^{\lambda}\) with \(\lambda \) to be determined. Let \({\mathbf {T}}\) denote the collection of tubes \(\{T_{\alpha}\}\) and let \(\#{\mathbf {T}}\) denote the number of tubes. Moreover, we will find a function Ω=( Ω 1 (ξ),, Ω n 1 (ξ)) such that for every \(\alpha \) and every \({\mathbf {x}}\in T_{\alpha}\), we have

| ξ ϕ(x; ξ α )Ω( ξ α )|δ.
(2.23)

Afterwards, let us set

$$ g({\mathbf {x}})=g_{\epsilon}({\mathbf {x}})=\sum _{\alpha}\epsilon _{\alpha} e^{-iN \phi ({\mathbf {x}}; \xi _{\alpha})}\chi _{T_{\alpha}}({\mathbf {x}}) $$
(2.24)

where \(\chi _{T_{\alpha}}\) is the indicator function of \(T_{\alpha}\) and \(\epsilon _{\alpha}\) takes \(\pm 1\) randomly. If (2.22) holds, then

$$ \begin{aligned} & \int \max _{\alpha} \Big| \int _{T_{\alpha}} e^{iN[\phi ({\mathbf {x}}; \xi )-\phi ({\mathbf {x}}; \xi _{\alpha})]} a({\mathbf {x}}; \xi ) d{\mathbf {x}}\Big| d\xi \\ & \le \int \Big( \sum _{\alpha} \Big| \int _{T_{\alpha}} e^{iN[\phi ({\mathbf {x}}; \xi )-\phi ({\mathbf {x}}; \xi _{\alpha})]} a({\mathbf {x}}; \xi ) d{\mathbf {x}}\Big| ^{2} \Big) ^{1/2} d\xi \end{aligned} $$
(2.25)

which, by Khintchine’s inequality, is bounded by

$$ N^{-n/q} \Big( \int (\sum _{\alpha} \chi _{T_{\alpha}})^{q'/2}d{\mathbf {x}}\Big) ^{1/q'}. $$
(2.26)

We apply Hölder’s inequality, and obtain

$$ \text{(2.26)} \lesssim N^{-n/q} \big|\bigcup _{\alpha}T_{\alpha} \big|^{\frac{1}{2}-\frac{1}{q}} (\sum |T_{\alpha}|)^{\frac{1}{2}}. $$
(2.27)

Next, we give a lower bound for the left hand side of (2.25). Recall the function Ω. Write

ϕ ( x ; ξ ) = ϕ ( x ; ξ α ) + ξ ϕ ( x ; ξ α ) , ξ ξ α + O ( δ 2 ) = ϕ ( x ; ξ α ) + Ω ( ξ α ) , ξ ξ α + O ( δ 2 ) , x T α .
(2.28)

Therefore the left hand side of (2.25) is at least

$$ \delta ^{n-1} \sum |T_{\alpha}|. $$
(2.29)

We combine (2.27) and (2.29), and obtain

$$ \delta ^{\lambda /2}\lesssim N^{-n/q}\delta ^{-(n-1)}\big|\bigcup _{ \alpha}T_{\alpha}\big|^{1/2-1/q}. $$
(2.30)

In the remaining part, we will construct \(\{T_{\alpha}\}\) so that the union of these tubes is small. So far we have been following Bourgain’s framework in [3]. The improvement over Bourgain’s result in dimension \(n=3\) comes from the construction of the tubes \(\{T_{\alpha}\}\).

Let us write our phase function in its normal form at the origin, that is,

$$ \phi ({\mathbf {x}}; \xi )=\langle x,\xi\rangle +t\langle A\xi,\xi\rangle +O(|t||\xi |^{3}+| {\mathbf {x}}|^{2} |\xi |^{2}). $$
(2.31)

The following lemma is the key for the construction of \(\{T_{\alpha}\}\).

Lemma 2.4

If Bourgain’s condition fails at the origin, then we can find Ω with Ω(0)=0 such that the following holds: Let \(X_{t}(\xi ): \mathbb{R}^{n-1}\mapsto \mathbb{R}^{n-1}\) denote the unique solution to ξ ϕ(x,t;ξ)=Ω(ξ) in the \(x\) variable, then

$$ \big|\det \nabla _{\xi} X_{t}\big|=O(|(t, \xi )|^{n}), $$
(2.32)

for \(t, \xi \) small.

Let us assume Lemma 2.4 and finish the proof of Theorem 1.1. As a consequence of this lemma, if we define

$$ T_{\alpha}=\{(x, t): |x-X_{t}(\xi _{\alpha})|\le \delta , 0\le t\le \delta ^{\lambda}\}, $$
(2.33)

then (2.23) holds by mean value theorems. Here the value that \(\lambda >0\) takes is not relevant, that is, \(\lambda \) can even be very close to zero. Next, pick \(\lambda =1/(n+1)\). Lemma 2.4 then says that the union of the tubes is small:

Claim 2.5

$$ \Big| \bigcup _{\alpha : |\xi _{\alpha}|\le \delta ^{\lambda}} T_{\alpha} \Big| \lesssim _{\lambda} \delta ^{2n\lambda}. $$
(2.34)

Let us first accept the above claim. For \(|\xi _{\alpha}|\ge \delta ^{\lambda}\), we will construct tubes in the same way. More precisely, we will cut frequency \(\xi \) space into balls of radius \(\delta ^{\lambda}\), and on each ball, we repeat the above construction. Therefore

$$ \Big| \bigcup _{\alpha} T_{\alpha} \Big| \lesssim \delta ^{2n\lambda} \delta ^{-(n-1)\lambda}. $$
(2.35)

Substituting this into (2.30) will give us

$$ q\ge \frac{2(2n^{2}+n-1)}{2n^{2}-n-2}>\frac{2n}{n-1}. $$
(2.36)

This finishes the proof of the theorem, modulo the proof of Lemma 2.4 and the proof of Claim 2.5.

Proof of Lemma 2.4.

Let us start with (2.31). Write

$$ \phi ({\mathbf {x}}; \xi )=\langle x,\xi\rangle +t\langle A\xi,\xi\rangle +t^{2}Q_{2}(\xi )+ \phi _{4}({\mathbf {x}}; \xi ), $$
(2.37)

where \(Q_{2}(\xi )\) is a quadratic form in \(\xi \). The assumption that Bourgain’s condition fails at the origin is then equivalent to saying that

$$ \mathrm{Hessian}(Q_{2}) \text{ is not a multiple of } A. $$
(2.38)

Let Ω=( Ω 1 (ξ),, Ω n 1 (ξ)) be smooth with Ω(0)=0. We need to solve

x+tAξ+ t 2 ξ Q 2 (ξ)+ ξ ϕ 4 (x;ξ)=Ω(ξ).
(2.39)

It is not difficult to see that when \({\mathbf {x}}\) and \(\xi \) are small, the solution is unique. We solve (2.39) and write the solution as

X t ( ξ ) = t A ξ t 2 B ξ + t P 1 ( ξ ) + t 2 P 2 ( ξ ) + j = 3 n t j P j ( ξ ) + Ω ˜ ( ξ ) + O ( | ( t , ξ ) | n + 1 ) ,
(2.40)

where \(B\) is a \((n-1)\times (n-1)\) matrix that is not a constant multiple of \(A\), \(B\xi =\nabla _{\xi} Q_{2}(\xi )\), Ω ˜ depends on Ω, \(P_{i}(\xi )\) is a polynomial of degree \(n-i\) with lowest order term of degree 2 for \(i=1, 2\), and \(P_{j}(\xi )\) is a polynomial of degree \(n-i\) with lowest order term of degree 1 for \(j\ge 3\). Here \(P_{i} (1\le i\le n)\) depends on Ω, but it is important that the matrix \(B\) does not depend on Ω. Before computing \(\nabla _{\xi} X_{t}\), let us do the change of variables

$$ -A\xi +P_{1}(\xi )\mapsto A\eta . $$
(2.41)

Write the right hand side of (2.40) in the \(\eta \) variable:

tAη+ t 2 Bη+ t 2 P 2 (η)+ j = 3 n t j P j (η)+ Ω ˜ (η)+O(|(t,η) | n + 1 )=: X t (η),
(2.42)

where \(P'_{2}(\eta )\) is a polynomial of degree \(n-2\) with lowest order term of degree 2, and \(P'_{j}(\eta )\) is a polynomial of degree \(n-i\) with lowest order term of degree 1 for \(j\ge 3\). As the change of variables in (2.41) is non-degenerate, in order to guarantee (2.32), we just need to show that

$$ \big|\det \nabla _{\eta} X'_{t}\big|=O(|(t, \eta )|^{n}). $$
(2.43)

When computing (2.43), we will see more clearly why it is convenient to do the change of variables in (2.41). Write

X t (η)=tAη+ t 2 Bη+ Ω ˜ (η)+ ϕ 4 (t,η).
(2.44)

Note that the lowest order term in \(\phi '_{4}\), jointly in \(t\) and \(\eta \) variables, is four. Compute

η X t =tA+ t 2 B+ η Ω ˜ (η)+ η ϕ 4 (t,η).
(2.45)

Next we will compute the determinant. As \(A\) is non-degenerate, when computing the determinant, we can without loss of generality assume that \(A\) is the identity matrix. Moreover, by using Jordan normal forms for \(B\), and using the fact that \(B\) is not a multiple of \(A\), we can therefore without loss of generality assume that \(B\) is of the form \([b_{ij}]_{1\le i, j\le n-1}\) with \(b_{i1}=b_{i2}=0\) for every \(i\ge 3\), and the leading principal minor of order 2 is one of the following forms

$$ \begin{bmatrix} \gamma , & 1 \\ 0, & \gamma \end{bmatrix} \text{ or } \begin{bmatrix} \gamma ', & 0 \\ 0, & 0 \end{bmatrix} \text{ or } \begin{bmatrix} \gamma _{1}, & -\gamma _{2} \\ \gamma _{2}, & \gamma _{1} \end{bmatrix} $$
(2.46)

where \(\gamma , \gamma ', \gamma _{1}, \gamma _{2}\in \mathbb{R}\) and \(\gamma '\neq 0, \gamma _{2}\neq 0\). Write

η X t (η)= η Ω ˜ (η)+ [ , , 1 t , y , 1 t , y , , , 1 t , y , 1 t , y , 3 t , y , 3 t , y , 1 t , y , 1 t , y , 3 t , y , 3 t , y , 1 t , y , 1 t , y , , , , , ]
(2.47)

where \(i_{t, y}\) means that the lowest order in \(t, y\) is \(i\), for \(i=1, 3\). If we are in the first case in (2.46), then we pick Ω ˜ such that

η Ω ˜ = [ 0 , 0 , 0 , 1 , 0 , 0 , 0 , 0 , 0 , , , , ]
(2.48)

If we are in the second case, then we pick Ω ˜ such that

η Ω ˜ = 1 γ [ 1 , 1 , 0 , 1 , 1 , 0 , 0 , 0 , 0 , , , , ]
(2.49)

The last case in (2.46) can be handled in the same way. These choices of Ω ˜ will guarantee that (2.43) holds. In the end, to find Ω from Ω ˜ , we just need to revert the change of variables in (2.41). □

Proof of Claim 2.5

Let us start by sketching the ideas in the proof. To begin with, we replace the set \(\bigcup _{\alpha : |\xi _{\alpha}|\le \delta ^{\lambda}} T_{\alpha}\) by a larger set \(\mathcal {N}_{2\delta} (X (B))\), where \(X\) is the map \((\xi , t) \mapsto (X_{t} (\xi ), t)\), \(B\) is a ball of radius \(O(\delta ^{\lambda})\) around the origin and \(\mathcal {N}_{2\delta}\) refers to the \(2\delta \) neighbourhood. Now intuitively the volume of \(\mathcal {N}_{2\delta} (X (B))\) depends on the volume of \(X (B)\) and the “surface area” of \(X (B)\). We will control the volume of \(X (B)\) by Lemma 2.4. For its “surface volume”, we first observe that if \(\phi \) and Ω are polynomials, \(\partial X (B)\) is contained in a nice semialgebraic set of dimension \(< n\) and hence has a controlled “surface volume” by tools in real algebraic geometry. Finally, the general situation can be reduced to the above polynomial situation by a Taylor series approximation. To establish the semialgebracity above, we will use quantifier elimination based on the Tarski-Seidenberg theorem. For a recent application of quantifier elimination in Kakeya and restriction that also helped us to motivate the present proof, see [11]. The tools we need from real algebraic geometry can be found in references [1, 23].

We now present the proof details. First we claim that without loss of generality, one may assume \(\phi \) and all components of Ω are polynomials of degree \(O_{\lambda}(1)\). To see this, let \(M\geq 5\) be a large positive integer to be determined later and replace \(\phi \) and Ω by their degree \(M\) Taylor approximations. Since \(M\geq 5\), \(\phi \) will stay as a legitimate phase function and Bourgain’s condition continues to fail at the origin. Moreover, whenever \(|t|, |\xi _{\alpha}| \leq \delta ^{\lambda}\), the change of Ω( ξ α ) is \(O(\delta ^{(M+1)\lambda})\). Thus for these \(t\) and \(\xi _{\alpha}\), the distance between the new \(X_{t}(\xi _{\alpha})\) from the old one is \(O(\delta ^{(M+1)\lambda})\) by the nondegeneracy of \(\phi \). Since we only care about the volume of the union of \(\delta \)-neighborhoods, it suffices to choose \(M > \frac{1}{\lambda}\) so that each old \(T_{\alpha}\) is contained in the twice-thickening of the corresponding new \(T_{\alpha}\). Now the old situation is reduced to the new situation where \(\phi \) and all components of Ω are polynomials of degree \(O_{\lambda}(1)\).

Take \(B\) to be a ball of radius \(O(\delta ^{\lambda})\) centered at the origin in the \((\xi , t)\) space containing all \((\xi , t)\) with \(|\xi |, |t| \leq \delta ^{\lambda}\). Let \(X\) denote the map \((\xi , t) \mapsto (X_{t} (\xi ), t)\). \(X\) is smooth near the origin by the implicit function theorem. By definition, \(\bigcup _{\alpha : |\xi _{\alpha}|\le \delta ^{\lambda}} T_{\alpha}\) is contained in \(\mathcal {N}_{2\delta} (X (B))\). It suffices to prove

$$ \Big| \mathcal {N}_{2\delta} (X (B)) \Big| \lesssim _{\lambda} \delta ^{2n\lambda}. $$
(2.50)

Let us understand the geometry of \(X(B)\). For a point in \(\partial X(B)\), either it is in \(X(B)\) and hence in \(X(\mathrm{Sing}(X; B))\) where \(\mathrm{Sing}(X; B)\) is the singular set of \(X\) inside \(B\), or it is outside of \(X(B)\) and hence by a compactness argument it is in \(X(\partial B)\). Since

$$ \mathcal {N}_{2\delta} (X (B)) \in X (B) \bigcup \mathcal {N}_{2\delta} ( \partial X (B)), $$
(2.51)

we have

$$ \mathcal {N}_{2\delta} (X (B)) \subseteq X (B) \bigcup \mathcal {N}_{2\delta} (X( \mathrm{Sing}(X; B))) \bigcup \mathcal {N}_{2\delta} (X(\partial B)) $$
(2.52)

and will next bound the measures of all three sets on the right-hand side from above.

First we bound \(\big|{X (B)}\big|\). By the definition of \(X\) and Lemma 2.4, \(\big|\det \nabla X\big| = \big|\det \nabla _{\xi} X_{t}\big|=O(|(t, \xi )|^{n})\). Integrating on \(B\) we get

$$ \Big| {X (B)} \Big| \lesssim |B|\sup _{(\xi , t) \in B}|(\xi , t)|^{n} \lesssim \delta ^{2n\lambda}. $$
(2.53)

Next we bound \(\big|\mathcal {N}_{2\delta}( X(\mathrm{Sing}(X; B)))\big|\). By the chain rule and the non-degeneracy of \(\nabla _{x}\nabla _{\xi}\phi \) near 0, we can rewrite

X ( Sing ( X ; B ) ) = { ( x , t ) : ( ξ , t ) B  s.t.  ξ ϕ ( x , t ; ξ ) = Ω ( ξ )  and  ξ 2 ϕ ( x , t ; ξ ) = ξ Ω ( ξ ) } .
(2.54)

We will analyze this set using tools in real algebraic geometry and first do some setup. We recall a subset of some \(\mathbb{R}^{N}\) is semialgebraic if it can be obtained by finitely many steps of taking unions, intersections, or complements from algebraic sets. The complexity of a semialgebraic set is the smallest possible sum of the degrees of all polynomials appearing in a complete description of it. Section 2 of [11] has a good introduction to basic properties of the above notions, as well as a quantitative quantifier elimination (or a quantitative Tarski-Seidenberg theorem) that we will use below, from analysts’ viewpoint. A semialgebraic set in \(\mathbb{R}^{N}\) has a dimension that is a non-negative integer \(\leq N\). See Chap. 5 of [1] for its basic properties.

Note that the ball \(B\) is a semialgebraic set of complexity \(O(1)\). Moreover, \(\phi \) and components of Ω are already polynomials of degree \(O_{\lambda}(1)\). Hence by quantitative quantifier elimination (or the quantitative Tarski-Seidenberg theorem, see Theorem 14.16 of [1]), \(X(\mathrm{Sing}(X; B))\) is a semialgebraic set of complexity \(O_{\lambda}(1)\). By Sard’s theorem, \(X(\mathrm{Sing}(X; B))\) has measure zero and thus has dimension \(< n\) (by Proposition 5.53 of [1]). Now by Corollary 5.7 of [23], \(X(\mathrm{Sing}(X; B))\) can be covered by \(O_{\lambda}(1) \times (\delta ^{\lambda -1})^{(n-1)}\) many \(\delta \)-balls. Hence

$$ \Big| \mathcal {N}_{2\delta} (X(\mathrm{Sing}(X; B))) \Big| \lesssim _{\lambda} \delta ^{1+ (n-1)\lambda}. $$
(2.55)

We remark that Corollary 5.7 of [23] can be viewed qualitatively as a generalization of Wongkew’s theorem [22] to the semialgebraic setting.

Finally we bound \(\big|\mathcal {N}_{2\delta} (X(\partial B))\big|\) via a similar application of real algebraic geometrical tools. First write

X(B)={(x,t):(ξ,t)B s.t.  ξ ϕ(x,t;ξ)=Ω(ξ)}

and we see \(X(\partial B)\) is a semialgebraic set of complexity \(O_{\lambda} (1)\). Since \(X\) is smooth on the dilation \(2B\), \(X(\partial B)\) has zero measure and thus has dimension \(< n\). Applying Corollary 5.7 of [23] as before we get

$$ \Big| \mathcal {N}_{2\delta} (X(\partial B)) \Big| \lesssim _{\lambda} \delta ^{1+ (n-1)\lambda}. $$
(2.56)

Combining (2.52), (2.53), (2.55) and (2.56) and noticing that our \(\lambda = \frac{1}{n+1}\), we finish the proof of the claim. □

3 Polynomial Wolff axiom: proof of Theorem 1.2

In this section, we will prove Theorem 1.2. In particular, we will see that Bourgain’s condition also appears very naturally in the proof, see (3.32)-(3.34). This may not be surprising, as polynomial Wolff axioms are “opposite” to Bourgain’s Kakeya compression phenomenon (see the beginning of Sect. 2 for a brief discussion): Polynomial Wolff axioms say that if a family of tubes are supported near algebraic varieties in \(\mathbb{R}^{n}\), then this family of tubes must point in a “small” number of directions; on the other hand, Bourgain’s Kakeya compression phenomenon says that (curved) Kakeya sets may concentrate near hyper-surfaces in \(\mathbb{R}^{n}\).

In order to prove this theorem, we first state and prove a generalized version of Polynomial Wolff Axiom by Katz-Rogers [11] as follows. We first introduce more notation. Let \(n\ge 2\). Suppose the map

$$ \Phi : \mathbb{R}^{n-1} \times \mathbb{R}\times \mathbb{R}^{n-1} \to \mathbb{R}^{n-1} $$
(3.1)

is smooth on a neighborhood of \([-1, 1]^{2n-1}\) with \(\|\Phi \|_{C^{k}} \lesssim _{k} 1, \forall k \geq 1\)..Footnote 2

By a \(\delta \)-tube for cap \(\theta \) with respect to \(\Phi \), we mean some

$$ T_{\xi _{\theta}, v, \Phi} (\delta , 1):=\{(x, t) \in \mathbb{R}^{n}: |x- \Phi (v, t, \xi _{\theta})|\le \delta , |t|\le 1\} $$
(3.2)

where the \(\delta \) in the name indicates the “thickness” and the 1 in the name indicates the time span of the tube. For a collection \(\mathbb{T}\) of tubes \(\{T_{\xi _{\theta}, v, \Phi} (\delta , 1)\}\), we say that the tubes in \(\mathbb{T}\) point in different directions if all the underlying \(\theta \) for them are distinct.

Theorem 3.1

Generalized Polynomial Wolff Axiom

Suppose that for every choice of \(v \in [-1, 1]^{n-1}\) and \(\xi \in [-1, 1]^{n-1}\),

$$ \int _{-\delta ^{\epsilon}}^{\delta ^{\epsilon}} |\det (\nabla _{v} \Phi (v, t, \xi )\cdot M+\nabla _{\xi} \Phi (v, t, \xi ))|\mathrm{d} t \gtrsim _{\epsilon} \delta ^{C \epsilon}, \forall M \in \mathrm{Mat}_{(n-1) \times (n-1)} (\mathbb{R}) $$
(3.3)

for some constant \(C\) that depends only on the dimension \(n\), where \(\mathrm{Mat}_{(n-1)\times (n-1)} (\mathbb{R})\) stands for the set of \((n-1)\times (n-1)\) real matrices and the bound is uniform and independent of the choices of \(v, \xi \) and \(M\). Then for every collection \(\mathbb{T}\) of \(\delta \)-tubes pointing in different directions,

$$ \#\{T\in \mathbb{T}: T \subset S\}\le C(n, E, \epsilon )|S| \delta ^{1-n- \epsilon} $$
(3.4)

whenever \(S \subset B^{n}\) is a semialgebraic set of complexity \(\le E\).

Moreover the implied constant only depends on bounds of finitely many (depending on \(n, E, \epsilon \)) derivatives of \(\Phi \).

Proof of Theorem 3.1

When first reading the proof we recommend fixing \(\Phi \), and it will be easy to see from the proof that the constant only depends on bounds of finitely many derivatives of \(\Phi \) when we are allowed to change it.

In the proof we always think of \(\epsilon \) as fixed and always assume \(\delta \) is sufficiently small (that can depend on \(\epsilon \)) since otherwise the conclusion is easily seen to hold. Without loss of generality, we always assume \(|S| \gtrsim \delta ^{n-1}\) with a suitable absolute constant, since otherwise no \(T\) can lie in \(S\). Our proof will largely follow that of Theorem 3.1 in [11].

Our \(\Phi \) is not necessarily a semialgebraic map and we would like to first make it semialgebraic by a Taylor approximation. Fix a \(0< \epsilon _{1} <\min \{0.1, \epsilon \}\) and fix a large \(K>\frac{2n^{3}}{\epsilon _{1}}\) (with more constraints to be determined on both parameters). Let \(N\) be the quantity on the left hand side of (3.4). We may assume \(N \geq 1\). Without loss of generality we can assume

$$ \#\{T_{\xi _{\theta}, v, \Phi} (\delta , 1)\in \mathbb{T}: T_{\xi _{\theta}, v, \Phi} (\delta , 1) \subset S, |\xi _{\theta}| \leq \delta ^{\epsilon _{1}}, |v|\leq \delta ^{\epsilon _{1}}\} \gtrsim \delta ^{2(n-1)\epsilon _{1}} N $$
(3.5)

and will focus on giving an upper bound of the number of these \(T_{\xi _{\theta}, v, \Phi} (\delta , 1)\). Without loss of generality we assume \(\Phi (0) = 0\).

Now replace \(\Phi \) by its \(K\)-th Taylor approximation at the origin, called \(\Phi _{1}\). The new map \(\Phi _{1}\) is still in \(C^{\infty}\). For each \(T_{\xi _{\theta}, v, \Phi} (\delta , 1)\) in \(\mathbb{T}\) such that \(|\xi _{\theta}| \leq \delta ^{\epsilon _{1}}\) and that \(|v|\leq \delta ^{\epsilon _{1}}\), we form a corresponding shrunken tube \(T_{\xi _{\theta}, v, \Phi _{1}} (\delta /2, 2\delta ^{\epsilon _{1}})\) that is defined similar to (3.2) but have thickness \(\delta /2\) and time span \(|t| \leq 2\delta ^{\epsilon _{1}}\). By Taylor approximation it is easy to see the entire

$$ T_{\xi _{\theta}, v, \Phi _{1}} (\delta /2, 2\delta ^{\epsilon _{1}}) \subset T_{\xi _{\theta}, v, \Phi} (\delta , 1) \subset S. $$
(3.6)

Moreover, by Taylor approximation, we see that the following analogue of (3.3) continues to hold for each \(|\xi | \leq \delta ^{\epsilon _{1}}\) and \(|v|\leq \delta ^{\epsilon _{1}}\) as long as \(K \gtrsim 1\) (when \(\delta \) is sufficiently small) and \(M \in \mathrm{Mat}_{(n-1)\times (n-1)} (\mathbb{R})\) are such that all entries of \(M\) are \(\leq \delta ^{-n}\):

$$ \int _{-\delta ^{\epsilon _{1}}}^{\delta ^{\epsilon _{1}}} |\det ( \nabla _{v} \Phi _{1}(v, t, \xi )\cdot M+\nabla _{\xi} \Phi _{1}(v, t, \xi ))|\mathrm{d} t \gtrsim _{\epsilon _{1}} \delta ^{C \epsilon _{1}} $$
(3.7)

Like Katz-Rogers did in [11], we define a set

$$ L = \{(\xi , v): T_{\xi , v, \Phi _{1}} (\delta /3, 2\delta ^{ \epsilon _{1}}) \subset S\}. $$
(3.8)

Now the map \(\Phi _{1}\) is algebraic, by definition \(L\) is semialgebraic with complexity \(O_{n, E, \epsilon _{1}, K} (1)\) (for the definition of semialgebraic sets and their complexity and how to arrive at the present claim, see Sect. 2 of [11]). Note that if a tube \(T_{\xi _{\theta}, v, \Phi _{1}} (\delta /2, 2\delta ^{\epsilon _{1}}) \subset S\), then keeping \(v\) and perturbing \(\xi _{\theta}\) by an arbitrary small distance proportional to \(\delta \), the resulting \((\xi , v)\) will end up in \(L\) by definition. Hence we know the measure

$$ |\{\xi : \exists v \text{ s.t. } (\xi , v) \in L\}| \gtrsim \delta ^{2(n-1) \epsilon _{1}} N \cdot \delta ^{n-1} \simeq \delta ^{(n-1)+2(n-1) \epsilon _{1}} N. $$
(3.9)

Like in the proof of Theorem 3.1 in [11], next we apply the Tarski-Seidenberg theorem to obtain a semialgebraic section \(L' \subset L\) of complexity \(O_{n, E, \epsilon _{1}, K} (1)\) consisting of a single \((\xi , v)\) for each \(\xi \) appearing in the set in (3.9). Arguing like [11], we see \(L'\) is an \((n-1)\)-dimensional subset of \(\mathbb{R}^{2n-2}\). Using Gromov’s lemma (cited as Lemma 2.3 in [11]) in the identical way as in pages 1711-1712 of [11], we find two polynomial maps \(F\) and \(G: [0, \delta ^{\epsilon _{1}}]^{n-1} \to \mathbb{R}^{n-1}\) with \(\deg F, \deg G = O_{n, E, \epsilon _{1}, K} (1)\) and \(\|F\|_{C^{1}}, \|G\|_{C^{1}} \leq 1\) such that

$$ |G([0, \delta ^{\epsilon _{1}}]^{n-1})| \gtrsim _{n, E, \epsilon _{1}, K} \delta ^{(n-1)+C\epsilon _{1}} N $$
(3.10)

and that

$$ (\Phi _{1} (F(x), t, G(x)), t) \in S, \forall x \in [0, \delta ^{ \epsilon _{1}}]^{n-1} \text{ and } \forall |t| \leq \delta ^{\epsilon _{1}}. $$
(3.11)

For technical reasons that we replaced \(\Phi \) by \(\Phi _{1}\) which does not need to satisfy (3.3) but instead only satisfies the weaker (3.7), we now pass to a subset \(B\) of \([0, \delta ^{\epsilon _{1}}]^{n-1}\) where \(G\) has reasonably large Jacobian. Define

$$ B = \{x \in [0, \delta ^{\epsilon _{1}}]^{n-1}: |\det \nabla _{x} G| \gtrsim _{n, E, \epsilon _{1}, K} \delta ^{(n-1)+C\epsilon _{1}} N\}. $$
(3.12)

We see that if the implied constant above is carefully chosen, by Chebyshev we continue to have the following inequality similar to (3.10):

$$ |G(B)| \gtrsim _{n, E, \epsilon _{1}, K} \delta ^{(n-1)+C\epsilon _{1}} N. $$
(3.13)

Now like in [11], we look at the volume of

$$ \Delta = \{(\Phi _{1} (F(x), t, G(x)), t): x \in B, |t| \leq \delta ^{ \epsilon _{1}}\}. $$
(3.14)

On one hand, \(\Delta \) is contained in \(S\) and thus

$$ |\Delta | \leq |S|. $$
(3.15)

On the other hand, we can bound \(|\Delta |\) below by calculus. Note that \(\Phi _{1}\) is a polynomial of degree \(K\) and that \(F\) and \(G\) are polynomials of degree \(O_{n, E, \epsilon _{1}, K} (1)\). Hence by Bézout’s theorem for every fixed \(t\) the map

$$ x \mapsto \Phi _{1} (F(x), t, G(x)) $$
(3.16)

is \(O_{n, E, \epsilon _{1}, K} (1)\) to 1. Thus

$$\begin{aligned} & |\Delta | \simeq _{n, E, \epsilon _{1}, K} \int _{-\delta ^{ \epsilon _{1}}}^{\delta ^{\epsilon _{1}}} \int _{B} |\nabla _{x} ( \Phi _{1} (F(x), t, G(x)))| \mathrm{d}x\mathrm{d}t \\ = & \int _{-\delta ^{\epsilon _{1}}}^{\delta ^{\epsilon _{1}}} \int _{B} |\det (\nabla _{v} \Phi _{1} (F(x), t, G(x))\cdot \nabla _{x} F + \nabla _{\xi} \Phi _{1} (F(x), t, G(x))\cdot \nabla _{x} G)| \mathrm{d}x\mathrm{d}t \\ = & \int _{B} |\det (\nabla _{x} G)| \\ \cdot & \int _{-\delta ^{\epsilon _{1}}}^{\delta ^{\epsilon _{1}}} | \det (\nabla _{v} \Phi _{1} (F(x), t, G(x))\cdot (\nabla _{x} F\cdot ( \nabla _{x} G)^{-1}) \\ + &\nabla _{\xi} \Phi _{1} (F(x), t, G(x)))| \mathrm{d}t\mathrm{d}x. \end{aligned}$$
(3.17)

Note that \(\|F\|_{C^{1}}, \|G\|_{C^{1}} \leq 1\) and that for all \(x \in B\),

$$ |\det \nabla _{x} G|\gtrsim _{n, E, \epsilon _{1}, K} \delta ^{(n-1)+C \epsilon _{1}} N \geq \delta ^{(n-1)+C\epsilon _{1}}, $$
(3.18)

we see that each entry of \((\nabla _{x} F\cdot (\nabla _{x} G)^{-1})\) is

$$ \lesssim _{n, E, \epsilon _{1}, K} \delta ^{-(n-1)-C\epsilon _{1}} \le \delta ^{-n} $$
(3.19)

if \(\delta \) is sufficiently small. This allows us to invoke (3.7) to obtain

$$ |\Delta |\gtrsim _{n, E, \epsilon _{1}, K} \delta ^{C\epsilon _{1}} \int _{B} |\det \nabla _{x} G(x)|\mathrm{d}x. $$
(3.20)

Use Bézout again and notice (3.13), the right hand side is

$$ \simeq _{n, E, \epsilon _{1}, K} |G(B)| \gtrsim _{n, E, \epsilon _{1}, K} \delta ^{(n-1)+C\epsilon _{1}} N. $$
(3.21)

Hence

$$ |\Delta |\gtrsim _{n, E, \epsilon _{1}, K} \delta ^{(n-1)+C\epsilon _{1}} N. $$
(3.22)

Combine (3.15) and (3.22), we obtain

$$ N \lesssim _{n, E, \epsilon _{1}, K} |S|\delta ^{1-n-C\epsilon _{1}}. $$
(3.23)

It suffices to take \(\epsilon _{1}\) to be a suitable multiple of \(\epsilon \) depending on the above constant \(C\) (and fix \(K\) accordingly as in the beginning of the proof). □

Before proving Theorem 1.2, we also state and prove an elementary lemma on the averaged size of determinants.

Lemma 3.2

For \(A, B \in \textit{Mat}_{k \times k} (\mathbb{R})\) and a measurable \(E \subset \mathbb{R}\), we have

$$ \int _{E} |\det (tA+B)|\mathrm{d}t \gtrsim _{k} |E|^{k+1}|\det (A)|. $$
(3.24)

Proof of Theorem 3.2

Without loss of generality we can assume \(\det A \neq 0\). We may further assume \(A = I\) since otherwise we can replace \(A\) by \(I\) and replace \(B\) by \(BA^{-1}\) and notice \(tA+B = A\cdot (tI + BA^{-1})\).

Now \(\det (tI+B)\) is a monic polynomial \(g_{B}(t)\) in variable \(t\) of degree \(k\). Factorize \(g_{B}\) over ℂ and notice that the set of \(t \in E\) with distance \(\geq \frac{|E|}{2k}\) against each root of \(g_{B}\) has measure at least \(\frac{|E|}{2}\). For each such \(t\),

$$ |\det (tI+B)| \geq \frac{|E|^{k}}{(2k)^{k}}. $$
(3.25)

This finishes the proof. □

Note that the key point of Lemma 3.2 is that the estimate (3.24) is independent of \(B\). With the above preparation we now prove Theorem 1.2.

Proof of Theorem 1.2

By the non-degenerate assumption of \(\phi \), for sufficiently small \((v, t, \xi )\) one can find a unique \(\Phi = \Phi (v, t, x)\) near 0 such that

$$ (\nabla _{\xi} \phi )(\Phi , t; \xi )=v. $$
(3.26)

Let us assume the above can be done for all \((v, t, \xi )\in [-1.5, 1.5]^{2n-1}\) without loss of generality since otherwise we can perform a constant rescaling. It suffices to show that our \(\Phi \) satisfies the condition (3.3) since we can then use Theorem 3.1 to conclude the proof.

By (3.26), we get

$$ (\nabla _{\xi} \phi )(\Phi (v, t, \xi ), t; \xi )=v. $$
(3.27)

Differentiating with respect to \(\xi \) and \(v\) respectively, we deduce

$$ \nabla _{x} \nabla _{\xi} \phi \cdot \nabla _{\xi} \Phi + \nabla _{ \xi}^{2} \phi = 0 $$
(3.28)

and

$$ \nabla _{x} \nabla _{\xi} \phi \cdot \nabla _{v} \Phi = I. $$
(3.29)

Note that we have adopted the abbreviation that \(\phi \) is evaluated at \((\Phi (v, t, \xi ), t; \xi )\). By the non-degeneracy of \(\phi \), we know \(|\nabla _{x} \nabla _{\xi} \phi | \simeq 1\). Hence

$$\begin{aligned} & \int _{-\delta ^{\epsilon}}^{\delta ^{\epsilon}} |\det (\nabla _{v} \Phi (v, t, \xi )\cdot M+\nabla _{\xi} \Phi (v, t, \xi ))|\mathrm{d} t \\ \simeq & \int _{-\delta ^{\epsilon}}^{\delta ^{\epsilon}} |\det ( \nabla _{x} \nabla _{\xi} \phi \cdot \nabla _{v} \Phi (v, t, \xi ) \cdot M+\nabla _{x} \nabla _{\xi} \phi \cdot \nabla _{\xi} \Phi (v, t, \xi ))|\mathrm{d} t \\ \simeq & \int _{-\delta ^{\epsilon}}^{\delta ^{\epsilon}} |\det (M- \nabla _{\xi}^{2} \phi (\Phi (v, t, \xi ), t; \xi ))|\mathrm{d} t. \end{aligned}$$
(3.30)

Recall we only need to verify (3.3) for \(\Phi \). In light of (3.30), it now suffices to show that the right hand side of (3.30) is \(\gtrsim _{\epsilon} \delta ^{n\epsilon}\) (independent of the choice of \(M\), \(v \in [-1, 1]^{n-1}\) and \(\xi \in [-1, 1]^{n-1}\)).

From now on we fix \(v\) and omit it from place to place, and all the estimates will be uniform in \(v\). For simplicity we use \(X_{t} (\xi )\) to denote \(\Phi (v, t, \xi )\) below. Denote

$$ A(t; \xi )= \nabla ^{2}_{\xi} \phi (X_{t}(\xi ), t; \xi ). $$
(3.31)

We claim that for all \(t \in [0, 1]\), \(\xi \in [0, 1]^{n-1}\), \(A(t; \xi ) - A(0; \xi )\) as a matrix is proportional to a matrix \(B(\xi )\) independent of \(t\). This will be refered to as Claim (∗). We will prove this claim by finding \(B(\xi )\) explicitly. Compute

$$ \begin{aligned} \partial _{t} A(t; \xi )& = \Big( \nabla _{x}\cdot \partial _{t} X_{t} \Big) \nabla ^{2}_{\xi}\phi (X_{t}( \xi ), t; \xi )+\partial _{t} \nabla ^{2}_{\xi}\phi (X_{t}(\xi ), t; \xi ) \\ &= \Big[ \Big( \vec {V}\cdot \nabla _{{\mathbf {x}}} \Big) \nabla ^{2}_{\xi}\phi \Big] (X_{t}( \xi ), t; \xi ) \end{aligned} $$
(3.32)

where \(\vec {V}:=(\partial _{t} X_{t}, 1)\). Note that if we differentiate both side of (3.26) in \(t\), then we obtain

$$ \partial _{t} X_{t} \cdot \nabla _{x}\nabla _{\xi}\phi (X_{t}(\xi ), t; \xi )+\partial _{t} \nabla _{\xi}\phi (X_{t}(\xi ), t; \xi )=0. $$
(3.33)

This means at the point \((X_{t}(\xi ), t; \xi )\), the vector field \(\vec {V}\) is parallel to the one defined in (2.2). Moreover by a similar computation, we obtain that

$$ \partial ^{j}_{t} A(t; \xi )= \Big[ \Big( \vec {V}\cdot \nabla _{{\mathbf {x}}} \Big) ^{j}\nabla ^{2}_{\xi}\phi \Big] (X_{t}( \xi ), t; \xi ) $$
(3.34)

for every \(j\ge 1\). By Lemma 2.3, \(\partial ^{2}_{t} A(t; \xi )\) is always parallel to \(\partial _{t} A(t; \xi )\). Hence by considering the time derivative of the quotient of two entries we see \(\partial _{t} A(t; \xi )\) is always parallel to \(\partial _{t} A(0; \xi )\). By our non-degeneracy assumption (1.2) for \(\phi \), we see \(\partial _{t} A(0; \xi )\) has norm and determinant \(\simeq 1\) and call it \(B(\xi )\). Since \(\partial _{t} A(t; \xi )\) is always parallel to \(B(\xi )\), we see that Claim (∗) holds.

Now by Claim (∗), we assume

$$ A(t; \xi ) = f(t; \xi )B(\xi ) + A(0; \xi ) $$
(3.35)

for some scalar function \(f\). Since

$$ \partial _{t} A(t; \xi )|_{t=0} = B(\xi ), $$
(3.36)

we see that the time derivative of \(f\) is 1 at \(t=0\). By compactness, the range of \(f(t; \xi )\) for \(|t| \leq \delta ^{\epsilon}\) has measure \(\geq C \delta ^{\epsilon}\) for some universal \(C>0\) independent of \(\xi \in [-1, 1]\) and \(\epsilon \).

We are in a position to apply Lemma 3.2 to the right hand side of (3.30). Note that

$$ M-\nabla _{\xi}^{2} \phi (\Phi (v, t, \xi ), t; \xi ) = M - A(t; \xi ) = (M-A(0; \xi )) - f(t; \xi ) B(\xi ). $$
(3.37)

We see the right hand side of (3.30) is bounded below by \(C\delta ^{n\epsilon}\det B(\xi ) \geq C\delta ^{n\epsilon}\), thus concluding the proof. □

4 Preliminaries for the proof of Theorem 1.3

This section is a preparation for the proof of Theorem 1.3. Materials in this section are standard: We follow the frameworks of Bourgain and Guth [4], Guth [9], Guth, Hickman and Iliopoulou [10], Hickman and Rogers [12] and reduce Theorem 1.3 to a broad-norm estimate (see Theorem 4.2 below).

For \(\lambda \ge 1\), denote

$$ \phi ^{\lambda}({\mathbf {x}}; \xi )=\lambda \phi ({\mathbf {x}}/\lambda ; \xi ), \ \ a^{\lambda}({\mathbf {x}}; \xi )=a({\mathbf {x}}/\lambda ; \xi ). $$
(4.1)

Define an operator

$$ T^{\lambda} f({\mathbf {x}}):=\int e^{i\phi ^{\lambda}({\mathbf {x}}; \xi )} a^{ \lambda}({\mathbf {x}}; \xi )f(\xi )d\xi . $$
(4.2)

Note that \(T^{\lambda} f\) is just a rescaled version of the operator in Theorem 1.3, and we use this rescaled version as we will use the wave packet decomposition and uncertainty principles to bound \(T^{\lambda}\). In the rest of the paper, we will prove the following Theorem.

Theorem 4.1

If \(\phi \) is assumed to satisfy (H1), \((\mathrm{H2}^{+})\) and Bourgain’s condition at every point, then

$$ \big\| T^{\lambda}f \big\| _{L^{p}(B_{\lambda})}\lesssim _{\epsilon , p, \phi , a} \lambda ^{\epsilon} \|f\|_{L^{p}} $$
(4.3)

for every \(p>q_{n, 2}\), defined in (1.20), ball \(B_{\lambda}\subset \mathbb{R}^{n}\) of radius \(\lambda \ge 1\) and every \(\epsilon >0 \).

For the sake of simplicity, we will assume that our phase function \(\phi \) is of the normal form. To prove (4.3), it suffices to prove

$$ \big\| T^{\lambda}f \big\| _{L^{p}(B_{R})}\lesssim _{\epsilon , p, \phi , a} R^{ \epsilon} \|f\|_{L^{p}}, $$
(4.4)

for every \(1\le R\le \lambda ^{1-\epsilon}\) and every cube \(B_{R}\subset B_{\lambda}\). We will run an induction on both parameters \(\lambda \) and \(R\). The base case of the induction \(\lambda =R=1\) is trivial. Let us assume that we have proven

$$ \big\| T^{\lambda '}f \big\| _{L^{p}(B_{R'})}\lesssim _{\epsilon , p, \phi , a} (R')^{ \epsilon} \|f\|_{L^{p}}, $$
(4.5)

for every \(\lambda '\le \lambda /2\), \(R'\le (\lambda ')^{1-\epsilon}\) and every cube \(B_{R'}\subset B_{\lambda '}\). Our goal is to prove that the same holds with \(\lambda \) and \(R\).

4.1 Wave packet decomposition

Let \(1\le r\le R\) and take a collection \(\Theta _{r}\) of dyadic cubes of side length \(\frac{9}{11} r^{-1 / 2}\) covering the ball \(B^{n-1}(0,2)\). We take a smooth partition of unity \(\left (\psi _{\theta}\right )_{\theta \in \Theta _{r}}\) with \(\operatorname{supp} \psi _{\theta} \subset \frac{11}{10} \theta \) for the ball \(B^{n-1}(0,2)\) such that

$$ \left \|\partial _{w}^{\alpha} \psi _{\theta}\right \|_{L^{\infty}} \lesssim _{\alpha} r^{|\alpha | / 2} $$

for any \(\alpha \in \mathbb{N}_{0}^{n-1}\). We denote by \(\omega _{\theta}\) the center of \(\theta \). Given a function \(g\), we perform a Fourier series decomposition to the function \(g \psi _{\theta}\) on the region \(\frac{11}{9} \theta \) and obtain

g(w) ψ θ (w) 1 11 10 θ (ω)= ( r 1 / 2 2 π ) n 1 v r 1 / 2 Z n 1 ( g ψ θ ) (v) e 2 π i v w 1 11 10 θ (ω).

Let \(\widetilde{\psi}_{\theta}\) be a non-negative smooth cutoff function supported on \(\frac{11}{9} \theta \) and equal to 1 on \(\frac{11}{10} \theta \). We can therefore write

$$ g(w) \psi _{\theta}(w) \cdot \widetilde{\psi}_{\theta}(\omega )= \left (\frac{r^{1 / 2}}{2 \pi}\right )^{n-1} \sum _{v \in r^{1 / 2} \mathbb{Z}^{n-1}}\left (g \psi _{\theta}\right )^{\wedge}(v) e^{2 \pi i v \cdot w} \widetilde{\psi}_{\theta}(\omega ) $$

If we also define

$$ g_{\theta , v}(w):=\left (\frac{r^{1 / 2}}{2 \pi}\right )^{n-1}\left (g \psi _{\theta}\right )^{\wedge}(v) e^{2 \pi i v \cdot \omega} \widetilde{\psi}_{\theta}(\omega ) $$

then we have

$$ g=\sum _{(\theta , v) \in \Theta _{r} \times r^{1 / 2} \mathbb{Z}^{n-1}} g_{\theta , v} $$

For \(\omega \in B^{n-1}\), \(z'\in B^{n-1}\) and \(t\in [0, 1]\), let us define a function \(\Phi =\Phi (z', t; \omega )\) by

$$ \partial _{\omega}\phi (\Phi (z', t; \omega ), t; \omega )=z'. $$
(4.6)

We refer to equation (4.6) in [10, page 275] for a discussion on the definition of \(\Phi \). For \(\theta \in B^{n-1}\) and \(v\in B^{n-1}\), define the curve \(\gamma ^{1}_{\theta , v}: [0, 1]\to \mathbb{R}^{n-1}\) by

$$ \gamma ^{1}_{\theta , v}(t):=\Phi (v, t; \omega _{\theta}), $$
(4.7)

where \(\omega _{\theta}\) is the center of \(\theta \). Moreover, for given \((\theta , v)\) let us define the rescaled curve

$$ \gamma ^{\lambda}_{\theta , v}(t):=\lambda \gamma ^{1}_{\theta , v/ \lambda} \Big( \frac{t}{\lambda} \Big) . $$
(4.8)

Let \(\Gamma ^{\lambda}_{\theta , v}\) be the map

$$ \Gamma ^{\lambda}_{\theta , v}:=(\gamma ^{\lambda}_{\theta , v}(t), t), $$
(4.9)

where \(t\in [0, \lambda ]\). Define the curved \(r^{\frac{1}{2}+\delta}\)-tube as

$$ T_{\theta , v}:= \Big\{ (x, t): |x-\gamma ^{\lambda}_{\theta , v}(t)|\le r^{1/2+\delta}, \ t\in [0, r] \Big\} . $$
(4.10)

The curve \(\Gamma ^{\lambda}_{\theta , v}\) is referred to as the core of \(T_{\theta , v}\). This finishes our wave packet decomposition for a ball of radius \(r\) centered at the origin.

Next, let us define the wave packet decomposition for a ball not centered at the origin. Fix \({\mathbf {x}}_{0}\in B(0, \lambda )\) and consider the ball \(B({\mathbf {x}}_{0}, r)\). For \(g: B^{n-1}\to \mathbb{C}\) integrable, define

$$ \tilde{g}(\omega ):=e^{2 \pi i \phi ^{\lambda}({\mathbf {x}}_{0}; \omega )} g( \omega ) $$

so that

$$ T^{\lambda} g({\mathbf {x}})=\tilde{T}^{\lambda} \tilde{g}(\tilde{{\mathbf {x}}}) \quad \text{ for } \tilde{{\mathbf {x}}}={\mathbf {x}}-{\mathbf {x}}_{0}, $$

where \(\widetilde{T}^{\lambda}\) is the Hörmander-type operator with phase \(\tilde{\phi}^{\lambda}\) and amplitude \(\tilde{a}^{\lambda}\) given by

$$ \tilde{\phi}({\mathbf {x}}; \omega ):=\phi \left ({\mathbf {x}}+ \frac{{\mathbf {x}}_{0}}{\lambda} ; \omega \right )-\phi \left ( \frac{{\mathbf {x}}_{0}}{\lambda} ; \omega \right ) \text{ and } \tilde{a}( {\mathbf {x}}; \omega ):=a\left ({\mathbf {x}}+\frac{{\mathbf {x}}_{0}}{\lambda} ; \omega \right ) . $$

If \({\mathbf {x}}\in B({\mathbf {x}}_{0}, r)\), then \(\tilde{{\mathbf {x}}}\in B(0, r)\), and we can therefore apply the wave packet decomposition above to \(\tilde{T}^{\lambda} \tilde{g}\). Moreover, notice that the core curve of \(T_{\theta , v}\) is given by the collection of \(\tilde{{\mathbf {x}}}\in B(0, r)\) satisfying

$$ \partial _{\omega}\phi \left (\frac{\tilde{{\mathbf {x}}}}{\lambda}+ \frac{{\mathbf {x}}_{0}}{\lambda}; \omega _{\theta}\right )=\frac{v}{\lambda}+ \partial _{\omega}\phi \left (\frac{{\mathbf {x}}_{0}}{\lambda}; \omega _{ \theta}\right ). $$
(4.11)

Set

$$ v^{\lambda}({\mathbf {x}}_{0}; \omega ):=\partial _{\omega}\phi ^{\lambda} \left ({\mathbf {x}}_{0}; \omega \right ). $$
(4.12)

Under this notation, the core curve of \(T_{\theta , v}\) can be written as the image of the map

$$ \Gamma ^{\lambda}_{\theta , v+v^{\lambda}({\mathbf {x}}_{0}; \omega _{\theta})}= \left (\gamma ^{\lambda}_{\theta , v+v^{\lambda}({\mathbf {x}}_{0}; \omega _{ \theta})}(t), t \right ) $$
(4.13)

with \(t\in [0, \lambda ]\). Define curved tubes

$$ T_{\theta , v}({\mathbf {x}}_{0}):={\mathbf {x}}_{0}+ \Big\{ (x, t): |x-\gamma ^{\lambda}_{\theta , v+v^{\lambda}({\mathbf {x}}_{0}; \omega _{\theta})}(t)|\le r^{1/2+\delta}, \ t\in [0, r] \Big\} . $$
(4.14)

Thus the function

$$ \tilde{T}^{\lambda}(\tilde{g})_{\theta , v}({\mathbf {x}}-{\mathbf {x}}_{0}) $$
(4.15)

is essentially supported on \(T_{\theta , v}({\mathbf {x}}_{0})\) if we restrict \(t\in [t_{0}, t_{0}+r]\) where \({\mathbf {x}}_{0}=(x_{0}, t_{0})\). We will use \(\mathbb{T}[B({\mathbf {x}}_{0}, r)]\) to denote the collection \(\{T_{\theta , v}({\mathbf {x}}_{0})\}_{(\theta , v)}\). Moreover, we write \(\theta (T)=\theta \) for a tube \(T=T_{\theta , v}({\mathbf {x}}_{0})\). To simplify notation, we also define

$$ g_{T_{\theta , v}({\mathbf {x}}_{0})}(\omega ):=e^{-2\pi i\phi ^{\lambda}( {\mathbf {x}}_{0}; \omega )} (\tilde{g})_{\theta , v}(\omega ). $$
(4.16)

Under this notation, we can write

$$ T^{\lambda} g({\mathbf {x}})=\sum _{T\in \mathbb{T}[B({\mathbf {x}}_{0}, r)]} T^{\lambda} g_{T}( {\mathbf {x}}). $$
(4.17)

This finishes our wave packet decomposition associated to the ball \(B({\mathbf {x}}_{0}, r)\).

4.2 Reducing to broad term estimates

We define Gauss maps and rescaled Gauss maps. Define

$$ G_{0}({\mathbf {x}}; \xi ):=\partial _{\xi _{1}}\nabla _{{\mathbf {x}}}\phi \wedge \cdots \wedge \partial _{\xi _{n-1}}\nabla _{{\mathbf {x}}}\phi . $$
(4.18)

Moreover, define

$$ G({\mathbf {x}}; \xi ):=\frac{G_{0}({\mathbf {x}}; \xi )}{|G_{0}({\mathbf {x}}; \xi )|}. $$
(4.19)

Define the rescaled Gauss map

$$ G^{\lambda}({\mathbf {x}}; \xi ):=G({\mathbf {x}}/\lambda ; \xi ). $$
(4.20)

Let \(K\ge 1\). We divide \(B^{n-1}\) into caps \(\tau \) of side length \(K^{-1}\). Let \(g_{\tau}\) denote g 1 τ . For \({\mathbf {x}}\in B_{R}\), denote

$$ G^{\lambda}({\mathbf {x}}; \tau ):=\{G^{\lambda}({\mathbf {x}}; \xi ): \xi \in \tau \}. $$
(4.21)

Let \(V\subset \mathbb{R}^{n}\) be a linear subspace. Let \(\measuredangle (G^{\lambda}({\mathbf {x}}; \tau ), V)\) denote the smallest angle between any non-zero vector \(v\in V\) and \(v'\in G^{\lambda}({\mathbf {x}}; \tau )\). Moreover, we say that \(\tau \notin _{{\mathbf {x}}, K}V\) if \(\measuredangle (G^{\lambda}({\mathbf {x}}; \tau ), V)\ge K^{-1}\); otherwise, we say \(\tau \in _{{\mathbf {x}}, K} V\). If \({\mathbf {x}}\) and \(K\) are clear from the context, we often abbreviate \(\tau \notin _{{\mathbf {x}}, K} V\) to \(\tau \notin V\). Next, let us introduce the notion of broad norms. Fix \(B_{K^{2}}\subset B_{R}\) centered at \({\mathbf {x}}_{0}\). Define

$$ \mu _{T^{\lambda} g}(B_{K^{2}}):=\min _{V_{1}, \dots , V_{A}\in \text{Gr}(k-1, n)} \Big(\max _{ \substack{\tau \notin V_{a}\\ \text{ for any } 1\le a\le A}} \|T^{ \lambda} g_{\tau}\|_{L^{p}(B_{K^{2}})}^{p}\Big). $$
(4.22)

Here \(\mathrm{Gr}(k-1,n)\) is the Grassmannian of all \((k-1)\)-dimensional subspaces in \(\mathbb{R}^{n}\), and \(k\) is to be determined, and \(A\) is a parameter that is less important and its choice will become clear later. For \(U\subset \mathbb{R}^{n}\), define

$$ \|T^{\lambda} g\|_{\mathrm {BL}_{k, A}^{p}(U)}:=\Big(\sum _{B_{K^{2}}} \frac{|B_{K^{2}}\cap U|}{|B_{K^{2}}|} \mu _{T^{\lambda} g}(B_{K^{2}}) \Big)^{1/p}. $$
(4.23)

This is called the broad part of \(T^{\lambda }g\).

For \(2\le k\le n-1\), denote

$$ \Gamma _{\mathrm{HZ}}(n, k):=\frac{n-1}{3}+\frac{k-1}{6}\prod _{i=k}^{n-1} \frac{2i}{2i+1}. $$
(4.24)

We will prove

Theorem 4.2

Broad norm estimate

Let \(2n/5\le k\le n/2\), and

$$ p>p_{n}(k):=2+ \frac{1}{ \Gamma _{\mathrm{HZ}}(n, k)+\frac{n}{10^{3}} }. $$
(4.25)

Then for every \(\epsilon >0\), there exists \(A\) such that

$$ \|T^{\lambda} g\|_{\mathrm {BL}_{k, A}^{p}(B_{R})}\lesssim _{K, \epsilon} R^{ \epsilon} \|g\|_{L^{2}}^{2/p}\|g\|_{L^{\infty}}^{1-2/p}, $$
(4.26)

for every \(K\ge 1\), \(1\le R\le \lambda \), where \(B_{R}\subset B_{\lambda}\) is a ball of radius \(R\). Moreover, the implicit constant depends polynomially on \(K\).

Recall that when \(\phi ({\mathbf {x}}; \xi )=\langle x,\xi\rangle +t|\xi |^{2}\), Hickman and Zahl [13] proved (4.26) for all

$$ p>2+\frac{1}{\Gamma _{\mathrm{HZ}} (n, k)}. $$
(4.27)

Theorem 4.2 provides a slight improvement in this case. When \(k\) is outside the range \([2n/5, n/2]\), we also obtain improved broad norm estimates. As they are irrelevant for the asymptotic formula in (1.20), we do not state them here.

It is standard in the literature to reduce Theorem 4.1 to Theorem 4.2. For instance, by Proposition 11.1 in Guth-Hickman-Iliopoulou’s work [10], if

$$ 2+\frac{4}{2n-k}\le p\le 2+\frac{2}{k-2}, $$
(4.28)

then Theorem 4.2 (for some fixed \(k\)) implies Theorem 4.1 for the same \(p\). To see the asymptotic formula in (1.20), we set \(k=\nu n\) and

$$ p_{n}(k)=2+\frac{4}{2n-k}, $$
(4.29)

solve for \(\nu \), and then obtain

$$ q_{n, 2}=2+\frac{4}{2-\nu}\frac{1}{n}+O(n^{-2}). $$
(4.30)

We refer to the appendix of [13] on how to control \(\Gamma _{\mathrm{HZ}}(n, k)\).

When proving Theorem 4.2, we will apply a wave packet decomposition

$$ g=\sum _{T\in \mathbb{T}[B_{R}]} g_{T}. $$
(4.31)

By pigeonholing, we can assume that \(\big\| g_{T} \big\| _{2} \simeq \big\| g_{T'} \big\| _{2}\) for every \(T\) and \(T'\). Footnote 3

5 Polynomial partitioning

5.1 Preparatory work

In this subsection, we state a few definitions that will be useful in the forthcoming polynomial partitioning algorithms.

Definition 5.1

Transverse complete intersection

Let \(P_{1}, \ldots , P_{n-m}: \mathbb{R}^{n} \rightarrow \mathbb{R}\) be polynomials. We consider the common zero set

$$ Z\left (P_{1}, \ldots , P_{n-m}\right ):=\left \{x \in \mathbb{R}^{n}: P_{1}(x)=\cdots =P_{n-m}(x)=0\right \} . $$
(5.1)

Suppose that for all \(z \in Z\left (P_{1}, \ldots , P_{n-m}\right )\), one has

$$ \bigwedge _{j=1}^{n-m} \nabla P_{j}(z) \neq 0 . $$

Then a connected branch of this set, or a union of connected branches of this set, is called an m-dimensional transverse complete intersection. Given a set \(Z\) of the form (5.1), the degree of \(Z\) is defined by

$$ \min \left (\prod _{i=1}^{n-m} \operatorname{deg}\left (P_{i}\right ) \right ) $$

where the minimum is taken over all possible representations of \(Z=Z (P_{1}, \ldots , P_{n-m} )\).

Definition 5.2

Tangent tubes

Recall the parameters in (1.44). Let \(r \geqslant 1\) and \(Z\) be an \(m\)-dimensional transverse complete intersection. A tube \(T_{\theta , v}\left (\mathbf{x}_{0}\right ) \in \mathbb{T}\left [B \left (\mathbf{x}_{0}, r\right )\right ]\) is said to be \(r^{-1 / 2+\delta _{m}}\)-tangent to \(Z\) in \(B\left (\mathbf{x}_{0}, r\right )\) if it satisfies

  1. 1.

    \(T_{\theta , v}\left (\mathbf{x}_{0}\right ) \subset \mathcal {N}_{r^{1 / 2+ \delta _{m}}}(Z) \cap B\left (\mathbf{x}_{0}, r\right )\);

  2. 2.

    For every \({\mathbf {z}}\in Z \cap B\left (\mathbf{x}_{0}, r\right )\), if \({\mathbf {y}}\in T_{\theta , v}\left (\mathbf{x}_{0}\right )\) with \(|{\mathbf {z}}-{\mathbf {y}}| \lesssim r^{1 / 2+\delta _{m}}\), then one has

    $$ \measuredangle \left (G^{\lambda}({\mathbf {y}}; w_{\theta}), T_{{\mathbf {z}}} Z \right ) \lesssim r^{-1 / 2+\delta _{m}} . $$

    Here, \(T_{{\mathbf {z}}} Z\) is the tangent space of \(Z\) at \({\mathbf {z}}\).

Definition 5.3

Given a function \(f: B^{n-1}\to \mathbb{C}\), we say that it is concentrated on wave packets from \(\mathbb{W}\subset \mathbb{T}[B_{R}]\) if

$$ \big\| \sum _{T\notin \mathbb{W}} f_{T} \big\| _{\infty} \lesssim R^{-100n} \big\| f \big\| _{2}. $$
(5.2)

Here \(R\) is from Theorem 4.2.

5.2 Partitioning algorithms: part I

In this subsection, we run the first part of the polynomial partitioning algorithm. It is a variant of the algorithm in Hickman-Rogers’ work [12] with two main differences.

The first difference is that, after reaching an algebraic dominant case, we will not compare contributions from the tangential case and the transverse case, but instead keep both terms and continue to do polynomial partitioning for both terms. This will be needed in Sect. 7 when we construct brooms.

Let us explain the second difference. In the first algorithm in [12, page 247], the authors there did not need to control how fast cells shrink. In other words, each time when they see a cellular case, they simply decrease the radius parameter \(\rho _{j}\) by a factor of 2 (see for instance equation (31) in [12, page 253]). If in the current paper we simply repeat their algorithm, then we will not have good control of the non-admissible parameters \(D_{n}, D_{n-1}, \dots \) by \(R\) (see Lemma 5.10 below), which was not needed in [12] and is crucial in our inductive argument in Sect. 9.

In order to control how fast cells shrink, in Lemma 5.4 we require cells to have diameter at most \(r/d\), instead of \(r/2\). This change also brings in changes in how the algorithm runs. For instance, in the last equation in [12, page 255], the authors there simply let \(d^{-\delta}\) absorb the constant 2 from the above \(r/2\), and this steps needs to be done more carefully in the current paper as \(d^{-\delta}\) certainly can not absorb the factor \(d\). Footnote 4

By pigeonholing, we can find a collection \(\mathbb{B}_{K^{2}}\) of balls of radius \(K^{2}\) such that

$$ \frac{1}{2}\big\| T^{\lambda} g \big\| _{\mathrm {BL}_{k, A}^{p}(B'_{K^{2}})}\le \big\| T^{\lambda} g \big\| _{\mathrm {BL}_{k, A}^{p}(B_{K^{2}})}\le 2 \big\| T^{\lambda} g \big\| _{\mathrm {BL}_{k, A}^{p}(B'_{K^{2}})} $$
(5.3)

for two arbitrary \(B_{K^{2}}, B'_{K^{2}}\in \mathbb{B}_{K^{2}}\) and

$$ \big\| T^{\lambda} g \big\| _{\mathrm {BL}_{k, A}^{p}(B_{R})}^{p}\lesssim (\log R)^{10} \sum _{B_{K^{2}}\in \mathbb{B}_{K^{2}}} \big\| T^{\lambda} g \big\| _{\mathrm {BL}_{k, A}^{p}(B_{K^{2}})}^{p}. $$
(5.4)

Denote

$$ \mathfrak {Y}=\bigcup _{B_{K^{2}}\in \mathbb{B}_{K^{2}}} B_{K^{2}}. $$
(5.5)

Next, we apply polynomial partitioning to \(T^{\lambda} g\) restricted to \(\mathfrak {Y}\). For a polynomial \(P: \mathbb{R}^{n}\to \mathbb{R}\), we let \(Z(P):=\{z\in \mathbb{R}^{n}: P(z)=0\}\) and let \(\operatorname{cell}(P)\) denote the set of connected components of \(\mathbb{R}^{n}\setminus Z(P)\).

Lemma 5.4

Polynomial partitioning, Guth [9], Hickman and Rogers [12]

Fix \(r \gg 1, d \in \mathbb{N}\) and suppose \(F \in L^{1}(\mathbb{R}^{n})\) is non-negative and supported on \(B_{r}\cap \mathcal {N}_{r^{1/2+\delta _{\circ}}}(Z)\) for some \(0<\delta _{\circ}\ll 1\), where \(Z\) is an \(m\)-dimensional transverse complete intersection of degree at most \(d\). At least one of the following cases holds:

Cellular case. There exists a polynomial \(P: \mathbb{R}^{n} \rightarrow \mathbb{R}\) of degree \(O(d)\) with the following properties:

  1. (1)

    \(\#\mathrm {cell}(P) \simeq d^{m}\) and each \(O' \in \operatorname{cell}(P)\) has diameter at most \(r/d\).

  2. (2)

    If we define

    $$ \mathcal {O}:=\{O'\setminus \mathcal {N}_{r^{1/2+\delta _{\circ}}}(Z(P)): O'\in \mathrm {cell}(P)\}, $$
    (5.6)

    then

    $$ \int _{O} F \simeq d^{-m} \int _{\mathbb{R}^{n}} F \quad \textit{ for all } O \in \mathcal{O} . $$
    (5.7)

Algebraic case. There exists an \((m-1)\)-dimensional transverse complete intersection Y of degree at most \(O(d)\) such that

$$ \int _{B_{r} \cap \mathcal {N}_{r^{1/2+\delta _{\circ}}}(Z)} F \lesssim \int _{B_{r} \cap \mathcal {N}_{r 1 / 2+\delta _{\circ}}(Y)} F $$

Here the diameter of a cell in Lemma 5.4 is \(O(r/d)\) instead of \(O(r/2)\) as in [9] and [12]. See the proof sketch of Theorem 2.12 in [21] for a discussion.

We now start our polynomial partitioning algorithm. This algorithm will produce a tree consisting of many nodes. Each node will have no child (algorithm for that node stops), one child (cellular case) or two children (algebraic case).

Let \(\mathfrak {n}\) be a node. We apply Lemma 5.4 to \(\mathfrak {n}\), and see whether we are in the cellular case or the algebraic case. If we are in the cellular case, then \(\mathfrak {n}_{j}\) has only one child, which will be denoted as \(\mathfrak {n}_{j+1, L}=\mathfrak {n}_{j+1, L}(\mathfrak {n}_{j})\); here \(``L"\) refers to “left” and \(\mathfrak {n}_{j+1, L}\) is called the \(L\)-child of \(\mathfrak {n}_{j}\). If we are the algebraic case, then \(\mathfrak {n}_{j}\) will have two children, which will be denoted as \(\mathfrak {n}_{j+1, M}=\mathfrak {n}_{j+1, M}(\mathfrak {n}_{j})\) and \(\mathfrak {n}_{j+1, R}=\mathfrak {n}_{j+1, R}(\mathfrak {n}_{j})\); here \(``M"\) refers to “middle” and \(\mathfrak {n}_{j+1, M}\) is called the \(M\)-child of \(\mathfrak {n}_{j}\), and \(``R"\) refers to “right” and \(\mathfrak {n}_{j+1, R}\) is called the \(R\)-child of \(\mathfrak {n}_{j}\). Footnote 5

For two nodes \(\mathfrak {n}\) and \(\mathfrak {n}'\), if \(\mathfrak {n}\) is a descendant of \(\mathfrak {n}'\), then we write \(\mathfrak {n}\preccurlyeq \mathfrak {n}'\); similarly we define ≽. Here we make the convention that \(\mathfrak {n}\preccurlyeq \mathfrak {n}\) and \(\mathfrak {n}\succcurlyeq \mathfrak {n}\). Recall the parameters in (1.44). Moreover, define \(\tilde{\delta}_{m-1}\) by

$$ (1-\tilde{\delta}_{m-1})(1/2+\delta _{m-1})=1/2+\delta _{m}. $$
(5.8)

Note that \(\delta _{m-1}/2\le \tilde{\delta}_{m-1}\le 2\delta _{m-1}\).

Step 0. In this step, we create the root of the tree. Denote

$$ \mathfrak{n}_{0}=\{O_{i_{0}}\}, $$
(5.9)

with \(O_{i_{0}}=B_{R}\cap \mathfrak {Y}\). Moreover, define \(\text{dim}(\mathfrak{n}_{0})=n\), \(\rho (\mathfrak{n}_{0})=R\) and \(\mathfrak {j}(\mathfrak {n}_{0})=0\). Later we will define \(\mathfrak {j}\) for every node. It will play the role of the parameter \(j\) in the recursive step of the first algorithm in [12, page 247]. In the end of this step, define

$$ \#_{a}(\mathfrak {j}(\mathfrak {n}_{0}))=0, \ \ \#_{c}(\mathfrak {j}(\mathfrak {n}_{0}))=0. $$
(5.10)

Here “a” is short for “algebraic” and “c” is short of “cellular”, and we use \(\#_{a}\) to record the number of algebraic cases we have so far when applying Lemma 5.4, and similarly, we use \(\#_{c}\) to record the number of cellular cases. Initialize

$$ \mathfrak {M}'_{\ell '}=\emptyset , \ \ \forall \ell '\in \mathbb{N}. $$
(5.11)

This collection will appear at almost the end of the algorithm; we will keep adding elements to it as we run the algorithm.

Step 1. Creating nodes at the first level. The root node \(\mathfrak{n}_{0}\) has either one or two children, given as follows. Apply Lemma 5.4 to the function T λ g 1 O i 0 , we obtain a collection of cells \(\{O_{i_{1}}\}_{i_{1}}\) (as in (5.6)) and a wall \(W=\mathcal {N}_{(\rho (\mathfrak{n}_{0}))^{1/2+\delta _{m}}}(Z)\) with \(m=\text{dim}(\mathfrak{n}_{0})\) for some variety \(Z\) of dimension \(m-1\). We without loss of generality assume that all these regions are unions of balls \(B_{K^{2}}\). Compare

$$ \#\{B_{K^{2}}\in \mathbb{B}_{K^{2}}: B_{K^{2}}\subset \bigcup _{i_{1}} O_{i_{1}} \} \text{ and } \#\{B_{K^{2}}\in \mathbb{B}_{K^{2}}: B_{K^{2}}\subset W\}. $$
(5.12)

If the former term in (5.12) is larger, then we say that we are in the cellular case of this step, and otherwise we say that we are in the algebraic case. In the cellular case, the node \(\mathfrak {n}_{0}\) has only one child (defined in (5.13)-(5.14)), and will be denoted by \(\mathfrak {n}_{1, L}=\mathfrak {n}_{1, L}(\mathfrak {n}_{0})\) and called the L-child; here \(L\) refers to “left”. Define

$$ \rho (\mathfrak {n}_{1, L})=\rho (\mathfrak {n}_{0})/d, \ \ \mathfrak {n}_{1, L}=\{O_{i_{1}} \}_{i_{1}}, \ \ \text{dim}(\mathfrak {n}_{1, L})=m, $$
(5.13)

and

$$\begin{aligned} & \mathfrak {j}(\mathfrak {n}_{1, L})=\mathfrak {j}(\mathfrak {n}_{0})+1, \end{aligned}$$
(5.14)
$$\begin{aligned} & \#_{c}(\mathfrak {j}(\mathfrak {n}_{1, L}))=\#_{c}(\mathfrak {j}(\mathfrak {n}_{0}))+1, \ \ \#_{a}( \mathfrak {j}(\mathfrak {n}_{1, L}))=\#_{a}(\mathfrak {j}(\mathfrak {n}_{0})). \end{aligned}$$
(5.15)

In other words, we have had one cellular cases so far, and zero algebraic cases. Define \(\mathfrak {L}_{1}=\{\mathfrak {n}_{1, L}\}\), that is, we use \(\mathfrak {L}_{1}\) to collect all the L-children in this step. Moreover, set \(\mathfrak {M}_{1}=\mathfrak {R}_{1}=\emptyset \). This finishes defining the node \(\mathfrak {n}_{1, L}\) and its information.

If we are in the algebraic case, then the node \(\mathfrak {n}_{0}\) has two children, which will be denoted by \(\mathfrak {n}_{1, M}=\mathfrak {n}_{1, M}(\mathfrak {n}_{0})\) and \(\mathfrak {n}_{1, R}=\mathfrak {n}_{1, R}(\mathfrak {n}_{0})\) and be called the M-child and the R-child; here \(M\) refers to “middle”, and \(R\) to “right”. Define

$$ \rho (\mathfrak{n}_{1, M})=\rho (\mathfrak{n}_{1, R})=\rho ( \mathfrak{n}_{0})^{1-\tilde{\delta}_{m-1}}, $$
(5.16)

and

$$ \text{dim}(\mathfrak{n}_{1, M})=n, \ \text{dim}(\mathfrak{n}_{1, R})=n-1, \ \ \mathfrak{n}_{1, M}=\mathfrak{n}_{1, R}=\{O_{i'_{1}}\}_{i'_{1}}, $$
(5.17)

where each \(O_{i'_{1}}\) is given by \(W\cap B_{\rho (\mathfrak{n}_{1, M})}\) and we let \(B_{\rho (\mathfrak{n}_{1, M})}\) run through a collection of finitely overlapping balls of radius \(\rho (\mathfrak{n}_{1, M})\) inside \(B_{\rho (\mathfrak{n}_{0})}\). Moreover, define

$$\begin{aligned} & \mathfrak {j}(\mathfrak {n}_{1, M})=\mathfrak {j}(\mathfrak {n}_{0})+1, \end{aligned}$$
(5.18)
$$\begin{aligned} & \#_{c}(\mathfrak {j}(\mathfrak {n}_{1, L}))=\#_{c}(\mathfrak {j}(\mathfrak {n}_{0})), \ \ \#_{a}( \mathfrak {j}(\mathfrak {n}_{1, L}))=\#_{a}(\mathfrak {j}(\mathfrak {n}_{0}))+1, \end{aligned}$$
(5.19)

and

$$ \mathfrak {j}(\mathfrak {n}_{1, R})=0, \ \ \#_{c}(\mathfrak {j}(\mathfrak {n}_{1, R}))=\#_{a}( \mathfrak {j}(\mathfrak {n}_{1, R}))=0. $$
(5.20)

Here let us explain the rule of defining \(\mathfrak {j}\): It is always reset to be 0 whenever we see an \(R\)-child, and otherwise its values is increased by 1.

Let \(\mathfrak{M}_{1}, \mathfrak{R}_{1}\) collect all the M-children and R-children at Step 1, respectively. As \(\mathfrak {n}_{0}\) has no L-child and we always use \(\mathfrak {L}_{1}\) to collection \(L\)-children, we therefore set \(\mathfrak {L}_{1}=\emptyset \). This finishes the first step.

Step 2. Creating nodes at the second level. Take a node \(\mathfrak{n}_{1}\) from the previous step. It has one or two children.

If \(\mathfrak{n}_{1}\in \mathfrak{L}_{1}\) or \(\mathfrak{M}_{1}\), then its children, which will be named either as \(\mathfrak{n}_{2, L}\) or as \(\mathfrak{n}_{2, M}, \mathfrak{n}_{2, R}\), will be given as follows. For each \(O_{i_{1}}\in \mathfrak{n}_{1}\), we apply Lemma 5.4 with dimension parameter \(\text{dim}(\mathfrak{n}_{1})\), and obtain a collection of cells \(\{O_{i_{2}}\}_{i_{2}}\) and a wall \(W_{m-1}=\mathcal {N}_{(\rho (\mathfrak{n}_{1}))^{1/2+\delta _{m}}}(Z_{m-1})\) with \(m=\text{dim}(\mathfrak{n}_{1})\) for some variety \(Z_{m-1}\) of dimension \(m-1\). We make a comparison similar to (5.12). If we are in the cellular case, then define

$$ \rho (\mathfrak{n}_{2, L})=\rho (\mathfrak{n}_{1})/d, \ \mathfrak{n}_{2, L}=\bigcup _{O_{i_{1}}\in \mathfrak {n}_{1}}\{O_{i_{2}}\}_{i_{2}}, \ \text{dim}( \mathfrak{n}_{2, L})=\text{dim}(\mathfrak {n}_{1}), $$
(5.21)

and

$$\begin{aligned} & \mathfrak {j}( \mathfrak {n}_{2, L} )=\mathfrak {j}( \mathfrak {n}_{1} )+1, \end{aligned}$$
(5.22)
$$\begin{aligned} & \#_{c}(\mathfrak {j}(\mathfrak {n}_{2, L}))=\#_{c}(\mathfrak {j}(\mathfrak {n}_{1}))+1, \ \ \#_{a}( \mathfrak {j}(\mathfrak {n}_{2, L}))=\#_{a}(\mathfrak {j}(\mathfrak {n}_{1})). \end{aligned}$$
(5.23)

If we are in the algebraic case, then define

$$ \rho (\mathfrak{n}_{2, M})=\rho (\mathfrak{n}_{2, R})=\rho ( \mathfrak{n}_{1})^{1-\tilde{\delta}_{m-1}}, $$
(5.24)

and

$$\begin{aligned} & \text{dim}(\mathfrak{n}_{2, M})=\text{dim}(\mathfrak {n}_{1}), \ \text{dim}(\mathfrak{n}_{2, R})=\text{dim}(\mathfrak {n}_{1})-1, \end{aligned}$$
(5.25)
$$\begin{aligned} & \mathfrak{n}_{2, M}=\mathfrak{n}_{2, R}=\bigcup _{O_{i_{1}}\in \mathfrak {n}_{1}}\{O_{i'_{2}}\}_{i'_{2}}, \end{aligned}$$
(5.26)

where each \(O_{i'_{2}}\) is given by \(W_{m-1}\cap B_{\rho (\mathfrak{n}_{2, M})}\) and we let \(B_{\rho (\mathfrak{n}_{2, M})}\) run through all balls of radius \(\rho (\mathfrak{n}_{2, M})\) inside \(B_{\rho (\mathfrak{n}_{1})}\). Moreover, define

$$\begin{aligned} & \mathfrak {j}(\mathfrak {n}_{2, M})=\mathfrak {j}(\mathfrak {n}_{1})+1, \end{aligned}$$
(5.27)
$$\begin{aligned} & \#_{c}(\mathfrak {j}(\mathfrak {n}_{2, L}))=\#_{c}(\mathfrak {j}(\mathfrak {n}_{1})), \ \ \#_{a}( \mathfrak {j}(\mathfrak {n}_{2, L}))=\#_{a}(\mathfrak {j}(\mathfrak {n}_{1}))+1, \end{aligned}$$
(5.28)

and

$$ \mathfrak {j}(\mathfrak {n}_{2, R})=0, \ \ \#_{c}(\mathfrak {j}(\mathfrak {n}_{2, R}))=\#_{a}( \mathfrak {j}(\mathfrak {n}_{2, R}))=0. $$
(5.29)

Let \(\mathfrak {L}_{2}\) collect all the \(L\)-children at Step 2, and similarly, we define \(\mathfrak {M}_{2}\) and \(\mathfrak {R}_{2}\).

Next, consider the remaining case \(\mathfrak {n}_{1}\in \mathfrak {R}_{1}\). This step is quite similar to the above case where \(\mathfrak {n}_{1}\in \mathfrak {L}_{1}\) or \(\mathfrak {M}_{1}\), with the main difference in how \(\mathfrak {n}_{2, M}\) will be defined.Footnote 6 The children of \(\mathfrak {n}_{1}\) are given as follows. For each \(O_{i_{1}}\in \mathfrak {n}_{1}\), we apply Lemma 5.4 with dimension parameter \(m=\text{dim}(\mathfrak {n}_{1})\) and \(\delta _{\circ}=\delta _{m}\), and obtain a collection of cells \(\{O_{i_{2}}\}_{i_{2}}\) and a wall \(W_{m-1}=\mathcal {N}_{(\rho (\mathfrak{n}_{1}))^{1/2+\delta _{m}}}(Z_{m-1})\) for some variety \(Z_{m-1}\) of dimension \(m-1\). If we are in the cellular case, then define \(\rho (\mathfrak{n}_{2, L}), \mathfrak{n}_{2, L}, \text{dim}(\mathfrak{n}_{2, L})\) in the same way as in (5.21) and \(\mathfrak {j}( \mathfrak {n}_{2, L} ), \#_{c}(\mathfrak {j}(\mathfrak {n}_{2, L})), \#_{a}( \mathfrak {j}(\mathfrak {n}_{2, L}))\) in the same way as in (5.22). If we are in the algebraic case, then define

$$ \rho (\mathfrak{n}_{2, M})=\rho (\mathfrak{n}_{2, R})=\rho ( \mathfrak{n}_{1})^{1-\tilde{\delta}_{m-1}}, $$
(5.30)

and

$$\begin{aligned} &\text{dim}(\mathfrak{n}_{2, M})=\text{dim}(\mathfrak {n}_{1}), \ \text{dim}(\mathfrak{n}_{2, R})=\text{dim}(\mathfrak {n}_{1})-1, \end{aligned}$$
(5.31)
$$\begin{aligned} & \mathfrak{n}_{2, R}=\bigcup _{O_{i_{1}}\in \mathfrak {n}_{1}}\{O_{i'_{2}} \}_{i'_{2}}, \end{aligned}$$
(5.32)

where each \(O_{i'_{2}}\) is given by \(W_{m-1}\cap B_{\rho (\mathfrak{n}_{2, M})}\) and we let \(B_{\rho (\mathfrak{n}_{2, M})}\) run through a collection of finitely overlapping balls of radius \(\rho (\mathfrak{n}_{2, M})\) that intersect \(O_{i_{1}}\). Our choice of parameters guarantees that

$$ \rho (\mathfrak {n}_{2, R})^{1/2+\delta _{m-1}}=\rho (\mathfrak {n}_{1})^{1/2+ \delta _{m}}. $$
(5.33)

We still need to define \(\mathfrak {n}_{2, M}\). Roughly speaking, for each \(O_{i'_{2}}\) given by \(W_{m-1}\cap B_{\rho (\mathfrak{n}_{2, M})}\), which is of thickness \(\rho (\mathfrak {n}_{1})^{1/2+\delta _{m}}\), we will cut it into thinner layers \(W_{m-1, b}\) of thickness \(\rho (\mathfrak {n}_{2, M})^{1/2+\delta _{m}}\). Then we set

$$ \mathfrak {n}_{2, M}=\bigcup _{O_{i_{1}}\in \mathfrak {n}_{1}} \bigcup _{O_{i'_{2}}} \{O_{i'_{2}}\cap W_{m-1, b}\}_{b}. $$
(5.34)

To make this precise, we follow Hickman-Rogers’ treatment [12]. For each \(B_{\rho (\mathfrak {n}_{2, M})}\), we follow page 258 in [12], find a finite set of translates \(\mathfrak {B}\subset B(0, \rho (\mathfrak {n}_{1})^{1/2+\delta _{m}})\) and then set (following the last equation in [12, page 258])

$$ \mathfrak {n}_{2, M}= \bigcup _{O_{i_{1}}\in \mathfrak {n}_{1}} \bigcup _{O_{i'_{2}}} \{O_{i'_{2}}\cap \mathcal {N}_{\rho (\mathfrak {n}_{2, M})^{1/2+\delta _{m}}}(Z_{m-1}+b): b\in \mathfrak {B}\}. $$
(5.35)

In the end, define \(\mathfrak {j}(\mathfrak {n}_{2, M}), \mathfrak {j}(\mathfrak {n}_{2, R}), \#_{a}, \#_{c}\) in the same way as in (5.27)–(5.29).

Let \(\mathfrak {L}_{2}\) be the collection of all the \(L\)-children at Step 2, and similarly, we define \(\mathfrak {M}_{2}\) and \(\mathfrak {R}_{2}\). This finishes Step 2.

Step \(\ell \). Creating nodes at the \(\ell \)-th level. How we proceed in a general step is similar to what we did in Step 2, with one difference mentioned at the beginning of this section that we need to control how fast cells shrink. We will sketch the part in this step that is similar to Step 2, and explain in more details the difference. Footnote 7

Take a node \(\mathfrak {n}_{\ell -1}\) from the previous step. There are a few parameters associated to it: A dimension parameter \(\text{dim}(\mathfrak {n}_{\ell -1})=:m\), a radius parameter \(\rho (\mathfrak {n}_{\ell -1})\), the parameters \(\mathfrak {j}(\mathfrak {n}_{\ell -1})\), \(\#_{c}(\mathfrak {j}(\mathfrak {n}_{\ell -1}))\) and \(\#_{a}(\mathfrak {j}(\mathfrak {n}_{\ell -1}))\) satisfying

$$ \mathfrak {j}(\mathfrak {n}_{\ell -1})=\#_{c}(\mathfrak {j}(\mathfrak {n}_{\ell -1}))+ \#_{a}( \mathfrak {j}(\mathfrak {n}_{\ell -1})). $$
(5.36)

Before we proceed, we need to introduce new notation. Let \(\mathfrak {n}_{\ell -1}^{\uparrow}\) denote the closest ancestor (itself included) to \(\mathfrak {n}_{\ell -1}\) that belongs to \(\mathfrak {M}_{\ell '}\), \(\mathfrak {M}'_{\ell '}\) or \(\mathfrak {R}_{\ell '}\) for some \(\ell '\). (Recall the initialization of \(\mathfrak {M}'_{\ell '}\) in (5.11).) Note that

$$ \text{dim}( \mathfrak {n}_{\ell -1} ) =\text{dim}( \mathfrak {n}_{\ell -1}^{\uparrow} ). $$
(5.37)

For each \(O_{i_{\ell -1}}\in \mathfrak {n}_{\ell -1}\), we apply Lemma 5.4 with dimension parameter \(m\) and \(\delta _{\circ}=\delta _{\circ}(\mathfrak {n}_{\ell -1})\) satisfying

$$ \rho (\mathfrak {n}_{\ell -1})^{\frac{1}{2}+\delta _{\circ}}= \rho (\mathfrak {n}_{ \ell -1}^{\uparrow}) ^{\frac{1}{2}+\delta _{m}} $$
(5.38)

and obtain a collection of cells \(\{O_{i_{\ell}}\}_{i_{\ell}}\) and a wall

$$ W_{m-1}=\mathcal {N }_{ (\rho (\mathfrak {n}^{\uparrow}_{\ell -1}))^{1/2+\delta _{m}} }(Z_{m-1}) $$
(5.39)

for some variety \(Z_{m-1}\). If we are in the algebraic case, then \(\mathfrak {n}_{\ell -1}\) has two children, called \(\mathfrak {n}_{\ell , M}\) and \(\mathfrak {n}_{\ell , R}\), and similarly to (5.30)–(5.35), we define

$$ \rho (\mathfrak {n}_{\ell , M})=\rho (\mathfrak {n}_{\ell , R})=\rho (\mathfrak {n}_{\ell -1})^{1- \tilde{\delta}_{m-1}}, \ \mathfrak {n}_{\ell , R}=\bigcup _{O_{i_{\ell -1}} \in \mathfrak {n}_{\ell -1}} \{O_{i'_{\ell}}\}_{i'_{\ell}}; $$
(5.40)

moreover, define

$$ \mathfrak {n}_{\ell , M}=\bigcup _{ O_{i_{\ell -1}}\in \mathfrak {n}_{\ell -1} } \bigcup _{ O_{i'_{\ell}} } \{ O_{i'_{\ell}}\cap \mathcal {N}_{ \rho (\mathfrak {n}_{ \ell , M})^{1/2+\delta _{m}} } (Z_{m-1}+b) : b\in \mathfrak {B} \}, $$
(5.41)

where \(\mathfrak {B}\) is a finite set of points in \(B(0, \rho ( \mathfrak {n}_{\ell -1} )^{1/2+\delta _{\circ}} )\), and

$$ \text{dim}(\mathfrak {n}_{\ell , M})=\text{dim}(\mathfrak {n}_{\ell -1}), \ \text{dim}(\mathfrak {n}_{ \ell , R})=\text{dim}(\mathfrak {n}_{\ell -1})-1. $$
(5.42)

If we are in the cellular case, then we proceed differently. Footnote 8 There are two further cases. If we are in the case

$$ \frac{\rho (\mathfrak {n}_{\ell -1})}{d}\ge \rho ( \mathfrak {n}^{\uparrow}_{\ell -1} ) ^{1- \delta _{m-1/2} }, $$
(5.43)

where \(\delta _{m-1/2}\) is as in (1.45), then we define \(\mathfrak {n}_{\ell , L}, \rho (\mathfrak {n}_{\ell , L} )\) and \(\text{dim}(\mathfrak {n}_{\ell , L} )\) in the same way as in (5.21), and \(\mathfrak {j(\mathfrak {n}_{\ell , L} ) }, \#_{c}(\mathfrak {j(\mathfrak {n}_{\ell , L} ) })\) and \(\#_{a}(\mathfrak {j(\mathfrak {n}_{\ell , L} ) })\) in the same way as in (5.22). If (5.43) is violated, then we first update

$$ \mathfrak {M}'_{\ell}=\mathfrak {M}'_{\ell}\bigcup \{\mathfrak {n}_{\ell , L}(\mathfrak {n}_{ \ell -1})\}. $$
(5.44)

The next step is to cut each \(O_{i_{\ell}}\) into thinner layers, in a way that is essentially the same as in (5.35). Let us be more precise. By the way we run the partitioning algorithm, in particular, due to the choice of the parameter \(\delta _{\circ}\) in (5.38), we know that

$$ \rho ( \mathfrak {n} )^{ \frac{1}{2}+ \delta _{\circ} (\mathfrak {n}) }= \rho ( \mathfrak {n}_{\ell -1}^{\uparrow} )^{\frac{1}{2}+\delta _{m}} $$
(5.45)

for every node \(\mathfrak {n}\) with \(\mathfrak {n}_{\ell -1}\preccurlyeq \mathfrak {n}\preccurlyeq \mathfrak {n}_{\ell -1}^{ \uparrow}\). Therefore we have

$$ O_{i_{\ell}}\subset B_{ \rho ( \mathfrak {n}_{\ell , L} ) }\cap \mathcal {N}_{ \rho ( \mathfrak {n}_{\ell -1}^{\uparrow} )^{1/2+\delta _{m}} }(Z_{m}), $$
(5.46)

where

$$ \rho (\mathfrak {n}_{\ell , L}):=\rho (\mathfrak {n}_{\ell -1})/d, $$
(5.47)

and similarly as before we define \(\rho (\mathfrak {n}_{\ell , L})\) before defining \(\mathfrak {n}_{\ell , L}\), and \(Z_{m}\) is an \(m\)-dimensional variety. Note that as (5.43) is violated, we have

$$ \rho ( \mathfrak {n}^{\uparrow}_{\ell -1} ) ^{1- \delta _{m-1/2} }/d \le \rho (\mathfrak {n}_{\ell , L})\le \rho ( \mathfrak {n}^{\uparrow}_{\ell -1} ) ^{1- \delta _{m-1/2} }. $$
(5.48)

We will cut the right hand side of (5.46) into thinner layers of thickness \(\rho (\mathfrak {n}_{\ell , L})^{1/2+\delta _{m}}\) and apply transverse equidistribution properties (for instance Lemma 8.4 in [10]). This can be done in exactly the same way as in (5.35) and (5.41), with the only difference that the parameter \(\tilde{\delta}_{m-1}\) that appears in the radius \(\rho (\mathfrak {n}_{2, M})\) is replaced by \(\delta _{m-1/2}\) (which appears in the radius \(\rho (\mathfrak {n}_{\ell , L})\) because of the relation (5.48)). Therefore, we follow [12, page 258] and find a finite set of translates \(\mathfrak {B}\subset B(0, \rho ( \mathfrak {n}_{\ell -1}^{\uparrow} )^{1/2+\delta _{m}} )\), and then set

$$ \mathfrak {n}_{\ell , L}=\bigcup _{ O_{i_{\ell -1}}\in \mathfrak {n}_{\ell -1} } \bigcup _{ O_{i'_{\ell}} } \{ O_{i'_{\ell}}\cap \mathcal {N}_{ \rho (\mathfrak {n}_{ \ell , L})^{1/2+\delta _{m}} } (Z_{m-1}+b) : b\in \mathfrak {B} \}. $$
(5.49)

Moreover, define

$$\begin{aligned} & \text{dim}(\mathfrak {n}_{\ell , L})=\text{dim}(\mathfrak {n}_{\ell -1}), \ \ \mathfrak {j} ( \mathfrak {n}_{\ell , L} )=\mathfrak {j} ( \mathfrak {n}_{\ell -1} )+1, \end{aligned}$$
(5.50)
$$\begin{aligned} & \#_{c}(\mathfrak {j} ( \mathfrak {n}_{\ell , L} ))=\#_{c}(\mathfrak {n}_{\ell -1})+1, \ \ \#_{a}(\mathfrak {j} ( \mathfrak {n}_{\ell , L} ))=\#_{a}(\mathfrak {n}_{\ell -1}). \end{aligned}$$
(5.51)

In the end, we let \(\mathfrak {L}_{\ell}\) collect all the \(L\)-children at the \(\ell \)-th level, and similarly we let \(\mathfrak {M}_{\ell}\) collect all the \(M\)-children and \(\mathfrak {R}_{\ell}\) collect all the \(R\)-children. This finishes the \(\ell \)-th step of the algorithm.

Before we proceed to the next step, we make a remark on the size of the parameter \(\delta _{\circ}\) in (5.38). By (5.38) and (5.43), we have

$$ \rho ( \mathfrak {n}_{\ell -1}^{\uparrow} )^{ \frac{1+2\delta _{m}}{1+2\delta _{\circ}} } =\rho (\mathfrak {n}_{\ell -1}) \ge d \rho ( \mathfrak {n}^{\uparrow}_{\ell -1} ) ^{1- \delta _{m-1/2} }, $$
(5.52)

which further implies

$$ \delta _{m}\le \delta _{\circ}\le \delta _{m-1/2}. $$
(5.53)

In other words, \(\delta _{\circ}\) is still quite close to \(\delta _{m}\), and is very far from \(\delta _{m-1}\).

Remark 5.5

In (5.40), each element \(O_{i'_{\ell}}\) in \(\mathfrak {n}_{\ell , R}\) is given by \(B_{ \rho (\mathfrak {n}_{\ell , R}) } \cap W_{m-1} \), where \(B_{ \rho (\mathfrak {n}_{\ell , R}) }\) is a ball of radius \(\rho (\mathfrak {n}_{\ell , R})\) and \(W_{m-1}\) is given in (5.39). Their counterpart in [12] is given by \(B\cap \mathcal {N}_{ \rho _{j}^{1/2+\delta _{m}} }({\mathbf {Y}})\) at the bottom of [12, page 256], where \(B\) is a ball of radius \(\rho _{j+1}\) and \(\rho _{j+1}=\rho ( \mathfrak {n}_{\ell , R} )\) in our notation, \(\rho _{j}=\rho ( \mathfrak {n}_{\ell -1} )\), \({\mathbf {Y}}=W_{m-1}\). The slight difference is that the neighborhood scale \((\rho ( \mathfrak {n}_{\ell -1}^{\uparrow} ))^{ 1/2+\delta _{m} }\) in (5.39) is bigger than \(\rho _{j}^{1/2+\delta _{m}}\). However, by (5.43),

$$ \frac{ \rho (\mathfrak {n}_{\ell -1}^{\uparrow}) }{\rho ( \mathfrak {n}_{\ell -1} )}\le \rho (\mathfrak {n}_{\ell -1}^{\uparrow})^{\delta _{m-1/2}}. $$
(5.54)

We will lose about \(\delta _{m-1}^{-1}\) many of these multiplicative factors. As \(\delta _{m-1/2}\ll _{\epsilon} \delta _{m-1}\), we see that they are harmless.

Stopping condition. Suppose we have arrived at the \(\ell _{0}\)-th level. Take a node \(\mathfrak {n}_{\ell _{0}}\). The algorithm will not continue at this node (but may still continue at other nodes at the same level) if either \(\rho (\mathfrak {n}_{\ell _{0}})\le R^{\delta _{0}}\) or \(\text{dim}(\mathfrak {n}_{\ell _{0}})\le k-1\). Here \(\delta _{0}\) is given in (1.44) and \(k\) is given in Theorem 4.2. In other words, we will not continue our algorithm if the radius of the node is too small, or the dimension is too small.

We state one lemma that will be used later.

Lemma 5.6

We have

$$ \# \Big( \bigcup _{\iota} (\mathfrak {M}_{\iota}\cup \mathfrak {R}_{\iota} ) \Big) \lesssim _{n, \delta} 1. $$
(5.55)

Proof of Lemma 5.6

Note that the left hand side of (5.55) would not change if we assume that there is no cellular case in the algorithm. In this case, each node has either zero or two children. The total number of levels \(\ell _{0}\le \delta ^{-1}\). Moreover, note that the algorithm stops at a node \(\mathfrak {n}\) if \(\text{dim}(\mathfrak {n})\le k-1\). As a consequence, we see that the left hand side of (5.55) is \(\le n\delta ^{-1}\). □

5.3 The related case

The proof of Theorem 4.2 relies on a two-ends argument. This requires a relation, denoted by ∼, which is defined between tubes \(T\in \mathbb{T}[B_{R}]\) and balls \(B_{\iota}\subset B_{R}\) of radius \(R^{1-\delta}\). The definition of ∼ is a bit complicated and relies on the definition of brooms; it will be given in Sect. 7. At this point, we only need the fact that

$$ \#\{B_{\iota}\subset B_{R}: B_{\iota}\sim T\}\lesssim 1, $$
(5.56)

for every tube \(T\in \mathbb{T}[B_{R}]\).

Definition 5.7

For each ball \(B_{\iota}\subset B_{R}\) of radius \(R^{1-\delta}\) and \({\mathbf {x}}\in B_{\iota}\), define

$$ T^{\lambda} g^{\sim}({\mathbf {x}}):=\sum _{T\in \mathbb{T}[B_{R}], T\sim B_{\iota}} T^{ \lambda} g_{T}({\mathbf {x}}), $$
(5.57)

and define \(T^{\lambda} g^{\nsim}({\mathbf {x}})\) to be the difference of \(T^{\lambda} g({\mathbf {x}})\) and \(T^{\lambda} g^{\sim}({\mathbf {x}})\). Moreover, for a given \(\iota \), define

$$ g^{\nsim}_{\iota}:=\sum _{T\in \mathbb{T}[B_{R}], T\nsim B_{\iota}} g_{T}. $$
(5.58)

Under the above notation, it holds that

$$ T^{\lambda} g^{\nsim}({\mathbf {x}})=T^{\lambda} g^{\nsim}_{\iota}( {\mathbf {x}}), $$
(5.59)

whenever \({\mathbf {x}}\in B_{\iota}\). Recall the definition of \(\mathbb{B}_{K^{2}}\) at the beginning of Sect. 5.2. Denote

$$ \mathbb{B}'_{K^{2}}:= \Big\{ B_{K^{2}}\in \mathbb{B}_{K^{2}}: \big\| T^{\lambda} g^{\sim} \big\| _{\mathrm {BL}_{k, A}^{p}(B_{K^{2}})} \le \big\| T^{\lambda} g^{\nsim} \big\| _{\mathrm {BL}_{k, A}^{p}(B_{K^{2}})} \Big\} . $$
(5.60)

If \(|\mathbb{B}'_{K^{2}}|\le |\mathbb{B}_{K^{2}}|/2\), then we say that we are in the related case (of Theorem 4.2), and otherwise we say that we are in the non-related case. Because of the pigeonholing step in (5.3), if we are in the related case, then the contribution to the broad-norm \(\mathrm {BL}_{k, A}^{p}(B_{R})\) from \(\mathbb{B}_{K^{2}}\setminus \mathbb{B}'_{K^{2}}\) is bigger. In this case, we can use the induction hypothesis (4.5) and the fact that there is only a small number of balls related to each tube, as in (5.56), to finish the proofs of Theorem 4.2 and Theorem 4.1.

Lemma 5.8

If we are in the related case, then (4.26) holds.

The proof of this lemma is a standard induction-on-scales argument, and is the same as that of Lemma 2.20 in [21]. We leave out the proof.

5.4 Partitioning algorithm: part II

The rest of the paper is to handle the case that

$$ |\mathbb{B}'_{K^{2}}|\ge |\mathbb{B}_{K^{2}}|/2 $$
(5.61)

that is, the unrelated component \(T^{\lambda} g^{\nsim}\) dominates. Recall that we need to bound

$$ \begin{aligned} & \sum _{B_{K^{2}}\in \mathbb{B}_{K^{2}}} \big\| T^{\lambda} g^{\nsim} \big\| _{\mathrm {BL}_{k, A}^{p}(B_{K^{2}})}^{p}\simeq \sum _{B_{\iota}} \sum _{B_{K^{2}}\subset B_{\iota}} \Big\| T^{\lambda} g^{\nsim}_{\iota} \Big\| _{\mathrm {BL}_{k, A}^{p}(B_{K^{2}})}^{p}. \end{aligned} $$
(5.62)

We run the previous algorithm again with \(T^{\lambda} g\) replaced by \(T^{\lambda} g^{\nsim}\). Note that in the algorithm in Sect. 5.2, we did not compare contributions between the transverse case and the tangential case, which is exactly because we will further run the algorithm below. In what follows, we often abbreviate \(g^{\nsim}_{\iota}\) to \(g_{\iota}\).

Step 0. Define \(\mathfrak {n}^{*}_{0}=\mathfrak {n}_{0}\).

Step 1. We will define three quantities; they correspond to contributions from the cellular case \(C(\mathfrak{L}_{1})\), the transverse case \(C(\mathfrak{M}_{1})\) and the tangential case \(C(\mathfrak{R}_{1})\). Take \(B_{\iota}\subset B_{R}\), a ball of radius \(R^{1-\delta}\), and \(\mathfrak {n}_{1}\), a child of \(\mathfrak {n}^{*}_{0}\) with \(\mathfrak {n}_{1}\in \mathcal {L}_{1}\), take \(O_{i_{1}}\in \mathfrak {n}_{1}\) with \(O_{i_{1}}\subset B_{\rho _{1}}\subset B_{\iota}\) with \(\rho _{1}=\rho (\mathfrak {n}_{1})\) for some \(B_{\rho (\mathfrak {n}_{1})}\), denote

$$ g_{\iota , O_{i_{1}}}=\sum _{T\in \mathbb{T}[B_{R}], T\cap O_{i_{1}}\neq \emptyset}(g_{\iota})_{T}. $$
(5.63)

Define

$$ C(\mathfrak{L}_{1}):= \sum _{O_{i_{1}}\in \mathfrak{n}_{1}} \sum _{ \iota} \Big\| T^{\lambda} g_{\iota , O_{i_{1}}} \Big\| _{\mathrm {BL}_{k, A}^{p}(O_{i_{1}} \cap B_{\iota})}^{p}. $$
(5.64)

If \(\mathfrak {n}_{0}^{*}\) does not have any children in \(\mathfrak {L}_{1}\), then we simply set \(C(\mathfrak {L})_{1}=0\).

To define the other two quantities, we need more notation. Let \(\mathfrak{n}_{1}\) be a child of \(\mathfrak {n}^{*}_{0}\) with \(\mathfrak {n}_{1}\in \mathfrak {M}_{1}\) or \(\mathfrak {R}_{1}\), and take \(O_{i'_{1}}\in \mathfrak{n}_{1}\) given by \(O_{i'_{1}}=B_{\rho _{1}}\cap W\) for some \(B_{\rho _{1}}\subset B_{\rho _{0}}\), with \(\rho _{1}=\rho (\mathfrak {n}_{1})\) and \(\rho _{0}=\rho (\mathfrak {n}^{*}_{0})\). Similarly to (5.63), we define \(g_{\iota , O_{i'_{1}}}\). Let \(\mathbb{T}_{O_{i'_{1}}}\) denote the collection of all \(T\in \mathbb{T}[B_{\rho _{0}}]\) for which

$$ T\cap B_{\rho _{1}}\cap W\neq \emptyset . $$
(5.65)

Moreover, we will partition \(\mathbb{T}_{O_{i'_{1}}}\) into two parts

$$ \mathbb{T}_{O_{i'_{1}}}=\mathbb{T}_{O_{i'_{1}}, \mathrm {tang}}\bigcup \mathbb{T}_{O_{i'_{1}}, \mathrm {trans}}, $$
(5.66)

where

$$ \mathbb{T}_{O_{i'_{1}}, \mathrm {tang}}:= \Big\{ T\in \mathbb{T}_{O_{i'_{1}}}: T \text{ is } \rho _{1}^{-\frac{1}{2}+\delta _{m-1}}\text{-tangent to } Z \text{ on } B_{\rho _{1}} \Big\} , $$
(5.67)

where \(m=\text{dim}(\mathfrak {n}^{*}_{0})\). We refer the definition of \(\mathbb{T}_{O_{i'_{1}}, \mathrm {tang}}\) to Definition 9.3 in [12, page 257]; it needs some clarification as \(T\) is a wave packet at the scale \(\rho _{0}\) and we are talking about tangency at the smaller scale \(\rho _{1}\).

After defining \(\mathbb{T}_{O_{i'_{1}}, \mathrm {tang}}\), we will just set

$$ \mathbb{T}_{O_{i'_{1}}, \mathrm {trans}}:=\mathbb{T}_{O_{i'_{1}}}\setminus \mathbb{T}_{O_{i'_{1}}, \mathrm {tang}}. $$
(5.68)

Moreover, define

$$ g_{\iota , O_{i'_{1}}, \mathrm {tang}}:=\sum _{T\in \mathbb{T}_{O_{i'_{1}}, \mathrm {tang}}}(g_{ \iota , O_{i'_{1}}})_{T} $$
(5.69)

and

$$ g_{\iota , O_{i'_{1}}, \mathrm {trans}}:=\sum _{T\in \mathbb{T}_{O_{i'_{1}}, \mathrm {trans}}}(g_{ \iota , O_{i'_{1}}})_{T}. $$
(5.70)

We continue to define the other two quantities. Define

$$ C(\mathfrak {M}_{1}):=\sum _{O_{i'_{1}}\in \mathfrak {n}_{1, M}(\mathfrak {n}^{*}_{0})} \sum _{\iota} \Big\| T^{\lambda} g_{\iota , O_{i'_{1}}, \mathrm {trans}} \Big\| ^{p}_{\mathrm {BL}_{k, A}^{p}(O_{i'_{1}}\cap B_{\iota})} $$
(5.71)

and

$$ C(\mathfrak {R}_{1}):=\sum _{O_{i'_{1}}\in \mathfrak {n}_{1, R}(\mathfrak {n}^{*}_{0})} \sum _{\iota} \Big\| T^{\lambda} g_{\iota , O_{i'_{1}}, \mathrm {tang}} \Big\| ^{p}_{\mathrm {BL}_{k, A}^{p}(O_{i'_{1}}\cap B_{\iota})}. $$
(5.72)

In the end, we compare \(C(\mathfrak {L}_{1}), C(\mathfrak {M}_{1}), C(\mathfrak {R}_{1})\) and see which one is the largest. For the one that is the largest, its node \(\mathfrak {n}_{1}\) will be called \(\mathfrak {n}^{*}_{1}\). This finishes the first step.

Before we proceed to the next step, we introduce more notations which will be used later and also in the forthcoming broom estimates. If \(\mathfrak {n}^{*}_{1}\in \mathcal {L}_{1}\), then

$$ g^{*}_{\iota , O_{i_{1}}}:=g_{\iota , O_{i_{1}}}, $$
(5.73)

for which we refer to (5.63). If \(\mathfrak {n}^{*}_{1}\in \mathcal {M}_{1}\), then

$$ g^{*}_{\iota , O_{i_{1}}}:=g_{\iota , O_{i_{1}}, \mathrm {trans}} $$
(5.74)

for which we refer to (5.69). If \(\mathfrak {n}^{*}_{1}\in \mathcal {R}_{1}\), then

$$ g^{*}_{\iota , O_{i_{1}}}:=g_{\iota , O_{i_{1}}, \mathrm {tang}} $$
(5.75)

for which we refer to (5.70).

Step \(2\le \ell \le \ell _{0}\). Here \(\ell _{0}\in \mathbb{N}\) is the last step in the algorithm in Sect. 5.2. Step \(\ell \) will be similar to Step 1. Our goal is to define \(C(\mathfrak {L}_{\ell}), C(\mathfrak {M}_{\ell}), C(\mathfrak {R}_{\ell})\). We consider the case \(\mathfrak {n}_{\ell , L}(\mathfrak {n}^{*}_{\ell -1})\) and the case \(\mathfrak {n}_{\ell , M}(\mathfrak {n}^{*}_{\ell -1}), \mathfrak {n}_{\ell , R}(\mathfrak {n}^{*}_{ \ell -1})\) separately.

Take \(\mathfrak {n}_{\ell}=\mathfrak {n}_{\ell , L}(\mathfrak {n}^{*}_{\ell -1})\) and \(O_{i_{\ell}}\in \mathfrak {n}_{\ell}\). Suppose that \(O_{i_{\ell}}\subset B_{\rho _{\ell}}\cap O_{i_{\ell -1}}\) with \(\rho _{\ell}=\rho (\mathfrak {n}_{\ell})\), \(O_{i_{\ell -1}}\in \mathfrak {n}^{*}_{\ell -1}\) and \(O_{i_{\ell -1}}\subset B_{\rho _{\ell -1}}\), \(\rho _{\ell -1}=\rho (\mathfrak {n}^{*}_{\ell -1})\). For a given \(\iota \), denote

$$ g_{\iota , O_{i_{\ell}}}=\sum _{ \substack{T\in \mathbb{T}[B_{\rho _{\ell -1}}]\\ T\cap O_{i_{\ell}}\neq \emptyset} } (g^{*}_{\iota , O_{i_{\ell -1}}})_{T}. $$
(5.76)

Define

$$ C(\mathfrak{L}_{\ell}):= \sum _{O_{i_{\ell}}\in \mathfrak{n}_{\ell}} \sum _{\iota} \Big\| T^{\lambda} g_{\iota , O_{i_{\ell}}} \Big\| _{\mathrm {BL}_{k, A}^{p}(O_{i_{ \ell}}\cap B_{\iota})}^{p}. $$
(5.77)

Next, take \(\mathfrak {n}_{\ell}=\mathfrak {n}_{\ell , M}(\mathfrak {n}^{*}_{\ell -1})\) or \(\mathfrak {n}_{\ell , R}(\mathfrak {n}^{*}_{\ell -1})\) and \(O_{i'_{\ell}}\in \mathfrak {n}_{\ell}\). Suppose that \(O_{i'_{\ell}}\subset B_{\rho _{\ell}}\cap O_{i'_{\ell -1}}\) with \(\rho _{\ell}=\rho (\mathfrak {n}_{\ell})\), \(O_{i'_{\ell -1}}\in \mathfrak {n}^{*}_{\ell -1}\) and \(O_{i'_{\ell -1}}\subset B_{\rho _{\ell -1}}\), \(\rho _{\ell -1}=\rho (\mathfrak {n}^{*}_{\ell -1})\). Let \(\mathbb{T}_{O_{i'_{\ell}}}\) denote the collection of all \(T\in \mathbb{T}[B_{\rho _{\ell}}]\) for which

$$ T\cap B_{\rho _{\ell}}\cap W_{m-1}\neq \emptyset , $$
(5.78)

with \(m=\text{dim}(\mathfrak {n}^{*}_{\ell -1})\). Moreover, we will partition \(\mathbb{T}_{O_{i'_{\ell}}}\) into two parts

$$ \mathbb{T}_{O_{i'_{\ell}}}=\mathbb{T}_{O_{i'_{\ell}}, \mathrm {tang}}\bigcup \mathbb{T}_{O_{i'_{ \ell}}, \mathrm {trans}}, $$
(5.79)

where

$$ \mathbb{T}_{O_{i'_{\ell}}, \mathrm {tang}}:= \Big\{ T\in \mathbb{T}_{O_{i'_{\ell}}}: T\text{ is } \rho _{\ell}^{-\frac{1}{2}+\delta _{m-1}}\text{-tangent to } W_{m-1} \text{ on } B_{\rho _{\ell}} \Big\} , $$
(5.80)

and

$$ \mathbb{T}_{O_{i'_{\ell}}, \mathrm {trans}}:=\mathbb{T}_{O_{i'_{\ell}}}\setminus \mathbb{T}_{O_{i'_{ \ell}}, \mathrm {tang}}. $$
(5.81)

We continue to define the other two quantities. Define

$$ C(\mathfrak {M}_{\ell}):=\sum _{O_{i'_{\ell}}\in \mathfrak {n}_{{\ell}, M}(\mathfrak {n}^{*}_{ \ell -1})} \sum _{\iota} \Big\| T^{\lambda} g_{\iota , O_{i'_{\ell}}, \mathrm {trans}} \Big\| ^{p}_{\mathrm {BL}_{k, A}^{p}(O_{i'_{\ell}}\cap B_{\iota})} $$
(5.82)

and

$$ C(\mathfrak {R}_{\ell}):=\sum _{O_{i'_{\ell}}\in \mathfrak {n}_{\ell , R}(\mathfrak {n}^{*}_{ \ell -1})} \sum _{\iota} \Big\| T^{\lambda} g_{\iota , O_{i'_{\ell}}, \mathrm {tang}} \Big\| ^{p}_{\mathrm {BL}_{k, A}^{p}(O_{i'_{\ell}}\cap B_{\iota})}. $$
(5.83)

In the end, we compare \(C(\mathfrak {L}_{\ell}), C(\mathfrak {M}_{\ell}), C(\mathfrak {R}_{\ell})\) and see which one is the largest. For the one that is the largest, its node \(\mathfrak {n}_{\ell}\) will be called \(\mathfrak {n}^{*}_{\ell}\).

In the end, we define \(g^{*}_{\iota , O_{i_{\ell}}}\) in the same way as in (5.73)–(5.75).

The above algorithm outputs a sequence of nodes

$$ \mathfrak {n}^{*}_{0}, \mathfrak {n}^{*}_{1}, \dots , \mathfrak {n}^{*}_{\ell _{0}}. $$
(5.84)

The parameter \(\text{dim}(\mathfrak {n}^{*}_{\ell})\) is non-increasing in \(\ell \). If \(\mathfrak {n}^{*}_{\ell}\) is an \(R\)-child, then

$$ \text{dim}(\mathfrak {n}^{*}_{\ell})=\text{dim}(\mathfrak {n}^{*}_{\ell -1})-1; $$
(5.85)

otherwise the dimension does not decrease. Denote \(m:=\text{dim}(\mathfrak {n}^{*}_{\ell _{0}})\). We know that \(m\ge k\), where \(k\) is as in Theorem 4.2, as otherwise the desired estimate (4.26) there would be trivial. Let

$$ \mathfrak {S}_{n}, \mathfrak {S}_{n-1}, \dots , \mathfrak {S}_{m} $$
(5.86)

denote the nodes from (5.84) that are \(R\)-children, where \(\mathfrak {S}_{n}:=\mathfrak {n}^{*}_{0}\) is also included. Here \(\mathfrak {S}\) is short for “surface”, as elements in \(\mathfrak {S}_{n'}\) are neighborhoods of algebraic varieties for each \(n'\). We therefore have

$$ \text{dim}(\mathfrak {S}_{n'})=n', \ \ \forall m\le n'\le n. $$
(5.87)

Moreover, denote

$$ r_{n'}:=\rho (\mathfrak {S}_{n'}), \ \ \forall m\le n'\le n, \ \ r_{m-1}:=1. $$
(5.88)

Elements in \(\mathfrak {S}_{n'}\) are of the form \(B_{r_{n'}}\cap \mathcal {N}_{r_{n'}^{1/2+\delta _{n'}}}(S_{n'})\) where \(S_{n'}\) is some algebraic variety of dimension \(n'\). To simplify notation, we will often identify \(B_{r_{n'}}\cap \mathcal {N}_{r_{n'}^{1/2+\delta _{n'}}}(S_{n'})\) with \(S_{n'}\) if it is clear from the context that we are talking about the node \(\mathfrak {S}_{n'}\). We follow [12] and introduce a few new notions.

The pair \((S_{n'}, B_{r_{n'}})\) is called a grain, with its dimension given by \(n'\) and degree given by the degree of \(S\).

A multigrain \(\vec{S}_{n'}\) is a tuple of grains

$$ \vec{S}_{n'}=\left (\mathcal{G}_{n}, \ldots , \mathcal{G}_{n'}\right ), \quad \mathcal{G}_{i}=\left (S_{i}, B_{r_{i}}\right ) \quad \text{ for } n' \leqslant i \leqslant n $$

satisfying

  1. 1.

    \(\operatorname{dim}\left (S_{i}\right )=i\) for \(n' \leqslant i \leqslant n\);

  2. 2.

    \(S_{n} \supset S_{n-1} \supset \cdots \supset S_{n'}\);

  3. 3.

    \(B_{r_{n}} \supset B_{r_{n-1}} \supset \cdots \supset B_{r_{n'}}\).

Sometimes we also write \(\vec{S}_{n'}=(S_{n}, \dots , S_{n'})\). The parameter \(n-n'\) is referred to as the level of the multigrain \(\vec{S}_{n'}\). The complexity of the multigrain is defined to be the maximum of the degrees \(\operatorname{deg} S_{i}\) over all \(n' \leqslant i \leqslant n\).

Definition 5.9

Nested tubes, [13]

Let \(\vec{S}_{n'}=\left (\mathcal{G}_{n}, \ldots , \mathcal{G}_{n'} \right )\) be a multigrain and

$$ \mathcal{G}_{i}=\left (S_{i}, B_{r_{i}}\right ) \quad \text{ for } n' \leqslant i \leqslant n . $$

Define \(\mathbb{T}_{r_{i}}[\vec{S}_{n'}]\) to be the set of length \(r_{i}\) tubes \(T\in \mathbb{T}[B_{r_{i}}]\) that are \(r_{i}^{-\frac{1}{2}+\delta _{i}}\)-tangent to \(S_{i}\) on \(B_{r_{i}}\) (see Definition 5.2 above or Definition 9.3 in [12], page 257) and satisfy that there exists \(T_{j} \in \mathbb{T}[B_{r_{j}}]\) for \(n'\le j< i\) such that

$$ T_{j} \subset \mathcal {N}_{r_{j}^{1 / 2+\delta _{j}}} S_{j}, \ \ \operatorname{dist}\left (\theta (T_{i}), \theta (T_{j})\right ) \lesssim r_{j}^{-1 / 2}, $$
(5.89)

and

$$ \operatorname{dist}\left (T_{j}, T_{i} \cap B_{r_{j}}\right ) \lesssim r_{i}^{(1+\delta ) / 2} $$
(5.90)

hold true for all \(i, j\) with \(n' \leqslant j < i\). The direction set of \(\mathbb{T}_{r_{i}}[\vec{S}_{n'}]\) is defined to be

$$ \Theta _{r_{i}}[\vec{S}_{n'}]:=\{ \theta (T): T\in \mathbb{T}_{r_{i}}[ \vec{S}_{n'}] \}. $$
(5.91)

For each \(m\le n'< n\), define

$$ D_{n'}=d^{\#_{c}( \mathfrak {j}(\mathfrak {n}) )}, $$
(5.92)

where \(\mathfrak {n}\) is the parent node of \(\mathfrak {S}_{n'}\). Moreover, define

$$ D_{m-1}=d^{ \#_{c} ( \mathfrak {j}(n_{\ell _{0}}^{*}) ) }, \ \ D_{n}=1. $$
(5.93)

This defines the same quantity as \(D_{\ell -1}\) in [12, page 265].

Lemma 5.10

For each \(n'\ge m-1\), it holds that

$$ r_{n'} \prod _{i=n'}^{n} D_{i}\le R. $$
(5.94)

Proof of Lemma 5.10

It suffices to show that

$$ D_{i}\le r_{i+1}/r_{i}, \ \forall i< n. $$
(5.95)

Recall that \(r_{i+1}=\rho (\mathfrak {S}_{i+1})\) and \(r_{i}=\rho (\mathfrak {S}_{i})\). As \(\mathfrak {S}_{i+1}\) is an \(R\)-child, by definition (see equation (5.20) and the line below), we have

$$ \mathfrak {j}(\mathfrak {S}_{i+1})=\#_{a}(\mathfrak {j}(\mathfrak {S}_{i+1}))=\#_{c}(\mathfrak {j}( \mathfrak {S}_{i+1}))=0. $$
(5.96)

Let \(\mathfrak {n}\) be the parent node of \(\mathfrak {S}_{i}\), and therefore \(D_{i}=d^{\#_{c}(\mathfrak {j}(\mathfrak {n}))}\). When the algorithm runs from \(\mathfrak {S}_{i+1}\) to \(\mathfrak {n}\), the radius parameter \(\rho \) decreases to \(\rho /d\) each time \(\#_{c}\) increases by 1, and (5.95) follows immediately. □

In the end of this subsection, we describe a few output functions of the above algorithm. Take \(n'\) with \(m\le n'\le n\) and consider the node \(\mathfrak {S}_{n'}\). As \(\mathfrak {S}_{n'}\) is an \(R\)-child, it means the tangential case \(C(\mathfrak {R}_{\ell})\), which was defined in (5.83), dominates. Here \(\ell \) is the level that \(\mathfrak {S}_{n'}\) belongs to. As elements in \(\mathfrak {S}_{n'}\) are neighborhoods of algebraic varieties, from now on we will always use \(S_{n'}\) to refer to an element in \(\mathfrak {S}_{n'}\). Consequently, \(g^{*}_{\iota , O_{i'_{\ell}}}\) will be called \(g^{*}_{\iota , S_{n'}}\), and its wave packets in the ball \(B_{r_{n'}}\) with \(S_{n'}\subset B_{r_{n'}}\) are all tangent to \(S_{n'}\).

Regarding these functions, we have the following properties. Let \(p_{n'}\) with \(n'\ge m\ge k\) be a Lebesgue exponent that is fixed later. These exponents satisfy

$$ p_{m}\ge p_{m+1}\ge \cdots \ge p_{n}=p\ge 2, $$
(5.97)

where \(p\) is the exponent in Theorem 4.2. Define \(\alpha _{n'}, \beta _{n'}\in [0, 1]\) by

$$ \frac{1}{p_{n'}}=\frac{1-\alpha _{n'-1}}{2}+ \frac{\alpha _{n'-1}}{p_{n'-1}}, \ \ \beta _{n'}=\prod _{i=n'}^{n-1} \alpha _{i}, $$
(5.98)

for \(m+1\le n'\le n-1\), and \(\alpha _{n}=\beta _{n}=1\). We have

Property 1. The inequality

$$\begin{aligned} \|T^{\lambda} g\|_{\mathrm{BL}_{k, A}^{p}\left (B_{R}\right )} & \lessapprox M(\vec{r}_{n'}, \vec{D}_{n'})\|g\|_{L^{2}}^{1-\beta _{n'}} \end{aligned}$$
(5.99)
$$\begin{aligned} & \left ( \sum _{S_{n'} \in \mathfrak {S}_{n'}} \sum _{\iota} \left \| T^{ \lambda} g^{*}_{\iota , S_{n'}} \right \| _{ \mathrm{BL}_{k, A_{n'}} ^{p_{n'}}(B_{r_{n'}}) } ^{p_{n'}} \right )^{\frac{\beta _{n'}}{p_{n'}}}, \end{aligned}$$
(5.100)

where \(B_{r_{n'}}\) is the ball of radius \(r_{n'}\) that contains \(S_{n'}\) and

$$ \vec{r}_{n'}:=(r_{n}, r_{n-1}, \dots , r_{n'}), \ \ \vec{D}_{n'}:=(D_{n}, D_{n-1}, \dots , D_{n'}), $$
(5.101)

holds for

$$\begin{aligned} M(\vec{r}_{n'}, \vec{D}_{n'}):=\left (\prod _{i=n'}^{n-1} D_{i} \right )^{(n-n') \delta} \left (\prod _{i=n'}^{n-1} r_{i}^{\left ( \beta _{i+1}-\beta _{i}\right ) / 2} D_{i}^{\left (\beta _{i+1}- \beta _{n'}\right ) / 2}\right ). \end{aligned}$$
(5.102)

Property 2. For \(n'\le n-1\), we have

$$\begin{aligned} \sum _{S_{n'} \in \mathfrak {S}_{n'}} \left \|g^{*}_{\iota , S_{n'}}\right \|_{2}^{2} \lessapprox D_{n'}^{1+\delta} \sum _{ S_{n'+1}\in \mathfrak {S}_{n'+1} } \big\| g^{*}_{ \iota , S_{n'+1} } \big\| _{2}^{2}, \end{aligned}$$
(5.103)

for every \(B_{\iota}\subset B_{R}\) of radius \(R^{1-\delta}\). Here when \(n'=n-1\), \(g^{*}_{\iota , S_{n'+1}}\) was not defined before and we simply set \(g^{*}_{\iota , S_{n'+1}}=g\).

Property 3. For \(n'\le n-1\), we have

$$\begin{aligned} \max _{S_{n'} \in \mathfrak {S}_{n'}} \left \| g^{*}_{\iota , S_{n'}} \right \|_{2}^{2} \lessapprox \Big( \frac{r_{n'+1}}{r_{n'}} \Big) ^{-\frac{n-n'-1}{2}} D_{n'}^{-n'+\delta} \max _{ S_{n'+1}\in \mathfrak {S}_{n'+1}} \big\| g^{*}_{\iota , S_{n'+1}} \big\| _{2}^{2} \end{aligned}$$
(5.104)

and

$$\begin{aligned} & \max _{S_{n'} \in \mathfrak {S}_{n'}} \max _{\theta} \left \| g^{*}_{ \iota , S_{n'}} \right \|_{L^{2}_{\mathrm {avg}}(\theta )}^{2} \end{aligned}$$
(5.105)
$$\begin{aligned} & \lessapprox \Big( \frac{r_{n'+1}}{r_{n'}} \Big) ^{-\frac{n-n'-1}{2}} D_{n'}^{\delta} \max _{ S_{n'+1}\in \mathfrak {S}_{n'+1}} \max _{\theta} \big\| g^{*}_{\iota , S_{n'+1}} \big\| _{L^{2}_{\mathrm {avg}}(\theta )}^{2}, \end{aligned}$$
(5.106)

where \(\theta \) is a frequency cap defined in Sect. 4.1 of side length \(\rho ^{-1/2}\), hold for all \(1\le \rho \le r_{n'}\).

Property 4. For \(n'\le n''\le n\), it holds that

$$ \big\| g^{*}_{\iota , S_{n'}} \big\| _{2}^{2} \lesssim _{\epsilon} r_{n'}^{\frac{n-n'}{2}} \Big( \prod _{i=n'}^{n-1} r_{i}^{-\frac{1}{2}} \Big) r_{n''}^{-\frac{n-n''}{2}} \Big( \prod _{i=n''}^{n-1} r_{i}^{\frac{1}{2}} \Big) R^{O(\epsilon _{\circ})} \big\| g_{\iota , S_{n'}}^{*(n'')} \big\| _{2}^{2}, $$
(5.107)

where

$$ g_{\iota , S_{n'}}^{*(n'')}:= \sum _{ T\in \mathbb{T}_{ r_{n''} }[\vec{S}_{n'}] } (g_{\iota})_{T}, $$
(5.108)

and \(\mathbb{T}_{r_{n''}}[\vec{S}_{n'}]\) is from Definition 5.9.

If one takes \(n''=n\), then \(g_{\iota , S_{n'}}^{*(n)}\) becomes \(g_{S_{n'}}^{\#}\) in the last equation in [13, page 9]. If \(n''=n'\), then \(g^{*}_{\iota , S_{n'}}=g^{*(n'')}_{\iota , S_{n'}}\).

These four properties are taken essentially from [13, page 9]. The only main difference is that Hickman and Zahl [13] only introduced and used \(n''=n\) in (5.108). The proofs of the first three properties are given in [12], the proof of the fourth property is the same as that of Property iv) in [13, page 10] and relies on transverse equidistribution properties (for our setting, the needed property is in [10]), and therefore we will not repeat it.

The only explanation that is needed is as follows. In the last equation in [12, page 255], the authors there used the fact that \(\rho _{j+1}\simeq \rho _{j}\), where the implicit constant is universal. In our case, these two radius parameters differ by \(d\), which is a large constant. Consequently, Property \((\mathrm{III})_{j}\) in [12, page 250] may not hold as is written there. However, we can still obtain some good substitute for it. For a node \(\mathfrak {n}\), let \(\mathfrak {n}^{\Uparrow}\) denote the closest ancestor (itself included) that is an \(R\)-child. Let \(\mathfrak {n}_{\ell}\) be a node in \(\mathfrak {M}'_{\ell}\cup \mathfrak {M}_{\ell}\) with \(\mathfrak {n}_{\ell}^{\Uparrow}=\mathfrak {S}_{n'}\). We will prove

$$ \big\| g_{\iota , O_{\ell}} \big\| _{2}^{2} \le C^{\mathrm{III}}_{\ell , \delta} \cdot \Big( \frac{r_{n'}}{\rho (\mathfrak {n}_{\ell})} \Big) ^{-\frac{n-n'}{2}} d^{-\#_{c}( \mathfrak {j}(\mathfrak {n}_{\ell}) ) (n'-1) } \big\| g \big\| _{2}^{2}, $$
(5.109)

for every \(O_{\ell}\in \mathfrak {n}_{\ell}\) and ball \(B_{\iota}\) of radius \(R^{1-\delta}\), where

$$ C^{\mathrm{III}}_{\ell , \delta}:=d^{ \#_{c}( \mathfrak {j}(\mathfrak {n}_{\ell}) ) \delta + \#_{a}( \mathfrak {j}(\mathfrak {n}_{\ell}) ) \delta } (r_{n'})^{ \#_{a}( \mathfrak {j}(\mathfrak {n}_{\ell}) ) O(\delta _{n'-1/2}) + \#_{a'}( \mathfrak {j}(\mathfrak {n}_{ \ell}) ) O(\delta _{n'}) } $$
(5.110)

and

$$ \#_{a'}(\mathfrak {j}( \mathfrak {n}_{\ell} )):=\#\{ \mathfrak {n}: \mathfrak {n}_{\ell} \preccurlyeq \mathfrak {n}\preccurlyeq \mathfrak {S}_{n'}, \mathfrak {n}\in \mathfrak {M}'_{\ell '} \text{ for some } \ell ' \}. $$
(5.111)

Note that

$$ \#_{a}(\mathfrak {j} (\mathfrak {n}_{\ell}) ) \lesssim \frac{|\log \delta _{n'-1}|}{\delta _{n'-1}}, \ \ \#_{a'}(\mathfrak {j} ( \mathfrak {n}_{\ell}) ) \lesssim \frac{|\log \delta _{n'-1/2}|}{\delta _{n'-1/2}}, $$
(5.112)

and therefore by applying Lemma 5.10, we always have \(\text{(5.110)} \lessapprox 1\).

Let us prove (5.109). Let \(\ell _{1}\) be the largest integer smaller than \(\ell \) such that there exists \(\mathfrak {n}_{\ell _{1}}\in \mathfrak {M}'_{\ell _{1}}\cup \mathfrak {M}_{\ell _{1}}\) with \(\mathfrak {n}_{\ell}\preccurlyeq \mathfrak {n}_{\ell _{1}}\) and \(\mathfrak {n}_{\ell _{1}}^{\Uparrow}=\mathfrak {S}_{n'}\); if no such nodes exist, then we simply take \(\mathfrak {n}_{\ell _{1}}=\mathfrak {S}_{n'}\). Assume that (5.109) has been proved for \(\mathfrak {n}_{\ell _{1}}\), and we will prove it for \(\mathfrak {n}_{\ell}\). There are two cases: \(\mathfrak {n}_{\ell}\in \mathfrak {M}_{\ell}\) or \(\mathfrak {n}_{\ell}\in \mathfrak {M}'_{\ell}\). We only prove the latter case, and the former case can be done in a similar way. List all the nodes between \(\mathfrak {n}_{\ell _{1}}\) and \(\mathfrak {n}_{\ell}\) in a descending order:

$$ \mathfrak {n}_{\ell _{1}}, \mathfrak {n}_{\ell _{1}+1}, \dots , \mathfrak {n}_{\ell -1}, \mathfrak {n}_{\ell}. $$
(5.113)

Note that \(\mathfrak {n}_{\ell '}\) is an \(L\)-child for every \(\ell _{1}<\ell '< \ell \). By orthogonality between wave packets and the fundamental theorem of algebra, we have

$$ \big\| g_{\iota , O_{\ell '}} \big\| _{2}^{2} \lesssim d^{-(n'-1)} \big\| g_{\iota , O_{\ell '-1}} \big\| _{2}^{2}, $$
(5.114)

for every \(\ell '<\ell \), \(O_{\ell '}\in \mathfrak {n}_{\ell '}, O_{\ell '-1}\in \mathfrak {n}_{\ell '-1}\) and \(O_{\ell '}\subset O_{\ell '-1}\). This further implies

$$ \big\| g_{\iota , O_{\ell -1}} \big\| _{2}^{2} \le C^{\mathrm{III}}_{\ell -1, \delta}\cdot \Big( \frac{r_{n'}}{\rho (\mathfrak {n}_{\ell _{1}})} \Big) ^{-\frac{n-n'}{2}} d^{-\#_{c}( \mathfrak {j}(\mathfrak {n}_{\ell -1}) ) (n'-1) } \big\| g \big\| _{2}^{2}. $$
(5.115)

Here note that the denominator is \(\rho ( \mathfrak {n}_{\ell _{1}} )\) instead of \(\rho ( \mathfrak {n}_{\ell -1} )\), as remarked at the beginning of Sect. 5.2. When passing from \(\mathfrak {n}_{\ell -1}\) to \(\mathfrak {n}_{\ell}\), recall that in (5.46) and (5.49), we cut the neighborhood \(\mathcal {N}_{ \rho ( \mathfrak {n}_{\ell -1}^{\uparrow} )^{1/2+\delta _{m}} } = \mathcal {N}_{ \rho ( \mathfrak {n}_{\ell _{1}} )^{1/2+\delta _{m}} }\) into thinner layers \(\mathcal {N}_{ \rho ( \mathfrak {n}_{\ell} )^{1/2+\delta _{m}} }\). Therefore by transverse equidistribution properties (for instance Lemma 8.4 in [10]), we have

$$ \big\| g_{\iota , O_{\ell}} \big\| _{2}^{2} \lesssim (r_{n'})^{O(\delta _{n'})} d^{-(n'-1)} \Big( \frac{ \rho (\mathfrak {n}_{\ell _{1}}) }{ \rho (\mathfrak {n}_{\ell}) } \Big) ^{-\frac{n-n'}{2}} \big\| g_{\iota , O_{\ell -1}} \big\| _{2}^{2}, $$
(5.116)

for \(O_{\ell}\in \mathfrak {n}_{\ell}\) with \(O_{\ell}\subset O_{\ell -1}\). This, combined with (5.115) and the choice of the constant in (5.110), gives us the desired bound.

6 Strong polynomial Wolff axioms

The goal of this section is to prove a strong polynomial Wolff axiom (see Theorem 6.2 below), a stronger version of the polynomial Wolff axiom in Sect. 3. As a direct consequence, we will obtain the key Lemma 6.1, which controls the number of tubes concentrated near a multi-grain. Strong polynomial axioms already appeared in earlier works in [14, 24] and [13]. The proof in this section combines the argument in the papers mentioned above and Bourgain’s conditions as formulated in Theorem 2.1.

Lemma 6.1

Let \(\vec{S}_{n'}\) be the multi-grain given in Definition 5.9with complexity at most \(d\). We have

$$ \# \Theta _{r_{i}}[\vec{S}_{n'}] \lesssim _{\epsilon _{\circ}, d} \left (\prod _{j=n'}^{i-1} r_{j}^{-1 / 2}\right ) r_{i}^{ \frac{i-1}{2}+\epsilon _{\circ}}, $$
(6.1)

for all \(n'\le i\le n\), where \(\epsilon _{\circ}\) is given in (1.44).

As with Theorem 1.2, we will deduce Lemma 6.1 from a geometric theorem.

Theorem 6.2

[Strong Polynomial Wolff Axiom for our \(\phi \)] Let \(n\ge 3\). If Bourgain’s condition holds for the phase function \(\phi \) at every \(({\mathbf {x}}_{0}; \xi _{0})\in \mathrm{supp}(a)\), then the following strong polynomial Wolff axiom for \(\phi \) holds: Let \(E\ge 2\) be an integer and we fix an integer \(k\). For every \(\epsilon >0\), there exists \(C(n, E, k, \epsilon )>0\) such that if we have a sequence of balls

$$ B({\mathbf {x}}_{1}, s_{1}) \subset \frac{1}{2} B({\mathbf {x}}_{2}, s_{2}) \subset \cdots \subset \frac{1}{2^{k-1}} B({\mathbf {x}}_{k}, s_{k}) \subset \frac{1}{2^{k}} B^{n}, $$
(6.2)

numbers \(\kappa ^{C_{0}} \leq \kappa _{1} \leq \cdots \leq \kappa _{k} \leq \kappa \) and subsets \(S_{j} \subset B({\mathbf {x}}_{j}, s_{j})\), \(j = 1, 2, \ldots , k\) satisfying:

  • \(\kappa _{j} \leq s_{j}, \forall 1 \leq j \leq k\) and \(\varphi _{j} := \frac{s_{j}}{\kappa _{j}}\) satisfy \(\varphi _{1} \geq \varphi _{2} \geq \cdots \geq \varphi _{k}\),

  • \(S_{j}\) is a semialgebraic set of complexity \(\le E\) whose \(\kappa _{j}\)-neighborhood has volume \(\simeq |S_{j}|\),

  • The intersection between \(S_{j}\) and any ball of radius \(r\in [\kappa _{j}, s_{j}]\) has volume \(\leq C_{j} r^{d_{j}} \kappa _{j}^{n-d_{j}}\),

then the following holds uniformly:

For every collection \(\mathbb{T}\) of \(\kappa \)-tubes pointing in different directions (defined before Theorem 1.2), if we use \(c(T)\) to denote the core curve of \(T\), then

$$\begin{aligned} & \#\{T\in \mathbb{T}: c(T) \bigcap \mathcal {B}({\mathbf {x}}_{j}, \frac{1}{2} s_{j}) \subset S_{j}, \forall 1 \leq j \leq k\} \\ \le & C(C_{0}, n, E, k, \epsilon )\prod _{j=1}^{k} ( \frac{\varphi _{j-1}}{\varphi _{j}})^{d_{j}-1}( \frac{\varphi _{k}}{\kappa})^{n-1} \kappa ^{-\epsilon} \end{aligned}$$
(6.3)

where \(\varphi _{0} = 1\) and the horizontal slabFootnote 9\(\mathcal {B}({\mathbf {x}}_{j}, \frac{1}{2} s_{j})\) is defined to be \(\pi _{t}^{-1} (\pi _{t} (B({\mathbf {x}}_{j}, \frac{1}{2} s_{j})))\). Here \(\pi _{t}: \mathbb{R}^{n} \to \mathbb{R}\) is the orthogonal projection to the \(t\)-variable.

Moreover the implied constant only depends on bounds of finitely many (depending on \(C_{0}, n, E, k, \epsilon \)) derivatives of \(\phi \).

Like in Sect. 3, we are going to deduce Theorem 6.2 when the phase function satisfies a concrete derivative condition. Then we simply check that the condition is satisfied by our phase function.

Theorem 6.3

[Generalized Strong Polynomial Wolff Axiom] Let \(n\ge 3\). Suppose for a \(\Phi \) as in the beginning of Sect3:

  1. (a)

    For every choice of \(v \in [-1, 1]^{n-1}\), \(\xi \in [-1, 1]^{n-1}\), subinterval \(I \subset [-1, 1]\) and \(t \in [-1, 1]\), we have both

    $$\begin{aligned} & |\det (\nabla _{v} \Phi (v, t, \xi )\cdot M+\nabla _{\xi} \Phi (v, t, \xi ))| \\ \lesssim & \left (1+ \frac{\mathrm{dist}(t, I)}{|I|}\right )^{n-1} \frac{1}{|I|}\int _{I} |\det (\nabla _{v} \Phi (v, s, \xi )\cdot M+ \nabla _{\xi} \Phi (v, s, \xi ))|\mathrm{d} s \end{aligned}$$
    (6.4)

    and (3.3) for some implied constants independent of the choices of \(v, \xi , I\) and \(M\).

  2. (b)

    If \(t_{1} \neq t_{2}\), \(\Phi (v, t_{1}, \xi ) = x_{1}\), and \(\Phi (v', t_{1}, \xi ')) = x_{1}\), then

    $$ |\Phi (v', t_{2}, \xi ')) - \Phi (v, t_{2}, \xi ))| \lesssim |t_{1} - t_{2}| \cdot |\xi - \xi '|. $$
    (6.5)
  3. (c)

    If \(t_{1} \neq t_{2}\) and \(\Phi (v, t_{1}, \xi )) = x_{1}\), \(\Phi (v, t_{2}, \xi )) = x_{2}\), then for \(x_{2}'\) with distance \(\mu |t_{1}-t_{2}|\) from \(x_{2}\) \((\mu \leq 10)\), there are \(v'\) and \(\xi '\) with \(\Phi (v', t_{1}, \xi ')) = x_{1}\), \(\Phi (v', t_{2}, \xi ')) = x_{2} '\) and \(|\xi ' - \xi | \lesssim \mu \).

Then the conclusion of Theorem 6.2 (with the notion “pointing in different directions” now defined as in the beginning of Sect3) holds with the implied constant only depending on bounds of finitely many (depending on \(C_{0}, n, E, k, \epsilon \)) derivatives of \(\Phi \) and the implied constants in (a) - (c).

Remark 6.4

We explain the intuition behind (b) and (c) a bit. For convenience we introduce the following notation. For fixed \(v \in \mathbb{R}^{n-1}\) and \(\xi \in \mathbb{R}^{n-1}\), we call the curve

$$ c_{v, \xi} = \{(x, t) \in \mathbb{R}^{n-1} \times [-1, 1]: x = \Phi (v, t, \xi )\} $$
(6.6)

to be a \(\Phi \)-curve. Intuitively, if we know a \(\Phi \)-curve passes through \((x_{1}, t_{1})\) and want to perturb the “direction variable” \(\xi \) and the “initial position variable” \(v\) so that \((x_{1}, t_{1})\) is still on the curve, then (b) says whenever the perturbation on the direction \(\xi \) is \(O(\mu )\) we always have the perturbation of the curve at time \(t= t_{2}\) is \(O(|t_{1} - t_{2}| \mu )\), and (c) says if we want the \(x\) coordinate at time \(t=t_{2}\) to be shifted by a distance \(\simeq \mu |t_{1} - t_{2}|\), we can always succeed with the amount of the perturbation needed on \(\xi \) being \(O(\mu )\).

Proof of Theorem 6.3

Similarly to the proof of Theorem 3.1, we only do the proof when \(\Phi \) is fixed and after seeing the proof it will be clear that the estimate only depends on finitely many derivatives of \(\Phi \) (in particular since the smallest scale (\(\kappa ^{C_{0}}\)) we consider is polynomial in \(\kappa \), in the approximation argument described below one only needs a Taylor approximation of order \(O_{C_{0}, n, E, k, \epsilon} (1)\)).

Like the proof of Theorem 3.1, we can reduce the situation to the case where \(\Phi \) is a polynomial of degree \(O_{C_{0}, n, E, k, \epsilon} (1)\) by a Taylor approximation argument. For general \(\Phi \) by this argument one reduces to a problem with two modified conditions: (a’) a slightly weaker condition than (6.4) in (a) (similar to (3.7) versus (3.3)) that has very small error term (of the form \(\kappa ^{-1000n(1+C_{0})}\)) which makes no difference (see the proof below and how to deal with this issue in the similar situation in the proof of Theorem 3.1), and (b’) (c’) two slightly weaker conditions than (b) and (c) with an error term \(\kappa ^{1+C_{0}}\) for \(|t| \leq \kappa ^{\epsilon}\), both not affecting our framework (for (b) and (c), note that the only place they are needed is the verification of a claim in the beginning of the induction step). From now on we always assume \(\Phi \) is a polynomial of degree \(O_{C_{0}, n, E, k, \epsilon} (1)\).

Let us assume \(\kappa \) and \(\kappa _{j}\) are all sufficiently small (allowed to depend on derivatives of \(\Phi \)). Otherwise we simply ignore some constraints. This assumption will enable us to use the implicit function theorem at scales \(\kappa \) or \(\kappa _{j}\) freely.

We make one more comment before starting: the present theorem is a generalization of Lemma 3.7 in [13], which was in turn developed based on Theorem 1.4 in [14] or Theorem 1.9 in [24]). Our proof will have a lot in common with these, and will be a natural generalization of Theorem 3.1.

We also refine the slabs \(\mathcal {B}({\mathbf {x}}_{j}, \frac{1}{2}s_{j})\) a bit before starting. Since each \(B ({\mathbf {x}}_{j}, s_{j})\) is contained in \(\frac{1}{2}B ({\mathbf {x}}_{j+1}, s_{j+1})\) (\(\forall j < k\)), we can take horizontal slabs \(\mathcal {B}_{1}, \ldots , \mathcal {B}_{k}\) such that:

  1. (i)

    \(\mathcal {B}_{j} \subset \mathcal {B}({\mathbf {x}}_{j}, \frac{1}{2}s_{j})\).

  2. (ii)

    The thickness of \(\mathcal {B}_{j}\) is \(\simeq s_{j}\).

  3. (iii)

    The distance between \(\mathcal {B}_{j_{1}}\) and \(\mathcal {B}_{j_{2}}\) is \(\gtrsim s_{j_{2}}\) for all pairs \(j_{1} < j_{2}\).

For \(1 \leq l \leq k\) and \(t \in [-1, 1]\), we define

$$\begin{aligned} S_{l, t} = {}&\{y \in \mathbb{R}^{n-1}: \exists \text{ a } \Phi -\text{curve } c_{v, \xi} \text{ s.t.} \\ & c_{v, \xi} \bigcap \mathcal {B}_{j} \subset S_{j}, \forall 1 \leq j \leq l \text{ and } (y, t) \in c_{v, \xi}\} \end{aligned}$$
(6.7)

and we are going to prove inductively that

$$\begin{aligned} m_{n-1}^{*}(S_{l, t}) \le{}& C(C_{0}, n, E, k, \epsilon ) (d_{l} (t))^{n-1} \\ &{}\times \prod _{j=1}^{l} (\frac{\varphi _{j-1}}{\varphi _{j}})^{d_{j}-1} \varphi _{l}^{n-1} \kappa ^{-2^{l-k-1}\epsilon}, \forall t \in [-1, 1] \end{aligned}$$
(6.8)

where \(d_{l} (t)\) is defined to be \(s_{l}\) plus the distance between \(B({\mathbf {x}}_{l}, s_{l})\) and the hyperplane \(\{x_{n} = t\}\), and \(m_{n-1}^{*}(\cdot )\) is the \((n-1)\)-dimensional Lebesgue outer measure.

For convenience, we choose a large constant \(K\) depending on the implied constant in (b) and (c) and also consider a companion set

$$\begin{aligned} \tilde{S}_{l, t} ={}& \{y \in \mathbb{R}^{n-1}: \exists \text{ a } \Phi - \text{curve } c_{v, \xi} \text{ s.t.} \\ & c_{v, \xi} \bigcap \mathcal {B}_{j} \subset \mathcal {N}_{K\kappa _{j}} (S_{j}), \forall 1 \leq j \leq l \text{ and } (y, t) \in c_{v, \xi}\} \end{aligned}$$
(6.9)

Note that once we prove (6.8), by the same reasoning and the assumption about \(\mathcal {N}_{\kappa _{j}} (S_{j})\), we also prove the same upper bound for \(m_{n-1}^{*} (\tilde{S}_{l, t})\).

Base case. Our base case is when \(l=0\). Since \(\Phi \) has a \(C^{1}\)-derivative bound, we observe that all \(S_{l, t}\) lie in a uniformly bounded set \(\Omega _{0}\). We make the convention that \(S_{0, t}\) is the part of the set defined in (6.7) with \(l=0\) (hence with a vacuous condition) lying in \(\Omega _{0}\). (6.8) then trivially holds for \(l=0\) with the convention \(d_{0} (t) = 1\). We will see the first induction step is similar to the steps afterward with this setup.

The inductive step. Suppose we have (6.8) for some \(l\) in \([1, k)\). Next we prove it for \(l+1\).

Integrate the induction hypothesis over \(t\) and temporally ignore measurability issues, by Fubini we formally deduce

m n ( ( t S l , t ) B l + 1 ) C ( C 0 , n , E , k , ϵ ) j = 1 l ( φ j 1 φ j ) d j 1 φ l n 1 s l + 1 n κ 2 l k 1 ϵ .
(6.10)

We assert a stronger conclusion: In fact, \((\bigcup _{t} S_{l, t})\bigcap \mathcal {B}_{l+1}\) is contained in a set \(U_{l}\) such that \(U_{l}\) is a union of (\(n\)-dim) balls of radii \(\varphi _{l} s_{l+1}\) and that

$$\begin{aligned} m_{n}^{*}(U_{l}) \le C(C_{0}, n, E, k, \epsilon ) \prod _{j=1}^{l} ( \frac{\varphi _{j-1}}{\varphi _{j}})^{d_{j}-1}\varphi _{l}^{n-1} s_{l+1}^{n} \kappa ^{-2^{l-k-1}\epsilon}. \end{aligned}$$
(6.11)

To construct such a \(U_{l}\), we take the \(\varphi _{l} s_{l+1}\)-neighborhood of \((\bigcup _{t} S_{l, t})\bigcap \mathcal {B}_{l+1}\) and cover it by a finitely overlapping collection of balls of radii \(\varphi _{l} s_{l+1}\). Define the union of these balls to be \(U_{l}\).

It remains to derive the volume bound (6.11). We claim that we can take \(K\) in (6.9) large (and the constraint here will be the only one affecting the choice of \(K\)) such that this \(U_{l}\) is contained in \(\bigcup _{t} \tilde{S}_{l, t}\).

To verify this claim, by definition we see \(U_{l}\) is contained in the \(3\varphi _{l} s_{l+1}\)-neighborhood of \((\bigcup _{t} S_{l, t})\bigcap \mathcal {B}_{l+1}\). This means for every point \((\tilde{y}, t)\) in \(U_{l}\), we can find a \(\Phi \)-curve that is \(3\varphi _{l} s_{l+1}\)-close to this point and the part of that curve in \(\mathcal {B}_{j}\) completely lies in \(S_{j}\), \(\forall 1 \leq j \leq l\). Now keep a point of that \(\Phi \)-curve in \(\mathcal {B}_{1}\) fixed and by assumption (c) (see Remark 6.4 for more intuition), one can change the “direction” \(\xi \) by up to \(O(\varphi _{l})\) so that the new \(\Phi \)-curve now passes through \((\tilde{y}, t)\). By assumption (b) for each \(t \in \pi _{t} \mathcal {B}_{j} (1 \leq j \leq l)\), the perturbation amount of the \(x\) variable is \(\lesssim \varphi _{l} s_{j} \lesssim \varphi _{j} s_{j} = \kappa _{j}\). Hence the intersection between the new \(\Phi \)-curve and \(\mathcal {B}_{j}\) lies in \(\mathcal {N}_{K\kappa _{j}} (S_{j})\) if \(K\) is sufficiently large depending on the implied constants in (b) and (c). For this choice of \(K\) we just proved that the claim holds. Applying the induction hypothesis to each \(t\)-slice of \(\tilde{S}_{l, t}\) and integrate, we deduce (6.11).

Our \(U_{l}\) is a union of \(\varphi _{l} s_{l+1}\)-balls that contains \((\bigcup _{t} S_{l, t})\bigcap \mathcal {B}_{l+1}\) and obeys the volume bound (6.11). By a covering lemma we may assume without loss of generality that the \(\varphi _{l} s_{l+1}\)-balls are finite-overlapping. Now we use the volume upper bound of the intersection between \(S_{l+1}\) and \(r\)-balls in the assumption of Theorem 6.2. We deduce

$$\begin{aligned} & m_{n}^{*}((\bigcup _{t} S_{l+1, t})\bigcap \mathcal {B}_{l+1}) \\ \le & C(C_{0}, n, E, k, \epsilon ) \prod _{j=1}^{l} ( \frac{\varphi _{j-1}}{\varphi _{j}})^{d_{j}-1}\varphi _{l}^{n-1} s_{l+1}^{n} (\frac{\kappa _{l+1}}{\varphi _{l} s_{l+1}})^{n-d_{l+1}} \kappa ^{-2^{l-k-1} \epsilon} \\ = & C(C_{0}, n, E, k, \epsilon ) \prod _{j=1}^{l} ( \frac{\varphi _{j-1}}{\varphi _{j}})^{d_{j}-1}\varphi _{l}^{n-1} s_{l+1}^{n} (\frac{\varphi _{l+1}}{\varphi _{l}})^{n-d_{l+1}} \kappa ^{-2^{l-k-1} \epsilon} \\ = & C(C_{0}, n, E, k, \epsilon ) \prod _{j=1}^{l+1} ( \frac{\varphi _{j-1}}{\varphi _{j}})^{d_{j}-1}\varphi _{l+1}^{n-1} s_{l+1}^{n} \kappa ^{-2^{l-k-1}\epsilon}. \end{aligned}$$
(6.12)

We will use (6.12) to close the induction step by an argument developed in [14] and [24]. Below we fix an arbitrary \(t=t_{0}\) to do the proof. This step is a lot similar to the proof of Theorem 3.1. So we will present some steps in sketch only.

Define the set

$$ L_{l+1, t} = \{(v, \xi ): c_{v, \xi} \bigcap \mathcal {B}_{j} \subset S_{j}, \forall 1 \leq j \leq l+1\}. $$
(6.13)

We already assumed \(\Phi \) is a polynomial of degree \(O_{C_{0}, n, E, k, \epsilon} (1)\). Thus in the expression of a \(\Phi \)-curve, \(x\) is polynomial in \(v, t, \xi \) with the same degree bound. For simplicity we use \(b_{t_{0}}\) to denote the hyperplane \(\{x_{n} = t_{0}\}\).

By effective quantifier elimination (i.e. the Tarski-Seidenberg theorem) we find a semialgebraic subset \(L_{l+1, t} ' \subset L_{l+1, t}\) of complexity \(O_{n, E, \epsilon _{1}, K} (1)\) such that all \(c_{v, \xi} \bigcap b_{t_{0}}\) are distinct for \((v, \xi ) \in L_{l+1, t} '\), and that \(\{c_{v, \xi} \bigcap b_{t_{0}}: (v, \xi ) \in L_{l+1, t} '\} = \{c_{v, \xi} \bigcap b_{t_{0}}: (v, \xi ) \in L_{l+1, t}\}\). (To see this one can first add \((n-1)\) more coordinates to each \((\xi , v) \in L_{l+1, t}\) denoting the “position”, i.e. the first \((n-1)\) coordinates, of the intersection \(c_{v, \xi} \bigcap b_{t_{0}}\). This is still a semialgebraic set of bounded complexity. Then one applies the quantifier elimination to find a subset such that the last \((n-1)\) coordinates are distinct among different points in the subset and the set of the last\((n-1)\) coordinates does not change. From the construction we see easily that \(L_{l+1, t} '\) has dimension \(\leq n-1\).

Using Gromov’s lemma to approximate \(L_{l+1, t} '\) by images of smooth maps in the same way as we did to prove Theorem 3.1, we see (for arbitrary \(\epsilon _{1}>0\)) there exist two polynomial maps \(F\) and \(G: [0, \kappa ^{\epsilon _{1}}]^{n-1} \to \mathbb{R}^{n-1}\) (whose images are the \(v\) and the \(\xi \) variables, respectively) with \(\deg F, \deg G = O_{n, E, \epsilon _{1}} (1)\) and \(\|F\|_{C^{1}}, \|G\|_{C^{1}} \leq 1\) such that

$$ c_{{F(x)}, G(x)} \bigcap \mathcal {B}_{j} \subset \mathcal {N}_{\kappa _{j}} (S_{j}), \forall x, \forall 1 \leq j \leq l+1 $$
(6.14)

and that

$$ \mathcal {H}^{n-1} (\{c_{{F(x)}, G(x)} \bigcap b_{t_{0}}: x \in [0, \kappa ^{ \epsilon _{1}}]^{n-1}\}) \gtrsim _{n, E, \epsilon _{1}, K} \kappa ^{C \epsilon _{1}} m_{n-1}^{*}(S_{l+1, t_{0}}) $$
(6.15)

where \(\mathcal {H}^{n-1} (\cdot )\) stands for the \((n-1)\)-dimensional Hausdorff measure on the hyperplane \(b_{t_{0}}\).

Now look at the \(n\)-dimensional volume of

$$\begin{aligned} M &= \{(\Phi (F(x), t, G(x)), t): x \in [0, \kappa ^{\epsilon _{1}}]^{n-1} \}\bigcap \mathcal {B}_{l+1} \\ &= (\bigcup c_{{F(x)}, G(x)}) \bigcap \mathcal {B}_{l+1} \end{aligned}$$
(6.16)

and the \((n-1)\)-dimensional volume of

$$ H = \{(\Phi (F(x), t_{0}, G(x)), t_{0}): x \in [0, \kappa ^{\epsilon _{1}}]^{n-1} \} = (\bigcup c_{{F(x)}, G(x)}) \bigcap b_{t_{0}} $$
(6.17)

and compare them.

Suppose the time interval (i.e. the range of the last coordinate) of \(\mathcal {B}_{l+1}\) is \(I_{l+1}\). Then using Bézout like in the proof of Theorem 3.1, we have

$$\begin{aligned} |M| \sim _{n, E, \epsilon _{1}} {}&\int _{I_{l+1}} \int _{[0, \kappa ^{ \epsilon _{1}}]^{n-1}} |\nabla _{x} (\Phi (F(x), t, G(x)))| \mathrm{d}x\mathrm{d}t \\ = {}& \int _{I_{l+1}} \int _{[0, \kappa ^{\epsilon _{1}}]^{n-1}} |\det ( \nabla _{v} \Phi (F(x), t, G(x))\cdot \nabla _{x} F \\ &{} + \nabla _{\xi} \Phi (F(x), t, G(x))\cdot \nabla _{x} G)| \mathrm{d}x\mathrm{d}t \\ = {}& \int _{[0, \kappa ^{\epsilon _{1}}]^{n-1}} |\det (\nabla _{x} G)| \\ &{}\cdot \int _{I_{l+1}} |\det (\nabla _{v} \Phi (F(x), t, G(x))\cdot ( \nabla _{x} F\cdot (\nabla _{x} G)^{-1}) \\ &{}+ \nabla _{\xi} \Phi (F(x), t, G(x)))| \mathrm{d}t\mathrm{d}x. \end{aligned}$$
(6.18)

On the other hand,

$$\begin{aligned} &\mathcal {H}^{n-1} (H) \\ &\quad \sim _{n, E, \epsilon _{1}} \int _{[0, \kappa ^{ \epsilon _{1}}]^{n-1}} |\nabla _{x} (\Phi (F(x), t_{0}, G(x)))| \mathrm{d}x \\ &\quad= \int _{[0, \kappa ^{\epsilon _{1}}]^{n-1}} |\det (\nabla _{v} \Phi (F(x), t_{0}, G(x))\cdot \nabla _{x} F + \nabla _{\xi} \Phi (F(x), t_{0}, G(x))\cdot \nabla _{x} G)| \mathrm{d}x \\ &\quad= \int _{[0, \kappa ^{\epsilon _{1}}]^{n-1}} |\det (\nabla _{x} G)| \cdot |\det (\nabla _{v} \Phi (F(x), t_{0}, G(x))\cdot (\nabla _{x} F \cdot (\nabla _{x} G)^{-1}) \\ &\qquad{}+ \nabla _{\xi} \Phi (F(x), t_{0}, G(x)))| \mathrm{d}x. \end{aligned}$$
(6.19)

Now by assumption (a), the left hand side of (6.18) is \(\gtrsim |I_{l+1}|^{n} \cdot d_{l+1} (t_{0})^{-(n-1)} \simeq s_{l+1}^{n} \cdot d_{l+1} (t_{0})^{-(n-1)}\) times the left hand side of (6.19). Note that (the \(\tilde{S}_{l+1, t}\) version of) (6.12) gives an upper bound of the left hand side of (6.18). Moreover (6.15) gives a lower bound of the left hand side of (6.19) in terms of \(m_{n-1}^{*}(S_{l+1, t_{0}})\). Combining everything, we can take \(\epsilon _{1}\) to be a sufficiently small multiple of \(\epsilon \) to finish the induction step and (6.8) is proved.

From (6.8) the conclusion will follow easily. Take the union of all \(S_{k, t}\) for \(t \in [-1, 1]\). We notice by the definition and effective quantifier elimination that this is a semialgebraic set of complexity \(O_{n, E, \epsilon} (1)\). Use (6.8), we see its measure is

$$ \leq C(C_{0}, n, E, k, \epsilon ) \prod _{j=1}^{k} ( \frac{\varphi _{j-1}}{\varphi _{j}})^{d_{j}-1}\varphi _{k}^{n-1} \kappa ^{-2^{-1}\epsilon}. $$
(6.20)

Note that we already have (3.3) holds. In exactly the same way as we proved Theorem 3.1 (the only difference is that Theorem 3.1 was stated for tubes and we need a version for \(\Phi \)-curves but notice that in Theorem 3.1 in fact a \(\Phi \)-curves version was proven), we can bound the left hand side of (6.3) by

$$ C(C_{0}, n, E, k, \epsilon ) \prod _{j=1}^{k} ( \frac{\varphi _{j-1}}{\varphi _{j}})^{d_{j}-1}\varphi _{k}^{n-1} \kappa ^{(1-n)-\epsilon}. $$
(6.21)

This concludes the proof. □

Proof of Theorem 6.2

The proof will be similar to the proof of Theorem 1.2 in §3. As with that Theorem, take unique smooth \(\Phi = \Phi (v, t, x)\) near 0 such that (3.26) holds. We can assume the above can be done for all \((v, t, \xi )\in [-1.5, 1.5]^{2n-1}\) without loss of generality like in the other proof. It suffices to show that our \(\Phi \) satisfies conditions (a)-(c) in Theorem 6.3. That Theorem then immediately leads to the desired conclusion. When reading the proof one naturally sees the implicit constants in (b) and (c) only depend on finitely many derivatives of \(\phi \). We also note that (b) and (c) comes from the non-degeneracy property of \(\phi \), and (a) comes from Bourgain’s condition.

For (a), (3.3) is already verified in the proof of Theorem 1.2. Recall we defined \(A(t; \xi )= \nabla ^{2}_{\xi} \phi (X_{t}(\xi ), t; \xi )\) in (3.31) and deduced \(A(t; \xi ) = f(t; \xi )B(\xi ) + A(0; \xi )\) and the time derivative of \(f\) is 1 at \(t=0\) around (3.35). We make one more harmless assumption that the time derivative of \(f\) is always in \((\frac{1}{2}, 2)\) since otherwise we can do a constant rescaling that only causes loss of a constant. Now we can do the reduction to both sides of (6.4) like in the proof of Theorem 1.2, which reduces (6.4) to proving that for every polynomial \(P(t)\) of degree \(n-1\),

$$ |P(t)| \lesssim \left (1+ \frac{\mathrm{dist}(t, I)}{|I|}\right )^{n-1} \frac{1}{|I|}\int _{I} |P(s)|\mathrm{d}s, $$
(6.22)

which is an elementary Theorem proved in e.g. Lemma 3.8 in [14]. We have completed the verification of (a).

As before, (b) and (c) are more general properties that do not depend on Bourgain’s condition. Next we verify them.

Differentiate (3.26) with respect to \(t\), we see

$$ \nabla _{x} \nabla _{\xi} \phi \cdot \partial _{t} \Phi + \partial _{t} \nabla _{\xi} \phi = 0. $$
(6.23)

Hence \(\Phi \)-curves can be viewed as integral curves of the vector field \(V_{\xi} ({\mathbf {x}}, t) = (\nabla _{x} \nabla _{\xi} \phi ({\mathbf {x}}, t, \xi )^{-1} \cdot \partial _{t}\nabla _{\xi} \phi ({\mathbf {x}}, t, \xi ), 1)\) parameterized by \(\xi \). Note that the non-degeneracy of \(\phi \) implies \(|\nabla _{x} \nabla _{\xi} \phi | \simeq 1\) and a \(\mu \)-perturbation on \(\xi \) only causes a \(O(\mu )\)-perturbation of the above vector field. We see (b) follows from stability of ODE solutions.

Next we check (c), which is the “opposite direction” to (b). When checking it we can assume \(t_{1}\) and \(t_{2}\) are sufficiently close and that \(x_{1}\) is sufficiently close to 0 (and in application the honest (c) will always be satisfied after a harmless constant-rescaling of \((x, t)\)). (c) basically asks: if we start from

$$ x_{0} = \Phi (v_{0}, t_{1}, \xi _{0}) $$
(6.24)

and start to change \(\xi \) and solve \(v\) from the equation

$$ x_{0} = \Phi (v, t_{1}, \xi ), $$
(6.25)

how would \(y = \Phi (v, t_{2}, \xi )\) change? For convenience denote \(y_{0} = \Phi (v_{0}, t_{2}, \xi _{0})\). We use differentiation to compute this change. Differentiating (6.25), we see

$$ \nabla _{v} \Phi |_{(v_{0}, t_{1}, \xi _{0})} \cdot \nabla _{\xi} v|_{v_{0}} + \nabla _{\xi} \Phi |_{(v_{0}, t_{1}, \xi _{0})} = 0. $$
(6.26)

Hence by the chain rule,

$$ \nabla _{\xi} y = \nabla _{v} \Phi |_{(v_{0}, t_{2}, \xi _{0})} \cdot (-\nabla _{v} \Phi |_{(v_{0}, t_{1}, \xi _{0})})^{-1} \cdot \nabla _{\xi} \Phi |_{(v_{0}, t_{1}, \xi _{0})} + \nabla _{\xi} \Phi |_{(v_{0}, t_{2}, \xi _{0})}. $$
(6.27)

By (3.28) and (3.29), this simplifies to

$$ \nabla _{\xi} y = (\nabla x \nabla _{\xi} \phi |_{y_{0}, t_{2}, \xi _{0}})^{-1} \cdot (\nabla _{\xi}^{2} \phi |_{x_{0}, t_{1}, \xi _{0}} - \nabla _{ \xi}^{2} \phi |_{y_{0}, t_{1}, \xi _{0}}). $$
(6.28)

The first factor \((\nabla x \nabla _{\xi} \phi |_{y_{0}, t_{2}, \xi _{0}})^{-1}\) has entries \(\lesssim 1\) and determinant \(\simeq 1\) and is harmless. We focus on the second factor. It is equal to \((t_{2}-t_{1})\) times some \((\sum _{j=1}^{n-1} c_{j}\partial _{x_{j}}\nabla _{\xi}^{2} \phi + \partial _{t}\nabla _{\xi}^{2} \phi )|_{x_{0}, t_{1}, \xi _{0}}\) plus a higher order term in \((t_{2} - t_{1})\), where each \(|c_{j}|\lesssim 1\). By a familiar technique of parabolic rescaling (see for example the reduction Lemmas 4.1-4.3 in [10]), one can assume all \(\|\partial _{x_{j}}\nabla _{\xi}^{2} \phi \|\) are uniformly very small and since we have the nondegeneracy condition on \(\partial _{t}\nabla _{\xi}^{2} \phi \) from (1.7), we see \(\nabla _{\xi} y\) is equal to \((t_{2} - t_{1})\) times a nondegenerate matrix of bounded entries. From here we see (c) holds by an application of the implicit function theorem.

Now that (a)-(c) are all verified, we apply Theorem 6.3 and conclude the proof. □

Proof of Lemma 6.1

At this point, the Lemma is just a straightforward consequence of Theorem 6.2. We rescale the whole \(B_{r_{n'}}\) to the unit ball and rescale all \(\mathcal {N}_{r_{j}^{1 / 2+\delta _{j}}} S_{j}\) in Definition 5.9 accordingly (to be our \(S_{j}\) in Theorem 6.2). For each possible \(\theta (T)\) in \(\Theta _{r_{i}}[\vec{S}_{n'}]\), we pick the core curve of the corresponding \(T_{n'}\), rescale it into the unit ball and extend it into a \(\Phi \)-curve (\(\Phi \) defined from \(\phi \) as in the beginning of §3). Then we choose \(\kappa = R^{-\frac{1}{2}}\) and see by Definition 5.9 that the set of \(\kappa \)-tubes around all above \(\Phi \)-curves satisfies the assumption of Theorem 6.2 with the balls having radii \(s_{j} = \frac{r_{i-1+j}}{r_{n'}}\) and corresponding \(\kappa _{j} = \frac{r_{i-1+j}^{\frac{1}{2}+\delta _{i-1+j}}}{r_{n'}}\). Hence \(\varphi _{j} = r_{n'+1-j}^{-\frac{1}{2}+\delta _{i-1+j}}\) By Wongkew’s theorem [22] of intersections between neighborhoods of algebraic varieties we can take \(d_{j} = i-1+j\). Lastly we can surely take \(E = O_{d, n}(1)\) to be a constant by the definition of \(S_{j}\) in Definition 5.9.

By Theorem 6.2, we see the left hand side of (6.1) is

$$ \lesssim r_{i}^{\delta _{i}} r_{n'}^{\frac{n'-1}{2}}\prod _{j=n'+1}^{i} \left (\frac{r_{j}}{r_{j-1}}\right )^{\frac{j-1}{2}} $$
(6.29)

and is thus bounded by the right hand side. □

As a corollary of Lemma 6.1, we obtain

Corollary 6.5

For \(m\le n'\le n''\), we have

$$ \Big\| g_{\iota , S_{n'}}^{*(n'')} \Big\| _{2}^{2} \lessapprox \Big( \prod _{j=n'}^{n''} r_{j}^{-\frac{1}{2}} \Big) r_{n''}^{-\frac{n-n''-1}{2}} \max _{\tau : \ell (\tau )=r_{n''}^{-1/2}} \Big\| g_{\iota , S_{n'}}^{*(n'')} \Big\| _{ L^{2}_{\mathrm{avg}}(\tau ) }^{2} $$
(6.30)

7 Brooms

7.1 Definition of brooms

Let \((S, B({\mathbf {x}}_{0}, r))\) be a grain of dimension \(n'\) with \(S\in \mathfrak {S}_{n'}\) and assume that it is the last entry of a multi-grain \(\vec {S}\). Throughout this section, we always assume that

$$ r\ge \sqrt{R}. $$
(7.1)

Recall Definition 5.2 and Definition 5.9. Define \(\mathbb{T}[S]\subset \mathbb{T}[B({\mathbf {x}}_{0}, r)]\) to be the collection of tubes that are tangent to \(S\) in the ball \(B({\mathbf {x}}_{0}, r)\). Define

$$ \Theta [S]:=\{\theta (T): T\in \mathbb{T}[S]\}. $$
(7.2)

Moreover, define \(\mathbb{T}_{R}[S]:=\mathbb{T}_{R}[\vec{S}]\).

Before we define brooms, we cut each \(S\) into \(O_{d, n}(1)\) many pieces so that the tangent spaces of each piece form a small angle with each other. Let us be more precise. For each \({\mathbf {z}}\in S\), let \(T_{{\mathbf {z}}} S\) denote the tangent space of \(S\) at \({\mathbf {z}}\). We cut \(S\) into \(O_{d, n}(1)\) many pieces \(\{S', S'', \dots \}\) so that for each such piece, say \(S'\), it holds that

$$ \measuredangle (T_{{\mathbf {z}}_{1}} S', T_{{\mathbf {z}}_{2}} S')\le \frac{1}{100n}, $$
(7.3)

for \({\mathbf {z}}_{1}, {\mathbf {z}}_{2}\in S'\). Similarly, we define \(\mathbb{T}[S'], \Theta [S']\) and \(\mathbb{T}_{R}[S']\). Such a decomposition only appears in this section. To simplify notation, in the rest of this section we will still use \(S\) to refer to each such piece, and still call it a grain.

Fix \(\tau \in \Theta [S]\) and a grain \(S\) satisfying (7.3). Define

$$ \mathbb{T}_{\tau , R}[S]:=\{T\in \mathbb{T}_{R}[S]: \theta (T)\subset \tau \}. $$
(7.4)

We let \(R\)-tubes \(T\in \mathbb{T}_{\tau , R}[S]\) intersect \(S\). Morally speaking, \(T\cap S\) can be thought of as a “curved” rectangular box of dimensions

$$ r\times \underbrace{R^{1/2+\delta}\times \cdots \times R^{1/2+\delta}}_{(n'-1) \mathrm{ copies}}\times \underbrace{r^{1/2+\delta _{m}}\times \cdots \times r^{1/2+\delta _{m}}}_{(n-n') \mathrm{ copies}}. $$
(7.5)

For two tubes \(T_{1}, T_{2}\in \mathbb{T}_{\tau , R}[S]\), we say that

$$ T_{1}\cap S \cap B(\mathbf{x}_{0}, r)\approx T_{2}\cap S\cap B( \mathbf{x}_{0}, r) $$
(7.6)

if

$$ T_{1}\cap S\cap B(\mathbf{x}_{0}, r) \subset (10 n T_{2})\cap S \cap B( \mathbf{x}_{0}, r), $$
(7.7)

or the other way around. Before we study the geometry of \(S_{\Box} := T\cap S\cap B(\mathbf{x}_{0}, r)\), let us assume without loss of generality that the tangent space \(T_{{\mathbf {z}}}(S)\) forms an angle \(\le 1/(100n)\) with the subspace spanned by \(\{\vec {e}_{1}, \dots , \vec {e}_{n'-1}, \vec {e}_{n}\}\), the first \((n'-1)\) vectors from the orthonormal basis and the vertical \(t\) coordinate direction \(\vec {e}_{n}\), for every \({\mathbf {z}}\in S\).

Lemma 7.1

  1. (1)

    We can write

    $$ S\supseteq \bigcup _{\Box}S_{\Box} $$
    (7.8)

    where \(S_{\Box}=T\cap S\cap B(\mathbf{x}_{0}, r)\) for some \(T\in \mathbb{T}_{\tau , R}[S]\) and \(\{S_{\Box}\}_{\Box}\) is a disjoint collection. Moreover, for every \(T\in \mathbb{T}_{\tau , R}[S]\), we can find \(S_{\Box}\) such that \(T\cap S \cap B(\mathbf{x}_{0}, r)\approx S_{\Box}\).

  2. (2)

    Take \((x_{1}, t_{1})\in S\). For each \(S_{\Box}\), we can find an algebraic variety \(Z\subset \{t=t_{1}\}\) of dimension \(n'-1\) and complexity \(O(\deg (S))\) satisfying that the angle between \(T_{{\mathbf {z}}}(Z)\) and the subspace \(\{\vec{e}_{1}, \dots , \vec{e}_{n'-1}\}\) is \(\le 1/(100n)\) for every \({\mathbf {z}}\in Z\cap S_{\Box}\), such that

    $$ (B(x_{1}, r)\times \{t_{1}\})\cap S_{\Box}\subset \mathcal {N}_{r^{1/2}}(Z). $$
    (7.9)

    Here \(B(x_{1}, r)\) is the ball in \(\mathbb{R}^{n}\) of radius \(r\) centered at \(x_{1}\).

Proof of Lemma 7.1

If \(\mathbb{T}_{\tau , R}[S]\) is empty, then define the right-hand side of (7.8) as an empty set. Now assume that \(\mathbb{T}_{\tau , R}[S]\) is nonempty. Pick \(T\in \mathbb{T}_{\tau , R}[S]\) and define \(S_{\Box}= T\cap S\cap B(\mathbf{x}_{0}, r)\). If there exists \(T'\in \mathbb{T}_{\tau , R}[S]\) and \(T\cap T'\cap S\cap B(\mathbf{x}_{0}, r)\neq \emptyset \), then \(T\cap S\cap B(\mathbf{x}_{0}, r) \approx T'\cap S \cap B(\mathbf{x}_{0}, r)\). To see this, if \(\mathbf{x}\in T\cap T'\) and \(\theta (T), \theta (T')\subset \tau \), then

$$ T\cap B(\mathbf{x}, 10 r) \approx T'\cap B(\mathbf{x}, 10r) $$

and \(B(\mathbf{x}_{0}, r)\subset B(\mathbf{x},r)\).

If there exists \(T'\in \mathbb{T}_{\tau , R}[S]\), then we add \(S_{\Box}':=T'\cap S \cap B(\mathbf{x}_{0}, r)\) to the right-hand side of (7.8). Continue until for every \(T\in \mathbb{T}_{\tau , R}[S]\), there exists \(S_{\Box}\) from the right-hand side of (7.8) such that \(S_{\Box}=T\cap S \cap B(\mathbf{x}_{0}, r)\).

For each \(S_{\Box}\), define \(Z=S\cap \{t=t_{1}\}\). □

To define brooms, we fix \((S, B({\mathbf {x}}_{0}, r)), \vec {S}\), \(\tau \) and \(S_{\Box}\). Write \({\mathbf {x}}_{0}=(x_{0}, t_{0})\). Define

$$ \mathbb{T}_{\tau , R}[S_{\Box}]:=\{T\in \mathbb{T}_{\tau , R}[S]: T\cap S_{\Box} \neq \emptyset \}. $$
(7.10)

Let us record the following lemma that will be useful later.

Lemma 7.2

Under the above notation, we have that

$$ \bigcup _{T\in \mathbb{T}_{\tau , R}[S_{\Box}]} \Big(T\cap \{(x, t_{1})\in \mathbb{R}^{n}: x\in \mathbb{R}^{n-1}\}\Big) $$
(7.11)

is contained in an \((n-1)\) dimensional ball of radius \(R^{1+\delta} r^{-1/2}\), for every \(|t_{1}-t_{0}|\le R\).

Proof of Lemma 7.2

By translation, we assume that \(S_{\Box}\) contains the origin. Let \(\omega _{0}\) be the center of \(\tau \). For \(\omega \in \tau \), let \(x=X_{\omega}(t)\) denote the solution to

$$ \nabla _{\omega} \phi (x, t; \omega )=0, $$
(7.12)

for \(|t|\le 1\). Then we need to show that

$$ |X_{\omega _{0}}(t)-X_{\omega}(t)|\lesssim r^{-1/2}, $$
(7.13)

for every \(t\). Note that

$$ \nabla _{\omega} \phi (X_{\omega}(t), t; \omega )-\nabla _{\omega} \phi (X_{\omega _{0}}(t), t; \omega _{0})=0. $$
(7.14)

By Taylor’s expansion, this further implies

$$ \nabla _{x} \nabla _{\omega} \phi (x', t; \omega )(X_{\omega}(t)-X_{ \omega _{0}}(t))+\nabla ^{2}_{\omega} \phi (X_{\omega _{0}}(t), t; \omega ')(\omega -\omega _{0})=0 $$
(7.15)

for some \(x', \omega '\). The desired bound follows from the fact that \(\nabla _{x} \nabla _{\omega} \phi \) is non-degenerate. □

We apply the following algorithm. Initialize

$$ \mathbb{T}_{0}:=\mathbb{T}_{\tau , R}[S_{\Box}], \ \mathbb{D}_{0}=\{T\cap \{t=t_{0}+R\}: T\in \mathbb{T}_{\tau , R}[S_{\Box}]\}. $$
(7.16)

Suppose we are at the \(\ell '\)-th step of the algorithm and from now on \(\ell '\ge 1\). Let \(Z\subset \{t=t_{0}+R\}\) be an algebraic variety of dimension \(n'-1\) and complexity \(O(\deg (S))\) which satisfies that the angle between \(T_{{\mathbf {z}}}(Z)\) and the subspace spanned by \(\{\vec {e}_{1}, \dots , \vec {e}_{n'-1}\}\) is \(\le 1/(100n)\), for every \({\mathbf {z}}\in Z\). Find such a \(Z\) that maximizes

$$ \#\{D_{0}\in \mathbb{D}_{0}: D_{0}\subset \mathcal {N}_{10n R^{1/2+\delta}}(Z)\}; $$
(7.17)

use \(b_{\ell '}\) to refer to the number in (7.17). Remove the discs (7.17) from \(\mathbb{D}_{0}\), use \(\mathbb{T}_{\ell '}\) to collect the tubes \(T\) from \(\mathbb{T}_{0}\) for which \(T\cap \{t=t_{0}+R\}\) is removed from \(\mathbb{D}_{0}\) in this step, and repeat this process until there is no any discs left.

Suppose this algorithm terminates after \(L\) steps. We obtain a collection of positive integers

$$ b_{1}\ge b_{2}\ge \cdots \ge b_{L} $$
(7.18)

and a collection of tubes

$$ \mathbb{T}_{1}, \mathbb{T}_{2}, \dots , \mathbb{T}_{L}. $$
(7.19)

We group \(\{b_{\ell '}\}_{\ell '}\) by checking which interval from

$$ [1, R^{\delta}), (R^{\delta}, R^{2\delta}], (R^{2\delta}, R^{3\delta}], \dots $$
(7.20)

they belong to:

$$ \{b_{\ell _{0}+1}, \dots , b_{\ell _{1}}\}, \{b_{\ell _{1}+1}, \dots , b_{\ell _{2}}\}, \dots $$
(7.21)

with \(\ell _{0}=0\), and therefore two \(b_{\ell '}, b_{\ell ''}\) in the same group are comparable up to factor \(R^{\delta}\). Now we are ready to define brooms.

Definition 7.3

Brooms

Fix \((S, B({\mathbf {x}}_{0}, r)), \vec {S}, \tau \) and \(S_{\Box}\). Each

$$ \mathcal {B}_{\ell , b}:=\bigcup _{\ell _{m}+1 \le \ell '\le \ell _{m+1}} \mathbb{T}_{\ell '}, $$
(7.22)

(see (7.19) for definition of \(\mathbb{T}_{\ell '}\)), with level

$$ \ell :=\lfloor R^{w\delta}\rfloor , \text{ with } w\in \mathbb{N}, R^{w\delta} \le \ell _{m+1}-\ell _{m}< R^{(w+1)\delta}, $$
(7.23)

is called a broom. Here

$$ b:=R^{w' \delta} \text{ with } w'\in \mathbb{N}, R^{w'\delta}\le b_{\ell _{m}+1}< R^{(w'+1)\delta}, $$
(7.24)

will be called the length of the broom. For the broom \(\mathcal {B}_{\ell , b}\) in (7.22), we say that it is rooted at \(S_{\Box}\).

In the previous definition, we used tubes from \(\mathbb{T}_{R}[S]\). For some perhaps technical reasons, we need to introduce the notion of brooms by using a sub-collection of tubes from \(\mathbb{T}_{R}[S]\). Fix \((S, B({\mathbf {x}}_{0}, r)), \vec {S}, \tau \) and \(S_{\Box}\), and a sub-collection \(\mathbb{T}'_{R}[S]\subset \mathbb{T}_{R}[S]\). We repeat the above definition of brooms with \(\mathbb{T}_{R}[S]\) replaced by \(\mathbb{T}'_{R}[S]\), and obtain a unique decomposition

$$ \mathbb{T}'_{R}[S]=\bigcup _{\ell , b} \mathcal {B}_{\ell , b}(\mathbb{T}'_{R}[S]), $$
(7.25)

where each \(\mathcal {B}_{\ell , b}(\mathbb{T}'_{R}[S])\) is called a broom of level \(\ell \) and length \(b\), and generated by tubes from \(\mathbb{T}'_{R}[S]\).

7.2 Definition of the two-ends relation

Recall the algorithm in Sect. 5.2. For each node \(\mathfrak {n}\in \cup _{\iota} \mathfrak {R}_{\iota}\), we will define a relation \(\sim _{\mathfrak {n}}\); Lemma 5.6 guarantees that we have a small number of these relations.

We define a few auxiliary functions \(\chi _{\mathfrak {n}, \kappa}=\chi _{\kappa}\), taking values 0 or 1, where \(\kappa =((\ell _{1}, b_{1}), \mu _{1}, \dots , (\ell _{\iota}, b_{ \iota}), \mu _{\iota})\), \(\iota \in \mathbb{N}\) and \(\ell _{\iota '}, b_{\iota '}, \mu _{\iota '}\in \{R^{w\delta}: w\in \mathbb{N}\}\) for every \(1\le \iota '\le \iota \). Denote

$$ r=\rho (\mathfrak {n}). $$
(7.26)

Step 1. For \(S\in \mathfrak {n}\) and a tube \(T\in \mathbb{T}[B_{R}]\), we say that

$$ \chi _{(\ell _{1}, b_{1})}(S, T)=1 $$
(7.27)

if \(T\) belongs to a broom rooted at some \(S_{\Box}\subset S\) with level \(\ell _{1}\) and length \(b_{1}\). Moreover, we say that

$$ \chi _{(\ell _{1}, b_{1}), \mu _{1}}(S, T)=1 $$
(7.28)

if

$$ \chi _{(\ell _{1}, b_{1})}(S, T)=1 $$
(7.29)

and

$$ \mu _{1}\le \sum _{S'\in \mathfrak {n}}\chi _{(\ell _{1}, b_{1})}(S', T)< \mu _{1} R^{\delta}. $$
(7.30)

A general step. Suppose we have defined \(\chi _{\kappa}\) for \(\kappa =((\ell _{1}, b_{1}), \mu _{1}, \dots , (\ell _{\iota}, b_{ \iota}), \mu _{\iota})\), and \(\iota \ge 1\). Let us define

$$ \chi _{\kappa , (\ell _{\iota +1}, b_{\iota +1})}, \ \ \chi _{ \kappa , (\ell _{\iota +1}, b_{\iota +1}), \mu _{\iota +1}}. $$
(7.31)

For fixed \(S\in \mathfrak {n}\), define

$$ \mathbb{T}_{S, \kappa}:=\{T'\in \mathbb{T}[B_{R}]: \chi _{\kappa}(S, T')=1\}. $$
(7.32)

Recall (7.25). Write

$$ \mathbb{T}_{S, \kappa}=\bigcup _{\ell _{\iota +1}, b_{\iota +1}} \bigcup _{ \tau , S_{\Box}} \mathcal {B}_{\ell _{\iota +1}, b_{\iota +1}, \tau , S_{ \Box}}( \mathbb{T}_{S, \kappa}), $$
(7.33)

where \(\tau \) runs through all frequency caps of side length \(\rho (\mathfrak {n})^{-1/2}\), \(S_{\Box}\) is as given in Lemma 7.1, \(\mathcal {B}_{\ell _{\iota +1}, b_{\iota +1}, \tau , S_{\Box}}( \mathbb{T}_{S, \kappa})\) is a broom of level \(\ell _{\iota +1}\), length \(b_{\iota +1}\), rooted at \(S_{\Box}\) and generated by tubes from \(\mathbb{T}_{S, \kappa}\). We then say that

$$ \chi _{\kappa , (\ell _{\iota +1}, b_{\iota +1})}(S, T)=1 \text{ if } T \in \mathcal {B}_{\ell _{\iota +1}, b_{\iota +1}, \tau , S_{\Box}}( \mathbb{T}_{S, \kappa , \tau}), $$
(7.34)

for some \(\tau \) and \(S_{\Box}\). Next, set

$$ \chi _{\kappa , (\ell _{\iota +1}, b_{\iota +1}), \mu _{\iota +1}}=1 $$
(7.35)

if

$$ \chi _{\kappa , (\ell _{\iota +1}, b_{\iota +1})}(S, T)=1 $$
(7.36)

and

$$ \mu _{\iota +1}\le \sum _{S'\in \mathfrak {n}} \chi _{\kappa , (\ell _{ \iota +1}, b_{\iota +1}), \mu _{\iota +1}}(S', T)< \mu _{\iota +1} R^{ \delta}. $$
(7.37)

This finishes the definition of the auxiliary functions we need.

For \(\kappa =((\ell _{1}, b_{1}), \mu _{1}, \dots , (\ell _{\iota}, b_{ \iota}), \mu _{\iota})\), we say that \(\kappa \) is admissible if there exists exactly one pair \((\iota _{1}, \iota _{2})\) with \(\iota _{1}\neq \iota _{2}\) such that

$$ ((\ell _{\iota _{1}}, b_{\iota _{1}}), \mu _{\iota _{1}})=((\ell _{ \iota _{2}}, b_{\iota _{2}}), \mu _{\iota _{2}}). $$
(7.38)

Lemma 7.4

The number of admissible \(\kappa \) is \(O_{\delta}(1)\).

Proof of Lemma 7.4

Note that by (7.23), the number of values that \(\ell _{\iota '}\) can take is \(\delta ^{-1}\); the same is true is for \(b_{\iota '}\) and \(\mu _{\iota '}\), for each \(\iota '\). The lemma follows. □

Definition 7.5

For a ball \(B\subset B_{R}\) of radius \(R^{1-\delta}\), a tube \(T\in \mathbb{T}[B_{R}]\), a node \(\mathfrak {n}\in \cup _{\iota} \mathfrak {R}_{\iota}\) and an admissible multi-index \(\kappa \), we say that \(B\sim _{\mathfrak {n}, \kappa} T\) if \(B\) maximizes

$$ \#\{S'\in \mathfrak {n}: S'\subset B', \chi _{\mathfrak {n}, \kappa}(S', T)=1\}, $$
(7.39)

among all \(B'\) of radius \(R^{1-\delta}\).

Definition 7.6

Relation

For a ball \(B\) of radius \(R^{1-\delta}\) and a tube \(T\in \mathbb{T}[B_{R}]\), we say that \(B\sim T\) if

$$ B\sim _{\mathfrak {n}, \kappa} T, $$
(7.40)

for some node \(\mathfrak {n}\in \cup _{\iota} \mathfrak {R}_{\iota}\) and admissible \(\kappa \).

By a simple inductive argument on \(\kappa \), we have:

Lemma 7.7

Let \(\vec {S}\) be a multi-grain with the last component given by \(S\). For every \(T\in \mathbb{T}_{R}[S]\), there exists exactly one admissible \(\kappa \) such that \(T\in \mathbb{T}_{S, \kappa}\).

7.3 Broom estimates

Let \(\vec {S}_{n'}\) be a multigrain from Definition 5.9 with the last component given by \(S_{n'}\). Let \(B_{\iota}\subset B_{R}\) be the ball of radius \(R^{1-\delta}\) that contains \(S_{n'}\). Recall the definition of \(f^{*}_{\iota , S_{n'}}\) from Sect. 5.4 and the definition of \(f_{\iota , S_{n'}}^{*(n'')}\) with \(n'\le n''\le n\) from (5.108). The notation \(*(n'')\) means that we start with the function \(f^{*}_{\iota , S_{n'}}\), which is defined via wave packets from \(\mathbb{T}[B_{r_{n''}}]\), and “trace” back by Definition 5.9 along the nodes in (5.86) to wave packets in \(\mathbb{T}[B_{r_{n''}}]\). Note that

$$ f_{\iota , S_{n'}}^{*(n'')}=f^{*}_{\iota , S_{n'}}, \text{ when } n''=n'. $$
(7.41)

Define

$$ f^{\nsim}_{S_{n'}, \tau}:=(f_{\tau})^{*}_{\iota , S_{n'}}. $$
(7.42)

Here we have suppressed the dependence on \(\iota \) and replaced it by ≁, as \(B_{\iota}\) is uniquely determined by \(S\), and we would also like to emphasize that we are in the non-related case. Moreover, define

$$ f^{\nsim (n'')}_{S_{n'}, \tau}:=(f_{\tau})^{*(n'')}_{\iota , S_{n'}}. $$
(7.43)

The main goal of this subsection is to prove the following broom estimate.

Theorem 7.8

Let \(\vec {S}_{n'}=(S_{n}, \dots , S_{n'})\) be a multi-grain and \(S_{n'}=\mathcal {N}_{r_{n'}^{1/2+\delta _{n'}}}(Z_{n'})\cap B_{r_{n'}}\) with \(r_{n'}\ge \sqrt{R}\), where \(Z_{n'}\) is an \(n'\)-dimensional algebraic variety of degree \(\lesssim _{n'} 1\) in \(\mathbb{R}^{n}\). Then, for \(\tau \) of scale \(r_{n'}^{-1/2}\) and \(n'\le n''< n\), it holds that

$$ \|f^{\nsim (n'')}_{S_{n'}, \tau}\|_{L^{2}}^{2} \lessapprox \Big( \frac{r_{n''}}{R}\Big)^{\frac{n-n'}{2}} \|f_{\tau}\|_{L^{2}}^{2}. $$
(7.44)

Here \(r_{i}=\rho (S_{i})\) for each \(n'\le i\le n\).

Proof of Theorem 7.8

We will first write down the details for the case \(n''=n'\), as it requires less notation and contains all the ideas of the proof; the general case \(n''\ge n'\) will be remarked in the end. Our goal is to prove

$$ \|f^{\nsim}_{S_{n'}, \tau}\|_{L^{2}}^{2} \lessapprox \Big( \frac{r_{n'}}{R}\Big)^{\frac{n-n'}{2}} \|f_{\tau}\|_{L^{2}}^{2}. $$
(7.45)

To simplify notation, we will abbreviate \(S_{n'}\) to \(S\), \(\vec {S}_{n'}\) to \(\vec {S}\) and \(r_{n'}\) to \(r\). For the ball \(B_{\iota}\) of radius \(R^{1-\delta}\) containing \(S\), the node \(\mathfrak {n}\) containing \(S\) and each admissible multi-index \(\kappa \), recall the notation in (7.32) and define

$$ \mathbb{T}^{\nsim}_{S, \kappa ,\tau}:=\{ T \in \mathbb{T}_{S, \kappa}: \theta (T)\subset \tau , B_{\iota}\nsim _{\mathfrak {n}, \kappa} T\}, $$
(7.46)

and

$$ f^{\nsim}_{\kappa , S, \tau}:= \sum _{T\in \mathbb{T}^{\nsim}_{S, \kappa ,\tau}}f_{T} \quad \text{and} \quad f^{\nsim , *}_{\kappa , S, \tau} := (f^{\nsim}_{\kappa , S, \tau} )^{*}_{\iota , S}. $$

Then we claim that

$$ f^{\nsim}_{S,\tau} = \sum _{\kappa} f^{\nsim , *}_{\kappa , S, \tau}. $$
(7.47)

To see this, note that for every \(T\in \mathbb{T}[B_{R}]\) that is tangent to \(S\), by Lemma 7.7, there exists exactly one admissible \(\kappa \) such that \(T\in \mathbb{T}_{S, \kappa}\). Moreover, if \(T\nsim B_{\iota}\), then for the above given \(\mathfrak {n}\) and \(\kappa \), we also have \(T\nsim _{\mathfrak {n}, \kappa} B_{\iota}\).

By the Cauchy-Schwarz inequality,

$$ \big\| f^{\nsim}_{S,\tau} \big\| ^{2}_{2} \lesssim _{\delta} \sum _{\kappa} \big\| f^{\nsim , *}_{\kappa , S, \tau} \big\| ^{2}_{2}. $$
(7.48)

Here we used Lemma 7.4. Next, we localize \(f^{\sim}_{\kappa , S, \tau}\) further in space. Recall from Lemma 7.1 that \(S=\cup _{\Box} S_{\Box}\). Denote

$$ \mathbb{T}^{\nsim}_{S_{\Box}, \kappa ,\tau}:=\{ T \in \mathbb{T}_{S, \kappa}: T\cap S_{\Box}\neq \emptyset , \theta (T)\subset \tau , B_{ \iota}\nsim _{\mathfrak {n}, \kappa} T\}, $$
(7.49)

and then

$$ f^{\nsim}_{\kappa , S_{\Box}, \tau}:= \sum _{T\in \mathbb{T}^{\nsim}_{S_{ \Box}, \kappa ,\tau}}f_{T} \quad \text{and} \quad f^{\nsim , *}_{ \kappa , S_{\Box}, \tau} := (f^{\nsim}_{\kappa , S_{\Box}, \tau} )^{*}_{ \iota , S}. $$
(7.50)

By spatial disjointness of \(\{S_{\Box}\}_{\Box}\), we have

$$ \text{(7.48)} \lesssim \sum _{\kappa} \sum _{\Box} \big\| f^{\nsim , *}_{\kappa , S_{\Box}, \tau} \big\| _{2}^{2}. $$
(7.51)

Let \((x_{1}, t_{1})\) be a point in \(S\), then

$$ \|f^{\nsim , *}_{\kappa , S_{\Box}, \tau}\|_{L^{2}}^{2} \lesssim \|T^{ \lambda} f^{\nsim , *}_{\kappa , S_{\Box},\tau}\|_{L^{2}( (B(x_{1}, 2r_{n'}) \times \{t_{1}\})\cap S_{\Box})}^{2} $$
(7.52)

Here \(B(x_{1}, 2r_{n'})\) is an \(n-1\) dimensional ball. Indeed, inequalities of the form (7.52) hold for more general data. Let \(h_{\tau}\) be supported on \(\tau \). Assume that \(T^{\lambda} h_{\tau}(\cdot , t_{1})\) is essentially supported on \(B_{\sqrt{R}}\subset \mathbb{R}^{n-1}\), a ball of radius \(\sqrt{R}\). Then

$$ \big\| h_{\tau} \big\| _{2}^{2} \lesssim \big\| T^{\lambda} h_{\tau} \big\| _{L^{2} ( B_{ \sqrt{R}}\times \{t_{1}\} ) }. $$
(7.53)

To see this, we first apply the change of variables

$$ x\mapsto x+x_{0}, t\mapsto t+t_{1}, \xi \mapsto \xi +\xi _{0}, $$
(7.54)

where \(x_{0}\) is the center of \(B_{\sqrt{R}}\) and \(\xi _{0}\) is the center of \(\tau \). We obtain

$$ T^{\lambda} h_{\tau}(x+x_{0}, t+t_{1})=\int h_{\tau}(\xi +\xi _{0}) e^{i \phi ^{\lambda}(x+x_{0}, t+t_{1}; \xi +\xi _{0})}a^{\lambda}(x+x_{0}, t+t_{1}; \xi +\xi _{0}) d\xi . $$
(7.55)

Its absolute value can be written as the absolute value of

$$ \begin{aligned} & \int \widetilde{h}_{\tau}(\xi ) e^{i\phi ^{\lambda}_{0}(x, t; \xi )} a^{\lambda}(x+x_{0}, t+t_{1}; \xi +\xi _{0}) d\xi \end{aligned} $$
(7.56)

where

$$ \begin{aligned} \phi ^{\lambda}_{0}(x, t; \xi ):= & \phi ^{\lambda}(x+x_{0}, t+t_{1}; \xi +\xi _{0})-\phi ^{\lambda}(x_{0}, t_{1}; \xi +\xi _{0}) \\ & - \phi ^{\lambda}(x+x_{0}, t+t_{1}; \xi _{0})+ \phi ^{\lambda}(x_{0}, t_{1}; \xi _{0}) \end{aligned} $$
(7.57)

and

$$ \widetilde{h}_{\tau}(\xi ):=h_{\tau}(\xi +\xi _{0}) e^{i\phi ^{ \lambda}(x_{0}, t_{1}; \xi +\xi _{0})} $$
(7.58)

This change of variables tells us that, if we denote

$$ \widetilde{T}^{\lambda} h_{\tau}(x, t):=\int h_{\tau}(\xi ) e^{i \phi _{0}^{\lambda}(x, t; \xi )}a^{\lambda}({\mathbf {x}}; \xi ) d\xi , $$
(7.59)

then to prove (7.53), it suffices to prove it for the operator as defined in (7.59) with \(t_{1}=0, x_{0}=0, \xi _{0}=0\).

We apply Taylor expansion to \(\phi _{0}^{\lambda}\) in the \(x\) variable about the origin, and write

$$ \phi _{0}^{\lambda}(x, 0; \xi )=\nabla _{x}\phi _{0}^{\lambda}(0; \xi )\cdot x+ w_{1}(x; \xi ). $$
(7.60)

Note that

$$ |\nabla _{x}^{2} \phi _{0}^{\lambda}| \lesssim 1/\lambda , \ \ |x| \lesssim \sqrt{R}, $$
(7.61)

and therefore \(|w_{1}(x; \xi )|\lesssim 1\). We apply Taylor expansion in the \(\xi \) variable about the origin, and obtain

$$ \phi _{0}^{\lambda}(x, 0; \xi )=\langle x,\xi\rangle + \underbrace{w_{2}(x; \xi )+ w_{1}(x; \xi )}_{=: w(x; \xi )}. $$
(7.62)

Note that

$$ |\nabla _{x}\nabla ^{\beta}_{\xi} \phi _{0}^{\lambda}|\lesssim _{\beta} 1, $$
(7.63)

for all multi-indices \(\beta \) and therefore \(|w_{2}(x; \xi )|\lesssim r^{-1} \sqrt{R} \lesssim 1\). Now the claimed estimate (7.53) follows from Plancherel’s theorem and Taylor’s expansion.

Suppose that

$$ \kappa =\{ (\ell _{1}, b_{1}), \mu _{1}, \dots , (\ell _{j}, b_{j}), \mu _{j}). $$
(7.64)

For each plank \(S_{\Box}\),

$$ \mathbb{T}_{\tau , R}[S_{\Box}] \cap \mathbb{T}_{S, \kappa}, $$

where \(\mathbb{T}_{\tau , R}[S_{\Box}]\) was defined in (7.10), is contained in a broom \(\mathcal{B}_{\ell _{j}, b_{j}}\). Recall the definition of brooms and the notation in (7.19). Write

$$ \mathcal{B}_{\ell _{j}, b_{j}}=\bigcup _{1\le \ell '\le \ell _{j}} \mathbb{T}_{\ell '}, $$
(7.65)

and

$$ \bigcup _{T\in \mathbb{T}_{\ell '}} T \cap \{t=t_{1}+R\}\subset \mathcal{N}_{10nR^{1/2+ \delta}}(Z_{\ell '}) $$
(7.66)

for an algebraic variety \(Z_{\ell '}\) of dimension \(n'-1\) and complexity \(O(\deg (S))\) satisfying that the angle between \(T_{\mathbf{z}}(Z_{\ell '})\) and the space spanned by \(\{\vec{e}_{1}, \dots , \vec{e}_{n'-1}\}\) is \(\leq 1/(100n)\), for every \(\mathbf{z}\in Z_{\ell '}\). Write

$$ f^{\nsim}_{\kappa , S_{\Box}, \tau , \ell '}=\sum _{T\in \mathbb{T}^{ \nsim}_{S_{\Box}, \kappa , \tau}\cap \mathbb{T}_{\ell '}} f_{T} $$

and

$$ f^{\nsim , *}_{\kappa , S_{\Box}, \tau , \ell '} := (f^{\nsim}_{ \kappa , S_{\Box}, \tau , \ell '})^{*}_{\iota , S}. $$

Then by the triangle inequality and Cauchy-Schwarz inequality,

$$ \begin{aligned} & \| T^{\lambda} f^{\nsim , *}_{\kappa , S_{\Box}, \tau} \|_{L^{2}( (B(x_{1}, 2r) \times \{t_{1}\}) \cap S_{\Box})}^{2} \\ & \lesssim \ell _{j} \sum _{1\le \ell '\le \ell _{j}} \|T^{\lambda} f^{ \nsim , *}_{\kappa , S_{\Box}, \tau , \ell '} \|_{L^{2}( (B(x_{1}, 2r) \times \{t_{1}\}) \cap S_{\Box})}^{2}. \end{aligned} $$
(7.67)

Claim 7.9

For each \(\ell '\), it holds that

$$ \begin{aligned} & \|T^{\lambda} f^{\nsim , *}_{\kappa , S_{\Box}, \tau , \ell '}\|_{L^{2}( (B(x_{1}, 2r) \times \{t_{1}\}) \cap S_{\Box})}^{2} \\ & \lessapprox (\frac{r}{R})^{\frac{n-n'}{2}} \|T^{\lambda}f^{\nsim}_{ \kappa , S_{\Box}, \tau , \ell '}\|_{L^{2}(\{ t=t_{2}\})}^{2} \end{aligned} $$
(7.68)

where \(t_{2}:=t_{1}+R\).

We first accept Claim 7.9, and continue with the \(L^{2}\) estimate:

$$ \text{(7.68)} \lessapprox (\frac{r}{R})^{\frac{n-n'}{2}} \|f^{ \nsim}_{\kappa , S_{\Box}, \tau , \ell '}\|^{2}_{L^{2}}. $$
(7.69)

Summing over all \(\ell '\) and \(S_{\Box}\), we obtain

$$ \|T^{\lambda} f^{\nsim}_{S, \tau}\|_{L^{2} (B(x_{1}, 2r)\times \{t_{1} \}) }^{2} \lessapprox \ell _{j} (\frac{r}{R})^{\frac{n-n'}{2}} \|f^{ \nsim}_{\kappa , S, \tau}\|_{L^{2}}^{2}. $$

By Lemma 7.10 below and the assumption that all wave packets have comparable coefficients, we conclude that

$$ \|f^{\nsim}_{\kappa , S, \tau}\|_{L^{2}}^{2} \lesssim \ell _{j}^{-1} R^{O(n \delta )}\|f_{\tau}\|_{L^{2}}^{2}. $$

This finishes the proof of the theorem, modulo the proofs of Lemma 7.10 and Claim 7.9. □

Lemma 7.10

Let \(\kappa = ( (\ell _{1}, b_{1}), \mu _{1}, \dots , (\ell _{j}, b_{j}), \mu _{j}) \) be an admissible multi-index and \(\mathbb{T}^{\nsim}_{S, \kappa , \tau}\) be defined as in (7.46). We have

$$ | \mathbb{T}^{\nsim}_{S, \kappa , \tau} |\lesssim \ell _{j}^{-1} R^{O(n \delta )} |\mathbb{T}_{\tau}|, $$

where \(\mathbb{T}_{\tau}\) collects tubes \(T\in \mathbb{T}[B_{R}]\) with \(\theta (T)\subset \tau \).

Proof of Lemma 7.10

Since \(\kappa \) is admissible, there exists

$$ \kappa '= ( (\ell _{1}, b_{1}), \mu _{1}, \dots , (\ell _{j}, b_{j'}), \mu _{j'}) $$
(7.70)

such that

$$ \kappa = (\kappa ', (\ell _{j'+1}, b_{j'+1}), \mu _{j'+1}, \dots ( \ell _{j}, b_{j}), \mu _{j} ) $$
(7.71)

and

$$ ((\ell _{j'}, b_{j'}),\mu _{j'})=((\ell _{j}, b_{j}), \mu _{j}). $$
(7.72)

Let \(B\) be a ball of radius \(R^{1-\delta}\) containing \(S\) and \(\mathfrak {n}\) be the node containing \(S\), then for each \(S\in \mathfrak {n}\) and \(S'\not\subset 2B\), we have

$$ \sum _{T\nsim B, T\in \mathbb{T}_{\tau}} \chi _{\mathfrak {n}, \kappa '} (S', T)\chi _{\mathfrak {n}, \kappa}(S, T) \lesssim R^{O(n \delta )} \ell _{j'}^{-1} \sum _{ T\in \mathbb{T}_{\tau}} \chi _{\mathfrak {n}, \kappa '} (S', T) $$
(7.73)

Summing over all \(S'\in \mathfrak {n}, S'\not\subset 2B\),

$$ \begin{aligned} & \sum _{S\in \mathfrak {n}, S\not\subset 2B} \sum _{ T\nsim B, T \in \mathbb{T}_{\tau}} \chi _{\mathfrak {n}, \kappa}(S, T)\chi _{\mathfrak {n}, \kappa '}(S', T) \\ & \leq R^{O(n\delta )}\ell _{j'}^{-1} \sum _{S\in \mathfrak {n}, S \not\subset 2B} \sum _{ T\in \mathbb{T}_{\tau} }\chi _{\mathfrak {n}, \kappa '} (S', T). \end{aligned} $$
(7.74)

On the other hand, for each \(T\nsim B\) and \(\chi _{\mathfrak {n, \kappa}}(S, T)=1\),

$$ \sum _{S'\in \mathfrak {n}, S'\not\subset 2B} \chi _{\mathfrak {n}, \kappa '} (S', T) \geq \mu _{j} R^{-\delta}. $$

As a consequence,

$$ |\mathbb{T}^{\nsim}_{S, \kappa , \tau}| \leq R^{\delta} \mu _{j}^{-1} \sum _{S'\in \mathfrak {n}, S'\not\subset 2B} \sum _{ T\nsim B, T\in \mathbb{T}_{\tau}} \chi _{\mathfrak {n}, \kappa}(S, T)\chi _{\mathfrak {n}, \kappa '}(S', T). $$

Moreover, note that

$$ \sum _{S'\in \mathfrak {n}, S'\not\subset 2B} \sum _{ T\in \mathbb{T}_{\tau}} \chi _{\mathfrak {n}, \kappa '} (S', T) \lesssim R^{\delta} \mu _{j'} | \mathbb{T}_{\tau}| $$

The conclusion follows from (7.72). □

In the rest of this section, we will prove Claim 7.9. We will start with the proof of Lemma 1.10, and this lemma will be an important ingredient in the forthcoming proof of Claim 7.9.

Proof of Lemma 1.10

Denote \(\sigma :=\sqrt{R_{1}}/\sqrt{R_{2}}\), \(\Omega '_{1}=\mathcal {N}_{1}(Z_{1})\) and \(\Omega '_{2}:=\mathcal {N}_{\sigma}(Z_{2})\). By scaling, let us assume that \(\mathrm{supp}(F)\subset \Omega '_{2}\), and we need to prove

$$ \big\| \widehat{F} \big\| _{L^{2}(\Omega '_{1})}^{2} \lesssim \sigma ^{n-m- \delta} \big\| F \big\| _{L^{2}}^{2}. $$
(7.75)

Let \(K\) be a large number depending on \(n, \deg (Z_{2}), \delta \), which is to be determined. We cut \(Z_{2}\) into \(O_{n, \deg (Z_{2}), \delta}(1)\) many pieces so that for each piece \(Z'_{2}\subset Z_{2}\), there exists a linear subspace of dimension \(m-1\) satisfying that the angle between \(T_{{\mathbf {z}}}(Z'_{2})\) and the linear subspace is \(\le 1/K\) for every \({\mathbf {z}}\in Z'_{2}\). As our constant is allowed to depend on \(K\), we only need to prove Lemma 1.10 for each \(Z'_{2}\).

Claim 7.11

Fix \(Z'_{2}\) as above and \(K'\ge K\). Let \(V\subset \mathbb{R}^{n-1}\) be an \((n-m)\)-dimensional affine subspace such that the angle between \(T_{{\mathbf {z}}}(Z'_{2})\) and \(V^{\perp}\) is \(\le 1/K\) for every \({\mathbf {z}}\in Z'_{2}\). Let \(S_{V}\) denote the \(1/K'\)-neighbourhood of \(V\). Then \(Z'_{2}\cap S_{V}\) is contained in a union of \(O(\deg (Z_{2})^{n-1})\) many rectangular boxes of dimensions

$$ \underbrace{\frac{1}{K'}\times \dots \frac{1}{K'}}_{(m-1) \textit{ copies}} \times \underbrace{\frac{1}{K K'}\times \dots \frac{1}{K K'}}_{(n-m) \textit{ copies}} $$
(7.76)

whose long sides are parallel to \(V^{\perp}\).

Proof of Claim 7.11

Without loss of generality, we may assume \(K = e^{-n}\) (all we need here is a small constant) and \(K' = 1\) since we can do an (anisotropic) rescaling otherwise. Since the angle between every \(T_{{\mathbf {z}}}(Z'_{2})\) and \(V^{\perp}\) is \(\le e^{-n}\), we see the angle between every \(T_{{\mathbf {z}}}(Z'_{2})\) and \(V\) is \(\gtrsim 1\). Hence if we take the union of all points \({\mathbf {z}}\in Z_{2} \bigcap S_{V}\) such that the angle between \(T_{{\mathbf {z}}}(Z_{2})\) and \(V\) is \(\gtrsim 1\), it suffices to prove this whole set can be contained in a union of \(O(\deg (Z_{2})^{n-1})\) many unit balls. Note that when \(n-m = 1\), this is proved by Guth as a special case of Lemma 5.7 in [9] (when one takes \(r \simeq \alpha \simeq 1\) there). For general \(n\) and \(m\) this can also be proved by induction on dimension exactly in the same way as in the proof of Lemma 5.7 in [9] and we thus omit the details. □

We continue to prove Lemma 1.10. We start with a trivial estimate:

$$ \big\| \widehat{F} \big\| _{L^{2}(\mathcal {N}_{1}(Z_{1}))}^{2} \le \big\| \widehat{F} \big\| _{L^{2}(\mathcal {N}_{K}(Z_{1}))}^{2}. $$
(7.77)

For a given \(K'\ge 1\), let \(\mathcal {P}_{K'}\) be a partition of \(\Omega '_{2}\) into disjoint pieces \(\{\Omega '_{2, K'}\}\) so that the orthogonal projection of each \(\Omega '_{2, K'}\) into \(\mathrm{span}\{\vec {e}_{1}, \dots , \vec {e}_{m-1}\}\) is roughly a dyadic cube of side length \(1/K'\). Denote

F Ω 2 , K =F 1 Ω 2 , K .
(7.78)

By local \(L^{2}\) orthogonality (see for instance [8, Appendix B]) and Claim 7.11,

$$ \text{(7.77)} \lesssim \sum _{\Omega '_{2, K}} \big\| \widehat{F_{\Omega '_{2, K}}} \big\| _{L^{2}(\mathcal {N}_{K}(Z_{1}))}^{2}. $$
(7.79)

When applying the local orthogonality lemma in [8, Appendix B], the \(L^{2}\) norm on the right hand side of (7.79) carries a weight that is associated to \(\mathcal {N}_{K}(Z_{1})\) and decays rapidly outside \(\mathcal {N}_{K}(Z_{1})\); this is standard and we leave out the technical description. By the assumption of Lemma 1.10, we see that \(\Omega '_{2, K}\) is contained in a rectangular box of dimensions

$$ \underbrace{\frac{1}{K}\times \dots \frac{1}{K}}_{(m-1) \text{ copies}} \times \underbrace{\frac{1}{K^{2}}\times \dots \frac{1}{K^{2}}}_{(n-m) \text{ copies}}, $$
(7.80)

such that its long axes span a subspace that has angle \(\lesssim 1/K\) against \(\text{span}\{\vec {e}_{1}, \dots , \vec {e}_{m-1}\}\) the for every \({\mathbf {z}}\).

We now use the uncertainty principle to analyze \(\widehat{F_{\Omega '_{2, K}}}\) again with the help of (the \(K\simeq K'\simeq 1\) version of) Claim 7.11. Each \(|\widehat{F_{\Omega '_{2, K}}}|\) is essentially a constant on dual boxes of dimensions \(\underbrace{K\times \dots K}_{(m-1) \text{ copies}} \times \underbrace{K^{2}\times \dots K^{2}}_{(n-m) \text{ copies}}\). If we use Claim 7.11, set all parameters in that claim to be \(\simeq 1\) and rescale the conclusion by \(K\) times, we see that if we tile any given \(\underbrace{K\times \dots K}_{(m-1) \text{ copies}} \times \underbrace{K^{2}\times \dots K^{2}}_{(n-m) \text{ copies}}\) box as above by \(K\)-balls, then \(Z_{1}\) only intersects \(\simeq 1\) of them.

Therefore, the uncertainty principle (for instance the version in [8, Appendix B]) and the above geometric observation imply that

$$ \begin{aligned} \text{(7.79)}& \lesssim \frac{1}{K^{n-m}} \sum _{ \Omega '_{2, K}} \big\| \widehat{F_{\Omega '_{2, K}}} \big\| _{L^{2}(\mathcal {N}_{K^{2}}(Z_{1}))}^{2} \\ & \lesssim \frac{1}{K^{n-m}} \sum _{\Omega '_{2, K^{2}}} \big\| \widehat{F_{\Omega '_{2, K^{2}}}} \big\| _{L^{2}(\mathcal {N}_{K^{2}}(Z_{1}))}^{2}, \end{aligned} $$
(7.81)

where in the second inequality we use \(L^{2}\) orthogonality. Similarly to (7.80), \(\Omega '_{2, K^{2}}\) is contained in a rectangular box of dimensions

$$ \underbrace{\frac{1}{K^{2}}\times \dots \frac{1}{K^{2}}}_{(m-1) \text{ copies}} \times \underbrace{\frac{1}{K^{3}}\times \dots \frac{1}{K^{3}}}_{(n-m) \text{ copies}} $$
(7.82)

with nearby directions.

We continue this process repeatedly, and in the end arrive at

$$ \big\| \widehat{F} \big\| _{L^{2}(\mathcal {N}_{1}(Z))}\le C^{W} \Big( \frac{1}{K^{W}} \Big) ^{n-m} \big\| F \big\| _{L^{2}}^{2}, $$
(7.83)

where \(K^{W}=1/\sigma \). In the end, we pick \(K\) to be large enough. □

Next, we will prove a simpler version of Claim 7.9, see Lemma 7.12 below. The proof of this lemma contains the main idea of that of Claim 7.9, and indeed we will apply Lemma 7.12 iteratively to prove Claim 7.9. To simplify notation, let us denote

$$ F(x, t):=\int f_{\tau}(\xi ) e^{i \phi ^{\lambda}(x, t; \xi )}a^{ \lambda}({\mathbf {x}}; \xi ) d\xi , $$
(7.84)

where \(\tau \) is a frequency cap.

Lemma 7.12

Let \(\sqrt{R}\le r\le R'_{1}\le R'_{2}\le R\). Let \(\tau \) be a frequency cap of side length \(r^{-1/2}\). Given \(t_{1}, t_{2}\in [0, \lambda ]\) with \(R= |t_{1}-t_{2}|\). Assume that we are given two \((m-1)\)-dimensional algebraic varieties \(Z_{1}\subset \{t=t_{1}\}\) and \(Z_{2}\subset \{t=t_{2}\}\) satisfying that the angle formed by \(T_{{\mathbf {z}}_{i}}(Z_{i})\) and the space spanned by \(\{\vec {e}_{1}, \dots , \vec {e}_{m-1}\}\) is \(\le 1/(100n)\), for every \(i=1, 2\) and every \({\mathbf {z}}_{i}\in Z_{i}\). Here \(T_{{\mathbf {z}}_{i}}(Z)\) refers to the tangent space. Denote

$$ \Omega _{1}=\mathcal {N}_{\sqrt{R'_{1}}}(Z_{1})\cap B_{\sqrt{R}}, $$
(7.85)

for a given ball \(B_{\sqrt{R}}\) of radius \(\sqrt{R}\); denote

$$ \Omega _{2}=\mathcal {N}_{R/\sqrt{R'_{2}}}(Z_{2})\cap B_{R/\sqrt{r}}. $$
(7.86)

Assume that \(F(\cdot , t_{1})\) is essentially supported on \(B_{\sqrt{R}}\) and \(F(\cdot , t_{2})\) is essentially supported on \(\Omega _{2}\), then

$$ \big\| F(x, t_{1}) \big\| _{L^{2}(\Omega _{1})}^{2} \lesssim \Big( \frac{R'_{1}}{R'_{2}} \Big) ^{\frac{n-m}{2}-O(\delta )} \big\| F(x, t_{2}) \big\| _{L^{2}(\Omega _{2})}^{2}. $$
(7.87)

Proof of Lemma 7.12

Let \(x_{0}\) be the center of \(B_{\sqrt{R}}\) and \(\xi _{0}\) the center of \(\tau \). We apply the same change of variables as in (7.54). Recall the new phase function in (7.57). If we denote Footnote 10

$$ F(x, t):=\int f_{\tau}(\xi ) e^{i \phi _{0}^{\lambda}(x, t; \xi )}a^{ \lambda}({\mathbf {x}}; \xi ) d\xi , $$
(7.88)

then to prove the lemma, we need to prove it for the function \(F(x, t)\) as in (7.88) with \(t_{1}=0\), \(x_{0}=0\) and \(\xi _{0}=0\). Before we finish this reduction step, let us do another linear change of variables so that \(\nabla _{x}\nabla _{\xi} \phi ^{\lambda}_{0}(0; 0)\) is the identity matrix.

We claim that \(\check {f}_{\tau}\) is essentially supported on \(2B_{\sqrt{R}}\). By the same Taylor expansion as in (7.60)–(7.62), we can write

$$ \begin{aligned} \check {f}_{\tau}(x)& =\int f_{\tau}(\xi ) e^{ix \cdot \xi}d\xi =\int f_{\tau}(\xi ) e^{i\phi _{0}^{\lambda}(x, 0; \xi )} e^{-iw(x; \xi )}d\xi \\ &=\sum _{k\in \mathbb{N}} \frac{(-i)^{k}}{k!} \int f_{\tau}(\xi ) e^{i\phi _{0}^{ \lambda}(x, 0; \xi )} w^{k}(x; \xi ) d\xi . \end{aligned} $$
(7.89)

Next, we do Fourier expansion for \(w^{k}(x; \xi ) a^{\lambda}(x, 0; \xi )\) in the \(\xi \) variable at the unit scale, and write it as

$$ \sum _{\beta \in \mathbb{N}^{n-1}} c_{k, \beta}(x) e^{i\beta \cdot \xi}, $$
(7.90)

where the coefficients satisfy

$$ |c_{k, \beta}(x)| \lesssim _{N} 2^{k} |\beta |^{-N}, $$
(7.91)

for every large \(N\), uniformly in \(x\). Now the claim that \(\check {f}_{\tau}\) is essentially supported on \(2B_{\sqrt{R}}\) follows from the assumption on \(F\) and the rapid decay in (7.91). Moreover, by a similar Taylor expansion, we obtain

$$ \big\| F(x, t_{1}) \big\| _{L^{2}(\Omega _{1})}^{2} \lesssim \big\| \check {f}_{\tau} \big\| _{L^{2}(\Omega _{1})}^{2}. $$
(7.92)

It therefore remains to control \(\big\| \check {f}_{\tau} \big\| _{L^{2}(\Omega _{1})}^{2}\) by \(\big\| F(x, t_{2}) \big\| _{L^{2}(\Omega _{2})}^{2}\).

Consider \(t=t_{2}\), and write

$$ \begin{aligned} F(x, t_{2}) &= \int f_{\tau}(\xi ) e^{i\phi ^{\lambda}(x, t_{2}; \xi )} a^{\lambda}(x, t_{2}; \xi ) d\xi \\ & =\int \check {f}_{\tau}(y) \big( \int e^{-iy\cdot \xi} e^{i\phi _{0}^{ \lambda}(x, t_{2}; \xi )}a^{\lambda}(x, t_{2}; \xi ) d\xi \big) dy \end{aligned} $$
(7.93)

Consider the critical point of the phase function:

$$ \nabla _{\xi }\phi _{0}^{\lambda} (x, t_{2}; \xi )=y. $$
(7.94)

Let \(\xi _{c}=\xi _{c}(x, t_{2}; y)\) denote the critical point. Note that by (7.57) and the assumption that \(\phi ^{\lambda}\) is in its normal form, we see that

$$ \nabla _{\xi}^{2} \phi _{0}^{\lambda}(x, t_{2}; \xi )=t_{2}\cdot I_{(n-1) \times (n-1)}+ \text{small perturbation} $$
(7.95)

where \(I_{(n-1)\times (n-1)}\) is the identity matrix of order \((n-1)\). Denote

$$ \psi _{t_{2}}^{\lambda}(x, y) = \phi _{0}^{\lambda}(x, t_{2}; \xi _{c}) -y\cdot \xi _{c}. $$
(7.96)

Then by the stationary phase principle (see for instance Sogge [19, Theorem 1.2.1]), we obtain

$$ F(x, t_{2}) = t_{2}^{-\frac{n-1}{2}}\int \check {f}_{\tau}(y) e^{i \psi _{t_{2}}^{\lambda}(x, y)} a_{t_{2}}^{\lambda}(x, y) dy $$
(7.97)

where

$$ a_{t_{2}}^{\lambda}(x, y):=a_{t_{2}}(\frac{x}{\lambda}, \frac{y}{\lambda}), $$
(7.98)

and \(a_{t_{2}}\) is a compactly supported smooth function in both variables. To continue, we apply Taylor expansion of \(\psi _{t_{2}}^{\lambda}(x, y) \) in \(y\). Take \(\nabla _{y}\) on both sides of (7.96):

$$ \nabla _{y} \psi _{t_{2}}^{\lambda}(x, y)= \nabla _{\xi} \phi _{0}^{ \lambda}(x, t_{2}; \xi _{c}) \cdot \nabla _{y} \xi _{c} -\xi _{c} -y \cdot \nabla _{y} \xi _{c}=-\xi _{c}. $$
(7.99)

Hence

$$ \nabla _{y}^{2} \psi _{t_{2}}^{\lambda} =-\nabla _{y} \xi _{c}. $$

We claim that

$$ |\nabla _{y} \xi _{c}|\leq \frac{1}{t_{2}}. $$
(7.100)

Here by \(|\cdot |\) of the matrix \(\nabla _{y}\xi _{c}\), we mean the maximum of all the entries. To see this, we go back to the definition of \(\xi _{c}\) in (7.94), and differentiate both side:

$$ \nabla _{\xi}^{2} \phi _{0}^{\lambda}(x, t_{2}; \xi _{c}) \nabla _{y} \xi _{c} = I_{(n-1)\times (n-1)}. $$
(7.101)

The claim now follows from (7.95). Moreover, taking \(\nabla _{x}\) on both sides of \(\nabla _{\xi} \phi _{0}^{\lambda}(x, t_{2}; \xi _{c}) =y\), we obtain

$$ \nabla _{x} \nabla _{\xi} \phi _{0}^{\lambda} + \nabla _{\xi}^{2} \phi _{0}^{\lambda} \cdot \nabla _{x} \xi _{c} =0. $$
(7.102)

If we keep differentiating both sides of (7.101) and (7.102) in \(x, y\), then we will be able to obtain

$$ |\nabla ^{\alpha}_{x}\nabla ^{\alpha '}_{y} \xi _{c}| \lesssim _{\alpha} t_{2}^{-\alpha} t_{2}^{-\alpha '}, $$
(7.103)

for every \(\alpha , \alpha '\in \mathbb{N}\). The proof is left out.

From (7.100) and the fact that \(\check {f}_{\tau}\) is essentially supported on \(B_{\sqrt{R}}\), we see that we can “ignore” the quadratic term in the Taylor expansion of \(\psi ^{\lambda}_{t_{2}}(x, y)\) in the \(y\) variable. Write

$$ F(x, t_{2})= e^{i\psi _{t_{2}}^{\lambda} (x, 0)} \int \check {f}_{ \tau}(y) e^{i y \cdot \nabla _{y} \psi _{t_{2}}^{\lambda}(x, 0)+iw_{3, t_{2}}(x, y)} a^{\lambda}_{t_{2}}(x, y) dy, $$
(7.104)

for some error function \(w_{3, t_{2}}(x, y)\). Next, we do Taylor’s expansion for \(\nabla _{y} \psi ^{\lambda}_{t_{2}}(x, 0)\) in the \(x\) variable. Recall our assumption that \(F(x, t_{2})\) is essentially supported on a ball of radius \(R/\sqrt{r}\); let \(x_{0}\) denote its center. Write

$$ \begin{aligned} y\cdot \nabla _{y} \psi ^{\lambda}_{t_{2}}(x, 0) & = y \cdot \nabla _{y} \psi ^{\lambda}_{t_{2}}(x_{0}, 0)+ y\cdot \nabla _{x} \nabla _{y} \psi ^{\lambda}_{t_{2}}(x_{0}, 0)x+w_{4, t_{2}}(x, y), \end{aligned} $$
(7.105)

for some error function \(w_{4, t_{2}}(x, y)\). In particular,

$$ |w_{4, t_{2}}(x, y)|\lesssim \frac{1}{R^{2}} \Big( \frac{R}{\sqrt{r}} \Big) ^{2} |y|\lesssim 1. $$
(7.106)

Next, by (7.99), (7.102) and (7.95), we see that \(\nabla _{x}\nabla _{y} \psi ^{\lambda}_{t_{2}}(x_{0}, 0)\) is a small perturbation of \(\frac{1}{t_{2}}I_{(n-1)\times (n-1)}\). The desired bound

$$ \big\| \check {f}_{\tau} \big\| _{L^{2}(\Omega _{1})}^{2} \lesssim \Big( \frac{R'_{1}}{R'_{2}} \Big) ^{\frac{n-m}{2}-O(\delta )} \big\| F(x, t_{2}) \big\| _{L^{2}(\Omega _{2})}^{2} $$
(7.107)

now follows from Taylor’s expansion and Lemma 1.10. □

In the end of this section, we prove Claim 7.9. As mentioned above, the main idea is already explained in Lemma 1.10 and the proof of Lemma 7.12. The extra work we need to do is to take care of the refinement process of wave packets in the polynomial partitioning algorithm. In other words, in the partitioning algorithm, each time we have an algebraic dominating case, we will need to remove certain wave packets, and therefore the input function \(f^{\nsim}_{\kappa , S_{\Box}, \tau , \ell '}\) also changes as the algorithm proceeds.

Proof of Claim 7.9

To simplify notation, let us write

$$ f_{R}:=f^{\nsim}_{\kappa , S_{\Box}, \tau , \ell '}. $$
(7.108)

Here we use \(R\) to emphasize that the function \(f_{R}\) is built on wave packets from \(\mathbb{T}[B_{R}]\). Our goal is to prove

$$ \|T^{\lambda} (f_{R})^{*}_{\iota , S}\|_{L^{2}( (B(x_{1}, 2r) \times \{t_{1}\}) \cap S_{\Box})}^{2} \lessapprox (\frac{r}{R})^{ \frac{n-n'}{2}} \|T^{\lambda}f_{R} \|_{L^{2}(\{ t=t_{2}\})}^{2} $$
(7.109)

We apply Lemma 7.1 and find an algebraic variety \(Z_{1}\subset \{t=t_{1}\}\) satisfying that the angle between \(T_{{\mathbf {z}}_{1}}(Z_{1})\) and \(\{\vec{e}_{1}, \dots , \vec{e}_{n'-1}\}\) is \(\le 1/(100n)\), such that

$$ (B(x_{1}, 2r)\times \{t_{1}\})\cap S_{\Box}\subset \mathcal {N}_{\sqrt{r}}(Z_{1}) \cap B_{\sqrt{R}}\cap \{t=t_{1}\}=:\Omega _{1}, $$
(7.110)

for some ball \(B_{\sqrt{R}}\) of radius \(\sqrt{R}\). Moreover, by Lemma 7.2 and the definition of brooms, we can find an algebraic variety \(Z_{2}\subset \{t=t_{1}+R\}\) satisfying that the angle between \(T_{{\mathbf {z}}_{2}}(Z_{2})\) and \(\{\vec{e}_{1}, \dots , \vec{e}_{n'-1}\}\) is \(\le 1/(100n)\) for every \({\mathbf {z}}_{2}\in Z_{2}\), such that \(T^{\lambda} f_{R}(\cdot , t_{2})\) is essentially supported on

$$ \mathcal {N}_{\sqrt{R}}(Z_{2})\cap B_{R^{1+\delta}/\sqrt{r}}\cap \{t=t_{2} \}=:\Omega _{2}. $$
(7.111)

Under the above notation, (7.109) can be written as

$$ \|T^{\lambda} (f_{R})^{*}_{\iota , S}\|_{L^{2}(\Omega _{1})}^{2} \lessapprox (\frac{r}{R})^{\frac{n-n'}{2}} \|T^{\lambda}f_{R} \|_{L^{2}( \Omega _{2})}^{2}. $$
(7.112)

To proceed, we need to recall notation and definitions from Sect. 5.4. Let \(\mathfrak {n}_{r}^{*}\in \{\mathfrak {n}^{*}_{0}, \mathfrak {n}^{*}_{1}, \dots \}\) be such that \(S\in \mathfrak {n}_{r}^{*}\). Collect all the ancestors of \(\mathfrak {n}^{*}_{r}\) that are in \(\cup _{j} (\mathfrak {M}_{j}\cup \mathfrak {R}_{j})\), and list them in descending order

$$ \mathfrak {n}^{*}_{R_{1}}, \mathfrak {n}_{R_{2}}^{*}, \dots , \mathfrak {n}^{*}_{R_{W-1}}, $$
(7.113)

where \(R>R_{1}> R_{2}>\cdots > R_{W-1}> r\). Moreover, denote \(R_{0}:=R\), \(\mathfrak {n}^{*}_{R_{0}}:=\mathfrak {n}^{*}_{0}\), \(R_{W}:=r\) and \(\mathfrak {n}^{*}_{R_{W}}:=\mathfrak {n}^{*}_{r}\). Note that we have the trivial bound \(W\le \delta _{n}^{-10}\). Next, find \(S_{R_{w}}\in \mathfrak {n}^{*}_{R_{w}}\) for each \(1\le w\le W\) such that

$$ S=S_{R_{W}}\subset S_{R_{W-1}}\subset \cdots \subset S_{R_{1}}\subset S_{R_{0}}=B_{R}, $$
(7.114)

and each \(S_{R_{w}}\) is contained in a ball \(B_{R_{w}}\) of radius \(R_{w}\).

We will prove Claim 7.9 by applying Lemma 7.12 iteratively. Denote

$$ \mathcal {N}_{\sqrt{R_{w}}}(\Omega _{1})\cap \{t=t_{1}\}=:\Omega ^{(1)}_{R_{w}}, \ \ \mathcal {N}_{R/\sqrt{R_{w}}}(\Omega _{2})\cap \{t=t_{1}+R\}=:\Omega ^{(2)}_{R_{w}}, $$
(7.115)

for every \(w\). Recall that \(T^{\lambda} f_{R_{0}}(\cdot , t_{2})\) is supported on \(\Omega ^{(2)}_{R_{0}}\). By Lemma 7.12 with \(m=n'\), \(R'_{1}=R_{1}\) and \(R'_{2}=R_{0}\), we obtain

$$ \big\| T^{\lambda} f_{R_{0}} \big\| _{ L^{2}( \Omega ^{(1)}_{R_{1}} ) } \lesssim \Big( \frac{R_{1}}{R_{0}} \Big) ^{\frac{n-n'}{2}-O(\delta )} \big\| T^{\lambda} f_{R_{0}} \big\| _{ L^{2}( \Omega ^{(2)}_{R_{0}} ) } $$
(7.116)

On the ball \(B_{R_{1}}\), we have the new wave packet decomposition

$$ f_{R_{0}}=\sum _{T\in \mathbb{T}[B_{R_{1}}]} (f_{R_{0}})_{T}. $$
(7.117)

Let \(\mathbb{T}'[B_{R_{1}}]\) be a subset of \(\mathbb{T}[B_{R_{1}}]\) such that

$$ (f_{R_{0}})^{*}_{\iota , S_{R_{1}}}=\sum _{T\in \mathbb{T}'[B_{R_{1}}]} (f_{R_{0}})_{T}, $$
(7.118)

that is, we throw away certain wave packets, depending on whether we are in the tangential case or the transverse case. Denote

$$ f_{R_{1}}:= \sum _{ \substack{ T\in \mathbb{T}'[B_{R_{1}}]\\ T\cap \Omega ^{(1)}_{R_{1}}\neq \emptyset } } (f_{R_{0}})_{T}. $$
(7.119)

Therefore by \(L^{2}\) orthogonality, we have

$$ \big\| T^{\lambda} f_{R_{1}} \big\| _{ L^{2}(\Omega ^{(1)}_{R_{1}}) } \lesssim \big\| T^{\lambda} f_{R_{0}} \big\| _{ L^{2}(\Omega ^{(1)}_{R_{1}}) }. $$
(7.120)

Note that by the construction of \(f_{R_{1}}\), the function \(T^{\lambda} f_{R_{1}}(\cdot , t_{1})\) is essentially supported on \(\Omega ^{(1)}_{R_{1}}\). Therefore, by (7.53) and \(L^{2}\) bounds for Hörmander’s operator at fixed time (see for instance Hörmander [15]), we have

$$ \big\| T^{\lambda} f_{R_{1}}(x, t_{2}) \big\| _{L^{2}_{x}(\mathbb{R}^{n-1})} \lesssim \big\| f_{R_{1}} \big\| _{2} \lesssim \big\| T^{\lambda} f_{R_{1}} \big\| _{ L^{2}(\Omega ^{(1)}_{R_{1}}) }. $$
(7.121)

The main observation is that \(T^{\lambda} f_{R_{1}}(\cdot , t_{2})\) is essentially supported on \(\Omega ^{(2)}_{R_{1}}\). Once this is proven, we see that we can repeat the above process: By Lemma 7.12 with \(R'_{1}=R_{2}\) and \(R'_{2}=R_{1}\), we obtain

$$ \begin{aligned} \big\| T^{\lambda} f_{R_{1}} \big\| _{ L^{2}( \Omega ^{(1)}_{R_{2}} ) } & \lesssim \Big( \frac{R_{2}}{R_{1}} \Big) ^{\frac{n-n'}{2}-O(\delta )} \big\| T^{\lambda} f_{1} \big\| _{ L^{2}( \Omega ^{(2)}_{R_{1}} ) } \\ & \lesssim \Big( \frac{R_{2}}{R_{0}} \Big) ^{\frac{n-n'}{2}-O(\delta )} \big\| T^{\lambda} f_{R_{0}} \big\| _{ L^{2}( \Omega ^{(2)}_{R_{0}} ) }. \end{aligned} $$
(7.122)

We define \(f_{R_{2}}\) similarly as above, and then observe that \(T^{\lambda} f_{R_{2}}(\cdot , t_{2})\) is essentially supported on \(\Omega ^{(2)}_{R_{2}}\). This allows us to repeat the above iteration one more time. In the end, we will obtain the desired bound (7.112).

It remains to prove that \(T^{\lambda} f_{R_{w}}(\cdot , t_{2})\) is essentially supported on \(\Omega ^{(2)}_{R_{w}}\) for every \(w\). We will only prove the case \(w=1\), and the other cases are the same. Recall the definition of \(f_{R_{1}}\) from (7.119). Write

$$ f_{R_{0}}=\sum _{T_{0}\in \mathbb{T}'[B_{R_{0}}]} f_{T_{0}}, $$
(7.123)

for some \(\mathbb{T}'[B_{R_{0}}]\subset \mathbb{T}[B_{R_{0}}]\). Denote

$$ f_{T_{0}, R_{1}}:= \sum _{ \substack{ T\in \mathbb{T}'[B_{R_{1}}]\\ T\cap \Omega ^{(1)}_{R_{1}}\neq \emptyset } } (f_{T_{0}})_{T}. $$
(7.124)

It suffices to prove that for each \(T_{0}\), we have

$$ \mathrm{supp}{ T^{\lambda} f_{T_{0}, R_{1}}(\cdot , t_{2}) } \subset \mathcal {N}_{R/ \sqrt{R_{1}}}( \mathrm{supp}{ T^{\lambda} f_{T_{0}}(\cdot , t_{2}) } ), $$
(7.125)

where by \(\mathcal {N}\) we mean neighborhood in \((n-1)\) dimensions. Note that \(f_{T_{0}, R_{1}}\) consists of wave packets of frequency scale \(R_{1}^{-1/2}\). In order to see where \(T^{\lambda} f_{T_{0}, R_{1}}(\cdot , t_{2})\) is supported, we do a wave packet decomposition for \(f_{T_{0}, R_{1}}\) by using wave packets of frequency scale \(R_{0}^{-1/2}\):

$$ f_{T_{0}, R_{1}}=\sum _{T'_{0}\in \mathbb{T}[B_{R_{0}}]} (f_{T_{0}, R_{1}})_{T'_{0}}. $$
(7.126)

In order for \(T'_{0}\in \mathbb{T}[B_{R_{0}}]\) to have non-trivial contribution, we need that \(T_{0}\cap T'_{0}\neq \emptyset \) and \(\mathrm{dist}(\theta (T_{0}), \theta (T'_{0}))\lesssim R_{1}^{-1/2}\). Under these two conditions, by the same Taylor expansion argument as in Lemma 7.2, (7.125) follows immediately. □

8 Bushes: small grains

Let \(\vec {S}\) be a grain with its last component given by \((S, B({\mathbf {x}}_{0}, r))\). In the previous section, we considered the case \(r\ge \sqrt{R}\). In this section, we will consider the case \(r\le \sqrt{R}\). As the grain \(S\) is small, in particular, it is smaller than the scale \(\sqrt{R}\) of a wave packet \(T\in \mathbb{T}[B_{R}]\), we will see that this case is much easier to handle.

The goal of this section is to prove the following result.

Theorem 8.1

Let \(\vec {S}_{n'}=(S_{n}, \dots , S_{n'})\) be a multi-grain with \(S_{n'}=\mathcal {N}_{r_{n'}^{1/2+\delta _{n'}}}(Z_{n'})\cap B_{r_{n'}}\), \(r_{n'}\le \sqrt{R}\) and \(Z_{n'}\) an \(n'\)-dimensional algebraic variety of degree \(\lesssim _{n'} 1\) in \(\mathbb{R}^{n}\). Then for frequency caps \(\tau \) of side length \(r_{n'}^{-1/2}\) and \(n'\le n''\le n\), we have

$$ \|f^{\nsim (n'')}_{S_{n'}, \tau}\|_{L^{2}}^{2} \lessapprox \Big( \frac{r_{n''}}{R}\Big)^{\frac{n-n'}{2}} \|f_{\tau}\|_{L^{2}}^{2}, $$
(8.1)

where \(f^{\nsim (n'')}_{S_{n'}, \tau}\) is defined in (7.42) and (7.43), with the definition ofgiven in Definition 8.3below.

In Theorem 8.1, the scale \(r_{n'}\) is so small that we do not see the broom structure; we will replace brooms by bushes.

Definition 8.2

Bushes

Given a grain \((S, B({\mathbf {x}}_{0}, r))\) with \(r\le \sqrt{R}\), a collection of tubes \(\mathbb{T}'[B_{R}]\subset \mathbb{T}[B_{R}]\) and a frequency cap \(\tau \) of side length \(r^{-1/2}\), the collection of tubes

$$ \mathcal {B}(\mathbb{T}'[B_{R}]):=\{T\in \mathbb{T}'[B_{R}]: \theta (T)\subset \tau , T \cap S\neq \emptyset \} $$
(8.2)

is called a bush generated by \(\mathbb{T}'[B_{R}]\). The size \(b\) of the bush \(\mathcal {B}(\mathbb{T}'[B_{R}])\) is defined to be \(R^{w\delta}\) with \(w\in \mathbb{N}\), \(R^{w\delta}\le \#\mathcal {B}(\mathbb{T}'[B_{R}])< R^{(w+1)\delta}\). Often \(\mathcal {B}(\mathbb{T}'[B_{R}])\) will be written as \(\mathcal {B}_{b}(\mathbb{T}'[B_{R}])\). In particular, if \(T'[B_{R}]=\mathbb{T}[B_{R}]\), then we will simply write \(\mathcal {B}_{b}\) for a bush.

Next, we define a two-ends relation \(\sim _{\mathfrak {n}}\) for nodes \(\mathfrak {n}\in \cup _{\iota} \mathfrak {R}_{\iota}\) with \(\rho (\mathfrak {n})\le \sqrt{R}\). Similarly as above, we start by introducing a few auxiliary functions \(\chi _{\mathfrak {n}, \kappa}=\chi _{\kappa}\), taking values 0 or 1, where \(\kappa =(b_{1}, \mu _{1}, \dots , b_{\ell}, \mu _{\ell})\) and \(b_{\ell '}, \mu _{\ell '}\in \{R^{w\delta}: w\in \mathbb{N}\}\) for every \(1\le \ell '\le \ell \). Denote \(r=\rho (\mathfrak {n})\). For \(S\in \mathfrak {n}\) and a tube \(T\in \mathbb{T}[B_{R}]\), we say that

$$ \chi _{b_{1}}(S, T)=1, $$
(8.3)

if \(T\) belongs to a bush of size \(b_{1}\) rooted at \(S\). Moreover, we say that

$$ \chi _{b_{1}, \mu _{1}}(S, T)=1 $$
(8.4)

if

$$ \chi _{b_{1}}(S, T)=1, \ \ \mu _{1}\le \sum _{S'\in \mathfrak {n}} \chi _{b_{1}}(S', T)< \mu _{1} R^{\delta}. $$
(8.5)

Now let us assume that we have defined \(\chi _{\kappa}\) with \(\kappa =(b_{1}, \mu _{1}, \dots , b_{\ell}, \mu _{\ell})\) already, and we would like to define \(\chi _{\kappa , b_{\ell +1}}\) and \(\chi _{\kappa , b_{\ell +1}, \mu _{\ell +1}}\). For a fixed \(S\in \mathfrak {n}\), define

$$ \mathbb{T}_{S, \kappa}:=\{T'\in \mathbb{T}[B_{R}]: \chi _{\kappa}(S, T')=1\}. $$
(8.6)

We write \(\mathbb{T}_{S, \kappa}\) as a disjoint union of bushes

$$ \bigcup _{b_{\ell +1}} \bigcup _{\tau} \mathcal {B}_{b_{\ell +1}, \tau}(\mathbb{T}_{S, \kappa}) $$
(8.7)

where \(\tau \) runs through all caps of side length \(\rho (\mathfrak {n})^{-1/2}\), and \(\mathcal {B}_{b_{\ell +1}, \tau}(\mathbb{T}_{S, \kappa})\) is a bush of size \(b_{\ell +1}\) with tubes coming from \(\tau \). We say that

$$ \chi _{\kappa , b_{\ell +1}}(S, T)=1, \text{ if } T\in \mathcal {B}_{b_{ \ell +1}, \tau}(\mathbb{T}_{S, \kappa}), $$
(8.8)

for some \(\tau \). Moreover,

$$ \chi _{\kappa , b_{\ell +1}, \mu _{\ell +1}}(S, T)=1, $$
(8.9)

if

$$ \chi _{\kappa , b_{\ell +1}}(S, T)=1, \ \ \mu _{\ell +1}\le \sum _{S' \in \mathfrak {n}} \chi _{\kappa , b_{\ell +1}}(S', T)< \mu _{\ell +1}R^{ \delta}. $$
(8.10)

This finishes the definition of the auxiliary functions.

For \(\kappa =(b_{1}, \mu _{1}, \dots , b_{\ell}, \mu _{\ell})\), we say that \(\kappa \) is admissible if there exists exactly one pair \((\ell _{1}, \ell _{2})\) with \(\ell _{1}\neq \ell _{2}\) such that

$$ (b_{\ell _{1}}, \mu _{\ell _{1}})=(b_{\ell _{2}}, \mu _{\ell _{2}}). $$
(8.11)

Similarly to Lemma 7.4, it is elementary to see that the number of admissible \(\kappa \) is \(O_{\delta}(1)\).

Definition 8.3

For a ball \(B\subset B_{R}\) of radius \(R^{1-\delta}\), a tube \(T\in \mathbb{T}[B_{R}]\), a node \(\mathfrak {n}\in \cup _{\iota} \mathfrak {R}_{\iota}\) with \(\rho (\mathfrak {n})\le \sqrt{R}\) and an admissible \(\kappa \), we say that \(T\sim _{\mathfrak {n}, \kappa} B\) if \(B\) maximizes

$$ \#\{S'\in \mathfrak {n}: S'\subset B', \chi _{\mathfrak {n}, \kappa}(S', T)=1\}, $$
(8.12)

among all \(B'\) of radius \(R^{1-\delta}\). Moreover, we say that \(T\sim _{\mathfrak {n}} B\) if \(T\sim _{\mathfrak {n}, \kappa}\) for some admissible \(\kappa \); we say that \(T\sim B\) if \(T\sim _{\mathfrak {n}} B\) for some node \(\mathfrak {n}\in \cup _{\iota} \mathfrak {R}_{\iota}\).

Now we are ready to prove Theorem 8.1.

Proof of Theorem 8.1.

Similarly to what we did in the proof of Theorem 7.8, we will write down more details for the case \(n''=n'\), and the case \(n''>n'\) is essentially the same.

We abbreviate \(S_{n'}\) to \(S\), \(\vec{S}_{n'}\) to \(\vec{S}\) and \(r_{n'}\) to \(r\). Let \(B_{\iota}\) be the ball of radius \(R^{1-\delta}\) containing \(S\); let \(\mathfrak {n}\) be the node such that \(S\in \mathfrak {n}\). For an admissible multi-index \(\kappa \) of the form \((b_{1}, \mu _{1}, \dots , b_{\ell}, \mu _{\ell})\), denote

$$ \mathbb{T}_{S, \kappa , \tau}:=\{T\in \mathbb{T}_{S, \kappa}: \theta (T)\subset \tau \}, $$
(8.13)
$$ \mathbb{T}^{\nsim}_{S, \kappa , \tau}:=\{T\in \mathbb{T}_{S, \kappa}: \theta (T) \subset \tau , B_{\iota}\nsim _{\mathfrak {n}, \kappa} T\}, $$
(8.14)

and

$$ f^{\nsim}_{\kappa , S, \tau}:= \sum _{T\in \mathbb{T}^{\nsim}_{S, \kappa ,\tau}}f_{T} \quad \text{and} \quad f^{\nsim , *}_{\kappa , S, \tau} := (f^{\nsim}_{\kappa , S, \tau} )^{*}_{\iota , S}. $$
(8.15)

Then similarly to (7.47), we have

$$ f^{\nsim}_{S, \tau}=\sum _{\kappa} f^{\nsim , *}_{\kappa , S, \tau}. $$
(8.16)

Next, similarly to (7.48) and (7.52), we have

$$ \big\| f^{\nsim}_{S, \tau} \big\| _{2}^{2} \lesssim _{\delta} \sum _{\kappa} \big\| T^{\lambda} f^{\nsim , *}_{\kappa , S, \tau} \big\| _{L^{2} ( \{t=t_{1} \} \cap S ) }^{2}, $$
(8.17)

where \((x_{1}, t_{1})\) is a point in \(S\). By the definition of \(\kappa \), \(\mathbb{T}_{S, \kappa , \tau}\) is contained in a bush \(\mathcal {B}_{b_{\ell}}\). Write

$$ f^{\nsim , *}_{\kappa , S, \tau}= \sum _{ T\in \mathbb{T}^{\nsim}_{S, \kappa , \tau}\cap \mathcal {B}_{b_{\ell}} } (f_{T})^{*}_{\iota , S}. $$
(8.18)

By the Cauchy-Schwarz inequality, we have

$$ \big\| T^{\lambda} f^{\nsim , *}_{\kappa , S, \tau} \big\| _{L^{2} ( \{t=t_{1} \} \cap S ) }^{2} \lesssim b_{\ell} \sum _{ T\in \mathbb{T}^{\nsim}_{S, \kappa , \tau}\cap \mathcal {B}_{b_{\ell}} } \big\| T^{\lambda} (f_{T})^{*}_{\iota , S} \big\| _{L^{2}(\{t=t_{1}\}\cap S)}^{2}. $$
(8.19)

It is elementary to see thatFootnote 11

$$ \big\| T^{\lambda} (f_{T})^{*}_{\iota , S} \big\| _{L^{2}(\{t=t_{1}\}\cap S)}^{2} \lesssim \Big( \frac{r}{R} \Big) ^{\frac{n-n'}{2}} \big\| f_{T} \big\| _{L^{2}}^{2}. $$
(8.20)

By summing over \(T\), we obtain

$$ \big\| T^{\lambda} f^{\nsim , *}_{\kappa , S, \tau} \big\| _{L^{2} ( \{t=t_{1} \} \cap S ) }^{2} \lesssim b_{\ell} \Big( \frac{r}{R} \Big) ^{\frac{n-n'}{2}} \big\| f^{\nsim}_{\kappa , S, \tau} \big\| _{L^{2}}^{2}. $$
(8.21)

In the end, we just need to show that

$$ | \mathbb{T}^{\nsim}_{S, \kappa , \tau} |\lesssim b_{\ell}^{-1} R^{O(n \delta )} |\mathbb{T}_{\tau}|, $$
(8.22)

which is an analogue of Lemma 7.10. As the proof of (8.22) is also more or less the same as that of Lemma 7.10 (indeed simpler), we leave it out. □

9 Finishing the proof of Theorem 4.2

In the last section, we combine the polynomial Wolff axiom in Lemma 6.1 and Corollary 6.5, Properties 1-4 in Sect. 5.4, the broom estimate in Theorem 7.8 and the bush estimate in Theorem 8.1 to finish the proof of Theorem 4.2.

First of all, by Property 1 and repeated applications of Property 2 in Sect. 5.4 in the same way as [12, page 269] obtain equation (56) and (57), we obtain

$$\begin{aligned} \|T^{\lambda} g\|_{\mathrm{BL}_{k, A}^{p}\left (B_{R}\right )} & \lessapprox \prod _{i=m-1}^{n-1} r_{i}^{ \frac{\beta _{i+1}-\beta _{i}}{2}} D_{i}^{\frac{\beta _{i+1}}{2}- \left (\frac{1}{2}-\frac{1}{p_{n}}\right )} \end{aligned}$$
(9.1)
$$\begin{aligned} & \|g\|_{2}^{\frac{2}{p_{n}}} \max _{O \in \mathfrak {n}_{\ell _{0}}^{*}} \left \|g_{\iota , O}\right \|_{2}^{1-\frac{2}{p_{n}}}, \end{aligned}$$
(9.2)

where \(\iota \) refers to \(B_{\iota}\subset B_{R}\), the ball of radius \(R^{1-\delta}\) containing \(O\), the parameter \(m\) comes from (5.86) and \(\mathfrak {n}_{\ell _{0}}^{*}\) is as in (5.84). Next, by repeated applications of Property 3, we obtain

$$\begin{aligned} \max _{O \in \mathfrak {n}_{\ell _{0}}^{*} } \left \|g_{\iota , O}\right \|_{2}^{2} \lessapprox r_{n'}^{-\frac{n-n'}{2}} \prod _{i=m-1}^{n'-1} r_{i}^{-1 / 2} D_{i}^{-i+\delta} \max _{ S_{n'}\in \mathfrak {S}_{n'} } \left \|g^{*}_{ \iota , S_{n'}}\right \|_{2}^{2}. \end{aligned}$$
(9.3)

By Property 4,

$$ \big\| g^{*}_{\iota , S_{n'}} \big\| _{2}^{2} \lessapprox r_{n'}^{\frac{n-n'}{2}} \Big( \prod _{i=n'}^{n-1} r_{i}^{-\frac{1}{2}} \Big) r_{n''}^{-\frac{n-n''}{2}} \Big( \prod _{i=n''}^{n-1} r_{i}^{\frac{1}{2}} \Big) R^{O(\epsilon _{\circ})} \big\| g_{\iota , S_{n'}}^{*(n'')} \big\| _{2}^{2}. $$
(9.4)

We then apply Corollary 6.5 and obtain

$$ \Big\| g_{\iota , S_{n'}}^{*(n'')} \Big\| _{2}^{2} \lessapprox \Big( \prod _{j=n'}^{n''} r_{j}^{-\frac{1}{2}} \Big) r_{n''}^{-\frac{n-n''-1}{2}} \max _{\tau : \ell (\tau )=r_{n''}^{-1/2}} \Big\| g_{\iota , S_{n'}}^{*(n'')} \Big\| _{ L^{2}_{\mathrm{avg}}(\tau ) }^{2}. $$
(9.5)

Recall the notation (7.42) and (7.43). By the broom estimate in Theorem 7.8 and the bush estimate in Theorem 8.1, we have

$$ \Big\| g_{\iota , S_{n'}}^{*(n'')} \Big\| _{ L^{2}_{\mathrm{avg}}(\tau ) }^{2} \lessapprox \Big( \frac{R}{r_{n''}} \Big) ^{-\frac{n-n'}{2}} \big\| g \big\| _{\infty}^{2}. $$
(9.6)

Putting everything together, we obtain

$$\begin{aligned} \max _{O} \big\| g_{\iota , O} \big\| _{2}^{2} & \lessapprox R^{O(\epsilon _{\circ})} \Big( \prod _{i=m-1}^{n'-1} D_{i}^{-i} \Big) \Big( \prod _{j=m-1}^{n'-1} r_{j}^{\frac{1}{2}} \Big) \Big( \prod _{i=m-1}^{n-1} r_{i}^{-1} \Big) \end{aligned}$$
(9.7)
$$\begin{aligned} & r_{n''}^{-(n-n''-1)} \Big( \prod _{i=n''+1}^{n-1} r_{i} \Big) \Big( \frac{R}{r_{n''}} \Big) ^{-\frac{n-n'}{2}} \big\| g \big\| _{\infty}^{2}. \end{aligned}$$
(9.8)

We pick \(n''\) such that

$$ n-n''=[\frac{n-n'}{3}]+1, $$
(9.9)

bound \(r_{i}\) by \(R\) and obtain

$$\begin{aligned} \max _{O} \big\| g_{\iota , O} \big\| _{2}^{2} \lessapprox & \Big( \prod _{i=m-1}^{n-1} r_{i}^{-1} \Big) \Big( \prod _{i=m-1}^{n'-1} r_{i}^{1/2} \Big) \Big( \prod _{i=m-1}^{n'-1} D_{i}^{-i} \Big) \Big( \frac{R}{r_{n''}} \Big) ^{-\frac{n-n'}{6}} \big\| g \big\| _{\infty}^{2}. \end{aligned}$$
(9.10)

Note that when

$$ n'\le n/100+99m/100=:W_{m}^{n}, $$
(9.11)

it holds that

$$ n''\le \frac{2n}{3}+\frac{n'}{3}\le \frac{67n}{100}+\frac{33m}{100} =:M, $$
(9.12)

and therefore we have

$$\begin{aligned} \max _{O} \big\| g_{\iota , O} \big\| _{2}^{2} \lessapprox & \Big( \prod _{i=m-1}^{n-1} r_{i}^{-1} \Big) \Big( \prod _{i=m-1}^{n'-1} r_{i}^{1/2} \Big) \Big( \prod _{i=m-1}^{n'-1} D_{i}^{-i} \Big) \end{aligned}$$
(9.13)
$$\begin{aligned} & \times \textstyle\begin{cases} \Big( \frac{R}{r_{M}} \Big) ^{-\frac{33(n-m)}{200}} \big\| g \big\| _{\infty}^{2} & \text{ if } n'\le W_{m}^{n}, \\ \big\| g \big\| _{\infty}^{2}, & \text{otherwise} \end{cases}\displaystyle \end{aligned}$$
(9.14)

By taking a weighted geometric average in \(n'\in \{m, m+1, \dots , n-1\}\), and substituting into (9.1), we obtain

$$ \big\| T^{\lambda}g \big\| _{ \mathrm {BL}_{k, A}^{p}(B_{R}) }\lessapprox R^{-\Lambda} \Big( \prod _{i=m-1}^{n-1} r_{i}^{X_{i}} D_{i}^{Y_{i}} \Big) \big\| g \big\| _{2}^{\frac{2}{p}} \big\| g \big\| _{\infty}^{1-\frac{2}{p}}, $$
(9.15)

where

$$\begin{aligned} & \Lambda = \Big( \sum _{j=m}^{W_{m}^{n}} \gamma _{j} \Big) \Big( \frac{1}{2}-\frac{1}{p} \Big) \frac{33(n-m)}{200}, \end{aligned}$$
(9.16)
$$\begin{aligned} & X'_{i}=\frac{\beta _{i+1}-\beta _{i}}{2}- \Big( \frac{1}{2}-\frac{1}{p} \Big) +\frac{1}{2} \Big( 1-\sum _{j=m}^{i} \gamma _{j} \Big) \Big( \frac{1}{2}-\frac{1}{p} \Big) , \end{aligned}$$
(9.17)
$$\begin{aligned} & X_{i}=X'_{i} + \textstyle\begin{cases} 0 \text{ if } i\neq M, \\ \Lambda \text{ if } i=M, \end{cases}\displaystyle \end{aligned}$$
(9.18)
$$\begin{aligned} & Y_{i}=\frac{\beta _{i+1}}{2}- \Big( 1+i(1-\sum _{j=m}^{i} \gamma _{j}) \Big( \frac{1}{2}-\frac{1}{p} \Big) \Big) \end{aligned}$$
(9.19)

and

$$\begin{aligned} & \gamma _{m-1}:=0, \ 0\le \gamma _{m}, \dots , \gamma _{n}\le 1, \ \gamma _{m}+\cdots +\gamma _{n}=1, \end{aligned}$$
(9.20)
$$\begin{aligned} & r_{m-1}=1, \ D_{n}=1, \ \beta _{n}=1, \ \beta _{n}\ge \beta _{n-1} \ge \cdots \ge \beta _{m}. \end{aligned}$$
(9.21)

Lemma 5.10 suggests that we write the coefficients on the right hand side of (9.15) as

$$\begin{aligned} R^{-\Lambda} (D_{m-1})^{ \frac{\beta _{m}}{2}-m(\frac{1}{2}- \frac{1}{p}) } \Big( \prod _{i=m}^{n-1} D_{i}^{ Y_{i}- (\sum _{j=m}^{i} X_{j}) } \Big) \prod _{i=m}^{n-1} \Big(r_{i} \prod _{j=i}^{n-1} D_{j}\Big)^{X_{i}}. \end{aligned}$$
(9.22)

We pick \(\gamma _{i}\) and \(\beta _{i}\) so that

$$\begin{aligned} & Y_{n-1}-X_{n-1}-\cdots -X_{m}=\cdots =Y_{m}-X_{m}= \frac{\beta _{m}}{2}-m(\frac{1}{2}-\frac{1}{p})=0, \end{aligned}$$
(9.23)
$$\begin{aligned} & X_{M}=\Lambda , \ \ X_{i}=0, \text{ when } i\neq M. \end{aligned}$$
(9.24)

One can check directly that if we set \(p_{m}=2m/(m-1)\),

$$ \gamma '_{i}:=\frac{1}{2(i-1)} \prod _{j=m}^{i} \frac{2(j-1)}{2j+1}, \forall m\le i\le n-1, $$
(9.25)

and let \(\gamma _{i}\) be given by the solution to the system

$$ \begin{aligned} & \gamma _{i}=\gamma '_{i}, \ \forall m\le i\le M-1, \\ & (M+\frac{1}{2})(\gamma _{M}-\gamma '_{M})=\Gamma _{m}^{n} \frac{33(n-m)}{200}, \text{ with } \Gamma _{m}^{n}:=\sum _{j=m}^{W_{m}^{n}} \gamma _{j}, \\ & (M+\frac{2i+1}{2})(\gamma _{M+i}-\gamma '_{M+i})+\frac{3}{2}\sum _{j=M}^{M+i-1} (\gamma _{j}-\gamma '_{j})=0, \ \ \forall 1\le i\le n-M-1, \end{aligned} $$
(9.26)

then (9.23) is satisfied.

Claim 9.1

Let \(\gamma _{i}\) be given as above. Then \(\gamma _{i}\ge 0\) for every \(m\le i\le n-1\) and

$$ \sum _{i=m}^{n-1} \gamma _{i}\le 1. $$
(9.27)

Proof of Claim 9.1

By taking the logarithm and Taylor’s expansion, we see that

$$ \sum _{i=m}^{n-1} \gamma '_{i}\le 1/3. $$
(9.28)

Moreover, under the assumption that \(k\ge 2n/5\) in Theorem 4.2, it holds that

$$ \begin{aligned} \gamma '_{i}& =\frac{1}{2(i-1)} \frac{2(m-1) 2m \dots 2(i-1)}{(2m+1)(2m+3)\dots (2i+1)} \\ & \ge \frac{1}{2(i-1)} \frac{2(m-1) 2m}{(2i-1)(2i+1)}\ge \frac{2n}{25}, \end{aligned} $$
(9.29)

and

$$ \gamma _{M}-\gamma '_{M}=(M+1/2)^{-1} \Gamma _{m}^{n} \frac{33(n-m)}{200}\le \frac{1}{24}. $$
(9.30)

The claim now follows from checking the system (9.26). □

So far we have picked the values for \(\gamma _{i}\) with \(m\le i< n-1\) and \(\beta _{m}\). By Claim 9.1, we can choose \(\gamma _{n}=1-\gamma _{m}-\cdots -\gamma _{n-1}\). Now we pick \(\beta _{i}\) satisfying (9.21) so that (9.24) is satisfied. Elementary computation shows that

$$\begin{aligned} \frac{1}{2}= \Big( \frac{1}{2}-\frac{1}{p} \Big) \Big( \frac{n+m}{2}+\frac{1}{2} \sum _{j=m}^{n-1} (n-j)\gamma _{j} \Big) \end{aligned}$$
(9.31)

Therefore, to prove Theorem 4.2, it remains to prove that \(p\le p_{n}(k)\), where \(p\) is given in (9.31). This can done by using the trick in the appendix of [13], together with elementary but tedious computation, which will be skipped.