1 Introduction

In this paper, we study the square function estimates. We begin with the most general setting. Let \(\Omega \subset \mathbb {R}^n\) be a set in the frequency space, and suppose we are given a partition of \(\Omega \) into subsets \(\Sigma =\{\sigma \}\):

$$\begin{aligned} \Omega =\bigsqcup _{\sigma \in \Sigma }\sigma . \end{aligned}$$

We will only consider the case when \(\sigma \) are morally rectangles. For any function f, we define \(f_\sigma =(\psi _\sigma \widehat{f})^\vee \), where \(\psi _\sigma \) is a smooth bump function adapted to \(\sigma \). We will also assume \(supp \widehat{f}\subset \Omega \) in the following discussions. The inequality we are interested in is of the following form:

Square Function Estimate:

$$\begin{aligned} \Vert f\Vert _p\le C_{p,\Sigma }\Big \Vert \Big (\sum _{\sigma \in \Sigma }|f_\sigma |^2\Big )^{1/2}\Big \Vert _p. \end{aligned}$$

The goal is to find the best constant \(C_{p,\Sigma }\) that works for all test functions f.

This type of estimate is of huge interest in harmonic analysis. We briefly review some well-known results.

When \(\Omega \) is the \(R^{-1}\)-neighborhood of the unit parabola \(\mathcal {P}=\{(\xi ,\xi ^2)\in \mathbb {R}^2: |\xi |\le 1\}\) and \(\Sigma =\{\sigma \}\) is the set of \(\sim R^{-1/2}\times R^{-1}\)-caps that form a partition of \(\Omega \), then an argument of Córdoba–Fefferman (see also [1, Proposition 3.3]) gives

$$\begin{aligned} \Vert f\Vert _4\lesssim \Big \Vert \Big (\sum _{\sigma \in \Sigma }|f_\sigma |^2\Big )^{1/2}\Big \Vert _4. \end{aligned}$$

(Throughout this article, we suppress the \(\sim \) symbol for simplicity when the precise scale is unimportant.)

When \(\Omega \) is the \(R^{-1}\)-neighborhood of the unit cone \(\mathcal {C}=\{(\xi _1,\xi _2,\xi _3)\in \mathbb {R}^3: \xi _3=\sqrt{\xi _1^2+\xi _2^2}, 1/2\le \xi _3\le 1\}\) and \(\Sigma =\{\sigma \}\) are \(1\times R^{-1/2}\times R^{-1}\)-caps that form a partition of \(\Omega \), then the sharp \(L^4\) square function estimate was proved by Guth–Wang–Zhang [6]:

$$\begin{aligned} \Vert f\Vert _4\lessapprox \Vert (\sum _{\sigma \in \Sigma }|f_\sigma |^2)^{1/2}\Vert _4. \end{aligned}$$

Here, \(A\lessapprox B\) means \(A\lesssim _\epsilon R^\epsilon B\) for any \(\epsilon >0\).

When \(\Omega \) is certain neighborhood of a moment curve, it was studied by Gressman, Guo, Pierce, Roos and Yung [3]. The sharp \(L^7\) estimate was obtained by Maldague [7]. There are some other related results (see [4, 8]).

In the discussion above, we see that the size of caps in the partition of parabola is \(R^{-1/2}\times R^{-1}\); the size of caps in the partition of cone is \(1\times R^{-1/2}\times R^{-1}\). We usually call them the canonical partition. Besides the canonical partition of parabola and cone, Demeter, Guth and Wang [2] introduced the “small cap decoupling" which is the decoupling inequality for a finer partition than the canonical partition. Similarly, we can also ask the question about the small cap square function estimate.

The goal of this paper is to prove the sharp square function estimates for the small caps of parabola and cone. We will first define the small caps. Then we will introduce and study examples which give sharp lower bounds of the constants. Finally, we will prove the sharp bounds of the constants.

1.1 Small Caps

1.1.1 Small Caps for Parabola

Let \(\mathcal {P}:=\{(\xi ,\xi ^2):\xi \in \mathbb {R},|\xi |\le 1\}\) be the unit parabola, and \(N_{R^{-1}}(\mathcal {P})\) be its \(R^{-1}\)-neighborhood. For \(1/2\le \alpha \le 1\), let \(\Gamma _\alpha (R^{-1})\) be the partition of \(N_{R^{-1}}(\mathcal {P})\) into rectangular boxes of dimensions \(R^{-\alpha }\times R^{-1}\). More precisely, each \(\gamma \in \Gamma _\alpha (R^{-1})\) is of form

$$\begin{aligned} \gamma =(I\times \mathbb {R})\cap N_{R^{-1}}(\mathcal {P}), \end{aligned}$$

where \(I\subset [-1,1]\) is an interval of length \(R^{-\alpha }\). Note that we have \(\#\Gamma _\alpha (R^{-1})\sim R^\alpha \). Our square function estimate is

Theorem 1

For \(supp \widehat{f}\subset N_{R^{-1}}(\mathcal {P})\), we have

$$\begin{aligned} \Vert f\Vert _{L^p(\mathbb {R}^2)}\lessapprox C_{\alpha ,p}(R)\Big \Vert \Big (\sum \limits _{\gamma \in \Gamma _\alpha (R^{-1})}|f_\gamma |^2\Big )^{1/2}\Big \Vert _{L^p(\mathbb {R}^2)}, \end{aligned}$$
(1)

where

$$\begin{aligned} C_{\alpha ,p}(R)= {\left\{ \begin{array}{ll} R^{\alpha (\frac{1}{2}-\frac{2}{p})} &{} p\ge 4\alpha +2,\\ R^{(\alpha -\frac{1}{2})(\frac{1}{2}-\frac{1}{p})}&{} 2\le p\le 4\alpha +2. \end{array}\right. } \end{aligned}$$
(2)

Remark

We remark that \(p\ge 4\alpha +2\) is equivalent to \(\alpha (\frac{1}{2}-\frac{2}{p})\ge (\alpha -\frac{1}{2})(\frac{1}{2}-\frac{1}{p})\). Therefore, (2) is equivalent to (up to constant) \(C_{\alpha ,p}(R)\sim R^{\alpha (\frac{1}{2}-\frac{2}{p})}+ R^{(\alpha -\frac{1}{2})(\frac{1}{2}-\frac{1}{p})}.\)

1.1.2 Small Caps for Cone

Denote the truncated cone in \(\mathbb {R}^3\) by

$$\begin{aligned}\mathcal {C}:=\{(\xi _1,\xi _2,\xi _3)\in \mathbb {R}^3: \xi _3=\sqrt{\xi _1^2+\xi _2^2}, 1/2\le \xi _3\le 1\}.\end{aligned}$$

For \(1/2\le \beta \le 1\), let \(\Gamma _{\beta }(R^{-1})\) be the partition of \(N_{R^{-1}}(\mathcal {C})\) into caps of dimensions \(1\times R^{-\beta }\times R^{-1}\). More precisely, we first choose a partition of \(\mathbb {S}^1\) into \(R^{-\beta }\)-arcs: \(\mathbb {S}^1=\sqcup \sigma \). For each arc \(\sigma \), consider the \(R^{-1}\)-neighborhood of

$$\begin{aligned} \left\{ (\xi _1,\xi _2,\xi _3)\in \mathcal {C}: \frac{(\xi _1,\xi _2)}{\sqrt{\xi _1^2+\xi _2^2}}\in \sigma \right\} , \end{aligned}$$

which is a cap of dimensions \(1\times R^{-\beta }\times R^{-1}\). \(\Gamma _{\beta }(R^{-1})\) is the set of caps constructed in this way (see Fig. 1). Note that \(\#\Gamma _{\beta }(R^{-1})\sim R^{\beta }\). Our square function estimate is

Fig. 1
figure 1

Small caps of the cone

Theorem 2

For \(supp \widehat{f}\subset N_{R^{-1}}(\mathcal {C})\), we have

$$\begin{aligned} \Vert f\Vert _{L^p(\mathbb {R}^3)}\lessapprox C_{\beta ,p}(R)\Big \Vert \Big (\sum \limits _{\gamma \in \Gamma _{\beta }(R^{-1})}|f_\gamma |^2\Big )^{1/2}\Big \Vert _{L^p(\mathbb {R}^3)}, \end{aligned}$$
(3)

where

$$\begin{aligned} C_{\beta ,p}(R)= {\left\{ \begin{array}{ll} R^{\frac{\beta }{2}} &{} p\ge 8,\\ R^{\frac{\beta }{2}+\frac{1}{4}-\frac{2}{p}} &{} 4\le p\le 8\\ R^{(\beta -\frac{1}{2})(1-\frac{2}{p})} &{} 2\le p\le 4. \end{array}\right. } \end{aligned}$$
(4)

Remark

We remark that there is no interpolation argument in the proof of square function estimate. It is because that we cannot rewrite our square function estimate in the form of

$$\begin{aligned} \Vert Tg\Vert _X\lesssim C \Vert g\Vert _Y, \end{aligned}$$

where XY are some normed vector spaces and T is a linear operator. Another way to see the interpolation argument is prohibited is by looking at the numerology in (4). We draw the graph of \((\frac{1}{p},\log _R C_{\beta ,p}(R))\), where we ignore the \(C_\epsilon R^\epsilon \) factor in \(C_{\beta ,p}(R)\) (See Fig. 2). We see the critical exponent \(p=8\) corresponds to a concave point \((\frac{1}{8},\frac{\beta }{2})\) in the graph. But if the interpolation argument works, then the graph should be convex which is a contradiction. Not being allowed to do interpolation will be the main difficulty in the proof. This means that we need to prove the estimate for all p, but not only the critical p. Let us consider the case \(\beta =1/2\). One critical exponent \(p=4\) was proved by Guth–Wang–Zhang [6]. The result for another critical exponent \(p=8\) and hence for \(p\in (4,8)\) is not included in [6]. We also remark that

$$\begin{aligned}C_{\beta ,p}(R)\sim \min \Big \{ R^{\frac{\beta }{2}}, R^{\frac{\beta }{2}+\frac{1}{4}-\frac{2}{p}}+ R^{(\beta -\frac{1}{2})(\frac{1}{2}-\frac{1}{p})} \Big \}.\end{aligned}$$
Fig. 2
figure 2

Sharp exponents

1.2 Elementary Tools

We briefly introduce the notion of dual rectangle and local orthogonality.

Definition 1

Let R be a rectangle of dimensions \(a\times b\times c\). Then the dual rectangle of R, denoted by \(R^*\), is the rectangle centered at the origin of dimensions \(a^{-1}\times b^{-1}\times c^{-1}\). Here \(R^*\) is made from R by letting the length of each edge of R become the reciprocal.

From our definition, we see that if \(R_2\) is a translated copy of \(R_1\), then \(R_1^*=R_2^*\). The motivation for defining dual rectangle is the following result.

Lemma 1

For any rectangle R, there exists a smooth function \(\omega _R\) which satisfies \(\frac{1}{10}\cdot \textbf{1}_R(x)\le \omega _R(x)\le 10\cdot \textbf{1}_R(x)\) for \(x\in R\), and \(\omega _R\) decays rapidly outside R. Also, \(supp \widehat{\omega }_R \subset R^*\).

This lemma is very standard, so we omit the proof. The next result is the local orthogonality property.

Lemma 2

Let R be a rectangle and \(\{f_i\}\) is a set of functions. If \(\{ supp \widehat{f}_i+R^* \}\) are finitely overlapping, then

$$\begin{aligned} \int _R \Big |\sum f_i \Big |^2\lesssim \int \sum |f_i|^2 |\omega _R|^2. \end{aligned}$$
(5)

Proof

$$\begin{aligned} \int _R \Big |\sum f_i\Big |^2\lesssim \int \Big |\sum f_i \omega _R\Big |^2=\int \Big |\sum \widehat{f_i\omega _R}\Big |^2. \end{aligned}$$

Note that \(\widehat{f_i\omega _R}=\widehat{f_i}*\widehat{\omega _R}\) is supported in \(supp \widehat{f}_i+R^*\). By the finitely overlapping property, we see the above is bounded by

$$\begin{aligned} \lesssim \int \sum |\widehat{f_i\omega _R}|^2=\int \sum |f_i \omega _R|^2. \end{aligned}$$

\(\square \)

Remark

Note that \(\omega _R\) is essentially \(\textbf{1}_R\) by ignoring the rapidly decaying tail. It turns out that the tail is always harmless. Therefore, to get rid of some irrelevant technicalities, we will just ignore the rapidly decaying tail, and write (5) as

$$\begin{aligned} \int _R \Big |\sum f_i\Big |^2\lesssim \int _R \sum |f_i|^2. \end{aligned}$$

There is another notion called comparable. Given two rectangles \(R_1, R_2\), we say \(R_1\) is essentially contained in \(R_2\), if there exists a universal constant C (say \(C=100\)) such that

$$\begin{aligned} R_1\subset C R_2. \end{aligned}$$

We say \(R_1\) and \(R_2\) are comparable if \(R_1\) is essentially contained in \(R_2\) and vice versa, i.e.,

$$\begin{aligned} \frac{1}{C} R_1\subset R_2\subset C R_1. \end{aligned}$$

Throughout this paper, we will just ignore the unimportant constant C, and just write \(R_1\subset R_2\) to denote that \(R_1\) is essentially contained in \(R_2\).

2 Small Cap Square Function Estimate for Parabola

We prove Theorem 1 in this section. We begin with the sharp examples.

2.1 Sharp Examples

There are two types of examples: concentrated example and flat example.

\(\boxed {\hbox {Case 1: }p\ge 4\alpha +2}\)

We introduce the concentrated example. Choose f such that \(\widehat{f}(\xi )=\psi _{N_{R^{-1}}(\mathcal {P})}(\xi )\), where \(\psi _{N_{R^{-1}}(\mathcal {P})}\) is a smooth bump function supported in \(N_{R^{-1}}(\mathcal {P})\). We see that \(f(0)=\int \widehat{f}(\xi )d\xi \sim R^{-1}\). Since \(\widehat{f}\) is supported in the unit ball centered at the origin, f is locally constant in B(0, 1). Therefore,

$$\begin{aligned} \Vert f\Vert _p\ge \Vert f\Vert _{L^p(B(0,1))}\gtrsim R^{-1}. \end{aligned}$$

We consider the right hand side of (1). By definition, for each \(\gamma \in \Gamma _\alpha (R^{-1})\), \(\widehat{f}_\gamma \) is roughly a bump function supported in \(2\gamma \). Let \(\gamma ^*\) be the dual rectangle of \(\gamma \) which has dimensions \(R^\alpha \times R\) and is centered at the origin. By an application of integration by parts and by ignoring the tails, we assume

$$\begin{aligned} f_\gamma = \frac{1}{|\gamma ^*|}\textbf{1}_{\gamma ^*}. \end{aligned}$$

Here, “\(\approx \)" means up to a \(C_\epsilon R^\epsilon \) factor for any \(\epsilon >0\). We will use the same notation throughout the paper.

We see that

$$\begin{aligned} \Big \Vert \Big (\sum \limits _{\gamma \in \Gamma _\alpha (R^{-1})}|f_\gamma |^2\Big )^{1/2}\Big \Vert _{L^p(\mathbb {R}^2)}^p\sim R^{-(1+\alpha )p}\int _{B(0,R)} \Big (\sum _\gamma \textbf{1}_{\gamma ^*}\Big )^{p/2}. \end{aligned}$$

We evaluate the integral above. There are two extreme regions: \(B(0,R^\alpha )\) where all the \(\{\gamma ^*\}\) overlap; \(B(0,R)\setminus B(0,R/2)\) where \(\{\gamma ^*\}\) is \( O(R^{2\alpha -1})\)-overlapping. For the intermediate region \(B(0,r)\setminus B(0,r/2)\) (\(R^\alpha \le r\le R\)), we see that \(\{\gamma ^*\}\) is \( O(r^{-1}R^{2\alpha })\)-overlapping. We may find a dyadic radius r such that

$$\begin{aligned} \int \Big (\sum _\gamma \textbf{1}_{\gamma ^*}\Big )^{p/2}&\approx \int _{B(0,r)\setminus B(0,r/2)} \Big (\sum _\gamma \textbf{1}_{\gamma ^*}\Big )^{p/2}\lesssim (r^{-1}R^{2\alpha })^{p/2}|B(0,r)|\sim r^{2-\frac{p}{2}}R^{\alpha p}. \end{aligned}$$

Since \(p\ge 4\alpha +2\ge 4\), the expression above is maximized when \(r=R^\alpha \). Plugging in, we obtain

$$\begin{aligned} \int \Big (\sum _\gamma \textbf{1}_{\gamma ^*}\Big )^{p/2}\lessapprox R^{\alpha (2+\frac{p}{2})}.\end{aligned}$$

Plugging into (1), we have

$$\begin{aligned} R^{-1}\lessapprox C_{\alpha ,p}(R)R^{-(1+\alpha )} R^{\alpha (\frac{2}{p}+\frac{1}{2})}, \end{aligned}$$

which gives

$$\begin{aligned} C_{\alpha ,p}(R)\gtrapprox R^{\alpha (\frac{1}{2}-\frac{2}{p})}.\end{aligned}$$

\(\boxed {\hbox {Case 2:} 2\le p\le 4\alpha +2}\)

We introduce the flat example. Let \(\theta \subset N_{R^{-1}}(\mathcal {P})\) be a \(R^{-1/2}\times R^{-1}\)-cap. Choose f such that \(\widehat{f}(\xi )=\psi _\theta (\xi )\), where \(\psi _\theta \) is a smooth bump function supported in \(N_{R^{-1}}(\mathcal {P})\). Let \(\theta ^*\) be the dual rectangle of \(\theta \) which has dimensions \(R^{1/2}\times R\) and is centered at the origin. By the locally constant property, f is an \(L^1\) normalized function essentially supported in \(\theta ^*\). By ignoring the tails, we assume

$$\begin{aligned} f= \frac{1}{|\theta ^*|}\textbf{1}_{\theta ^*}. \end{aligned}$$

We see that

$$\begin{aligned} \Vert f\Vert _p\sim R^{-\frac{3}{2}}R^{\frac{3}{2p}}. \end{aligned}$$

We consider the right hand side of (1). By the same reasoning as in Case 1, for each \(\gamma \in \Gamma _\alpha (R^{-1})\) with \(\gamma \subset \theta \), we know that \(\widehat{f}_\gamma \) is roughly a bump function supported in \(2\gamma \). Therefore, we can assume

$$\begin{aligned} f_\gamma = \frac{1}{|\gamma ^*|}\textbf{1}_{\gamma ^*}. \end{aligned}$$

We also note that \(\gamma _1^*\) and \(\gamma _2^*\) are comparable when \(\gamma _1,\gamma _2\subset \theta \). We have

$$\begin{aligned} \Big \Vert \Big (\sum \limits _{\gamma \in \Gamma _\alpha (R^{-1})}|f_\gamma |^2\Big )^{1/2}\Big \Vert _{L^p(\mathbb {R}^2)}&\sim R^{-(1+\alpha )}\bigg (\int \Big (\sum _{\gamma \subset \theta } \textbf{1}_{\gamma ^*}\Big )^{p/2}\bigg )^{1/p}\\&\quad \sim R^{-(1+\alpha )}\#\{\gamma \subset \theta \}^{1/2}|\gamma ^*|^{1/p} \\&\sim R^{-(1+\alpha )} R^{\frac{1}{2}(\alpha -\frac{1}{2})}R^{\frac{1+\alpha }{p}}. \end{aligned}$$

Plugging into (1), we have

$$\begin{aligned} R^{-\frac{3}{2}}R^{\frac{3}{2p}}\lessapprox C_{\alpha ,p}(R)R^{-(1+\alpha )} R^{\frac{1}{2}(\alpha -\frac{1}{2})}R^{\frac{1+\alpha }{p}}, \end{aligned}$$

which gives

$$\begin{aligned} C_{\alpha ,p}(R)\gtrapprox R^{(\alpha -\frac{1}{2})(\frac{1}{2}-\frac{1}{p})}.\end{aligned}$$

2.2 Proof of Theorem 1

By the standard localization argument, it suffices to prove

$$\begin{aligned} \Vert f\Vert _{L^p(B_R)}\lessapprox _\epsilon (R^{\alpha (\frac{1}{2}-\frac{2}{p})}+R^{(\alpha -\frac{1}{2})(\frac{1}{2}-\frac{1}{p})})\Big \Vert \Big (\sum \limits _{\gamma \in \Gamma _\alpha (R^{-1})}|f_\gamma |^2\Big )^{1/2}\Big \Vert _p. \end{aligned}$$

We introduce some notations. Throughout the proof, we use \(\gamma \) to denote caps of dimensions \(R^{-\alpha }\times R^{-1}\). For \(R^{-1/2}\le \Delta \le 1\), we will consider caps \(\tau \) of length \(\Delta \) and thickness \(R^{-1}\). We write \(|\tau |=\Delta \) to indicate the length of \(\tau \). We will also partition the region \(B_R\) into rectangles of dimensions \(R^\alpha \times R\). For simplicity, we denote these rectangles by \(B_{R^\alpha \times R}.\) The longest direction of \(B_{R^\alpha \times R}\) will be specified in the proof.

Let \(K\sim \log R\) and let \(m\in \mathbb {N}\) be such that \(K^m=R^{1/2}\). By doing the broad-narrow reduction as in [2, Section 5.1], we have

$$\begin{aligned} \Vert f\Vert _{L^p(B_R)}^p&\lesssim C^m\sum _{|\theta |=R^{-1/2}}\Vert f_\theta \Vert _{L^p(B_R)}^p \end{aligned}$$
(6)
$$\begin{aligned}&+C^m K^C \sum _{{\begin{array}{l} R^{-1/2}\le \Delta \le 1\\ \Delta \text {~dyadic} \end{array}}} \sum _{|\tau |\sim \Delta } \max _{{\begin{array}{l} \tau _1,\tau _2\subset \tau \\ |\tau _1|=|\tau _2|=K^{-1}\Delta \\ dist (\tau _1,\tau _2)\ge (10K)^{-1}\Delta \end{array}} }\Vert (f_{\tau _1}f_{\tau _2})^{1/2}\Vert _{L^p(B_R)}^p. \end{aligned}$$
(7)

Note that \(C^mK^C\lesssim _\epsilon R^\epsilon \), for each \(\epsilon >0\).

We first estimate the right hand side of (6).

Lemma 3

Let \(\theta \) be a cap of length \(R^{-1/2}\). Then,

$$\begin{aligned} \Vert f_\theta \Vert _{L^p(B_R)}\lesssim R^{(\alpha -\frac{1}{2})(\frac{1}{2}-\frac{1}{p})}\Big \Vert \Big (\sum _{\gamma \subset \theta }|f_\gamma |^2\Big )^{1/2}\Big \Vert _p. \end{aligned}$$

Proof

We partition \(B_R\) into \(B_{R^\alpha \times R}\), where each \(B_{R^\alpha \times R}\) is a translation of \(\gamma ^*\) for \(\gamma \subset \theta \) (note that for all \(\gamma \subset \theta \), \(\gamma ^*\)’s are comparable). It suffices to prove for any \(B_{R^\alpha \times R}\),

$$\begin{aligned} \Vert f_\theta \Vert _{L^p(B_{R^\alpha \times R})}\lesssim R^{(\alpha -\frac{1}{2})(\frac{1}{2}-\frac{1}{p})}\Big \Vert \Big (\sum _{\gamma \subset \theta }|f_\gamma |^2\Big )^{1/2}\Big \Vert _{L^p(\omega _{B_{R^\alpha \times R}})}. \end{aligned}$$
(8)

Here, \(\omega _{B_{R^\alpha \times R}}\) is a weight which \(=1\) on \(B_{R^\alpha \times R}\) and decays rapidly outside \(B_{R^\alpha \times R}\). And \(\Vert g\Vert _{L^p(\omega )}\) is defined to be \((\int |g|^p\omega )^{1/p}\). We remark that we use \(\omega _{B_{R^\alpha \times R}}\) instead of \(\textbf{1}_{B_{R^\alpha \times R}}\) is to make the local orthogonality and locally constant property rigorous. As such technicality is well-known (see for example in [1]), we will just pretend \(\omega _{B_{R^\alpha \times R}}=\textbf{1}_{B_{R^\alpha \times R}}\) for convenience.

We further do the partition

$$\begin{aligned} B_{R^\alpha \times R}=\bigsqcup B_{R^{1/2}\times R}, \end{aligned}$$

where each \(B_{R^{1/2}\times R}\) is a translation of \(\theta ^*\). Since \(f_\theta \) is locally constant on each \(B_{R^{1/2}\times R}\), we have

$$\begin{aligned} \Vert f_\theta \Vert _{L^p(B_{R^\alpha \times R})}&=\bigg ( \sum _{B_{R^{1/2}\times R}}\Vert f_\theta \Vert _{L^p(B_{R^{1/2}\times R})}^p \bigg )^{1/p}\\&\lesssim R^{\frac{3}{2}(\frac{1}{p}-\frac{1}{2})}\bigg ( \sum _{B_{R^{1/2}\times R}}\Vert f_\theta \Vert _{L^2(B_{R^{1/2}\times R})}^p \bigg )^{1/p}\\&\le R^{\frac{3}{2}(\frac{1}{p}-\frac{1}{2})}\Vert f_\theta \Vert _{L^2(B_{R^\alpha \times R})}. \end{aligned}$$

By local orthogonality, Hölder’s inequality and noting \(p\ge 2\), we have

$$\begin{aligned}{} & {} \Vert f_\theta \Vert _{L^2(B_{R^\alpha \times R})}\\{} & {} \lesssim \Big \Vert \Big (\sum _{\gamma \subset \theta }|f_\gamma |^2\Big )^{1/2}\Big \Vert _{L^2(B_{R^\alpha \times R})}\le R^{(1+\alpha )(\frac{1}{2}-\frac{1}{p})} \Big \Vert \Big (\sum _{\gamma \subset \theta }|f_\gamma |^2\Big )^{1/2}\Big \Vert _{L^p(B_{R^\alpha \times R})}.\end{aligned}$$

Combining the inequalities, we finish the proof of (8). \(\square \)

By Lemma 3, the right hand side of (6) is bounded by

$$\begin{aligned} R^\epsilon R^{(\alpha -\frac{1}{2})(\frac{1}{2}-\frac{1}{p})}\bigg (\sum _\theta \Big \Vert \Big (\sum _{\gamma \subset \theta }|f_\gamma |^2\Big )^{1/2}\Big \Vert _p^p\bigg )^{1/p}\le C_{\alpha ,p}(R)\Big \Vert \Big (\sum _{\gamma }|f_\gamma |^2\Big )^{1/2}\Big \Vert _p.\end{aligned}$$

Next, we estimate (7). For any summand in (7), we will show that

$$\begin{aligned} \Vert (f_{\tau _1}f_{\tau _2})^{1/2}\Vert _{L^p(B_R)}\lesssim C_{\alpha ,p}(R) \Big \Vert \Big (\sum _{\gamma \subset \tau }|f_\gamma |^2\Big )^{1/2}\Big \Vert _p. \end{aligned}$$
(9)

This will imply (7)\(^{\frac{1}{p}}\) \(\lessapprox C_{\alpha ,p}(R)\Big \Vert \Big (\sum \limits _{\gamma }|f_\gamma |^2\Big )^{1/2}\Big \Vert _p\), and then finishes the proof of Theorem 1. It remains to prove (9).

Fix a \(\Delta \in [R^{-1/2},1]\) and a \(\tau \) with \(|\tau |=\Delta \). We first consider \(\bigcap _{\gamma \subset \tau } \gamma ^*\). It is easy to see \(\bigcap _{\gamma \subset \tau } \gamma ^*\) is an \(R^\alpha \times R^\alpha \Delta ^{-1}\)-rectangle when \(\Delta \ge R^{\alpha -1}\); \(\bigcap _{\gamma \subset \tau } \gamma ^*\) is an \(R^\alpha \times R\)-rectangle when \(\Delta \le R^{\alpha -1}\). We consider these two cases separately.

\(\boxed {\hbox {Case 1:} \Delta \ge R^{\alpha -1}}\)

We choose a partition \(B_R=\bigsqcup B_{R^{\alpha }\times R^\alpha \Delta ^{-1}}\), where each \(B_{R^{\alpha }\times R^\alpha \Delta ^{-1}}\) is a translation of \(\bigcap _{\gamma \subset \tau }\gamma ^*\). We just need to show

$$\begin{aligned} \Vert (f_{\tau _1}f_{\tau _2})^{1/2}\Vert _{L^p(B_{R^{\alpha }\times R^\alpha \Delta ^{-1}})}\lesssim C_{\alpha ,p}(R) \Big \Vert \Big (\sum _{\gamma \subset \tau }|f_\gamma |^2\Big )^{1/2}\Big \Vert _{L^p(B_{R^{\alpha }\times R^\alpha \Delta ^{-1}})}. \end{aligned}$$
(10)

Since each \(|f_\gamma |\) is locally constant on \(B_{R^{\alpha }\times R^\alpha \Delta ^{-1}}\) when \(\gamma \subset \tau \), we have

$$\begin{aligned} \Big \Vert \Big (\sum _{\gamma \subset \tau }|f_\gamma |^2\Big )^{1/2}\Big \Vert _{L^p(B_{R^{\alpha }\times R^\alpha \Delta ^{-1}})}\sim (R^{2\alpha }\Delta ^{-1})^{-\frac{1}{2}+\frac{1}{p}}\Big \Vert \Big (\sum _{\gamma \subset \tau }|f_\gamma |^2\Big )^{1/2}\Big \Vert _{L^2(B_{R^{\alpha }\times R^\alpha \Delta ^{-1}})}. \end{aligned}$$

Since \(\{f_\gamma \}_{\gamma \subset \tau }\) are locally orthogonal on \(B_{R^{\alpha }\times R^\alpha \Delta ^{-1}}\), we have

$$\begin{aligned} \Big \Vert \Big (\sum _{\gamma \subset \tau }|f_\gamma |^2\Big )^{1/2}\Big \Vert _{L^2(B_{R^{\alpha }\times R^\alpha \Delta ^{-1}})}\sim \Vert f_\tau \Vert _{L^2(B_{R^{\alpha }\times R^\alpha \Delta ^{-1}})}. \end{aligned}$$

Therefore, (10) is reduced to

$$\begin{aligned} \Vert (f_{\tau _1}f_{\tau _2})^{1/2}\Vert _{L^p(B_{R^{\alpha }\times R^\alpha \Delta ^{-1}})}\lesssim C_{\alpha ,p}(R) (R^{2\alpha }\Delta ^{-1})^{-\frac{1}{2}+\frac{1}{p}} \Vert f_\tau \Vert _{L^2(B_{R^{\alpha }\times R^\alpha \Delta ^{-1}})}. \end{aligned}$$
(11)

Next, we apply the parabolic rescaling. Recall that \(\tau \) is a cap of length \(\Delta \). We dilate by factor \(\Delta ^{-1}\) in the tangent direction of \(\tau \) and dilate by factor \(\Delta ^{-2}\) in the normal direction of \(\tau \). Under the rescaling, we see that: \(\tau \) becomes the \(R^{-1}\Delta ^{-2}\)-neighborhood of \(\mathcal {P}\); \(\tau _1\) and \(\tau _2\) become \(K^{-1}\)-separated caps with length \(K^{-1}\) and thickness \(R^{-1}\Delta ^{-2}\); the rectangle \(B_{R^{\alpha }\times R^\alpha \Delta ^{-1}}\) in the physical space becomes \(B_{R^\alpha \Delta }\). Let \(g,g_1,g_2\) be the rescaled version of \(f_\tau ,f_{\tau _1},f_{\tau _2}\) respectively. The inequality (11) becomes

$$\begin{aligned} \Vert (g_1 g_2)^{1/2}\Vert _{L^p(B_{R^{\alpha }\Delta })}\lesssim C_{\alpha ,p}(R) (R^{2\alpha }\Delta ^{-1})^{-\frac{1}{2}+\frac{1}{p}} \Delta ^{3(-\frac{1}{2}+\frac{1}{p})}\Vert g\Vert _{L^2(B_{R^{\alpha }\Delta })}. \end{aligned}$$
(12)

We recall the following bilinear restriction estimate (see for example in [9]).

Lemma 4

Let \(r>1\), \(K>1\). Suppose \(g_1, g_2\) satisfy \(supp \widehat{g}_1, supp \widehat{g}_2\subset N_{r^{-2}}(\mathcal {P})\) and \(dist (supp \widehat{g}_1, supp \widehat{g}_2)>K^{-1}\). Then for \(p\ge 2\) and \(r'\ge r\) we have

$$\begin{aligned} \Vert (g_1g_2)^{1/2}\Vert _{L^p(B_{r'})}\lesssim K^{O(1)}r^{\frac{2}{p}-1}\big (\Vert g_1\Vert _{L^2(B_{r'})}\Vert g_2\Vert _{L^2(B_{r'})}\big )^{1/2}. \end{aligned}$$
(13)

Proof

We just need to prove for \(r'=r\). When \(p=2\), this is trivial. When \(p=4\), this is the bilinear restriction estimate. When \(p=\infty \), we note that

$$\begin{aligned} \Vert (g_1g_2)^{1/2}\Vert _{L^\infty (B_r)}^2&\le \Vert g_1\Vert _{L^\infty (B_r)}\Vert g_2\Vert _{L^\infty (B_r)}\le \Vert \widehat{g}_1\Vert _{L^1}\Vert \widehat{g}_2\Vert _{L^1}\\&\lesssim r^{-2}\Vert \widehat{g}_1\Vert _{L^2}\Vert \widehat{g}_1\Vert _{L^2}=r^{-2}\Vert g_1\Vert _{L^2}\Vert g_2\Vert _{L^2}. \end{aligned}$$

The second-last inequality is by Hölder and the condition on the support of \(\widehat{g}_1,\widehat{g}_2\). The last inequality is by Plancherel. For other p, the proof is by using Hölder to interpolate between \(p=2,4,\infty \). \(\square \)

We return to (12). Noting that \(R^\alpha \Delta \ge (R\Delta ^2)^{1/2}\), we apply the lemma above to bound the left hand side of (12) by \((R^\alpha \Delta )^{\frac{2}{p}-1}\Vert g\Vert _{L^2(B_{R^\alpha \Delta })}\). It suffices to prove

$$\begin{aligned} (R^\alpha \Delta )^{\frac{2}{p}-1}\lesssim C_{\alpha ,p}(R) (R^{2\alpha }\Delta ^{-1})^{-\frac{1}{2}+\frac{1}{p}} \Delta ^{3(-\frac{1}{2}+\frac{1}{p})}. \end{aligned}$$
(14)

When \(p\ge 4\), we use \(C_{\alpha ,p}(R)\gtrsim R^{\alpha (\frac{1}{2}-\frac{2}{p})}\). Then (14) boils down to

$$\begin{aligned} (R^\alpha \Delta )^{\frac{2}{p}-1}\lesssim R^{\alpha (\frac{1}{2}-\frac{2}{p})} (R^{2\alpha }\Delta ^{-1})^{-\frac{1}{2}+\frac{1}{p}} \Delta ^{3(-\frac{1}{2}+\frac{1}{p})}, \end{aligned}$$
(15)

which is equivalent to

$$\begin{aligned} R^{\alpha (\frac{1}{2}-\frac{2}{p})}\gtrsim 1, \end{aligned}$$

which is true since \(R\ge 1\).

When \(p\le 4\), we use \(C_{\alpha ,p}(R)\gtrsim R^{(\alpha -\frac{1}{2})(\frac{1}{2}-\frac{1}{p})}\). Then (14) boils down to

$$\begin{aligned} (R^\alpha \Delta )^{\frac{2}{p}-1}\lesssim R^{(\alpha -\frac{1}{2})(\frac{1}{2}-\frac{1}{p})} (R^{2\alpha }\Delta ^{-1})^{-\frac{1}{2}+\frac{1}{p}} \Delta ^{3(-\frac{1}{2}+\frac{1}{p})}, \end{aligned}$$
(16)

which is equivalent to

$$\begin{aligned} R^{(\alpha -\frac{1}{2})(\frac{1}{2}-\frac{1}{p})}\gtrsim 1, \end{aligned}$$

which is true since \(\alpha \ge 1/2.\)

\(\boxed {\hbox {Case 2:} \Delta \le R^{\alpha -1}}\)

We choose a partition \(B_R=\bigsqcup B_{R^{\alpha }\times R}\), where each \(B_{R^{\alpha }\times R}\) is a translation of \(\bigcap _{\gamma \subset \tau }\gamma ^*\). We just need to show

$$\begin{aligned} \Vert (f_{\tau _1}f_{\tau _2})^{1/2}\Vert _{L^p(B_{R^{\alpha }\times R})}\lesssim C_{\alpha ,p}(R) \Big \Vert \Big (\sum _{\gamma \subset \tau }|f_\gamma |^2\Big )^{1/2}\Big \Vert _{L^p(B_{R^{\alpha }\times R})}. \end{aligned}$$
(17)

Since each \(|f_\gamma |\) is locally constant on \(B_{R^{\alpha }\times R}\) when \(\gamma \subset \tau \), we have

$$\begin{aligned} \Big \Vert \Big (\sum _{\gamma \subset \tau }|f_\gamma |^2\Big )^{1/2}\Big \Vert _{L^p(B_{R^{\alpha }\times R})}\sim (R^{\alpha +1})^{-\frac{1}{2}+\frac{1}{p}}\Big \Vert \Big (\sum _{\gamma \subset \tau }|f_\gamma |^2\Big )^{1/2}\Big \Vert _{L^2(B_{R^{\alpha }\times R})}. \end{aligned}$$

Since \(\{f_\gamma \}_{\gamma \subset \tau }\) are locally orthogonal on \(B_{R^{\alpha }\times R}\), we have

$$\begin{aligned} \Big \Vert \Big (\sum _{\gamma \subset \tau }|f_\gamma |^2\Big )^{1/2}\Big \Vert _{L^2(B_{R^{\alpha }\times R})}\sim \Vert f_\tau \Vert _{L^2(B_{R^{\alpha }\times R})}. \end{aligned}$$

Therefore, (17) is reduced to

$$\begin{aligned} \Vert (f_{\tau _1}f_{\tau _2})^{1/2}\Vert _{L^p(B_{R^{\alpha }\times R})}\lesssim C_{\alpha ,p}(R) (R^{\alpha +1})^{-\frac{1}{2}+\frac{1}{p}} \Vert f_\tau \Vert _{L^2(B_{R^{\alpha }\times R})}. \end{aligned}$$
(18)

Next, we do the same parabolic rescaling as above. The rectangle \(B_{R^{\alpha }\times R}\) in the physical space becomes \(B_{R^\alpha \Delta \times R\Delta ^2}\). Let \(g,g_1,g_2\) be the rescaled version of \(f_\tau ,f_{\tau _1},f_{\tau _2}\) respectively. The inequality (18) becomes

$$\begin{aligned} \Vert (g_1 g_2)^{1/2}\Vert _{L^p(B_{R^{\alpha }\Delta \times R\Delta ^2})}\lesssim C_{\alpha ,p}(R) (R^{\alpha +1})^{-\frac{1}{2}+\frac{1}{p}} \Delta ^{3(-\frac{1}{2}+\frac{1}{p})}\Vert g\Vert _{L^2(B_{R^{\alpha }\Delta \times R\Delta ^2})}. \end{aligned}$$
(19)

To apply Lemma 4, we do the partition \(B_{R^\alpha \Delta \times R\Delta ^2}=\bigsqcup B_{R\Delta ^2}\). So, (19) is reduced to

$$\begin{aligned} \Vert (g_1 g_2)^{1/2}\Vert _{L^p(B_{R\Delta ^2})}\lesssim C_{\alpha ,p}(R) (R^{\alpha +1})^{-\frac{1}{2}+\frac{1}{p}} \Delta ^{3(-\frac{1}{2}+\frac{1}{p})}\Vert g\Vert _{L^2(B_{R\Delta ^2})}. \end{aligned}$$
(20)

By Lemma 4,

$$\begin{aligned}\Vert (g_1 g_2)^{1/2}\Vert _{L^p(B_{R\Delta ^2})}\lesssim (R \Delta ^2)^{\frac{2}{p}-1}\Vert g\Vert _{L^2(B_{R\Delta ^2})}.\end{aligned}$$

It suffices to prove

$$\begin{aligned} (R \Delta ^2)^{\frac{2}{p}-1}\lesssim C_{\alpha ,p}(R) R^{(\alpha +1)(-\frac{1}{2}+\frac{1}{p})} \Delta ^{3(-\frac{1}{2}+\frac{1}{p})}. \end{aligned}$$
(21)

When \(p\ge 4\alpha +2\), we use \(C_{\alpha ,p}(R)\gtrsim R^{\alpha (\frac{1}{2}-\frac{2}{p})}\). Then (21) boils down to

$$\begin{aligned} (R\Delta ^2)^{\frac{2}{p}-1}\lesssim R^{\alpha (\frac{1}{2}-\frac{2}{p})} R^{(\alpha +1)(-\frac{1}{2}+\frac{1}{p})} \Delta ^{3(-\frac{1}{2}+\frac{1}{p})}, \end{aligned}$$
(22)

which is equivalent to

$$\begin{aligned} \Delta ^{\frac{1}{p}-\frac{1}{2}}\lesssim R^{-\frac{\alpha }{p}+\frac{1}{2}-\frac{1}{p}}. \end{aligned}$$

Using \(\Delta \ge R^{-\frac{1}{2}}\), we just need to prove

$$\begin{aligned} R^{-\frac{1}{2p}+\frac{1}{4}}\lesssim R^{-\frac{\alpha }{p}+\frac{1}{2}-\frac{1}{p}}. \end{aligned}$$

The last inequality is equivalent to \(\frac{1}{4}-\frac{1}{2p}-\frac{\alpha }{p}\ge 0\), which is further equivalent to \(p\ge 4\alpha +2\). We also remark that this is the place where the critical exponent \(p=4\alpha +2\) appears.

When \(2\le p\le 4\alpha +2\), we use \(C_{\alpha ,p}(R)\gtrsim R^{(\alpha -\frac{1}{2})(\frac{1}{2}-\frac{1}{p})}\). Then (21) boils down to

$$\begin{aligned} (R\Delta ^2)^{\frac{2}{p}-1}\lesssim R^{(\alpha -\frac{1}{2})(\frac{1}{2}-\frac{1}{p})} R^{(\alpha +1)(-\frac{1}{2}+\frac{1}{p})} \Delta ^{3(-\frac{1}{2}+\frac{1}{p})}, \end{aligned}$$
(23)

which is equivalent to

$$\begin{aligned} \Delta ^{\frac{1}{p}-\frac{1}{2}}\lesssim R^{\frac{1}{4}-\frac{1}{2p}}, \end{aligned}$$

which is true since \(\Delta ^{-1}\le R^{1/2}.\)

The proof of Theorem 1 is finished.

3 Small Cap Square Function Estimate for Cone

We prove Theorem 2 in this section. We begin with the sharp examples.

3.1 Sharp Examples

Choose f such that \(\widehat{f}=\psi _{N_{R^{-1}}(\mathcal {C})}(\xi )\), where \(\psi _{N_{R^{-1}}(\mathcal {C})}(\xi )\) is a smooth bump function supported in \(N_{R^{-1}}(\mathcal {C})\). We are going to calculate the lower bound of \(\Vert f\Vert _{p}\), which is the left hand side of (3). We see that \(f(0)=\int \widehat{f}(\xi ) d\xi \sim R^{-1}\). Since \(\widehat{f}\) is supported in the unit ball centered at the origin, f is locally constant in B(0, 1). Therefore,

$$\begin{aligned} \Vert f\Vert _p\gtrsim \Vert f\Vert _{L^p(B(0,1))}\gtrsim R^{-1}. \end{aligned}$$
(24)

We also estimate the integral of f in the region \(\{|x|\sim R\}\). We first do a canonical partition of \(N_{R^{-1}}(\mathcal {C})\) into \(1\times R^{-1/2}\times R^{-1}\)-planks, denoted by

$$\begin{aligned} N_{R^{-1}}(\mathcal {C})=\bigsqcup \theta . \end{aligned}$$

Then we can write \(f=\sum _\theta f_\theta \), such that each \(\widehat{f}_\theta \) is a smooth bump function on \(\theta \). Let \(\theta ^*\) be the dual rectangle of \(\theta \), so \(\theta ^*\) has size \(1\times R^{1/2}\times R\) and is centered at the origin. By an application of integration by parts, we can assume

$$\begin{aligned} |f_\theta |= \frac{1}{|\theta ^*|}\textbf{1}_{\theta ^*}=R^{-3/2}\textbf{1}_{\theta ^*}. \end{aligned}$$

Now the key observation is that \(\{\theta ^*\}\) are disjoint in \(B(0,R)\setminus B(0,\frac{9}{10}R)\), so we see that

$$\begin{aligned} \Vert f\Vert _p=\Vert \sum _\theta f_\theta \Vert _p&\ge \Big \Vert \sum _\theta f_\theta \Big \Vert _{L^p(B(0,R)\setminus B(0,\frac{9}{10}R))}\\&\sim R^{-3/2}\Big \Vert \sum _\theta \textbf{1}_{\theta ^*} \Big \Vert _{L^p(B(0,R)\setminus B(0,\frac{9}{10}R))}\\&\sim R^{-3/2}\Big (\sum _\theta |\theta ^*| \Big )^{1/p}=R^{-\frac{3}{2}+\frac{2}{p}}. \end{aligned}$$

Combining with (24), we see

$$\begin{aligned} \Vert f\Vert _p\gtrsim \max \Big \{ R^{-1}, R^{-\frac{3}{2}+\frac{2}{p}} \Big \}. \end{aligned}$$
(25)

And we see the threshold for these two lower bounds to be equal is at \(p=4\).

For this same f, we will estimate the upper bound of the right hand side of (3). Recall that \(\gamma \) is a \(1\times R^{-\beta }\times R^{-1}\)-cap contained in \(N_{R^{-1}}(\mathcal {C})\), and by definition \(\widehat{f}_\gamma =\psi _\gamma \widehat{f}\). Therefore, \(\widehat{f}_\gamma \) is a smooth bump function adapted to \(\gamma \). By an application of integration by parts, we can assume

$$\begin{aligned} |f_\gamma |= \frac{1}{|\gamma ^*|}\textbf{1}_{\gamma ^*}. \end{aligned}$$

Here, the dual rectangle \(\gamma ^*\) is centered at the origin with size \(1\times R^\beta \times R\). See Fig. 3: the rectangle on the left hand side is \(\gamma \); the rectangle on the right hand side is \(\gamma ^*\).

Therefore, we can write

$$\begin{aligned} \Big \Vert \Big (\sum _{\gamma \in \Gamma _\beta (R^{-1})}|f_\gamma |^2\Big )^{1/2}\Big \Vert _p\sim R^{-1-\beta } \bigg (\int \Big (\sum _\gamma \textbf{1}_{\gamma ^*}\Big )^{p/2}\bigg )^{1/p}. \end{aligned}$$
(26)
Fig. 3
figure 3

Dual rectangle

Note that each \(\gamma ^*\) is supported in B(0, R), so we rewrite

$$\begin{aligned}\int \Big (\sum _\gamma \textbf{1}_{\gamma ^*}\Big )^{p/2}=\int _{|r|\le R}dr\int _{\{x_3=r\}}\Big (\sum _\gamma \textbf{1}_{\gamma ^*}\Big )^{p/2}dx_1dx_2.\end{aligned}$$

We are going to calculate \(\int _{\{x_3=r\}}\Big (\sum _\gamma \textbf{1}_{\gamma ^*}\Big )^{p/2}\). Here is the result:

Proposition 1

For \(p\ge 2\), we have

$$\begin{aligned} \int _{\{x_3=r\}}\Big (\sum _\gamma \textbf{1}_{\gamma ^*}\Big )^{p/2}\approx {\left\{ \begin{array}{ll} R^{2\beta }+R^{\frac{p\beta }{2}} &{} 0\le r\le 10,\\ r^{1-\frac{p}{4}}R^{\beta \frac{p}{2}}+r^{2-\frac{p}{2}}R^{\beta \frac{p}{2}}+R^{2\beta } &{} 10\le r\le R^\beta ,\\ r^{1-\frac{p}{4}}R^{\beta \frac{p}{2}}+R^{2\beta } &{} R^\beta \le r\le R. \end{array}\right. } \end{aligned}$$
(27)
Fig. 4
figure 4

Horizontal slice

Proof

Fix the plane \(\{x_3=r\}\). For each \(\gamma ^*\), we set

$$\begin{aligned} \gamma ^*_r:=\gamma ^*\cap \{x_3=r\}. \end{aligned}$$

\(\gamma ^*_r\) is a rectangle of size \(1\times R^\beta \) in the plane \(\{x_3=r\}\). Denote the center of \(\gamma ^*_r\) by \(C(\gamma ^*_r)\). We see that \(C(\gamma ^*_r)\) lies on the circle

$$\begin{aligned}S_r:=\{x_3=r, \sqrt{x_1^2+x_2^2}=r\},\end{aligned}$$

and the long direction of \(\gamma ^*_r\) is tangent to \(S_r\) (see Fig. 4). We can rewrite the left hand side of (27) as

$$\begin{aligned} \int _{\mathbb {R}^2} \Big (\sum _\gamma \textbf{1}_{\gamma ^*_r}\Big )^{p/2}. \end{aligned}$$

We also notice two useful facts: (1) \(\#\{\gamma ^*_r\}\sim R^\beta \); (2) \(\{C(\gamma ^*_r)\}\) are roughly \(rR^{-\beta }\)-separated on the circle \(S_r\).

\(\boxed {\hbox {Case 1:} 0\le r\le 10}\)

In this case, we see that \(\{\gamma ^*_r\}\) essentially form a bush centered at the origin. Evaluating the concentrated part and spread-out part, we have

$$\begin{aligned}{} & {} \int _{\mathbb {R}^2} \Big (\sum _\gamma \textbf{1}_{\gamma ^*_r}\Big )^{p/2}\approx \int _{B(0,1)}\Big (\sum _\gamma \textbf{1}_{\gamma ^*_r}\Big )^{p/2} \\{} & {} \quad + \int _{B(0,R^\beta )\setminus B(0,\frac{1}{2}R^\beta )}\Big (\sum _\gamma \textbf{1}_{\gamma ^*_r}\Big )^{p/2}\sim R^{\frac{p\beta }{2}}+R^{2\beta }. \end{aligned}$$

\(\boxed {\hbox {Case 2:} 10\le r\le R^\beta }\)

For any point \(P\in \bigcup \gamma ^*_r\), we are going to estimate \(\sum _\gamma \textbf{1}_{\gamma ^*_r}(P)\). Define

$$\begin{aligned} d(P):=dist (P,S_r). \end{aligned}$$

We see that any \(P\in \bigcup \gamma ^*_r\) satisfies \(d(P)\lesssim R^\beta \), and if \(P\in \bigcup \gamma ^*_r\) lies inside \(S_r\) then \(d(P)=0\). For simplicity, we write \(d=d(P)\). We consider several cases:

Fig. 5
figure 5

Horizontal slice

  1. (1)

    \(d\le 10\). In this case, P lies in the 10-neighborhood of \(S_r\). Therefore,

    $$\begin{aligned} \sum _\gamma \textbf{1}_{\gamma ^*_r}(P)=\sum _\gamma \textbf{1}_{\gamma ^*_r\cap N_{10}(S_r)}(P) \end{aligned}$$

    Noting that \(\gamma ^*_r\cap N_{10}(S_r)\) is essentially a \(1\times r^{1/2}\)-rectangle centered at \(C(\gamma ^*_r)(\in S_r)\) and noting that \(\{C(\gamma ^*_r)\}\) are \(rR^{-\beta }\) separated, we have

    $$\begin{aligned} \sum _\gamma \textbf{1}_{\gamma ^*_r\cap N_{10}(S_r)}(P)\sim \frac{r^{1/2}}{rR^{-\beta }}=r^{-1/2}R^\beta . \end{aligned}$$
  2. (2)

    \(10\le d\le r\). We claim in this case

    $$\begin{aligned} \sum _\gamma \textbf{1}_{\gamma ^*_r}(P)\sim R^\beta (rd)^{-1/2}. \end{aligned}$$

    See Fig. 5. By translation and rotation, we may assume \(S_r\) is centered at \((-r,0)\) and P lies on the \(x_2\)-axis. By Pythagorean theorem, the coordinate of P is \((0, \sqrt{d(d+2r)})\). Since \(d\le r\), we may ignore some constant factor and write the coordinate of P as

    $$\begin{aligned} P=(0,(dr)^{1/2}). \end{aligned}$$
    (28)

    The next step is to find the number of \(\gamma ^*_r\) that pass through P. Suppose \(P\in \gamma ^*_r\). Since the center of \(\gamma ^*_r\) lies in \( S_r\), we may denote its coordinate by \(C(\gamma ^*_r)=(-r+r\cos \theta ,r\sin \theta )\). Let \(\ell \) be the line passing through \(C(\gamma ^*_r)\) and tangent to \(S_r\) (which is also the core line of \(\gamma ^*_r\)):

    $$\begin{aligned} \ell : y-r\sin \theta =-\frac{\cos \theta }{\sin \theta }(x+r-r\cos \theta ). \end{aligned}$$

    Since \((dr)^{1/2}\le R^\beta \), we see that \(P\in \gamma ^*_r\) is equivalent to \(dist (\ell ,P)\le \frac{1}{2}\). By some calculation,

    $$\begin{aligned} dist (\ell ,P)&=\frac{|(dr)^{1/2}-r\sin \theta +\frac{\cos \theta }{\sin \theta }r(1-\cos \theta )|}{\sqrt{1+\frac{\cos ^2\theta }{\sin ^2\theta }}}\\&=|\sin \theta (dr)^{1/2}-r(1-\cos \theta )|\\&=2|(dr)^{1/2}\sin \frac{\theta }{2}\cos \frac{\theta }{2}-r\sin ^2\frac{\theta }{2}|. \end{aligned}$$

    We just need to find the number of \(\theta \) such that \(dist (\ell ,P)\le 1/2\). By symmetry, we just compute the positive solutions \(\theta \) that are close to 0. In this case, the inequality becomes

    $$\begin{aligned} (dr)^{1/2}\sin \frac{\theta }{2}\cos \frac{\theta }{2}-r\sin ^2\frac{\theta }{2}\le 1/4. \end{aligned}$$

    The meaningful solutions will be

    $$\begin{aligned} \sin \frac{\theta }{2}&\le \frac{(dr)^{1/2}\cos \frac{\theta }{2}-\sqrt{dr\cos ^2\frac{\theta }{2}-r}}{2r}\\&=\frac{1}{2}\frac{1}{(dr)^{1/2}\cos \frac{\theta }{2}+\sqrt{dr\cos ^2\frac{\theta }{2}-r}}\\&\sim (dr)^{-1/2}. \end{aligned}$$

    In the last step, we use \(\cos \frac{\theta }{2}\sim 1\). Therefore, \(0\le \theta \lesssim (dr)^{-1/2}\). Since \(\{C(\gamma ^*_r)\}\) have angle separation \(\sim R^{-\beta }\), we see the number of \(\gamma ^*_r\) that contains P is \(\sim R^\beta (dr)^{-1/2}\).

  3. (3)

    \(r\le d\le R^{\beta }\). We claim in this case

    $$\begin{aligned} \sum _\gamma \textbf{1}_{\gamma ^*_r}(P)\sim R^\beta d^{-1}. \end{aligned}$$

    The calculation is exactly the same as above, with the only modification that we replace (28) by \(P=(0,d)\).

Combining the three scenarios (1), (2), (3), we can estimate

$$\begin{aligned} \int _{\mathbb {R}^2} \Big (\sum _\gamma \textbf{1}_{\gamma ^*_r}\Big )^{p/2}&=\bigg (\int _{d(P)\le 10}+ \int _{10\le d(P)\le r}+\int _{r\le d(P)\le R^\beta } \bigg )\Big (\sum _\gamma \textbf{1}_{\gamma ^*_r}(P)\Big )^{p/2}dP\\&\sim r (r^{-1/2}R^\beta )^{p/2}+ \sum _{d\in [10,r]~dyadic }dr (R^\beta (rd)^{-1/2})^{p/2}\\&\quad +\sum _{d\in [r,R^\beta ]~dyadic } d^2 (R^\beta d^{-1})^{p/2}\\&\approx r^{1-\frac{p}{4}}R^{\beta \frac{p}{2}}+r^{2-\frac{p}{2}}R^{\beta \frac{p}{2}}+R^{2\beta } . \end{aligned}$$

In the last line, we use \(\approx \) is because when \(p=4\), the summation is over \(\sim \log R\) same numbers instead of a geometric series.

\(\boxed {\hbox {Case 3:} R^\beta \le r\le R}\)

This is almost the same as \(\boxed {\hbox {Case 2}}\). Actually, it is even simpler, since we only have scenarios (1) and (2) (with the range in (2) replaced by \(10\le d\le r^{-1}R^{2\beta }\) and noting \(r^{-1}R^{2\beta }\le r\)). The same argument will give

$$\begin{aligned} \int _{\mathbb {R}^2} \Big (\sum _\gamma \textbf{1}_{\gamma ^*_r}\Big )^{p/2}&=\bigg (\int _{d(P)\le 10}+ \int _{10\le d(P)\le r^{-1}R^{2\beta }}\bigg )\Big (\sum _\gamma \textbf{1}_{\gamma ^*_r}(P)\Big )^{p/2}dP\\&\sim r (r^{-1/2}R^\beta )^{p/2}+ \sum _{d\in [10,r^{-1}R^{2\beta }]~dyadic }dr (R^\beta (rd)^{-1/2})^{p/2}\\&\approx r^{1-\frac{p}{4}}R^{\beta \frac{p}{2}}+R^{2\beta }. \end{aligned}$$

\(\square \)

With (27), we can finally estimate

$$\begin{aligned} \int \Big (\sum _\gamma \textbf{1}_{\gamma ^*}\Big )^{p/2}&=\int _{|r|\le R}dr\int _{\{x_3=r\}}(\sum _\gamma \textbf{1}_{\gamma ^*})^{p/2}dx_1dx_2\\&\bigg (\int _{0\le |r|\le 10}+\int _{10\le |r|\le R^\beta }+\int _{R^{\beta }\le |r|\le R}\bigg )\Big (\sum _\gamma \textbf{1}_{\gamma ^*}\Big )^{p/2}dx_1dx_2dr\\&\lesssim R^{2\beta }+R^{\frac{p\beta }{2}}+ \sum _{r\in [10,R^\beta ] ~dyadic } r(r^{1-\frac{p}{4}}R^{\beta \frac{p}{2}}+r^{2-\frac{p}{2}}R^{\beta \frac{p}{2}}+R^{2\beta })\\&\ \ \ \ +\sum _{r\in [R^\beta ,R] ~dyadic } r(r^{1-\frac{p}{4}}R^{\beta \frac{p}{2}}+R^{2\beta })\\&\lesssim R^{\frac{p\beta }{2}}+ R^{\beta (2+\frac{p}{4})}+ R^{(2-\frac{p}{4})+\frac{p\beta }{2}}+R^{1+2\beta }\\&\sim R^{\frac{p\beta }{2}}+R^{(2-\frac{p}{4})+\frac{p\beta }{2}}+R^{1+2\beta }. \end{aligned}$$

The last step is because of \(R^{\beta (2+\frac{p}{4})}\le R^{\frac{p\beta }{2}}+R^{(2-\frac{p}{4})+\frac{p\beta }{2}}\).

Combining (25), (26) and plugging into (3), we obtain

$$\begin{aligned} \max \Big \{R^{-1},R^{-\frac{3}{2}+\frac{2}{p}}\Big \}\lessapprox C_{\beta ,p}(R) R^{-1-\beta } \Big (R^{\frac{\beta }{2}}+R^{\frac{2}{p}-\frac{1}{4}+\frac{\beta }{2}}+R^{\frac{1+2\beta }{p}}\Big ). \end{aligned}$$

Considering of the three cases \(2\le p\le 4\), \(4\le p\le 8\) and \(p\ge 8\) will give us that the right hand side of (4) is actually the lower bound of \(C_{\beta ,p}(R)\) (up to \(R^\epsilon \) factor).

3.2 Proof of Theorem 2

The difficult part of the proof will be in the range \(4\le p\le 8\). Recall from Remark 1.1.2 that we need to prove for all p but not only the endpoint p, since there is no interpolation argument. The main tool we are going to use is called the amplitude dependent wave envelope estimate by Guth–Maldague [5]. Before giving the proof, we introduce some notations from [5, 6].

Recall \(\mathcal {C}\) is the truncated cone in \(\mathbb {R}^3\):

$$\begin{aligned}\mathcal {C}:=\{\xi \in \mathbb {R}^3:\xi _3=\sqrt{x_1^2+x_2^2},1/2\le \xi _3\le 1\}.\end{aligned}$$

We have the canonical partition of \(N_{R^{-1}}(\mathcal {C})\) into \(1\times R^{-1/2}\times R^{-1}\)-planks \(\Theta =\{\theta \}\):

$$\begin{aligned}N_{R^{-1}}(\mathcal {C})=\bigsqcup \theta .\end{aligned}$$

More generally, for any dyadic \(s\in [R^{-1/2},1]\), we can partition the \(s^2\)-neighborhood of \(\mathcal {C}\) into \(1\times s\times s^2\)-planks \({\textbf {S}}_s=\{\tau _s\}\):

$$\begin{aligned}N_{s^2}(\mathcal {C})=\bigsqcup \tau _s.\end{aligned}$$

Note in particular \({\textbf {S}}_{R^{-1/2}}=\Theta \). For each s and a frequency plank \(\tau _s\in {\textbf {S}}_s\), we define the box \(U_{\tau _s}\) in the physical space to be a rectangle centered at the origin of dimensions \(Rs^2\times Rs\times R\) whose edge of length \(Rs^2\) (respectively Rs, R) is parallel to the edge of \(\tau _s\) with length 1 (respectively s, \(s^2\)). Note that for any \(\theta \in \Theta \), \(U_\theta \) is just \(\theta ^*\) (the dual rectangle of \(\theta \)). Also, \(U_{\tau _s}\) is the convex hull of \(\cup _{\theta \subset \tau _s}U_{\theta }\).

We make a useful observation, which will be used later. For any \(\theta \subset \tau _s\), we see that \(\theta ^*\) is a \(1\times R^{1/2}\times R\)-plank. Define \(U_{\theta ,s}\) to be the \(Rs^2\times Rs\times R\)-plank which is made by dilating the corresponding edges of \(\theta ^*\). Our observation is that \(U_{\tau _s}\) and \(U_{\theta ,s}\) are comparable:

$$\begin{aligned} \frac{1}{C} U_{\theta ,s}\subset U_{\tau _s}\subset C U_{\theta ,s}. \end{aligned}$$
(29)

This is not hard to see by noting that the second longest edge of \(\theta ^*\) form an angle \(\lesssim s\) with the \(Rs\times R\)-face of \(U_{\tau _s}\). We just omit the proof.

We cover \(\mathbb {R}^3\) by translated copies of \(U_{\tau _s}\). We will use \(U\parallel U_{\tau _s}\) to indicate U is one of the translated copies. If \(U\parallel U_{\tau _s}\), then we define \(S_U f\) by

$$\begin{aligned} S_U f=\big (\sum _{\theta \subset \tau _s}|f_\theta |^2\big )^{1/2}\textbf{1}_U. \end{aligned}$$
(30)

We can think of \(S_U f\) as the wave envelope of f localized in U in the physical space and localized in \(\tau _s\) in the frequency space. We have the following inequality of Guth, Wang and Zhang (see [6, Theorem 1.5]):

Theorem 3

[Wave envelope estimate] Suppose \(supp \widehat{f}\subset N_{R^{-1}}(\mathcal {C})\). Then

$$\begin{aligned} \Vert f\Vert _{4}^4\le C_\epsilon R^{\epsilon } \sum _{R^{-1/2}\le s\le 1}\sum _{\tau _s\in {\textbf {S}}_s}\sum _{U\parallel U_{\tau _s}} |U|^{-1}\Vert S_Uf\Vert _2^4, \end{aligned}$$
(31)

for any \(\epsilon >0\).

There is a refined version of the wave envelope estimate proved by Guth and Maldague (See [5, Theorem 2]):

Theorem 4

[Amplitude dependent wave envelope estimate] Suppose \(supp \widehat{f}\subset N_{R^{-1}}(\mathcal {C})\). Then for any \(\alpha >0\),

$$\begin{aligned} \alpha ^4 |\{ x\in \mathbb {R}^3:|f(x)|>\alpha \}|\le C_\epsilon R^{\epsilon } \sum _{R^{-1/2}\le s\le 1}\sum _{\tau _s\in {\textbf {S}}_s}\sum _{U\in \mathcal {G}_{\tau _s}(\alpha )} |U|^{-1}\Vert S_Uf\Vert _2^4, \end{aligned}$$
(32)

for any \(\epsilon >0\). Here, \(\mathcal {G}_{\tau _s}(\alpha )=\Big \{ U\parallel U_{\tau _s}:|U|^{-1}\Vert S_U f\Vert _2^2\gtrsim |\log R|^{-1} \frac{\alpha ^2}{(\#{\textbf {S}}_s)^2} \Big \}\).

Remark

In the original paper [5], their definition for \(\mathcal {G}_{\tau _s}(\alpha )\) is

$$\begin{aligned}\mathcal {G}_{\tau _s}(\alpha )=\Big \{ U\parallel U_{\tau _s}:|U|^{-1}\Vert S_U f\Vert _2^2\gtrapprox \frac{\alpha ^2}{(\#\tau _s)^2} \Big \},\end{aligned}$$

where \(\#\tau _s=\#\{\tau _s\in {\textbf {S}}_s:f_{\tau _s}\not \equiv 0\}\). Noting that \(\#\tau _s\le \#{\textbf {S}}_s\), we see our \(\mathcal {G}_{\tau _s}(\alpha )\) is a bigger set, and hence our (32) is weaker than the original version ([5] Theorem 2).

Proof of Theorem 2

\(\boxed {\hbox {Case 1:} p\ge 8}\) This is just by Cauchy–Schwarz inequality, since \(\#\Gamma _\beta (R^{-1})\sim R^{\beta }.\)

\(\boxed {\hbox {Case 2:} 2\le p\le 4}\)

We have (31). By dyadic pigeonholing on s, we can find s such that

$$\begin{aligned} \Vert f\Vert _4^4\lessapprox \sum _{\tau \in {\textbf {S}}_s}\sum _{U\parallel U_\tau } |U|^{-1}\Vert S_U f\Vert _2^4. \end{aligned}$$
(33)

We fix this s. Denote \({\textbf {U}}:=\{U: U\parallel U_\tau ~for~some~ \tau \in {\textbf {S}}_s\}\). Then the inequality above can be written as

$$\begin{aligned} \Vert f\Vert _4^4\lessapprox \sum _{U\in {\textbf {U}}} |U|^{-1}\Vert S_U f\Vert _2^4. \end{aligned}$$
(34)

We remind readers that each \(U\in {\textbf {U}}\) has size \(Rs^2\times Rs\times R\). We also have the following \(L^2\) estimate:

$$\begin{aligned} \Vert f\Vert _2^2\sim \sum _{U\in {\textbf {U}}} \Vert S_U f\Vert ^2_2. \end{aligned}$$
(35)

We provide a quick proof for (35). We have

$$\begin{aligned} \Vert f\Vert _2^2=\sum _{\tau \in {\textbf {S}}_s}\Vert f_\tau \Vert _2^2\sim \sum _{\tau \in {\textbf {S}}_s}\sum _{U\parallel U_\tau }\Vert f_\tau \Vert _{L^2(U)}^2. \end{aligned}$$

Noting that \(\{f_\theta : \theta \subset \tau \}\) are locally orthogonal on any translation of \(U_\tau \) and recalling (30), we have

$$\begin{aligned} \Vert f\Vert _2^2\sim \sum _{\tau \in {\textbf {S}}_s}\sum _{U\parallel U_\tau }\int _U \sum _{\theta \subset \tau }|f_\theta |^2=\sum _{U\in {\textbf {U}}}\Vert S_U f\Vert _2^2. \end{aligned}$$

Next, we will do dyadic pigeonholing on \(\Vert S_U f\Vert _2^2\). (Actually, we only need to prove a local version of the inequality, so we just care about those U that intersect \(B_R\). There are in total \(R^{O(1)}\) of them.) We can find a number \(W>0\) and set \({\textbf {U}}'=\{U\in {\textbf {U}}: \Vert S_Uf\Vert _2^2\sim W\}\), so that

$$\begin{aligned} \Vert f\Vert _4^4&\lessapprox |U|^{-1}\#{\textbf {U}}' W^2,\end{aligned}$$
(36)
$$\begin{aligned} \Vert f\Vert _2^2&\approx \#{\textbf {U}}' W. \end{aligned}$$
(37)

Since every \(U\in {\textbf {U}}\) has the same measure \(R^3s^2\), there is no ambiguity to write \(|U|^{-1}\) in (36).

Let \(\alpha \) be such that \(\frac{1}{p}=\frac{\alpha }{4}+\frac{1-\alpha }{2}\). Then \(\alpha =4(\frac{1}{2}-\frac{1}{p})\). Applying Hölder’s inequality gives

$$\begin{aligned} \Vert f\Vert _p^p\le \Vert f\Vert _4^{\alpha p} \Vert f\Vert _2^{(1-\alpha )p}\lessapprox |U|^{-p(\frac{1}{2}-\frac{1}{p})}\#{\textbf {U}}' W^{\frac{p}{2}}\le |U|^{-p(\frac{1}{2}-\frac{1}{p})}\sum _{U\in {\textbf {U}}} \Vert S_U f\Vert _2^p. \end{aligned}$$

Next we are going to exploit more orthogonality for \(S_U f\). Suppose \(U\parallel U_\tau \). By definition

$$\begin{aligned} \Vert S_U f\Vert _2^2=\int _U \sum _{\theta \subset \tau }|f_\theta |^2=\int _U \sum _{\theta \subset \tau }\Big |\sum _{\gamma \subset \theta }f_\gamma \Big |^2. \end{aligned}$$

We remind readers that \(\{\tau \}\) are \(1\times s\times s^2\)-caps; \(\{\theta \}\) are \(1\times R^{-1/2}\times R^{-1}\)-caps; \(\{\gamma \}\) are \(1\times R^{-\beta }\times R^{-1}\)-caps. Since U is too small for \(\{f_\gamma :\gamma \subset \theta \}\) to be orthogonal on U, we need to find a larger rectangle. First, let us look at the rectangles \(\{\gamma : \gamma \subset \theta \}\). We want to find a rectangle \(\nu _\theta \) as big as possible, such that \(\{\gamma +\nu : \gamma \subset \theta \}\) are finitely overlapping. Actually, we can choose \(\nu _\theta \) to be of size \(R^{1/2-\beta }\times R^{-\beta }\times R^{-1}\) (here the edge of \(\nu _\theta \) with length \(R^{1/2-\beta }\) (respectively \(R^{-\beta }\), \(R^{-1}\)) are parallel to the edge of \(\theta \) with length 1 (respectively \(R^{-1/2}\), \(R^{-1}\)). See Fig. 6: the left hand side is \(\theta \) and \(\{\gamma :\gamma \subset \theta \}\); the right hand side is our \(\nu _\theta \). It is not hard to see \(\{\gamma +\nu _\theta : \gamma \subset \theta \}\) are finitely overlapping. Let \(\nu _\theta ^*\) be the dual of \(\nu _\theta \) in the physical space, then \(\nu _\theta ^*\) has size \(R^{\beta -\frac{1}{2}}\times R^\beta \times R\) and we have the local orthogonality (we just ignore the rapidly decaying tail for simplicity):

$$\begin{aligned} \int _{\nu _\theta ^*} \Big |\sum _{\gamma \subset \theta } f_\gamma \Big |^2 \sim \int _{\nu _\theta ^*} \sum _{\gamma \subset \theta } |f_\gamma |^2 \end{aligned}$$
Fig. 6
figure 6

Small caps

Define

$$\begin{aligned} V_\theta =U_\tau +\nu _\theta ^*, \end{aligned}$$
(38)

which is a rectangle of size

$$\begin{aligned}\max \{Rs^2,R^{\beta -\frac{1}{2}}\}\times \max \{Rs,R^\beta \}\times R.\end{aligned}$$

We tile \(\mathbb {R}^3\) with translated copies of \(V_\theta \), and we write \(V\parallel V_\theta \) if V is one of the tiles. Noting that \(\frac{R^{\beta -\frac{1}{2}}}{Rs^2}\le \frac{R^\beta }{Rs}\), we will discuss three scenarios: 1. \(\frac{R^{\beta -\frac{1}{2}}}{Rs^2}\le \frac{R^\beta }{Rs}\le 1\); 2. \(\frac{R^{\beta -\frac{1}{2}}}{Rs^2}\le 1\le \frac{R^\beta }{Rs}\); 3. \(1\le \frac{R^{\beta -\frac{1}{2}}}{Rs^2}\le \frac{R^\beta }{Rs}\).

  • If \(\frac{R^{\beta -\frac{1}{2}}}{Rs^2}\le \frac{R^\beta }{Rs}\le 1\), then \(V_\theta \) is essentially \(U_\tau \). In this case, we already have the orthogonality of \(\{f_\gamma : \gamma \subset \theta \}\) on \(U(\parallel U_\tau )\). Therefore,

    $$\begin{aligned} \Vert f\Vert _p^p&\lessapprox |U|^{-p(\frac{1}{2}-\frac{1}{p})}\sum _{\tau \in {\textbf {S}}_s}\sum _{U\parallel U_\tau }\bigg (\int _U \sum _{\theta \subset \tau }|\sum _{\gamma \subset \theta }f_\gamma |^2\bigg )^{\frac{p}{2}}\\&\sim |U|^{-p(\frac{1}{2}-\frac{1}{p})}\sum _{\tau \in {\textbf {S}}_s}\sum _{U\parallel U_\tau }\bigg (\int _U \sum _{\gamma \subset \tau }|f_\gamma |^2\bigg )^{\frac{p}{2}}\\&\le \sum _{\tau \in {\textbf {S}}_s}\sum _{U\parallel U_\tau }\int _U\Big (\sum _{\gamma \subset \tau }|f_\gamma |^2\Big )^{\frac{p}{2}}\\&=\sum _{\tau \in {\textbf {S}}_s}\int _{\mathbb {R}^3}\Big (\sum _{\gamma \subset \tau }|f_\gamma |^2\Big )^{\frac{p}{2}}\\&\le \int _{\mathbb {R}^3}\Big (\sum _{\gamma \in \Gamma _{\beta }(R^{-1})}|f_\gamma |^2\Big )^{p/2}. \end{aligned}$$
  • In the other two scenarios, we proceed as follows.

    $$\begin{aligned} \Vert f\Vert _p^p&\lessapprox |U|^{-p(\frac{1}{2}-\frac{1}{p})}\sum _{\tau \in {\textbf {S}}_s} \sum _{U\parallel U_\tau } \bigg (\int _U \sum _{\theta \subset \tau }|f_\theta |^2\bigg )^{p/2}\\&\le |U|^{-p(\frac{1}{2}-\frac{1}{p})}\sum _{\tau \in {\textbf {S}}_s} \sum _{U\parallel U_\tau } \#\{\theta \subset \tau \}^{\frac{p}{2}-1}\sum _{\theta \subset \tau }\bigg (\int _U |f_\theta |^2\bigg )^{p/2}\\&\le |U|^{-p(\frac{1}{2}-\frac{1}{p})}\#\{\theta \subset \tau \}^{\frac{p}{2}-1}\sum _{\tau \in {\textbf {S}}_s} \sum _{\theta \subset \tau } \sum _{V\parallel V_\theta } \bigg (\int _V |f_\theta |^2\bigg )^{p/2}\\ (\text {By~orthogonality})&\sim |U|^{-p(\frac{1}{2}-\frac{1}{p})}\#\{\theta \subset \tau \}^{\frac{p}{2}-1}\sum _{\tau \in {\textbf {S}}_s} \sum _{\theta \subset \tau } \sum _{V\parallel V_\theta } \bigg (\int _V \sum _{\gamma \subset \theta }|f_\gamma |^2\bigg )^{p/2}\\ (\text {H}\ddot{\textrm{o}}\text {lder})&\le |U|^{-p(\frac{1}{2}-\frac{1}{p})}\#\{\theta \subset \tau \}^{\frac{p}{2}-1}\sum _{\tau \in {\textbf {S}}_s}\\&\qquad \sum _{\theta \subset \tau } \sum _{V\parallel V_\theta } |V|^{p(\frac{1}{2}-\frac{1}{p})}\int _V \Big (\sum _{\gamma \subset \theta }|f_\gamma |^2\Big )^{p/2}\\&\le \bigg (\frac{|V|}{|U|}\bigg )^{p(\frac{1}{2}-\frac{1}{p})}\#\{\theta \subset \tau \}^{\frac{p}{2}-1} \Big \Vert \Big (\sum _{\gamma \in \Gamma _\beta (R^{-1})}|f_\gamma |^2\Big )^{\frac{1}{2}}\Big \Vert _p^p\\&= \bigg (\max \Big \{\frac{R^\beta }{Rs},1\Big \}\max \Big \{\frac{R^{\beta -\frac{1}{2}}}{Rs^2},1\Big \}\bigg )^{p(\frac{1}{2}-\frac{1}{p})} (s R^{\frac{1}{2}})^{\frac{p}{2}-1}\\&\qquad \Big \Vert \Big (\sum _{\gamma \in \Gamma _\beta (R^{-1})}|f_\gamma |^2\Big )^{\frac{1}{2}}\Big \Vert _p^p. \end{aligned}$$

We just need to check

$$\begin{aligned} \bigg (\max \Big \{\frac{R^\beta }{Rs},1\Big \}\max \Big \{\frac{R^{\beta -\frac{1}{2}}}{Rs^2},1\Big \}\bigg )^{p(\frac{1}{2}-\frac{1}{p})} (s R^{\frac{1}{2}})^{\frac{p}{2}-1}\lesssim R^{(\beta -\frac{1}{2})(p-2)}. \end{aligned}$$
(39)

\(*\) If \(\frac{R^{\beta -\frac{1}{2}}}{Rs^2}\le 1\le \frac{R^\beta }{Rs}\), then the left hand side of (39) equals \(R^{(\beta -\frac{1}{2})(\frac{p}{2}-1)}\), which is \(\le \) the right hand side of (39).

\(*\) If \(1\le \frac{R^{\beta -\frac{1}{2}}}{Rs^2}\le \frac{R^\beta }{Rs}\), then the left hand side of (39) equals

$$\begin{aligned} (R^{2\beta -2}s^{-2})^{\frac{p}{2}-1}, \end{aligned}$$

which is less than the right hand side of (39) since \(s^{-1}\le R^{1/2}\).

\(\boxed {\hbox {Case 3:} 4\le p\le 8}\)

Note that

$$\begin{aligned} \Vert f\Vert _p^p\sim \sum _{\alpha \text {~dyadic}}\alpha ^p |\{x\in \mathbb {R}^3:|f(x)|\sim \alpha \}|. \end{aligned}$$

We can assume the range of \(\alpha \) is \(R^{-100}\Vert f\Vert _\infty \le \alpha \le \Vert f\Vert _\infty \). Other \(\alpha \) are considered as negligible.

By dyadic pigeonholing, we can find \(\alpha >0\) such that

$$\begin{aligned} \Vert f\Vert _p^p\lesssim (\log R)\cdot \alpha ^p |\{x\in \mathbb {R}^3:|f(x)|\sim \alpha \}|+negligible~term . \end{aligned}$$

We just need to fix this \(\alpha \), and prove an upper bound for \(\alpha ^p |\{x\in \mathbb {R}^3:|f(x)|> \alpha \}|\). By (32), we have

$$\begin{aligned} \alpha ^4 |\{ x\in \mathbb {R}^3:|f(x)|>\alpha \}|\le C_\epsilon R^{\epsilon } \sum _{R^{-1/2}\le s\le 1}\sum _{\tau _s\in {\textbf {S}}_s}\sum _{U\in \mathcal {G}_{\tau _s}(\alpha )} |U|^{-1}\Vert S_Uf\Vert _2^4.\end{aligned}$$

By pigeonholing again, we can find s such that

$$\begin{aligned} \alpha ^4 |\{ x\in \mathbb {R}^3:|f(x)|>\alpha \}|\lessapprox \sum _{\tau \in {\textbf {S}}_s}\sum _{U\in \mathcal {G}_{\tau }(\alpha )} |U|^{-1}\Vert S_Uf\Vert _2^4. \end{aligned}$$
(40)

We fix this s. We also remind readers the definition of \(\mathcal {G}_\tau (\alpha )\):

$$\begin{aligned} \mathcal {G}_\tau (\alpha ):=\{ U\parallel U_\tau : |U|^{-1}\int _U\sum _{\theta \subset \tau }|f_\theta |^2\gtrapprox (\alpha s)^2 \}, \end{aligned}$$

since \(\#{\textbf {S}}_s\sim s^{-1}\). Continuing the estimate in (40), we have

$$\begin{aligned} \alpha ^4 |\{ x\in \mathbb {R}^3:|f(x)|>\alpha \}|&\lessapprox \sum _{\tau \in {\textbf {S}}_s}\sum _{U\in \mathcal {G}_{\tau }(\alpha )} |U|^{-1}\bigg (\int _{U} \sum _{\theta \subset \tau }|f_\theta |^2\bigg )^2\\&\lessapprox \sum _{\tau \in {\textbf {S}}_s}\sum _{U\in \mathcal {G}_{\tau }(\alpha )} |U|^{-1}\bigg (\int _{U} \sum _{\theta \subset \tau }|f_\theta |^2\bigg )^{\frac{p}{2}}\bigg ( |U|(\alpha s)^2 \bigg )^{2-\frac{p}{2}}. \end{aligned}$$

Moving the power of \(\alpha \) to the left hand side, we obtain

$$\begin{aligned} \alpha ^p |\{ x\in \mathbb {R}^3:|f(x)|>\alpha \}|\lessapprox \sum _{\tau \in {\textbf {S}}_s}\sum _{U\in \mathcal {G}_{\tau }(\alpha )} |U|^{1-\frac{p}{2}}\bigg (\int _{U} \sum _{\theta \subset \tau }|f_\theta |^2\bigg )^{\frac{p}{2}}s^{4-p}. \end{aligned}$$
(41)

Our final goal is to prove that the right hand side above is

$$\begin{aligned} \lessapprox R^{\frac{\beta p}{2}+\frac{p}{4}-2}\Big \Vert \Big (\sum _\gamma |f_\gamma |^2\Big )^{1/2}\Big \Vert _p^p. \end{aligned}$$
(42)

To do that, we again need to exploit the orthogonality of \(\{f_\gamma :\gamma \subset \theta \}\). The argument is different from that in \(\boxed {\hbox {Case 2:} 2\le p\le 4}\). In \(\boxed {\hbox {Case 2:} 2\le p\le 4}\), we expand the integration domain U to a bigger rectangle V to get orthogonality, whereas here we are going to use Cauchy–Schwarz inequality.

We discuss the geometry of these caps. Fix a \(\tau \in {\textbf {S}}_s\). By definition, \(U_\tau \) is a \(Rs^2\times Rs\times R\)-rectangle in the physical space. Then \(U_\tau ^*\) is a \(R^{-1}s^{-2}\times R^{-1}s^{-1}\times R^{-1}\)-rectangle. We make the following observation: for each \(\theta \subset \tau \), we can show that \(U^*_\tau \) is comparable to another rectangle, which has the same size but with the edges parallel to the corresponding edges of \(\theta \). We explain it with more details. Let \(U_{\theta ,s}\) be the \(Rs^2\times Rs\times R\)-rectangle which is made from the \(1\times R^{1/2}\times R\)-rectangle \(\theta ^*\) by dilating the corresponding edges. Then \(U_{\theta ,s}^*\) is a \(R^{-1}s^{-2}\times R^{-1}s^{-1}\times R^{-1}\)-rectangle whose edges are parallel to the corresponding edges of the \(1\times R^{-1/2}\times R^{-1}\)-rectangle \(\theta \). We want to show \(U_\tau ^*\) and \(U_{\theta ,s}^*\) are comparable. This is equivalent to show \(U_\tau \) and \(U_{\theta ,s}\) are comparable, which is an observation we made at (29). Therefore, for any \(\theta \subset \tau \), we can assume the edges of \(U_\tau ^*\) are parallel to the corresponding edges of \(\theta \).

Fig. 7
figure 7

Small caps

Fix a \(U\parallel U_\tau \), then \(U^*=U_\tau ^*\). See Fig. 7: on the left is \(\theta \) and \(\{\gamma : \gamma \subset \theta \}\); on the middle is our \(U^*\). We will discuss two scenarios depending on whether \(R^{-\beta }\) (the width of \(\gamma \)) is bigger than \(R^{-1}s^{-1}\) (the width of \(U^*\)).

  • If \(R^{-\beta }\ge R^{-1}s^{-1}\), then we see that \(\{ \gamma +U^*: \gamma \subset \theta \}\) are finitely overlapping. This means that \(\{f_\gamma : \gamma \subset \theta \}\) are locally orthogonal on U:

    $$\begin{aligned} \int _U \Big |\sum _{\gamma \subset \theta }f_\gamma \Big |^2\lesssim \int _U \sum _{\gamma \subset \theta }|f_\gamma |^2. \end{aligned}$$

    Therefore,

    $$\begin{aligned} \Vert f\Vert _p^p&\lessapprox |U|^{1-\frac{p}{2}}\sum _{\tau \in {\textbf {S}}_s}\sum _{U\parallel U_s}\bigg (\int _U \sum _{\theta \subset \tau }|\sum _{\gamma \subset \theta }f_\gamma |^2\bigg )^{\frac{p}{2}}s^{4-p}\\&\lesssim |U|^{1-\frac{p}{2}}\sum _{\tau \in {\textbf {S}}_s}\sum _{U\parallel U_s}\bigg (\int _U \sum _{\gamma \subset \tau }|f_\gamma |^2\bigg )^{\frac{p}{2}}s^{4-p}\\ (\text {H}\ddot{\textrm{o}}\text {lder})&\le s^{4-p}\sum _{\tau \in {\textbf {S}}_s}\sum _{U\parallel U_s}\int _U\Big (\sum _{\gamma \subset \tau }|f_\gamma |^2\Big )^{\frac{p}{2}}\\&=s^{4-p}\sum _{\tau \in {\textbf {S}}_s}\int _{\mathbb {R}^3}\Big (\sum _{\gamma \subset \tau }|f_\gamma |^2\Big )^{\frac{p}{2}}\\&\le s^{4-p}\int _{\mathbb {R}^3}\Big (\sum _{\gamma \in \Gamma _{\beta }(R^{-1})}|f_\gamma |^2\Big )^{p/2}. \end{aligned}$$

    We just need to check

    $$\begin{aligned} s^{4-p}\le R^{\frac{\beta p}{2}+\frac{p}{4}-2}.\end{aligned}$$

    Plugging \(s^{-1}\le R^{-1/2}\), the inequality above is reduced to

    $$\begin{aligned} R^{p/4}\le R^{\frac{\beta p}{2}}, \end{aligned}$$

    which is true since \(\beta \ge 1/2\).

  • If \(R^{-\beta }\le R^{-1}s^{-1}\), we will define a set of new planks which we call \(\pi \). See on the right hand side of Fig. 7. We partition \(\theta \) into a set of \(1\times R^{-1}s^{-1}\times R^{-1}\)-planks, which we denoted by \(\{\pi : \pi \subset \theta \}\). If the partition is well chosen (the size of caps can vary within a constant multiple), we can assume each \(\gamma \) fits into one \(\pi \), so we define

    $$\begin{aligned} f_\pi :=\sum _{\gamma \subset \pi }f_\gamma . \end{aligned}$$

    Now, our key observation is that \(\{\pi +U^*: \pi \subset \theta \}\) are finitely overlapping. This is true by noting that: the width of \(U^*\) and \(\pi \) are both \(R^{-1}s^{-1}\); the angle between the longest edge of \(\pi \) and \(U^*\) is less than \(R^{-1/2}\) and \(R^{-1}s^{-2}\cdot R^{-1/2}\le R^{-1}s^{-1}\). Therefore, we have that \(\{f_\pi : \pi \subset \theta \}\) are locally orthogonal on U, i.e.,

    $$\begin{aligned} \int _U \Big |\sum _{\pi \subset \theta }f_\pi \Big |^2\lesssim \int _U \sum _{\pi \subset \theta }|f_\pi |^2. \end{aligned}$$
    (43)

    Another step of Cauchy–Schwarz will give

    $$\begin{aligned} \int _U \sum _{\pi \subset \theta }|f_\pi |^2&=\int _U \sum _{\pi \subset \theta }\Big |\sum _{\gamma \subset \pi }f_\gamma \Big |^2\le \#\{\gamma \subset \pi \} \int _U \sum _{\gamma \subset \theta }|f_\gamma |^2\nonumber \\&=R^\beta R^{-1}s^{-1} \int _U \sum _{\gamma \subset \theta }|f_\gamma |^2. \end{aligned}$$
    (44)

    As a result, we obtain

    $$\begin{aligned} \int _U |f_\theta |^2\lesssim R^\beta R^{-1}s^{-1}\int _U \sum _{\gamma \subset \theta }|f_\gamma |^2. \end{aligned}$$

    Summing over \(\theta \subset \tau \), we obtain

    $$\begin{aligned} \int _U \sum _{\theta \subset \tau }|f_\theta |^2\lesssim R^\beta R^{-1}s^{-1}\int _U \sum _{\gamma \subset \tau }|f_\gamma |^2. \end{aligned}$$

    Therefore,

    $$\begin{aligned} \Vert f\Vert _p^p&\lessapprox |U|^{1-\frac{p}{2}}\sum _{\tau \in {\textbf {S}}_s} \sum _{U\parallel U_\tau } \bigg (\int _U \sum _{\theta \subset \tau }|f_\theta |^2\bigg )^{p/2}s^{4-p}\\&\lesssim |U|^{1-\frac{p}{2}}(R^\beta R^{-1}s^{-1})^{\frac{p}{2}} \sum _{\tau \in {\textbf {S}}_s} \sum _{U\parallel U_\tau } \bigg (\int _U \sum _{\gamma \subset \tau }|f_\gamma |^2\bigg )^{p/2}s^{4-p}\\ (\text {H}\ddot{\textrm{o}}\text {lder})&\le s^{4-p}(R^\beta R^{-1}s^{-1})^{\frac{p}{2}}\sum _{\tau \in {\textbf {S}}_s} \sum _{U\parallel U_\tau } \int _U \Big (\sum _{\gamma \subset \tau }|f_\gamma |^2\Big )^{p/2}\\&\le s^{4-p}(R^\beta R^{-1}s^{-1})^{\frac{p}{2}}\Big \Vert \Big (\sum _{\gamma \in \Gamma _\beta (R^{-1})}|f_\gamma |^2\Big )^{\frac{1}{2}}\Big \Vert _p^p. \end{aligned}$$

We just need to check

$$\begin{aligned}s^{4-p}(R^\beta R^{-1}s^{-1})^{\frac{p}{2}}\le R^{\frac{\beta p}{2}+\frac{p}{4}-2},\end{aligned}$$

which is equivalent to

$$\begin{aligned} s^{4-\frac{3p}{2}}\le R^{\frac{3p}{4}-2}. \end{aligned}$$

Plugging \(s^{-1}\le R^{1/2}\) and noting that \(4-\frac{3p}{2}<0\), we prove the result. \(\square \)