1 Introduction and Statement of Results

Let X and Y be two Banach spaces and T be a linear operator from X into Y. Beyond the immediate question of boundedness of T or the value of the operator norm of T, it is natural to investigate the operator in more details, such as the various properties of the operator—both qualitative and quantitative—of the existence of extremizers, the properties of extremizers, the extremizing sequences, the near extremizers of the operator.

This kind of detailed study of a bounded operator has a long and rich history. One of the most celebrated examples is the work of Beckner [1], where he studied the existence of extremizers for the Hausdorff-Young inequality on \(\mathbb {R}^d\). Recently there has been a series of works on the above questions for different operators, such as the Stein-Tomas Inequality [8, 9], an improved Hausdorff-Young inequality [5], convolution with the surface measure on the paraboloid [3, 10], for k-plane transforms [13, 15], convolution with the surface measure on the sphere [24] to name a few. One motivation to investigate such questions is to produce improved inequalities and thus an inverse result concerning the stability of the inequality near an extremizer (see [5], for example) and to make qualitative studies of PDEs [25].

In this paper we investigate the above questions for a generalized Radon transform along the moment curve. Let T be the linear operator which acts on the continuous functions on \(\mathbb {R}^d\) by convolution with affine arc length measure on the moment curve \((t, t^2,\ldots , t^d)\). That is, for a continuous function f on \(\mathbb {R}^d\), Tf is defined by

$$\begin{aligned} Tf(x)= \int _{\mathbb {R}} f(x+(t, t^{2},\ldots , t^{d})) dt. \end{aligned}$$
(1.1)

Theorem 1.1

(Christ, Littman, Oberlin, Stovall) T maps \(L^{p}(\mathbb {R}^d)\) into \(L^{q}(\mathbb {R}^d)\) as a bounded operator with \(1\le p, q\le \infty \) if and only if

$$\begin{aligned} \frac{1}{p}=\frac{1}{p_\theta }=\frac{1-\theta }{p_0}+\frac{\theta }{p_1}\qquad \text {and}\qquad \frac{1}{q}=\frac{1}{q_\theta }=\frac{1-\theta }{q_0}+\frac{\theta }{q_1} \end{aligned}$$
(1.2)

for some \(\theta \in [0,1]\) where \(p_{0}=\frac{d+1}{2}\), \(q_{0}=\frac{d(d+1)}{2(d-1)}\) and \(p_{1}=q_{0}', q_{1}=p_{0}'\).

The above theorem was proved for \(d=2\) by Littman in [21]. Oberlin [22] showed that it is sufficient to satisfy the above condition when \(d=3\). The theorem was proved up to the end point for any dimension by Christ in [2]. Extending ideas from [2], Stovall proved the strong type end point bound [23].

By using methods based on several observations and results of Christ (see [2,3,4, 23]), such as the method of refinement to get quantitative restricted weak type inequality and an almost orthogonality argument using a distance-like function between two near extremizers, together with some refinements by Dendrinos and Stovall (using the increasing structure in the method of refinement), [12], we have been able to improve the associated inequality for this operator for \(\theta \in (0,1)\), such as establishing existence of functions that optimize the inequality and a qualitative information about functions that nearly optimize the corresponding inequality. At the endpoint i.e. for \(\theta =0,1\) we apply a scaling, not adapted to the moment curve, to establish existence of functions that optimize the inequality corresponding to the bounds of an Xray transform. To state our results more precisely, first we need to introduce some definitions.

Let (pq) be as above. Let \(A_p\) be the operator norm of T. That is

$$\begin{aligned} A_p= \sup _{\Vert f\Vert _{L^p}=1} \Vert Tf\Vert _{L^q}. \end{aligned}$$

Note that although \(A_p\) depends on p, we shall write it simply as A when it is clear what value of p is under consideration.

Definition 1.2

Extremizer. Let \(f\in L^p\). We say that f is an extremizer if

$$\begin{aligned} \Vert Tf\Vert _{L^q}= A \Vert f\Vert _{L^p}\ne 0. \end{aligned}$$
(1.3)

Definition 1.3

\(\delta \)-Quasiextremizer. For any \(\delta \>0\), \(f\in L^p\) is a \(\delta \)-quasiextremizer if

$$\begin{aligned} \Vert Tf\Vert _{L^q}\ge \delta \Vert f\Vert _{L^p}\ne 0. \end{aligned}$$
(1.4)

Definition 1.4

\(\delta \)-Quasiextremizer pair. Let \(\delta \>0\). We say that an ordered pair (fg) of measurable functions on \(\mathbb {R}^d\) is an \(\delta \)-quasiextremizer pair if

$$\begin{aligned} \langle T(f), g \rangle \ge \delta \Vert g\Vert _{L^{q'}} \Vert f\Vert _{L^{p}}\ne 0. \end{aligned}$$

Definition 1.5

Extremizing sequence. An extremizing sequence is any sequence \(f_n\in L^{p}\) such that

$$\begin{aligned} \Vert f_n\Vert _{L^p}=1, \end{aligned}$$
$$\begin{aligned} \Vert Tf_n\Vert _{L^q}\rightarrow A. \end{aligned}$$

Let us denote the moment curve by \(h(t)=(t,t^2,\ldots ,t^d)\). Our main theorems are as follows.

Theorem 1.6

For every \(\theta \in (0,1)\) there exists an extremizer for \(T:L^{p_{\theta }}\rightarrow L^{q_{\theta }}\) for the inequality (1.3) when \(d\ge 2\). Furthermore for any nonnegative extremizing sequence \(\{f_{n}\}\) there exists a sequence of symmetries (diffeomorphisms of \(\mathbb {R}^d\), preserving \(L^p\) norm of f), \(\{\phi ^{*}_{n}\}\), of \(\mathbb {R}^{d}\), such that there is a subsequence of \(\{\phi ^{*}_n(f_n)\}\) which converges in \(L^{p_\theta }\) to some extremizer for \(T:L^{p_{\theta }}\rightarrow L^{q_{\theta }}\).

To state the second theorem we need a few more definitions.

Let X be the X-ray transform restricted to directions along the moment curve defined on continuous functions on \(\mathbb {R}^d=\mathbb {R}\times \mathbb {R}^{d-1}\) by

$$\begin{aligned} Xf(t,y)=\int _{\mathbb {R}} f(s,y+s(2t,3t^2,\ldots ,dt^{d-1})) \,ds. \end{aligned}$$
(1.5)

Theorem 1.7

(Christ and Erdogan [6], Dendrinos and Stovall [11], Erdogan [14], Laghi [19]) X maps \(L^p\) into \(L^{q}(L^{r}, dt)\) if for some \(\theta \in [0,1)\)

$$\begin{aligned} \Big (\frac{1}{p}, \frac{1}{q}, \frac{1}{r}\Big )=\Big (\frac{1}{p_\theta }, \frac{1}{q_\theta }, \frac{1}{r_\theta }\Big )=\Big (1-\theta +\frac{\theta d}{d+2},\,\,\frac{\theta \,d}{d+2},\,\,1-\theta +\frac{\theta (d^2-d-2)}{d^2+d-2}\Big ) \end{aligned}$$

and the restricted weak type bound holds for X at the end point i.e., for \(\theta =1\).

The X-ray transform has been studied by many authors for its connection to many other parts of mathematics. It was first studied by Gelfand in [16]. There has been a lot of work done to investigate the boundedness properties of the X-ray like transforms, such as [17] and [18], to name a few.

It has been proved by Michael Christ, that extremizers of T exist in the case \(d=2\), and they have been identified and shown to be unique up to symmetries. Although it is still not known whether for \(d>2\), there exists an extremizer for \(T:L^{\frac{d+1}{2}}\rightarrow L^{\frac{d(d+1)}{2(d-1)}}\)(corresponding to \(\theta =0\) in Theorem 1.1), we have been able to prove the following.

Theorem 1.8

Let T be defined as in 1.1 and \(d>2\).

  • Every extremizing sequence for \(T:L^{p_0}\rightarrow L^{q_0}\) has a subsequence that either converges modulo symmetries of T to an extremizer for \(T:L^{p_0}\rightarrow L^{q_0}\), or that converges modulo the nonsymmetry, \(f_n\rightarrow r^{\frac{2(d-1)}{d+1}}_nf_n((0,r_{n}x')+h(x_1))\), to an extremizer for \(X^{*}:L^{p_0}\rightarrow L^{q_0}\) corresponding to \(\theta =\frac{(d+2)(d-1)}{d^2+d}\) in Theorem 1.7.

  • Likewise, every extremizing sequence for \(T:L^{p_1}\rightarrow L^{q_1}\) has a subsequence that either converges modulo symmetries of T to an extremizer for \(T:L^{p_1}\rightarrow L^{q_1}\), or that converges modulo nonsymmetry, \(f_n\rightarrow r^{\frac{d^2-d+2}{d+1}}_nf_n(r_{n}x)\), to an extremizer for \(X:L^{p_1}\rightarrow L^{q_1}\).

  • \(\Vert T\Vert _{L^{p_0}\rightarrow L^{q_0}}\ge \Vert X^{*}\Vert _{L^{p_0}\rightarrow L^{q_0}}\) and if there exists an extremizing sequence for \(T:L^{p_0}\rightarrow L^{q_0}\) that does not have a subsequence converging to an extremizer modulo symmetries of T, then \(\Vert T\Vert _{L^{p_0}\rightarrow L^{q_0}}=\Vert X^{*}\Vert _{L^{p_0}\rightarrow L^{q_0}}\).

Corollary 1.9

At least one of the following must hold:

  • (A) There exists an extremizer for \(T:L^{p_0}\rightarrow L^{q_0}\); or

  • (B) There exists an extremizer for \(X:L^{p_1}\rightarrow L^{q_1}\).

2 Outline of the Proof

A simplified outline of the argument is as follows. Given a function f which is very close to being an extremizer, we consider a dyadic decomposition of the range: \(f=\sum _j 2^{j}f_j\) where \(\frac{1}{2}\chi _{E_j}\le f_j<\chi _{E_{j}}\) for pairwise disjoint measurable sets \(E_j\) in \(\mathbb {R}^d\). Following [3] we prove that for f to be a near extremizer (say \(\Vert Tf\Vert _{q}\ge A(1-\delta )\Vert f\Vert _p\) for small \(\delta \)) the set of indices \(\{j\}\) essentially lies in an interval around \(J\in \mathbb {Z}\) in \(\mathbb {Z}\) of length depending only on \(\delta \) but independent of f. This is done by applying a certain“trilinear” bound for T (see Lemma 9.2 and Theorem 10.2) using Christ’s method of refinement [2] and the increasing structure of the refinement due to Dendrinos and Stovall [12]. The trilinear bound has already been established in Lemma 5.2 in [23]. Our proof is much simpler in comparison due to the increasing structure in the method of refinement. We show that \(E_j\) are very close to being curved “parallelepipeds”, and these are “almost” pairwise disjoint. More precisely these parallelepipeds are projections of balls in the incidence manifold. This is accomplished by introducing a mock distance on the set of all natural extremizers and proving that any two distant natural extremizers are almost orthogonal in the \(L^q\) space, see Lemma 9.1 (this is similar to the argument due to Christ in [3]). One significant difference in the structure of these parallelepipeds corresponding to when \(d=2\) (in this case the paraboloid in [3] is the same as the moment curve) from \(d>2\) is when \(d=2\) the symmetries of the operator act transitively on this set of parallelepipeds. For \(d>2\) the situation is quite different. For \(\theta \in (0,1)\) the symmetries of the operator act transitively on this set of parallelepipeds (near extremizers for \(L^{p_\theta }\rightarrow L^{q_\theta }\) bound). So after applying symmetries we can assume that all these near extremizers are well adapted to the unit ball in \(\mathbb {R}^d\). In other words \(E_J\sim B(0,1)\), the unit ball in \(\mathbb {R}^d\). So we can proceed as in the case of Paraboloid [3] using Christ’s argument to prove the existence of extremizers. On the other hand when \(d>2\), the symmetry group does not act transitively on the set of near extremizers for \(L^{p_0}\rightarrow L^{q_0}\) bound. As a consequence one has to allow the thickness of these parallelepipeds to become arbitrarily small as f becomes closer to being an extremizer. We overcome this obstruction by applying a non-symmetric “scaling” to the function f, so that we can avoid an extremizing sequence converging to 0 pointwise while simultaneously preserving the \(L^p\) norm of f. This is essentially the only new part of our analysis. This enables us to make the now rescaled near extremizers for \(T:L^{p_0}\rightarrow L^{q_0}\) well adapted to the unit ball in \(\mathbb {R}^d\) and thus proving existence of extremizers for the Xray tranform, \(X:L^{p_0}\rightarrow L^{q_0}\).

3 Notation

Most of the notation we will use is fairly standard. In this note cC denote implicit small and large positive constants respectively, which are allowed to change from one line to another. If \(1\le p\le \infty \), we denote by \(p^{'}\) the exponent dual to p. We use |E| to indicate Lebesgue measure. When A and B are non-negative real numbers, we write \(A\lesssim B\) to mean \(A\le CB\) for an implicit constant C, and \(A\sim B\) when \(A\lesssim B\) and \(B\lesssim A\). We will also employ the somewhat less standard notation \(\mathcal {T}(E,F):=\langle T({\chi _{E}}), \chi _{F}\rangle \) when E and F are Borel sets and T is a linear operator. We will also use (EF) to denote the pair of functions \((\chi _{E}, \chi _{F})\). We say that the sequence \(\{f_{n}\}\subset L^{p}\) converges weakly to f in \(L^{p}\) if for any function \(\psi \in L^{p'}\), \(\int f_{n}\psi \) converges to \(\int f\psi \) and \(\{f_{n}\}\) converges strongly to f in \(L^{p}\) if \(\int |f_{n}-f|^{p}\) converges to 0. Since \(T(f)\le T(|f|)\) for all \(f\in L^{p}\) and we are interested in only all those f for which |T(f)| is large, in this paper all the functions f will be assumed to be nonnegative.

4 Symmetries

In this section we study the symmetries of the operator T.

Definition 4.1

A symmetry of \(T:L^{p}\rightarrow L^{q}\) is an \(L^p\) isometry \(\phi ^{*}\) for which \(T\circ \phi ^{*}=\psi ^{*}\circ T\) for some \(L^q\) isometry \(\psi ^{*}\).

The operator T has many symmetries. Let \(\Theta : \mathbb {R}^{d+d}\rightarrow \mathbb {R}^{d-1}\) be the function defined by

$$\begin{aligned} \Theta (x,y)=(y_2-x_2-(y_1-x_1)^2, y_3-x_3-(y_1-x_1)^3,\ldots , y_d-x_d-(y_1-x_1)^d) \end{aligned}$$

and Let \(\Sigma \) be the incidence manifold \(\Sigma =\{(x,y): \Theta (x,y)=0\}\). Let us denote the set of all diffeomorphisms of \(\mathbb {R}^d\) by Diff\((\mathbb {R}^d)\).

Definition 4.2

Let \(G_{d,d}\) denote the set of all \((\phi ,\psi )\in \text {Diff}(\mathbb {R}^d)\times \text {Diff}(\mathbb {R}^d)\) such that

$$\begin{aligned} \Theta (\phi (x),\psi (y))= 0\quad \text {if and only if} \quad \Theta (x,y)=0 \quad \text {for all} \quad (x,y)\in \mathbb {R}^{d+d}. \end{aligned}$$

In other words, \(G_{d,d}\) denotes the set of all ordered pairs of diffeomorphisms of \(\mathbb {R}^d\) which preserve the incidence manifold \(\Sigma \). We also let \(G_d\) denote the set of all \(\phi \in \text {Diff}(\mathbb {R}^d)\) such that there exists \(\psi \in \text {Diff}(\mathbb {R}^d)\) such that \((\phi ,\psi )\in G_{d,d}\).

The followings are examples of elements of \(G_{d,d}\).

  • Translation: \((\phi (x), \psi (y))=(x+v, y+v)\) for some \(v\in \mathbb {R}^d\).

  • Scaling: \((\phi (x), \psi (y))=\big (S_r(x), S_r(y)\big )=\big ((rx_1,r^2x_2,\ldots ,r^dx_d),(ry_1,r^2y_2,\ldots ,r^dy_d)\big )\) for some \(r\in \mathbb {R} -\{0\}\).

  • Gliding along h: \((\phi (x), \psi (y))= (G_{t_0}(x),G_{t_0}(y)+h(t_0))\) for some \(t_0\in \mathbb {R}\), where \(G_{t_0}\) is the linear operator defined on \(\mathbb {R}^d\) associated to the \((d\times d)\) matrix

    $$\begin{aligned} G_{t_0}= \begin{bmatrix} 1&0&0\ldots&0\\ 2t_0&1&0 \ldots&0\\ 3t_0^2&3t_0&1 \ldots&0\\ \vdots&\vdots&\ldots&0\\ {m\atopwithdelims ()1}t_0^{m-1}&{m\atopwithdelims ()2}t_0^{m-2}&\ldots&0\\ \vdots&\vdots&\ldots&0\\ {d\atopwithdelims ()1}t_0^{d-1}&{d\atopwithdelims ()2}t_0^{d-2}&\ldots&1 \end{bmatrix}. \end{aligned}$$

Note that \(h(t+t_0)=G_{t_0}(h(t))+h(t_0)\) for all \(t,t_0\in \mathbb {R}\).

The elements of \(G_d\) play a central role in our analysis. There might be more elements in \(G_d\) than the ones in the above examples but as we shall see these are enough for our analysis. For each of the three types of symmetries described above the associated diffeomorphism has constant Jacobian. For each \(\phi \) we define the associated operators \(\phi ^*:L^p\rightarrow L^p\) by \(\phi ^*f(x)=J_{\phi }^{\frac{1}{p}} f(\phi (x))\). Then

$$\begin{aligned} \Vert \phi ^*(f)\Vert _{L^{p}}=\Vert f\Vert _{L^{p}}, \quad \langle T(\phi ^{*}f),\psi ^{*}g \rangle = \langle T(f),g \rangle . \end{aligned}$$

5 Paraballs

In this section we shall study an essentially exhaustive list of quasiextremal pairs. They are natural in the sense that every quasi-extremal pair is close, in a sense that degrades as the constant of quasiextremality decreases, to one of these pairs, see Theorem 6.1. It is elementary to show that the characteristic function of the set \(\{x\in \mathbb {R}^d:\Vert x\Vert <\delta \}\) is an \(\delta ^C\)-quasiextremal for \(T:L^{p_\theta }\rightarrow L^{q_\theta }\) for each \(\theta \in [0,1]\). For \(\theta =0\), we have, in addition, that the characteristic function of the set \(\{x\in \mathbb {R}^d:\Vert x\Vert <\delta \}\), and for \(\theta =1\), the \(\delta \)-tubular neighborhood of the set \(\{(t,t^2,\ldots .,t^d):t\in [-1,1]\}\) i.e. \(\{x\in \mathbb {R}^d: \Vert x-(t,t^2,\ldots ,t^d)\Vert < \delta \,\, \text{ for } \text{ some }\,\, t\in [-1,1]\}\) are c-quasiextremal where c is a small positive number that depends only on d and independent of \(\delta \), see Proposition 5.7. The set of all paraballs is the collection of sets that we produce by applying the elements of \(G_d\) to these sets. Below is a more detailed description of the “paraballs”.

Definition 5.1

For \(0<\alpha , \beta \le 1\), we define

  • \(B(0,0,\alpha ,1)=\{y\in \mathbb {R}^d:|y_1|<1\,\text {and}\,\,\Vert y-h(y_1)\Vert <\alpha \}\)  and  \(B^{*}(0,0,\alpha ,1)=\{x\in \mathbb {R}^d:|x_i|<\alpha \,\,\text {for all}\,\,1\le i\le d\}\).

  • \(B(0,0,1,\beta )=\{y\in \mathbb {R}^d:|y_i|<\beta \,\,\text {for all}\,\,1\le i\le d\}\)  and  \(B^{*}(0,0,1,\beta )=\{x\in \mathbb {R}^d:|x_1|<1 \,\text {and}\, \,\Vert x+h(-x_1)\Vert <\beta \}\).

  • $$\begin{aligned} \begin{aligned}&B({\bar{x}},t_0,\lambda \alpha ,\lambda )=G_{t_0}S_{\lambda }B(0,0,\alpha ,1)+{\bar{x}}+h(t_0)\\&B^{*}({\bar{x}},t_0,\lambda \alpha ,\lambda )=G_{t_0}S_{\lambda }B^{*}(0,0,\alpha ,1)+{\bar{x}}. \end{aligned} \end{aligned}$$
    (5.1)

We also define a scaling of a paraball by

$$\begin{aligned} \lambda B({{\bar{x}}}, t_0,\alpha ,\beta )= B({{\bar{x}}}, t_0, \lambda \alpha ,\lambda \beta ). \end{aligned}$$
(5.2)

Note that this does not correspond to a symmetry of the operator.

As an example in the special case when \(0<\alpha \le \beta \) the paraball \(B=B({{\bar{x}}}, t_0,\alpha ,\beta )\) is the set of all \(y\in \mathbb {R}^d\) satisfying all of

  • \(|y_1-{{\bar{y}}}_1| \le \beta \);

  • \(|\sum _{i=1}^{m}\left( {\begin{array}{c}m\\ i\end{array}}\right) {(-t_0)}^{m-i}(y_i-{{\bar{y}}}_i)-(y_1-{{\bar{y}}}_1)^m|\le \beta ^{m-1}\alpha \)    for all    \(1< m\le d\)

where \({{\bar{y}}}={{\bar{x}}}+h(t_0)\).

For \(0<\alpha \le \beta \), the dual paraball, denoted by \(B^{*}=B^{*}({{\bar{x}}}, t_0,\alpha ,\beta )\), is the set of all \(x\in \mathbb {R}^d\) such that

  • \(|x_1-{{\bar{x}}}_1| \le \alpha \);

  • \(|\sum _{i=1}^{m}\left( {\begin{array}{c}m\\ i\end{array}}\right) {(-t_0)}^{m-i}(x_i-{{\bar{x}}}_i)|\le \beta ^{m-1}\alpha \)    for all   \(1< m\le d\).

Similarly when \(0<\beta <\alpha \) the paraball \(B=B({{\bar{x}}}, t_0,\alpha ,\beta )\) is the set of all \(y\in \mathbb {R}^d\) satisfying all of

  • \(|y_1-{{\bar{y}}}_1| \le \beta \);

  • \(|\sum _{i=1}^{m}\left( {\begin{array}{c}m\\ i\end{array}}\right) {(-t_0)}^{m-i}(y_i-{{\bar{y}}}_i)|\le \alpha ^{m-1}\beta \)    for all   \(1< m\le d\)

where \({{\bar{y}}}={{\bar{x}}}+h(t_0)\).

The dual paraball, denoted by \(B^{*}=B^{*}({{\bar{x}}}, t_0,\alpha ,\beta )\) is the set of all \(x\in \mathbb {R}^d\) such that

  • \(|x_1-{{\bar{x}}}_1| \le \alpha \);

  • \(|\sum _{i=1}^{m}\left( {\begin{array}{c}m\\ i\end{array}}\right) {(-t_0)}^{m-i}(x_i-{{\bar{x}}}_i)+({{\bar{x}}}_1-x_1)^m|\le \alpha ^{m-1}\beta \)    for all    \(1< m\le d\).

For our analysis of an extremizing sequence it is important to measure how two paraballs interact with each other. We define a mock-distance on the set of all paraballs to measure the interaction between any two distant paraballs similar to the distance defined in [3].

Definition 5.2

Let \(B^{a}=B({{\bar{x}}}^{a}, t_{a},\alpha _{a},\beta _{a})\) and \(B^{b}=B({{\bar{x}}}^{b}, t_{b},\alpha _{b},\beta _{b})\) be two paraballs. Let \({{\bar{y}}}^a={{\bar{x}}}^a+h(t_a)\) and \({{\bar{y}}}^b={{\bar{x}}}^b+h(t_b)\) be the centers of the dual paraballs of \(B^a\) and \(B^b\) respectively. We define:

  • If \(\alpha _a\le \beta _a\) and \(\alpha _b\le \beta _b\),

    $$\begin{aligned} \begin{aligned} d(B^a, B^b)=&\frac{\max \big (\alpha _{a}\beta _{a}\alpha _{a}{\beta _{a}}^{2} \ldots \alpha _a\beta _a^{d-1}, \quad \alpha _b\beta _b\alpha _{b}{\beta _{b}}^{2} \ldots \alpha _b\beta _b^{d-1}\big )}{\min \big (\alpha _{a}\beta _{a}\alpha _{a}{\beta _{a}}^{2} \ldots \alpha _a\beta _a^{d-1}, \quad \alpha _b\beta _b\alpha _{b}{\beta _{b}}^{2} \ldots \alpha _b\beta _b^{d-1}\big )}+\Big (\frac{\beta _a}{\beta _b} + \frac{\beta _b}{\beta _a}\Big )\\&+\Big (\frac{\alpha _a}{\alpha _b} + \frac{\alpha _b}{\alpha _a}\Big ) +|{{\bar{y}}}_1^a-{{\bar{y}}}_1^b|\bigg (\frac{1}{\beta _a}+\frac{1}{\beta _b}\bigg ) +|{{\bar{x}}}_1^a-{{\bar{x}}}_1^b|\bigg (\frac{1}{\alpha _a}+\frac{1}{\alpha _b}\bigg )\\&+\sum _{m=2}^{d} \frac{|\sum _{i=1}^{m} \left( {\begin{array}{c}m\\ i\end{array}}\right) {(-t_{a})}^{m-i}({{\bar{y}}}_{i}^b- {{\bar{y}}}_{i}^{a})-({{\bar{y}}}_{1}^b-{{\bar{y}}}_{1}^a)^m|}{\alpha _a\beta _{a}^{m-1}}\\&+\sum _{m=2}^{d} \frac{|\sum _{i=1}^{m} \left( {\begin{array}{c}m\\ i\end{array}}\right) {(-t_{b})}^{m-i}({{\bar{y}}}_{i}^a- {{\bar{y}}}_{i}^{b})-({{\bar{y}}}_{1}^a-{{\bar{y}}}_{1}^b)^m|}{\alpha _b\beta _{b}^{m-1}}\\&+\sum _{m=2}^{d} \frac{|\sum _{i=1}^{m} \left( {\begin{array}{c}m\\ i\end{array}}\right) {(-t_{a})}^{m-i}({{\bar{x}}}_{i}^b-{{\bar{x}}}_{i}^a)|}{\alpha _a\beta _{a}^{m-1}}\\&+\sum _{m=2}^{d} \frac{|\sum _{i=1}^{m} \left( {\begin{array}{c}m\\ i\end{array}}\right) {(-t_{b})}^{m-i}({{\bar{x}}}_{i}^a-{{\bar{x}}}_{i}^b)|}{\alpha _b\beta _{b}^{m-1}}; \end{aligned} \end{aligned}$$
    (5.3)
  • If \(\beta _a<\alpha _a\) and \(\beta _b<\alpha _b\),

    $$\begin{aligned} \begin{aligned} d(B^a, B^b)=&\frac{\max \big (\alpha _{a}\beta _{a}\alpha _{a}^2{\beta _{a}} \ldots {\alpha _a}^{d-1}\beta _a, \quad \alpha _b\beta _b\beta _{b}{\alpha _{b}}^{2} \ldots \beta _b\alpha _b^{d-1}\big )}{\min \big (\alpha _{a}\beta _{a}\beta _{a}{\alpha _{a}}^{2} \ldots \beta _a\alpha _a^{d-1}, \quad \alpha _b\beta _b\beta _{b}{\alpha _{b}}^{2} \ldots \beta _b\alpha _b^{d-1}\big )}+\Big (\frac{\beta _a}{\beta _b} + \frac{\beta _b}{\beta _a}\Big )\\&+\Big (\frac{\alpha _a}{\alpha _b} + \frac{\alpha _b}{\alpha _a}\Big ) +|{{\bar{y}}}_1^a-{{\bar{y}}}_1^b|\bigg (\frac{1}{\beta _a}+\frac{1}{\beta _b}\bigg ) +|{{\bar{x}}}_1^a-{{\bar{x}}}_1^b|\bigg (\frac{1}{\alpha _a}+\frac{1}{\alpha _b}\bigg )\\&+\sum _{m=2}^{d} \frac{|\sum _{i=1}^{m} \left( {\begin{array}{c}m\\ i\end{array}}\right) {(-t_{a})}^{m-i}({{\bar{y}}}_{i}^b-{{\bar{y}}}_{i}^a)|}{\beta _a\alpha _{a}^{m-1}}\\&+\sum _{m=2}^{d} \frac{|\sum _{i=1}^{m} \left( {\begin{array}{c}m\\ i\end{array}}\right) {(-t_{b})}^{m-i}({{\bar{y}}}_{i}^a-{{\bar{y}}}_{i}^b)|}{\beta _b\alpha _{b}^{m-1}}\\&+\sum _{m=2}^{d} \frac{|\sum _{i=1}^{m} \left( {\begin{array}{c}m\\ i\end{array}}\right) {(-t_{a})}^{m-i}({{\bar{x}}}_{i}^b- {{\bar{x}}}_{i}^{a})+({{\bar{x}}}_{1}^a-{{\bar{x}}}_{1}^b)^m|}{\beta _a\alpha _{a}^{m-1}}\\&+\sum _{m=2}^{d} \frac{|\sum _{i=1}^{m} \left( {\begin{array}{c}m\\ i\end{array}}\right) {(-t_{b})}^{m-i}({{\bar{x}}}_{i}^a- {{\bar{x}}}_{i}^{b})+({{\bar{x}}}_{1}^b-{{\bar{x}}}_{1}^a)^m|}{\beta _b\alpha _{b}^{m-1}}. \end{aligned} \end{aligned}$$
    (5.4)

A few comments are in order. Note that we have defined this distance d for only two of the four possible cases, only for (\(\alpha _a\le \beta _a, \alpha _b\le \beta _b\)) and (\(\beta _a<\alpha _a, \beta _b<\alpha _b\)). This is because in our analysis we need to use the distance between only these two types of paraballs. So from now on when when we talk about distance between paraballs it would be one of these two cases. Note that d is not a distance on the set of all paraballs, simply because for any paraball B, \(d(B,B)=5\). But as we shall see that this is not of any significance to our analysis, for we shall use the properties of d only when the distance between two paraballs is large. Note that our “distance” function, d is not a pseudo-distance either, as it does not satisfy the properties of a pseudo-distance, So for the lack of a better term we shall call it a mock-distance.

In the first term in the expression we compare the \((d-1)\)-dimensional volume of the cross sections of the paraballs. The second and the third terms measure the ratio between the lengths of the bases of the paraballs and the dual paraballs respectively. The fourth term measures the distance between the first coordinates of the centers of the paraballs and the fifth term for the centers of the dual paraballs. The sixth and the seventh term measure how far are the centers of each paraball from the other paraball. Likewise the eighth and ninth terms measure how far are the centers of the dual paraball from the other dual paraball.

We shall see in the proof of Proposition 5.4 that the third, eighth and the ninth terms are redundant, in the sense that these are essentially dominated by first and second, sixth and the seventh term respectively. But we include these terms to make the mock-distance symmetric i.e. \(d(B^a, B^b)=d({B^a}^{*}, {B^b}^{*})\). In addition we have the following property of this mock-distance.

Lemma 5.3

For every pair of paraballs \(B^a, B^b\) and \(\phi \in G_d\) we have

$$\begin{aligned} d(B^a, B^b)=d(\phi ^*(B^a), \phi ^*(B^b)). \end{aligned}$$

Proof

It is enough to prove the stated equality when \(\phi \) is either a translation or scaling or gliding along h. Let \(B^{a}=B(\bar{x}^{a}, t_{a},\alpha _{a},\beta _{a})\) and \(B^{b}=B({{\bar{x}}}^{b}, t_{b},\alpha _{b},\beta _{b})\) be two paraballs. When \(\phi \) is a translation i.e. \(\phi (x)=x+v\) for some \(v\in \mathbb {R}^d\), then \(\phi ^*(B^{a})=B({{\bar{x}}}^{a}+v, t_{a},\alpha _{a},\beta _{a})\) and \(\phi ^*(B^{b})=B({{\bar{x}}}^{b}+v, t_{b},\alpha _{b},\beta _{b})\). When \(\phi \) is a scaling i.e. \(\phi =S_r\) for some \(r>0\), then \(\phi ^*(B^{a})=B(S_r({{\bar{x}}}^{a}), rt_{a},r\alpha _{a},r\beta _{a})\) and \(\phi ^*(B^{b})=B(S_r({{\bar{x}}}^{b}), rt_{b},r\alpha _{b},r\beta _{b})\). In both these cases each term in the Definition 5.2 remains unchanged. Therefore \(d(B^a, B^b)=d(\phi ^*(B^a), \phi ^*(B^b))\).

Let us now assume that \(\phi \) is a gliding along h i.e. \(\phi (x)=G_{t_0}(x)\) for some \(t_0\in \mathbb {R}\). Then \(\phi ^*(B^{a})=B(G_{t_0}({{\bar{x}}}^{a}), t_{a}+t_0,\alpha _{a},\beta _{a})\) and \(\phi ^*(B^{b})=B(G_{t_0}(\bar{x}^{b}), t_{b}+t_0,\alpha _{b},\beta _{b})\). So the first five terms in \(d(\phi ^*(B^a), \phi ^*(B^b))\) remain same as the first five terms in \(d(B^a),(B^b))\). If \(\alpha _a\le \beta _a\) then the sixth term in \(d(\phi ^*(B^a), \phi ^*(B^b))\) is

$$\begin{aligned} \begin{aligned}&\sum _{m=2}^{d}\frac{|\sum _{i=1}^{m} \left( {\begin{array}{c}m\\ i\end{array}}\right) {(-t_{a}-t_0)}^{m-i}(G_{t_0}({{\bar{y}}}^b-{{\bar{y}}}^{a}))_i-({{\bar{y}}}_{1}^b-{{\bar{y}}}_{1}^a)^m|}{\alpha _a\beta _{a}^{m-1}}\\&\qquad =\sum _{m=2}^{d}\frac{|(G_{-t_0-t_a}G_{t_0}({{\bar{y}}}^b-{{\bar{y}}}^{a}))_m-({{\bar{y}}}_{1}^b-{{\bar{y}}}_{1}^a)^m|}{\alpha _a\beta _{a}^{m-1}}\\&\qquad =\sum _{m=2}^{d}\frac{|(G_{-t_a}({{\bar{y}}}^b-{{\bar{y}}}^{a}))_m-({{\bar{y}}}_{1}^b-{{\bar{y}}}_{1}^a)^m|}{\alpha _a\beta _{a}^{m-1}}\\&\qquad =\sum _{m=2}^{d} \frac{|\sum _{i=1}^{m} \left( {\begin{array}{c}m\\ i\end{array}}\right) {(-t_{a})}^{m-i}({{\bar{y}}}_{i}^b-{{\bar{y}}}_{i}^{a})-(\bar{y}_{1}^b-{{\bar{y}}}_{1}^a)^m|}{\alpha _a\beta _{a}^{m-1}}, \end{aligned} \end{aligned}$$
(5.5)

which is the also sixth term in \(d(B^a,B^b)\). Similarly when \(\beta _a<\alpha _a\) the sixth term remains unchanged. By similar arguments one can prove that all other terms remain unchanged in \(d(\phi ^*(B^a), \phi ^*(B^b))\). This finishes the proof. \(\square \)

Proposition 5.4

There exists a constant \(C<\infty \) which depends only on the dimension d, such that for any two paraballs \(B^a,B^b\)

$$\begin{aligned} d(B^{a},B^{b})\le C\Bigg (\frac{\max \big (|B^{a}|, |B^{b}|\big )}{|B^{a}\cap B^{b}|}\Bigg )^C. \end{aligned}$$

Proof

The proof of this lemma will be an adaptation of the proof of Lemma 3.7 in [3]. We shall give the proof for the case \(\alpha _a\le \beta _a\) and \(\alpha _b\le \beta _b\) for the paraballs \(B^{a}=B({\bar{x}}^a,t_a,\alpha _a,\beta _a)\) and \(B^{b}=B({\bar{x}}^b,t_b,\alpha _b,\beta _b)\) respectively, the other case being identical. Without loss of generality we may assume that \(d(B^a,B^b)\) is large. Otherwise \(d(B^a,B^b)\) would be bounded by a large constant C. For any paraball \(B({\bar{x}}, t_{0},\alpha ,\beta )\), \(y\in B\) implies \(|y_{1}-{\bar{y}}_{1}|\le \beta \) where \({\bar{y}}={\bar{x}}+h(t_0)\). This implies that

$$\begin{aligned} |B^{a}\cap B^{b}|\le \min (\alpha _{a}\beta _{a}\alpha _{a}{\beta _{a}}^{2} \ldots \alpha _a\beta _a^{d-1},\, \alpha _b\beta _b\alpha _{b}{\beta _{b}}^{2} \ldots \alpha _b\beta _b^{d-1}) \text {min}(\beta _a, \beta _b)\\ \quad \le \frac{\text {min} (\alpha _{a}\beta _{a}\alpha _{a}{\beta _{a}}^{2} \ldots \alpha _a\beta _a^{d-1},\, \alpha _b\beta _b\alpha _{b}{\beta _{b}}^{2} \ldots \alpha _b\beta _b^{d-1})}{ \text {max} (\alpha _{a}\beta _{a}\alpha _{a}{\beta _{a}}^{2} \ldots \alpha _a\beta _a^{d-1},\, \alpha _b\beta _b\alpha _{b}{\beta _{b}}^{2} \ldots \alpha _b\beta _b^{d-1})}\text {max}(|B^{a}|, |B^{b}|). \end{aligned}$$

If \(\frac{\max \big (\alpha _{a}\beta _{a}\alpha _{a}{\beta _{a}}^{2} \ldots \alpha _a\beta _a^{d-1},\, \alpha _b\beta _b\alpha _{b}{\beta _{b}}^{2} \ldots \alpha _b\beta _b^{d-1}\big )}{\min \big (\alpha _{a} \beta _{a}\alpha _{a}{\beta _{a}}^{2} \ldots \alpha _a\beta _a^{d-1},\,\alpha _b\beta _b\alpha _{b} {\beta _{b}}^{2} \ldots \alpha _b\beta _b^{d-1}\big )}\ge c\,d(B^{a}, B^{b})\), this concludes the proof.

For the paraball \(B^a=B({\bar{x}}^a,t_a,\alpha _a,\beta _a)\), we define

$$\begin{aligned} S^a=(-\beta _a, \beta _a). \end{aligned}$$

Similarly, we define \(S^b\) for \(B^b\). Now

$$\begin{aligned}&|B^{a}\cap B^{b}|\le \frac{|\big ({\bar{y}}_{1}^{a}+S^{a}\big )\cap \big ({\bar{y}}_{1}^{b}+S^{b}\big )|}{\text {max}(|S^{a}|, |S^{b}|)} \text {max}(|B^{a}|, |B^{b}|)\\&\quad \le \frac{\text {min} (|S^{a}|, |S^{b}|)}{ \text {max} (|S^{a}|, |S^{b}|)} \text {max}(|B^{a}|, |B^{b}|)\\&\quad =\frac{\text {min} (\beta _{a}, \beta _{b})}{ \text {max} (\beta _{a}, \beta _{b})} \text {max}(|B^{a}|, |B^{b}|)\\&\quad \sim \bigg (\frac{\beta _a}{\beta _b} + \frac{\beta _b}{\beta _a}\bigg )^{-1}\text {max}(|B^{a}|, |B^{b}|). \end{aligned}$$

Therefore the desired inequality follows if \(\frac{\beta _a}{\beta _b}+\frac{\beta _b}{\beta _a}\ge c\,d(B^{a}, B^{b})\). In addition, for some absolute constant C,

$$\begin{aligned} |\big ({\bar{y}}_{1}^{a}+S^{a}\big )\cap \big ({\bar{y}}_{1}^{b}+S^{b}\big )|\le C \bigg [|{\bar{y}}_{1}^{a}-{\bar{y}}_{1}^{b}| \bigg (\frac{1}{\beta _{a}}+\frac{1}{\beta _b}\bigg )\bigg ]^{-1} \text {max}(|S^{a}|, |S^{b}|). \end{aligned}$$

Hence the desired inequality follows if the fourth term is \(\ge c\,d(B^{a}, B^{b})\).

Let us now consider the third term. We can assume that the first two terms are small i.e.

  • \(\frac{\beta _a}{\beta _b} + \frac{\beta _b}{\beta _a}\le c' d(B^a,B^b)^{c'}\);

  • \(\frac{\max \big (\alpha _{a}\beta _{a}\alpha _{a}{\beta _{a}}^{2} \ldots \alpha _a\beta _a^{d-1}, \quad \alpha _b\beta _b\alpha _{b}{\beta _{b}}^{2} \ldots \alpha _b\beta _b^{d-1}\big )}{\min \big (\alpha _{a}\beta _{a}\alpha _{a}{\beta _{a}}^{2} \ldots \alpha _a\beta _a^{d-1}, \quad \alpha _b\beta _b\alpha _{b}{\beta _{b}}^{2} \ldots \alpha _b\beta _b^{d-1}\big )}\le c' d(B^a,B^b)^{c'}\).

We shall prove that this implies the third term is also small i.e.

$$\begin{aligned} \frac{\alpha _a}{\alpha _b} + \frac{\alpha _b}{\alpha _a}\le cd(B^a,B^b)^c. \end{aligned}$$

WLOG let us assume that \(\beta _a\le \beta _b\). Since \(\frac{\beta _a}{\beta _b} + \frac{\beta _b}{\beta _a}\le c' d(B^a,B^b)^{c'}\), this implies

$$\begin{aligned} \beta _a\le \beta _b\le c'd(B^a,B^b)^{c'}\beta _a. \end{aligned}$$

Therefore

$$\begin{aligned} \begin{aligned} \Big (\frac{\alpha _a}{\alpha _b} + \frac{\alpha _b}{\alpha _a}\Big )^{d-1}&\sim \frac{\text {max}({\alpha _a}^{d-1},{\alpha _b}^{d-1})}{\text {min}({\alpha _a}^{d-1},{\alpha _b}^{d-1})}\\&\le \frac{\max \big (\alpha _{a}\beta _{b}\alpha _{a}{\beta _{b}}^{2} \ldots \alpha _a\beta _b^{d-1}, \quad \alpha _b\beta _b\alpha _{b}{\beta _{b}}^{2} \ldots \alpha _b\beta _b^{d-1}\big )}{\min \big (\alpha _{a}\beta _{a}\alpha _{a}{\beta _{a}}^{2} \ldots \alpha _a\beta _a^{d-1}, \quad \alpha _b\beta _a\alpha _{a}{\beta _{a}}^{2} \ldots \alpha _b{\beta _a}^{d-1}\big )}\\&\sim \bigg (c'd(B^a,B^b)^{c'}\bigg )^{C}\frac{\max \big (\alpha _{a}\beta _{a}\alpha _{a}{\beta _{a}}^{2} \ldots \alpha _a\beta _a^{d-1}, \quad \alpha _b\beta _b\alpha _{b}{\beta _{b}}^{2} \ldots \alpha _b\beta _b^{d-1}\big )}{\min \big (\alpha _{a}\beta _{a}\alpha _{a}{\beta _{a}}^{2} \ldots \alpha _a\beta _a^{d-1} \quad \alpha _b\beta _b\alpha _{b}{\beta _{b}}^{2} \ldots \alpha _b\beta _b^{d-1}\big )}. \end{aligned} \end{aligned}$$
(5.6)

We choose \(c'\) such that \(\bigg (c'd(B^a,B^b)^{c'}\bigg )^{C}<cd(B^a,B^b)^c\).

Now we consider the fifth term. Suppose that \(|{\bar{x}}_{1}^{a}-{\bar{x}}_{1}^{b}|\ge c d(B^{a}, B^{b}) \alpha _{a}\). If \((w, s_{1}, \ldots , s_{d-1})\in \mathbb {R}\times \mathbb {R}^{d-1}\) belongs to \(B^{a}\cap B^{b}\), then one has \(|s_{1}-{\bar{x}}_{2}^{a}-(w-{\bar{x}}_{1}^{a})^{2}|\le \alpha _{a}\beta _{a}\) and \(|s_{1}-{\bar{x}}_{2}^{b}-(w-{\bar{x}}_{1}^{b})^{2}|\le \alpha _{b}\beta _{b}\). Subtracting gives us

$$\begin{aligned} |2w\, ({\bar{x}}_{1}^{a}-{\bar{x}}_{1}^{b})-d|\le 2 \max (\alpha _{a}\beta _{a}, \alpha _{b}\beta _{b}), \end{aligned}$$

where \(d=2({\bar{x}}_{1}^{a})^{2}-2{\bar{x}}_{1}^{a}{\bar{y}}_{1}^{b}\). Since \(|{{\bar{x}}}^a_1-{{\bar{x}}}^b_1|\ge c d(B^a,B^b)\alpha _a\), this implies

$$\begin{aligned} \big |\{w\in S^{a}: |2w({\bar{x}}_{1}^{a}-{\bar{x}}_{1}^{b})-d|\le 2 \max (\alpha _{a}\beta _{a}, \alpha _{b}\beta _{b}) \}\big |\le C d(B^{a}, B^{b})^{-1}|S^{a}| \end{aligned}$$

uniformly for all \(d\in \mathbb {R}\). This implies the required upper bound on \(|B^{a}\cap B^{b}|\).

Next let us assume that for some m with \(2\le m\le d\), we have

$$\begin{aligned} \frac{|\sum _{i=1}^{m} \left( {\begin{array}{c}m\\ i\end{array}}\right) {(-t_{a})}^{m-i}({{\bar{y}}}_{i}^b-\bar{y}_{i}^{a})-({{\bar{y}}}_{1}^b- \bar{y}_{1}^a)|}{\alpha _a\beta _{a}^{m-1}}\ge c d(B^{a}, B^{b}). \end{aligned}$$

We define polynomials \(Q^{a}_{j}\) and \(Q^{b}_{j}\) on \(\mathbb {R}^{d}\) by

$$\begin{aligned}&Q^{a}_{j}(y)=\sum _{i=1}^{j} \left( {\begin{array}{c}j\\ i\end{array}}\right) {(-t_{a})}^{j-i}(y_{i}-{{\bar{y}}}_{i}^{a})-(y_{1}-{{\bar{y}}}_{1}^a)^j\\&Q^{b}_{j}(y)=\sum _{i=1}^{j} \left( {\begin{array}{c}j\\ i\end{array}}\right) {(-t_{b})}^{j-i}(y_{i}-{{\bar{y}}}_{i}^{b})-(y_{1}-{{\bar{y}}}_{1}^b)^j \end{aligned}$$

for each \(2\le j\le d\). For every \(z\in \mathbb {R}\) we define \(t(z)\in \mathbb {R}^{d-1}\) so that \(Q^{b}_{j}(z, t(z))=0\) for each j. Now we define a one variable polynomial \(P(z)=Q^{a}_{m}(z, t(z))\). Observe that

$$\begin{aligned} |P(z)|>\,\alpha _{a}\beta _{a}^{m-1}+\alpha _{b}\beta _{b}^{m-1}\quad \text {implies}\quad B^{a}\cap B^{b}\cap (\{z\}\times \mathbb {R}^{d-1})=\emptyset . \end{aligned}$$

Note that \(P({{\bar{x}}}_1^{b})\ge cd(B^{a}, B^{b}) \alpha _{a}\beta _{a}^{m-1}\). Let

$$\begin{aligned} \epsilon =\frac{3\max (\alpha _a\beta _{a}^{m-1}, \alpha _{b}\beta _{b}^{m-1})}{d(B^{a}, B^{b})\, \alpha _{a}\beta _{a}^{m-1}}\le d(B^{a}, B^{b})^{-\frac{1}{2}}. \end{aligned}$$

Then for all \(z\in {{\bar{x}}}_1^b+S^{b}\) we have

$$\begin{aligned} P(z)\ge \epsilon d(B^{a}, B^{b}) \alpha _{a}\beta _{a}^{m-1}= 3\max (\alpha _a\beta _{a}^{m-1}, \alpha _{b}\beta _{b}^{m-1}) \ge \alpha _{a}\beta _{a}^{m-1}+\alpha _{b}\beta _{b}^{m-1} \end{aligned}$$

except on a set of measure smaller than \(C\epsilon ^{c}|S^{b}|\). Therefore one has

$$\begin{aligned} |B^{a}\cap B^{b}|\le C\epsilon ^{c}|B^{b}|\le C d(B^{a}, B^{b})^{-c} |B^{b}|. \end{aligned}$$

Similar arguments give the required inequality if \(\frac{\big |\sum _{i=1}^m\left( {\begin{array}{c}m\\ i\end{array}}\right) {(-t_{b})}^{m-i}(\bar{y}_{i}^a-{{\bar{y}}}_{i}^{b})-({{\bar{y}}}_{1}^a-\bar{y}_{1}^b)^m\big |}{\alpha _b\beta _{b}^{m-1}}\ge cd(B^a,B^b)\).

Now let us assume that for some \(2\le m\le d\) we have

$$\begin{aligned} \frac{|\sum _{i=1}^{m} \left( {\begin{array}{c}m\\ i\end{array}}\right) {(-t_{a})}^{m-i}({{\bar{x}}}_{i}^b-{{\bar{x}}}_{i}^a)|}{\alpha _a\beta _{a}^{m-1}}\ge cd(B^a,B^b) \end{aligned}$$

and that all the previous terms are less than \(c'd(B^a,B^b)\) where \(c'\) is a small positive number to be chosen precisely in a moment. Since both sides of the equation are invariant if we replace \((B^a,B^b)\) by \((\phi ^{*}(B^a), \phi ^{*}(B^b))\), we can assume that \(B^a=B(0,0,\alpha _a,1)\) and \(B^b=B({{\bar{x}}}^b,t,\alpha _b,\beta _b)\). This implies

$$\begin{aligned} |{{\bar{x}}}_m^b|=|{{\bar{y}}}_m^b-t^m|> cd(B^a,B^b)\alpha _a \end{aligned}$$
(5.7)

and

  • \(\frac{\alpha _a}{\alpha _b}+\frac{\alpha _b}{\alpha _a}< c'd(B^a,B^b)\);

  • \(|{{\bar{y}}}_1^b-t|<c'd(B^a,B^b)\text {max}(\alpha _a,\alpha _b)\);

  • \(|{{\bar{y}}}_m^b-(\bar{y_1}^b)^m|<c'd(B^a,B^b)\alpha _a\).

This implies \(t={{\bar{y}}}_1^b+\mathcal {O}(c'd(B^a,B^b))\alpha _a\) which in turn implies

$$\begin{aligned} |{{\bar{x}}}_m^b|=|{{\bar{y}}}_m^b-t^m|=|{{\bar{y}}}_m^b-(\bar{y}_1^b)^m+\mathcal {O}(c'd(B^a,B^b))\alpha _a|<\mathcal {O}(c'd(B^a,B^b))\alpha _a. \end{aligned}$$

We choose \(c'\) small enough such that this contradicts 5.7. \(\square \)

We also have an almost triangle inequality.

Lemma 5.5

There exists a constant \(C<\infty \) depending only on the dimension d such that for any three paraballs \(B^{a},B^{b},B^{d}\) we have

$$\begin{aligned} d(B^{a},B^{b})\le C \left( d(B^{a}, B^{d})^{C} + d(B^{d},B^{b})^{C}\right) . \end{aligned}$$

Proof

Without loss of generality we may assume that \(B^{d}=B(0,0,\alpha ,1)\) or \(B(0,0,1,\beta )\). The parameters specifying \(B^{a}\) and \(B^{b}\) are controlled by \(\eta _{1}=d(B^{a}, B^d)\) and \(\eta _{2}=d(B^{b}, B^d)\) respectively. Also for any three positive numbers \(\beta _{1}, \beta _{2}, \beta \), one has \(\frac{\beta _{1}}{\beta _{2}}+\frac{\beta _{2}}{\beta _{1}}\le C \bigg (\big (\frac{\beta _{1}}{\beta }+\frac{\beta }{\beta _{1}}\big )^{C}+ \big (\frac{\beta _{2}}{\beta }+\frac{\beta }{\beta _{2}}\big )^{C}\bigg )\). Therefore \(d(B^{a}, B^{b})\) is bounded from above by \(C\eta ^{C}\) where \(\eta =\max \big (d(B^{a}, B^{d}),\,d(B^{d},B^{b})\big )\). \(\square \)

We also have the following covering property.

Lemma 5.6

There exists a constant \(C<\infty \) depending only on the dimension d such that for any two paraballs \(B^{a},B^{b}\) we have

$$\begin{aligned} B^{a}\subset C(d(B^{a},B^{b}))^C B^{b} \end{aligned}$$

where \(C(d(B^{a},B^{b}))^C B^{b}\) is defined as in (5.2).

Proof

Without loss of generality we may assume that \(B^{b}=B(0,0,\alpha ,1)\). Let \(B^{a}=B({\bar{x}}, t_{0}, \alpha _1, \beta _1)\) and let \(\eta =d(B^{a}, B^{b})\). Then there is a constant C such that the parameters corresponding to \(B^{a}\) are controlled by \(C\eta ^C\). After some elementary algebra it follows that \(B^{a}\subset C\eta ^C B^{b}\). \(\square \)

In the next Proposition we shall prove that \((B(0,0,1,1), B^{*}(0,0,1,1))\) [and hence \((B(0,0,\alpha ,\alpha ), B^{*}(0,0,\alpha ,\alpha ))\)] are quasiextremal pairs for \(T:L^{p_\theta }\rightarrow L^{q_\theta }\) for every \(\theta \in [0,1]\). In Lemma 7.3 we shall prove that for \(0<\theta <1\) these are essentially the only quasi-extremal pairs. For \(\theta =0\), in addition to the above we also have \(\big (B(0,0,\alpha ,1),B^{*}(0,0,\alpha ,1)\big )\) for every \(0<\alpha <1\) and for \(\theta =1\), we have \(\big (B(0,0,1,\beta ),B^{*}(0,0,1,\beta )\big )\) for every \(0<\beta <1\) which are quasiextremal pairs for T.

Proposition 5.7

There exists \(c>0\) which depends only on the dimension d with the following property.

  • \(\big (B(0,0,\alpha ,1), B^{*}(0,0,\alpha ,1)\big )\) is a c-quasi-extremal pair for \(T:L^{p_0}\rightarrow L^{q_0}\) for all \(0<\alpha <1\);

  • \(\big (B(0,0,1,\beta ), B^{*}(0,0,1,\beta )\big )\) is a c-quasi-extremal pair for \(T:L^{p_1}\rightarrow L^{q_1}\) for all \(0<\beta <1\);

  • \(\big (B(0,0,\alpha ,\alpha ), B^{*}(0,0,\alpha ,\alpha )\big )\) is a c-quasi-extremal pair for \(T:L^{p_\theta }\rightarrow L^{q_\theta }\) for all \(0\le \theta \le 1\) and for all \(\alpha >0\).

Proof

We shall write the proof of the first claim, the others being identical. Let \(B=B(0, 0,\alpha ,1)\) with \(0<\alpha <1\). We claim that

$$\begin{aligned} T(B, B^{*})\ge c\alpha ^{d},\qquad |B|\le \alpha ^{d-1},\qquad |B^{*}|\le \alpha ^{d} \end{aligned}$$
(5.8)

which after some elementary calculations implies that

$$\begin{aligned} \frac{T(B, B^{*})}{|B|^{\frac{2}{d+1}}|B^{*}|^{1-\frac{2(d-1)}{d(d+1)}}}\ge c. \end{aligned}$$

The upper bounds on the sizes of B and \(B^{*}\) follow directly from the definition. Let us fix a small number \(r> 0\) (to be chosen precisely later) which depends only on d. Define \(B_{r}\) to be the set of all \(y\in \mathbb {R}^d\) such that

  • \(|y_{1}| \le r\);

  • \(|y_{m}-y_{1}^{m}|\le r\alpha \) for all \(1< m\le d\).

Then \(|B_{r}|\ge r^d |B|\).

We want to show that if r is sufficiently small then for all \(y\in B_{r}\), the set of all \(x\in \mathbb {R}^d\) such that \(x\in B^{*}\) has measure at least \(r\alpha \). Therefore \(T(B, B^{*})\ge r^{d+1}\alpha |B|\ge r^{d+1} |B|^{\frac{2}{d+1}}|B^{*}|^{1-\frac{2(d-1)}{d(d+1)}}\).

Let us fix \(y\in B_r\). For each \(x_{1}\in \mathbb {R}\) with \(|x_{1}|< r\alpha \), we define \(x'=(x_2,x_3, \ldots ,x_d)\in \mathbb {R}^{d-1}\) by \(x_{m}=y_{m}-(y_{1}-x_{1})^{m}\) for \(2\le m\le d\), so that \((x, y)\in \Sigma \). Now \(x\in B^{*}\) if and only if

$$\begin{aligned} |x_{m}|= |y_{m}-(y_{1}-x_{1})^{m}|<\alpha . \end{aligned}$$
(5.9)

Now,

$$\begin{aligned} \begin{aligned} x_{m}&=y_{m}-(y_{1}-x_{1})^{m}\\&=y_{1}^{m}-(y_{1}-x_{1})^{m}+\mathcal {O}{(r)}\alpha \\&\le \mathcal {O}{(r)}\alpha <\alpha , \end{aligned} \end{aligned}$$
(5.10)

if we choose r to be sufficiently small. \(\square \)

6 Quasiextremal Pairs and Paraballs

Let E and F be subsets of \(\mathbb {R}^d\) with finite positive Lebesgue measure. Write \(\mathcal {T}(E,F)=\langle T(\chi _{E}), \chi _{F} \rangle \) and \(\mathcal {T}(f,g)=\langle T(f), g\rangle \). Define \(\alpha \) and \(\beta \) by

$$\begin{aligned} \alpha |E|= \beta |F|= \langle T(\chi _{E}), \chi _{F} \rangle . \end{aligned}$$

Then T being restricted weak type \((p_0,q_0)=\big (\frac{d+1}{2},\frac{d(d+1)}{2(d-1)}\big )\) is equivalent to

$$\begin{aligned} |E|\ge c \beta ^{\frac{d(d+1)}{2}} {\bigg (\frac{\alpha }{\beta }\bigg )}^{d-1}. \end{aligned}$$

In addition, if (EF) is an \(\epsilon \)-quasiextremal pair then by Definition 1.4 we also have

$$\begin{aligned} |E|\le c {\epsilon }^{-\mathbf{C }} \beta ^{\frac{d(d+1)}{2}} {\bigg (\frac{\alpha }{\beta }\bigg )}^{d-1}. \end{aligned}$$

for some \(C> 0\). We aim to exploit these two inequalities simultaneously to obtain information about \(\epsilon \)-quasiextremal pairs and prove the following theorem.

Theorem 6.1

Let \(d>2\). There exists an absolute constant C, depending only on d such that for any \(\epsilon \)-quasiextremal pair (EF), there exists a paraball B such that

$$\begin{aligned} T(E\cap B, F\cap B^*)\ge {C}^{-1} {\epsilon }^{C} \mathcal {T}(E,F) \end{aligned}$$

and

$$\begin{aligned} |B|\le |E|\quad \text {and}\quad |B^*|\le |F|. \end{aligned}$$

7 Parametrization of Subsets of E and F

The following Lemma is proved in Lemma 3.7 in [12].

Lemma 7.1

If d is even there exists a point \({{\bar{y}}}\) in E, a measurable subset \(\Omega \subset \mathbb {R}^{d+1}\) such that

  • \(|\Omega |=c \alpha ^{\frac{d+2}{2}} \beta ^{\frac{d}{2}}\);

  • \({{\bar{y}}} -h(t_1)+h(t_2)-h(t_3)- \cdots +h(t_j)\in E\) for every \(t=(t_1, \ldots ,t_{d+1})\in \Omega \) and for every even j;

  • \({{\bar{x}}}-h(t_1)+h(t_2)-h(t_3)- \cdots -h(t_j)\in F\) for every \(t=(t_1, \ldots ,t_{d+1})\in \Omega \) and for every odd j;

  • \(t_1< t_2< \cdots < t_d\) for every \(t=(t_1,t_2,\ldots ,t_{d+1})\in \Omega \);

  • \(t_i-t_{i-1}\ge c\beta \) for every even i;

  • \(t_i-t_{i-1}\ge c\alpha \) for every odd i;

and if d is odd, there exists a point \({\bar{x}}\) in F, a measurable subset \(\Omega \subset \mathbb {R}^{d+1}\) such that

  • \(|\Omega |=c \alpha ^{\frac{d+1}{2}} \beta ^{\frac{d+1}{2}}\);

  • \({\bar{x}}+h(t_{1})-h(t_{2})+h(t_{3})- \cdots +h(t_{j})\in E\) for every \(t=(t_{1},..,t_{d+1})\in \Omega \) and for every odd j;

  • \({\bar{x}}+h(t_{1})-h(t_{2})+h(t_{3})- \cdots -h(t_{j})\in F\) for every \(t=(t_{1}, \ldots ,t_{d+1})\in \Omega \) and for every even j;

  • \(t_{1}< t_{2}< \cdots < t_{d}\) for every \(t=(t_{1}, t_{2},\ldots , t_{d+1})\in \Omega \).

  • \(t_i-t_{i-1}\ge c\beta \) for every odd i;

  • \(t_i-t_{i-1}\ge c\alpha \) for every even i.

Here c is a small positive constant independent of \(E,F,\alpha ,\beta \) and depends only on d.

Now as in [2] if we consider the map \((t_1,\ldots ,t_d)\) goes to \({{\bar{y}}} -h(t_1)+h(t_2)-h(t_3)- \cdots +(-1)^dh(t_d)\in E\) we have

$$\begin{aligned} |E|\ge c \int _{\Omega } \prod _{1\le i < j\le d} (t_j-t_i). \end{aligned}$$

Since (EF) is a \(\epsilon \)-quasiextremal in addition to Lemma 7.1 we also have for each \(t\in \Omega \),

  • \(t_i \le t_{i-1}+\epsilon ^{-C}\alpha \) for every odd \(1< i \le d+1\);

  • \(t_i \le t_{i-1}+\epsilon ^{-C}\beta \) for every even \(1< i \le d+1\);

where C is an absolute constant depending only on d.

Lemma 7.2

There exists \(C<\infty \), depending only on d, with the following properties. If (EF) is an \(\epsilon \)-quasiextremal with \(\alpha |E|=\beta |F|=\mathcal {T}(E, F)\) then there exists \(t_0\in \mathbb {R}\) and a point \({{\bar{y}}}\) in E, a measurable subset \(\Omega \subset \mathbb {R}^{d+1}\) such that if d is even

  • \(|\Omega |=c \alpha ^{\frac{d+2}{2}} \beta ^{\frac{d}{2}}\);

  • \({{\bar{y}}}-h(t_{1})+h(t_{2})-h(t_{3})+ \cdots +h(t_{j})\in E\) for every \(t=(t_{1},\ldots ,t_{d+1})\in \Omega \) and for every even j;

  • \({{\bar{y}}}-h(t_{1})+h(t_{2})-h(t_{3})- \cdots -h(t_{j})\in F\) for every \(t=(t_{1}, \ldots ,t_{d+1})\in \Omega \) and for every odd j;

  • \(t_i \le t_{i-1}+\epsilon ^{-\mathbf{C }}\alpha \) for every odd \(1< i < d\);

  • \(t_i \le t_{i-1}+\epsilon ^{-\mathbf{C }}\beta \) for every even \(1< i \le d\);

  • \(t_{1}< t_2< \cdots < t_d\) for every \(t\in \Omega \);

  • \(|t_1-t_0|\le \epsilon ^{-\mathbf C }\alpha \) for all \(t\in \Omega \):

  • \(t_i-t_{i-1}\ge c\beta \) for every even i;

  • \(t_i-t_{i-1}\ge c\alpha \) for every odd i.

and if d is odd, there exists a point \({\bar{x}}\) in F, a measurable subset \(\Omega \subset \mathbb {R}^{d+1}\) such that

  • \(|\Omega |=c \alpha ^{\frac{d+1}{2}} \beta ^{\frac{d+1}{2}}\);

  • \({\bar{x}}+h(t_{1})-h(t_{2})-h(t_{3})- \cdots +h(t_{j})\in E\) for every \(t=(t_{1},..,t_{d+1})\in \Omega \) and for every odd j;

  • \({\bar{x}}+h(t_{1})-h(t_{2})+h(t_{3})- \cdots -h(t_{j})\in F\) for every \(t=(t_{1}, \ldots ,t_{d+1})\in \Omega \) and for every even j;

  • \(t_{1}< t_{2}< \cdots < t_{d}\) for every \(t=(t_{1}, t_{2},\ldots , t_{d+1})\in \Omega \).

  • \(|t_{1}-t_{0}|\le \epsilon ^{-C}\beta \) for all \(t\in \Omega \):

  • \(t_i-t_{i-1}\ge c\beta \) for every odd i;

  • \(t_i-t_{i-1}\ge c\alpha \) for every even i.

Proof

The proof is quite straightforward. If the \(\Omega \) from Lemma 7.1 does not satisfy the property that for some \(t_0\), \(|t_1-t_0|\le \epsilon ^{-C}\alpha \) for all \(t\in \Omega \), then in the proof of Lemma 7.1 we iterate the construction of the sets \(\Omega _k\) upto (d+3) times. Now we fix a point \(s\in \Omega _{d+3}\) and apply Lemma 7.1 with \({{\bar{y}}}\) replaced by \(\bar{y}-h(s_1)+h(s_2)\). \(\square \)

Lemma 7.3

There exist \(c,C<\infty \) with the following properties. Let (EF) be an \(\epsilon \)-quasiextremal pair for \(T:L^{p_\theta }\rightarrow L^{q_\theta }\) and \(\alpha |E|=\beta |F|=\mathcal {T}(E,F)\). Then

  • If \(\theta =0\) then \(\alpha \le C\epsilon ^{-C}\beta \);

  • If \(\theta =1\) then \(\beta \le C\epsilon ^{-C}\alpha \);

  • If \(0<\theta <1\) then \(c\epsilon ^{\frac{C}{1-\theta }}\alpha \le \beta \le C\epsilon ^{\frac{-C}{\theta }}\alpha \).

Proof

We shall first consider the case when \(\theta =0\) and d is even. The proof when d is odd is identical. If (EF) is an \(\epsilon \)-quasiextremal for \(T:L^{p_0}\rightarrow L^{q_0}\), by Lemma 7.2 one has for all \(t=(t_1, t_2, \ldots ,t_d)\in \Omega \),

$$\begin{aligned} t_2+c\alpha<t_3<t_1+C\epsilon ^{-C}\beta <t_2+C\epsilon ^{-C}\beta . \end{aligned}$$

This implies \(\alpha <C\epsilon ^{-C}\beta \).

Next Let us consider the case when \(\theta =1\). If (EF) is an \(\epsilon \)-quasiextremal for \(T:L^{p_1}\rightarrow L^{q_1}\). Then one has

$$\begin{aligned} \langle T^{*}(\chi _F), \chi _E\rangle =\langle T(\chi _E), \chi _F\rangle \ge \epsilon |E|^{\frac{1}{p_1}}|F|^{\frac{1}{q'_1}} \end{aligned}$$

and

$$\begin{aligned} \beta |F|=\alpha |E|=\langle T^{*}(\chi _F), \chi _E)\rangle . \end{aligned}$$

This implies (FE) is an \(\epsilon \)-quasiextremal pair of \(T^{*}:L^{q'_1}=L^{p_0}\rightarrow L^{p'_1}=L^{q_0}\). Since \(T^{*}\) is the convolution with the affine arclength measure of \(-h(t)\), one has

$$\begin{aligned} \beta \le C\epsilon ^{-C}\alpha . \end{aligned}$$

Let us now fix a \(\theta \in (0,1)\). Let (EF) be such that

$$\begin{aligned} \langle T(\chi _E), \chi _F\rangle < \epsilon ^\frac{C}{1-\theta } |E|^{\frac{1}{p_0}}|F|^{\frac{1}{q'_0}}. \end{aligned}$$

This implies

$$\begin{aligned} \begin{aligned} \langle T(\chi _E), \chi _F\rangle&=(\langle T(\chi _E), \chi _F\rangle )^{1-\theta } (\langle T(\chi _E), \chi _F\rangle )^\theta \\&<\epsilon ^{C}\big (|E|^{\frac{1}{p_0}}|F|^{\frac{1}{q'_0}}\big )^{1-\theta } (A_{p_1}|E|^{\frac{1}{p_1}}|F|^{\frac{1}{q'_1}}\big )^\theta \\&<\epsilon ^{C} A_{p_1}^\theta |E|^{\frac{1}{p_\theta }} |F|^{\frac{1}{q'_\theta }}. \end{aligned} \end{aligned}$$
(7.1)

Similarly \(\langle T(\chi _E), \chi _F\rangle <\epsilon ^{C} A_{p_0}^{1-\theta } |E|^{\frac{1}{p_\theta }} |F|^{\frac{1}{q'_\theta }}\) if \(\langle T(\chi _E), \chi _F\rangle < \epsilon ^\frac{C}{\theta } |E|^{\frac{1}{p_1}}|F|^{\frac{1}{q'_1}}\). This implies if (EF) is an \(\epsilon \)-quasiextremal pair of \(T:L^{p_\theta }\rightarrow L^{q_\theta }\) then (EF) is an \(\epsilon ^\frac{C}{1-\theta }\)-quasiextremal pair of \(T:L^{p_0}\rightarrow L^{q_0}\) and \(\epsilon ^{\frac{C}{\theta }}\)-quasiextremal pair of \(T:L^{p_1}\rightarrow L^{q_1}\). Thus by the results for \(\theta =0\) and \(\theta =1\), we have

$$\begin{aligned} c\epsilon ^{\frac{C}{1-\theta }}\alpha \le \beta \le C\epsilon ^{\frac{-C}{\theta }}\alpha . \end{aligned}$$

\(\square \)

Let us consider the paraball \(B=B({\bar{x}}={\bar{y}}-h(t_0), t_{0}, C \epsilon ^{-C}\alpha , C\epsilon ^{-C}\beta )\).

Lemma 7.4

If B is as above then if d is even,

$$\begin{aligned}&E\cap B\supset {\bar{y}}-h(t_{1})+h(t_{2})-h(t_{3})- \cdots +h(t_{d}), \\&F\cap B^{*}\supset {\bar{y}}-h(t_{1})+h(t_{2})+h(t_{3})- \cdots -h(t_{d+1}) \end{aligned}$$

for every \(t\in \Omega \), and when d is odd

$$\begin{aligned}&E\cap B\supset {\bar{y}}- h(t_{1})+h(t_{2})-h(t_{3})- \cdots +h(t_{d+1}), \\&F\cap B^{*}\supset {\bar{y}}-h(t_{1})+h(t_{2})-h(t_{3})- \cdots -h(t_{d}) \end{aligned}$$

for every \(t\in \Omega \).

Proof

We will give the details when d is even, \(\epsilon =1\) and \(\theta =0\); the proof for other cases are essentially the same. By Lemma 7.2 it is enough to prove that \({\bar{y}}-h(t_{1})+h(t_{2})+h(t_{3})- \cdots +h(t_{d})\in B\) and \({\bar{y}}-h(t_{1})+h(t_{2})-h(t_{3})- \cdots +h(t_d)-h(t_{d+1})\in B^{*}\) for every \(t\in \Omega \).

Let us first prove that \({\bar{y}}-h(t_{1})+h(t_{2})+h(t_{3})- \cdots +h(t_{d})\in B\). We can assume, by applying suitable symmetry, if necessary, that \({\bar{x}}={\bar{y}}=0\) with \(t_{0}=0\). If \(y=(y_1,y_2, \ldots ,y_d)=-h(t_1)+h(t_2)-h(t_3)- \cdots +h(t_d)\) then Lemma 7.2 implies

$$\begin{aligned} |y_{1}|=|t_{1}-t_{2}+ \cdots -t_{d}|\le C\beta , \end{aligned}$$

and

$$\begin{aligned} \begin{aligned} y_{m}-y_{1}^{m}=&\big (-t_{1}^{m}+t_{2}^{m}- \cdots +t_{d}^{m}\big )-(-t_{1}+t_{2}- \cdots +t_{d})^{m}\\ =&-t_{1}^m+(t_{2}^m-t_{3}^m)+ \cdots +(t_{d-2}^m-t_{d-1}^m)+t_{d}^m\\&-\big [-t_{1}+(t_{2}-t_3)+ \cdots +(t_{d-2}-t_{d-1})+t_{d}\big ]^{m}\\ =&-t_{1}^{m}-(-t_1)^m+[t_{2}^m-t_{3}^m-(t_2-t_3)^m]\\&+ \cdots +[t_{d-2}^{m}-t_{d-1}^{m}-(t_{d-2}-t_{d-1})^{m}]\\&-\sum _{r_{1}+ \cdots +r_{\frac{d}{2}}=m, r_{i}\ne m} (-t_1)^{r_1}(t_{2}-t_{3})^{r_{2}} \ldots (t_{d-2}-t_{d-1})^{r_{\frac{d}{2}-1}}t_{d}^{\frac{d}{2}}\\ \le&C\alpha ^m{+}C\beta ^{m-1}|t_2{-}t_3|+C\beta ^{m-1}|t_4-t_5|+ \cdots +C\beta ^{m-1}|t_{d-2}-t_{d-1}|. \end{aligned} \end{aligned}$$
(7.2)

By Lemma 7.2 we have \(|t_i-t_{i+1}|<C\alpha \)  for all even  \(2\le i\le d\). By Lemma 7.3 we have \(\alpha \le C\beta \). Therefore we have \(|y_m-y_{1}^m|<C\alpha \beta ^{m-1}\). This proves our claim.

Now we shall prove that \({\bar{y}}-h(t_{1})+h(t_{2})-h(t_{3})- \cdots +h(t_d)-h(t_{d+1})\in B^{*}\) for every \(t\in \Omega \). WLOG \({\bar{x}}={\bar{y}}=0\) with \(t_{0}=0\). Let \(x=(x_1,x_2, \ldots ,x_d)=-h(t_{1})+h(t_{2})-h(t_{3})- \cdots +h(t_d)-h(t_{d+1})\). Then \(x\in B^{*}\) if and only if \(|x_1|<C\alpha \) and \(|x_m|<C\alpha \beta ^{m-1}\) for all \(1<m\le d\). By Lemma 7.2

$$\begin{aligned} |x_{1}|=|-t_{1}+(t_{2}-t_3)+ \cdots +(t_{d}-t_{d+1})|\le C\alpha \end{aligned}$$

and

$$\begin{aligned} \begin{aligned} |x_{m}|&=|-t_{1}^{m}+t_{2}^{m}- \cdots +t_{d}^{m}-t_{d+1}^m|\\&=|-t_{1}^m+(t_{2}^m-t_{3}^m)+ \cdots +(t_{d}^m-t_{d+1}^m)|\\&\le |-t_{1}^{m}|+C\beta ^{m-1}|t_{2}-t_{3}|+ \cdots +C\beta ^{m-1}|t_{d}-t_{d+1}|\\&\le C\beta ^{m-1}\alpha . \end{aligned} \end{aligned}$$
(7.3)

\(\square \)

Therefore we have \(|E\cap B({{\bar{x}}}, t_0,C\epsilon ^{-C}\alpha ,C\epsilon ^{-C}\beta )|\ge |\{\bar{y}-h(t_1)+h(t_2)+h(t_3)- \cdots +h(t_d):t\in \Omega \}|\ge c\epsilon ^C|E|\). Similarly \(|F\cap B^{*}({{\bar{x}}}, t_0,C\epsilon ^{-C}\alpha ,C\epsilon ^{-C}\beta )|\ge |\{\bar{y}-h(t_1)+h(t_2)-h(t_3)- \cdots -h(t_{d+1}):t\in \Omega \}|\ge c\epsilon ^C|F|\). This implies

  • \(\mathcal {T}(E\cap B, F\cap B^*)\ge c\epsilon ^C\mathcal {T}(E,F)\)

  • \( |B|\le C {\epsilon }^{-C} |E|\)

  • \( |B^*|\le C {\epsilon }^{-C} |F| \)

This is stronger than the conclusion of Theorem 6.1 which will be proved in the following lemma.

Lemma 7.5

There exist absolute constants \(N, C<\infty \) with the following property. For each paraball B and given any \(0<\delta \le 1\), there exists a family of paraballs \(\{B_l:l\in L\}\) with the following properties,

  • \(B\subset \cup _{l\in L}B_l\);

  • \(B^{*}\subset \cup _{l\in L}B_l^{*}\);

  • \(|L|\le N \delta ^{-C}\);

  • \(|B_l|\sim \delta |B|\) for all l;

  • \(|B^{*}_l|\sim \delta |B^{*}|\) for all l.

Proof

The proof of this lemma will be similar to the proof of Lemma 7.2 in [4]. WLOG we can assume that \(B=B(0,0,\alpha ,1)\) or \(B(0,0,1,\beta )\). Let \(B=B(0,0,\alpha ,1)\), the proof for \(B(0,0,1,\beta )\) follows from a similar argument. Then \(|B|\sim \alpha ^{d-1}\) and \(|B^*|\sim \alpha ^{d}\). Let \(\eta =\delta ^{\frac{2}{d(d+1)}}\). Let us select a maximal \(\eta ^d\alpha \)-separated subset of \(B^{*}(0,0,\alpha ,1)\) with respect to the regular Euclidean distance. Let us denote this set by \(\{z^l:l\in L\}\). Then \(|L|\le C\eta ^{-C}\). Now we choose a maximal \(\eta \)-seperated of \([-1,1]\). Let us denote this set by \(\{t_k:k\in K\}\). Then \(|K|\le C\eta ^{-C}\).

Now we define \(B_{l,k}=B(z^l,t_k,C\eta \alpha ,C\eta )\). Then \(|B_{l,k}|\sim \eta ^{\frac{d(d+1)}{2}}\alpha ^{d-1}\sim \delta |B|\) and \(|B^{*}_{l,k}|\sim \eta ^{\frac{d(d+1)}{2}}\alpha ^d\sim \delta |B^*|\). By the definition of \(B_{l,k}\) it directly follows that

$$\begin{aligned} B^{*}\subset \cup _{l,k}B_{l,k}^{*}. \end{aligned}$$

We shall now prove that \(B\subset \cup _{l,k}B_{l,k}\). It is enough to prove that

$$\begin{aligned} B(0,0,\eta ^d \alpha ,1)\subset \cup _{k} B(0,t_k,C\eta \alpha , C\alpha ) \end{aligned}$$
(7.4)

since this implies

$$\begin{aligned} B(0,0,1,1)=\cup _{l}B(z^l,0,\eta ^d\alpha ,1)\subset \cup _{l}\cup _{k} B(z^l,t_k,C\eta \alpha , C\alpha ). \end{aligned}$$

To prove 7.4, let \(y\in B(0,0,\eta ^d\alpha ,1)\). Now choose \(t_k\) such that \(t_k\le y_1<t_{k+1}\). We claim that \(y\in B(0,t_k,C\eta \alpha , C\eta )\). Our claim is true if and only if for each \(2\le m\le d\)

$$\begin{aligned} \bigg |\sum _{i=1}^m \left( {\begin{array}{c}m\\ i\end{array}}\right) (-t_k)^{m-i}(y_i-t_k^i)-(y_1-t_k)^m\bigg |\le C^m\eta ^m\alpha . \end{aligned}$$

Now

$$\begin{aligned}&\bigg |\sum _{i=1}^m \left( {\begin{array}{c}m\\ i\end{array}}\right) (-t_k)^{m-i}(y_i-t_k^i)-(y_1-t_k)^m\bigg |\\&\quad =\bigg |\sum _{i=1}^m \left( {\begin{array}{c}m\\ i\end{array}}\right) (-t_k)^{m-i}(y_i-y_1^i)\bigg |\le C\eta ^d\alpha \le C\eta ^m\alpha \end{aligned}$$

as \(|t_k|\le 1\) and \(y\in B(0,0,\eta ^d\alpha ,1)\).

\(\square \)

To complete the proof of the Theorem 6.1 we choose C sufficiently large such that with \(\delta =\epsilon ^C\) we apply Lemma 7.5 to obtain paraballs \(\{B_l\}_{l\in L}\) such that for each \(l\in L\), \(|B_l|\le |E|\) and \(|B^{*}_l|\le |F|\). We now have

$$\begin{aligned} T(E\cap B, F\cap B^{*}) \le \sum _{l\in L}T(E\cap B_l, F\cap B^{*}_l). \end{aligned}$$
(7.5)

In addition we have \(|L|\le C\epsilon ^{-C}\). Therefore there exists a paraball \(B_l\) such that

$$\begin{aligned} T(E\cap B_l, F\cap B^{*}_l)\ge C^{-1}\epsilon ^{C} T(E\cap B, F\cap B^{*}). \ge c \epsilon ^{C} \mathbb {T}(E, F) \end{aligned}$$

and

$$\begin{aligned} |B_l|\le |E|\,\,\,\,\,\,\,\,\,\,|B^{*}_l|\le |F|. \end{aligned}$$

8 Lorentz Spaces and \(\epsilon \)-Quasiextremal Function

Definition 8.1

Let f be a nonnegative function which is finite almost everywhere. By a rough level set decomposition of f we mean a representation of f as \(f =\sum _{j=-\infty }^{\infty }2^{j} f_j\) where \(\chi _{E_j}\le f_j\le 2\chi _{E_j}\) with the sets \(E_j\) pairwise disjoint and measurable.

We may approximate the Lorentz norms of f by,

$$\begin{aligned} \Vert f\Vert _{p,r}\sim \left\{ \begin{array}{ll} \bigg (\sum _{j}(2^{j} |E_{j}|^{\frac{1}{p}})^{r}\bigg )^{\frac{1}{r}}, &{} \text{ if }\quad r<\infty \\ \sup _{j}2^{j} |E_{j}|^{\frac{1}{p}}, &{} \text{ if } \quad r=\infty \end{array} \right. \end{aligned}$$

where \(f =\sum _{j=-\infty }^{\infty }2^{j} f_j, f_j \sim \chi _{E_j}\) is a rough level set decomposition of f. In particular, \(L^{p, r}\big (\mathbb {R}^{d}\big )=\{f: \Vert f\Vert _{p,r}\,<\,\infty \}\).

The following lemma is Theorem 4.1 in [23].

Lemma 8.2

T maps \(L^{p, r}\) boundedly to \(L^{q}\) for every \(r\in (p, q)\) for every (pq) as in 1.2.

The following lemma is also proved in the proof of Theorem 4.1 in [23].

Lemma 8.3

There exist \(C, c> 0\) with the following property. Let \(\epsilon > 0\). Let \(f=\sum _{j} 2^{j}f_j, f_j\sim \chi _{E_{j}}\) and \(g=\sum _{k} 2^{k}g_k, g_k\sim \chi _{F_{k}}\) be such that either \(\mathcal {T}(E_{j}, F_{k})\le \epsilon |E_{j}|^{\frac{1}{p}} |F_{k}|^{1-\frac{1}{q}}\) or \(2^{j}|E_{j}|^{\frac{1}{p}}\le \epsilon \Vert f\Vert _{p}\) for each j and k. Then \(\mathcal {T}(f, g)\le C\, \epsilon ^{c}\Vert f\Vert _{p}\,\Vert g\Vert _{q^{'}}\).

Lemma 8.4

There exist \(c, C<\infty \) with the following property. For each \(\epsilon > 0\), if f is a nonnegative function with rough level set decomposition \(f =\sum _{j=-\infty }^{\infty }2^{j}f_j, f_j\sim \chi _{E_{j}}\) and if f is a \(\epsilon \)-quasiextremal then there exists \(j\in \mathbb {Z}\) and a paraball B such that

$$\begin{aligned} \Vert 2^{j}\chi _{E_{j}\cap B}\Vert _{p}\ge c\epsilon ^{C}\Vert f\Vert _{p} \end{aligned}$$

and

$$\begin{aligned} |B|\le |E_{j}|. \end{aligned}$$

Proof

The proof of this lemma follows directly from Theorem 6.1 and the previous lemma. If \(f =\sum _{j}2^{j}f_j, f_j\sim \chi _{E_{j}}\) by the previous lemma there exists \(j\in \mathbb {Z}\) such that \(\Vert 2^{j}\chi _{E_{j}\cap B}\Vert _{p}\ge c\epsilon ^{C}\Vert f\Vert _{p}\) and \(E_{j}\) is an \(\epsilon ^{c}\)-quasietremal. Now we apply Theorem (6.1) to \(E_{j}\) to get the desired conclusion. \(\square \)

9 Two Key Lemmas

In this section we shall prove two lemmas that will be used in the later sections. The first lemma is about how paraballs interact with each other when they are distant from each other. It shows in some sense when we have a collection of paraballs which are at a large distance from each other, then their image under T act on nearly disjoint portions of any given set. The precise statement is given below.

Lemma 9.1

Let \(d> 2\) and let \((\frac{1}{p}, \frac{1}{q})\) be on the line segment joining the points \((\frac{2}{d+1}, \frac{2(d-1)}{d(d+1)})\) and \((1-\frac{2(d-1)}{d(d+1)}, 1-\frac{2}{d+1})\). Then there exists a positive finite constant C depending only on d with the following property. Let \(\{B_{i}\}_{i\in S}\) be a collection of paraballs such that for any \(i\ne j\) with \(i,j\in S\) we have \(d(B_i,B_j)\ge C \eta ^{-C}\) for some \(\eta > 0\). Then for any F subset of \(\mathbb {R}^d\) with positive finite Lebesgue measure, we can write \(F=\sqcup F_{i}\) so that

$$\begin{aligned} T(B_i, F_j)\le \eta |B_i|^{\frac{1}{p}} |F_j|^{\frac{1}{q'}}, \text{ for } \text{ all }\quad i\ne j. \end{aligned}$$

Proof

The proof of this lemma will be a straightforward adaptation of the proof of Lemma 4.1 in [3]. For the sake of completeness we give a sketch of the proof here. Define

$$\begin{aligned} \gamma _i=\frac{1}{3}\eta |F|^{\frac{1}{q'}-1} |B_i|^\frac{1}{p} \end{aligned}$$

and

$$\begin{aligned} {\tilde{F}}_i=\{x\in F: T(B_i)> \gamma _i \}. \end{aligned}$$
(9.1)

We note that

$$\begin{aligned} T(B_i, F\setminus {\tilde{F}}_i)\le \gamma _i |F|\le \frac{1}{3}\eta |B_i|^{\frac{1}{p}} |F|^\frac{1}{q'}. \end{aligned}$$
(9.2)

Now choose \(F_i\subset {\tilde{F}}_i\) such that \(\cup _i \tilde{F}_i=\sqcup F_{i}\). Note that the there are many choices of \(F_i\). We just choose one such collection. Also, there might be elements in F which do not belong to \(\tilde{F}_i\) for all \(i\in S\). We pick one \(F_i\) and include these points to this particular set. Since by (9.2) we have \(T(B_i, F\setminus \sqcup F_i)\le \frac{1}{3}\eta |B_i|^{\frac{1}{p}} |F|^\frac{1}{q'}\), it is enough to prove that for \(i\ne j\)

$$\begin{aligned} T(B_i, F_j)\le \frac{2}{3}\eta |B_i|^{\frac{1}{p}} |F_j|^\frac{1}{q'}. \end{aligned}$$
(9.3)

Suppose (9.3) does not hold. Then there exists \(i\ne j\) such that \(T(B_i, F_j)> \frac{2}{3}\eta |B_i|^{\frac{1}{p}} |F_j|^\frac{1}{q'}\). For the rest of this proof we fix these two indices ij. Define \(\mathcal {F}=F_j\cap {\tilde{F}}_i\). By (9.2) \(T(B_i, F_j\setminus {\tilde{F}}_i) \le \frac{1}{3}\eta |B_i|^{\frac{1}{p}} |F|^\frac{1}{q'}\), so we have

$$\begin{aligned} \frac{1}{3}\eta |B_i|^{\frac{1}{p}} |F|^\frac{1}{q'}\le \mathcal {T}(B_i,\mathcal {F})\le A |B_i|^{\frac{1}{p}}|\mathcal {F}|^{\frac{1}{q'}}. \end{aligned}$$
(9.4)

This implies

$$\begin{aligned} |\mathcal {F}|\ge \bigg (\frac{1}{3}\bigg )^{q'} \eta ^{q'}A^{-q'} |F|. \end{aligned}$$

Now we apply Theorem 6.1 to the pair \((B_j,\mathcal {F})\) to obtain a paraball \({\tilde{B}}_j\) such that

$$\begin{aligned} |{\tilde{B}}_j|\le |B_j|, \quad |{\tilde{B}}_j^*|\le |\mathcal {F}|\le |F|,\quad \quad |{\tilde{B}}_j\cap B_j|\ge c \eta ^\gamma |B_j|,\quad |{\tilde{B}}_j^*\cap \mathcal {F}|\ge c \eta ^\gamma |F|.\nonumber \\ \end{aligned}$$
(9.5)

Now we replace \(\mathcal {F}\) by \(\tilde{\mathcal {F}}=\mathcal {F}\cap \tilde{B_j}^*\). Since \(|\tilde{\mathcal {F}}|\ge c \eta ^\gamma |F|\) and \(T(B_i)(x)> \gamma _i\) for all \(x\in F_i\supset \mathcal {F}\supset \tilde{\mathcal {F}}\), consequently

$$\begin{aligned} T(B_i,\tilde{\mathcal {F}})\ge \gamma _i |\tilde{\mathcal {F}}|\ge c\eta ^{\gamma } |\tilde{\mathcal {F}}|^{\frac{1}{q'}} |B_i|^{\frac{1}{p}}. \end{aligned}$$

This means the pair \((B_i, \tilde{\mathcal {F}})\) is a \(c\eta ^{\gamma }\)-quasietxremal. Therefore by applying Theorem 6.1 once more we get another paraball \(\tilde{B_i}\) such that

$$\begin{aligned} |{\tilde{B}}_i|\le |B_i|, \quad |{\tilde{B}}_i^*|\le |\tilde{\mathcal {F}}|\le |F|,\quad \quad |{\tilde{B}}_i\cap B_i|\ge c \eta ^\gamma |B_i|,\quad |{\tilde{B}}_i^*\cap \tilde{\mathcal {F}}|\ge c \eta ^\gamma |F|.\nonumber \\ \end{aligned}$$
(9.6)

Since \(\tilde{B_i}^*\cap \tilde{B_j}^*\supset \tilde{B_i}^*\cap \tilde{B_j}^*\cap \mathcal {F}\supset \tilde{B_i}^*\cap \tilde{\mathcal {F}}\), we have

$$\begin{aligned} |\tilde{B_i}^*\cap \tilde{B_j}^*|\ge | \tilde{B_i}^*\cap \tilde{\mathcal {F}}|\ge c\eta ^{\gamma } |F|\ge c \eta ^{\gamma } \text {max}(|\tilde{B_i}^*|, |\tilde{B_j}^*|). \end{aligned}$$

Now by applying Proposition 5.4 to the pair of dual paraballs \((\tilde{B_i}^*, \tilde{B_j}^*)\) we get \(d(\tilde{B_i}^*, \tilde{B_j}^*)\le C \eta ^{-C}\). This implies

$$\begin{aligned} d(\tilde{B_i}, \tilde{B_j})\le C \eta ^{-C}. \end{aligned}$$

Since \(|\tilde{B_i}|\le |B_i|\) and \(|\tilde{B_i}\cap B_i|\ge c \eta ^{\gamma } |B_i|\), we have

$$\begin{aligned} d(\tilde{B_i}, B_i)\le C \eta ^{-C}. \end{aligned}$$

Similarly

$$\begin{aligned} d(\tilde{B_j}, B_j)\le C \eta ^{-C}. \end{aligned}$$

By applying Lemma 5.5 we get \(d(B_i, B_j)\le C \eta ^{-C}\), which contradicts our hypothesis. \(\square \)

Lemma 9.2

Let \(d>2\) and \((\frac{1}{p}, \frac{1}{q})\) be a point on the line segment joining the points \((\frac{2}{d+1}, \frac{2(d-1)}{d(d+1)})\) and \((1-\frac{2(d-1)}{d(d+1)}, 1-\frac{2}{d+1})\). There exists \(C,C^{'}\) positive finite constants depending only on d with the following property. Let \(E_1,E_2, F\) be subsets of \(\mathbb {R}^d\) with positive finite Lebesgue measure such that \(T(\chi _{E_1})\ge \eta |E_1|^{\frac{1}{p}} |F|^{{\frac{1}{q^{'}}}-1}\) and \(T(\chi _{E_2})\ge \eta |E_2|^{\frac{1}{p}} |F|^{{{\frac{1}{q^{'}}}-1}}\) on F, then if \(|E_2|\ge |E_1|\) we have \(|E_2|\le C^{'} \eta ^{-C} |E_1|\).

Proof

This lemma is essentially proved in the proof of Theorem 4.1 in [23] by applying extrapolation method of Christ. Here we give a simplified proof using the increasing structure, \((t_1< t_2< \cdots <t_d)\) of \(\Omega \) in Lemma 7.2. We shall give the proof when d is even for the other case being similar.

Let \(p_0=\frac{d+1}{2}\) and \(q_0=\frac{d(d+1)}{2(d-1)}\). Let us first consider the case when \(p=p_0\) and \(q=q_0\). Define

$$\begin{aligned} \alpha =\eta |E_1|^{\frac{1}{p_0}-1} |F|^{\frac{1}{{q_0}^{'}}},\quad \beta =\eta |E_1|^{\frac{1}{p_0}} |F|^{\frac{1}{{q_0}^{'}}-1} \text {and} \quad \gamma =\eta |E_2|^{\frac{1}{p_0}} |F|^{\frac{1}{{q_0}^{'}}-1}. \end{aligned}$$

Since \(T(\chi _{E_1})\ge \eta |E_1|^{\frac{1}{p_0}} |F|^{\frac{1}{{q_0}^{'}}-1}\) on F, we have

$$\begin{aligned} \langle \chi _{E_1}, T^{*}(\chi _{F})\rangle =T(E_1,F)\ge \eta |E_1|^{\frac{1}{p_0}} |F|^{\frac{1}{{q_0}^{'}}}. \end{aligned}$$

Therefore on a large subset of \(E_1\), \(T^{*}(\chi _{F})\ge \alpha \). Similarly on a large subset of F, \(T(\chi _{E_1})\ge \beta \) and \(T(\chi _{E_2})\ge \gamma \).

Similar to the proof of Theorem 6.1 there exists a point \({\bar{y}}\in E_1\) such that we can travel along the curve shifted to \({\bar{y}}\) inside F for a length of \(\alpha \). Then for each of these points on this travelled path we can travel back inside \(E_1\) for a length of \(\beta \). We continue this process \(d-1\) times. At the dth step we move into \(E_2\) along the curve for a length \(\gamma \).

As a result we get a \(\Omega \subset \mathbb {R}^{d}\) such that

  • \(|\Omega |=c \alpha ^{\frac{d}{2}} \beta ^{\frac{d}{2}-1} \gamma \);

  • \({{\bar{y}}} - h(t_1)+h(t_2)-h(t_3)- \cdots +h(t_j)\in E_1\) for every \(t=(t_1,\ldots ,t_d)\in \Omega \) and for every even \(j\le d-2\);

  • \({{\bar{y}}} - h(t_1)+h(t_2)-h(t_3)- \cdots -h(t_j)\in F\) for every \(t=(t_1,\ldots ,t_d)\in \Omega \) and for every odd \(j\le d-1\);

  • \({{\bar{y}}} - h(t_1)+h(t_2)-h(t_3)- \cdots +h(t_d)\in E_2\);

  • \(t_1< t_2< \cdots < t_d\) for every \(t=(t_1,t_2,\ldots ,t_d)\in \Omega \).

Now we consider the Jacobian, J(t), of the map \((t_1,t_2,\ldots ,t_d)\mapsto {{\bar{x}}} + h(t_1)-h(t_2)+h(t_3)- \cdots -h(t_d)\). We have

$$\begin{aligned} J\ge \prod _{1\le i< j\le d}|t_i-t_j|. \end{aligned}$$

Since \(t_1< t_2< \cdots < t_d\) for every \(t\in \Omega \), we have

  • \(\prod _{1\le i< d}(t_i-t_d)\ge \gamma ^{d-1}\);

  • \(\prod _{1\le i< j}(t_i-t_j) \ge \beta ^{j-1}\) for every even \(j< d\);

  • \(\prod _{1\le i< j}(t_i-t_j)\ge \alpha \beta ^{j-2}\) for every odd \(j \le d-1\).

Therefore as in the proof of Theorem 6.1 we get

$$\begin{aligned} |E_2|\ge |\Omega |\, \min _{t} J(t)\ge c \alpha ^{\frac{d}{2}} \beta ^{\frac{d}{2}-1} \gamma \quad \alpha ^{\frac{d}{2}-1} \beta ^{\frac{{d-2}^2}{2}} \gamma ^{d-1}. \end{aligned}$$

After substituting the values of \(\alpha \), \(\beta \) and \(\gamma \) in terms of \(|E_1|\), \(|E_2|\) and |F| we get

$$\begin{aligned}&|E_2|\ge c \eta ^{\frac{d(d-1)}{2}} |E_2|^{\frac{d}{p_0}} |E_1|^{\big (d-1\big )\big (\frac{1}{p_0}-1\big )+\frac{\big (\frac{d}{2}-1\big )\big (d-1\big )}{p_0}}\\&\quad |F|^{d\big (\frac{1}{q^{'}_0}-1\big )+\frac{d-1}{q^{'}_0}+\big (\frac{d}{2}-1\big )\big (d-1\big )\big (\frac{1}{q^{'}_0}-1\big )} \end{aligned}$$

which implies

$$\begin{aligned} |E_2|^{-\frac{d-1}{d+1}}\ge c \eta ^{\frac{d(d-1)}{2}} |E_1|^{-\frac{d-1}{d+1}}. \end{aligned}$$

This is equivalent to

$$\begin{aligned} |E_2|\le C \eta ^{-\frac{d(d+1)}{2}} |E_1|. \end{aligned}$$

Now let us consider the case when \(\frac{1}{p}=\frac{1}{p_1}=1-\frac{2(d-1)}{d(d+1)}\) and \(\frac{1}{q}=\frac{1}{q_1} =1-\frac{2}{d+1}\). The argument in this case is similar to the above case. Let \(\alpha , \beta , \gamma \), \(\Omega \), \({\bar{y}}\) and J be as before.

Since \(t_1< t_2< \cdots < t_d\) for every \(t\in \Omega \), we have

  • \(\prod _{1\le i< d}(t_i-t_d)\ge \gamma \alpha ^{d-2} \);

  • \(\prod _{1\le i< j}(t_i-t_j) \ge \beta \alpha ^{j-2}\) for every even \(j< d\);

  • \(\prod _{1\le i< j}(t_i-t_j)\ge \alpha ^{j-1}\) for every odd \(j \le d-1\).

Therefore we have

$$\begin{aligned}&|E_2|\ge |\Omega |\, \min _{t} J(t)\ge c \alpha ^{\frac{d}{2}} \beta ^{\frac{d}{2}-1}\gamma \quad \alpha ^{d(\frac{d}{2}-1)} \beta ^{\frac{d}{2}-1} \gamma \\&\quad =c \eta ^{\frac{d(d-1)}{2}} |E_2|^{\frac{2}{p_2}}\,\, |E_1|^{\frac{d-2}{p_2}+\big (\frac{1}{p_2}-1\big )\frac{d}{2}(d-1)}\,\, |F|^{d\big (\frac{1}{q_2^{'}}-1\big )+\frac{1}{q_2^{'}}\frac{d}{2}\big (d-1\big )}. \end{aligned}$$

This implies

$$\begin{aligned} |E_2|^{-\frac{d^2-3d+4}{d(d+1)}}\ge c \eta ^{\frac{d(d-1)}{2}} |E_1|^{-\frac{d^2-3d+4}{d(d+1)}}. \end{aligned}$$

This is equivalent to

$$\begin{aligned} |E_2|\le C \eta ^{-\frac{d^2(d^2-1)}{d^2-3d+4}} |E_1|. \end{aligned}$$

We shall now consider the case when

$$\begin{aligned} \frac{1}{p}=\frac{\theta }{p_1}+\frac{1-\theta }{p_0}\quad \text {and} \quad \frac{1}{q}=\frac{\theta }{q_1}+\frac{1-\theta }{q_0} \end{aligned}$$

for some \(\theta \in (0,1)\) and \(p_0,p_1,q_0,q_1\) as mentioned earlier. By the hypothesis of the theorem we have

$$\begin{aligned}&T(\chi _{E_1})\ge \eta |E_1|^{\frac{\theta }{p_1}+\frac{1-\theta }{p_0}} |F|^{\frac{\theta }{{q_1}^{'}}+\frac{1-\theta }{{q_0}^{'}}-1}\\&\quad = (\eta |E_1|^{\frac{1}{p_0}} |F|^{{\frac{1}{{q_0}^{'}}}-1})^{1-\theta } (\eta |E_1|^{\frac{1}{p_1}} |F|^{{\frac{1}{{q_1}^{'}}}-1})^{\theta }. \end{aligned}$$

and

$$\begin{aligned}&T(\chi _{E_2})\ge \eta |E_2|^{\frac{\theta }{p_1}+\frac{1-\theta }{p_0}} |F|^{\frac{\theta }{{q_1}^{'}}+\frac{1-\theta }{{q_0}^{'}}-1}\\&\quad = (\eta |E_2|^{\frac{1}{p_0}} |F|^{{\frac{1}{{q_0}^{'}}}-1})^{1-\theta } (\eta |E_2|^{\frac{1}{p_1}} |F|^{{\frac{1}{{q_1}^{'}}}-1})^{\theta }. \end{aligned}$$

Now let us consider the case when

$$\begin{aligned} |E_2|^{\frac{1}{p_0}} |F|^{{\frac{1}{{q_0}^{'}}}-1}\ge |E_2|^{\frac{1}{p_1}} |F|^{{\frac{1}{{q_1}^{'}}}-1}. \end{aligned}$$

This is equivalent to \(|E_2|^{\frac{1}{p_0}-\frac{1}{p_1}}\le |F|^{\frac{1}{{q_1}^{'}}-\frac{1}{{q_1}^{'}}}\). Since \(|E_2|\ge |E_1|\), we also have \(|E_1|^{\frac{1}{p_0}-\frac{1}{p_1}}\le |E_2|^{\frac{1}{p_0}-\frac{1}{p_1}} \le |F|^{\frac{1}{{q_1}^{'}}-\frac{1}{{q_0}^{'}}}\). This implies

$$\begin{aligned} |E_1|^{\frac{1}{p_0}} |F|^{{\frac{1}{{q_0}^{'}}}-1}\ge |E_1|^{\frac{1}{p_1}} |F|^{{\frac{1}{{q_1}^{'}}}-1}. \end{aligned}$$

Therefore we have for all \(x\in F\)

  • \(T(\chi _{E_1})(x) \ge \eta |E_1|^{\frac{1}{p_0}} |F|^{{\frac{1}{{q_0}^{'}}}-1}\);

  • \(T(\chi _{E_2})(x) \ge \eta |E_2|^{\frac{1}{p_0}} |F|^{{\frac{1}{{q_0}^{'}}}-1}\).

Now we apply the proof for the case \((p,q)=(p_0,q_0)\) to get the desired inequality. For the other case we have

$$\begin{aligned} |E_2|^{\frac{1}{p_0}} |F|^{{\frac{1}{{q_0}^{'}}}-1}\le |E_2|^{\frac{1}{p_1}} |F|^{{\frac{1}{{q_1}^{'}}}-1}. \end{aligned}$$

In this case we apply the proof for \((p,q)=(p_1,q_1)\) to get the desired inequality. \(\square \)

10 Entropy Refinement

The following lemma is proved for the paraboloid in Lemma 5.3 in [3]. The proof for the moment curve is almost identical and so we omit the proof.

Lemma 10.1

Let \(d\ge 2\). There exist \(c, C< \infty \) with the following property. Let \(\delta > 0\). Let f be any function in \(L^{p}(\mathbb {R}^d)\) satisfying \(\Vert Tf\Vert _{q}\ge (1-\delta ) A \Vert f\Vert _{p}\) that has rough level set decomposition \(f=\sum _{j\in \mathbb {Z}} 2^j f_j, \chi _{E_j}\le f_j\le 2\chi _{E_j}\). Then for any \(\eta \in (0, 1]\),

$$\begin{aligned} \big \Vert \sum _{j:2^{j} |E_{j}|^{\frac{1}{p}}< \eta \Vert f\Vert _{p}} 2^{j} f_j\big \Vert _{p}\le C(\delta ^{\frac{1}{p}}+\eta ^c) \Vert f\Vert _{p}. \end{aligned}$$

Lemma 10.2

Let \(d\ge 2\). There exist \(c, C, \tilde{C} < \infty \) with the following property. Let \(\rho \in (0,1)\). Let f be a \((1-\delta )\)-quasiextremal for T. If \(\delta \le C\rho ^{C}\), then there exists a function \(\tilde{f}\) satisfying \(\Vert f-\tilde{f}\Vert \le C\rho ^{C}\) with a rough level set decomposition \(\tilde{f}=\sum _{j\in \mathbb {Z}}2^j f_{j}, f_j\sim \chi _{E_{j}}\) such that if both \(\Vert 2^{i}\chi _{E_i}\Vert \ge \rho \) and \(\Vert 2^{j}\chi _{E_j}\Vert \ge \rho \), then

$$\begin{aligned} |i-j|\le \tilde{C}\rho ^{-\tilde{C}}. \end{aligned}$$

Proof

This lemma is an improvement over the previous lemma, in the sense that the indices \(\{j\}\) in the sum for which \(\Vert 2^{j}\chi _{E_{j}}\Vert _{p}\ge \eta \), can not be too far from each other. All of them are inside an interval of \(\mathbb {Z}\) of length at most \(C\,\eta ^{-C}\). This lemma is proved corresponding to the paraboloid in Lemma 6.1 in [3]. The proof of this lemma will be an application of Lemma 9.2 together with the previous lemma.The proof is almost identical in our case. Therefore we omit the proof. \(\square \)

Following Corollary 6.3 in [3] the above lemma immediately implies the following.

Corollary 10.3

There exist a finite constant C and a function \(\Psi :(0,\infty )\rightarrow (0,\infty )\) satisfying \(\frac{\Psi (t)}{t^p}\rightarrow 0\) as \(t\rightarrow 0,\infty \) with the following property. For any \(\epsilon > 0\) there exists a \(\delta > 0\) such that for any nonnegative function f with \(\Vert f\Vert _{p}=1\) and \(\Vert T(f)\Vert _{q}\ge (1-\delta )A\), there exists \(\phi \in G_{d}\) and a decomposition \(\phi ^{*}(f)=g+h\) with \(g,h \ge 0\) satisfying \(\Vert h\Vert _{p}<\epsilon \) and

$$\begin{aligned} \int \Psi (g)\le C. \end{aligned}$$
(10.1)

11 Uniform Decay and Extremizers at Non-end Points

In this section we show that any extremizing sequence behaves in a uniform manner. Using this we prove that at non end points any extremizing sequence, after applying symmetries if necessary, converges to a non zero function in \(L^{p}\). By continuity, the limit must be an extremizer for the corresponding \(L^{p}\) bound.

11.1 Spatial Localization

Lemma 11.1

There exists \(C< \infty \) such that for any \(\epsilon > 0\) there exists \(\delta > 0\) with the following property. Let f be a nonnegative function with \(\Vert f\Vert _{p}=1\) and \(\Vert T(f)\Vert _{q}\ge (1-\delta ) A\). Then there exists F with rough level set decomposition \(F=\sum _{j\in S}2^{j}f_j, f_j\sim \chi _{E_j}\) satisfying

$$\begin{aligned}&0\le F\le f, \\&\Vert T(F)\Vert _{q}\ge (1-\epsilon ) A, \\&|i-j|\le C \epsilon ^{-C} \quad \text {for all}\quad i, j\in S \end{aligned}$$

and for each \(j\in S\) there exist \(N(\sim C\epsilon ^{-C})\) paraballs \(B_{j,i}\) such that

$$\begin{aligned}&E_{j}\subset \bigcup _{i=1}^N B_{j,i};\\&\sum _{i=1}^N |B_{j,i}|\le C \epsilon ^{-C} |E_{j}|. \end{aligned}$$

Proof

Let \(\epsilon >0\). Fix \(\delta >0\) sufficiently small for later purposes, and suppose \(\Vert f\Vert _p=1\), \(\Vert Tf\Vert _{q}\ge (1-\delta )A\). By using Lemma 10.2, by losing \(\epsilon \) amount of \(L^{p}\) norm we can assume that f has a finite level set decomposition. In other words, \(f=\sum _{S} 2^{j}f_j\) with \(S\subset (J-C\epsilon ^{-C}, J+C\epsilon ^{-C})\cap \mathbb {Z}\) for some \(J\in \mathbb {Z}\). Let \(0< \eta \,(\le c\epsilon ^C)\) be a small quantity to be chosen later. Then \(\Vert T(f)\Vert _{q}\ge (1-2\delta )\,A\ge \eta \). Now we apply Lemma 8.4 to f to get a paraball \(B_{1}\) and \(i_{1}\in S\) such that \(\Vert 2^{i_{1}}\chi _{E_{i_{1}}\cap B_{1}}\Vert _{p}\ge c\eta ^{C}\) and \(|B_{1}|\le |E_{i_{1}}|\).

At the next step we set \(g_{1}=f\chi _{E_{i_{1}}\cap B_{1}}\) and write \(f=g_{1}+h_{1}\). So \(h_{1}=\sum _{j\ne _{i_{1}}}f(\chi _{E_{j}}+\chi _{E_{i_{1}}\setminus B_{1}})\). Now we look at \(\Vert T(h_{1})\Vert _{q}\). If \(\Vert T(h_{1})\Vert _{q}\ge \eta \) then by applying Lemma 8.4 to \(h_{1}\) we get another paraball \(B_{2}\) and \(i_{2}\in \mathbb {Z}\) such that

$$\begin{aligned} \Vert 2^{i_{2}}\chi _{E_{i_{2}}\cap B_{2}\setminus \big (E_{i_{1}}\cap B_{1}\big )}\Vert \ge c\eta ^{C}\quad \text {and}\quad |B_{2}|\le |E_{i_{2}}|. \end{aligned}$$

Now define \(g_{2}=f\chi _{E_{i_{2}}\cap B_{2}\setminus \big (E_{i_{1}}\cap B_{1}\big )}\) and \(f=g_{1}+g_{2}+h_{2}\).

We continue this process. Now suppose we are at the \((n-1)\)-th step. So we have a collection of paraballs \(\{B_{j}\}_{1\le j\le n-1}\) and indices \(\{i_{j}\}_{1\le j\le n-1}\) such that

  • \(|B_{j}|\le |E_{i_{j}}|\);

  • \(g_{m}=f\chi _{E_{i_{m}}\cap B_{m}\setminus \cup _{1\le j\le m-1}(E_{i_{j}}\cap B_{j})}\);

  • \(\Vert g_{m}\Vert _{p}\ge c\,\eta ^{C}\);

  • \(f=\sum _{j=1}^{n-1} g_{j}+h_{n-1}\).

Now if \(\Vert T(h_{n-1})\Vert _{q}<\, \eta \) we stop. Otherwise after applying Lemma 8.4 one more time we get another paraball \(B_{n}\) and \(i_{n}\in \mathbb {Z}\) such that \(\Vert 2^{i_{n}}\chi _{E_{i_{n}}\cap B_{n}}\Vert _{p}\ge c\eta ^{C}\) and \(|B_{n}|\le |E_{i_{n}}|\).

Since the \(g_{j}\) have disjoint support, \(\sum _{1\le j\le n} \Vert g_{j}\Vert _{p}^{p}\le \Vert f\Vert _{p}^{p}\le 1\). So this process must stop after at most \(C\eta ^{-C}\) steps. Let the process stops at the n-th step. Then we define \(F=\sum _{1\le j\le n} g_{j}\), so that \(\Vert T(f-F)\Vert _{q}< \,\eta \). This means \(\Vert T(F)\Vert _{q}\ge (1-2\delta -\eta ) A\ge (1-\epsilon ) A\) provided \(\eta \) is sufficiently small compared to \(\epsilon \). At the same time since \(\Vert T(F)\Vert _{q}\le A\Vert F\Vert _{p}\), we have \(\Vert F\Vert _{p}\ge 1-\epsilon \). This implies \(\Vert f-F\Vert _{p}\le \epsilon \). Now we set \(\{B_{j,i}\}=\{B_{i}: i_{j}=j\}\). \(\square \)

Lemma 11.2

There exists \(C< \infty \) such that for any \(\epsilon > 0\) there exists \(\delta > 0\) with the following property. Let f be a nonnegative function with \(\Vert f\Vert _{p}=1\) and \(\Vert T(f)\Vert _{q}\ge (1-\delta )\,A\). Then there exists \(\tilde{f}\) with rough level set decomposition \(\tilde{f}=\sum _{j\in S}2^{j}f_j\), \(f_j \sim \chi _{E_j}\) satisfying

$$\begin{aligned}&0\le \tilde{f}\le f, \\&\Vert \tilde{f}\Vert _{p}\ge (1-\epsilon ), \\&\Vert T(\tilde{f})\Vert _{q}\ge (1-\epsilon )\,A \end{aligned}$$

and there exists a distinguished \(J\in S\) and a paraball \(B_J\) such that

$$\begin{aligned}&|J-j|\le C \epsilon ^{-C}\quad \text {for all}\quad j\in S, \\&E_{j}\subset C\epsilon ^{-C^{C\epsilon ^{-C}}} B_J\quad \text {for all}\quad j\in S, \\&\Vert 2^{J}\chi _{{B}_J}\Vert _{p}\le C. \end{aligned}$$

Proof

This lemma is an improvement over the previous lemma, in the sense that the paraballs \(\{B_{j,i}\}\) have been replaced by a single paraball B, after scaling it with a factor of \((C\epsilon ^{-C})^{C\epsilon ^{-C}}\). The proof of this lemma will be an application of Lemma 9.1 together with the previous lemma.

By the previous lemma we can assume that \(f=\sum _{j\in S}2^{j}f_j, f_j\sim \chi _{E_{j}}\) where \(E_{j}\subset \cup _{i=1}^{C\epsilon ^{-C}} B_{j,i}\) and \(|S|\le C\epsilon ^{-C}\). Let \(0<\,\eta <\, \epsilon ^{C}\) be a small quantity to be chosen later. Let us write the collection of paraballs \(B_{j,i}\) as \(\{B_{l}:1\le l\le N\}\). Then \(N\le C\epsilon ^{-C}\). Let if possible \(\{1,2,\ldots ,N\}=S^{a}\cup S^{b}\) be a partition of \(\{1,2,\ldots ,N\}\) such that for each \((i,j)\in S^a\times S^b\), \(d(B_i, B_j)> C \eta ^{-C}\). We continue as in the proof of Lemma 10.2. Let \(g=\sum _{k\in \tilde{S}} 2^{k}g_k, g_k\sim \chi _{ F_k}\), be an arbitrary \(L^{q^{'}}\) function with \(\Vert g\Vert _{q^{'}}=1\). Now partition each of the sets \(F_{k}\) measurably as \(F_{k}=F_{k}^{a}\cup F_{k}^{b}\cup F_{k}^{c}\) with the following property:

  • For each \(x\in F_{k}^{a}\) there exists \(j\in S^{a}\) so that \(T\chi _{B_{j}}(x)\ge \eta |B_{j}|^{\frac{1}{p}}\, |F_{k}|^{-\frac{1}{q}}\);

  • For each \(x\in F_{k}^{b}\) there exists \(j\in S^{b}\) so that \(T\chi _{B_{j}}(x)\ge \eta |B_{j}|^{\frac{1}{p}}\, |F_{k}|^{-\frac{1}{q}}\).

Write

$$\begin{aligned}&h^{a}=g\sum _{k\in \tilde{S}}2^{k}\chi _{F_{k}^{a}},\\&h^{b}=g\sum _{k\in \tilde{S}}2^{k}\chi _{F_{k}^{b}}, \\&f^{a}=f\sum _{j\in S^{a}}2^{j}\chi _{E_{j}},\\&f^{b}=f\sum _{j\in S^{b}}2^{j}\chi _{E_{j}}, \end{aligned}$$

As in the proof of Lemma 10.2, if \(\eta \) is sufficiently small, there is \(i\in S^{a}\) and \(k\in \tilde{S}\) with

$$\begin{aligned} T(\chi _{B_{i}}, \chi _{F_{k}^{b}})\ge \eta |B_{i}|^{\frac{1}{p}} |F_{k}|^{1-\frac{1}{q}}. \end{aligned}$$

But by the proof of Lemma 9.1 this implies there is \(j\in S^b\), such that \(d(B_i, B_j)\le C\epsilon ^{-C}\) which contradicts our hypothesis. Therefore it is not possible to decompose the collection of paraballs corresponding to f into a disjoint union of two sets such that any two elements belonging to different sets are at least \(C\epsilon ^{-C}\) far with respect to the mock-distance d. Now let us fix a paraball \(B_{j_0}\) corresponding to \(f_0\). Now we construct inductively a sequence of collection of paraballs by

  • \(\{B_1,\ldots ,B_N\}=\cup _{j=0}^N \mathbb {B}_j\);

  • \(\mathbb {B}_0=\{B_{j_0}\}\);

  • \(B\in \mathbb {B}_j\) and \(B'\in \mathbb {B}_{j+1}\) implies \(d(B, B')\le C\epsilon ^{-C}\).

By quasi-tirangle inequality this implies \(d(B_{j_0}, B_j)\le C\epsilon ^{-C^{C\epsilon ^{-C}}}\)for all \(j=1,2,\ldots ,N\). \(\square \)

12 Weak Convergence and Extremizers for \(\theta \in (0,1)\)

Lemma 12.1

There exists a constant C (depending only on d) and positive functions \(\Psi _{1}, \Psi _{2}:(0,\infty )\rightarrow (0,\infty )\) and \(\rho _{1}, \rho _{2}:(1,\infty )\rightarrow (0,\infty )\) satisfying \(\frac{\Psi _{1}(t)}{t^p}\rightarrow 0\) and \(\frac{\Psi _{2}(t)}{t^{q'}}\rightarrow 0\) as \(t\rightarrow 0,\infty \) and \(\rho _{i}(R)\rightarrow \infty \) as \(R\rightarrow \infty \) with the following property. For any \(\epsilon \,>\,0\) there exists a \(\delta > 0\) such that for any nonnegative function f with \(\Vert f\Vert _{p}=1\) and \(\Vert T(f)\Vert _{q}\ge (1-\delta )\,A\), there exists \(\phi \in G_{d}\) and a decomposition

$$\begin{aligned} \phi ^{*}(f)=g+h \end{aligned}$$

with \(g, h\ge 0\) satisfying for large R,

$$\begin{aligned} \Vert h\Vert _{p}< \epsilon ,\quad \int _{\mathbb {R}^{d}} \Psi _{1}(g)\le C, \quad \text {support}(g)\subset B(0,\rho _{1}(R)). \end{aligned}$$

In addition, there exists \(F\ge 0\) satisfying \(\Vert F\Vert _{q^{'}}=1\) and

$$\begin{aligned} \langle T(g), F\rangle \ge (1-\epsilon )\,A,\quad \int _{\mathbb {R}^{d}} \Psi _{2}(F)\le C, \quad \text {support}(F)\subset B(0,\rho _{2}(R)). \end{aligned}$$

Proof

Let \(\epsilon >0\). Fix \(\delta >0\) sufficiently small. Let \(\Vert f\Vert _{p}=1\) with \(\Vert Tf\Vert _{q}\ge A(1-\delta )\). By applying Lemma 11.2 to f we can assume that there is a \(\phi \in G_d\) such that

$$\begin{aligned} \phi ^{*}(f)=g+h \end{aligned}$$

with \(\Vert h\Vert _{p}<\epsilon \) and support\((g)\subset B_J=B(\bar{x}_J,t_J,\alpha _J,\beta _J)\). In addition, by Lemma 7.3, we can assume that in Lemma 11.2 the distinguished index \(J=0\) and the paraball \(B_J=\{y\in \mathbb {R}^{d}; \Vert y\Vert <1\}\). Therefore \(\Vert h\Vert _{p}<\epsilon \) and support(g)\(\subset B(0,C\epsilon ^{-C^{C\epsilon ^{-C}}})\).

We set \(\rho (R)=CR^{C^{CR^{C}}}\). The second part of the conclusion follows similarly by applying the same proof to the operator \(T^{*}\) since \(T^{*}\) is the convolution with the affine arc length measure on the curve \(-h(t)\). \(\square \)

12.1 Proof of Existence of Extremizers for \(\theta \in (0,1)\)

The proof of the existence of extremizers is along the lines of the proof for the corresponding result in [3]. For the sake of completeness we give the details of the proof. Let \(\{f_n\}\) be any extremizing sequence. By the previous Lemma there exist \(\{\phi _{n}\in G_d\}\), such that \(\phi ^{*}_n(f_{n})=g_{n}+h_{n}\), while the functions \(g_{n}\) and \(h_{n}\) satisfy all the conclusions of the previous Lemma corresponding to \(\epsilon _n=\frac{1}{n}\). Also there exists a sequence \(\{F_{n}\}\) with \(\Vert F_{n}\Vert _{q^{'}}=1\) and \(T(g_{n}, F_{n})\rightarrow A\).

By applying the Banach-Alaoglu Theorem, after passing through a subsequence we can assume that \(g_{n}\rightharpoonup g\) and \(F_{n}\rightharpoonup F\) weakly. Therefore \(\Vert F\Vert _{{q}^{'}}\le 1\) and \(\Vert g\Vert _{p}\le 1\). Let us fix a large R. Now we set

$$\begin{aligned} g_{n,R}(x)&=g_{n}(x)\,\chi _{\Vert x\Vert \le R}(x)\,\chi _{g_{n}\le R}(x),\qquad g_{R}(x)=g(x)\, \chi _{\Vert x\Vert \le R}(x)\,\chi _{g\le R}(x); \end{aligned}$$
(12.1)
$$\begin{aligned} F_{n,R}(x)&=F_{n}(x)\,\chi _{\Vert x\Vert \le R}(x)\,\chi _{F_{n}\le R}(x),.\qquad F_{R}(x)=F(x)\, \chi _{\Vert x\Vert \le R}(x)\,\chi _{F\le R}(x). \end{aligned}$$
(12.2)

By stationary phase argument for any fixed \(\psi \in C_{0}^{1}(\mathbb {R}^{d})\), the operator \(f\mapsto \psi T(\psi f)\) maps \(L^{2}(\mathbb {R}^{d})\) boundedly to the Sobolev space \(H^{\frac{1}{d}}\), which in turn embeds into \(L^{s}\) where \(\frac{1}{s} = \frac{1}{2}-\frac{1}{d^{2}}\). Thus the weak convergence of \(g_{n, R}\) to \(g_{R}\) as \(n\rightarrow \infty \) implies the \(L^{s}\) norm convergence of \(T(g_{n, R})\) to \(T(g_{R})\) as \(n\rightarrow \infty \), for every fixed R. Therefore \(T(G_{n, R}, F_{n, R})\rightarrow T(g_{R}, F_{R})\) as \(n\rightarrow \infty \) for a fixed R.

By Lemma 12.1 the integral of \(g^{p_{\theta }}_{n}\) outside the ball of radius R goes to zero as R goes to infinity uniformly in n. This implies \(g_{n,R}\) converges to \(g_{n}\) in \(L^{p_{\theta }}\) as R goes to infinity uniformly in n. Similarly \(F_{n,R}\) converges to \(F_{n}\) in \(L^{q^{'}_{\theta }}\) uniformly in n. This together with the conclusion from previous paragraph implies

$$\begin{aligned} A=\lim _{n\rightarrow \infty }T(g_{n}, F_{n})=T(g, F). \end{aligned}$$
(12.3)

So g is an extremizer.

13 \(L^{p}\) Convergence of Extremizing Subsequence

The main result of this section is the \(L^{p}\) convergence of a subsequence of any extremizing sequence after applying suitable symmetries. The proof is similar to the corresponding result in [3]. For the convenience of the reader we give the details of the proof.

Euler-Lagrange Identity Let f be a nonnegative extremizer with \(\Vert f\Vert _{p}=1\). Then by Holder’s inequality

$$\begin{aligned}&A^{q}=\Vert T(f)\Vert _{q}^{q}=\langle T(f), T(f)^{q-1}\rangle =\langle f, T^{*}(T(f)^{q-1})\rangle \\&\qquad \le \Vert f\Vert _{p} \Vert T^{*}(T(f)^{q-1})\Vert _{p^{'}}= A \Vert (Tf)^{q-1}\Vert _{q{'}}= A \Vert T(f)\Vert _{q}^{\frac{q}{q^{'}}}=A A^{\frac{q}{q^{'}}}=A^{q}. \end{aligned}$$

Since the equality holds for the above chain of inequalities, we have \(T^{*}(T(f)^{q-1})\) agrees with a constant multiple of \(f^{\frac{p}{p^{'}}}\) almost everywhere on \(\mathbb {R}^{d}\). The above equality implies this constant is \(A^{q}\). So finally, we have for any nonnegative extremizer f with \(\Vert f\Vert _{p}=1\),

$$\begin{aligned} T^{*}\big ((T(f))^{q-1}\big )=A^{q} f^{p-1}\quad \text {almost everywere on}\quad \mathbb {R}^{d}. \end{aligned}$$
(13.1)

Lemma 13.1

Let \(f_{n}\) be an extremizing sequence. Then there exist a sequence of symmetries \(\{\phi _{n}\}\subset G_{d}\) and an extremal F such that \(\{\phi _{n_{k}}^{*} (f_{n_{k}})\}\) converges to F in \(L^{p}\) for some subsequence \(\{f_{n_{k}}\}\).

Proof

After passing through a subsequence, if necessary and applying suitable symmetry \(\phi _{n}^{*}\) we have that \(\phi _{n}^{*}(f_{n})=g_{n}+h_{n}\) with \(\Vert h_{n}\Vert _{p}\rightarrow 0\) and \(\Vert T(g_{n})\Vert _{q}\rightarrow A\). Therefore it is enough to prove that \(\{\phi _{n}^{*}(f_{n})\}\) converges to g in \(L^{p}\) where g is as in (12.3).

By (13.1) there exists \(H\ge 0\) such that \(T^{*}(H)=A^{q} g^{p-1}\) almost everywhere on \(\mathbb {R}^{d}\). By (12.3) we have

$$\begin{aligned}&A^{q}=A^{q}\langle g, g^{p-1}\rangle =\langle g, T^{*}(H)\rangle =\lim _{n\rightarrow \infty }\langle g_{n}, T^{*}(H)\rangle \\&\qquad =A^{q}\lim _{n\rightarrow \infty }\langle g_{n}, g^{p-1}\rangle =A^{q}\lim _{n\rightarrow \infty }\langle \phi _{n}^{*}(f_{n}),g^{p-1}\rangle . \end{aligned}$$

At this point we apply Theorem 2.11 in [20] to the pair \((\{\phi _{n}^{*}(f_{n})\}, g)\) to get the desired conclusion. \(\square \)

14 Extremizers at End Points

In this section we shall prove Theorem 1.8, which describes a relation between extremizers for \(T:L^{p_{0}}\rightarrow L^{q_{0}}\) and extremizers for \(X^{*}:L^{p_{0}}\rightarrow L^{q_{0}}\), adjoint of X as defined in 1.5. Simultaneously we prove a relation between extremizers for \(T:L^{p_{1}}\rightarrow L^{q_{1}}\) and extremizers for \(X:L^{p_{1}}\rightarrow L^{q_{1}}\). We would like to thank Michael Christ for his suggestion to look at the restricted X-ray transform, X, for the endpoint cases.

Lemma 14.1

Let T and X be defined as in 1.1 and 1.5 respectively and \(d>2\). Then \(\Vert T\Vert _{L^{p_0}\rightarrow L^{q_0}}\ge \Vert X^{*}\Vert _{L^{p_0}\rightarrow L^{q_0}}\) and \(\Vert T\Vert _{L^{p_1}\rightarrow L^{q_1}}\ge \Vert X\Vert _{L^{p_1}\rightarrow L^{q_1}}\).

Proof

It suffices to show that \(\Vert T\Vert _{L^{p_0}\rightarrow L^{q_0}}\ge \Vert X^{*}\Vert _{L^{p_0}\rightarrow L^{q_0}}\) since \((p_1,q_1)=(q'_0,p'_0)\) and \(T^{*}\) is the same operator as T with the curve h(t) replaced by \(-h(t)\). Let \(\epsilon >0\). Let \(f\in L^{p_0}\) and \(g\in L^{q'_0}\). Let \(\gamma (t)=(t^2,t^3,\ldots ,t^d)\in \mathbb {R}^{d-1}\). Define

$$\begin{aligned} \tilde{f}_\epsilon (x):={\epsilon }^{\frac{d-1}{p_0}} f((0,\epsilon (x_2,\ldots ,x_d)+h(x_1)) \end{aligned}$$

and

$$\begin{aligned} \tilde{g}_\epsilon (x)={\epsilon }^{\frac{d}{q^{'}_0}}g(\epsilon x). \end{aligned}$$

Let f and g be compactly supported smooth functions. Then

$$\begin{aligned} \begin{aligned} \langle X^{*}f,g\rangle&=\lim _{\epsilon \rightarrow 0}\int _{(t,y)\in \mathbb {R}\times \mathbb {R}^{d}}f\big (\epsilon y_1+t, y-\frac{\gamma (\epsilon y_1+t)-\gamma (t)}{\epsilon }\big )g(y)\,dy\,dt\\&=\lim _{\epsilon \rightarrow 0}\langle Tf_{\epsilon }, g_{\epsilon }\rangle \le \Vert T\Vert \end{aligned} \end{aligned}$$
(14.1)

where \(f_{\epsilon }, g_\epsilon \) are the functions such that \(\tilde{f_{\epsilon }}=f\) and \(\tilde{g_{\epsilon }}=g\). \(\square \)

Lemma 14.2

Let T and X be as above and \(d>2\).

  • If there exists an extremizing sequence for \(T:L^{p_0}\rightarrow L^{q_0}\) that does not have a subsequence converging to an extremizer modulo symmetries of T, then \(\Vert T\Vert _{L^{p_0}\rightarrow L^{q_0}}=\Vert X^{*}\Vert _{L^{p_0}\rightarrow L^{q_0}}\).

  • If there exists an extremizing sequence for \(T:L^{p_1}\rightarrow L^{q_1}\) that does not have a subsequence converging to an extremizer modulo symmetries of T, then \(\Vert T\Vert _{L^{p_1}\rightarrow L^{q_1}}=\Vert X\Vert _{L^{p_1}\rightarrow L^{q_1}}\).

Proof

We shall prove the lemma only for \(T:L^{p_0}\rightarrow L^{q_0}\), the other case being identical. By the previous Lemma it suffices to show that \(\Vert T\Vert _{L^{p_0}\rightarrow L^{q_0}}\le \Vert X^{*}\Vert _{L^{p_0}\rightarrow L^{q_0}}\). By hypothesis, there exists an extremizing sequence \(\{f_n\}\) for \(T:L^{p_0}\rightarrow L^{q_0}\) such that for any sequence of symmetries \(\{\phi _{n}^*\}\), the sequence \(\{\phi _{n}^*(f_n)\}\) has no subsequence which converges to a non-zero limit in \(L^{p_0}\). Let us start with such an extremizing sequence \(\{f_n\}\) such that \(\Vert Tf_n\Vert _{L^{q_0}}\ge (1-\frac{1}{n}) A\) for each n. Then there exists another sequence \(\{g_n\}\) such that \(\langle Tf_n, g_n\rangle \) converges to \(A=\Vert T\Vert _{L^{p_0}\rightarrow L^{q_0}}\) and \(\Vert g_n\Vert _{L^{q'_0}}=1\). By Lemma 11.2 there is a sequence of symmetries \(\{\phi _{n}^*\}\) such that after changing \(f_n\) to \(\phi _{n}^*(f_n)\) and \(g_n\) to \(\psi _{n}^*(g_n)\), if necessary, we have

  • \(f_n=\sum _{j\in S_n}2^{j}f_{n,j}\), \(f_{n,j} \sim \chi _{E_{n,j}}\);

  • \(|j-J_n|\le C n^{-C}\quad \text {for all}\quad j\in S_n\);

  • \(E_{n,j}\subset Cn^{{-C}^{Cn^{-C}}}B (0,0,\alpha _n,1)\quad \text {for all}\quad j\in S_n\);

  • \(\Vert 2^{J_n}\chi _{B (0,0,\alpha _n,1)}\Vert _{p_0}\le C\).

and

  • \(g_n=\sum _{k\in \tilde{S}_n}2^{k}g_{n,k}\), \(g_{n,k}\sim \chi _{F_{n,k}}\);

  • \(|k-K_n|\le C n^{-C}\quad \text {for all}\quad k\in \tilde{S}_n\);

  • \(F_{n,k}\subset Cn^{{-C}^{Cn^{-C}}}B^{*} (0,0,\alpha _n,1)\quad \text {for all}\quad k\in \tilde{S}_n\);

  • \(\Vert 2^{K_n}\chi _{B^{*} (0,0,\alpha _n,1)}\Vert _{{q'_0}}\le C\).

Since \(f_n\) has no subsequence converging to a nonzero limit in \(L^{p_0}\), by the proof of Lemma 12.1, \(\alpha _n\rightarrow 0\).

We define

$$\begin{aligned} \tilde{f}_{n}(x):={\alpha _n}^{\frac{d-1}{p_0}} f_{n}((0,\alpha _{n}(x_2,\ldots ,x_d)+h(x_1)) \end{aligned}$$

and

$$\begin{aligned} \tilde{g}_n(x)={\alpha _n}^{\frac{d}{q^{'}_0}}g_{n}(\alpha _{n}x). \end{aligned}$$

Let \(c_n=Cn^{{-C}^{Cn^{-C}}}\). This implies for each \(\tilde{f}_n\), we now have the following,

$$\begin{aligned} \tilde{f}_n=\sum _{|j|<c_n}2^{j}f_{n,j}((0,2^{\frac{-p_{0}J_{n}}{d-1}}(x_2,\ldots ,x_d)+h(x_1))). \end{aligned}$$

Let \(f^{*}_{n,k}(x)= f_{n,k}((0,2^{\frac{-p_{0}J_{n}}{d-1}}(x_2,\ldots ,x_d)+h(x_1)))\). Then \(f^{*}_{n,j}\sim \chi _{E^{*}_{n,j}}\) with the following properties:

  • \(\chi _{E^{*}_{n,0}}\sim \chi _{B(0,1)}\);

  • \(E^{*}_{n,j}\subset B(0,c_n)\) for all \(|j|<c_n\).

Similarly we have

$$\begin{aligned} {\tilde{g}}_n(x)=\sum _{|k|<c_n}2^kg_{n,k}(2^{\frac{-q'_{0}K_{n}}{d}}x). \end{aligned}$$

If \(g^{*}_{n,k}(x)=g_{n,k}(2^{\frac{-q'_{0}K_{n}}{d}}x)\) then \(g^{*}_{n,k}\sim \chi _{F^{*}_{n,k}}\) with

  • \(\chi _{F^{*}_{n,0}}\sim \chi _{B(0,1)}\);

  • \(F^{*}_{n,k}\subset B(0,c_n)\) for all \(|j|<c_n\).

By a simple change of variable we see that for each n, we have \(\Vert \tilde{g}_{n}\Vert _{L^{q'_0}}=1\) and \(\Vert \tilde{f}_{n}\Vert _{L^{p_0}}=1\). Furthermore, we now apply the proof of Lemma 12.1 to show that there is an \(f\in L^{p_0}\) such that \(\{\tilde{f}_n\}\) has a subsequence that converges weakly to f as n goes to infinity and there is an \(g\in L^{q'_0}\) such that \(\{\tilde{g}_n\}\) has a subsequence weakly converges to g as n goes to infinity. WLOG we assume that the sequence \(\{\tilde{f}_n\}\) it self converges weakly to f and likewise for \(\{\tilde{g}_n\}\). We shall now prove that

$$\begin{aligned} \begin{aligned} \Vert T\Vert _{L^{p_0}\rightarrow L^{q_0}}&=\lim _{n\rightarrow \infty }\langle Tf_n, g_n\rangle \\&=\lim _{n\rightarrow \infty }\int _{(t,y)\in \mathbb {R}\times \mathbb {R}^{d}}\tilde{f_n}(\epsilon y_1+t, y-\frac{\gamma (\epsilon y_1+t)-\gamma (t)}{\epsilon })\tilde{g_n}(y) \,dy\,dt\\&=\lim _{n\rightarrow \infty }\langle X^{*}({\tilde{f}}_n), {\tilde{g}}_n\rangle \\&= \langle X^{*}f,g\rangle \\&\le \Vert X^{*}\Vert _{L^{p_0}\rightarrow L^{q_0}}. \end{aligned} \end{aligned}$$
(14.2)

where \(\epsilon =\alpha _n\).

The proof of this claim is similar to the proof of 12.1. We define for a large R,

$$\begin{aligned} {\tilde{f}}_{n,R}(x)&={\tilde{f}}_{n}(x)\,\chi _{\Vert x\Vert \le R}(x)\,\chi _{{\tilde{f}}_{n}\le R}(x),\qquad f_{R}(x)=f(x)\, \chi _{\Vert x\Vert \le R}(x)\,\chi _{f\le R}(x); \end{aligned}$$
(14.3)
$$\begin{aligned} {\tilde{g}}_{n,R}(x)&={\tilde{g}}_{n}(x)\,\chi _{\Vert x\Vert \le R}(x)\,\chi _{{\tilde{g}}_{n}\le R}(x),.\qquad g_{R}(x)=g(x)\, \chi _{\Vert x\Vert \le R}(x)\,\chi _{g\le R}(x). \end{aligned}$$
(14.4)

For any compactly supported smooth function \(\psi \), the operator \(f\mapsto \psi X^{*}(\psi f)\) maps \(L^{2}(\mathbb {R}^{d})\) boundedly to the Sobolev space \(H^{s}\), which in turn embeds into \(L^{q}\) where \(q>2\), see [7]. Thus the weak convergence of \(\tilde{f}_{n, R}\) to \(f_{R}\) as \(n\rightarrow \infty \) implies the \(L^{q}\) norm convergence of \(X^{*}({\tilde{f}}_{n, R})\) to \(X^{*}(f_{R})\) as \(n\rightarrow \infty \), for every fixed R. Therefore \(\langle X^{*}{\tilde{f}}_{n, R}, {\tilde{g}}_{n, R}\rangle \rightarrow \langle X^{*}f_{R},g_{R}\rangle \) as \(n\rightarrow \infty \) for a fixed R.

By Lemma 12.1 we know that the \(L^{q^{'}_0}\) norms of \(\tilde{g}_n\) and the \(L^{p_0}\) norms of \(\tilde{f}_n\) decreases uniformly in n outside the ball of radius R centered at 0 as R goes to infinity. This together with the previous paragraph imply that \(\Vert T\Vert \le \Vert X^{*}\Vert \). \(\square \)

We now have the following immediate corollary.

Corollary 14.3

Let T and X be as above and \(d>2\).

  • If there exists an extremizing sequence, \(\{f_n\}\), for \(T:L^{p_0}\rightarrow L^{q_0}\) that does not have a subsequence converging to an extremizer modulo symmetries of T, then after applying the nonsymmetry, \(f_n\rightarrow r^{\frac{d-1}{p_0}}_nf_n((0,r_{n}x')+h(x_1))\), it has a subsequence that converges weakly to an extremizer for \(X^*:L^{p_0}\rightarrow L^{q_0}\).

  • If there exists an extremizing sequence, \(\{f_n\}\), for \(T:L^{p_1}\rightarrow L^{q_1}\) that does not have a subsequence converging to an extremizer modulo symmetries of T, then after applying the nonsymmetry, \(f_n\rightarrow r^{\frac{d}{q'_0}}_nf_n(r_{n}x)\), it has a subsequence that converges weakly to an extremizer for \(X:L^{p_1}\rightarrow L^{q_1}\).

14.1 Proof of Theorem 1.8

To complete the proof of Theorem 1.8, it suffices to prove that the weak convergence in Corollary 14.3 is in fact \(L^{p}\) convergence. The proof is along the same line of proof of Lemma 13.1. Let \(\{\tilde{f}_n\}\) converges weakly to f in \(L^{p_0}\) and \(\{\tilde{g}_n\}\) converges weakly to g in \(L^{q_0}\). Since g is an extremizer for \(X:L^{q^{'}_{0}}\rightarrow L^{p^{'}_{0}}\), By the Euler Lagrange equation for X we have

$$\begin{aligned} X^{*}\big ((X(g))^{p^{'}_{0}-1}\big )= A^{p^{'}_{0}} g^{q^{'}_{0}-1}. \end{aligned}$$

Now we apply Theorem 2.11 in [20] to the tuple \((p,g_n,g)=(q^{'}_{0}, \tilde{g}_{n},g)\) to prove that \(\tilde{g}_{n}\) converges to g in \(L^{q^{'}_{0}}\). Similarly \(\tilde{f}_{n}\) converges to f in \(L^{p_{0}}\).