1 Introduction

In this paper the nonlinear Klein–Gordon (NLKG) equation in the nonrelativistic limit, namely as the speed of light c tends to infinity, is studied. Formal computations going back to the first half of the last century suggest that, up to corrections of order \(\mathscr {O}(c^{-2})\), the system should be described by the nonlinear Schrödinger (NLS) equation. Subsequent mathematical results have shown that the NLS describes the dynamics over timescales of order \(\mathscr {O}(1)\).

The nonrelativistic limit for the Klein–Gordon equation on \(\mathbb {R}^d\) has been extensively studied over more then 30 years, and essentially all the known results only show convergence of the solutions of NLKG to the solutions of the approximate equation for times of order \(\mathscr {O}(1)\). The typical statement ensures convergence locally uniformly in time. In a first series of results (see [35, 42, 57]) it was shown that, if the initial data are in a certain smoothness class, then the solutions converge in a weaker topology to the solutions of the approximating equation. These are informally called “results with loss of smoothness.” Although in this paper a longer time convergence is proved, our results also fill in this group.

Some other results, essentially due to Machihara, Masmoudi, Nakanishi and Ozawa, ensure convergence without loss of regularity in the energy space, again over timescales of order \(\mathscr {O}(1)\) (see [36, 38, 44]).

Concerning radiation solutions there is a remarkable result (see [43]) by Nakanishi, who considered the complex NLKG in the defocusing case, in which it is known that all solutions scatter (and thus the scattering operator exists), and proved that the scattering operator of the NLKG equation converges to the scattering operator of the NLS. It is important to remark that this result is not contained in the one proved here and does not contain it.

Recently Lu and Zhang in [34] proved a result which concerns the NLKG with a quadratic nonlinearity. Here the problem is that the typical scale over which the standard approach allows to control the dynamics is \(\mathscr {O}(c^{-1})\), while the dynamics of the approximating equation takes place over timescales of order \(\mathscr {O}(1)\). In that work the authors are able to use a normal form transformation (in a spirit quite different from ours) in order to extend the time of validity of the approximation over the \(\mathscr {O}(1)\) timescale. We did not try to reproduce or extend that result.

In this paper we prove two kinds of results for the dynamics of NLKG: a global existence result (see Theorem 1) which is uniform for sufficiently large values of \(c>0\), and approximation results (see Theorems 2 and 3) that allow to approximate solutions of NLKG by solutions of suitable higher-order NLS equations. Approximation results are different in the case where the equation lives on \(\mathbb {R}^d\) or in a compact manifold: When M is a smooth compact manifold or \(\mathbb {R}^d\) the solution of NLS approximates the solution of the original equation locally uniformly in time; when \(M=\mathbb {R}^d\), \(d \ge 2\), it is possible to prove that for \(r>1\) solutions of the order-r normalized equation approximate solutions of the NLKG equation up to times of order \(\mathscr {O}(c^{2(r-1)})\).

The present paper can be thought as an example in which techniques from canonical perturbation theory are used together with results from the theory of dispersive equations in order to understand the singular limit of Hamiltonian PDEs. In this context, the nonrelativistic limit of the NLKG is a relevant example.

The issue of nonrelativistic limit has been studied also in the more general Maxwell–Klein–Gordon system [10, 39], in the Klein–Gordon–Zakharov system [40, 41], in the Hartree equation [17] and in the pseudo-relativistic NLS [18]. However, all these results proved the convergence of the solutions of the limiting system in the energy space ([17] studied also the convergence in \(H^k\)), locally uniformly in time; no information could be obtained about the convergence of solutions for longer (in the case of NLKG, which means c-dependent) timescales. On the other hand, in the recent [27], which studies the nonrelativistic limit of the Vlasov–Maxwell system, the authors were able to prove a stability result for solutions which lie in a neighborhood of stable equilibria of the system; this result is valid for times which are polynomial in terms of the inverse of the speed of light, and does not exhibit loss of smoothness.

Other examples of singular perturbation problems that have been studied either with canonical perturbation theory or with multiscale analysis are the problem of the continuous approximation of lattice dynamics (see, e.g., [6, 51]) and the semiclassical analysis of Schrödinger operators (see, e.g., [1, 46]). In the framework of lattice dynamics, the timescale covered by all known results is that typical of averaging theorems, which corresponds to our \(\mathscr {O}(1)\) timescale. The methods developed in the present paper should allow to extend the time of validity of those results.

The paper is organized as follows. In Sect. 2 we state the results of the paper, together with examples and comments. In Sect. 3 we show Strichartz estimates for the linear KG equation and for the KG equation with potential, as well as a global existence result uniform with respect to c for the cubic NLKG equation on \(\mathbb {R}^3\). In Sect. 4 we state the main abstract result of the paper. In Sect. 5 we present the proof of the abstract result, which is based on a Galerkin cutoff technique, along with remarks and variant of the result. Next, in Sect. 6 we apply the abstract theorem to the NLKG equation, making explicit computations of the normal form at the first and at the second step. In Sect. 7 we deduce a result about the approximation of solutions locally uniformly in time. In Sect. 8 we study the properties of the normalized equation, namely its dispersive properties in the linear case and its well-posedness for solutions with small initial data in the nonlinear case. In Sect. 9 we discuss the approximation for longer timescales: In particular, to deduce the latter we exploit some dispersive properties of the KG equation reported in Sect. 3. Finally, in “Appendix A” we report all technical lemmata used in Birkhoff normal form estimates (the approach is essentially the same as in [2]), and in “Appendix B” we prove some interpolation theory results for relativistic Sobolev spaces, and we exploit them to deduce Strichartz estimates for the KG equation with potential.

2 Statement of the main results

The NLKG equation describes the motion of a spinless particle with mass \(m>0\). Consider first the real NLKG

$$\begin{aligned} \frac{\hbar ^2}{2mc^2} u_{tt} - \frac{\hbar ^2}{2m} \varDelta u +\frac{mc^2}{2} u + \lambda |u|^{2(l-1)}u&= 0, \end{aligned}$$
(1)

where \(c>0\) is the speed of light, \(\hbar >0\) is the Planck constant, \(\lambda \in \mathbb {R}\), \(l \ge 2\), \(c>0\).

In the following \(m=1\), \(\hbar =1\). As anticipated above, one is interested in the behavior of solutions as \(c\rightarrow \infty \).

First it is convenient to reduce Eq. (1) to a first-order system, by making the following symplectic change variables

$$\begin{aligned} \psi&:= \frac{1}{\sqrt{2}} \left[ \left( \frac{\langle \nabla \rangle _c}{c} \right) ^{1/2} u - i \left( \frac{c}{\langle \nabla \rangle _c}\right) ^{1/2}v \right] , \quad v = u_t/c^2, \end{aligned}$$

where

$$\begin{aligned} \langle \nabla \rangle _c:=(c^2-\varDelta )^{1/2}, \end{aligned}$$
(2)

which reduces (1) to the form

$$\begin{aligned} -i \psi _t&= c \langle \nabla \rangle _c\psi + \frac{\lambda }{2^l} \left( \frac{c}{\langle \nabla \rangle _c} \right) ^{1/2} \left[ \left( \frac{c}{\langle \nabla \rangle _c} \right) ^{1/2} (\psi +\bar{\psi }) \right] ^{2l-1}, \end{aligned}$$
(3)

which is Hamiltonian with Hamiltonian function given by

$$\begin{aligned} H(\bar{\psi },\psi )&= \left\langle \bar{\psi }, c\langle \nabla \rangle _c\psi \right\rangle + \frac{\lambda }{2l} \int \left[ \left( \frac{c}{\langle \nabla \rangle _c} \right) ^{1/2} \frac{\psi +\bar{\psi }}{\sqrt{2}} \right] ^{2l} \mathrm{d}x. \end{aligned}$$
(4)

To state our first result, introduce for any \(k \in \mathbb {R}\) and for any \(1< p < \infty \) the following relativistic Sobolev spaces

$$\begin{aligned} \mathscr {W}_c^{k,p}(\mathbb {R}^3)&:= \left\{ u\in L^p: \Vert u\Vert _{\mathscr {W}_c^{k,p}}:=\Vert c^{-k} \, \langle \nabla \rangle _c^k u \Vert _{L^p}<+\infty \right\} , \end{aligned}$$
(5)
$$\begin{aligned} \mathscr {H}_c^{k}(\mathbb {R}^3)&:= \left\{ u\in L^2: \Vert u\Vert _{\mathscr {H}_c^k}:=\Vert c^{-k} \, \langle \nabla \rangle _c^k u \Vert _{L^2}<+\infty \right\} , \end{aligned}$$
(6)

and remark that the energy space is \(\mathscr {H}_c^{1/2}\). Remark that for finite \(c>0\) such spaces coincide with the standard Sobolev spaces, while for \(c=\infty \) they are equivalent to the Lebesgue spaces \(L^p\).

In the following the notation \(a \lesssim b\) is used to mean: there exists a positive constant K that does not depend on c such that \(a \le Kb\).

We begin with a global existence result for the NLKG (3) in the cubic case, \(l=2\), for small initial data.

Theorem 1

Consider Eq. (3) with \(l=2\) on \(\mathbb {R}^3\).

There exist \(\varepsilon _*>0\) and \(c_*>0\) such that for any \(c>c_*\), if the norm of the initial datum \(\psi _0\) fulfills

$$\begin{aligned} \Vert \psi _0 \Vert _{\mathscr {H}_c^{1/2}}&\le \varepsilon _*, \end{aligned}$$
(7)

then the corresponding solution \(\psi (t)\) of (3) exists globally in time:

$$\begin{aligned} \Vert \psi (t) \Vert _{L^\infty _t \mathscr {H}_c^{1/2} } \;&\lesssim \; \Vert \psi _0 \Vert _{\mathscr {H}_c^{1/2}}. \end{aligned}$$
(8)

We remark that the constant involved in the estimate (8) does not depend on c.

Remark 1

For finite c this is the standard result for small amplitude solution, while for \(c=\infty \) it becomes the standard result for the NLS: Thus Theorem 1 interpolates between these apparently completely different situations. Remark that the lack of a priori estimates for the solutions of NLKG in the limit \(c\rightarrow \infty \) was the main obstruction in order to obtain global existence results uniform in c in standard Sobolev spaces.

One is now interested in discussing the approximation of the solutions of NLKG with NLS-type equations. Before giving the result we describe the general strategy we use to get them.

Remark that Eq. (1) is Hamiltonian with Hamiltonian function (4). If one divides the Hamiltonian by a factor \(c^2\) (which corresponds to a rescaling of time) and expands in powers of \(c^{-2}\) it takes the form

$$\begin{aligned} \langle \psi ,\bar{\psi }\rangle + \frac{1}{c^2} P_c(\psi ,\bar{\psi }) \end{aligned}$$
(9)

with a suitable function \(P_c\). One can notice that this Hamiltonian is a perturbation of \(h_0:=\langle \psi ,\bar{\psi }\rangle \), which is the generator of the standard gauge transform and which in particular admits a flow that is periodic in time. Thus the idea is to exploit canonical perturbation theory in order to conjugate such a Hamiltonian system to a system in normal form, up to remainders of order \(\mathscr {O}(c^{-2r})\), for any given \(r \ge 1\).

The problem is that the perturbation \(P_c\) has a vector field which is small only as an operator extracting derivatives. One can Taylor expand \(P_c\) and its vector field, but the number of derivatives extracted at each order increases. This situation is typical in singular perturbation problems. Problems of this kind have already been studied with canonical perturbation theory, but the price to pay to get a normal form is that the remainder of the perturbation turns out to be an operator that extracts a large number of derivatives.

In Sect. 6 the normal form equation is explicitly computed in the case \(r=2\):

$$\begin{aligned} -i \psi _t \;&= \; c^2 \psi - \frac{1}{2} \varDelta \psi + \frac{3}{4} \lambda |\psi |^2\psi \nonumber \\&\quad + \frac{1}{c^2} \left[ \frac{51}{8} \lambda ^2 |\psi |^4\psi + \frac{3}{16} \lambda \left( 2|\psi |^2 \, \varDelta \psi + \psi ^2 \varDelta \bar{\psi } + \varDelta (|\psi |^2\bar{\psi }) \right) - \frac{1}{8} \varDelta ^2\psi \right] , \end{aligned}$$
(10)

namely a singular perturbation of a gauge-transformed NLS equation. If one, after a gauge transformation, only considers the first-order terms, one has the NLS, for which radiation solution exist (for example, in the defocusing case all solutions are of radiation type). For higher-order NLS there are very few results (see, for example, [37]).

The standard way to exploit such a “singular” normal form is to use it just to construct some approximate solution of the original system, and then to apply Gronwall lemma in order to estimate the difference with a true solution with the same initial datum (see, for example, [4]).

This strategy works also here, but it only leads to a control of the solutions over times of order \(\mathscr {O}(c^2)\). When scaled back to the physical time, this allows to justify the approximation of the solutions of NLKG by solutions of the NLS over timescales of order \(\mathscr {O}(1)\), on any manifold admitting a Littlewood–Paley decomposition (such as Riemannian smooth compact manifolds, or \(\mathbb {R}^d\); see the introduction of [12] for the construction of Littlewood–Paley decomposition on manifolds).

Theorem 2

Let M be a manifold which admits a Littlewood–Paley decomposition, and consider Eq. (3) on M.

Fix \(r \ge 1\), \(R>0\), \(k_1 \gg 1\), \(1<p< +\infty \). Then \(\exists \)\(k_0=k_0(r)>0\) with the following properties: For any \(k \ge k_1\) there exists \(c_{l,r,k,p,R} \gg 1\) such that for any \(c>c_{l,r,k,p,R}\), if

$$\begin{aligned} \Vert \psi _0\Vert _{k+k_0,p}&\le R \end{aligned}$$

and there exists \(T=T_{r,k,p}>0\) such that the solution \(\psi _{r}\) of the equation in normal form up to order r (98) with the initial datum \(\psi _0\) satisfies

$$\begin{aligned} \Vert \psi _r(t)\Vert _{k+k_0,p}&\le 2R, \quad \text {for} \quad 0 \le t \le T, \end{aligned}$$

then

$$\begin{aligned} \Vert \psi (t)-\psi _r(t)\Vert _{k,p}&\lesssim \frac{1}{c^2}, \quad \text {for} \quad 0\le t \le T. \end{aligned}$$
(11)

where \(\psi (t)\) is the solution of (3) with the initial datum \(\psi _{0}\).

A similar result has been obtained for the case \(M=\mathbb {T}^d\) by Faou and Schratz, who aimed to construct numerical schemes which are robust in the nonrelativistic limit (see [23]; see also [7, 8] and to [9] for the numerical analysis of the nonrelativistic limit of the NLKG).

The idea one uses here in order to improve the timescale of the result is that of substituting Gronwall lemma with a more sophisticated tool, namely dispersive estimates and the retarded Strichartz estimate. This can be done, provided one can prove a dispersive or a Strichartz estimate for the linearization of Eq. (3) on the approximate solution, uniformly in c.

In order to state our approximation result for the linear case, we consider the approximate equation given by the Hamilton equations of the normal form truncated at order \(\mathscr {O}(c^{-2r})\), and let \(\psi _r\) be a solution of such a linearized normal form equation.

Theorem 3

Fix \(r \ge 1\) and \(k_1 \gg 1\). Then \(\exists \)\(k_0=k_0(r)>0\) such that for any \(k \ge k_1\), if we denote by \(\psi _r\) the solution of the linearized normal equation (105) with the initial datum \(\psi _0 \in H^{k+k_0}\) and by \(\psi \) the solution of the linear KG equation (12) with the same initial datum, then there exists \(c^*:=c^*(r,k) > 0\) such that for any \(c > c^*\)

$$\begin{aligned} \sup _{t\in [0,T]} \Vert \psi (t)-\psi _r(t)\Vert _{H^k_x}&\lesssim \frac{1}{c^2}, \quad T \lesssim c^{2(r-1)}. \end{aligned}$$

This result has been proved in the case \(r=1\) in Appendix A of [14].

Next we consider the approximation of small radiation solutions of the NLKG equation.

Theorem 4

Consider (3) on \(\mathbb {R}^d\), \(d \ge 2\). Let \(r>1\), and fix \(k_1 \gg 1\). Assume that \(l \ge 2\) and \(r < \frac{d}{2}(l-1)\). Then \(\exists \)\(k_0=k_0(r)>0\) such that for any \(k \ge k_1\) and for any \(\sigma >0\) the following holds: Consider the solution \(\psi _{r}\) of the normalized equation (98), with the initial datum \(\psi _{r,0} \in H^{k+k_0+\sigma +d/2}\). Then there exist \(\alpha ^*:=\alpha ^*(d,l,r)>0\) and there exists \(c^*:=c^*(r,k) > 1\), such that for any \(\alpha > \alpha ^*\) and for any \(c > c^*\), if \(\psi _{r,0}\) satisfies

$$\begin{aligned} \Vert \psi _{r,0}\Vert _{H^{k+k_0+\sigma +d/2}}&\lesssim c^{-\alpha }, \end{aligned}$$

then

$$\begin{aligned} \sup _{t\in [0,T]} \Vert \psi (t)-\psi _r(t)\Vert _{H^k_x}&\lesssim \frac{1}{c^2}, \quad T \lesssim c^{2(r-1)}, \end{aligned}$$

where \(\psi (t)\) is the solution of (3) with the initial datum \(\psi _{r,0}\).

Remark 2

The assumption of existence of \(\psi _r\) up to times of order \(\mathscr {O}(c^{2(r-1)})\) is actually a delicate matter. Equation (10), for example, is a quasilinear perturbation of a fourth-order Schrödinger equation (4NLS). Even if we restrict to the case \(r=2\), the issues of global well-posedness and scattering for solutions with large initial data for Eq. (10) have not been solved. For solutions with small initial data, on the other hand, there are some papers dealing with the local well-posedness of 4NLS (see, for example, [28]) and with global well-posedness and scattering of 4NLS (see [50]). In Sec. 8.2 we prove the local well-posedness for times of order \(\mathscr {O}(c^{2(r-1)})\) for solutions of the order-r normalized equation with small initial data under the assumptions that \(l \ge 2\) and \(r < \frac{d}{2}(l-1)\).

Remark 3

Just to be explicit, we make some examples of Theorem 4. For \(M=\mathbb {R}^2\) and a nonlinearity of order 2l, we can justify the approximation of small radiation solutions up to times of order \(\mathscr {O}(c^{2(r-1)})\), for \(r < l-1\). For \(M=\mathbb {R}^3\) and a nonlinearity of order 2l, we can justify the approximation of small radiation solutions up to times of order \(\mathscr {O}(c^{2(r-1)})\), for \(r < \frac{3}{2}(l-1)\).

There are some equations, namely the ones in which \(\frac{d}{2}(l-1) \le 2\), in which we cannot justify the approximation over long timescales (we mention, for example, the cubic NLKG in 2, 3 and 4 dimensions, or the quintic NLKG in 2 dimensions).

There are other well-known solutions of NLS which would be interesting to study; indeed, it is well known that in the case of mixed-type nonlinearity

$$\begin{aligned} i \psi _t&=-\varDelta \psi - \left( |\psi |^2-|\psi |^4\right) \psi , \end{aligned}$$

such an equation admits linearly stable solitary wave solutions; it can also be proved that the standing waves of NLS can be modified in order to obtain standing wave solutions of the normal form of order r, for any r. It would be of clear interest to prove that true solutions starting close to such standing wave remain close to them for long times (remark that the NLKG does not admit stable standing wave solutions, see [45]); in order to get such a result, one should prove a Strichartz estimate for NLKG close to the approximate solution and uniformly in c.

Before closing the subsection, a few technical comments are as follows: The first one is that here we develop normal form in the framework of the spaces \(W^{k,p}\), while known results in Galerkin averaging theory only allow to deal with the spaces \(H^k\). This is due to the fact that the Fourier analysis is used in order to approximate the derivatives operators with bounded operators. Thus the first technical step needed in order to be able to exploit dispersion is to reformulate Galerkin averaging theory in terms of dyadic decompositions. This is done in Theorem 7.

Second, the condition on r in Theorem 4 depends on the assumption in which we were able to prove a well-posedness result for the normalized equation, which in turn depends on the approach presented recently in [50]; we do not exclude that this technical condition could be improved.

3 Dispersive properties of the Klein–Gordon equation

We briefly recall some classical notion of Fourier analysis on \(\mathbb {R}^d\). Recall the definition of the space of Schwartz (or rapidly decreasing) functions,

$$\begin{aligned} \mathscr {S}&:= \left\{ f \in C^\infty (\mathbb {R}^d,\mathbb {R}) | \sup _{x \in \mathbb {R}^d} (1+|x|^2)^{\alpha /2} |\partial ^\beta f(x)| < + \infty , \quad \forall \alpha \in \mathbb {N}^d, \forall \beta \in \mathbb {N}^d \right\} . \end{aligned}$$

In the following \(\left\langle x \right\rangle :=(1+|x|^2)^{1/2}\).

Now, for any \(f \in \mathscr {S}\) the Fourier transform of f, \(\hat{f}:\mathbb {R}^d \rightarrow \mathbb {R}\), is defined by the following formula

$$\begin{aligned} \hat{f}(\xi )&:= (2\pi )^{-d/2} \int _{\mathbb {R}^d} f(x) e^{-i \left\langle x,\xi \right\rangle }\mathrm{d}x, \quad \forall \xi \in \mathbb {R}^d, \end{aligned}$$

where \(\left\langle \cdot ,\cdot \right\rangle \) denotes the scalar product in \(\mathbb {R}^d\).

At the beginning we obtain Strichartz estimates for the linear equation

$$\begin{aligned} -i \, \psi _t \,&= \, c\langle \nabla \rangle _c \, \psi , \quad x \in \mathbb {R}^d. \end{aligned}$$
(12)

Proposition 1

Let \(d \ge 2\). For any Schrödinger-admissible couples (pq) and (rs), namely such that

$$\begin{aligned} 2&\le p,r \le \infty , \\ 2&\le q,s \le \frac{2d}{d-2}, \\ \frac{2}{p}+\frac{d}{q}&=\frac{d}{2}, \; \frac{2}{r}+\frac{d}{s} =\frac{d}{2}, \\ (p,q,d),(r,s,d)&\ne (2,+\infty ,2), \end{aligned}$$

one has

$$\begin{aligned} \left\| \langle \nabla \rangle _c^{\frac{1}{q}-\frac{1}{p}} \; e^{it \; c\langle \nabla \rangle _c} \; \psi _0 \right\| _{L^p_t L^q_x} \;&\lesssim \; c^{\frac{1}{q}-\frac{1}{p}-\frac{1}{2}} \; \Vert \langle \nabla \rangle _c^{1/2} \psi _0\Vert _{L^2}, \end{aligned}$$
(13)
$$\begin{aligned} \left\| \langle \nabla \rangle _c^{\frac{1}{q}-\frac{1}{p}} \; \int _0^t e^{i(t-s) \; c\langle \nabla \rangle _c} \; F(s) \; \mathrm{d}s \right\| _{L^p_t L^q_x} \;&\lesssim \; c^{\frac{1}{q}-\frac{1}{p}+\frac{1}{s}-\frac{1}{r}-1} \; \left\| \langle \nabla \rangle _c^{\frac{1}{r}-\frac{1}{s}+1} F \right\| _{L^{r{^\prime }}_t L^{s{^\prime }}_x}. \end{aligned}$$
(14)

Remark 4

By choosing \(p=+\infty \) and \(q=2\), we get the following a priori estimate for finite energy solutions of (12),

$$\begin{aligned} \left\| c^{1/2} \langle \nabla \rangle _c^{1/2} \; e^{it \; c\langle \nabla \rangle _c} \; \psi _0 \right\| _{L^\infty _t L^2_x} \;&\lesssim \; \left\| c^{1/2} \langle \nabla \rangle _c^{1/2} \psi _0\right\| _{L^2}. \end{aligned}$$

We also point out that, since the operators \(\langle \nabla \rangle \) and \(\langle \nabla \rangle _c\) commute, the above estimates in the spaces \(L^p_tL^q_x\) extend to estimates in \(L^p_tW^{k,q}_x\) for any \(k \ge 0\).

Proof

We recall a result reported by D’Ancona–Fanelli in [21] for the operator \(\langle \nabla \rangle :=\langle \nabla \rangle _1\).

Lemma 1

For all (pq) Schrödinger-admissible exponents

$$\begin{aligned} \Vert e^{i\tau \; \langle \nabla \rangle } \; \phi _0 \Vert _{L^p_\tau \; W^{\frac{1}{q}-\frac{1}{p}-\frac{1}{2},q }_y} = \; \left\| \langle \nabla \rangle ^{\frac{1}{q}-\frac{1}{p}-\frac{1}{2} } \; e^{it \; \langle \nabla \rangle } \; \phi _0 \right\| _{L^p_\tau \; L^q_y} \; \le \; \Vert \phi _0\Vert _{L^2_y}. \end{aligned}$$

Now, the solution of Eq. (12) satisfies \(\hat{\psi }(t,\xi ) = e^{ i c \langle \xi \rangle _c t}\hat{\psi }_0(\xi )\). We then define \(\eta := \xi /c\), in order to have that

$$\begin{aligned} \hat{\phi }(c^2 t, \eta )&:= \hat{\psi }(t, c\eta ) = \hat{\psi }(t, \xi ), \end{aligned}$$

and in particular that \(\hat{\phi }_0(\eta ) = \hat{\psi }_0(\xi )\).

Since

$$\begin{aligned} \left\langle \xi \right\rangle _c = \sqrt{c^2+|\xi |^2} = c \sqrt{1+|\xi |^2/c^2}, \end{aligned}$$
(15)

we get

$$\begin{aligned} \hat{\phi }(t, \eta )&= e^{it \, c^2 \langle \xi /c \rangle } \hat{\phi }_0(\xi /c) \\&= e^{ i \, tc^2 \, \langle \eta \rangle } \hat{\phi }_0(\eta ) \\&= e^{ i \, \tau \, \langle \eta \rangle } \hat{\phi }_0(\eta ) \end{aligned}$$

if we set \(\tau :=c^2t\). Now, by setting \(y:=cx\) a simple scaling argument leads to

$$\begin{aligned} \Vert e^{i\tau \; \langle \nabla \rangle } \; \phi _0 \Vert _{L^p_\tau \; L^q_y} \; \lesssim \; \left\| \langle \nabla \rangle ^{\frac{1}{p}-\frac{1}{q}+\frac{1}{2} } \; \phi _0 \right\| _{L^2} \; = \; \left\| \left\langle \eta \right\rangle ^{\frac{1}{p}-\frac{1}{q}+\frac{1}{2} } \hat{\phi }_0\right\| _{L^2} \end{aligned}$$

and since

$$\begin{aligned} \Vert \left\langle \eta \right\rangle ^{k} \hat{\phi }_0\Vert ^2_{L^2} \;&= \; \int _{\mathbb {R}^d} \left\langle \eta \right\rangle ^{2k} \; |\hat{\phi }_0(\eta )|^2 \; \mathrm{d}\eta \\&= \; \int _{\mathbb {R}^d} \left\langle \frac{\xi }{c} \right\rangle ^{2k} \; |\hat{\phi }_0(\eta /c)|^2 \; \frac{\mathrm{d}\xi }{c^d} \; = \; \frac{1}{c^{2k+d}} \; \int _{\mathbb {R}^d} \left\langle \xi \right\rangle _c^{2k} \; |\hat{\psi }_0(\xi )|^2 \; \mathrm{d}\xi , \end{aligned}$$

we get

$$\begin{aligned} \Vert \left\langle \eta \right\rangle ^{\frac{1}{p}-\frac{1}{q}+\frac{1}{2} } \hat{\phi }_0\Vert _{L^2} \;&= \; \frac{1}{c^{\frac{d}{2}-\frac{1}{q}+\frac{1}{p}+\frac{1}{2} } } \; \left\| \langle \nabla \rangle _c^{ \frac{1}{p}-\frac{1}{q}+\frac{1}{2} } \; \psi _0\right\| _{L^2}, \end{aligned}$$
(16)

while on the other hand

$$\begin{aligned} \psi (t, x) \;&= (2\pi )^{-d/2} \int _{\mathbb {R}^d} e^{i \left\langle \xi ,x\right\rangle } \; \hat{\psi }(t, \xi ) \; \mathrm{d}\xi \; = (2\pi )^{-d/2} \int _{\mathbb {R}^d} e^{i \left\langle \eta ,cx\right\rangle } \; \hat{\psi }(t, c\eta ) \; c^d \mathrm{d}\eta \\&= (2\pi )^{-d/2} \; c^d \; \int _{\mathbb {R}^d} e^{i \left\langle \eta ,cx\right\rangle } \; \hat{\phi }(c^2t, \eta ) \; \mathrm{d}\eta \; = c^d \; \phi (c^2t, cx), \end{aligned}$$

yields

$$\begin{aligned} \Vert \psi \Vert _{L^p_t L^q_x} \; = \; c^{d - \; d/q - \; 2/p} \; \Vert \phi \Vert _{L^p_\tau L^q_y}. \end{aligned}$$
(17)

Hence we can deduce (13); via a scaling argument, we can also deduce (14). \(\square \)

One important application of the Strichartz estimates for the free Klein–Gordon equation is Theorem 1, namely a global existence result uniform with respect to c for the NLKG equation (3) on \(\mathbb {R}^3\) with cubic nonlinearity (\(l=2\)), for small initial data.

Proof (Theorem 1)

It just suffices to apply Duhamel formula,

$$\begin{aligned} \psi (t)&= e^{it c \nabla _c}\psi _0 + i \frac{\lambda }{2^2} \int _0^t e^{i(t-s) c \nabla _c} \left( \frac{c}{\langle \nabla \rangle _c} \right) ^{1/2} \left[ \left( \frac{c}{\langle \nabla \rangle _c} \right) ^{1/2} (\psi +\bar{\psi }) \right] ^{3}, \end{aligned}$$

and Proposition 1 with \(p=+\infty \) and \(q=2\), in order to get that

$$\begin{aligned} \Vert \psi (t) \Vert _{L^\infty _t \mathscr {H}_c^{1/2} }&\lesssim \Vert \psi _0 \Vert _{\mathscr {H}_c^{1/2}} + c^{1/s - 1/r} \left\| \nabla _c^{1/r - 1/s} \left[ \left( \frac{c}{\langle \nabla \rangle _c} \right) ^{1/2} (\psi +\bar{\psi }) \right] ^{3} \right\| _{L^{r{^\prime }}_t L^{s{^\prime }}_x}, \end{aligned}$$

but by choosing \(r=+\infty \) and by exploiting Hölder inequality and Sobolev embedding we get

$$\begin{aligned} \Vert \psi (t) \Vert _{L^\infty _t \mathscr {H}_c^{1/2} }&\lesssim \Vert \psi _0 \Vert _{\mathscr {H}_c^{1/2}} + \left\| \left[ \left( \frac{c}{\langle \nabla \rangle _c} \right) ^{1/2} (\psi +\bar{\psi }) \right] ^{3}\right\| _{L^1_t L^2_x} \\&\lesssim \Vert \psi _0 \Vert _{\mathscr {H}_c^{1/2}} + \left\| \left[ \left( \frac{c}{\langle \nabla \rangle _c} \right) ^{1/2} (\psi +\bar{\psi }) \right] ^2\right\| _{L^1_t L^3_x} \left\| \left( \frac{c}{\langle \nabla \rangle _c} \right) ^{1/2} (\psi +\bar{\psi }) \right\| _{L^\infty _t L^6_x} \\&\lesssim \Vert \psi _0 \Vert _{\mathscr {H}_c^{1/2}} + \left\| \left( \frac{c}{\langle \nabla \rangle _c} \right) ^{1/2} (\psi +\bar{\psi })\right\| ^2_{L^2_t L^6_x} \left\| \left( \frac{c}{\langle \nabla \rangle _c} \right) ^{1/2} (\psi +\bar{\psi }) \right\| _{L^\infty _t L^6_x} \\&\lesssim \Vert \psi _0 \Vert _{\mathscr {H}_c^{1/2}} + \Vert \psi \Vert ^2_{L^2_t \mathscr {W}_c^{-1/2,6} } \Vert \psi \Vert _{L^\infty _t \mathscr {W}_c^{-1/2,6} } \\&\lesssim \Vert \psi _0 \Vert _{\mathscr {H}_c^{1/2}} + \Vert \psi \Vert ^2_{L^2_t \mathscr {W}_c^{-1/3,6} } \Vert \psi \Vert _{L^\infty _t \mathscr {H}_c^{1/2} } , \end{aligned}$$

and one can conclude by a standard continuation argument. \(\square \)

We also give a formulation of the Kato–Ponce inequality for the relativistic Sobolev spaces.

Proposition 2

Let \(f,g \in \mathscr {S}(\mathbb {R}^d)\), and let \(c>0\), \(1< r < \infty \) and \(k \ge 0\). Then

$$\begin{aligned} \Vert f \; g \Vert _{\mathscr {W}_c^{k,r}}&\lesssim \Vert f\Vert _{\mathscr {W}_c^{k,r_1}} \Vert g\Vert _{L^{r_2}} + \Vert f\Vert _{L^{r_3}} \Vert g\Vert _{\mathscr {W}_c^{k,r_4}}, \end{aligned}$$
(18)

with

$$\begin{aligned} \frac{1}{r} = \frac{1}{r_1} + \frac{1}{r_2} = \frac{1}{r_3} + \frac{1}{r_4}, \quad \; 1< r_1, r_4 <+\infty . \end{aligned}$$

Remark 5

For \(c=1\) Eq. (18) reduces to the classical Kato–Ponce inequality.

Proof

We follow an argument by Cordero and Zucco (see Theorem 2.3 in [19]).

We introduce the dilation operator \(S_c(f)(x):=f(x/c)\), for any \(c>0\).

Then we apply the classical Kato–Ponce inequality to the rescaled product \(S_c(fg) = S_c(f) \; S_c(g)\),

$$\begin{aligned} \Vert S_c(fg) \Vert _{W^{k,r}}&\lesssim \Vert S_c(f)\Vert _{W^{k,r_1}} \Vert S_c(g)\Vert _{L^{r_2}} + \Vert S_c(f)\Vert _{L^{r_3}} \Vert S_c(g)\Vert _{W^{k,r_4}}, \end{aligned}$$
(19)

where

$$\begin{aligned} \frac{1}{r} = \frac{1}{r_1} + \frac{1}{r_2} = \frac{1}{r_3} + \frac{1}{r_4}, \quad \; 1< r_1, r_4 <+\infty . \end{aligned}$$

Now, combining the commutativity property

$$\begin{aligned} \langle \nabla \rangle ^k S_c(f)(x)&= c^{-k} S_c( \langle \nabla \rangle _c^k \; f)(x), \end{aligned}$$

with the equality \(\Vert S_c(f)\Vert _{L^r} = c^{-d/r} \Vert f\Vert _{L^r}\), we can rewrite (19) as

$$\begin{aligned} \Vert \langle \nabla \rangle ^k( f \; g) \Vert _{L^{r}}&\lesssim \Vert \langle \nabla \rangle ^k f\Vert _{L^{r_1}} \Vert g\Vert _{L^{r_2}} + \Vert f\Vert _{L^{r_3}} \Vert \langle \nabla \rangle ^k g\Vert _{L^{r_4}}, \end{aligned}$$

and this leads to the thesis. \(\square \)

We conclude with another dispersive result, which could be interesting in itself: by exploiting the boundedness of the wave operators for the Schrödinger equation, we can deduce Strichartz estimates for the KG equation with potential.

Theorem 5

Let \(c \ge 1\), and consider the operator

$$\begin{aligned} \mathscr {H}(x) := c \left( c^2-\varDelta +V(x)\right) ^{1/2}&= \mathscr {H}_0 \left( 1+ \langle \nabla \rangle _c^{-2}V\right) ^{1/2}, \end{aligned}$$
(20)

where \(V \in C(\mathbb {R}^3,\mathbb {R})\) is a potential such that

$$\begin{aligned} |V(x)|+|\nabla V(x)|&\lesssim \left\langle x \right\rangle ^{-\beta }, \quad x \in \mathbb {R}^3, \end{aligned}$$

for some \(\beta >5\), and that 0 is neither an eigenvalue nor a resonance for the operator \(-\varDelta +V(x)\). Let (pq) be a Schrödinger-admissible couple, and assume that \(\psi _0 \in \langle \nabla \rangle ^{-1/2}_c L^2\) is orthogonal to the bound states of \(-\varDelta +V(x)\). Then

$$\begin{aligned} \Vert \langle \nabla \rangle _c^{ \frac{1}{q}-\frac{1}{p} } \, e^{it\mathscr {H}(x)}\psi _0\Vert _{L^p_tL^q_x}&\lesssim c^{ \frac{1}{q}-\frac{1}{p}-\frac{1}{2} } \Vert \langle \nabla \rangle _c^{1/2} \; \psi _0\Vert _{L^2}. \end{aligned}$$
(21)

In order to prove Theorem 5 we recall Yajima’s result on wave operators [60] (where we denote by \(P_c(-\varDelta +V)\) the projection onto the continuous spectrum of the operator \(-\varDelta +V\)).

Theorem 6

Assume that

  • 0 is neither an eigenvalue nor a resonance for \(-\varDelta +V\);

  • \(|\partial ^\alpha V(x)| \lesssim \left\langle x \right\rangle ^{-\beta }\) for \(|\alpha |\le k\), for some \(\beta >5\).

Consider the strong limits

$$\begin{aligned} \mathscr {W}_\pm := \lim _{t \rightarrow \pm \infty } e^{it(-\varDelta +V)} e^{it\varDelta }, \;&\; \mathscr {Z}_\pm := \lim _{t \rightarrow \pm \infty } e^{-it\varDelta } e^{it(\varDelta -V)}P_c(-\varDelta +V). \end{aligned}$$

Then \(\mathscr {W}_\pm : L^2 \rightarrow P_c(-\varDelta +V)L^2\) are isomorphic isometries which extend into isomorphisms \(\mathscr {W}_\pm : W^{k,p} \rightarrow P_c(-\varDelta +V)W^{k,p}\) for all \(p \in [1,+\infty ]\), with inverses \(\mathscr {Z}_\pm \). Furthermore, for any Borel function \(f(\cdot )\) we have

$$\begin{aligned} f(-\varDelta +V)P_c(-\varDelta +V) = \mathscr {W}_\pm f(-\varDelta ) \mathscr {Z}_\pm , \;&\; f(-\varDelta ) = \mathscr {Z}_\pm f(-\varDelta +V)P_c(-\varDelta +V)\mathscr {W}_\pm . \end{aligned}$$
(22)

Now, in the case \(c=1\) one can derive Strichartz estimates for \(\mathscr {H}(x)\) from the Strichartz estimates for the free KG equation, just by applying the aforementioned theorem by Yajima in the case \(k=1\) (since \(1/p-1/q+1/2 \in [0,5/6]\) for all Schrödinger-admissible couples (pq)). This was already proved in [5] (see Lemma 6.3). In the general case, this follows from an interpolation theory argument, and we defer it to Appendix B.

4 Galerkin averaging method

Consider the scale of Banach spaces \(W^{k,p}(M,\mathbb {C}^n \times \mathbb {C}^n) \ni (\psi ,\bar{\psi })\) (\(k \ge 1\), \(1<p<+\infty \), \(n \in \mathbb {N}_0\)) endowed by the standard symplectic form. Having fixed k and p, and \(U_{k,p} \subset W^{k,p}\) open, we define the gradient of \(H \in C^\infty (U_{k,p},\mathbb {R})\) w.r.t. \(\bar{\psi }\) as the unique function s.t.

$$\begin{aligned} \left\langle \nabla _{\bar{\psi }} H ,\bar{h} \right\rangle&= \mathrm{d}_{\bar{\psi }}H \bar{h}, \quad \forall h \in W^{k,p}, \end{aligned}$$

so that the Hamiltonian vector field of a Hamiltonian function H is given by

$$\begin{aligned} X_H(\psi ,\bar{\psi })=\left( i\nabla _{\bar{\psi }}H, \; -i\nabla _{\psi }H\right) . \end{aligned}$$

The open ball of radius R and center 0 in \(W^{k,p}\) will be denoted by \(B_{k,p}(R)\).

Now, we call an admissible family of cutoff (pseudo-differential) operators a sequence \((\pi _j(D))_{j \ge 0}\), where \(\pi _j(D): W^{k,p} \rightarrow W^{k,p}\) for any \(j\ge 0\), such that

  • for any \(j\ge 0\) and for any \(f \in W^{k,p}\)

    $$\begin{aligned} f = \sum _{j \ge 0} \pi _j(D) f; \end{aligned}$$
  • for any \(j\ge 0\)\(\pi _j(D)\) can be extended to a self-adjoint operator on \(L^2\), and there exist constants \(K_1\), \(K_2>0\) such that

    $$\begin{aligned} K_1 \left( \sum _{j\ge 0} \Vert \pi _j(D)f\Vert _{L^2}^2 \right) ^{1/2}&\le \Vert f\Vert _{L^2} \le K_2 \left( \sum _{j\ge 0} \Vert \pi _j(D)f\Vert _{L^2}^2 \right) ^{1/2}; \end{aligned}$$
  • for any \(j \ge 0\), if we denote by \(\varPi _j(D) := \sum _{l=0}^j \pi _l(D)\), there exist positive constants \(K{^\prime }\) (possibly depending on k and p) such that

    $$\begin{aligned} \Vert \varPi _jf\Vert _{k,p}&\le K{^\prime } \, \Vert f\Vert _{k,p} \quad \forall f \in W^{k,p}; \end{aligned}$$
  • there exist positive constants \(K^{\prime \prime }_1\), \(K^{\prime \prime }_2\) (possibly depending on k and p) and an increasing and unbounded sequence \((K_j)_{j \in \mathbb {N}} \subset \mathbb {R}_+\) such that

    $$\begin{aligned} K^{\prime \prime }_1 \Vert f \Vert _{W^{k,p}} \le \left\| \left[ \sum _{j \in \mathbb {N}} K_j^{2k} |\pi _j(D)f|^2 \right] ^{1/2} \right\| _{L^p} \le K^{\prime \prime }_2 \Vert f\Vert _{W^{k,p}}. \end{aligned}$$
    (23)

Remark 6

Let \(k \ge 0\), M be either \(\mathbb {R}^d\) or the d-dimensional torus \(\mathbb {T}^d\), and consider the Sobolev space \(H^k=H^k(M)\). One can readily check that Fourier projection operators on \(H^k\)

$$\begin{aligned} \pi _j \psi (x) := (2\pi )^{-d/2} \int _{j-1 \le |k| \le j} \hat{\psi }(k) e^{i k \cdot x} \mathrm{d}k, \quad j \ge 1 \end{aligned}$$

form an admissible family of cutoff operators. In this case we have

$$\begin{aligned} \varPi _N \psi (x) := (2\pi )^{-d/2} \int _{|k| \le N} \hat{\psi }(k) e^{i k \cdot x} \mathrm{d}k, \quad N \ge 0, \end{aligned}$$

and the constants \((K_j)_{j\in \mathbb {N}}\) in (23) are given by \(K_j:=j\).

Remark 7

Let \(k \ge 0\), \(1< p < +\infty \); we now introduce the Littlewood–Paley decomposition on the Sobolev space \(W^{k,p}=W^{k,p}(\mathbb {R}^d)\) (see [56], Ch. 13.5).

In order to do this, define the cutoff operators in \(W^{k,p}\) in the following way: Start with a smooth, radial nonnegative function \(\phi _0: \mathbb {R}^d \rightarrow \mathbb {R}\) such that \(\phi _0(\xi ) = 1\) for \(|\xi | \le 1/2\), and \(\phi _0(\xi ) = 0\) for \(|\xi | \ge 1\); then, define \(\phi _1(\xi ):=\phi _0(\xi /2)-\phi _0(\xi )\), and set

$$\begin{aligned} \phi _j(\xi )&:= \phi _1(2^{1-j}\xi ), \quad j \ge 2. \end{aligned}$$
(24)

Then \((\phi _j)_{j \ge 0}\) is a partition of unity,

$$\begin{aligned} \sum _{j \ge 0} \phi _j(\xi )&= 1. \end{aligned}$$

Now, for each \(j \in \mathbb {N}\) and each \(f \in W^{k,2}\), we can define \(\phi _j(D)f\) by

$$\begin{aligned} \mathscr {F}( \phi _j(D)f )(\xi ) := \phi _j(\xi )\hat{f}(\xi ). \end{aligned}$$

It is well known that for \(p \in (1,+\infty )\) the map \(\varPhi :L^p(\mathbb {R}^d) \rightarrow L^p(\mathbb {R}^d,l^2)\),

$$\begin{aligned} \varPhi (f)&:= (\phi _j(D)f)_{j \in \mathbb {N}}, \end{aligned}$$

maps \(L^p(\mathbb {R}^d)\) isomorphically onto a closed subspace of \(L^p(\mathbb {R}^d,l^2)\), and we have compatibility of norms ([56], Ch. 13.5, (5.45)–(5.46)),

$$\begin{aligned} K{^\prime }_p \Vert f \Vert _{L^p} \le \Vert \varPhi (f)\Vert _{L^p(\mathbb {R}^d,l^2)}&:= \left\| \left[ \sum _{j \in \mathbb {N}} |\phi _j(D)f|^2 \right] ^{1/2} \right\| _{L^p} \le K_p \Vert f\Vert _{L^p}, \end{aligned}$$

and similarly for the \(W^{k,p}\)-norm, i.e., for any \(k>0\) and \(p \in (1,+\infty )\)

$$\begin{aligned} K{^\prime }_{k,p} \Vert f \Vert _{W^{k,p}} \le \left\| \left[ \sum _{j \in \mathbb {N}} 2^{2jk} |\phi _j(D)f|^2 \right] ^{1/2} \right\| _{L^p} \le K_{k,p} \Vert f\Vert _{W^{k,p}}. \end{aligned}$$
(25)

We then define the cutoff operator \(\varPi _N\) by

$$\begin{aligned} \varPi _N\psi := \sum _{j \le N}\phi _j(D)\psi . \end{aligned}$$
(26)

Hence, according to the above definition, the sequence \((\phi _j(D))_{j\ge 0}\) is an admissible family of cutoff operators.

We point out that the Littlewood–Paley decomposition, along with equality (25), can be extended to compact manifolds (see [13]), as well as to some particular noncompact manifolds (see [12]).

Now we consider a Hamiltonian system of the form

$$\begin{aligned} H=h_0+ \varepsilon \, h + \varepsilon \, F, \end{aligned}$$
(27)

where \(\varepsilon >0\) is a parameter. We fix an admissible family of cutoff operators \((\pi _j(D))_{j \ge 0}\) on \(W^{k,p}(\mathbb {R}^d)\). We assume that

  1. PER

    \(h_0\) generates a linear periodic flow \(\varPhi ^t\) with period \(2\pi \),

    $$\begin{aligned} \varPhi ^{t+2\pi } = \varPhi ^t \quad \forall t. \end{aligned}$$

    We also assume that \(\varPhi ^t\) is analytic from \(W^{k,p}\) to itself for any \(k \ge 1\), and for any \(p \in (1,+\infty )\);

  2. INV

    for any \(k\ge 1\), for any \(p \in (1,+\infty )\), \(\varPhi ^t\) leaves invariant the space \(\varPi _jW^{k,p}\) for any \(j\ge 0\). Furthermore, for any \(j \ge 0\)

    $$\begin{aligned} \pi _j(D) \circ \varPhi ^t = \varPhi ^t \circ \pi _j(D); \end{aligned}$$
  3. NF

    h is in normal form, namely

    $$\begin{aligned} h \circ \varPhi ^t = h. \end{aligned}$$

Next we assume that both the Hamiltonian and the vector field of both h and F admit an asymptotic expansion in \(\varepsilon \) of the form

$$\begin{aligned}&h \sim \sum _{j \ge 1} \varepsilon ^{j-1} h_j, \quad F \sim \sum _{j \ge 1} \varepsilon ^{j-1} F_j, \end{aligned}$$
(28)
$$\begin{aligned}&X_h \sim \sum _{j \ge 1} \varepsilon ^{j-1} X_{h_j}, \quad X_F \sim \sum _{j \ge 1} \varepsilon ^{j-1} X_{F_j}, \end{aligned}$$
(29)

and that the following properties are satisfied

  1. HVF

    There exists \(R^*>0\) such that for any \(j \ge 1\)

    • \(X_{h_j}\) is analytic from \(B_{k+2j,p}(R^*)\) to \(W^{k,p}\);

    • \(X_{F_j}\) is analytic from \(B_{k+2(j-1),p}(R^*)\) to \(W^{k,p}\).

    Moreover, for any \(r \ge 1\) we have that

    • \(X_{h-\sum _{j=1}^r \varepsilon ^{j-1} h_j}\) is analytic from \(B_{k+2(r+1),p}(R^*)\) to \(W^{k,p}\);

    • \(X_{F - \sum _{j=1}^r \varepsilon ^{j-1} F_j}\) is analytic from \(B_{k+2r,p}(R^*)\) to \(W^{k,p}\).

The main result of this section is the following theorem.

Theorem 7

Fix \(r\ge 1\), \(R>0\), \(k_1\gg 1\), \(1<p<+\infty \). Consider (27), and assume PER, INV (with respect to the Littlewood–Paley decomposition), NF and HVF. Then \(\exists \)\(k_0=k_0(r)>0\) with the following properties: For any \(k \ge k_1\) there exists \(\varepsilon _{r,k,p} \ll 1\) such that for any \(\varepsilon <\varepsilon _{r,k,p}\) there exists \(\mathscr {T}^{(r)}_\varepsilon :B_{k,p}(R) \rightarrow B_{k,p}(2R)\) analytic canonical transformation such that

$$\begin{aligned} H_r := H \circ \mathscr {T}^{(r)}_\varepsilon = h_0 + \sum _{j=1}^r\varepsilon ^j \mathscr {Z}_j + \varepsilon ^{r+1} \; \mathscr {R}^{(r)}, \end{aligned}$$

where \(\mathscr {Z}_j\) are in normal form, namely

$$\begin{aligned} \{\mathscr {Z}_j,h_0\}&= 0, \end{aligned}$$
(30)

and

$$\begin{aligned} \sup _{B_{k+k_0,p}(R)} \Vert X_{\mathscr {Z}_{j}}\Vert _{W^{k,p}}&\le C_{k,p},\nonumber \\ \sup _{B_{k+k_0,p}(R)} \Vert X_{\mathscr {R}^{(r)}}\Vert _{W^{k,p}}&\le C_{k,p}, \end{aligned}$$
(31)
$$\begin{aligned} \sup _{B_{k,p}(R)} \Vert \mathscr {T}^{(r)}_\varepsilon -id\Vert _{W^{k,p}}&\le C_{k,p} \, \varepsilon . \end{aligned}$$
(32)

In particular, we have that

$$\begin{aligned} \mathscr {Z}_1(\psi ,\bar{\psi }) = h_1(\psi ,\bar{\psi }) + \left\langle F_1 \right\rangle (\psi ,\bar{\psi }), \end{aligned}$$

where \(\left\langle F_1 \right\rangle (\psi ,\bar{\psi }) := \int _0^{2\pi } F_1\circ \varPhi ^t(\psi ,\bar{\psi }) \frac{\mathrm{d}t}{2\pi }\).

5 Proof of Theorem 7

We first make a Galerkin cutoff through the Littlewood–Paley decomposition (see [56], Ch. 13.5).

In order to do this, fix \(N \in \mathbb {N}\), \(N \gg 1\), and introduce the cutoff operators \(\varPi _N\) in \(W^{k,p}\) by

$$\begin{aligned} \varPi _N\psi&:= \sum _{j \le N}\phi _j(D)\psi , \end{aligned}$$

where \(\phi _j(D)\) are the operators we introduced in Remark 7.

We notice that by assumption INV the Hamiltonian vector field of \(h_0\) generates a continuous flow \(\varPhi ^t\) which leaves \(\varPi _NW^{k,p}\) invariant.

Now we set \(H = H_{N,r} + \mathscr {R}_{N,r} + \mathscr {R}_r\), where

$$\begin{aligned} H_{N,r}&:= h_{0} + \varepsilon \, h_{N,r} + \varepsilon \, F_{N,r}, \end{aligned}$$
(33)
$$\begin{aligned} h_{N,r}&:= \sum _{j=1}^r \varepsilon ^{j-1} h_{j,N}, \quad h_{j,N} := h_j \circ \varPi _N, \end{aligned}$$
(34)
$$\begin{aligned} F_{N,r}&:= \sum _{j=1}^r \varepsilon ^{j-1} F_{j,N}, \quad F_{j,N} := F_j \circ \varPi _N, \end{aligned}$$
(35)

and

$$\begin{aligned} \mathscr {R}_{N,r}&:= h_0 +\sum _{j=1}^r \varepsilon ^j h_j +\sum _{j=1}^r \varepsilon ^j F_j -H_{N,r}, \end{aligned}$$
(36)
$$\begin{aligned} \mathscr {R}_r&:= \varepsilon \left( h - \sum _{j=1}^r \varepsilon ^{j-1} h_j \right) + \varepsilon \left( F - \sum _{j=1}^r \varepsilon ^{j-1} F_j \right) . \end{aligned}$$
(37)

The system described by the Hamiltonian (33) is the one that we will put in normal form.

In the following we will use the notation \(a \lesssim b\) to mean: there exists a positive constant K independent of N and R (but dependent on r, k and p), such that \(a \le Kb\).

We exploit the following intermediate results:

Lemma 2

For any \(k \ge k_1\) and \(p \in (1,+\infty )\) there exists \(B_{k,p}(R) \subset W^{k,p}\) s.t. \(\forall \)\(\sigma >0\), \(N>0\)

$$\begin{aligned} \sup _{ B_{k+\sigma +2(r+1),p}(R) } \Vert X_{\mathscr {R}_{N,r}}(\psi ,\bar{\psi })\Vert _{W^{k,p}}&\lesssim \; \frac{\varepsilon }{2^{\sigma (N+1)}}, \end{aligned}$$
(38)
$$\begin{aligned} \sup _{ B_{k+2(r+1),p}(R) } \Vert X_{\mathscr {R}_r}(\psi ,\bar{\psi })\Vert _{W^{k,p}}&\lesssim \varepsilon ^{r+1}. \end{aligned}$$
(39)

Proof

We recall that \(\mathscr {R}_{N,r} = h_0 +\sum _{j=1}^r \varepsilon ^j h_j +\sum _{j=1}^r \varepsilon ^j F_j -H_{N,r}\).

Now, \(\Vert id-\varPi _N\Vert _{ W^{k+\sigma ,p} \rightarrow W^{k,p} } \lesssim 2^{-\sigma (N+1)}\), since

$$\begin{aligned} \left\| \sum _{j \ge N+1} \phi _j(D)f \right\| _{W^{k,p}}&\lesssim \left\| \left[ \sum _{j \ge N+1} |2^{jk} \phi _j(D)f|^2 \right] ^{1/2} \right\| _{L^p} \\&\lesssim 2^{-\sigma (N+1)} \left\| \left[ \sum _{j \ge N+1} |2^{j(k+\sigma )} \phi _j(D)f|^2 \right] ^{1/2} \right\| _{L^p} \\&\lesssim 2^{-\sigma (N+1)} \Vert f\Vert _{W^{k+\sigma ,p}}, \end{aligned}$$

hence

$$\begin{aligned}&\sup _{\psi \in B_{k+2(r+1)+\sigma ,p}(R)} \; \Vert X_{\mathscr {R}_{N,r}}(\psi ,\bar{\psi })\Vert _{W^{k,p}} \\&\quad \lesssim \; \Vert dX_{ \sum _{j=1}^r \varepsilon ^j(h_j+F_j) }\Vert _{ L^\infty (B_{k+2(r+1),p}(R),W^{k,p}) } \Vert id-\varPi _N\Vert _{ L^\infty (B_{k+2(r+1)+\sigma ,p}(R),B_{k+2(r+1),p}) } \\&\quad \lesssim \varepsilon \, 2^{-\sigma (N+1)}. \end{aligned}$$

The estimate of \(X_{\mathscr {R}_r}\) follows from the hypothesis HVF. \(\square \)

Lemma 3

Let \(j \ge 1\). Then for any \(k \ge k_1+2(j-1)\) and \(p \in (1,+\infty )\) there exists \(B_{k,p}(R) \subset W^{k,p}\) such that

$$\begin{aligned} \sup _{ B_{k,p}(R) } \Vert X_{h_{j,N}}(\psi ,\bar{\psi })\Vert _{k,p}&\le K^{(h)}_{j,k,p} 2^{2jN} , \\ \sup _{ B_{k,p}(R) } \Vert X_{F_{j,N}}(\psi ,\bar{\psi })\Vert _{k,p}&\le K^{(F)}_{j,k,p} 2^{2(j-1)N} , \end{aligned}$$

where

$$\begin{aligned} K^{(h)}_{j,k,p}&:= \sup _{B_{k,p}(R) } \Vert X_{h_j}(\psi ,\bar{\psi })\Vert _{k-2j,p}, \\ K^{(F)}_{j,k,p}&:= \sup _{B_{k,p}(R) } \Vert X_{F_j}(\psi ,\bar{\psi })\Vert _{k-2(j-1),p}. \end{aligned}$$

Proof

It follows from

$$\begin{aligned}&\sup _{\psi \in B_{k,p}(R)} \left\| \sum _{h \le N} \phi _h(D)X_{F_{j,N}} (\psi ,\bar{\psi }) \right\| _{W^{k,p}} \lesssim \sup _{\psi \in B_{k,p}(R)} \left\| \left[ \sum _{h \le N} |2^{hk} \phi _h(D)X_{F_{j,N}}(\psi ,\bar{\psi })|^2 \right] ^{1/2} \right\| _{L^p} \end{aligned}$$
(40)
$$\begin{aligned}&\le 2^{2(j-1)N} \sup _{\psi \in B_{k,p}(R)} \left\| \left[ \sum _{h \le N} |2^{h[k-2(j-1)]} \phi _h(D)X_{F_{j,N}}(\psi ,\bar{\psi })|^2 \right] ^{1/2} \right\| _{L^p} \end{aligned}$$
(41)
$$\begin{aligned}&\lesssim 2^{2(j-1)N} \sup _{\psi \in B_{k,p}(R)} \Vert X_{F_{j,N}}(\psi ,\bar{\psi })\Vert _{k-2(j-1),p} \end{aligned}$$
(42)
$$\begin{aligned}&= K^{(F)}_{j,k,p} \, 2^{2(j-1)N}, \end{aligned}$$
(43)

and similarly for \(X_{h_{j,N}}\). \(\square \)

Next we have to normalize the system (33). In order to do this we need a slight reformulation of Theorem 4.4 in [2]. Here we report a statement of the result adapted to our context.

Lemma 4

Let \(k \ge k_1+2r\), \(p \in (1,+\infty )\), \(R>0\), and consider the system (33). Assume that \(\varepsilon < 2^{-4Nr}\), and that

$$\begin{aligned} \left( K^{(F,r)}_{k,p} + K^{(h,r)}_{k,p}\right) r 2^{2Nr} \varepsilon&< 2^{-9} e^{-1} \pi ^{-1} R , \end{aligned}$$
(44)

where

$$\begin{aligned} K^{(F,r)}_{k,p}&:= \sup _{1\le j\le r} \sup _{\psi \in B_{k,p}(R)} \Vert X_{F_j}(\psi ,\bar{\psi })\Vert _{k-2(j-1),p}, \\ K^{(h,r)}_{k,p}&:= \sup _{1\le j\le r} \sup _{\psi \in B_{k,p}(R)} \Vert X_{h_j}(\psi ,\bar{\psi })\Vert _{k-2j,p}. \end{aligned}$$

Then there exists an analytic canonical transformation \(\mathscr {T}^{(r)}_{\varepsilon ,N}:B_{k,p}(R) \rightarrow B_{k,p}(2R)\) such that

$$\begin{aligned} \sup _{B_{k,p}(R/2)} \Vert \mathscr {T}^{(r)}_{\varepsilon ,N}(\psi ,\bar{\psi })-(\psi ,\bar{\psi })\Vert _{W^{k,p}}&\le 4\pi r K^{(F,r)}_{k,p} 2^{2Nr} \varepsilon , \end{aligned}$$

and that puts (33) in normal form up to a small remainder,

$$\begin{aligned} H_{N,r} \circ \mathscr {T}^{(r)}_{\varepsilon ,N}&= h_{0} + \varepsilon h_{N,r} + \varepsilon Z^{(r)}_N + \varepsilon ^{r+1} \mathscr {R}^{(r)}_N, \end{aligned}$$
(45)

with \(Z^{(r)}_N\) is in normal form, namely \(\{h_{0,N},Z^{(r)}_N\}=0\), and

$$\begin{aligned}&\sup _{B_{k,p}(R/2)} \Vert X_{ Z^{(r)}_N }(\psi ,\bar{\psi })\Vert _{k,p} \le 4 \, 2^{2Nr} \, \varepsilon \, \left( r K^{(F,r)}_{k,p} + r K^{(h,r)}_{k,p} \right) \, r 2^{2Nr} K^{(F,r)}_{k,p} \nonumber \\&\quad = 4r^2 K^{(F,r)}_{k,p} ( K^{(F,r)}_{k,p} + K^{(h,r)}_{k,p} ) 2^{4NR} \varepsilon , \end{aligned}$$
(46)
$$\begin{aligned}&\sup _{B_{k,p}(R/2)} \Vert X_{ \mathscr {R}^{(r)}_N }(\psi ,\bar{\psi })\Vert _{k,p} \end{aligned}$$
(47)
$$\begin{aligned}&\quad \le 2^8 e \frac{T}{R} (K^{(F,r)}_{k,p} + K^{(F,r)}_{k,p}) r 2^{2Nr} \end{aligned}$$
(48)
$$\begin{aligned}&\qquad \times \left[ \frac{4T}{R} \left( 2^9 3^2 e \frac{T}{R} (K^{(F,r)}_{k,p} + K^{(F,r)}_{k,p}) K^{(F,r)}_{k,p} r^2 2^{4Nr} \varepsilon + 5 K^{(h,r)}_{k,p} \, r 2^{2Nr} + 5 K^{(F,r)}_{k,p} \, r 2^{2Nr} \right) r \right] ^r \end{aligned}$$
(49)

The proof of Lemma 4 is postponed to “Appendix A.”

Remark 8

In the original notation of Theorem 4.4 in [2] we set

$$\begin{aligned} \mathscr {P}&= W^{k,p}, \\ h_\omega&= h_{0}, \\ \hat{h}&= \varepsilon h_{N,r}, \\ f&= \varepsilon F_{N,r}, \\ f_1&= r = g \equiv 0, \\ F&= K^{(F,r)}_{k,p} \, r 2^{2Nr} \, \varepsilon , \\ F_0&= K^{(h,r)}_{k,p} \, r 2^{2Nr} \, \varepsilon . \end{aligned}$$

Remark 9

Actually, Lemma 4 would also hold under a weaker smallness assumption on \(\varepsilon \): It would be enough that \(\varepsilon < 2^{-2N}\), and that

$$\begin{aligned} \varepsilon&\, \left[ K^{(F,r)}_{k,p} \frac{ 1-2^{2Nr}\varepsilon ^r }{1-2^{2N}\varepsilon } + K^{(h,r)}_{k,p} \frac{ 2^{2N}(1-2^{2Nr}\varepsilon ^r) }{1-2^{2N}\varepsilon } \right] < 2^{-9} e^{-1} \pi ^{-1} R \end{aligned}$$
(50)

is satisfied. However, condition (50) is less explicit than (44), which allows us to apply directly the scheme of [2]. The disadvantage of the stronger smallness assumption (44) is that it holds for a smaller range of \(\varepsilon \), and that at the end of the proof it will force us to choose a larger parameter \(\sigma = 4r^2\). By using (50) and by making a more careful analysis, it may be possible to prove Theorem 7 also by choosing \(\sigma = 2r\).

Now we conclude with the proof of Theorem 7.

Proof

Now consider the transformation \(\mathscr {T}^{(r)}_{\varepsilon ,N}\) defined by Lemma 4, then

$$\begin{aligned} (\mathscr {T}^{(r)}_{\varepsilon ,N})^*H&= h_{0} \; + \sum _{j=1}^{r} \varepsilon ^j h_{j,N} \; + \varepsilon Z^{(r)}_N + \varepsilon ^{r+1} \mathscr {R}^{(r)}_N + \varepsilon ^r \mathscr {R}_{Gal} \end{aligned}$$

where we recall that

$$\begin{aligned} \varepsilon ^{r} \mathscr {R}_{Gal}&:= \left( \mathscr {T}^{(r)}_{\varepsilon ,N}\right) ^*( \mathscr {R}_{N,r} + \mathscr {R}_r). \end{aligned}$$

By exploiting Lemma 4 we can estimate the vector field of \(\mathscr {R}^{(r)}_N\), while by using Lemma 2 and (275) we get

$$\begin{aligned} \sup _{B_{k+\sigma +2(r+1),p }(R/2)} \; \Vert X_{\mathscr {R}_{Gal}}(\psi ,\bar{\psi })\Vert _{W^{k,p}}&\lesssim \left( \frac{\varepsilon }{2^{\sigma (N+1)}} + \frac{\varepsilon ^{r+1}}{\sigma +2(r+1)} \right) . \end{aligned}$$
(51)

To get the result choose

$$\begin{aligned} k_0&= \sigma +2(r+1), \\ N&= r \sigma ^{-1} \log _2(1/\varepsilon )-1, \\ \sigma&= 4r^2. \end{aligned}$$

\(\square \)

Remark 10

The compatibility condition \(N \ge 1\) and (44) lead to

$$\begin{aligned} \varepsilon \le \left[ 2^{-9} e^{-1} \pi ^{-1} R ( K^{(F,r)}_{k,p} + K^{(h,r)}_{k,p} )^{-1} r^{-1} 2^{-2r} \right] ^{\frac{\sigma }{2r}}&=: \varepsilon _{r,k,p} \le 2^{-2\sigma /r} \le 2^{-8r}. \end{aligned}$$

Remark 11

We point out the fact that Theorem 7 holds for the scale of Banach spaces \(W^{k,p}(M,\mathbb {C}^n \times \mathbb {C}^n)\), where \(k \ge 1\), \(1< p < +\infty \), \(n \in \mathbb {N}_0\), and where M is a smooth manifold on which the Littlewood–Paley decomposition can be constructed, for example, a compact manifold (see sect. 2.1 in [13]), \(\mathbb {R}^d\), or a noncompact manifold satisfying some technical assumptions (see [12]).

If we restrict to the case \(p=2\), and we consider M as either \(\mathbb {R}^d\) or the d-dimensional torus \(\mathbb {T}^d\), we can prove an analogous result for Hamiltonians \(H(\psi ,\bar{\psi })\) with \((\psi ,\bar{\psi }) \in H^k:=W^{k,2}(M,\mathbb {C}\times \mathbb {C})\). In the following we denote by \(B_k(R)\) the open ball of radius R and center 0 in \(H^k\). We recall that the Fourier projection operator on \(H^k\) is given by

$$\begin{aligned} \pi _j \psi (x) := (2\pi )^{-d/2} \int _{j-1 \le |k| \le j} \hat{\psi }(k) e^{i k \cdot x} \mathrm{d}k, \quad j \ge 1. \end{aligned}$$

Theorem 8

Fix \(r \ge 1\), \(R>0\), \(k_1 \gg 1\). Consider (27), and assume PER, INV (with respect to Fourier projection operators), NF and HVF. Then \(\exists \)\(k_0=k_0(r)>0\) with the following properties: For any \(k \ge k_1\) there exists \(\varepsilon _{r,k} \ll 1\) such that for any \(\varepsilon <\varepsilon _{r,k}\) there exists \(\mathscr {T}^{(r)}_\varepsilon :B_k(R) \rightarrow B_k(2R)\) transformation s.t.

$$\begin{aligned} H_r := H \circ \mathscr {T}^{(r)}_\varepsilon = h_0 + \sum _{j=1}^r\varepsilon ^j \mathscr {Z}_j + \varepsilon ^{r+1} \; \mathscr {R}^{(r)}, \end{aligned}$$

where \(\mathscr {Z}_j\) are in normal form, namely

$$\begin{aligned} \{\mathscr {Z}_j,h_0\}&= 0, \end{aligned}$$
(52)

and

$$\begin{aligned} \sup _{B_{k+k_0}(R)} \Vert X_{\mathscr {R}^{(r)}}\Vert _{H^k}&\le C_k, \end{aligned}$$
(53)
$$\begin{aligned} \sup _{B_k(R)} \Vert \mathscr {T}^{(r)}_\varepsilon -id\Vert _{H^k}&\le C_k \, \varepsilon . \end{aligned}$$
(54)

In particular, we have that

$$\begin{aligned} \mathscr {Z}_1(\psi ,\bar{\psi }) = h_1(\psi ,\bar{\psi }) + \left\langle F_1 \right\rangle (\psi ,\bar{\psi }), \end{aligned}$$

where \(\left\langle F_1 \right\rangle (\psi ,\bar{\psi }) := \int _0^{2\pi } F_1\circ \varPhi ^t(\psi ,\bar{\psi }) \frac{\mathrm{d}t}{2\pi }\).

The only technical difference between the proofs of Theorem 7 and the proof of Theorem 8 is that we exploit the Fourier cutoff operator

$$\begin{aligned} \varPi _N \psi (x) := \int _{|k| \le N} \hat{\psi }(k) e^{i k \cdot x} \mathrm{d}k, \end{aligned}$$

as in [3]. This in turn affects (38), which in this case reads

$$\begin{aligned} \sup _{ B_{k+\sigma +2(r+1)}(R) } \Vert X_{\mathscr {R}_{N,r}}(\psi ,\bar{\psi })\Vert _{H^k}&\lesssim \; \frac{\varepsilon }{N^\sigma }, \end{aligned}$$
(55)

and (51), for which we have to choose a bigger cutoff, \(N=\varepsilon ^{- r \sigma }\).

6 Application to the nonlinear Klein–Gordon equation

6.1 The real nonlinear Klein–Gordon equation

We first consider the Hamiltonian of the real nonlinear Klein–Gordon equation with power-type nonlinearity on a smooth manifold M (M is such that the Littlewood–Paley decomposition is well defined; take, for example, a smooth compact manifold, or \(\mathbb {R}^d\)). The Hamiltonian is of the form

$$\begin{aligned} H(u,v)&= \frac{c^2}{2} \left\langle v,v\right\rangle + \frac{1}{2} \left\langle u,\langle \nabla \rangle _c^2u \right\rangle \; + \; \lambda \int \frac{u^{2l}}{2l}, \end{aligned}$$
(56)

where \(\langle \nabla \rangle _c:=(c^2-\varDelta )^{1/2}\), \(\lambda \in \mathbb {R}\), \(l \ge 2\).

If we introduce the complex-valued variable

$$\begin{aligned} \psi&:= \frac{1}{\sqrt{2}} \left[ \left( \frac{\langle \nabla \rangle _c}{c} \right) ^{1/2} u - i \left( \frac{c}{\langle \nabla \rangle _c}\right) ^{1/2}v \right] , \end{aligned}$$
(57)

(the corresponding symplectic 2-form becomes \(i \mathrm{d}\psi \wedge \mathrm{d}\bar{\psi }\)), the Hamiltonian (56) in the coordinates \((\psi ,\bar{\psi })\) is

$$\begin{aligned} H(\bar{\psi },\psi )&= \left\langle \bar{\psi }, c\langle \nabla \rangle _c\psi \right\rangle + \frac{\lambda }{2l} \int \left[ \left( \frac{c}{\langle \nabla \rangle _c} \right) ^{1/2} \frac{\psi +\bar{\psi }}{\sqrt{2}} \right] ^{2l} \mathrm{d}x. \end{aligned}$$
(58)

If we rescale the time by a factor \(c^{2}\), the Hamiltonian takes the form (27), with \(\varepsilon = \frac{1}{c^2}\), and

$$\begin{aligned} H(\psi ,\bar{\psi })&= h_0(\psi ,\bar{\psi }) + \varepsilon \, h(\psi ,\bar{\psi }) + \varepsilon \, F(\psi ,\bar{\psi }), \end{aligned}$$
(59)

where

$$\begin{aligned} h_0(\psi ,\bar{\psi })&= \left\langle \bar{\psi },\psi \right\rangle , \end{aligned}$$
(60)
$$\begin{aligned} h(\psi ,\bar{\psi })&= \left\langle \bar{\psi }, \left( c \langle \nabla \rangle _c - c^2 \right) \psi \right\rangle \sim \sum _{j\ge 1}\varepsilon ^{j-1} \; \left\langle \bar{\psi },a_j\varDelta ^j\psi \right\rangle =: \sum _{j\ge 1}\varepsilon ^{j-1} h_j(\psi ,\bar{\psi }), \end{aligned}$$
(61)
$$\begin{aligned} F(\psi ,\bar{\psi })&= \frac{\lambda }{2^{l+1}l} \int \left[ \left( \frac{c}{\langle \nabla \rangle _c}\right) ^{1/2} (\psi +\bar{\psi }) \right] ^{2l} \mathrm{d}x \end{aligned}$$
(62)
$$\begin{aligned}&\sim \frac{\lambda }{2^{l+1}l} \int (\psi +\bar{\psi })^{2l} \mathrm{d}x \nonumber \\&\quad + \varepsilon b_2 \int \left[ (\psi +\bar{\psi })^{2l-1}\varDelta (\psi +\bar{\psi }) + \ldots + (\psi +\bar{\psi })\varDelta \left( \left( \psi +\bar{\psi }\right) ^{2l-1}\right) \right] \mathrm{d}x \nonumber \\&\quad + \mathscr {O}(\varepsilon ^2) \nonumber \\&=: \sum _{j \ge 1} \varepsilon ^{j-1} \, F_j(\psi ,\bar{\psi }), \end{aligned}$$
(63)

where \((a_j)_{j \ge 1}\) and \((b_j)_{j \ge 1}\) are real coefficients, and \(F_j(\psi ,\bar{\psi })\) is a polynomial function of the variables \(\psi \) and \(\bar{\psi }\) (along with their derivatives) and which admits a bounded vector field from a neighborhood of the origin in \(W^{k+2(j-1),p}\) to \(W^{k,p}\) for any \(1<p<+\infty \).

This description clearly fits the scheme treated in the previous section, and one can easily check that assumptions PER, NF and HVF are satisfied. Therefore, we can apply Theorem 7 to the Hamiltonian (59).

Remark 12

About the normal forms obtained by applying Theorem 7, we remark that in the first step (case \(r=1\) in the statement of the theorem) the homological equation we get is of the form

$$\begin{aligned} \{\chi _1, h_0 \} + F_1 = \left\langle F_1 \right\rangle , \end{aligned}$$
(64)

where \(F_1(\psi ,\bar{\psi }) = \frac{\lambda }{2^{l+1}l} \int (\psi +\bar{\psi })^{2l} \mathrm{d}x\). Hence the transformed Hamiltonian is of the form

$$\begin{aligned} H_1(\psi ,\bar{\psi }) = h_0(\psi ,\bar{\psi }) + \frac{1}{c^2} \left[ -\frac{1}{2} \left\langle \bar{\psi },\varDelta \psi \right\rangle + \left\langle F_1 \right\rangle (\psi ,\bar{\psi }) \right] + \frac{1}{c^4} \mathscr {R}^{(1)}(\psi ,\bar{\psi }), \end{aligned}$$
(65)

where

$$\begin{aligned} \left\langle F_1 \right\rangle (\psi ,\bar{\psi })&= \frac{\lambda }{2^{l+1}l} \left( {\begin{array}{c}2l\\ l\end{array}}\right) \int |\psi |^{2l} \; \mathrm{d}x. \end{aligned}$$
(66)

If we neglect the remainder and we derive the corresponding equation of motion for the system, we get

$$\begin{aligned} -i \psi _t \; = \psi + \frac{1}{c^2} \left[ -\frac{1}{2} \varDelta \psi + \frac{\lambda }{2^{l+1}} \left( {\begin{array}{c}2l\\ l\end{array}}\right) |\psi |^{2(l-1)}\psi \right] , \end{aligned}$$
(67)

which is the NLS, and the Hamiltonian which generates the canonical transformation is given by

$$\begin{aligned} \chi _1(\psi ,\bar{\psi }) = \frac{\lambda }{2^{l+1}l} \sum _{\begin{array}{c} j=0,\ldots ,2l \\ j \ne l \end{array}} \frac{1}{i \, 2(l-j)} \left( {\begin{array}{c}2l\\ j\end{array}}\right) \int \psi ^{2l-j} \bar{\psi }^{j} \mathrm{d}x. \end{aligned}$$
(68)

Remark 13

Now we iterate the construction by passing to the case \(r=2\).

If we neglect the remainder of order \(c^{-6}\), we have that

$$\begin{aligned} H \circ \mathscr {T}^{(1)}&= h_0 + \frac{1}{c^2} h_1 + \frac{1}{c^4} \{\chi _1, h_1\} + \frac{1}{c^4} h_2 \nonumber \\&\quad + \frac{1}{c^2} \left\langle F_1 \right\rangle + \frac{1}{c^4} \{\chi _1, F_1\} + \frac{1}{2c^4} \{\chi _1,\{\chi _1,h_0\}\} + \frac{1}{c^4} F_2 \end{aligned}$$
(69)
$$\begin{aligned}&= h_0 + \frac{1}{c^2} \left[ h_1 + \left\langle F_1\right\rangle \right] + \frac{1}{c^4} \left[ \{\chi _1,h_1\} + h_2 + \{\chi _1,F_1\} + \frac{1}{2} \{ \chi _1, \left\langle F_1 \right\rangle - F_1 \} + F_2 \right] , \end{aligned}$$
(70)

where \(h_1(\psi ,\bar{\psi }) = -\frac{1}{2} \left\langle \bar{\psi },\varDelta \psi \right\rangle \), and \(\chi _1\) is of the form (68).

Now we compute the terms of order \(\frac{1}{c^4}\).

$$\begin{aligned} \{\chi _1, h_1 \}&= \mathrm{d}\chi _1 X_{h_1} = \frac{\partial \chi _1}{\partial \psi } \cdot i \frac{\partial h_1}{\partial \bar{\psi }} - i \frac{\partial \chi _1}{\bar{\psi }} \frac{\partial h_1}{\partial \psi } \nonumber \\&= -\frac{\lambda }{2^{l+3}l} \int \left[ \sum _{\begin{array}{c} j=0,\ldots ,2l-1 \\ j \ne l \end{array}} \frac{1}{l-j} \left( {\begin{array}{c}2l\\ j\end{array}}\right) (2l-j) \psi ^{2l-j-1} \bar{\psi }^j \right] \, \varDelta \psi \; \mathrm{d}x \nonumber \\&\quad +\frac{\lambda }{2^{l+3}l} \int \left[ \sum _{\begin{array}{c} j=1,\ldots ,2l \\ j \ne l \end{array}} \frac{1}{l-j} \left( {\begin{array}{c}2l\\ j\end{array}}\right) j \psi ^{2l-j} \bar{\psi }^{j-1} \right] \, \varDelta \bar{\psi } \; \mathrm{d}x \nonumber \\&=-\frac{\lambda }{2^{l+3}l} \int \varDelta \psi \, \psi ^{2l-1} + \varDelta \bar{\psi } \, \bar{\psi }^{2l-1} \; \mathrm{d}x \nonumber \\&\quad -\frac{\lambda }{2^{l+3}l} \int \sum _{\begin{array}{c} j=1,\ldots ,2l-1 \\ j \ne l \end{array}} \frac{1}{l-j} \left( {\begin{array}{c}2l\\ j\end{array}}\right) \int (2l-j) \psi ^{2l-j-1}\bar{\psi }^j \, \varDelta \psi \nonumber \\&\quad - j \psi ^{2l-j}\bar{\psi }^{j-1} \, \varDelta \bar{\psi } \; \mathrm{d}x, \end{aligned}$$
(71)

and since \(j \ne l\) in the sum we have that

$$\begin{aligned} \left\langle \{\chi _1, h_1 \} \right\rangle&= 0. \end{aligned}$$
(72)

Next,

$$\begin{aligned}&h_2 = -\frac{1}{8} \left\langle \bar{\psi },\varDelta ^2\psi \right\rangle , \end{aligned}$$
(73)
$$\begin{aligned}&\{\chi _1, F_1\} \nonumber \\&\quad = \frac{\lambda ^2}{2^{2l+3}l^2} \int \left[ \sum _{\begin{array}{c} j=0,\ldots ,2l-1 \\ j \ne l \end{array}} \frac{1}{l-j} \left( {\begin{array}{c}2l\\ j\end{array}}\right) (2l-j) \psi ^{2l-j-1}\bar{\psi }^j \right] \left[ \sum _{h=1}^{2l} \left( {\begin{array}{c}2l\\ h\end{array}}\right) h \psi ^{2l-h}\bar{\psi }^{h-1} \right] \; \mathrm{d}x \nonumber \\&\qquad - \frac{\lambda ^2}{2^{2l+3}l^2} \int \left[ \sum _{\begin{array}{c} j=1,\ldots ,2l \\ j \ne l \end{array}} \frac{1}{l-j} \left( {\begin{array}{c}2l\\ j\end{array}}\right) j \psi ^{l-j}\bar{\psi }^{j-1} \right] \left[ \sum _{h=0}^{2l-1} \left( {\begin{array}{c}2l\\ h\end{array}}\right) (2l-h) \psi ^{2l-h-1}\bar{\psi }^{h} \right] \; \mathrm{d}x \nonumber \\&\quad = \frac{\lambda ^2}{2^{2l+3}l^2} \sum _{\begin{array}{c} j,h=1,\ldots ,2l-1 \\ j \ne l \end{array}} \frac{1}{l-j} \left( {\begin{array}{c}2l\\ j\end{array}}\right) \left( {\begin{array}{c}2l\\ h\end{array}}\right) [(2l-j)h-j(2l-h)] \int \psi ^{4l-j-h-1} \bar{\psi }^{j+h-1} \; \mathrm{d}x \nonumber \\&\qquad + \frac{\lambda ^2}{2^{2l+3}l^2} \, 2 \int \psi ^{2l-1} \left[ \sum _{h=1}^{2l} \left( {\begin{array}{c}2l\\ h\end{array}}\right) h \psi ^{2l-h} \bar{\psi }^{h-1} \right] \; \mathrm{d}x \nonumber \\&\qquad + \frac{\lambda ^2}{2^{2l+3}l^2} \, 2l \int \left[ \sum _{\begin{array}{c} j=0,\ldots ,2l-1 \\ j \ne l \end{array}} \frac{1}{l-j} \left( {\begin{array}{c}2l\\ j\end{array}}\right) (2l-j) \psi ^{2l-j-1} \bar{\psi }^j \right] \bar{\psi }^{2l-1} \; \mathrm{d}x \nonumber \\&\qquad + \frac{\lambda ^2}{2^{2l+3}l^2} \, 2 \int \bar{\psi }^{2l-1} \left[ \sum _{h=0}^{2l-1} \left( {\begin{array}{c}2l\\ h\end{array}}\right) (2l-h) \psi ^{2l-h-1} \bar{\psi }^h \right] \; \mathrm{d}x \nonumber \\&\qquad - \frac{\lambda ^2}{2^{2l+3}l^2} \, 2l \int \left[ \sum _{\begin{array}{c} j=1,\ldots ,2l \\ j \ne l \end{array}} \frac{1}{l-j} \left( {\begin{array}{c}2l\\ j\end{array}}\right) j \psi ^{2l-j}\bar{\psi }^{j-1} \right] \psi ^{2l-1} \; \mathrm{d}x, \end{aligned}$$
(74)
$$\begin{aligned}&\left\langle \{\chi _1, F_1\} \right\rangle = \lambda ^2 K(l) \int |\psi |^{2(2l-1)} \; \mathrm{d}x, \end{aligned}$$
(75)
$$\begin{aligned}&K(l) := \frac{1}{2^{2l+3}l^2} \left\{ \left( \sum _{\begin{array}{c} j,h=1,\ldots ,2l-1 \\ j \ne l \\ j+h=2l \end{array}} \frac{1}{l-j} \left( {\begin{array}{c}2l\\ j\end{array}}\right) \left( {\begin{array}{c}2l\\ h\end{array}}\right) [(2l-j)h-j(2l-h)] \right) + 16 l \right\} , \end{aligned}$$
(76)

where \(K(l)>0\) by the conditions on j and h in the sum.

Then,

$$\begin{aligned}&\{ \chi _1, \left\langle F_1 \right\rangle \} \nonumber \\&\quad = \frac{\lambda ^2}{2^{2l+3}l^2} \left( {\begin{array}{c}2l\\ l\end{array}}\right) \int \sum _{\begin{array}{c} j=0,\ldots ,2l-1 \\ j \ne l \end{array}} \frac{1}{l-j} \left( {\begin{array}{c}2l\\ j\end{array}}\right) (2l-j)l \, \psi ^{2l-j-1} \bar{\psi }^j \psi ^l \bar{\psi }^{l-1} \; \mathrm{d}x \nonumber \\&\quad - \frac{\lambda ^2}{2^{2l+3}l^2} \left( {\begin{array}{c}2l\\ l\end{array}}\right) \int \sum _{\begin{array}{c} j=1,\ldots ,2l \\ j \ne l \end{array}} \frac{1}{l-j} \left( {\begin{array}{c}2l\\ j\end{array}}\right) jl \, \psi ^{2l-j}\bar{\psi }^{j-1} \psi ^{l-1} \bar{\psi }^l \; \mathrm{d}x \nonumber \\&\quad = \frac{\lambda ^2}{2^{2l+3}l^2} \left( {\begin{array}{c}2l\\ l\end{array}}\right) \left[ \left( {\begin{array}{c}2l\\ l\end{array}}\right) \, 2 \int \psi ^{3l-1} \bar{\psi }^{l-1} +\psi ^{l-1} \bar{\psi }^{3l-1} \; \mathrm{d}x\right. \nonumber \\&\qquad \left. +\, \sum _{\begin{array}{c} j=1,\ldots ,2l-1 \\ j \ne l \end{array}}2l \left( {\begin{array}{c}2l\\ j\end{array}}\right) \int \psi ^{3l-j-1} \bar{\psi }^{j+l-1} \; \mathrm{d}x \right] , \end{aligned}$$
(77)

and since \(j \ne l\) in the sum we have that

$$\begin{aligned} \left\langle \{ \chi _1, \left\langle F_1 \right\rangle \} \right\rangle&= 0. \end{aligned}$$
(78)

Furthermore,

$$\begin{aligned} F_2&= \frac{\lambda }{2^{l+3}l} \, 2l \int (\psi +\bar{\psi })^{2l-1} \, \varDelta (\psi +\bar{\psi }) \; \mathrm{d}x \nonumber \\&= \frac{\lambda }{2^{l+2}} \sum _{j=0}^{2l-1} \left( {\begin{array}{c}2l-1\\ j\end{array}}\right) \int \psi ^{2l-j-1}\bar{\psi }^j (\varDelta \psi + \varDelta \bar{\psi }) \; \mathrm{d}x, \end{aligned}$$
(79)
$$\begin{aligned} \left\langle F_2 \right\rangle&= \frac{\lambda }{2^{l+2}} \int \left( {\begin{array}{c}2l-1\\ l\end{array}}\right) \psi ^{l-1}\bar{\psi }^l \varDelta \psi + \left( {\begin{array}{c}2l-1\\ l-1\end{array}}\right) \psi ^l \bar{\psi }^{l-1} \varDelta \bar{\psi } \; \mathrm{d}x \nonumber \\&= \frac{\lambda }{2^{l+2}} \left( {\begin{array}{c}2l-1\\ l\end{array}}\right) \int |\psi |^{2(l-1)}( \bar{\psi } \varDelta \psi + \psi \varDelta \bar{\psi } ) \; \mathrm{d}x \end{aligned}$$
(80)

Hence, up to a remainder of order \(O\left( \frac{1}{c^6}\right) \), we have that

$$\begin{aligned} H_2&= h_0 + \frac{1}{c^2} \int \left[ -\frac{1}{2} \left\langle \bar{\psi },\varDelta \psi \right\rangle + \frac{\lambda }{2^{l+1}l} \left( {\begin{array}{c}2l\\ l\end{array}}\right) |\psi |^{2l} \right] \; \mathrm{d}x \nonumber \\&\quad + \frac{1}{c^4} \int \left[ \lambda ^2 K(l) |\psi |^{2(2l-1)} + \frac{\lambda }{2^{l+2}} \left( {\begin{array}{c}2l-1\\ l\end{array}}\right) |\psi |^{2(l-1)}( \bar{\psi } \varDelta \psi + \psi \varDelta \bar{\psi } ) - \frac{1}{8} \left\langle \bar{\psi }, \varDelta ^2\psi \right\rangle \right] \; \mathrm{d}x, \end{aligned}$$
(81)

which, by neglecting \(h_0\) (that yields only a gauge factor) and by rescaling the time, leads to the following equations of motion

$$\begin{aligned} -i \psi _t&= - \frac{1}{2} \varDelta \psi + \frac{\lambda }{2^{l+1}} \left( {\begin{array}{c}2l\\ l\end{array}}\right) |\psi |^{2(l-1)}\psi + \frac{1}{c^2} \left[ - \frac{1}{8} \varDelta ^2\psi + \lambda ^2 K(l) \, (2l-1) |\psi |^{4(l-1)}\psi \right] \nonumber \\&\quad + \frac{1}{c^2} \left[ \frac{\lambda }{2^{l+2}} \left( {\begin{array}{c}2l-1\\ l\end{array}}\right) \left( l |\psi |^{2(l-1)} \, \varDelta \psi + (l-1) |\psi |^{2(l-2)} \psi ^2 \varDelta \bar{\psi } + \varDelta (|\psi |^{2(l-1)}\bar{\psi }) \right) \right] , \end{aligned}$$
(82)

which, for example, in the case of a cubic nonlinearity (\(l=2\)) reads

$$\begin{aligned} -i \psi _t \;&= \; - \frac{1}{2} \varDelta \psi + \frac{3}{4} \lambda |\psi |^2\psi \nonumber \\&\quad + \frac{1}{c^2} \left[ \frac{51}{8} \lambda ^2 |\psi |^4\psi + \frac{3}{16} \lambda \left( 2|\psi |^2 \, \varDelta \psi + \psi ^2 \varDelta \bar{\psi } + \varDelta (|\psi |^2\bar{\psi }) \right) - \frac{1}{8} \varDelta ^2\psi \right] . \end{aligned}$$
(83)

To the author’s knowledge, Eq. (83) has never been studied before. It is the nonlinear analogue of a linear higher-order Schrödinger equation that appears in [14, 15] in the context of semirelativistic equations. Indeed, the linearization of Eq. (83) is studied within the framework of relativistic quantum field theory, as an approximation of nonlocal kinetic terms; Carles, Lucha and Moulay studied the well-posedness of these approximations, as well as the convergence of the equations as the order of truncation goes to infinity, in the linear case, also when one takes into account the effects of some time-independent potentials (e.g., bounded potentials, the harmonic oscillator potential and the Coulomb potential).

Remark 14

We point out that the case of the one-dimensional cubic defocusing NLKG is also interesting, since for \(\lambda =1\) the normalized equation at first step is the cubic defocusing NLS, which is known to be integrable by the inverse scattering method. It would be interesting to reach a better understanding of the one-dimensional normalized equation, even in the case \(r=2\).

Even though there is a one-dimensional integrable 4NLS equation related to the dynamics of a vortex filament (see [52] and references therein),

$$\begin{aligned}&i \psi _t + \psi _{xx} + \frac{1}{2} |\psi |^2\psi - \nu \left[ \psi _{xxxx} + \frac{3}{2} |\psi |^2\psi _{xx} + \frac{3}{2} \psi _x^2 \bar{\psi } + \frac{3}{8} |\psi |^4\psi + \frac{1}{2} (|\psi |^2)_{xx}\psi \right] \nonumber \\&\quad =0, \; \nu \in \mathbb {R}\end{aligned}$$
(84)

apparently there is no obvious relation between the above equation and Eq. (83).

6.2 The complex nonlinear Klein–Gordon equation

Now we consider the Hamiltonian of the complex nonlinear Klein–Gordon equation with power-type nonlinearity on a smooth manifold M (take, for example, a smooth compact manifold, or \(\mathbb {R}^d\))

$$\begin{aligned} H(w,p_w)&= \frac{c^2}{2} \left\langle p_w,p_w \right\rangle + \frac{1}{2} \left\langle w,\langle \nabla \rangle _c^2w \right\rangle \; + \; \lambda \int \frac{|w|^{2l}}{2l}, \end{aligned}$$
(85)

where \(w:\mathbb {R}\times M \rightarrow \mathbb {C}\), \(\langle \nabla \rangle _c:=(c^2-\varDelta )^{1/2}\), \(\lambda \in \mathbb {R}\), \(l \ge 2\).

If we rewrite the Hamiltonian in terms of \(u:= Re(w)\) and \(v:= Im(w)\), we have

$$\begin{aligned} H(u,v,p_u,p_v)&= \frac{c^2}{2} \left( \left\langle p_u,p_u \right\rangle + \left\langle p_v,p_v \right\rangle \right) + \frac{1}{2} ( |\nabla u|^2 + |\nabla v|^2 )\nonumber \\&\quad + \frac{c^2}{2} ( u^2 + v^2 ) + \lambda \int \frac{(u^2+v^2)^{l}}{2l}. \end{aligned}$$
(86)

We consider by simplicity only the cubic case (\(l=2\)), but the argument may be readily generalized to the other power-type nonlinearities.

If we introduce the variables

$$\begin{aligned} \psi&:= \frac{1}{\sqrt{2}} \left[ \left( \frac{\langle \nabla \rangle _c}{c} \right) ^{1/2} u - i \left( \frac{c}{\langle \nabla \rangle _c}\right) ^{1/2}p_u \right] , \end{aligned}$$
(87)
$$\begin{aligned} \phi&:= \frac{1}{\sqrt{2}} \left[ \left( \frac{\langle \nabla \rangle _c}{c} \right) ^{1/2} v + i \left( \frac{c}{\langle \nabla \rangle _c}\right) ^{1/2}p_v \right] , \end{aligned}$$
(88)

(the corresponding symplectic 2-form becomes \(i \mathrm{d}\psi \wedge \mathrm{d}\bar{\psi } -i \mathrm{d}\phi \wedge \mathrm{d}\bar{\phi }\)), the Hamiltonian (85) in the coordinates \((\psi ,\phi ,\bar{\psi },\bar{\phi })\) reads

$$\begin{aligned}&H(\psi ,\phi ,\bar{\psi },\bar{\phi }) = \left\langle \bar{\psi }, c\langle \nabla \rangle _c\psi \right\rangle + \left\langle \bar{\phi }, c\langle \nabla \rangle _c\phi \right\rangle \end{aligned}$$
(89)
$$\begin{aligned}&\quad +\, \frac{\lambda }{16} \int _M \left[ \left\langle \psi +\bar{\psi }, \frac{c}{\langle \nabla \rangle _c} (\psi +\bar{\psi }) \right\rangle + \left\langle \phi +\bar{\phi }, \frac{c}{\langle \nabla \rangle _c} (\phi +\bar{\phi }) \right\rangle \right] ^{2} \mathrm{d}x, \end{aligned}$$
(90)

with corresponding equations of motion

$$\begin{aligned} {\left\{ \begin{array}{ll} -i \psi _t &{}= c\langle \nabla \rangle _c\psi + \frac{1}{4} \left[ \left\langle \psi +\bar{\psi }, \frac{c}{\langle \nabla \rangle _c} (\psi +\bar{\psi }) \right\rangle + \left\langle \phi +\bar{\phi }, \frac{c}{\langle \nabla \rangle _c} (\phi +\bar{\phi }) \right\rangle \right] \frac{c}{\langle \nabla \rangle _c} (\psi +\bar{\psi }), \\ \\ i \phi _t &{}= c\langle \nabla \rangle _c\phi + \frac{1}{4} \left[ \left\langle \psi +\bar{\psi }, \frac{c}{\langle \nabla \rangle _c} (\psi +\bar{\psi }) \right\rangle + \left\langle \phi +\bar{\phi }, \frac{c}{\langle \nabla \rangle _c} (\phi +\bar{\phi }) \right\rangle \right] \frac{c}{\langle \nabla \rangle _c} (\phi +\bar{\phi }). \\ \end{array}\right. } \end{aligned}$$

If we rescale the time by a factor \(c^{2}\), the Hamiltonian takes the form (27), with \(\varepsilon = \frac{1}{c^2}\), and

$$\begin{aligned} H(\psi ,\phi ,\bar{\psi },\bar{\phi })&= H_0(\psi ,\phi ,\bar{\psi },\bar{\phi }) + \varepsilon \, h(\psi ,\phi ,\bar{\psi },\bar{\phi }) + \varepsilon \, F(\psi ,\phi ,\bar{\psi },\bar{\phi }), \end{aligned}$$
(91)

where

$$\begin{aligned} H_0(\psi ,\phi ,\bar{\psi },\bar{\phi })&= \left\langle \bar{\psi },\psi \right\rangle + \left\langle \bar{\phi },\phi \right\rangle , \end{aligned}$$
(92)
$$\begin{aligned} h(\psi ,\phi ,\bar{\psi },\bar{\phi })&= \left\langle \bar{\psi }, \left( c \langle \nabla \rangle _c - c^2 \right) \psi \right\rangle - \left\langle \bar{\phi }, \left( c \langle \nabla \rangle _c - c^2 \right) \phi \right\rangle \nonumber \\&\quad \sim \sum _{j\ge 1}\varepsilon ^{j-1} \; \left( \left\langle \bar{\psi },a_j\varDelta ^j\psi \right\rangle + \left\langle \bar{\phi },a_j\varDelta ^j\phi \right\rangle \right) \nonumber \\&=: \sum _{j\ge 1}\varepsilon ^{j-1} ( h_j(\psi ,\phi ,\bar{\psi },\bar{\phi }) ), \end{aligned}$$
(93)
$$\begin{aligned} F(\psi ,\phi ,\bar{\psi },\bar{\phi })&= \frac{\lambda }{16} \int _\mathbb {T}\left[ \left\langle \psi +\bar{\psi }, \frac{c}{\langle \nabla \rangle _c} (\psi +\bar{\psi }) \right\rangle + \left\langle \phi +\bar{\phi }, \frac{c}{\langle \nabla \rangle _c} (\phi +\bar{\phi }) \right\rangle \right] ^{2} \mathrm{d}x, \nonumber \\&\quad \sim \frac{\lambda }{16} \int \left[ |\psi +\bar{\psi }|^2 + |\phi +\bar{\phi }|^2 \right] ^2 \mathrm{d}x \nonumber \\&\quad + \mathscr {O}(\varepsilon ) \nonumber \\&=: \sum _{j \ge 1} \varepsilon ^{j-1} \,F_j(\psi ,\phi ,\bar{\psi },\bar{\phi }), \end{aligned}$$
(94)

where \((a_j)_{j \ge 1}\) are real coefficients, and \(F_j(\psi ,\phi ,\bar{\psi },\bar{\phi })\) is a polynomial function of the variables \(\psi \), \(\phi \), \(\bar{\psi }\), \(\bar{\phi }\) (along with their derivatives) and which admits a bounded vector field from a neighborhood of the origin in \(W^{k+2(j-1),p}(\mathbb {R}^d,\mathbb {C}^2 \times \mathbb {C}^2)\) to \(W^{k,p}(\mathbb {R}^d,\mathbb {C}^2 \times \mathbb {C}^2)\) for any \(1<p<+\infty \).

This description clearly fits the scheme treated in Sect. 4 with \(n=2\), and one can easily check that assumptions PER, NF and HVF are satisfied. Therefore, we can apply Theorem 7 to the Hamiltonian (91).

Remark 15

About the normal forms obtained by applying Theorem 7, we remark that in the first step (case \(r=1\) in the statement of the theorem) the homological equation we get is of the form

$$\begin{aligned} \{\chi _1, h_0 \} + F_1&= \left\langle F_1 \right\rangle , \end{aligned}$$
(95)

where \(F_1(\psi ,\bar{\psi }) = \frac{\lambda }{16} \int \left[ |\psi +\bar{\psi }|^2 + |\phi +\bar{\phi }|^2 \right] ^2 \mathrm{d}x\). Hence the transformed Hamiltonian is of the form

$$\begin{aligned} H_1(\psi ,\phi ,\bar{\psi },\bar{\phi })&= h_0(\psi ,\phi ,\bar{\psi },\bar{\phi }) + \frac{1}{c^2} \left[ -\frac{1}{2} \left( \left\langle \bar{\psi },\varDelta \psi \right\rangle + \left\langle \bar{\phi },\varDelta \phi \right\rangle \right) + \left\langle F_1 \right\rangle (\psi ,\phi ,\bar{\psi },\bar{\phi }) \right] \nonumber \\&\quad + \frac{1}{c^4} \mathscr {R}^{(1)}(\psi ,\phi ,\bar{\psi },\bar{\phi }), \end{aligned}$$
(96)

where

$$\begin{aligned} \left\langle F_1 \right\rangle&= \frac{\lambda }{16} \left[ 6\psi ^2 \bar{\psi }^2 + 6\phi ^2 \bar{\phi }^2 + 8 \psi \bar{\psi } \phi \bar{\phi } + 2 \psi ^2 \phi ^2 + 2 \bar{\psi }^2 \bar{\phi }^2 \right] \\&= \frac{\lambda }{8} \left[ 3 (|\psi |^2+|\phi |^2)^2 + 2 (\psi \phi - \bar{\psi } \bar{\phi })^2 \right] . \end{aligned}$$

If we neglect the remainder and we derive the corresponding equations of motion for the system, we get

$$\begin{aligned} {\left\{ \begin{array}{ll} -i \psi _t &{}= \psi + \frac{1}{c^2} \left\{ -\frac{1}{2} \varDelta \psi + \frac{\lambda }{4} \left[ 3(|\psi |^2+|\phi |^2)\psi + 2(\psi \phi +\bar{\psi } \bar{\phi })\bar{\phi } \right] \right\} , \\ \\ i \phi _t &{}= \phi + \frac{1}{c^2} \left\{ -\frac{1}{2} \varDelta \phi + \frac{\lambda }{4} \left[ 3(|\psi |^2+|\phi |^2)\phi + 2(\psi \phi +\bar{\psi } \bar{\phi })\bar{\psi } \right] \right\} , \\ \end{array}\right. } \end{aligned}$$
(97)

which is a system of two coupled NLS equations.

7 Dynamics

Now we want to exploit the result of the previous section in order to deduce some consequences about the dynamics of the NLKG equation (3) in the nonrelativistic limit. Consider the simplified system, that is, the Hamiltonian \(H_r\) in the notations of Theorem 7, where we neglect the remainder:

$$\begin{aligned} H_{simp}&:= h_0+\varepsilon (h_1+ \left\langle F_1 \right\rangle )+ \sum _{j=2}^{r} \varepsilon ^j(h_j+Z_j). \end{aligned}$$

We recall that in the case of the NLKG the simplified system is actually the NLS (given by \(h_0+\varepsilon (h_1+ \left\langle F_1 \right\rangle )\)), plus higher-order normalized corrections. Now let \(\psi _r\) be a solution of

$$\begin{aligned} -i \,\dot{\psi }_r \,&= \, X_{H_{simp}}(\psi _r), \end{aligned}$$
(98)

then \(\psi _a(t,x):=\mathscr {T}^{(r)}(\psi _r(c^2t,x))\) solves

$$\begin{aligned} \dot{\psi }_a&= i c\langle \nabla \rangle _c \psi _a + \frac{\lambda }{2l} \left( \frac{c}{\langle \nabla \rangle _c} \right) ^{1/2} \, \left[ \left( \frac{c}{\langle \nabla \rangle _c} \right) ^{1/2} \frac{\psi _a+\bar{\psi }_a}{\sqrt{2}} \right] ^{2l-1} - \frac{1}{c^{2r}} X_{ \mathscr {T}^{(r)*}\mathscr {R}^{(r)} }(\psi _a,\bar{\psi }_a), \end{aligned}$$
(99)

that is, the NLKG plus a remainder of order \(c^{-2r}\) (in the following we will refer to Eq. (99) as approximate equation, and to \(\psi _a\) as the approximate solution of the original NLKG). We point out that the original NLKG and the approximate equation differ only by a remainder of order \(c^{-2r}\), which is evaluated on the approximate solution. This fact is extremely important: indeed, if one can prove the smoothness of the approximate solution (which often is easier to check than the smoothness of the solution of the original equation), then the contribution of the remainder may be considered small in the nonrelativistic limit. This property is rather general and has been already applied in the framework of normal form theory (see, for example, [4]).

Now let \(\psi \) be a solution of the NLKG equation (3) with the initial datum \(\psi _0\), and let \(\delta :=\psi -\psi _a\) be the error between the solution of the approximate equation and the original one. One can check that \(\delta \) fulfills

$$\begin{aligned} \dot{\delta }&= i c \langle \nabla \rangle _c \delta + [ P(\psi _a+\delta ,\bar{\psi }_a+\bar{\delta })-P(\psi _a,\bar{\psi }_a) ]+ \frac{1}{c^{2r}} X_{ \mathscr {T}^{(r)*}\mathscr {R}^{(r)} }(\psi _a(t),\bar{\psi }_a(t) ), \end{aligned}$$

where

$$\begin{aligned} P(\psi ,\bar{\psi })&= \frac{\lambda }{2l} \left( \frac{c}{\langle \nabla \rangle _c} \right) ^{1/2} \left[ \left( \frac{c}{\langle \nabla \rangle _c} \right) ^{1/2} \frac{\psi +\bar{\psi }}{\sqrt{2}} \right] ^{2l-1}. \end{aligned}$$
(100)

Thus we get

$$\begin{aligned} \dot{\delta }&= i \,c \langle \nabla \rangle _c\delta + dP(\psi _a(t))\delta + \mathscr {O}(\delta ^2) + \mathscr {O}\left( \frac{1}{ c^{2r} }\right) ; \nonumber \\ \delta (t)&= e^{itc\langle \nabla \rangle _c}\delta _0 + \int _0^{t}e^{i(t-s)c\langle \nabla \rangle _c}dP(\psi _a(s))\delta (s)\mathrm{d}s + \mathscr {O}(\delta ^2)+\mathscr {O}\left( \frac{1}{c^{2r}}\right) . \end{aligned}$$
(101)

By applying Gronwall inequality to (101) we obtain

Proposition 3

Fix \(r \ge 1\), \(R>0\), \(k_1 \gg 1\), \(1<p< +\infty \). Then \(\exists \)\(k_0=k_0(r)>0\) with the following properties: For any \(k \ge k_1\) there exists \(c_{l,r,k,p,R} \gg 1\) such that for any \(c>c_{l,r,k,p,R}\), if we assume that

$$\begin{aligned} \Vert \psi _0\Vert _{k+k_0,p}&\le R \end{aligned}$$

and that there exists \(T=T_{r,k,p}>0\) such that the solution of (98) satisfies

$$\begin{aligned} \Vert \psi _r(t)\Vert _{k+k_0,p}&\le 2R, \quad \text {for} \quad 0 \le t \le T, \end{aligned}$$

then

$$\begin{aligned} \Vert \delta (t)\Vert _{k,p}&\le C_{k,p} \, c^{-2r},\quad \text {for} \quad 0\le t \le T. \end{aligned}$$
(102)

Remark 16

If we restrict to \(p=2\), and to \(M=\mathbb {T}^d\), the above result is actually a reformulation of Theorem 3.2 in [23]. We also remark that the time interval [0, T] in which estimate (102) is valid is independent of c.

Remark 17

By exploiting estimate (32) about the canonical transformation, Proposition 3 leads immediately to a proof of Theorem 2.

In order to study the evolution of the error between the approximate solution and the solution of the NLKG over longer (namely, c-dependent) timescales, we observe that the error is described by

$$\begin{aligned} \dot{\delta }(t)&= i \, c\langle \nabla \rangle _c\delta (t) + dP(\psi _a(t))\delta (t); \end{aligned}$$
(103)
$$\begin{aligned} \delta (t)&= e^{itc\langle \nabla \rangle _c}\delta _0+\int _0^{t}e^{i(t-s)c\langle \nabla \rangle _c} dP(\psi _a(s))\delta (s)\mathrm{d}s, \end{aligned}$$
(104)

up to a remainder which is small, if we assume the smoothness of \(\psi _a\).

Equation (103) in the context of dispersive PDEs is known as semirelativistic spinless Salpeter equation with a time-dependent potential. This system was introduced as a first order in time analogue of the KG equation for the Lorentz covariant description of bound states within the framework of relativistic quantum field theory, and, despite the nonlocality of its Hamiltonian, some of its properties have already been studied. (See [55] for a study from a physical point of view; for a more mathematical approach, see [33] and the more recent works [14, 15], which are closer to the spirit of our approximation.)

It seems reasonable to estimate the solution of Eq. (103) by studying and by exploiting its dispersive properties, and this will be the aim of the following sections. From now on we will consider only the case \(M=\mathbb {R}^d\) for \(d \ge 2\).

8 Properties of the normal form equation

8.1 Linear case

Now let \(r \ge 1\), \(d \ge 2\). In [14, 15] the authors proved that the linearized normal form system, namely the one that corresponds (up to a rescaling of time by a factor \(c^2\)) to

$$\begin{aligned} -i \dot{\psi _r}&= X_{h_0 + \sum _{j=1}^r \varepsilon ^j h_j}(\psi _r), \nonumber \\ \psi _r(0)&= \psi _0, \end{aligned}$$
(105)

admits a unique solution in \(L^\infty (\mathbb {R})H^{k+k_0}(\mathbb {R}^d)\) (this is a simple application of the properties of the Fourier transform), and by a perturbative argument they also proved the global existence also for the higher oder Schrödinger equation with a bounded time-independent potential.

Moreover, by following the arguments of Theorem 4.1 in [31] and Lemma 4.3 in [14] one obtains the following dispersive estimates and local-in-time Strichartz estimates for solutions of the linearized normal form equation (105).

Fig. 1
figure 1

Set of admissible exponents \(\varDelta _r\) for different values of r: a\(r=1\) (this is the Schrödinger case); b\(r=2\); c\(r=11\)

Proposition 4

(Fig. 1) Let \(r \ge 1\) and \(d \ge 2\), and denote by \(\mathscr {U}_r(t)\) the evolution operator of (105) at the time \(c^2t\) (\(c \ge 1\), \(t>0\)). Then one has the following local-in-time dispersive estimate

$$\begin{aligned} \Vert \mathscr {U}_r(t) \Vert _{L^1(\mathbb {R}^d) \rightarrow L^\infty (\mathbb {R}^d)}&\lesssim c^{d \left( 1-\frac{1}{r} \right) } |t|^{-d/(2r)}, \quad 0<|t| \lesssim c^{2(r-1)}. \end{aligned}$$
(106)

On the other hand, \(\mathscr {U}_r(t)\) is unitary on \(L^2(\mathbb {R}^d)\).

Now introduce the following set of admissible exponent pairs:

$$\begin{aligned} \varDelta _r&:= \left\{ (p,q): (1/p,1/q) \; \text {lies in the closed quadrilateral ABCD}\right\} , \end{aligned}$$
(107)

where

$$\begin{aligned} A= & {} \left( \frac{1}{2},\frac{1}{2}\right) , \quad B=\left( 1,\frac{1}{\tau _r}\right) , \quad C=(1,0), \quad D=\left( \frac{1}{\tau _r{^\prime }},0\right) , \quad \\ \tau _r= & {} \frac{2r-1}{r-1}, \quad \frac{1}{\tau _r} + \frac{1}{\tau _r{^\prime }} = 1. \end{aligned}$$

Then for any \((p,q)\in \varDelta _r \setminus \{(2,2),(1,\tau _r),(\tau _r{^\prime },\infty )\}\)

$$\begin{aligned} \Vert \mathscr {U}_r(t) \Vert _{L^p(\mathbb {R}^d) \rightarrow L^q(\mathbb {R}^d)}&\lesssim c^{d \left( 1-\frac{1}{r} \right) \left( \frac{1}{p}-\frac{1}{q} \right) } |t|^{-\frac{d}{2r} \left( \frac{1}{q}-\frac{1}{p} \right) }, \quad 0<|t| \lesssim c^{2(r-1)}, \end{aligned}$$
(108)

Let \(r \ge 1\) and \(d \ge 2\): In the following lemma (pq) is called an order-r admissible pair when \(2 \le p,q \le +\infty \) for \(r \ge 2\) (\(2 \le q \le 2d/(d-2)\) for \(r=1\)), and

$$\begin{aligned} \frac{2}{p} + \frac{d}{rq}&= \frac{d}{2r}. \end{aligned}$$
(109)

Proposition 5

Let \(r \ge 1\) and \(d \ge 2\), and denote by \(\mathscr {U}_r(t)\) the evolution operator of (105) at the time \(c^2t\) (\(c \ge 1\), \(t>0\)). Let (pq) and (ab) be order-r admissible pairs, then for any \(T \lesssim c^{2(r-1)}\)

$$\begin{aligned}&\Vert \mathscr {U}_r(t)\phi _0 \Vert _{L^p([0,T])L^q(\mathbb {R}^d)} \lesssim c^{ d \left( 1-\frac{1}{r} \right) \left( \frac{1}{2} -\frac{1}{q} \right) } \Vert \phi _0\Vert _{L^2(\mathbb {R}^d)} = c^{ \left( 1-\frac{1}{r}\right) \frac{2r}{p} }\Vert \phi _0\Vert _{L^2(\mathbb {R}^d)}, \end{aligned}$$
(110)
$$\begin{aligned}&\quad \left\| \int _0^t \mathscr {U}_r(t-\tau )\phi (\tau ) \mathrm{d}\tau \right\| _{L^p([0,T])L^q(\mathbb {R}^d)} \lesssim c^{\left( 1-\frac{1}{r}\right) 2r\left( \frac{1}{p}+\frac{1}{a}\right) } \Vert \phi \Vert _{L^{a{^\prime }}([0,T])L^{b{^\prime }}(\mathbb {R}^d)}. \end{aligned}$$
(111)

8.2 Well-posedness of higher-order nonlinear Schrödinger equations with small data

Here we discuss the local well-posedness of

$$\begin{aligned} -i \psi _t&= A_{c,r} \psi + P( (\partial ^\alpha _x\psi )_{|\alpha | \le 2(r-1)},(\partial ^\alpha _x\bar{\psi })_{|\alpha | \le 2(r-1)} ), \quad t \in I, \; x \in \mathbb {R}^d, \end{aligned}$$
(112)
$$\begin{aligned} \psi (0,x)&= \psi _0(x), \end{aligned}$$
(113)

where \(r \ge 2\), \(I:=[0,T]\), \(T>0\),

$$\begin{aligned} A_{c,r}&= c^2-\sum _{j=1}^r \frac{\varDelta ^j}{c^{2(j-1)}}, \quad c \ge 1, \end{aligned}$$

and P is an analytic function at the origin of the form

$$\begin{aligned} P(z)&= \sum _{m+1 \le |\beta | < M} a_\beta z^\beta , \quad |a_\beta | \le K^{|\beta |}, \; |z| \ll 1, \end{aligned}$$
(114)

where \(M > m \ge 2\), \(m,M \in \mathbb {N}\).

We will exploit this result during the proof of Theorem 4. We will adapt an argument of [50] in order to show the local well-posedness of equation for data with small norm in the so-called modulation spaces.

Modulation spaces \(M^s_{p,q}\) (\(s \in \mathbb {R}\), \(0< p,q < +\infty \)) were introduced by Feichtinger, and they can be seen as a variant of Besov spaces, in the sense that they allow to perform a frequency decomposition of operators, and to study their properties with respect to lower and higher frequencies. This spaces were recently used in order to prove global well-posedness and scattering for small data for nonlinear dispersive PDEs, especially in the case of derivative nonlinearities (see, for example, [50, 58, 59]). We refer to [49] for a survey about modulation spaces and nonlinear evolution equations.

We define the norm on modulation spaces via the following decomposition: Let \(\sigma :\mathbb {R}^d \rightarrow \mathbb {R}\) be a function such that

$$\begin{aligned} supp(\sigma )&\subset [-3/4,3/4]^d, \end{aligned}$$

and consider a function sequence \((\sigma _k)_{k \in \mathbb {Z}^d}\) satisfying

$$\begin{aligned} \sigma _k(\cdot )&= \sigma (\cdot -k), \end{aligned}$$
(115)
$$\begin{aligned} \sum _{k \in \mathbb {Z}^d} \sigma _k(\xi )&= 1, \quad \forall \xi \in \mathbb {R}^d. \end{aligned}$$
(116)

Denote by

$$\begin{aligned} \mathscr {Y}_d := \{ (\sigma _k)_{k \in \mathbb {Z}^d}: (\sigma _k)_{k \in \mathbb {Z}^d} \text {satisfies}\,\,\, (115){-}(116)\}. \end{aligned}$$

Let \((\sigma _k)_{k \in \mathbb {Z}^d} \in \mathscr {Y}_d\), and define the frequency-uniform decomposition operators

$$\begin{aligned} \square _k := \mathscr {F}^{-1}\sigma _k\mathscr {F}, \end{aligned}$$
(117)

where by \(\mathscr {F}\) we denote the Fourier transform on \(\mathbb {R}^d\), then we define the modulation spaces \(M^s_{p,q}(\mathbb {R}^d)\) via the following norm,

$$\begin{aligned} \Vert f\Vert _{M^s_{p,q}(\mathbb {R}^d)} := \left( \sum _{k \in \mathbb {Z}^d} \left\langle k \right\rangle ^{sq} \Vert \square _k f\Vert _p^q \right) ^{1/q}, \quad s \in \mathbb {R}, 0< p,q < +\infty . \end{aligned}$$
(118)

Actually, in our application we will always be interested in the spaces \(M^s_{p,1}(\mathbb {R}^d)\) with \(s \in \mathbb {R}\) and \(p>1\). We just mention some properties of modulation spaces.

Proposition 6

Let \(s,s_1,s_2 \in \mathbb {R}\) and \(1< p,p_1,p_2 < +\infty \).

  1. 1.

    \(M^s_{p,1}(\mathbb {R}^d)\) is a Banach space;

  2. 2.

    \(\mathscr {S}(\mathbb {R}^d) \subset M^s_{p,1}(\mathbb {R}^d) \subset \mathscr {S}{^\prime }(\mathbb {R}^d)\);

  3. 3.

    \(\mathscr {S}(\mathbb {R}^d)\) is dense in \(M^s_{p,1}(\mathbb {R}^d)\);

  4. 4.

    if \(s_2 \le s_1\) and \(p_1 \le p_2\), then \(M^{s_1}_{p_1,1} \subseteq M^{s_2}_{p_2,1}\);

  5. 5.

    \(M^0_{p,1}(\mathbb {R}^d) \subseteq L^\infty (\mathbb {R}^d) \cap L^p(\mathbb {R}^d)\);

  6. 6.

    let \(\tau (p) = max\left( 0, d(1-1/p), d/p \right) \) and \(s_1 > s_2 + \tau (p)\), then \(W^{s_1,p}(\mathbb {R}^d) \subset M^{s_2}_{p,1}(\mathbb {R}^d)\);

  7. 7.

    let \(s_1 \ge s_2\), then \(M^{s_1}_{p,1}(\mathbb {R}^d) \subset W^{s_2,p}(\mathbb {R}^d)\).

The last two properties are not trivial and have been proved in [32].

We also introduce other spaces which are often used in this context: the anisotropic Lebesgue space \(L^{p_1,p_2}_{x_i;(x_j)_{j\ne i},t}\),

$$\begin{aligned} \Vert f\Vert _{ L^{p_1,p_2}_{x_i;(x_j)_{j\ne i},t} }&:= \left\| \Vert f\Vert _{L^{p_2}_{x_1,\ldots ,x_{i-1},x_{i+1},\ldots ,x_d,t}(\mathbb {R}^{d-1} \times I)} \right\| _{L^{p_1}_{x_i}(\mathbb {R})}, \end{aligned}$$

and, for any Banach space X, the spaces \(l^{1,s}_\square (X)\) and \(l^{1,s}_{\square ,i}(X)\),

$$\begin{aligned} \Vert f\Vert _{l^{1,s}_\square (X)}&:= \sum _{k \in \mathbb {Z}^d} \left\langle k \right\rangle ^s \Vert \square _k f\Vert _X, \end{aligned}$$
(119)
$$\begin{aligned} \Vert f\Vert _{l^{1,s}_{\square ,i,c}(X)}&:= \sum _{k \in \mathbb {Z}^d_i} \left\langle k \right\rangle ^s \Vert \square _k f\Vert _X, \quad \mathbb {Z}^d_i:= \left\{ k \in \mathbb {Z}^d: |k_i|=\max _{1 \le j \le d}|k_j|, |k_i|>c\right\} . \end{aligned}$$
(120)

For simplicity, we write \(l^1_\square (X)=l^{1,0}_\square (X)\) and \(M^s_{p,1}=M^s_{p,1}(\mathbb {R}^d)\).

Proposition 7

Let \(d \ge 2\), \(m\ge 2\), \(m > 4r/d\) and \(s > 2(r-1)+1/m\).

  1. (i)

    There exist \(c_0>1\) and \(\delta _0=\delta _0(d,m,r)>0\) such that for any \(c \ge c_0\), for any \(\delta >\delta _0\) and for any \(\psi _0 \in M^s_{2,1}\) with \(\Vert \psi _0\Vert _{M^s_{2,1}} \le c^{-\delta }\) Eq. (112) admits a unique solution \(\psi \in C(I,M^s_{2,1}) \cap D\), where \(T=T( \Vert \psi _0\Vert _{M^s_{2,1}} ) = \mathscr {O}( c^{2(r-1)} )\), and

    $$\begin{aligned} \Vert \psi \Vert _D&= \sum _{\alpha =0}^{2(r-1)} \sum _{i,l=1}^d \Vert \partial _{x_l}^\alpha \psi \Vert _{ l^{1,s-r+1/2}_{\square ,i,c}(L^{\infty ,2}_{x_i;(x_j)_{j\ne i},t}) \cap l^{1,s}_\square (L^{m,\infty }_{x_i;(x_j)_{j\ne i},t}) \cap l^{1,s+1/m}_\square (L^\infty _t L^2_x \cap L^{2+m}_{t,x}) } \; \lesssim \; c^{-\delta }. \end{aligned}$$
    (121)
  2. (ii)

    Moreover, if \(s \ge s_0(d):=d+2+\frac{1}{2}\), then there exists \(\delta _1=\delta _1(d,m,r)>0\) such that for any \(c \ge c_0\), for any \(\delta >\delta _1\) and for any \(\psi _0 \in M^s_{2,1}\) with \(\Vert \psi _0\Vert _{M^s_{2,1}} \le c^{-\delta }\) Eq. (112) admits a unique solution \(\psi \in C(I,H^s)\), where \(T=T( \Vert \psi _0\Vert _{M^s_{2,1}} ) = \mathscr {O}( c^{2(r-1)} )\), and

    $$\begin{aligned} \Vert \psi (t)\Vert _{H^{s}}&\lesssim c^{-\delta }, \quad |t| \lesssim c^{2(r-1)}. \end{aligned}$$
    (122)

From the above proposition and from the embedding \(H^{s+\sigma +d/2} \subset M^s_{2,1}\) for any \(\sigma >0\) we can deduce

Corollary 1

Let \(d \ge 2\), \(l\ge 2\), \(r<\frac{d}{2}(l-1)\) and \(s>2(r-1)+\frac{1}{2(l-1)}\). Then there exist \(c_0>1\), \(\delta _0=\delta _0(d,l,r)>0\) and \(\delta _1=\delta _1(d,l,r)>0\) such that for any \(c \ge c_0\), for any \(\delta >\max (\delta _0,\delta _1)\), for any \(\sigma >0\) and for any \(\psi _0 \in H^{s+\sigma +d/2}\) with \(\Vert \psi _0\Vert _{H^{s+\sigma +d/2}} \le c^{-\delta }\) the normal form equation for (56) admits a unique solution \(\psi \in C([0,T],H^{s+\sigma +d/2}) \cap D\), where \(T=T( \Vert \psi _0\Vert _{H^{s+\sigma +d/2}} ) = \mathscr {O}( c^{2(r-1)} )\), and (121) holds. Furthermore, we have that \(\psi \in L^\infty (I)H^{s+\sigma +d/2}(\mathbb {R}^d)\), and

$$\begin{aligned} \Vert \psi (t)\Vert _{H^{s+\sigma +d/2}}&\lesssim c^{-\delta }, \quad |t| \lesssim c^{2(r-1)}. \end{aligned}$$
(123)

Since the nonlinearity in Eq. (112) involves derivatives, this could cause a loss of derivatives as long as we rely only on energy estimates, on dispersive estimates or on Strichartz estimates. In order to overcome such a problem, we study the time decay of the operator \(\mathscr {U}_r(t):=e^{itA_{c,r}}\), its local smoothing property, Strichartz estimates with \(\square _k\)-decomposition and maximal function estimates in the framework of frequency-uniform localization.

The rest of this subsection is devoted to the proof of Proposition 7. For convenience, we will always use the following function sequence \((\sigma _k)_{k \in \mathbb {Z}^d}\) to define modulation spaces.

Lemma 5

Let \((\eta _k)_{k \in \mathbb {Z}} \in \mathscr {Y}_1\), and assume that \(\text {supp}(\eta _k) \subset [k-2/3,k+2/3]\). Consider

$$\begin{aligned} \sigma _k(\xi ) := \eta _{k_1}(\xi _1) \ldots \eta _{k_d}(\xi _d), \quad k = (k_1,\ldots ,k_d) \in \mathbb {Z}^d, \end{aligned}$$
(124)

then \((\sigma _k)_{k \in \mathbb {Z}^d} \in \mathscr {Y}_d\).

For convenience, we also write

$$\begin{aligned} \tilde{\sigma }\sigma _k = \sum _{\Vert l\Vert _\infty \le 1} \sigma _{k+l}, \;&\; \tilde{\sigma }\square _k = \sum _{\Vert l\Vert _\infty \le 1} \square _{k+l}, \quad k \in \mathbb {Z}^d, \end{aligned}$$
(125)

and one can check that

$$\begin{aligned} \tilde{\sigma }\sigma _k \sigma _k = \sigma _k, \;&\; \tilde{\sigma }\square _k \circ \square _k = \square _k, \; k \in \mathbb {Z}^d. \end{aligned}$$
(126)

We also write \(\mathscr {A}_rf(t,x) := \int _0^t \mathscr {U}_r(t-\tau )f(\tau ,x)\mathrm{d}\tau \).

8.2.1 Time decay

Now, the time decay of the operator \(\mathscr {U}_r(t)\) is known (see (106)), but now we are interested in its frequency-localized version, and we want to consider lower, medium and higher frequency separately. For simplicity we discuss the case \(r=2\), and we defer to the end of this section a remark about the case \(r>2\). So, consider

$$\begin{aligned} \mathscr {U}_2(t)&= e^{itA_{c,2}} = e^{ic^2t} \; \mathscr {F}^{-1} e^{it\left( |\xi |^2-\frac{|\xi |^4}{c^2}\right) } \mathscr {F}, \end{aligned}$$

and write \(\varepsilon =c^{-2}\). It is known that the time decay of \(\mathscr {U}_2(t)\) is determined by the critical points of \(P_2(|\xi |)= |\xi |^2-\varepsilon |\xi |^4\). Notice that \(P{^\prime }_2(R)=4R(\varepsilon ^{1/2}R+\frac{1}{\sqrt{2}})(\varepsilon ^{1/2}R-\frac{1}{\sqrt{2}})\), the singular points of \(P_2\) are \(\xi =0\) and the points of the sphere \(\xi =(2\varepsilon )^{-1/2}\). To handle these points, we exploit Littlewood–Paley decomposition, van der Corput lemma and some properties of the Fourier transform of radial functions.

Indeed, it is known that the Fourier transform of a radial function f is radial,

$$\begin{aligned} \mathscr {F}f(\xi )&= 2\pi \int _0^\infty f(R) R^{d-1}(R|\xi |)^{-(d-2)/2} J_{\frac{d-2}{2}}(R|\xi |) dR, \end{aligned}$$

where \(J_m\) is the order m Bessel function,

$$\begin{aligned} J_m(R)&= \frac{(R/2)^m}{\varGamma (m+1/2)\pi ^{1/2}} \int _{-1}^1 e^{iRt} (1-t^2)^{m-1/2} dt, \quad m>-1/2. \end{aligned}$$

By following the computations in [50] we obtain that

$$\begin{aligned} \mathscr {F}f(s)&= K_d \pi \int _0^\infty f(R) R^{d-1} e^{-iRs} \bar{h}(Rs) \mathrm{d}R\nonumber \\&\quad + K_d \pi \int _0^\infty f(R) R^{d-1} e^{iRs} h(Rs) \mathrm{d}R, \quad K_d>0, \end{aligned}$$
(127)
$$\begin{aligned} |h^{(k)}(R)|&\le K_d (1+R)^{- \frac{d-1}{2}-k}, \quad \forall k \ge 0. \end{aligned}$$
(128)

Now we make a Littlewood–Paley decomposition of the frequencies: Choose \(\rho \) a smooth cutoff function equal to 1 in the unit ball and equal to 0 outside the ball of radius 2, write \(\phi _0=\rho (\cdot )-\rho (2\cdot )\), \(\phi _j(\cdot )= \mathscr {F}^{-1}\phi _0(2^{-j}\cdot )\mathscr {F}\), \(j \in \mathbb {Z}\), and consider

$$\begin{aligned} \mathscr {U}_2(t)\psi _0&= \sum _{|j| \le K} \phi _j(D)\mathscr {U}_2(t)\psi _0 + \sum _{j<- K} \phi _j(D)\mathscr {U}_2(t)\psi _0 + \sum _{j>K} \phi _j(D)\mathscr {U}_2(t)\psi _0 \nonumber \\&=: P_= \, \mathscr {U}_2(t)\psi _0 + P_<\,\mathscr {U}_2(t)\psi _0 + P_>\,\mathscr {U}_2(t)\psi _0, \end{aligned}$$
(129)

where

$$\begin{aligned} K&:= K(\varepsilon ) \; = \; 10-\frac{1}{2} \lceil \log _2 \varepsilon \rceil . \end{aligned}$$
(130)

Notice that the singular point \(R=0\) is in the support set of \(\mathscr {F}(P_= \, \mathscr {U}_2 (t)\psi _0)\). Roughly speaking, if \(j <-K\), the dominant term in \(P_2(R)\) is \(R^2\), while if \(j>K\) the dominant term in \(P_2(R)\) is \(\varepsilon R^4\); hence, by (106)

$$\begin{aligned} \Vert P_< \, \mathscr {U}_2(t)\psi _0 \Vert _{L^\infty }&\lesssim |t|^{-d/2} \Vert \psi _0\Vert _{L^1}, \end{aligned}$$
(131)
$$\begin{aligned} \Vert P_> \, \mathscr {U}_2(t)\psi _0 \Vert _{L^\infty }&\lesssim c^{d/2}|t|^{-d/4} \Vert \psi _0\Vert _{L^1}, \quad 0 < |t| \lesssim c^2. \end{aligned}$$
(132)

The time decay estimate for \(P_= \, \mathscr {U}_2(t)\psi _0\) is more difficult, since \(P_2(R)\) has a singular point in \(R=R_1:=(2\varepsilon )^{-1/2}\), which corresponds to the sphere \(|\xi |=R_1\) in the support set of \(\mathscr {F}(P_= \, \mathscr {U}_2 (t)\psi _0)\). We notice that also the point that satisfies \(P_2^{\prime \prime }(R)=0\), \(R=(6\varepsilon )^{-1/2}\), corresponds to a sphere \(\xi =R_2\) contained in the support set of \(\mathscr {F}(P_= \, \mathscr {U}_2 (t)\psi _0)\); we shall use this fact later.

In order to handle the singular point \(R_1\), we perform another decomposition around the sphere \(|\xi |=R_1\). Denote \(\tilde{\sigma }\rho (\cdot )=\rho (2^{-K}\cdot )-\rho (2^{(K+1)}\cdot )\), then \(P_= = \mathscr {F}^{-1} \tilde{\sigma }\rho \mathscr {F}\); write \(P_k = \mathscr {F}^{-1} \phi _k(|\xi |-R_1)\mathscr {F}\), we get

$$\begin{aligned} \sum _{|j| \le K} \phi _j(D)\mathscr {U}_2(t)\psi _0&= \sum _{k \in \mathbb {Z}} P_= P_k \, \mathscr {U}_2(t)\psi _0 \end{aligned}$$
(133)

By Young’s inequality

$$\begin{aligned} \Vert P_= P_k \, \mathscr {U}_2(t)\psi _0\Vert _{L^\infty }&\lesssim \Vert \mathscr {F}^{-1}\left( \tilde{\sigma }\rho \phi _k(|\xi |-R_1) e^{-itP_2(|\xi |)} \right) \Vert _{L^\infty } \Vert \psi _0\Vert _{L^1}. \end{aligned}$$
(134)

Moreover,

$$\begin{aligned}&\mathscr {F}^{-1}\left( \tilde{\sigma }\rho \phi _k(|\xi |-R_1) e^{-itP_2(|\xi |)} \right) \\&\quad {\mathop {=}\limits ^{(127)}} K_d \pi \int _0^\infty R^{d-1} \tilde{\sigma }\rho (R)\phi _k(R-R_1) e^{-itP_2(R)-iR|x|} \bar{h}(R|x|)\mathrm{d}R \\&\qquad +\, K_d \pi \int _0^\infty R^{d-1} \tilde{\sigma }\rho (R)\phi _k(R-R_1) e^{-itP_2(R)+iR|x|} h(R|x|) \mathrm{d}R \\&\quad =:A_k(|x|)+B_k(|x|). \end{aligned}$$

In order to estimate \(A_k(s)\) we rewrite it as

$$\begin{aligned} A_k(s)&= K_d\pi \left( \int _{R_1}^\infty +\int _0^{R_1} \right) R^{d-1} \tilde{\sigma }\rho (R)\phi _k(R-R_1) e^{-itP_2(R)-iRs} \bar{h}(Rs) \mathrm{d}R \end{aligned}$$
(135)
$$\begin{aligned}&=: A_k^{(1)}(s)+A_k^{(2)}(s). \end{aligned}$$
(136)

We begin by estimating \(A_k^{(1)}\): Notice that \(A_k^{(1)}(s)\) for \(k>K+2\); hence, we can assume that \(k \le K+2\). By a change of variables we obtain

$$\begin{aligned}&A_k^{(1)}(s) {\mathop {=}\limits ^{R=R_1+2^k\sigma }} 2^k K_d\pi e^{-iR_1s} \int _{1/2}^2 F(\sigma ) e^{it 2^{2k} \tilde{\sigma }{P_2}(\sigma ) } \mathrm{d}\sigma , \\&F(\sigma ) := (R_1+2^k \sigma )^{d-1} \tilde{\sigma }\rho (R_1+2^k\sigma )\phi _0(\sigma ) \bar{h}((R_1+2^k\sigma )s), \\&\tilde{\sigma }{P_2}(\sigma ) := (2^{2k}t)^{-1} (t P_2(R_1+2^k\sigma )-2^k\sigma s). \end{aligned}$$

One can check that

$$\begin{aligned} |\tilde{\sigma }{P_2}{^\prime }(\sigma )|&= \left| 4(R_1+2^k\sigma )(2R_1+2^k\sigma )\sigma \varepsilon - \frac{s}{t2^k} \right| . \end{aligned}$$

Let \(s \gg 1\); if \(s \ll 2^k t/ \varepsilon \), then

$$\begin{aligned} |F^{(m)}(\sigma )|&\lesssim 1, \quad \forall m \ge 1, \quad |\tilde{\sigma }{P_2}{^\prime }(\sigma )| \;\lesssim \varepsilon , \; |\tilde{\sigma }{P_2}^{\prime \prime }(\sigma )|\\&\lesssim \varepsilon ^{1/2}, \; |\tilde{\sigma }{P_2}^{\prime \prime \prime }(\sigma )| \;\lesssim \varepsilon , \; |\tilde{\sigma }{P_2}^{(m)}(\sigma )| \;{\mathop {\lesssim }\limits ^{\varepsilon \le 1}} 1, \quad \forall m \ge 4 \end{aligned}$$

while for \(s \gg 2^kt/ \varepsilon \)

$$\begin{aligned} |F^{(m)}(\sigma )| \lesssim 1, \quad \forall m \ge 1, \quad |\tilde{\sigma }{P_2}^{(m)}(\sigma )|&{\mathop {\lesssim }\limits ^{\varepsilon \le 1}} 1, \quad \forall m \ge 1. \end{aligned}$$

Integrating by parts we get

$$\begin{aligned}&A^{(1)}_k(s)\\&\quad = 2^k(2^{2k}t)^{-N} K_d\pi e^{iR_1s} \int _{1/2}^2 e^{ it 2^{2k}\tilde{\sigma }{P_2}(\sigma ) } \frac{\mathrm{d}}{\mathrm{d}\sigma } \left( \frac{1}{ \tilde{\sigma }{P_2}{^\prime }(\sigma ) } \cdots \frac{\mathrm{d}}{\mathrm{d}\sigma } \left( \frac{1}{ \tilde{\sigma }{P_2}{^\prime }(\sigma ) } \frac{\mathrm{d}}{\mathrm{d}\sigma } \left( \frac{ F(\sigma ) }{ \tilde{\sigma }{P_2}{^\prime }(\sigma ) } \right) \right) \right) \mathrm{d}\sigma . \end{aligned}$$

Therefore,

$$\begin{aligned} |A_k^{(1)}(s)|&\lesssim 2^k(2^{2k}t)^{-N}. \end{aligned}$$
(137)

If \(s \sim 2^kt/\varepsilon \), we apply van der Corput lemma,

$$\begin{aligned} |A_k^{(1)}(s)| \,\,&\lesssim 2^k(2^{2k}t)^{-1/2} \int _{1/2}^2 |\partial _\sigma F(\sigma )|\mathrm{d}\sigma \\&{\mathop {\lesssim }\limits ^{(128)}} 2^k(2^{2k}t)^{-1/2} s^{-(d-1)/2} \lesssim 2^k(2^{2k}t)^{-d/2} \varepsilon ^{(d-1)/2}. \end{aligned}$$

Moreover, we can check that \(|A_k^{(1)}(s)| \lesssim 2^k\); hence, for \(s \gg 1\)

$$\begin{aligned} |A_k^{(1)}(s)|&{\mathop {\lesssim }\limits ^{\varepsilon \le 1}} 2^k \min ( 1, (2^{2k}t)^{-d/2} ). \end{aligned}$$
(138)

If \(s \lesssim 1\), we rewrite \(A_k^{(1)}\) in the following form

$$\begin{aligned} A_k^{(1)}(s)&= 2^k K_d \pi e^{-iR_1 s} \int _{1/2}^2 F_1(\sigma ) e^{itP_2(R_1+2^k\sigma )} \mathrm{d}\sigma , \\ F_1(\sigma )&:= (R_1+2^k\sigma )^{d-1} \tilde{\sigma }\rho (R_1+2^k\sigma )\phi _0(\sigma ) \bar{h}((R_1+2^k\sigma )s) e^{-i2^k\sigma s}. \end{aligned}$$

Again integrating by parts, we obtain

$$\begin{aligned} |A_k^{(1)}(s)|&\lesssim 2^k \min ( 1, (2^{2k}t)^{-d/2} ). \end{aligned}$$
(139)

Now we estimate \(A_k^{(2)}\). We notice that \(R_2 \in \text {supp}(\phi _k(R_1-\cdot ))\) if and only if \(k \in \{-2,-1\}\); when \(k \notin \{-2,-1\}\), one can repeat the above argument and show that

$$\begin{aligned} |A_k^{(2)}(s)|&\lesssim 2^k \min ( 1, (2^{2k}t)^{-d/2} ). \end{aligned}$$
(140)

Let \(k \in \{-2,-1\}\). If \(s \ll t\) or \(s \gg t\) we have by integration by parts that

$$\begin{aligned} |A_k^{(2)}(s)|&\lesssim \min ( 1, t^{-N} ), \quad \forall N \in \mathbb {N}. \end{aligned}$$

On the other hand, if \(s \sim t\) we can use van der Corput lemma and obtain

$$\begin{aligned} |A_k^{(2)}(s)|&\lesssim t^{-1/3} s^{-(d-1)/2} \lesssim t^{-\frac{d}{2}+\frac{1}{6}}. \end{aligned}$$

Therefore, for \(k \in \{-2,-1\}\) we have

$$\begin{aligned} |A_k^{(2)}(s)|&\lesssim \min \left( 1,t^{-\frac{d}{2}+\frac{1}{6}} \right) . \end{aligned}$$
(141)

Combining (140) and (141) we can deduce that

$$\begin{aligned} |A_k^{(2)}(s)|&\lesssim 2^k \min \left( 1,(2^{2k}t)^{-\frac{d}{2}+\frac{1}{6}} \right) . \end{aligned}$$
(142)

If we sum up all the \(A_k\) for \(k \le K+2\) we finally conclude that for any \(d \ge 2\)

$$\begin{aligned} \Vert P_= \, \mathscr {U}_2(t)\psi _0 \Vert _{L^\infty }&\lesssim c \min (|t|^{-d/2},|t|^{-d/2+1/6}) \Vert \psi _0\Vert _{L^1}. \end{aligned}$$
(143)

Remark 18

In the general case \(r>2\), we have to determine critical points for the polynomial

$$\begin{aligned} P_r(R)&= \sum _{j=1}^r (-1)^{j+1} \varepsilon ^{j-1}R^{2j}, \end{aligned}$$
(144)

namely the roots of the polynomial

$$\begin{aligned} P{^\prime }_r(R)&= \sum _{j=1}^r (-1)^{j+1} \varepsilon ^{j-1} 2j R^{2j-1} = R \left( \sum _{j=1}^r (-1)^{j+1} \varepsilon ^{j-1} 2j R^{2(j-1)}\right) . \end{aligned}$$
(145)

Besides the trivial value \(R=0\), which we deal as in the case \(r=2\), one should rely on lower and upper bounds to determine the other (if any) real roots. For a lower bound, we rely on a well-known corollary of Rouché theorem from complex analysis, and we obtain that the other roots satisfy

$$\begin{aligned} R&\ge \frac{2}{\max \left( 2,\sum _{j=1}^r 2j\varepsilon ^{j-1} \right) } \\&\ge \frac{2}{\max \left( 2,2r \sum _{j=0}^{r-1} \varepsilon ^j \right) } \\&{\mathop {\ge }\limits ^{\varepsilon \le 1/2}} \frac{2}{\max (2,4r\varepsilon )} {\mathop {\ge }\limits ^{\varepsilon \ll 1/(2r)}} 1. \end{aligned}$$

For what concerns an upper bound, we exploit an old result by Fujiwara [24], and we get that the roots satisfy

$$\begin{aligned} R&\le \max _{1 \le j \le r-1} \left( 2(r-1) \frac{2j\varepsilon ^{j-1}}{2r\varepsilon ^{r-1}} \right) ^{ \frac{1}{2(j-1)} } \\&\le 2(r-1) \max _{1 \le j \le r-1} \left( \frac{j}{r}\right) ^{ \frac{1}{2(j-1)} } \varepsilon ^{ \frac{j-r}{2(j-1)} } \\&{\mathop {\le }\limits ^{\varepsilon \le 1}} K_r \varepsilon ^{-1/2} \end{aligned}$$

for some \(K_r>0\).

Hence, in the case \(r>2\), if \(\varepsilon \) sufficiently small (depending on r), then the polynomial \(P{^\prime }_r\) has critical points (apart from 0) which have modulus between 1 and \(\mathscr {O}(\varepsilon ^{-1/2})\) (a similar argument works also for the polynomial \(P^{\prime \prime }_r\)), and this affects the medium-frequency decay of \(\mathscr {U}_r(t)\). In any case, we can deal with this problem as in the case \(r=2\), and we get

$$\begin{aligned} \Vert P_< \, \mathscr {U}_r(t)\psi _0 \Vert _{L^\infty }&\lesssim |t|^{-d/2} \Vert \psi _0\Vert _{L^1}, \end{aligned}$$
(146)
$$\begin{aligned} \Vert P_= \, \mathscr {U}_r(t)\psi _0 \Vert _{L^\infty }&\lesssim c \min (|t|^{-d/2},|t|^{-d/2+1/6}) \Vert \psi _0\Vert _{L^1}, \end{aligned}$$
(147)
$$\begin{aligned} \Vert P_> \, \mathscr {U}_r(t)\psi _0 \Vert _{L^\infty }&\lesssim c^{d/2}|t|^{-\frac{d}{2r}} \Vert \psi _0\Vert _{L^1}, \; 0 < |t| \lesssim c^{2(r-1)}. \end{aligned}$$
(148)

8.2.2 Smoothing estimates

As already pointed out, one needs smoothing estimates to ensure the well-posedness of Eq. (112) because of the presence of derivatives in the nonlinearity. Again, we first consider the case \(r=2\), and then we mention the results for \(r>2\).

Proposition 8

For any \(k=(k_1,\ldots ,k_d) \in \mathbb {Z}^d\) with \(|k_i|=|k|_\infty \) and \(|k_i|\gtrsim c\)

$$\begin{aligned} \left\| \square _k D_{x_i}^{3/2}\mathscr {U}_2(t)\psi _0 \right\| _{ L^{\infty ,2}_{x_i;(x_j)_{j\ne i},t} }&\lesssim c \Vert \square _k \psi _0\Vert _{L^2}. \end{aligned}$$
(149)

Proof

It suffices to consider the case \(i=1\). For convenience, we write \(\bar{z}=(z_1,\ldots ,z_d)\). Then,

$$\begin{aligned} \left\| \square _k D_{x_i}^{3/2}\mathscr {U}_2(t)\psi _0 \right\| _{ L^{\infty ,2}_{x_i;(x_j)_{j\ne i},t} }&= \left\| \int \sigma _k(\xi )|\xi _1|^{3/2} e^{itP_2(|\xi |)} \mathscr {F}(\psi _0)(\xi ) e^{ix_1\xi _1} \mathrm{d}\xi _1 \right\| _{ L^\infty _{x_1} L^2_{\bar{\xi },t} } \\&\lesssim \left\| \int \eta _{k_1}(\xi _1)|\xi _1|^{3/2} e^{itP_2(|\xi |)} \mathscr {F}(\psi _0)(\xi ) e^{ix_1\xi _1} \mathrm{d}\xi _1 \right\| _{ L^\infty _{x_1} L^2_{\bar{\xi },t} } =: L. \end{aligned}$$

Now, we estimate L: If \(k_1 \gtrsim c\), then \(\xi _1>0\) for \(\xi \in \text {supp}(\eta _{k_1})\). Hence, by changing variable, \(\theta = P_2(|\xi |)\), we get

$$\begin{aligned} L&\lesssim \left\| \int \eta _{k_1}( \xi _1(\theta ) )\xi _1(\theta )^{3/2} e^{it\theta } \mathscr {F}(\psi _0)(\xi (\theta )) e^{ix_1\xi _1(\theta )} \; \frac{1}{2} \xi _1^{-1}(\theta ) \left( 2 \frac{|\xi |^2}{c^2}-1 \right) ^{-1} \right\| _{ L^\infty _{x_1} L^2_{\bar{\xi },t} } \\&\lesssim \left\| \eta _{k_1}( \xi _1(\theta ) )\xi _1(\theta )^{1/2} \mathscr {F}(\psi _0)(\xi (\theta )) \left( 2 \frac{|\xi |^2}{c^2}-1 \right) ^{-1} \right\| _{ L^2_{\theta } L^2_{\bar{\xi }} } \\&\lesssim \left\| \eta _{k_1}( \xi _1 )\xi _1^{1/2} \mathscr {F}(\psi _0)(\xi ) \left( 2 \frac{|\xi |^2}{c^2}-1 \right) ^{-1} \left( 2 \frac{|\xi |^2}{c^2}-1 \right) ^{1/2} \xi _1^{1/2} \right\| _{ L^2_{\xi } } \\&= \left\| \eta _{k_1}( \xi _1 )\xi _1 \mathscr {F}(\psi _0)(\xi ) \left( 2 \frac{|\xi |^2}{c^2}-1 \right) ^{-1/2} \right\| _{ L^2_{\xi } } \; \lesssim \; c \Vert \psi _0\Vert _{L^2}. \\ \end{aligned}$$

The proof for the case \(k_1 \lesssim -c\) is similar. \(\square \)

By duality we have the following

Proposition 9

For any \(k=(k_1,\ldots ,k_d) \in \mathbb {Z}^d\) with \(|k_i|=|k|_\infty \) and \(|k_i|\gtrsim c\)

$$\begin{aligned} \left\| \square _k \partial ^2_{x_i} \mathscr {A}_2f \right\| _{ L^\infty _t L^2_x }&\lesssim c \Vert \square _k D_i^{1/2} f\Vert _{ L^{1,2}_{x_i;(x_j)_{j \ne i},t} }. \end{aligned}$$
(150)

Now consider the inhomogeneous Cauchy problem

$$\begin{aligned} -i\psi _t&= A_{c,2}\psi + f(t,x), \quad \psi (0,x)=0. \end{aligned}$$
(151)

Proposition 10

For any \(k=(k_1,\ldots ,k_d) \in \mathbb {Z}^d\) with \(|k_i|=|k|_\infty \) and \(|k_i|\gtrsim c\)

$$\begin{aligned} \left\| \square _k \partial ^2_{x_i} \psi \right\| _{ L^{\infty ,2}_{x_i;(x_j)_{j \ne i},t} }&\lesssim \Vert \square _k f\Vert _{ L^{1,2}_{x_i;(x_j)_{j \ne i},t} }. \end{aligned}$$
(152)

Proof

It suffices to consider \(i=1\). We write

$$\begin{aligned} \psi&= \mathscr {F}^{-1}_{\tau ,\xi } \frac{1}{\tau -c^2-P_2(|\xi |)}(\mathscr {F}_{t,x}f)(\tau ,\xi ). \end{aligned}$$

We have

$$\begin{aligned} \partial ^2_{x_i}\psi&= \mathscr {F}^{-1}_{\tau ,\xi } \frac{\xi _1^2}{P_2(|\xi |)+c^2-\tau }\mathscr {F}_{t,x}f. \end{aligned}$$
(153)

We want to show that

$$\begin{aligned} \left\| \mathscr {F}^{-1}_{\tau ,\xi } \frac{\eta _{k_1}(\xi _1)\xi _1^2}{P_2(|\xi |)+c^2-\tau }\mathscr {F}_{t,x}f \right\| _{L^\infty _{x_1} L^2_{\bar{\xi },t}}&\lesssim \left\| \mathscr {F}^{-1}_{\xi _1} \eta _{k_1}(\xi _1) \mathscr {F}_{x_1}f \right\| _{L^1_{x_1} L^2_{\bar{\xi },t}}, \end{aligned}$$

which, by Young’s inequality, is equivalent to show that

$$\begin{aligned} \sup _{x_1,\tau ,\xi _j \; (j \ne 1)} \left| \mathscr {F}^{-1}_{\xi _1} \frac{\sigma _k(\xi )\xi _1^2}{P_2(|\xi |)+c^2-\tau } \right|&\lesssim 1. \end{aligned}$$
(154)

We prove (154): First, notice that when \(|k_1|=|k|_\infty \), then \(|\xi _1| \sim |\xi |_\infty \) for \(\xi \in \text {supp}(\sigma _k)\). We split the argument according to the cases \(\tau -c^2>0\) and \(\tau -c^2\le 0\). In the case \(\tau -c^2 > 0\)

$$\begin{aligned} \sup _{x_1,\tau ,\xi _j \; (j \ne 1)} \left| \mathscr {F}^{-1}_{\xi _1} \frac{\sigma _k(\xi )\xi _1^2}{P_2(|\xi |)+c^2-\tau } \right|&\lesssim \left| \int _{k_1-3/4}^{k_1+3/4} \frac{c^2}{\xi _1^2} \mathrm{d}\xi _1 \right| \; {\mathop {\lesssim }\limits ^{|k_1|\gtrsim c}} 1. \end{aligned}$$

When \(\tau -c^2 \le 0\) we set \(\tau _2:=\tau _2(c)= c \left( \sqrt{\frac{5}{4}-\frac{\tau }{c^2}} - \frac{1}{2} \right) > 0\), in order to write

$$\begin{aligned} P_2(|\xi |)+c^2-\tau&= \left( \frac{|\xi |^2}{c}+\tau _2 \right) \left( -\frac{|\xi |^2}{c}+\tau _2+c \right) . \end{aligned}$$

Hence

$$\begin{aligned} \mathscr {F}^{-1}_{\xi _1} \frac{\sigma _k(\xi )\xi _1^2}{P_2(|\xi |)+c^2-\tau }&= \mathscr {F}^{-1}_{\xi _1} \frac{\sigma _k(\xi )\xi _1^2}{ \left( \frac{|\xi |^2}{c}+\tau _2 \right) \left( -\frac{|\xi |^2}{c}+\tau _2+c \right) } \nonumber \\&= \mathscr {F}^{-1}_{\xi _1} \frac{\sigma _k(\xi )\xi _1^2}{ \left( \frac{\xi _1^2}{c}+\frac{|\bar{\xi }|^2}{c}+\tau _2 \right) \left( -\frac{\xi _1^2}{c}-\frac{|\bar{\xi }|^2}{c}+\tau _2+c \right) }. \end{aligned}$$
(155)

When \(|\bar{\xi }|^2 \ge c(\tau _2+c)\), we can treat the problem as before.

Next, we consider the case \(|\bar{\xi }|^2 < c(\tau _2+c)\). Let

$$\begin{aligned} A^2&:= A(\bar{\xi },\tau ,c)^2 = \frac{|\bar{\xi }|^2}{c}+\tau _2, \\ B^2&:= B(\bar{\xi },\tau ,c)^2 = -\left( \frac{|\bar{\xi }|^2}{c}-\tau _2-c \right) , \end{aligned}$$

then

$$\begin{aligned}&\mathscr {F}^{-1}_{\xi _1} \frac{\eta _{k_1}(\xi _1)\xi _1^2}{ \left( \frac{|\xi |^2}{c}+\tau _2 \right) \left( -\frac{|\xi |^2}{c}+\tau _2+c \right) } \\&\quad = \mathscr {F}^{-1}_{\xi _1} \frac{\xi _1}{ \frac{\xi _1^2}{c}+A^2 } \frac{\xi _1}{ B^2-\frac{\xi _1^2}{c} } \eta _{k_1}(\xi _1) \\&\quad = \frac{c^{1/2}}{2} \mathscr {F}^{-1}_{\xi _1} \frac{\xi _1}{ \frac{\xi _1^2}{c}+A^2 } \left( \frac{1}{ B -\frac{\xi _1}{c^{1/2}} } - \frac{1}{ B +\frac{\xi _1}{c^{1/2}} } \right) \eta _{k_1}(\xi _1)\\&\quad =: I + II. \end{aligned}$$

We estimate only I, as the argument of II is similar. First we write

$$\begin{aligned} I&= -\frac{c}{2} \mathscr {F}^{-1}_{\xi _1} \frac{\eta _{k_1}(\xi _1)}{ \frac{\xi _1^2}{c}+A^2 } + \frac{c}{2} \mathscr {F}^{-1}_{\xi _1} \frac{\eta _{k_1}(\xi _1)B}{ \left( B -\frac{\xi _1}{c^{1/2}} \right) \left( \frac{\xi _1^2}{c}+A^2 \right) } := I_1+I_2. \end{aligned}$$

Since \(\mathscr {F}^{-1}_{\xi _1}(1/\xi _1)\) is the function \(sgn(\xi _1)\), we have that \(I_1\) is bounded uniformly with respect to c. For \(I_2\), it suffices to show

$$\begin{aligned} cB \sup _{x_1} \left| \mathscr {F}^{-1}_{\xi _1} \frac{1}{ \left( B -\frac{\xi _1}{c^{1/2}} \right) \left( \frac{\xi _1^2}{c}+A^2 \right) } \right|&\lesssim 1. \end{aligned}$$

Since \(|\mathscr {F}(e^{-|\cdot |})(\xi )| \lesssim \frac{1}{1+|\xi |^2}\),

$$\begin{aligned} cB \left\| \mathscr {F}^{-1}_{\xi _1} \frac{1}{ \left( B -\frac{\xi _1}{c^{1/2}} \right) \left( \frac{\xi _1^2}{c}+A^2 \right) } \right\| _{L^\infty _{x_1}}&\lesssim cB \left\| \mathscr {F}^{-1}_{\xi _1} \frac{1}{ B -\frac{\xi _1}{c^{1/2}} } \right\| _{L^\infty _{x_1}} \left\| \mathscr {F}^{-1}_{\xi _1} \frac{1}{ \frac{\xi _1^2}{c}+A^2 } \right\| _{L^1_{x_1}} \\&\lesssim \frac{c^2B}{A^2} \left\| \mathscr {F}^{-1}_{\xi _1} \frac{1}{ B -\frac{\xi _1}{c^{1/2}} } \right\| _{L^\infty _{x_1}} \left\| \mathscr {F}^{-1}_{\xi _1} \frac{1}{ \xi _1^2A^{-2} + 1} \right\| _{L^1_{x_1}}\\&\lesssim B \left\| \mathscr {F}^{-1}_{\xi _1} \frac{1}{ B -\frac{\xi _1}{c^{1/2}} } \right\| _{L^\infty _{x_1}} \cdot \frac{c^2}{A^2} \left\| A e^{-A|x_1|} \right\| _{L^1_{x_1}} \; \lesssim \; 1. \end{aligned}$$

Finally, we observe that in general the solution \(\psi \) of (151) may not vanish at \(t=0\). However, by Parseval’s identity

$$\begin{aligned} \psi (0,x)&= \psi (t,x)_{|t=0} = K \int _I \mathscr {U}_2(s) \mathscr {F}(f)(s,x) \mathrm{d}s, \end{aligned}$$

for some \(K>0\), and if we combine it with (150), we have that \(\square _k\mathscr {U}_2(t) \mathrm{d}^2_{x_1}\psi (0,x) \in L^2\). Hence, by (149)

$$\begin{aligned} \tilde{\sigma }\psi (t)&:= \psi (t)-\mathscr {U}_2(t)\psi (0,\cdot ) = i \int _I \mathscr {U}_2(t-\tau )f(\tau )\mathrm{d}\tau \end{aligned}$$
(156)

is the solution of (151), and it satisfies (152). \(\square \)

Lemma 6

For any \(\sigma \in \mathbb {R}\) and \(k \in \mathbb {Z}^d\) with \(|k_i| \ge 4\),

$$\begin{aligned} \Vert \square _k D^\sigma _{x_i}\psi \Vert _{ L^{p_1,p_2}_{x_1;(x_j)_{j \ne 1},t} }&\lesssim \left\langle k_i \right\rangle ^\sigma \Vert \square _k \psi \Vert _{ L^{p_1,p_2}_{x_1;(x_j)_{j \ne 1},t} }. \end{aligned}$$
(157)

If we replace \(D^\sigma _{x_i}\) by \(\partial ^\sigma _{x_i}\), the above inequality holds for all \(k \in \mathbb {Z}^d\).

Proof

See the proof of Lemma 3.4 in [58]. One can check that both sides of (157) are equivalent for \(|k_i|\ge 4\). \(\square \)

By combining (152), (150) and (157) we obtain

Proposition 11

For any \(k=(k_1,\ldots ,k_d) \in \mathbb {Z}^d\) with \(|k_i|=|k|_\infty \gtrsim c\) we have

$$\begin{aligned} \left\| \square _k \partial _{x_i}^2\mathscr {A}_2f \right\| _{ L^{\infty ,2}_{x_i;(x_j)_{j\ne i},t} }&\lesssim \Vert \square _k f\Vert _ {L^{1,2}_{x_i;(x_j)_{j\ne i},t} }, \end{aligned}$$
(158)
$$\begin{aligned} \left\| \square _k \partial ^2_{x_i} \mathscr {A}_2f \right\| _{ L^\infty _t L^2_x }&\lesssim c \left\langle |k_i|\right\rangle ^{1/2} \Vert \square _k f\Vert _{ L^{1,2}_{x_i;(x_j)_{j \ne i},t} }. \end{aligned}$$
(159)

Remark 19

For the case \(r>2\) we replace (149), (150), (152), (158) and (159) with

$$\begin{aligned} \left\| \square _k D_{x_i}^{r-1/2}\mathscr {U}_r(t)\psi _0 \right\| _{ L^{\infty ,2}_{x_i;(x_j)_{j\ne i},t} }&\lesssim c^{r-1} \Vert \square _k \psi _0\Vert _{L^2}, \end{aligned}$$
(160)
$$\begin{aligned} \left\| \square _k \partial ^r_{x_i} \mathscr {A}_rf \right\| _{ L^\infty _t L^2_x }&\lesssim c^{r-1} \Vert \square _k D_i^{1/2} f\Vert _{ L^{1,2}_{x_i;(x_j)_{j \ne i},t} }, \end{aligned}$$
(161)
$$\begin{aligned} \left\| \square _k \partial ^{2(r-1)}_{x_i} \psi \right\| _{ L^{\infty ,2}_{x_i;(x_j)_{j \ne i},t} }&\lesssim \Vert \square _k f\Vert _{ L^{1,2}_{x_i;(x_j)_{j \ne i},t} }, \end{aligned}$$
(162)
$$\begin{aligned} \left\| \square _k \partial _{x_i}^{2(r-1)}\mathscr {A}_rf \right\| _{ L^{\infty ,2}_{x_i;(x_j)_{j\ne i},t} }&\lesssim \Vert \square _k f\Vert _ {L^{1,2}_{x_i;(x_j)_{j\ne i},t} }, \end{aligned}$$
(163)
$$\begin{aligned} \left\| \square _k \partial ^{2(r-1)}_{x_i} \mathscr {A}_rf \right\| _{ L^\infty _t L^2_x }&\lesssim c^{r-1} \left\langle |k_i|\right\rangle ^{r-3/2} \Vert \square _k f\Vert _{ L^{1,2}_{x_i;(x_j)_{j \ne i},t} }. \end{aligned}$$
(164)

Remark 20

We point out the fact that we have worked out smoothing estimates only in the higher frequencies. As in [50], only these smoothing estimates are needed in order to discuss the well-posedness of (112).

8.2.3 Strichartz estimates

By exploiting (110) we can deduce Strichartz estimates for solutions of (112) combined with \(\square _k\)-decomposition operators.

Proposition 12

Let \(r \ge 1\), \(d \ge 2\), \(c \ge 1\), \(t>0\). Let (pq) and (ab) be order-r admissible pairs. Then for any \(0 < T \lesssim c^{2(r-1)}\) and for any \(k \in \mathbb {Z}^d\) with \(|k| \gtrsim K\) (\(K=K(c)\) is defined in (130))

$$\begin{aligned} \Vert \square _k \mathscr {U}_r(t)\phi _0 \Vert _{L^p([0,T])L^q(\mathbb {R}^d)}&\lesssim c^{ d \left( 1-\frac{1}{r} \right) \left( \frac{1}{2} -\frac{1}{q} \right) } \Vert \square _k\phi _0\Vert _{L^2(\mathbb {R}^d)} \nonumber \\&= c^{ \left( 1-\frac{1}{r}\right) \frac{2r}{p} } \Vert \square _k\phi _0\Vert _{L^2(\mathbb {R}^d)}, \end{aligned}$$
(165)
$$\begin{aligned} \left\| \square _k \int _0^t \mathscr {U}_r(t-\tau )\phi (\tau ) \mathrm{d}\tau \right\| _{L^p([0,T])L^q(\mathbb {R}^d)}&\lesssim c^{\left( 1-\frac{1}{r}\right) 2r\left( \frac{1}{p}+\frac{1}{a}\right) } \Vert \square _k\phi \Vert _{L^{a{^\prime }}([0,T])L^{b{^\prime }}(\mathbb {R}^d)}. \end{aligned}$$
(166)

Furthermore, by (106) we have that

$$\begin{aligned} \Vert \square _k \mathscr {U}_r(t) \Vert _{L^1 \rightarrow L^\infty }&\lesssim c^{d \left( 1-\frac{1}{r}\right) } \left\langle t \right\rangle ^{-d/(2r)}, \; 0 < |t| \lesssim c^{2(r-1)}, \end{aligned}$$

and by following closely the argument in Section 5 of [59] we can deduce

Proposition 13

Let \(r \ge 1\), \(d \ge 2\), \(c \ge 1\). Let (pq) be a Schrödinger-admissible pair, then

$$\begin{aligned} \Vert \mathscr {U}_r(t)\psi _0 \Vert _{l^{1,s}_\square (L^p_t([0,T]) L^q_x)}&\lesssim c^{\left( 1-\frac{1}{r}\right) \frac{2r}{p}} \Vert \psi _0\Vert _{M^s_{2,1}}, \; 0 < T \lesssim c^{2(r-1)}, \end{aligned}$$
(167)
$$\begin{aligned} \Vert \mathscr {A}_r f \Vert _{l^{1,s}_\square (L^p_t([0,T]) L^q_x) \cap l^{1,s}_\square (L^\infty _t([0,T]) L^2_x)}&\lesssim c^{\left( 1-\frac{1}{r}\right) \frac{4r}{p}} \Vert f\Vert _{l^{1,s}_\square (L^{p{^\prime }}([0,T])L^{q{^\prime }}(\mathbb {R}^d))}. \end{aligned}$$
(168)

8.2.4 Maximal function estimates

In this subsection we study the maximal function estimates for the semigroup \(\mathscr {U}_r(t)\) and the integral operator \(\int _0^t \mathscr {U}_r(t-\tau ) \cdot \mathrm{d}\tau \) in anisotropic Lebesgue spaces. To do this, we will need the time decay properties proved in Sec. 8.2.1. As always, we first prove results for the case \(r=2\), and then we write the modification for the general case.

Lemma 7

  1. 1.

    Let \(q \ge 2\), \(\frac{8}{d} < q \le + \infty \) and \(k \in \mathbb {Z}^d\) with \(|k| \gtrsim K(c)\), then

    $$\begin{aligned} \Vert \square _k \, \mathscr {U}_2(t)\psi _0 \Vert _{L^{q,\infty }_{x_i;(x_j)_{j \ne i},t}}&\lesssim c^{d/2} \left\langle k \right\rangle ^{1/q} \Vert \square _k\psi _0\Vert _{L^2}, \; 0 < |t| \lesssim c^2, \; \forall i=1,\ldots ,d. \end{aligned}$$
    (169)
  2. 2.

    Let \(q \ge 2\), \(\frac{4}{d} < q \le + \infty \) and \(k \in \mathbb {Z}^d\) with \(|k| \lesssim K(c)\), then

    $$\begin{aligned} \Vert \square _k \, \mathscr {U}_2(t)\psi _0 \Vert _{L^{q,\infty }_{x_i;(x_j)_{j \ne i},t}}&\lesssim c \left\langle k \right\rangle ^{1/q} \Vert \square _k\psi _0\Vert _{L^2}, \quad \forall i=1,\ldots ,d. \end{aligned}$$
    (170)

Proof

Clearly it suffices to show the thesis for \(i=1\); recall that for any \(x=(x_1,\ldots ,x_d) \in \mathbb {R}^d\) we denote \(\bar{x} = (x_2,\ldots ,x_d)\). By a standard \(TT^\star \) argument, (169) is equivalent to

$$\begin{aligned} \left\| \int _{\mathbb {R}^d} e^{i \left\langle x,\xi \right\rangle } e^{it(c^2+P_2(|\xi |)) }\sigma _k(\xi )\mathrm{d}\xi \right\| _{L^{q/2,\infty }_{x_1;\bar{x},t } }&\lesssim \left\langle k \right\rangle ^{2/q}. \end{aligned}$$
(171)

If \(|k| \gtrsim K(c)\), then

$$\begin{aligned} \Vert \mathscr {F}^{-1}e^{it(c^2+P_2(|\xi |))}\sigma _k(\xi ) \Vert _{L^\infty _x}&{\mathop {\lesssim }\limits ^{(132)}} c^{d/2} \left\langle k \right\rangle ^{-d} |t|^{-d/4}, \; 0 < |t| \lesssim c^2; \end{aligned}$$
(172)

on the other hand

$$\begin{aligned} \Vert \square _k \mathscr {U}_2(t)\mathscr {F}^{-1}\sigma _k \Vert _{L^\infty _{t,x}}&\lesssim \Vert \square _k \mathscr {U}_2(t)\mathscr {F}^{-1}\sigma _k \Vert _{L^\infty _t L^2_x} \lesssim 1. \end{aligned}$$
(173)

If we combine (172) and (173), we obtain

$$\begin{aligned} |\square _k\mathscr {U}_2(t)\mathscr {F}^{-1}\sigma _k|&\lesssim c^{d/2} (1+\left\langle k \right\rangle ^4 |t|)^{-d/4}, \; 0 < |t| \lesssim c^2. \end{aligned}$$
(174)

Now, if \(|x_1| \gtrsim 1+|t|\left\langle k \right\rangle ^5\), by integrating by parts we get

$$\begin{aligned} | \square _k \mathscr {U}_2(t)\mathscr {F}^{-1}\sigma _k |&\lesssim c^{d/2} \left\langle x_1 \right\rangle ^{-2} . \end{aligned}$$
(175)

If \(|x_1| \lesssim 1+|t|\left\langle k \right\rangle ^5\), by (174) we can deduce

$$\begin{aligned} | \square _k \mathscr {U}_2(t)\mathscr {F}^{-1}\sigma _k |&\lesssim c^{d/2} \left( 1+|x_1|\left\langle k \right\rangle ^{-1}\right) ^{-d/4} \end{aligned}$$
(176)

Combining (175) and (176) we have

$$\begin{aligned} \sup _{\bar{x},t}|\square _k \mathscr {U}_2(t)\mathscr {F}^{-1}\sigma _k|&\lesssim c^{d/2} \left\langle x_1 \right\rangle ^{-2} + c^{d/2} \left( 1+|x_1|\left\langle k \right\rangle ^{-1}\right) ^{-d/4}, \end{aligned}$$
(177)

from which, by taking the \(L^{q/2}_{x_1}\) norm on both sides, we obtain (171). The proof for the case \(|k| \lesssim K(c)\) is similar. \(\square \)

Lemma 8

Let \(q \ge 2\), \(\frac{8}{d} < q \le + \infty \) and \(k \in \mathbb {Z}^d\) with \(|k_i| \gtrsim K(c)^2\), then

$$\begin{aligned} \Vert \square _k \, \mathscr {A}_2 f \Vert _{L^{q,\infty }_{x_i;(x_j)_{j \ne i},t}}&\lesssim c^{d/2} \left\langle k_i \right\rangle ^{-3/2+1/q} \Vert \square _kf\Vert _{L^{1,2}_{x_i;(x_j)_{j \ne i},t}}, \; 0 < |t| \lesssim c^2, \quad \forall i=1,\ldots ,d. \end{aligned}$$
(178)

Proof

It suffices to prove the case \(i=1\). Recall that the solution of (151) is of the form

$$\begin{aligned} \psi&= \mathscr {F}^{-1}_{\tau ,\xi } \frac{1}{c^2+P_2(|\xi |)-\tau }\mathscr {F}_{t,x}f; \end{aligned}$$

hence, its frequency localization can be written as

$$\begin{aligned} \square _k \, \psi&= \mathscr {F}^{-1}_{\tau ,\xi } \frac{1}{c^2+P_2(|\xi |)-\tau }(\mathscr {F}_{t,x}\square _k \, f)(\tau ,\xi ). \end{aligned}$$

For convenience, we introduce the following regions

$$\begin{aligned} \mathbb {E}_1&= \left\{ \tau -c^2 \le -c^2/4 \right\} , \\ \mathbb {E}_2&= \left\{ -c^2/4 \le \tau -c^2 \le |\bar{\xi }|^2 \left( -\frac{|\bar{\xi }|^2}{c^2}+1 \right) \right\} , \\ \mathbb {E}_3&= \left\{ \tau -c^2 \ge |\bar{\xi }|^2 \left( -\frac{|\bar{\xi }|^2}{c^2}+1 \right) \right\} , \end{aligned}$$

and we make the following decomposition

$$\begin{aligned} c^2+P_2(|\xi |)-\tau&= {\left\{ \begin{array}{ll} \left( \frac{|\xi |^2}{c} + \tau _2(c,\tau ) \right) \left( -\frac{\xi _1}{c^{1/2}} + a \right) \left( \frac{\xi _1}{c^{1/2}} + a \right) &{} (\bar{\xi },\tau ) \in \mathbb {E}_1, \\ \left( \frac{|\xi |^2}{c} + \tau _2(c,\tau ) \right) \left( -\frac{|\xi |^2}{c} + \tau _2(c,\tau )+c \right) , &{} (\bar{\xi },\tau ) \in \mathbb {E}_2, \\ - \left( \frac{|\xi |^2}{c} - \frac{c}{2} \right) ^2 + \left( \frac{5}{4} c^2- \tau \right) , &{} (\bar{\xi },\tau ) \in \mathbb {E}_3, \\ \end{array}\right. } \end{aligned}$$
(179)

where \(a=a(c,\bar{\xi },\tau ):= ( \tau _2(c,\tau )- |\bar{\xi }|^2/c + c)^{1/2}\). We denote

$$\begin{aligned} \square _k \, \psi _i&= \mathscr {F}^{-1}_{\tau ,\xi } \frac{ \chi _{\mathbb {E}_i}(\bar{\xi },\tau ) }{c^2+P_2(|\xi |)-\tau }(\mathscr {F}_{t,x}\square _k \, f)(\tau ,\xi ), \; i=1,2,3. \end{aligned}$$

First, we estimate \(\square _k \, \psi _1\). Set \(\tilde{\sigma }\eta _{k_1}(\xi _1) = \sum _{|l| \le 10} \eta _{k_1+l}(\xi _1)\). First we notice that

$$\begin{aligned} \frac{ \chi _{\mathbb {E}_1}(\bar{\xi },\tau ) }{c^2+P_2(|\xi |)-\tau }&= \frac{ \chi _{\mathbb {E}_1}(\bar{\xi },\tau ) }{ (2\tau _2(c,\tau )+c) \left( \frac{|\xi |^2}{c} + \tau _2(c,\tau ) \right) } \\&\quad + \frac{ \chi _{\mathbb {E}_1}(\bar{\xi },\tau ) }{ 2a(2\tau _2(c,\tau )+c) } \left( \frac{1}{ -\frac{\xi _1}{c^{1/2}} + a } + \frac{1}{ \frac{\xi _1}{c^{1/2}} + a } \right) \\&=: \sum _{j=1}^3 A_j(c,\xi ,\tau ). \end{aligned}$$

According to the above decomposition, we can rewrite \(\square _k \, \psi _1\) as

$$\begin{aligned}&\square _k \, \psi _1 \\&\quad = \mathscr {F}^{-1}_{\tau ,\xi } \frac{ \chi _{\mathbb {E}_1}(\bar{\xi },\tau ) \tilde{\sigma }\eta _{k_1}(ac^{1/2}) }{c^2+P_2(|\xi |)-\tau }(\mathscr {F}_{t,x}\square _k \, f)(\tau ,\xi ) \\&\qquad +\,\mathscr {F}^{-1}_{\tau ,\xi } \frac{ \chi _{\mathbb {E}_1}(\bar{\xi },\tau ) (1-\tilde{\sigma }\eta _{k_1}(ac^{1/2})) }{c^2+P_2(|\xi |)-\tau }(\mathscr {F}_{t,x}\square _k \, f)(\tau ,\xi ) \\&\quad = \sum _{j=1}^3 \mathscr {F}^{-1}_{\tau ,\xi } \chi _{\mathbb {E}_1}(\bar{\xi },\tau ) A_j(\xi ,\tau ) \tilde{\sigma }\eta _{k_1}(ac^{1/2}) (\mathscr {F}_{t,x}\square _k \, f)(\tau ,\xi ) \\&\qquad + \,\mathscr {F}^{-1}_{\tau ,\xi } \frac{ \chi _{\mathbb {E}_1}(\bar{\xi },\tau ) (1-\tilde{\sigma }\eta _{k_1}(ac^{1/2})) }{c^2+P_2(|\xi |)-\tau }(\mathscr {F}_{t,x}\square _k \, f)(\tau ,\xi ) \\&\quad =: I + \textit{II} + \textit{III} + \textit{IV}. \end{aligned}$$

Case\(k_1 \gtrsim K(c)^2\): First, we estimate II. Let \(\tilde{\sigma }\sigma _k\) be as in (125), then

$$\begin{aligned} \textit{II}&= \int _{I \times \mathbb {R}^d} \frac{ e^{it\tau +i \left\langle \bar{x},\bar{\xi } \right\rangle } \chi _{\mathbb {E}_1}(\bar{\xi },\tau ) }{2a(2\tau _2(c,\tau )+c)} \tilde{\sigma }\sigma _{\bar{k}}(\bar{\xi }) \tilde{\sigma }\eta _{k_1}(ac^{1/2}) \widehat{\square _k f(y_1,\cdot )}(\bar{\xi },\tau ) c^{1/2} e^{i(x_1-y_1)ac^{1/2}} \\&\quad \times \, sgn(x_1-y_1) \mathrm{d}\bar{\xi } \mathrm{d}y_1 \mathrm{d}\tau . \end{aligned}$$

By changing variable, \(\xi _1=c^{1/2} a(c,\bar{\xi },\tau )\), and by setting \(\tilde{\sigma }\rho _k(\xi ) = \tilde{\sigma }\sigma _{\bar{k}}(\bar{\xi }) \tilde{\sigma }\eta _{k_1}(\xi _1)\), we obtain

$$\begin{aligned} |\textit{II}|&\lesssim \left| \int \mathrm{d}y_1 \; sgn(x_1-y_1) \int e^{it(c^2+P_2(|\xi |))} e^{i(x_1-y_1)\xi _1 + i \left\langle \bar{x},\bar{\xi } \right\rangle } \tilde{\sigma }\rho _k(\xi ) \widehat{\square _k f(y_1,\cdot )}(c^2+P_2(|\xi |),\tau ) \mathrm{d}\xi \right| , \end{aligned}$$

and by applying (169) we get

$$\begin{aligned} \Vert \textit{II} \Vert _{L^{q,\infty }_{x_1;\bar{x},t}}&\lesssim \int \mathrm{d}y_1 \left\| \int e^{it(c^2+P_2(|\xi |))} e^{i(x_1-y_1)\xi _1 + i \left\langle \bar{x},\bar{\xi } \right\rangle } \tilde{\sigma }\rho _k(\xi ) \widehat{\square _k f(y_1,\cdot )}(c^2+P_2(|\xi |),\tau ) \mathrm{d}\xi \right\| _{L^{q,\infty }_{x_1;\bar{x},t}} \nonumber \\&\lesssim c^{d/2} \left\langle k_1 \right\rangle ^{1/q} \int \Vert \tilde{\sigma }\rho _k(\xi ) \widehat{\square _k f(y_1,\cdot )}(c^2+P_2(|\xi |),\tau ) \Vert _{L^2_\xi } \mathrm{d}y_1 \nonumber \\&{\mathop {\lesssim }\limits ^{(150),(157)}} c \, c^{d/2} \left\langle k_1 \right\rangle ^{1/q-3/2} \Vert \square _k \, f\Vert _{L^{1,2}_{x_1;\bar{x},t}}. \end{aligned}$$
(180)

Since \(k_1>0\), \(\textit{III}\) has the same upper bound as in (180).

Now we estimate \(\textit{IV}\): First notice that

$$\begin{aligned} \textit{IV}&= \int \mathrm{d}y_1 \int e^{it\tau + i \left\langle \bar{x}, \bar{\xi } \right\rangle } \tilde{\sigma }\sigma _{\bar{k}}(\bar{\xi }) \widehat{\square _k f(y_1,\cdot )}(\bar{\xi },\tau ) \; K(x_1-y_1,a,\bar{\xi }) \mathrm{d}\bar{\xi }, \\ K(x_1,a,\bar{\xi })&= \chi _{\mathbb {E}_1}(\bar{\xi },\tau ) (1-\tilde{\sigma }\eta _{k_1}(ac^{1/2})) \int \frac{\sum _{|l| \le 1} \eta _{k_1+l}(\xi _1) e^{ix_1 \xi _1} }{c^2+P_2(|\xi |)-\tau } \mathrm{d}\xi _1. \end{aligned}$$

By Young’s inequality for convolutions, Hölder’s inequality and Minkowski’s inequality we have

$$\begin{aligned} \Vert IV \Vert _{L^{q,\infty }_{x_1;\bar{x},t}}&\le \left\| \int \Vert \tilde{\sigma }\sigma _{\bar{k}}(\bar{\xi }) \widehat{\square _k f(y_1,\cdot )}(\bar{\xi },\tau ) \; K(x_1-y_1,a,\bar{\xi }) \Vert _{L^1_{\bar{\xi },\tau }} \mathrm{d}y_1 \right\| _{L^q_{x_1}} \\&\le \Vert \square _k \, f \Vert _{L^{1,2}_{x_1;\bar{x},t}} \Vert \tilde{\sigma }\sigma _{\bar{k}}(\bar{\xi })K(x_1,a,\bar{\xi }) \Vert _{L^{q,2}_{x_1;\bar{\xi },\tau }} \\&\lesssim \Vert \square _k \, f \Vert _{L^{1,2}_{x_1;\bar{x},t}} | \tilde{\sigma }\sigma _{\bar{k}}(\bar{\xi })K(x_1,a,\bar{\xi }) \Vert _{L^\infty _{\bar{\xi }} L^2_\tau L^q_{x_1}}. \end{aligned}$$

Integrating by parts it follows that

$$\begin{aligned}&\Vert \tilde{\sigma }\sigma _{\bar{k}}(\bar{\xi }) K(x_1,a,\xi ) \Vert _{L^\infty _{\bar{\xi }} L^2_\tau L^q_{x_1}} \nonumber \\&\quad \lesssim \sup _{|\xi -k|_\infty \le 3} \sum _{j=0}^1 \Vert \chi _{\mathbb {E}_1}(\bar{\xi },\tau ) (1-\tilde{\sigma }\eta _{k_1}(ac^{1/2})) \partial ^j_{\xi _1}(c^2+P_2(|\xi |)-\tau )^{-1} \Vert _{L^2_\tau }. \end{aligned}$$
(181)

Noticing that \(|\xi - ac^{1/2}| \ge c^{1/2} \ge 1\) in the support set of \((1-\tilde{\sigma }\eta _{k_1}(ac^{1/2})) \chi _{|\xi _1-k_1| \le 3}\partial ^j_{\xi _1}(c^2+P_2(|\xi |)-\tau )^{-1}\) we can deduce from (179) that there is no singularity if we integrate (181), and this gives

$$\begin{aligned} \Vert \tilde{\sigma }\sigma _{\bar{k}}(\bar{\xi }) K(x_1,a,\bar{\xi }) \Vert _{L^\infty _{\bar{\xi }} L^2_\tau L^q_{x_1}}&\lesssim c^{1/2} |k_1|^{-3/2}. \end{aligned}$$

Now we estimate I: We begin by setting

$$\begin{aligned} J(x_1,a,\bar{\xi })&= \chi _{\mathbb {E}_1}(\bar{\xi },\tau ) \tilde{\sigma }\eta _{k_1}(ac^{1/2}) \int \frac{ \sum _{|l| \le 1} \eta _{k+l}(\xi _1)e^{ix_1\xi _1} }{ (2\tau _2(c,\tau )+c)\left( \frac{|xi|^2}{c}+\tau _2(c,\tau ) \right) } \mathrm{d}\xi _1. \end{aligned}$$
(182)

One can check that

$$\begin{aligned} I&= \int \mathrm{d}y_1 \int e^{it\tau + i \left\langle \bar{x}, \bar{\xi } \right\rangle } \tilde{\sigma }\sigma _{\bar{k}}(\bar{\xi }) \widehat{\square _k \, f(y_1,\cdot )}(\bar{\xi },\tau ) J(x_1-y_1,a,\bar{\xi }) \mathrm{d}\bar{\xi } \mathrm{d}\tau . \end{aligned}$$

Similar to the estimate of IV, by Young’s, Hölder’s and Minkowski’s inequalities we obtain

$$\begin{aligned} \Vert I\Vert _{L^{q,\infty }_{x_1;\bar{x},t}}&\lesssim c^{1/2} \Vert \square _k f\Vert _{L^{1,2}_{x_1;\bar{x},t}} \Vert \tilde{\sigma }\sigma _k(\bar{\xi })J(x_1,a,\bar{\xi })\Vert _{L^\infty _{\bar{\xi }} L^2_\tau L^q_{x_1}}. \end{aligned}$$

By integration by parts we get

$$\begin{aligned} |J(x_1,a,\xi )| \lesssim \frac{ \chi _{\mathbb {E}_1}(\bar{\xi },\tau ) \tilde{\sigma }\eta _{k_1}(ac^{1/2}) }{(2\tau _2(c,\tau )+c)(1+|x_1|) } \sum _{j=0}^1 \int _{|\xi _1-k_1| \le 3} \left| \partial ^j_{\xi _1} \left( \frac{|xi|^2}{c}+\tau _2(c,\tau ) \right) ^{-1} \right| \mathrm{d}\xi _1. \end{aligned}$$

Therefore,

$$\begin{aligned}&\Vert \tilde{\sigma }\sigma _k(\bar{\xi })J(x_1,a,\bar{\xi })\Vert _{L^\infty _{\bar{\xi }} L^2_\tau L^q_{x_1}} \nonumber \\&\quad \lesssim \sup _{|\xi -k|_\infty \le 3} \sum _{j=0}^1 \left\| \frac{ \chi _{\mathbb {E}_1}(\bar{\xi },\tau ) \tilde{\sigma }\eta _{k_1}(ac^{1/2}) }{(2\tau _2(c,\tau )+c)(1+|x_1|) } \sum _{j=0}^1 \left| \partial ^j_{\xi _1} \left( \frac{|xi|^2}{c}+\tau _2(c,\tau ) \right) ^{-1} \right| \right\| _{L^2_\tau }, \end{aligned}$$
(183)

and noticing that \(|ac^{1/2}-k_1| \le 20\) in the support set of \(\tilde{\sigma }\eta _{k_1}(ac^{1/2})\), we can deduce that \(2\tau _2(c,\tau )+c \gtrsim k_1^2\), and finally we obtain

$$\begin{aligned} \Vert \tilde{\sigma }\sigma _k(\bar{\xi })J(x_1,a,\bar{\xi })\Vert _{L^\infty _{\bar{\xi }} L^2_\tau L^q_{x_1}}&\lesssim |k_1|^{-2}. \end{aligned}$$
(184)

The proof for the case \(k \lesssim - K(c)^2\) is similar. Furthermore, in the estimate of \(\square _k \, \psi _2\) and \(\square _k \, \psi _3\) we can check that there is no singularity in \((c^2+P_2(|\xi |)-\tau )^{-1}\) for \(|\xi _1| \ge c^{1/2}\) and \((\bar{\xi },\tau ) \in \mathbb {E}_2 \cup \mathbb {E}_3\). Hence one can argue as in (182)–(184) and conclude. \(\square \)

In the last lemma we proved that \(\square _k \mathscr {A}_2: L^{1,2}_{x_1,(x_j)_{j \ne 2},t} \rightarrow L^{2,\infty }_{x_1,(x_j)_{j \ne 2},t}\). In the next lemma we show that \(\square _k \mathscr {A}_2: L^{1,2}_{x_2,(x_j)_{j \ne 1},t} \rightarrow L^{2,\infty }_{x_1,(x_j)_{j \ne 2},t}\).

Lemma 9

Let \(q \ge 2\), \(\frac{8}{d} < q \le + \infty \), \(k \in \mathbb {Z}^d\) with \(|k_i| \gtrsim c\) and \(h,i \in \{1,\ldots ,d\}\) with \(h \ne i\), then

$$\begin{aligned} \Vert \square _k \, \mathscr {A}_2 f \Vert _{L^{q,\infty }_{x_h;(x_j)_{j \ne h},t}}&\lesssim c^{1+d/2} \left\langle k_i \right\rangle ^{-3/2+1/q} \Vert \square _kf\Vert _{L^{1,2}_{x_i;(x_j)_{j \ne i},t}}, \quad 0 < |t| \lesssim c^2. \end{aligned}$$
(185)

Proof

It clearly suffices to consider the case \(h=1\), \(i=2\) and \(k_2 \gtrsim c\). The proof goes along the same line of that of (178), and we will only prove in detail the parts that are different. For convenience, we denote \(\tilde{\sigma }\xi =(\xi _1,\xi _3,\ldots ,\xi _d)\). We introduce the following regions

$$\begin{aligned} \mathbb {F}_1&= \left\{ \tau -c^2 \le -c^2/4 \right\} , \\ \mathbb {F}_2&= \left\{ -c^2/4 \le \tau -c^2 \le |\tilde{\sigma }\xi |^2 \left( -\frac{|\tilde{\sigma }\xi |^2}{c^2}+1 \right) \right\} , \\ \mathbb {F}_3&= \left\{ \tau -c^2 \ge |\tilde{\sigma }\xi |^2 \left( -\frac{|\tilde{\sigma }\xi |^2}{c^2}+1 \right) \right\} , \end{aligned}$$

and we make the following decomposition

$$\begin{aligned} c^2+P_2(|\xi |)-\tau&= {\left\{ \begin{array}{ll} \left( \frac{|\xi |^2}{c} + \tau _2(c,\tau ) \right) \left( -\frac{\xi _2}{c^{1/2}} + a \right) \left( \frac{\xi _2}{c^{1/2}} + a \right) &{} (\tilde{\sigma }\xi ,\tau ) \in \mathbb {F}_1, \\ \left( \frac{|\xi |^2}{c} + \tau _2(c,\tau ) \right) \left( -\frac{|\xi |^2}{c} + \tau _2(c,\tau )+c \right) , &{} (\tilde{\sigma }\xi ,\tau ) \in \mathbb {F}_2, \\ - \left( \frac{|\xi |^2}{c} - \frac{c}{2} \right) ^2 + \left( \frac{5}{4} c^2- \tau \right) , &{} (\tilde{\sigma }\xi ,\tau ) \in \mathbb {F}_3, \\ \end{array}\right. } \end{aligned}$$
(186)

where \(b=b(c,\tilde{\sigma }\xi ,\tau ):= ( \tau _2(c,\tau )- |\tilde{\sigma }\xi |^2/c + c)^{1/2}\), \(\tau _2(c,\tau )= c \left( \sqrt{\frac{5}{4}-\frac{\tau }{c^2}} - \frac{1}{2} \right) \). We denote

$$\begin{aligned} \square _k \, \tilde{\sigma }\psi _i&= \mathscr {F}^{-1}_{\tau ,\xi } \frac{ \chi _{\mathbb {F}_i}(\tilde{\sigma }\xi ,\tau ) }{c^2+P_2(|\xi |)-\tau }(\mathscr {F}_{t,x}\square _k \, f)(\tau ,\xi ), \; i=1,2,3. \end{aligned}$$

We estimate \(\square _k\,\tilde{\sigma }\psi _1\), since by definition of the regions \(\mathbb {F}_i\) the estimate of the other terms follow more easily, like in the last Lemma.

Set \(\tilde{\sigma }\eta _{k_2}(\xi _2) = \sum _{|l| \le 10} \eta _{k_2+l}(\xi _2)\). First we notice that

$$\begin{aligned} \frac{ \chi _{\mathbb {F}_1}(\tilde{\sigma }\xi ,\tau ) }{c^2+P_2(|\xi |)-\tau }&= \frac{ \chi _{\mathbb {F}_1}(\tilde{\sigma }\xi ,\tau ) }{ (2\tau _2(c,\tau )+c) \left( \frac{|\xi |^2}{c} + \tau _2(c,\tau ) \right) } \\&\quad + \frac{ \chi _{\mathbb {F}_1}(\tilde{\sigma }\xi ,\tau ) }{ 2b(2\tau _2(c,\tau )+c) } \left( \frac{1}{ -\frac{\xi _1}{c^{1/2}} + b } + \frac{1}{ \frac{\xi _1}{c^{1/2}} + b } \right) \\&=: \sum _{j=1}^3 B_j(c,\xi ,\tau ). \end{aligned}$$

According to the above decomposition, we can rewrite \(\square _k \, \psi _1\) as

$$\begin{aligned}&\square _k \, \tilde{\sigma }\psi _1 \\&\quad = \sum _{j=1}^3 \mathscr {F}^{-1}_{\tau ,\xi } \chi _{\mathbb {F}_1}(\tilde{\sigma }\xi ,\tau ) B_j(\xi ,\tau ) \tilde{\sigma }\eta _{k_2}(bc^{1/2}) (\mathscr {F}_{t,x}\square _k \, f)(\tau ,\xi ) \\&\qquad +\, \mathscr {F}^{-1}_{\tau ,\xi } \frac{ \chi _{\mathbb {F}_1}(\tilde{\sigma }\xi ,\tau ) (1-\tilde{\sigma }\eta _{k_2}(bc^{1/2})) }{c^2+P_2(|\xi |)-\tau }(\mathscr {F}_{t,x}\square _k \, f)(\tau ,\xi ) \\&\quad =: I + \textit{II} + \textit{III} + \textit{IV}. \end{aligned}$$

The estimates of \(\textit{II}\) and \(\textit{III}\) follow in the same way as for (178) by exchanging the roles of \(\xi _1\) and \(\xi _2\). Now we estimate I: Set

$$\begin{aligned} m(\xi ,\tau )&= \frac{ \chi _{\mathbb {E}_1}(\tilde{\sigma }\xi ,\tau )\tilde{\sigma }\sigma _{\bar{k}}(\tilde{\sigma }\xi ) \tilde{\sigma }\eta _{k_2}(bc^{1/2} }{2b(2\tau _2(c,\tau )+c)}, \end{aligned}$$
(187)

and notice that \(2\tau _2(c,\tau )+c \gtrsim k_2^2\) in the support set of m; hence, for sufficiently large c, we have

$$\begin{aligned} m(\xi ,\tau )&\lesssim \frac{ \chi _{\mathbb {F}_1} \tilde{\sigma }\sigma _k(\xi ) }{k_2^4}, \end{aligned}$$

and therefore,

$$\begin{aligned} \Vert m\Vert _{L^1_{\xi _2}L^2_{\xi _3,\ldots ,\xi _d,\tau } L^{2q/(q-2)}_{\xi _1}} \lesssim |k_2|^{-2}. \end{aligned}$$
(188)

Now, since by Young’s, Hölder’s and Minkowski’s inequalities we have

$$\begin{aligned} \Vert \mathscr {F}^{-1}_{\xi ,\tau }m(\xi ,\tau )(\mathscr {F}_{t,x}\square _k \, f)\Vert _{L^{q,\infty }_{x_1;\bar{x},t}}&\lesssim \Vert \mathscr {F}^{-1}_{\xi ,\tau }m(\xi ,\tau )(\mathscr {F}_{t,x}\square _k \, f)\Vert _{L^q_{x_1} L^1_{\tilde{\sigma }\xi ,\tau }} \nonumber \\&\lesssim \Vert m(\xi ,\tau )(\mathscr {F}_{t,x}\square _k \, f)\Vert _{L^1_{\tilde{\sigma }\xi ,\tau } L^{q{^\prime }}_{\xi _1}} \nonumber \\&\lesssim \Vert m\Vert _{L^1_{\xi _2}L^2_{\xi _3,\ldots ,\xi _d,\tau } L^{2q/(q-2)}_{\xi _1}} \Vert \mathscr {F}_{t,x}\square _k \, f\Vert _{L^\infty _{\xi _2} L^2_{(\xi _j)_{j\ne 2},\tau }} \nonumber \\&\lesssim \Vert m\Vert _{L^1_{\xi _2}L^2_{\xi _3,\ldots ,\xi _d,\tau } L^{2q/(q-2)}_{\xi _1}} \Vert \square _k \, f\Vert _{L^1_{x_2} L^2_{(x_j)_{j\ne 2},t}}, \end{aligned}$$
(189)

we can deduce that

$$\begin{aligned} \Vert I\Vert _{L^{q,\infty }_{x_1;\bar{x},t}}&\lesssim |k_2|^{-2} \Vert \square _k \, f\Vert _{ L^1_{x_1} L^2_{(x_j)_{j \ne 2},t} }. \end{aligned}$$
(190)

Now we estimate \(\textit{IV}\): Set

$$\begin{aligned} m_k(\xi ,\tau ): = \frac{ \chi _{\mathbb {F}_3}(\tilde{\sigma }\xi ,\tau ) \tilde{\sigma }\sigma _k(\xi ) (1-\tilde{\sigma }\eta _{k_2}(b)) }{ c^2+P_2(|\xi |)-\tau }, \;&\; \mathscr {M}_k(f) := \mathscr {F}^{-1}_{\tau ,\xi }m_k(\xi ,\tau )(\mathscr {F}_{t,x}f), \end{aligned}$$
(191)

and notice that \(\mathscr {M}_k(f)\) is the solution of the inhomogeneous equation

$$\begin{aligned} -i\psi _t&= A_{c,2}\psi - \mathscr {F}^{-1}_{\tau ,\xi }m_k(\xi ,\tau )(c^2+P_2(|\xi |)-\tau )(\mathscr {F}_{t,x}f). \end{aligned}$$

Applying (159) (recall that \(k_2 \gtrsim c\)), we have

$$\begin{aligned} \Vert \mathscr {M}_k(f)\Vert _{L^\infty _{x,t}}&\lesssim c^{3/2} \Vert \mathscr {M}_k(f)\Vert _{L^\infty _t L^2_x} \nonumber \\&\lesssim c^{d/2} c |k_2|^{-3/2} \Vert f\Vert _{ L^1_{x_2} L^2_{(x_j)_{j \ne 2},t} } \nonumber \\&= c^{\frac{d}{2} +1} |k_2|^{-3/2} \Vert f\Vert _{ L^1_{x_2} L^2_{(x_j)_{j \ne 2},t} }. \end{aligned}$$
(192)

Next, for \((\xi ,\tau ) \in \text {supp}(m_k)\),

$$\begin{aligned} |c^2+P_2(|\xi |)-\tau |&{\mathop {\gtrsim }\limits ^{(\tilde{\sigma }\xi ,\tau ) \in \mathbb {F}_3}} c^{-1} \left\langle k \right\rangle ^2 |k_2|. \end{aligned}$$
(193)

By the definition of b we have that for \(\frac{|\tilde{\sigma }\xi |}{c} \lesssim \frac{1}{2} (c+\tau _2(c,\tau ))\)

$$\begin{aligned} |c^2+P_2(|\xi |)-\tau |&\gtrsim ( c+\tau _2(c,\tau ) )^{3/2}, \; (\xi ,\tau ) \in \text {supp}(m_k), \end{aligned}$$
(194)

while for \(\frac{|\tilde{\sigma }\xi |}{c} \gtrsim \frac{1}{2} (c+\tau _2(c,\tau ))\) we can exploit the fact that \(|k|_\infty =|k_2| \gtrsim c\) to obtain again that

$$\begin{aligned} |c^2+P_2(|\xi |)-\tau |&\gtrsim ( c+\tau _2(c,\tau ) )^{3/2}, \; (\xi ,\tau ) \in \text {supp}(m_k), \end{aligned}$$
(195)

and by combining (191) with (194)–(195) we obtain

$$\begin{aligned} m_k(\xi ,\tau ) \lesssim c \frac{\chi _{\tau \ge \frac{3}{4}c^2}\tilde{\sigma }\sigma _k(\xi ) }{(|k_2|^{2}+c+\tau _2(c,\tau ))^{3/2} }, \end{aligned}$$
(196)

which gives

$$\begin{aligned} \Vert m_k\Vert _{L^1_{\xi _2}L^2_{\xi _3,\ldots ,\xi _d,\tau }L^\infty _{\xi _1}}&\lesssim c |k_2|^{-1}. \end{aligned}$$
(197)

Therefore, from(191) and (189) we can deduce

$$\begin{aligned} \Vert \mathscr {M}_k(f)\Vert _{L^{2,\infty }_{x_1;\bar{x},t}}&\lesssim c |k_2|^{-1} \Vert f\Vert _{L^1_{x_2}L^2_{(x_j)_{j \ne 2},t}}. \end{aligned}$$
(198)

For any \(q \ge 2\) we obtain by interpolation between (192) and (198)

$$\begin{aligned} \Vert \mathscr {M}_k(f)\Vert _{L^{q,\infty }_{x_1;\bar{x},t}}&\lesssim c^{1+d\left( \frac{1}{2}-\frac{1}{q} \right) } |k_2|^{-3/2+1/q} \Vert f\Vert _{L^1_{x_2}L^2_{(x_j)_{j \ne 2},t}}, \end{aligned}$$
(199)

and replacing f by \(\square _k \, f\) in (199), we finally obtain

$$\begin{aligned} \Vert \textit{IV}\Vert _{L^{q,\infty }_{x_1;\bar{x},t}}&\lesssim c^{1+d\left( \frac{1}{2}-\frac{1}{q} \right) } |k_2|^{-3/2+1/q} \Vert f\Vert _{L^1_{x_2}L^2_{(x_j)_{j \ne 2},t}}. \end{aligned}$$

\(\square \)

If we collect (178) and (185), we can deduce

Lemma 10

Let \(q \ge 2\), \(\frac{8}{d} < q \le + \infty \), \(k \in \mathbb {Z}^d\) with \(|k_i| \gtrsim c\) and \(h,i \in \{1,\ldots ,d\}\), then

$$\begin{aligned} \Vert \square _k \, \partial ^2_{x_i}\mathscr {A}_2 f \Vert _{L^{q,\infty }_{x_h;(x_j)_{j \ne h},t}}&\lesssim c^{1+d/2} \left\langle k_i \right\rangle ^{1/2+1/q} \Vert \square _kf\Vert _{L^{1,2}_{x_i;(x_j)_{j \ne i},t}}, \; 0 < |t| \lesssim c^2. \end{aligned}$$
(200)

Remark 21

In the general case \(r>2\) we have

  1. 1.

    Let \(q \ge 2\), \(\frac{4r}{d} < q \le + \infty \) and \(k \in \mathbb {Z}^d\) with \(|k| \gtrsim K(c)\), then

    $$\begin{aligned}&\Vert \square _k \, \mathscr {U}_r(t)\psi _0 \Vert _{L^{q,\infty }_{x_i;(x_j)_{j \ne i},t}} \lesssim c^{d\left( 1-\frac{1}{r}\right) } \left\langle k \right\rangle ^{1/q} \Vert \square _k\psi _0\Vert _{L^2}, \nonumber \\&\quad 0 < |t| \lesssim c^{2(r-1)}, \; \forall i=1,\ldots ,d. \end{aligned}$$
    (201)
  2. 2.

    Let \(q \ge 2\), \(\frac{4}{d} < q \le + \infty \) and \(k \in \mathbb {Z}^d\) with \(|k| \lesssim K(c)\), then

    $$\begin{aligned} \Vert \square _k \, \mathscr {U}_r(t)\psi _0 \Vert _{L^{q,\infty }_{x_i;(x_j)_{j \ne i},t}}&\lesssim c \left\langle k \right\rangle ^{1/q} \Vert \square _k\psi _0\Vert _{L^2}, \quad \forall i=1,\ldots ,d. \end{aligned}$$
    (202)
  3. 3.

    Let \(q \ge 2\), \(\frac{4r}{d} < q \le + \infty \) and \(k \in \mathbb {Z}^d\) with \(|k_i| \gtrsim K(c)^2\) and \(i \in \{1,\ldots ,d\}\), then

    $$\begin{aligned} \Vert \square _k \, \mathscr {A}_r f \Vert _{L^{q,\infty }_{x_i;(x_j)_{j \ne i},t}}&\lesssim c^{r-1+d\left( 1-\frac{1}{r}\right) } \left\langle k_i \right\rangle ^{-r+1/2+1/q} \Vert \square _kf\Vert _{L^{1,2}_{x_i;(x_j)_{j \ne i},t}}, \quad 0 < |t| \lesssim c^{2(r-1)}. \end{aligned}$$
    (203)
  4. 4.

    Let \(q \ge 2\), \(\frac{4r}{d} < q \le + \infty \), \(k \in \mathbb {Z}^d\) with \(|k_i| \gtrsim c\) and \(h,i \in \{1,\ldots ,d\}\) with \(h \ne i\), then

    $$\begin{aligned}&\Vert \square _k \, \mathscr {A}_r f \Vert _{L^{q,\infty }_{x_h;(x_j)_{j \ne h},t}}\nonumber \\&\quad \lesssim c^{r-1+d\left( 1-\frac{1}{r}\right) } \left\langle k_i \right\rangle ^{-r+1/2+1/q} \Vert \square _kf\Vert _{L^{1,2}_{x_i;(x_j)_{j \ne i},t}}, \quad 0 < |t| \lesssim c^{2(r-1)}. \end{aligned}$$
    (204)
  5. 5.

    Let \(q \ge 2\), \(\frac{4r}{d} < q \le + \infty \), \(k \in \mathbb {Z}^d\) with \(|k_i| \gtrsim c\) and \(h,i \in \{1,\ldots ,d\}\), then

    $$\begin{aligned}&\Vert \square _k \, \partial ^{2(r-1)}_{x_i}\mathscr {A}_r f \Vert _{L^{q,\infty }_{x_h;(x_j)_{j \ne h},t}} \nonumber \\&\quad \lesssim c^{r-1+d\left( 1-\frac{1}{r}\right) } \left\langle k_i \right\rangle ^{r-3/2+1/q} \Vert \square _kf\Vert _{L^{1,2}_{x_i;(x_j)_{j \ne i},t}}, \quad 0 < |t| \lesssim c^{2(r-1)}. \end{aligned}$$
    (205)

8.2.5 Proof of the local well-posedness

In this subsection we use smoothing estimates, Strichartz estimates and maximal function estimates in order to prove Proposition 7. In order to do so, it seems necessary to estimate norms in which partial derivatives and anisotropic Lebesgue spaces have different directions, for example \(\Vert \partial ^2_{x_1} \square _k \, \mathscr {A}f\Vert _{L^{2,\infty }_{x_2;(x_j)_{j \ne 2},t}}\) with \(|k|_\infty = |k_3|\). As usual, we show results for the case \(r=2\), and then we point out the modifications for the case \(r>2\).

Lemma 11

Let \(i,l,m \in \{1,\ldots ,d\}\), \(1 \le p,q, \le +\infty \). Assume that \(k=(k_1,\ldots ,k_d)\) with \(|k|_\infty = |k_m| \gtrsim c\), then

$$\begin{aligned} \Vert \square _k \, \partial ^2_{x_l}f\Vert _{L^{p,q}_{x_i;(x_j)_{j \ne i},t}}&\lesssim \Vert \square _k \, \partial ^2_{x_m}f\Vert _{L^{p,q}_{x_i;(x_j)_{j \ne i},t}}. \end{aligned}$$
(206)

Proof

$$\begin{aligned} \Vert \square _k \, \partial ^2_{x_l}f\Vert _{L^{p,q}_{x_i;(x_j)_{j \ne i},t}}&\lesssim \sum _{|h_l|_\infty ,|h_m|_\infty \le 1} \left\| \mathscr {F}^{-1}_{\xi _l,\xi _m} \left( \left( \frac{\xi _l}{\xi _m} \right) ^2 \eta _{k_l+h_l}(\xi _l)\eta _{k_m+l_m}(\xi _m) \right) \right\| _{L^1(\mathbb {R}^2)} \\&\quad \times \Vert \square _k \, \partial ^2_{x_m}f\Vert _{L^{p,q}_{x_i;(x_j)_{j \ne i},t}} \\&\lesssim \Vert \square _k \, \partial ^2_{x_m}f\Vert _{L^{p,q}_{x_i;(x_j)_{j \ne i},t}}. \end{aligned}$$

\(\square \)

Lemma 12

  1. 1.

    Let (ab) be order-2 admissible, \(i \in \{1,\ldots ,d\}\), \(q \ge 2\), \(\frac{8}{d}< q <+\infty \) and \(k \in \mathbb {Z}^d\) with \(|k|_\infty \gtrsim K(c)\), then

    $$\begin{aligned} \Vert \square _k \; \partial ^\alpha _{x_i} \mathscr {A}_2 f \Vert _{L^{q,\infty }_{x_i;(x_j)_{j \ne i},t}} \lesssim c^{\frac{d}{2} + \frac{2}{a}} \left\langle |k|_\infty \right\rangle ^{\alpha +1/q} \Vert \square _k \, f\Vert _{L^{a{^\prime }}_t L^{b{^\prime }}_x}, \quad 0 < |t| \lesssim c^2. \end{aligned}$$
    (207)
  2. 2.

    Let (ab) be Schrödinger-admissible, \(i \in \{1,\ldots ,d\}\), then

    $$\begin{aligned}&\Vert \square _k \partial ^2_{x_i}\mathscr {A}_2f \Vert _{L^a_t L^b_x} \lesssim c^{1+4/p} \left\langle |k|_\infty \right\rangle ^{1/2} \Vert \square _k f \Vert _{L^{1,2}_{x_i;(x_j)_{j \ne i},t}}, \quad 0 < |t| \lesssim c^2 \end{aligned}$$
    (208)
    $$\begin{aligned}&\Vert \square _k \partial ^2_{x_i}\mathscr {A}_2f \Vert _{L^{\infty ,2}_{x_i;(x_j)_{j \ne i},t}} \lesssim c^{1+4/p} \left\langle |k|_\infty \right\rangle ^{1/2} \Vert \square _k f \Vert _{L^{a{^\prime }}_t L^{b{^\prime }}_x}, \quad 0 < |t| \lesssim c^2 . \end{aligned}$$
    (209)

Proof

Denote

$$\begin{aligned} \mathscr {L}_k(f,\psi )&= \int \left( \square _k \, \int \mathscr {U}_2(t-\tau )f(\tau )\mathrm{d}\tau , \psi (t) \right) \mathrm{d}t. \end{aligned}$$

By duality and the maximal function estimate (169)

$$\begin{aligned} |\mathscr {L}_k(f,\psi )|&\le \Vert \square _k \, f\Vert _{L^{q{^\prime }}_{x_1} L^1_{\bar{x},t}} \sum _{|l|_\infty \le 1} \left\| \square _{k+l} \int \mathscr {U}_2(t-\tau )\psi (t) \mathrm{d}t \right\| _{L^q_{x_1} L^\infty _{\bar{x},t}} \\&\le \Vert \square _k \, f\Vert _{L^{q{^\prime }}_{x_1} L^1_{\bar{x},t}} \sum _{|l|_\infty \le 1} \int \Vert \square _{k+l} \mathscr {U}_2(t-\tau )\psi (t) \mathrm{d}t \Vert _{L^q_{x_1} L^\infty _{\bar{x},t}} \\&{\mathop {\le }\limits ^{(169)}} c^{d/2} \left\langle k \right\rangle ^{1/q} \Vert \square _k \, f\Vert _{L^{q{^\prime }}_{x_1} L^1_{\bar{x},t}} \Vert \psi \Vert _{L^1_t L^2_x}, \end{aligned}$$

so by duality we obtain

$$\begin{aligned} \left\| \square _k \, \int \mathscr {U}_2(t-\tau )f(\tau )\mathrm{d}\tau \right\| _{L^\infty _tL^2_x}&\lesssim c^{d/2} \left\langle k \right\rangle ^{1/q} \Vert \square _k \, f\Vert _{L^{q{^\prime }}_{x_1} L^1_{\bar{x},t}}. \end{aligned}$$
(210)

Therefore, by duality, Strichartz estimates (166) and (210)

$$\begin{aligned} |\mathscr {L}_k(f,\psi )|&\le \left\| \square _k \, \int \mathscr {U}_2(-\tau )f(\tau )\mathrm{d}\tau \right\| _{L^2_x} \left\| \square _k \, \int \mathscr {U}_2(-t)\psi (t)\mathrm{d}t \right\| _{L^2_x} \nonumber \\&\lesssim c^{d/2} \left\langle k \right\rangle ^{1/q} \Vert f\Vert _{L^{q{^\prime }}_{x_1} L^1_{\bar{x},t}} \; c^{(1-1/r)2r/a} \Vert \square _k \, \psi \Vert _{L^{a{^\prime }}_t L^{b{^\prime }}_x}, \end{aligned}$$
(211)

which implies (207) for \(q>2\) or \(a>2\). In the case \(a=q=2\), (207) can be directly deduced from (169). Furthermore, by (168), (157) and (159) we get

$$\begin{aligned} \mathscr {L}_k(\partial ^2_{x_i}f,\psi )&\lesssim c^{1+4/p} \left\langle |k|_\infty \right\rangle ^{1/2} \Vert \square _k f\Vert _{ L^{1,2}_{x_i;(x_j)_{j \ne i},t} } \; c^{4/p} \Vert \psi \Vert _{L^{a{^\prime }}_t L^{b{^\prime }}_x}, \end{aligned}$$
(212)

and we can deduce (208); by exchanging f and \(\psi \), we get (209). \(\square \)

We now summarize the results we will use in order to prove the local well-posedness of (112): We omit the proof, it follows from the results of the previous subsections, together with (206).

Proposition 14

Let \(d \ge 2\), \(8/d \le p < + \infty \), \(2 \le q <+\infty \), \(q > 8/d\), \(k \in \mathbb {Z}^d\) with \(|k|_\infty = |k_i| \gtrsim c\), \(h,i,l \in \{1,\ldots ,d\}\). Then

$$\begin{aligned} \left\| \square _k D_{x_i}^{3/2}\mathscr {U}_2(t)\psi _0 \right\| _{ L^{\infty ,2}_{x_i;(x_j)_{j\ne i},t} }&\lesssim c \Vert \square _k \psi _0\Vert _{L^2}, \end{aligned}$$
(213)
$$\begin{aligned} \Vert \square _k \, \mathscr {U}_2(t)\psi _0 \Vert _{L^{q,\infty }_{x_i;(x_j)_{j \ne i},t}}&\lesssim c^{d/2} \left\langle k \right\rangle ^{1/q} \Vert \square _k\psi _0\Vert _{L^2}, \quad 0 < |t| \lesssim c^2, \end{aligned}$$
(214)
$$\begin{aligned} \Vert \square _k \mathscr {U}_2(t)\phi _0 \Vert _{L^\infty _t L^2_x \cap L^{2+p}_x}&\lesssim c^{\frac{4}{p(p+2)}} \Vert \square _k\phi _0\Vert _{L^2}, \quad 0 < |t| \lesssim c^2 \end{aligned}$$
(215)
$$\begin{aligned} \Vert \square _k \partial ^2_{x_l} \mathscr {A}_2 f \Vert _{ L^{\infty ,2}_{x_i;(x_j)_{j\ne i},t} }&\lesssim \Vert \square _k \, f\Vert _{ L^{1,2}_{x_i;(x_j)_{j\ne i},t} }, \end{aligned}$$
(216)
$$\begin{aligned} \Vert \square _k \, \partial ^2_{x_l}\mathscr {A}_2 f \Vert _{L^{q,\infty }_{x_h;(x_j)_{j \ne h},t}}&\lesssim c^{1+d/2} \left\langle k_i \right\rangle ^{1/2+1/q} \Vert \square _kf\Vert _{L^{1,2}_{x_i;(x_j)_{j \ne i},t}}, \quad 0 < |t| \lesssim c^2, \end{aligned}$$
(217)
$$\begin{aligned} \Vert \square _k \partial ^2_{x_l}\mathscr {A}_2f \Vert _{L^\infty _t L^2_x \cap L^{2+p}_{t,x}}&\lesssim c^{1+\frac{4}{p(p+2)}} \left\langle k_i \right\rangle ^{1/2} \Vert \square _k f \Vert _{L^{1,2}_{x_i;(x_j)_{j \ne i},t}}, \quad 0 < |t| \lesssim c^2 \end{aligned}$$
(218)
$$\begin{aligned} \Vert \square _k \partial ^2_{x_l}\mathscr {A}_2f \Vert _{L^{\infty ,2}_{x_i;(x_j)_{j \ne i},t}}&\lesssim c^{1+\frac{4}{p(p+2)}} \left\langle k_i \right\rangle ^{1/2} \Vert \square _k f \Vert _{ L^{(2+p)/(1+p)}_{t,x} }, \quad 0 < |t| \lesssim c^2 ,\end{aligned}$$
(219)
$$\begin{aligned} \Vert \square _k \; \partial ^2_{x_l} \mathscr {A}_2 f \Vert _{L^{q,\infty }_{x_i;(x_j)_{j \ne i},t}}&\lesssim c^{\frac{d}{2} + \frac{2}{p+2} + \frac{4}{p(p+2)} } \left\langle k_i \right\rangle ^{2+1/q} \Vert \square _k \, f\Vert _{ L^{(2+p)/(1+p)}_{t,x} }, \quad 0 < |t| \lesssim c^2, \end{aligned}$$
(220)
$$\begin{aligned} \Vert \square _k \; \mathscr {A}_2 f \Vert _{L^\infty _tL^2_x \cap L^{2+p}_{t,x}}&\lesssim c^{\frac{8}{p(p+2)}} \Vert \square _k \, f\Vert _{L^{(2+p)/(1+p)}_{t,x}}. \end{aligned}$$
(221)

For the case \(r>2\) we have the following results

Remark 22

  1. 1.

    Let (ab) be order-r admissible, \(i \in \{1,\ldots ,d\}\), \(q \ge 2\), \(\frac{4r}{d}< q <+\infty \) and \(k \in \mathbb {Z}^d\) with \(|k|_\infty \gtrsim K(c)\), then

    $$\begin{aligned} \Vert \square _k \; \partial ^\alpha _{x_i} \mathscr {A}_r f \Vert _{L^{q,\infty }_{x_i;(x_j)_{j \ne i},t}} \lesssim c^{d\left( 1-\frac{1}{r}\right) + \left( 1-\frac{1}{r}\right) \frac{2r}{a} } \left\langle |k|_\infty \right\rangle ^{\alpha +1/q} \Vert \square _k \, f\Vert _{L^{a{^\prime }}_t L^{b{^\prime }}_x}, \quad 0 < |t| \lesssim c^{2(r-1)}. \end{aligned}$$
    (222)
  2. 2.

    Let (ab) be Schrödinger-admissible, \(i \in \{1,\ldots ,d\}\), then

    $$\begin{aligned} \Vert \square _k \partial ^{2(r-1)}_{x_i}\mathscr {A}_rf \Vert _{L^a_t L^b_x}&\lesssim c^{r-1+2r/a} \left\langle |k|_\infty \right\rangle ^{r-3/2} \Vert \square _k f \Vert _{L^{1,2}_{x_i;(x_j)_{j \ne i},t}}, \quad 0 < |t| \lesssim c^{2(r-1)} \end{aligned}$$
    (223)
    $$\begin{aligned} \Vert \square _k \partial ^{2(r-1)}_{x_i}\mathscr {A}_2f \Vert _{L^{\infty ,2}_{x_i;(x_j)_{j \ne i},t}}&\lesssim c^{r-1+2r/a} \left\langle |k|_\infty \right\rangle ^{r-3/2} \Vert \square _k f \Vert _{L^{a{^\prime }}_t L^{b{^\prime }}_x}, \quad 0 < |t| \lesssim c^{2(r-1)} . \end{aligned}$$
    (224)

Proposition 15

Let \(d \ge 2\), \(4r/d \le p < + \infty \), \(2 \le q <+\infty \), \(q > 4r/d\), \(k \in \mathbb {Z}^d\) with \(|k|_\infty = |k_i| \gtrsim c\), \(h,i,l \in \{1,\ldots ,d\}\). Then

$$\begin{aligned} \left\| \square _k D_{x_i}^{r-1/2}\mathscr {U}_r(t)\psi _0 \right\| _{ L^{\infty ,2}_{x_i;(x_j)_{j\ne i},t} }&\lesssim c^{r-1} \Vert \square _k \psi _0\Vert _{L^2}, \end{aligned}$$
(225)
$$\begin{aligned} \Vert \square _k \, \mathscr {U}_r(t)\psi _0 \Vert _{L^{q,\infty }_{x_i;(x_j)_{j \ne i},t}}&\lesssim c^{d\left( 1-\frac{1}{r}\right) } \left\langle k \right\rangle ^{1/q} \Vert \square _k\psi _0\Vert _{L^2}, \; 0 < |t| \lesssim c^{2(r-1)}, \end{aligned}$$
(226)
$$\begin{aligned} \Vert \square _k \mathscr {U}_r(t)\phi _0 \Vert _{L^\infty _t L^2_x \cap L^{2(r-1)+p}_x}&\lesssim c^{\frac{4(r-1)^2}{p(p+2(r-1))}} \Vert \square _k\phi _0\Vert _{L^2(\mathbb {R}^d)}, \; 0 < |t| \lesssim c^{2(r-1)} \end{aligned}$$
(227)
$$\begin{aligned} \Vert \square _k \partial ^{2(r-1)}_{x_l} \mathscr {A}_r f \Vert _{ L^{\infty ,2}_{x_i;(x_j)_{j\ne i},t} }&\lesssim \Vert \square _k \, f\Vert _{ L^{1,2}_{x_i;(x_j)_{j\ne i},t} }, \end{aligned}$$
(228)
$$\begin{aligned} \Vert \square _k \, \partial ^{2(r-1)}_{x_l}\mathscr {A}_r f \Vert _{L^{q,\infty }_{x_h;(x_j)_{j \ne h},t}}&\lesssim c^{r-1+d\left( 1-\frac{1}{r}\right) } \left\langle k_i \right\rangle ^{r-3/2+1/q} \Vert \square _kf\Vert _{L^{1,2}_{x_i;(x_j)_{j \ne i},t}}, \nonumber \\&\qquad 0 < |t| \lesssim c^{2(r-1)}, \end{aligned}$$
(229)
$$\begin{aligned} \Vert \square _k \partial ^{2(r-1)}_{x_l}\mathscr {A}_rf \Vert _{L^\infty _t L^2_x \cap L^{2(r-1)+p}_{t,x}}&\lesssim c^{ r-1+\frac{4(r-1)^2}{p(p+2(r-1))} } \left\langle k_i \right\rangle ^{r-3/2} \Vert \square _k f \Vert _{L^{1,2}_{x_i;(x_j)_{j \ne i},t}}, \nonumber \\&\qquad 0 < |t| \lesssim c^{2(r-1)} \end{aligned}$$
(230)
$$\begin{aligned} \Vert \square _k \partial ^{2(r-1)}_{x_l}\mathscr {A}_rf \Vert _{L^{\infty ,2}_{x_i;(x_j)_{j \ne i},t}}&\lesssim c^{ r-1+\frac{4(r-1)^2}{p(p+2(r-1))} } \left\langle k_i \right\rangle ^{r-3/2} \Vert \square _k f \Vert _{ L^{ \frac{2(r-1)+p}{2r-1+p} }_{t,x} }, \nonumber \\ 0 < |t|&\lesssim c^{2(r-1)} , \end{aligned}$$
(231)
$$\begin{aligned} \Vert \square _k \; \partial ^{2(r-1)}_{x_l} \mathscr {A}_r f \Vert _{L^{q,\infty }_{x_i;(x_j)_{j \ne i},t}}&\lesssim c^{ \frac{d}{2} + \frac{2r}{p+2(r-1)} + \frac{4(r-1)^2}{p(p+2(r-1))} } \left\langle k_i \right\rangle ^{2(r-1)+1/q} \Vert \square _k \, f\Vert _{ L^{ \frac{2(r-1)+p}{2r-1+p} }_{t,x} }, \nonumber \\&\qquad 0 < |t| \lesssim c^{2(r-1)}, \end{aligned}$$
(232)
$$\begin{aligned} \Vert \square _k \; \mathscr {A}_r f \Vert _{L^\infty _tL^2_x \cap L^{2(r-1)+p}_{t,x}}&\lesssim c^{\frac{8(r-1)^2}{p(p+2(r-1))}} \Vert \square _k \, f\Vert _{L^{ \frac{2(r-1)+p}{2r-1+p} }_{t,x}}. \end{aligned}$$
(233)

For convenience, we state some technical results related to nonlinear mapping estimates. For \(i=1,\ldots ,d\) and \(N \in \mathbb {N}\) we set

$$\begin{aligned} \mathbb {B}_{i,1}^{(N)}&:= \left\{ \left( k^{(1)},\ldots ,k^{(N)}\right) \in (\mathbb {Z}^d)^N : \max \left( |k^{(1)}_i|,\ldots ,|k^{(N)}_i|\right) \gtrsim c \right\} , \\ \mathbb {B}_{i,2}^{(N)}&:= \left\{ \left( k^{(1)},\ldots ,k^{(N)}\right) \in (\mathbb {Z}^d)^N : \max \left( |k^{(1)}_i|,\ldots ,|k^{(N)}_i|\right) \lesssim c \right\} . \end{aligned}$$

Lemma 13

Let \(s \ge 0\), \(N \ge 3\), \(i \in \{1,\ldots ,d\}\), then

$$\begin{aligned}&\left\| \sum _{\mathbb {B}_{i,1}^{(N)}} \square _{k^{(1)}} \, \psi _1 \cdots \square _{k^{(N)}} \, \psi _N \right\| _{ l^{1,s}_{\square ,i,c}(L^{1,2}_{x_1,(x_j)_{j \ne 2},t}) } \nonumber \\&\quad \lesssim \sum _{\alpha =1}^N \Vert \psi _\alpha \Vert _{ \cap _{h=1}^d l^{1,s}_{\square ,h,c}(L^{\infty ,2}_{x_h,(x_j)_{j \ne h},t}) } \prod _{ \begin{array}{c} \beta =1,\ldots ,d \\ \beta \ne \alpha \end{array} } \Vert \psi _\beta \Vert _{ \cap _{h=1}^d l^{1}_{\square }(L^{N-1\infty }_{x_h,(x_j)_{j \ne h},t}) } . \end{aligned}$$
(234)

Proof

See proof of Lemma 3.1 in [50]. \(\square \)

Lemma 14

Let \(N \ge 1\) and \(i \in \{1,\ldots ,d\}\), and assume that \(1 \le p,q,p_1,q_1,\ldots ,p_N,q_N \le +\infty \) satisfy

$$\begin{aligned} \frac{1}{p} = \frac{1}{p_1} + \cdots + \frac{1}{p_N}, \quad \frac{1}{q} = \frac{1}{q_1} + \cdots + \frac{1}{q_N}, \end{aligned}$$

then

$$\begin{aligned} \left\| \sum _{\mathbb {B}_{i,2}^{(N)}} \square _{k^{(1)}} \, \psi _1 \cdots \square _{k^{(N)}} \, \psi _N \right\| _{ l^1_{\square ,i,c}(L^q_t L^p_x) }&\lesssim c^d \, N^d \, \sum _{\mathbb {B}_{i,2}^{(N)}} \prod _{i=1}^N \Vert \square _{k^{(i)}} \, \psi _i\Vert _{L^{q_i}_t L^{p_i}_x}. \end{aligned}$$
(235)

Proof

See proof of Lemma 3.3 in [50]. \(\square \)

Lemma 15

Let \(s \ge 0\), \(N \ge 1\) and \(i \in \{1,\ldots ,d\}\), and assume that \(1 \le p,q,p_1,q_1,\ldots ,p_N,q_N \le +\infty \) satisfy

$$\begin{aligned} \frac{1}{p} = \frac{1}{p_1} + \cdots + \frac{1}{p_N}, \quad \frac{1}{q} = \frac{1}{q_1} + \cdots + \frac{1}{q_N}, \end{aligned}$$

then

$$\begin{aligned} \Vert \psi _1 \cdots \psi _N \Vert _{l^{1,s}_{\square }(L^p_t L^q_x)}&\lesssim N^d \prod _{i=1}^N \Vert \psi _i \Vert _{l^{1,s}_{\square }(L^{p_i}_t L^{q_i}_x)}. \end{aligned}$$
(236)

Proof

See proof of Lemma 8.2 in [59]. \(\square \)

Proof

(Proposition 7, part (i), case \(r=2\)) Since the nonlinearity contains terms of the form \((\partial _x^\alpha \psi )^\beta \) with \(|\alpha | \le 2\), \(|\beta | \ge m+1\), we introduce the space

$$\begin{aligned} D&:= \left\{ \psi \in \mathscr {S}{^\prime }: \Vert \psi \Vert _D := \sum _{|\alpha | \le 2} \sum _{l=1}^3 \sum _{i,j=1}^d \rho _l^{(i)}(\partial _{x_j}^\alpha \psi ) \lesssim c^{-\delta _0} \right\} , \end{aligned}$$

where

$$\begin{aligned} \rho _1^{(i)}(\psi )&:= \Vert \psi \Vert _{l^{1,s-r+1/2+1/m}_{\square ,i,c}(L^{\infty ,2}_{x_i;(x_j)_{j \ne i},t})}, \\ \rho _2^{(i)}(\psi )&:= \Vert \psi \Vert _{l^{1,s}_{\square }(L^{m,\infty }_{x_i;(x_j)_{j \ne i},t})},\\ \rho _3^{(i)}(\psi )&:= \Vert \psi \Vert _{l^{1,s+1/m}_{\square }(L^\infty _t L^2_x \cap L^{2+m}_{t,x})}. \end{aligned}$$

and for some \(\delta _0>0\) that we will choose later.

Since \(\Vert \psi \Vert _D=\Vert \bar{\psi }\Vert _D\), without loss of generality we can assume that the nonlinearity contain only terms of the form

$$\begin{aligned} \psi ^{\beta _0}(\partial _x^{\alpha _1}\psi )^{\beta _1}(\partial _x^{\alpha _2}\psi )^{\beta _2}&=: \varPsi _1\ldots \varPsi _R, \end{aligned}$$

where \(R:=|\beta |=\beta _0+|\beta _1|+|\beta _2|\), \(|\alpha _i|=i\) (\(i=1,2\)).

To prove the first part of Proposition 7 we will show that the map

$$\begin{aligned}&\mathscr {F}: D \rightarrow D, \\&\quad \psi (t) \mapsto \mathscr {U}_2(t)\psi _0+i \mathscr {A}_2P\left( \left( \partial ^\alpha _x\psi \right) _{|\alpha | \le 2}, \left( \partial ^\alpha _x\bar{\psi }\right) _{|\alpha | \le 2} \right) \end{aligned}$$

is a contraction mapping.

First, we have that by Proposition 14

$$\begin{aligned} \Vert \mathscr {U}_2(t)\psi _0\Vert _D&\lesssim c^{ \frac{d}{2} + \frac{4}{m(m+2)} } \Vert \psi _0\Vert _{M^{s+3+1/m}_{2,1}}. \end{aligned}$$

Now, for the estimate of \(\rho _1^{(i)}(\mathscr {A}_2 \partial ^\alpha _{x_j}F)\) (\(i,j=1,\ldots ,d\)) it suffices to estimate \(\rho _1^{(1)}(\mathscr {A}_2 \partial ^\alpha _{x_1}F)\): Indeed, by (206)

$$\begin{aligned} \rho _1^{(1)}(\mathscr {A}_2 \partial ^\alpha _{x_2}F)&\lesssim \rho _1^{(1)}(\mathscr {A}_2 \partial ^\alpha _{x_1}F). \end{aligned}$$

Using frequency-uniform decomposition, we write

$$\begin{aligned} \square _k(\varPsi _1 \cdots \varPsi _R)&= \sum _{\mathbb {B}^{(R)}_{1,1}} \square _k( \square _{k^{(1)}} \, \varPsi _1 \cdots \square _{k^{(R)}} \, \varPsi _R) + \sum _{\mathbb {B}^{(R)}_{1,2}} \square _k( \square _{k^{(1)}} \, \varPsi _1 \cdots \square _{k^{(R)}} \, \varPsi _R). \end{aligned}$$

By exploiting (216) and (234) for the first sum and (219) and (236) for the second sum we obtain

$$\begin{aligned} \rho _1^{(1)}(\mathscr {A}_2 \, \partial _{x_1}^{\alpha }(\varPsi _1\cdots \varPsi _R))&\lesssim \left\| \sum _{ \mathbb {B}^{(R)}_{1,1} } \square _{k^{(1)}} \, \varPsi _1 \cdots \square _{k^{(R)}} \, \varPsi _R \right\| _{l^{1,s-r+1/2+1/m}_{\square ,1,c}\left( L^{1,2}_{x_1,(x_j)_{j \ne 1},t}\right) } \\&\quad + c^{1+\frac{4}{R^2-1}} \, \left\| \sum _{ \mathbb {B}^{(R)}_{1,1} } \square _{k^{(1)}} \, \varPsi _1 \cdots \square _{k^{(R)}} \, \varPsi _R \right\| _{ l^1_{\square ,1,c}( L^{\frac{R+1}{R}}_{t,x} ) } \\&\lesssim c^{1+\frac{4}{R^2-1}+d} \, R^d \Vert \psi \Vert _D^R. \end{aligned}$$

Next, we estimate \(\rho _2^{(1)}(\mathscr {A}_2(\varPsi _1 \cdots \varPsi _R))\) and \(\rho _3^{(1)}(\mathscr {A}_2(\varPsi _1 \cdots \varPsi _R))\). By (221) and (220) we have

$$\begin{aligned} \sum _{j=2}^3 \rho _j^{(1)}(\mathscr {A}_2(\varPsi _1 \cdots \varPsi _R))&\lesssim c^{ \frac{d}{2} + \frac{2}{m+2} + \frac{8}{m(m+2)} } \Vert \varPsi _1 \cdots \varPsi _R\Vert _{l^{1,s+1/m}\left( L^{\frac{2+m}{1+m}}_{t,x}\right) } \\&{\mathop {\lesssim }\limits ^{(236)}} c^{ \frac{d}{2} + \frac{2}{m+2} + \frac{8}{m(m+2)} } R^d \Vert \psi \Vert _D^R. \end{aligned}$$

Then we consider \(\rho _2^{(1)}(\mathscr {A}_2 \, \partial ^2_{x_1} (\varPsi _1 \cdots \varPsi _R))\): We have

$$\begin{aligned} \rho _2^{(1)}(\mathscr {A}_2 \, \partial ^2_{x_1} (\varPsi _1 \cdots \varPsi _R))&\lesssim \left( \sum _{\begin{array}{c} k \in \mathbb {Z}^d \\ |k|_\infty \gtrsim c \end{array}} + \sum _{\begin{array}{c} k \in \mathbb {Z}^d \\ |k|_\infty \lesssim c \end{array}} \right) \Vert \square _k \, \mathscr {A}_2 \, \partial ^2_{x_1} (\varPsi _1 \dots \varPsi _R) \Vert _{L^{m,\infty }_{x_1;(x_j)_{j \ne 1},t}} \\&=: \textit{III}+ \textit{IV}. \end{aligned}$$

Again by (220) and (236) we obtain

$$\begin{aligned} \textit{IV}&\lesssim c^{ \frac{d}{2} + \frac{2}{m+2} + \frac{4}{m(m+2)} } \Vert \varPsi _1 \dots \varPsi _R\Vert _{l^{1,s+1/m}\left( L^{\frac{2+m}{1+m}}_{t,x}\right) } \\&\lesssim c^{ \frac{d}{2} + \frac{2}{m+2} + \frac{4}{m(m+2)} } R^d \Vert \psi \Vert _D^R. \end{aligned}$$

Furthermore, we have that

$$\begin{aligned} \textit{III}&\lesssim \left( \sum _{k \in \mathbb {Z}^d_1} + \cdots + \sum _{k \in \mathbb {Z}^d_d} \right) \Vert \square _k \, \mathscr {A}_2 \, \partial ^2_{x_1} (\varPsi _1 \dots \varPsi _R) \Vert _{L^{m,\infty }_{x_1; (x_j)_{j \ne 1},t}} \\&=: G_1(\psi ) + \cdots G_d(\psi ). \end{aligned}$$

Using the frequency-uniform decomposition, (217), (234) and (235) we have that

$$\begin{aligned} G_i(\psi )&\lesssim c^{ 1+ 3 \frac{d}{2} } R^d \Vert \psi \Vert _D^R, \quad i=1,\ldots ,d, \end{aligned}$$

therefore

$$\begin{aligned} \textit{III}&\lesssim c^{ 1+ 3 \frac{d}{2} } R^d \Vert \psi \Vert _D^R, \end{aligned}$$

Finally, we estimate \(\rho _3^{(1)}( \mathscr {A}_2 \, \partial ^2_{x_i}(\varPsi _1 \dots \varPsi _R) )\). It suffices to consider the case \(i=1\): By (168) and (157) we have

$$\begin{aligned} \Vert \square _k \, \mathscr {A}_2 \partial ^2_{x_1}f\Vert _{L^\infty _t L^2 \cap L^{2+m}_{t,x}}&\lesssim c^{ \frac{4}{m(m+2)} } \left\langle k_1 \right\rangle ^2 \Vert \square _k \, f \Vert _{L^{\frac{2+m}{1+m}}_{t,x}}, \end{aligned}$$

and by (218) and (207) we obtain

$$\begin{aligned} \rho _3^{(1)}( \mathscr {A}_2 \, \partial ^2_{x_i}(\varPsi _1 \cdots \varPsi _R))&\lesssim c^{ 1 + \frac{8}{m(m+2)} + \frac{d}{2} } R^d \Vert \psi \Vert _D^R. \end{aligned}$$

Collecting all estimates, we have

$$\begin{aligned} \Vert \mathscr {F}(\psi ) \Vert _D&\lesssim c^{ \frac{d}{2} + \frac{4}{m(m+2)} } \Vert \psi _0\Vert _{M^{s+3+1/m}_{2,1}} + c^{ 1+ \frac{3d}{2} +\frac{2}{m+2} + \frac{8}{m(m+2)} } \sum _{m+1 \le R < M} c^{\frac{4}{R^2-1}} R^d \Vert \psi \Vert _D^R. \end{aligned}$$
(237)

and for \(c \ge 1\) sufficiently large we can conclude by a standard contraction mapping argument (see, for example, the proof of Theorem 1.1 in [16]), by choosing

$$\begin{aligned} \delta&> \delta _0(d,m,2) := \max \left( \frac{d}{2}+\frac{4}{m(m+2)} , \frac{1}{m} + \frac{3d}{2m} + \frac{2}{m(m+2)} + \frac{8}{m^2(m+2)} + \frac{4}{m^3} \right) . \end{aligned}$$
(238)

\(\square \)

Remark 23

By arguing in the same way for the general case \(r>2\) we end up with the condition

$$\begin{aligned}&\delta > \delta _0(d,m,r) \end{aligned}$$
(239)
$$\begin{aligned}&\quad := \max \left( d\left( 1-\frac{1}{r}\right) +\frac{4r}{m^2( m+2(r-1) )} , \frac{r-1}{m} + \frac{3d}{2m} + \frac{2rm + 8(r-1)^2}{m^2( m+2(r-1) )} + \frac{4(r-1)^2}{m^3} \right) . \end{aligned}$$
(240)

Remark 24

The quantity \(\delta _0(d,l,r)\) defined in Corollary 1 is actually the right-hand side of (240) with m replaced by \(2(l-1)\).

In order to prove the second part of Proposition 7 we will exploit another contraction mapping argument, like in the proof of Theorem 1 in [28] (which in turn is based on the proof of Theorem 4.1 of [30]). In the following, we denote by a \((Q_\alpha )_{\alpha \in \mathbb {Z}^d}\) a fixed family of nonoverlapping cubes of size R such that \(\mathbb {R}^d = \bigcup _\alpha Q_\alpha \).

Lemma 16

Let \(d \ge 2\) and \(r \ge 2\), then the following estimates hold.

  • (Local smoothing, homogeneous case)

    $$\begin{aligned}&\sup _{\alpha \in \mathbb {Z}^d} \left( \int _{Q_\alpha } \int _\mathbb {R}|D_x^{r-1/2} \mathscr {U}_r(t)\psi _0(x)|^2 \mathrm{d}t \mathrm{d}x \right) ^{1/2} \lesssim c^{r-1} R^{1/2} \Vert \psi _0\Vert _{L^2}, \end{aligned}$$
    (241)
    $$\begin{aligned}&\left\| D_x^r \int _I \mathscr {U}_r(t-\tau )\psi (\tau ,\cdot )\mathrm{d}\tau \right\| _2 \lesssim c^{r-1} R^{1/2} \sum _{\alpha \in \mathbb {Z}^d} \left( \int _{Q_\alpha } \int _I |\psi (t,x)|^2 \mathrm{d}t \mathrm{d}x\right) ^{1/2}; \end{aligned}$$
    (242)
  • (Local smoothing, inhomogeneous case) the solution of the inhomogeneous Cauchy problem

    $$\begin{aligned} -\,i\psi _t&= A_{c,r}\psi + f(t,x), \quad t \in I, x \in \mathbb {R}^d, \end{aligned}$$

    such that \(\psi _0 \equiv 0\) satisfies

    $$\begin{aligned} \sup _{\alpha \in \mathbb {Z}^d} \Vert D_x^{2(r-1)} \psi \Vert _{L^2_x(Q_\alpha ); L^2_t(I)}&\lesssim c^{2(r-1)} \, R T^{1/(4d)} \sum _{\alpha \in \mathbb {Z}^d} \Vert f\Vert _{L^2_x(Q_\alpha ); L^2_t(I)} \end{aligned}$$
    (243)
  • (Maximal function estimate) For any \(s>d+\frac{1}{2}\) we have

    $$\begin{aligned} \left( \int _{\mathbb {R}^d} \sup _{|t| \lesssim c^{2(r-1)}} |\mathscr {U}_r(t)\psi _0(x)|^2 \mathrm{d}x \right) ^{1/2}&\lesssim c^{d\left( 1-\frac{1}{r}\right) } \Vert \psi _0\Vert _{H^s}. \end{aligned}$$
    (244)

Proof (sketch)

The proof in the case \(r=2\) can be obtained simply by rescaling Lemma 3, Lemma 4, Lemma 5 and Lemma 6 of [28]. The proof in the case \(r>2\) can be obtained by considering the operator \(\mathscr {U}_r(t)\) and \(\mathscr {A}_r(t)\) instead of \(\mathscr {U}_2(t)\) and \(\mathscr {A}_2(t)\). \(\square \)

Proof

(Proposition 7, part (ii), case \(r=2\)) We will prove the result only for \(s=s_0\), since the general case follows from commutator estimates. For simplicity, we only deal with the case

$$\begin{aligned} P( (\partial ^\alpha _x\psi )_{|\alpha | \le 2}, (\partial ^\alpha _x\bar{\psi })_{|\alpha | \le 2} )&= \partial _{x_j}^2\psi \, \partial _{x_k}^2\psi \, \partial _{x_m}^2\psi . \end{aligned}$$

More precisely, we fix a positive constant \(\nu <1/3\), and we define the space \(Z_I^\delta \) of all function \(\phi :I \times \mathbb {R}^d \rightarrow \mathbb {C}\) such that the following three conditions hold

$$\begin{aligned}&\Vert \phi \Vert _{L^\infty (I) H^{s_0}} \le c^{-\delta }, \end{aligned}$$
(245)
$$\begin{aligned}&\sum _{|\beta |=s_0+1/2} \sup _{\alpha \in \mathbb {Z}^d} \left( \int _I \int _{Q_\alpha } |\partial ^\beta _x\phi (t,x)|^2 \mathrm{d}x \mathrm{d}t \right) ^{1/2} \le T^\nu , \end{aligned}$$
(246)
$$\begin{aligned}&\left( \sup _{t \in I} \sup _{x \in Q_\alpha } |D^2_x\phi (t,x)|^2 \right) ^{1/2} \le c^{-\delta }. \end{aligned}$$
(247)

We want to show that the map

$$\begin{aligned}&\mathscr {F}: Z_I^\delta \rightarrow Z_I^\delta , \\&\quad \psi (t) \mapsto \mathscr {U}_2(t)\psi _0+i \mathscr {A}_2P( (\partial ^\alpha _x\psi )_{|\alpha | \le 2}, (\partial ^\alpha _x\bar{\psi })_{|\alpha | \le 2} ) \end{aligned}$$

is a contraction mapping.

We can observe that for any \(\beta \in \mathbb {Z}^d\) with \(|\beta |=s_0-\frac{3}{2}\)

$$\begin{aligned} \partial _x^\beta (\partial _{x_j}^2\psi \, \partial _{x_k}^2\psi \, \partial _{x_m}^2\psi )&= \partial _x^\beta \partial _{x_j}^2\psi \, \partial _{x_k}^2\psi \, \partial _{x_m}^2\psi + \partial _{x_j}^2\psi \, \partial _x^\beta \partial _{x_k}^2\psi \, \partial _{x_m}^2\psi + \partial _{x_j}^2\psi \, \partial _{x_k}^2\psi \, \partial _x^\beta \partial _{x_m}^2\psi \\&\quad + R\left( \left( \partial _x^\gamma \psi \right) _{2\le |\gamma |\le s_0-1/2}\right) . \end{aligned}$$

Now, for any \(\psi \in Z_I^\delta \) we have

$$\begin{aligned}&\sum _{|\beta |=s_0+1/2} \sup _{\alpha \in \mathbb {Z}^d} \left( \int _I \int _{Q_\alpha } |\partial ^\beta _x\psi (t,x)|^2 \mathrm{d}x \mathrm{d}t \right) ^{1/2} \nonumber \\&\quad \lesssim \sum _{|\beta |=s_0+1/2} \sup _{\alpha \in \mathbb {Z}^d} \left( \int _I \int _{Q_\alpha } |\mathscr {U}_2(t)\partial ^\beta _x\psi _0(x)|^2 \mathrm{d}x \mathrm{d}t \right) ^{1/2} \nonumber \\&\qquad + \,\sum _{|\beta |=s_0+1/2} \sup _{\alpha \in \mathbb {Z}^d} \left( \int _I \int _{Q_\alpha } \left| \int _0^t \mathscr {U}_2(t-\tau ) \partial _x^\beta (\partial _{x_j}^2\psi \, \partial _{x_k}^2\psi \, \partial _{x_m}^2\psi ) \mathrm{d}\tau \right| ^2 \mathrm{d}x \mathrm{d}t \right) ^{1/2} \nonumber \\&\quad {\mathop {\lesssim }\limits ^{(241),(242)}} c T^{1/3} \Vert \psi _0\Vert _{H^{s_0}} + c^2 T^{1/(4d)} \sum _{|\beta _0|=s_0-3/2} \sum _{j,k,m=1}^d \sum _{\alpha \in \mathbb {Z}^d} \Vert \partial _x^\beta \partial _{x_j}^2\psi \, \partial _{x_k}^2\psi \, \partial _{x_m}^2\psi \Vert _{L^2_x(Q_\alpha ; L^2_t(I))} \nonumber \\&\qquad + \,c^2 \int _0^T \Vert D_x^{1/2} R((\partial _x^\gamma \psi )_{2\le |\gamma |\le s_0-1/2}) \Vert _{L^2} \mathrm{d}t \nonumber \\&\quad \lesssim c T^{1/3} \Vert \psi _0\Vert _{H^{s_0}} + c^2 T^{1/(4d)}\nonumber \\&\sum _{|\beta _0|=s_0+1/2} \sup _{\alpha \in \mathbb {Z}^d} \left( \int _I \int _{Q_\alpha } |\partial _x^\beta \psi |^2 \mathrm{d}x \mathrm{d}t \right) ^{1/2} \left( \sum _{\alpha \in \mathbb {Z}^d} \sup _{t \in I} \sup _{x \in Q_\alpha } |D_x^2\psi |^2 \right) \nonumber \\&\qquad + c^2 T \sup _{t \in I} \Vert \psi \Vert _{H^{s_0}}^3 \nonumber \\&\quad \lesssim c^{1-\delta } T^{1/3} + c^2 T^{1/(4d)} T^\nu c^{-2\delta } + c^2 T c^{-3\delta } \le T^\nu , \end{aligned}$$
(248)

where in the last inequality we have chosen \(\delta \gg 1\) such that

$$\begin{aligned} c^{1-\delta } T^{-\nu +1/3} + c^{2(1-\delta )} T^{1/(4d)} + c^{2-3\delta } T^{1-\nu }&\lesssim 1, \quad T =\mathscr {O}(c^{2(r-1)}). \end{aligned}$$
(249)

Next, we have that for any \(\psi \in Z_I^\delta \)

$$\begin{aligned} \Vert \psi \Vert _{L^\infty (I)H^{s_0}}&\le \Vert \psi _0\Vert _{H^{s_0}} + \sup _{t \in I} \int _0^t \Vert \mathscr {U}_2(t-\tau ) d_{x_j}^2\psi (\tau ) \, \partial _{x_k}^2\psi (\tau ) \, \partial _{x_m}^2\psi (\tau ) \Vert _{L^2} \mathrm{d}\tau \nonumber \\&\quad + \sup _{t \in I} \left\| D_x^{3/2} \int _0^t \mathscr {U}_2(t-\tau ) D_x^{s_0-3/2}d_{x_j}^2\psi (\tau ) \, \partial _{x_k}^2\psi (\tau ) \, \partial _{x_m}^2\psi (\tau ) \mathrm{d}\tau \right\| _{L^2} \nonumber \\&{\mathop {\lesssim }\limits ^{(242)}} \Vert \psi _0\Vert _{H^{s_0}} + T \sup _{t \in I} \Vert \partial _{x_j}^2\psi (t) \, \partial _{x_k}^2\psi (t) \, \partial _{x_m}^2\psi (t) \Vert _{L^2} \nonumber \\&\quad + c^{r-1} \sum _{\alpha \in \mathbb {Z}^d} \left( \int _{Q_\alpha } \int _I |D_x^{s_0-3/2}( \partial _{x_j}^2\psi (t) \, \partial _{x_k}^2\psi (t) \, \partial _{x_m}^2\psi (t) )|^2\mathrm{d}t \mathrm{d}x \right) ^{1/2} \nonumber \\&\lesssim \Vert \psi _0\Vert _{H^{s_0}} + T \sup _{t \in I} \Vert \psi \Vert ^3_{H^{\frac{d}{3}+2}} \nonumber \\&\quad + c \sum _{j,k,m=1}^d \sum _{\alpha \in \mathbb {Z}^d} \left( \int _{Q_\alpha } \int _I |D_x^{s_0-3/2}\partial _{x_j}^2\psi (t) \, \partial _{x_k}^2\psi (t) \, \partial _{x_m}^2\psi (t) |^2\mathrm{d}t \mathrm{d}x \right) ^{1/2} \nonumber \\&\quad + c \sum _{\alpha \in \mathbb {Z}^d} \left( \int _{Q_\alpha } \int _I |R(D_x^\gamma \psi )_{2\le |\gamma |\le s_0-1/2}|^2 \mathrm{d}t \mathrm{d}x \right) ^{1/2} \nonumber \\&\lesssim \Vert \psi _0\Vert _{H^{s_0}} + T \Vert \psi \Vert ^3_{L^\infty (I)H^{\frac{d}{3}+2}} \nonumber \\&\quad + c \sum _{|\beta |=s_0+1/2} \sup _{\alpha \in \mathbb {Z}^d} \left( \int _I \int _{Q_\alpha } |\partial _x^\beta \psi |^2 \mathrm{d}x \mathrm{d}t \right) ^{1/2} \; \sum _{\alpha \in \mathbb {Z}^d} \sup _{t \in I} \sup _{x \in Q_{\alpha }} |D_x^2\psi |^2 \nonumber \\&\quad + c T^{1/2} \Vert \psi \Vert ^3_{L^\infty (I) H^{s_0}} \nonumber \\&\lesssim \Vert \psi _0\Vert _{H^{s_0}} + (T+cT^{1/2}) c^{-3\delta } + cT^\nu c^{-2\delta } \nonumber \\&\lesssim c^{-\delta }, \end{aligned}$$
(250)

where in the last inequality we have chosen \(\delta \gg 1\) such that

$$\begin{aligned} (T+cT^{1/2}) c^{-3\delta } + T^\nu c^{1-2\delta }&\lesssim \frac{1}{2}, \quad T =\mathscr {O}(c^{2(r-1)}). \end{aligned}$$
(251)

Then, we have that for any \(\psi \in Z_I^\delta \)

$$\begin{aligned}&\left( \sum _{\alpha \in \mathbb {Z}^d} \sup _{t \in I} \sup _{x \in Q_\alpha } |D^2_x\psi (t,x)|^2 \right) ^{1/2} \nonumber \\&\quad {\mathop {\lesssim }\limits ^{T \lesssim c^{2(r-1)},(244) }} \Vert \psi _0\Vert _{H^{s_0}} + c^{d\left( 1-\frac{1}{r}\right) } T \Vert \psi \Vert ^3_{L^\infty (I)H^{s_0}} \nonumber \\&\quad \lesssim \Vert \psi _0\Vert _{H^{s_0}} + c^{d\left( 1-\frac{1}{r}\right) } T c^{-3\delta } \nonumber \\&\quad \lesssim c^{-\delta }, \end{aligned}$$
(252)

where in the last inequality we have chosen \(\delta \gg 1\) such that

$$\begin{aligned} c^{d\left( 1-\frac{1}{r}\right) -2\delta } T&\lesssim 1, \quad T =\mathscr {O}(c^{2(r-1)}). \end{aligned}$$
(253)

Finally, if for any \(\phi \in Z_I^\delta \) we set \(\varLambda _T(\phi )\) as the maximum between the three following quantities,

$$\begin{aligned}&\sum _{\alpha \in \mathbb {Z}^d} \sup _{t \in I} \sup _{x \in Q_\alpha } |D^2_x\psi (t,x)|^2,\\&\quad \Vert \phi \Vert _{L^\infty (I)H^{s_0}},\\&\quad c^{-\delta } T^{-\nu } \sum _{|\beta |=s_0+1/2} \sup _{\alpha \in \mathbb {Z}^d} \left( \int _I \int _{Q_\alpha } |\partial ^\beta _x\phi (t,x)|^2 \mathrm{d}x \mathrm{d}t \right) ^{1/2}, \end{aligned}$$

we can observe that for any \(\phi _1,\phi _2 \in Z_I^\delta \)

$$\begin{aligned} \varLambda _T( \mathscr {F}(\phi _1)-\mathscr {F}(\phi _2) )&\le K T^\nu c^{-2\delta } \varLambda _T(\phi _1-\phi _2), \end{aligned}$$

where K is a positive constant which does not depend on c. Hence if we choose \(\delta \gg 1\) such that (251), (249), (253) and

$$\begin{aligned} K T^\nu c^{-2\delta }&\le \frac{1}{2} \end{aligned}$$
(254)

hold true, we can conclude. \(\square \)

9 Long-time approximation

Now we study the evolution of the error between the approximate solution \(\psi _a\), namely the solution of (99), and the original solution \(\psi \) of (3) for long (that means, c-dependent) time intervals. First we prove a result for the linear case, then we consider the approximation of small radiation solutions in the nonlinear case, and finally we make some remarks about the approximation of standing waves and soliton solutions.

9.1 Linear case

Fix \(r \ge 1\), and take \(\psi _0 \in H^{k+k_0}\), where \(k_0>0\) and \(k \gg 1\) are the ones in Theorem 7.

Now, we want to estimate the space–time norm of the error \(\delta =\psi -\psi _a\). In the linear case we can observe that \(\delta \) satisfies

$$\begin{aligned} \dot{\delta }&= i c \langle \nabla \rangle _c \delta + \frac{1}{c^{2r}} X_{ \mathscr {T}^{(r)*}\mathscr {R}^{(r)} }(\psi _a(t),\bar{\psi }_a(t) ). \end{aligned}$$
(255)

Proof (Theorem3)

By applying the Strichartz estimate (14) (choose \(p=+\infty \), \(q=2\), \(r=+\infty \), \(s=2\)), together with estimate (31) for the vector field of the remainder \(\mathscr {R}^{(r)}\), estimate (32) for the canonical transformation \(\mathscr {T}^{(r)}\), and estimate (108) (choose \(p=q=2\)), we can deduce Theorem 3. \(\square \)

9.2 The nonlinear case: radiation solutions

Now, assume that we want to recover the approach of Sect. 9.1 to approximate radiation solutions of the NLKG equation for long (c-dependent) timescales.

We pursue such a program by a perturbative argument, considering a small radiation solution \(\psi _r=\eta _{rad,r}\) of the normalized system (98) that exists up to times of order \(\mathscr {O}(c^{2(r-1)})\), \(r>1\).

As an application of Proposition 1, we consider the following case. Fix \(r>1\), let \(\sigma >0\) and let \(\psi _r = \eta _{rad}\) be a radiation solution of (98), namely such that

$$\begin{aligned} \eta _{rad,0}:=\eta _{rad}(0)&\in H^{k+k_0+\sigma +d/2}(\mathbb {R}^d), \end{aligned}$$
(256)

where \(k_0>0\) and \(k \gg 1\) are the ones in Theorem 7.

Let \(\delta (t)\) be a solution of (103); then, by Duhamel formula

$$\begin{aligned} \delta (t)&:= \mathscr {U}(t,0) \delta _0 = e^{it c\langle \nabla \rangle _c}\delta _0 +\int _0^t e^{i(t-s) c\langle \nabla \rangle _c} \mathrm{d}P(\psi _a(s)) \mathscr {U}(s,0) \delta _0 \mathrm{d}s. \end{aligned}$$
(257)

Now fix \(T\lesssim c^{2(r-1)}\); we want to estimate the local-in-time norm in the space \(L^\infty ([0,T])H^{k}(\mathbb {R}^d)\) of the error \(\delta (t)\).

By (13) we can estimate the first term. We can estimate the second term by (14): Hence for any (pq) Schrödinger-admissible exponents

$$\begin{aligned}&\left\| \int _0^{t}e^{i(t-s) c\langle \nabla \rangle _c}dP(\psi _a(s)) \delta (s) \mathrm{d}s \right\| _{L^\infty _t([0,T]) H^{k}_x} \\&\quad \lesssim c^{\frac{1}{q}-\frac{1}{p}-\frac{1}{2}} \Vert \langle \nabla \rangle _c^{\frac{1}{p}-\frac{1}{q}+\frac{1}{2}} \mathrm{d}P(\psi _a(t)) \delta (t) \Vert _{L^{p{^\prime }}_t([0,T]) W^{k,q{^\prime }}_x} \\&\quad \lesssim c^{\frac{1}{q}-\frac{1}{p}-\frac{1}{2}} \Vert \langle \nabla \rangle _c^{\frac{1}{p}-\frac{1}{q}+\frac{1}{2}} \mathrm{d}P(\eta _{rad}(c^2t)) \delta (t) \Vert _{L^{p{^\prime }}_t([0,T]) W^{k,q{^\prime }}_x} \\&\qquad +\, c^{\frac{1}{q}-\frac{1}{p}-\frac{1}{2}} \Vert \langle \nabla \rangle _c^{\frac{1}{p}-\frac{1}{q}+\frac{1}{2}} [\mathrm{d}P(\psi _a(t))-\mathrm{d}P(\eta _{rad}(c^2t)) ] \delta (t) \Vert _{L^{p{^\prime }}_t([0,T]) W^{k,q{^\prime }}_x} \\&\quad =: I_p + II_p, \end{aligned}$$

but recalling (100) one has that

$$\begin{aligned} I_p&\lesssim \frac{|\lambda |}{2^{l-1/2}(2l)(2l-1)} c^{\frac{1}{q}-\frac{1}{p}+\frac{1}{2}} \left\| \langle \nabla \rangle _c^{\frac{1}{p}-\frac{1}{q}-\frac{1}{2}} \left[ \left( \frac{c}{\langle \nabla \rangle _c} \right) ^{1/2} (\eta _{rad}+\bar{\eta }_{rad}) \right] ^{2(l-1)} \, \delta (t) \right\| _{L^{p{^\prime }}_t([0,T]) W^{k,q{^\prime }}_x}. \end{aligned}$$

Now fix a real number \(\rho \in ]0,1[\), and choose

$$\begin{aligned} p&= 2+\rho , \end{aligned}$$
(258)
$$\begin{aligned} q=\frac{2dp}{dp-4}&= \frac{4d+2d\rho }{2d+d\rho -4} = 2+\frac{8}{2d-4+d\rho }, \end{aligned}$$
(259)

we get (since \(\Vert (c/\langle \nabla \rangle _c)^{\frac{1}{q}-\frac{1}{p}-\frac{1}{2}}\Vert _{L^{q{^\prime }} \rightarrow L^{q{^\prime }}} \le 1\))

$$\begin{aligned} I_{2}&\le \frac{|\lambda |}{2^{l-1/2}(2l)(2l-1)} \left\| \left[ \left( \frac{c}{\langle \nabla \rangle _c} \right) ^{1/2} (\eta _{rad}(c^2 t)+\bar{\eta }_{rad}(c^2 t)) \right] ^{2(l-1)} \, \delta (t) \right\| _{L^{ 2-\frac{\rho }{1+\rho } }_t([0,T]) W^{k,q{^\prime }}_x}. \end{aligned}$$

Now, since by Hölder inequality

$$\begin{aligned}&\left\| \left[ \left( \frac{c}{\langle \nabla \rangle _c} \right) ^{1/2} (\eta _{rad}(c^2 t)+\bar{\eta }_{rad}(c^2 t)) \right] ^{2(l-1)} \, \delta (t) \right\| _{L^{ 2-\frac{\rho }{1+\rho } }_t([0,T]) W^{k,q{^\prime }}_x} \\&\quad \le \left\| \left[ \left( \frac{c}{\langle \nabla \rangle _c} \right) ^{1/2} (\eta _{rad}(c^2 t)+\bar{\eta }_{rad}(c^2 t)) \right] ^{2(l-1)} \right\| _{L_t^{ 2-\frac{\rho }{1+\rho } }([0,T]) W^{k,d(1+\rho /2)}_x} \, \Vert \delta (t)\Vert _{L^\infty _t([0,T]) H^{k}_x}, \end{aligned}$$

and by Sobolev product theorem (recall that \(l \ge 2\), and that \(k \gg 1\)) we can deduce that

$$\begin{aligned}&\left\| \left[ \left( \frac{c}{\langle \nabla \rangle _c} \right) ^{1/2} (\eta _{rad}(c^2 t)+\bar{\eta }_{rad}(c^2 t)) \right] ^{2(l-1)} \right\| _{L_t^{ 2-\frac{\rho }{1+\rho } }([0,T]) W^{k,d(1+\rho /2)}_x}\\&\quad \le \left[ \int _0^T \left\| \left[ \left( \frac{c}{\langle \nabla \rangle _c} \right) ^{1/2} (\eta _{rad}(c^2 t)+\bar{\eta }_{rad}(c^2 t)) \right] \right\| _{W^{k,d(1+\rho /2)}_x }^{2(l-1)\left( 2-\frac{\rho }{1+\rho } \right) } \mathrm{d}t \right] ^{\frac{1}{\left( 2-\frac{\rho }{1+\rho } \right) }} \\&\quad \le \left\| \eta _{rad}(c^2 t)+\bar{\eta }_{rad}(c^2 t) \right\| ^{2(l-1)}_{L^{2(l-1)\left( 2-\frac{\rho }{1+\rho } \right) }_t([0,T]) W^{k,d(1+\rho /2)}_x} , \end{aligned}$$

but since by Proposition 6 we have that for any \(\sigma >0\)

$$\begin{aligned} L^{2(l-1)\left( 2-\frac{\rho }{1+\rho } \right) }_t([0,T]) W^{k,d(1+\rho /2)}_x&\supseteq L^{2(l-1)\left( 2-\frac{\rho }{1+\rho } \right) }_t([0,T]) M^k_{d(1+\rho /2),1,x} \\&\supseteq L^{2(l-1)\left( 2-\frac{\rho }{1+\rho } \right) }_t([0,T]) M^k_{2,1,x} \\&\supseteq L^{2(l-1)\left( 2-\frac{\rho }{1+\rho } \right) }_t([0,T]) H^{k+\sigma +d/2}_x \\&\supseteq L^{\infty }_t([0,T]) H^{k+\sigma +d/2}_x, \end{aligned}$$

we have that

$$\begin{aligned} \Vert \eta _{rad} \Vert ^{2(l-1)}_{L^{2(l-1)\left( 2-\frac{\rho }{1+\rho } \right) }_t([0,T]) W^{k,d(1+\rho /2)}_x}&\lesssim T^{\frac{1+\rho }{2+\rho }} \Vert \eta _{rad} \Vert ^{2(l-1)}_{ L^\infty _t([0,T])H^{k+\sigma +d/2}_x }, \end{aligned}$$
(260)

but by Corollary 1 the right-hand side of (260) is finite and does not depend on \(c \ge 1\) for

$$\begin{aligned}&\Vert \eta _{rad,0} \Vert _{H^{k+k_0+\sigma +d/2}_x} \lesssim c^{-\alpha }, \end{aligned}$$
(261)
$$\begin{aligned}&\alpha > \max \left( \delta _0(d,l,r), \delta _1(d,l,r) , \frac{r-1}{l-1} \right) := \alpha ^*(d,l,r) . \end{aligned}$$
(262)

where \(c \ge c_0\) is sufficiently large, and where \(\delta _0(d,l,r)\) and \(\delta _1(d,l,r)\) are defined in Corollary 1.

Furthermore, via (32) one can show that there exists \(c_{r,k}>0\) sufficiently large such that for \(c \ge c_{r,k}\) the term \(II_{2}\) can be bounded by \(\frac{1}{c^2} \; I_2\).

This means that we can estimate the \(L^\infty ([0,T])H^{k}\) norm of the error only for a small (with respect to c) radiation solution, which is the statement of Proposition 16.

To summarize, we get the following result.

Proposition 16

Consider (58) on \(\mathbb {R}^d\), \(d \ge 2\). Let \(r>1\), and fix \(k_1 \gg 1\). Assume that \(l \ge 2\) and \(r < \frac{d}{2}(l-1)\). Then \(\exists \)\(k_0=k_0(r)>0\) such that for any \(k \ge k_1\) and for any \(\sigma >0\) the following holds: Consider the solution \(\eta _{rad}\) of (98) with the initial datum \(\eta _{rad,0} \in H^{k+k_0+\sigma +d/2}(\mathbb {R}^d)\), and call \(\delta \) the difference between the solution of the approximate equation (99) and the original solution of the Hamilton equation for (58). Assume that \(\delta _0:=\delta (0)\) satisfies

$$\begin{aligned} \Vert \delta _0\Vert _{H^k_x} \lesssim \frac{1}{c^2}. \end{aligned}$$

Then there exist \(\alpha ^*:=\alpha ^*(d,l,r)>0\) and there exists \(c^*:=c^*(r,k) > 1\), such that for any \(\alpha > \alpha ^*\) and for any \(c > c^*\), if \(\eta _{rad,0}\) satisfies

$$\begin{aligned} \Vert \eta _{rad,0}\Vert _{H^{k+k_0+\sigma +d/2}}&\lesssim c^{-\alpha }, \end{aligned}$$

then

$$\begin{aligned} \sup _{t\in [0,T]} \Vert \delta (t)\Vert _{H^k_x}&\lesssim \frac{1}{c^2}, \quad T \lesssim c^{2(r-1)}. \end{aligned}$$

By exploiting (32) and Proposition 16, we obtain Theorem 4.

9.3 The nonlinear case: standing waves solutions

Now we consider the approximation of another important type of solutions, the so-called standing waves solutions.

The issue of (in)stability of standing waves and solitons has a long history: For the NLS equation and the NLKG the orbital stability of standing waves has been discussed first in [53]; for the NLS, the orbital stability of one soliton solutions has been treated in [26], while the asymptotic stability has been discussed in [20] for one soliton solutions, and in [48] and [47] for N-solitons. For the higher-order Schrödinger equation we mention [37], which deals with orbital stability of standing waves for fourth-order NLS-type equations. For the NLKG equation, the instability of solitons and standing waves has been studied in [29, 45, 54].

As for the case of radiation solution, we should fix \(r \ge 1\), and consider a standing wave solution \(\psi _r\) of (98), namely of the form

$$\begin{aligned} \psi _r(t,x)&= e^{it\omega } \eta _\omega (x), \end{aligned}$$
(263)

where \(\omega \in \mathbb {R}\), and \(\eta _\omega \in \mathscr {S}(\mathbb {R}^d)\) solves

$$\begin{aligned} -\omega \eta _\omega&= X_{H_{simp}}(\eta _\omega ). \end{aligned}$$

Remark 25

Of course the existence of a standing wave for the simplified equation (98) is a far from trivial question. For \(r=1\) [26] deals with the NLS equation; for \(r\ge 2\), [37] deals with the one-dimensional fourth-order NLS-type equation.

We also point out that in the case of a standing wave solution, if \(\delta (t)\) satisfies (103), then by Duhamel formula

$$\begin{aligned} \dot{\delta }&= i c\langle \nabla \rangle _c\delta (t) + \mathrm{d}P(\psi _a(t),\bar{\psi }_a(t)) \delta (t). \end{aligned}$$

Since

$$\begin{aligned} P(e^{it\omega }\eta _\omega ,e^{-it\omega }\bar{\eta }_\omega )&= 2^{l-1/2} \, \left( \frac{c}{\left\langle \nabla \right\rangle _c}\right) ^{1/2} \left[ \left( \frac{c}{\left\langle \nabla \right\rangle _c}\right) ^{1/2} Re(e^{it\omega }\eta _\omega ) \right] ^{2l-1}, \end{aligned}$$

we have that

$$\begin{aligned} \mathrm{d}P(\eta _\omega ,\bar{\eta }_\omega )e^{it\omega }h&= 2^{l-1/2} \, \left( \frac{c}{\left\langle \nabla \right\rangle _c}\right) \left[ \left( \frac{c}{\left\langle \nabla \right\rangle _c}\right) ^{1/2} \cos (\omega t) \eta _\omega \right] ^{2(l-1)} (e^{it\omega }h + e^{-it\omega }\bar{h}), \end{aligned}$$

and by setting \(\delta = e^{-it\omega }h\), one gets

$$\begin{aligned}&-i \dot{h} = (c \langle \nabla \rangle _c + \omega )h + 2^{l-1/2} \cos ^{2(l-1)}(\omega t) \left( \frac{c}{\left\langle \nabla \right\rangle _c}\right) \left[ \left( \frac{c}{\left\langle \nabla \right\rangle _c}\right) ^{1/2} \eta _\omega \right] ^{2(l-1)} (h + e^{-2it\omega }\bar{h}) \end{aligned}$$
(264)
$$\begin{aligned}&+\left[ \mathrm{d}P(\psi _a(s),\bar{\psi }_a(s))-\mathrm{d}P(\eta _\omega ,\bar{\eta }_\omega )\right] h. \end{aligned}$$
(265)

Equation (264) is a Salpeter spinless equation with a periodic time-dependent potential; therefore, in order to get some information about the error, one would need the corresponding Strichartz estimates for Eq. (264). Unfortunately, in the literature of dispersive estimates there are only few results for PDEs with time-dependent potentials, and the majority of them is of perturbative nature; for the Schrödinger equation, we mention [22, 25], in which Strichartz estimates are proved in a nonperturbative framework.

Remark 26

By using Proposition 3 one can show that the NLKG can be approximated by the simplified equation (3) locally uniformly in time, up to an error of order \(\mathscr {O}(c^{-2r})\).

Remark 27

One could ask whether one could get a similar result for more general (in particular, moving) soliton solution of (98). Apart from the issue of existence and stability for such solutions, one can check that, provided that a moving soliton solution for (98) exists, then the error \(\delta (t)\) must solve a (264)-type equation, namely a spinless Salpeter equation with a time-dependent moving potential. Unfortunately, since Eq. (264), unlike KG, is not manifestly covariant, one cannot apparently reduce to an analogue equation, and once again one cannot justify the approximation over the \(\mathscr {O}(1)\) timescale.