1 Introduction and Main Results

We consider families of NLS equations on the circle with external parameters of the form:

$$\begin{aligned} \mathrm{i}u_t + u_{xx} - V*u + f(x,|u|^2)u=0, \end{aligned}$$
(1.1)

where \(\mathrm{i}=\sqrt{-1}\) and \(V*\) is a Fourier multiplier

$$\begin{aligned} V*u = \sum _{j\in \mathbb {Z}} V_j u_j e^{\mathrm{i}j x},\quad {\left( V_j\right) }_{j\in \mathbb {Z}}\in {\mathtt {w}}^\infty _{q}, \end{aligned}$$

living in the weighted \(\ell ^\infty \) space

$$\begin{aligned} {\mathtt {w}}^\infty _{q}:=\{ V = {\left( V_j\right) }_{j\in \mathbb {Z}}\in \ell ^\infty \ \ | \quad |V|_{q}:= \sup _{j\in \mathbb {Z}}|V_j|\langle j \rangle ^{q}<\infty \}, \qquad {q}\ge 0, \end{aligned}$$

where \(\langle j \rangle :=\max \{ |j|,1\},\) while f(xy) is \(2\pi \) periodic and real analytic in x and is real analytic in y in a neighborhood of \(y=0\). We shall assume that f(xy) has a zero in \(y=0\). By analyticity, for some \(\mathtt {a},R>0\) we have

$$\begin{aligned} f(x,y)= \sum _{d=1}^\infty f^{(d)}(x) y^d,\quad |f|_{{\mathtt {a}},R}:=\sum _{d=1}^\infty |f^{(d)}|_{\mathbb {T}_{\mathtt {a}}}R^d <\infty , \end{aligned}$$
(1.2)

where, given a real analytic function \(g(x)=\displaystyle \sum \nolimits _{j\in \mathbb {Z}}g_j e^{\mathrm{i}j x},\) we setFootnote 1\( |g|^2_{\mathbb {T}_{{\mathtt {a}}}}:=\displaystyle \sum \nolimits _{j\in \mathbb {Z}}|g_j|^2e^{2\mathtt {a}|j|}.\) Note that if f is independent of x (1.2) reduces to

$$\begin{aligned} |f|_{R}:= \sum _{d=1}^\infty |f^{(d)}|R^d <\infty . \end{aligned}$$
(1.3)

Equation (1.1) is at least locally well-posed (say in a neighborhood of \(u=0\) in \(H^1\), see e.g. Lemma 5.4) and has an elliptic fixed point at \(u=0\), so that an extremely natural question is to understand stability times for small initial data. One can informally state the problem as follows: let \(E\subset H^1\) be some Banach space and consider (1.1) with initial datum \(u_0\) such that \(|u_0|_E\le \delta \ll 1\). By local well posedness, the solution u(tx) of (1.1) with such initial datum exists and is in \(H^1\).

We call stability time \(T=T(\delta )\) the supremum of the times t such that for all \(|u_0|_E\le \delta \) one has \(u(t,\cdot )\in E\) with \(|u(t,\cdot )|_E\le 2\delta \).

Computing the stability time \(T(\delta )\) is out of reach, so the goal is to give lower (and possibly upper) bounds.

A good comparison is with the case of a finite dimensional Hamiltonian system with a non-degenerate elliptic fixed point, which in the standard complex symplectic coordinates \(u_j= \frac{1}{\sqrt{2}}(q_j+ \mathrm{i}p_j)\) is described by the Hamiltonian

$$\begin{aligned} \sum _{j=1}^n \omega _j |u_j|^2 + O(u^3),\quad \text{ where } \omega _j\in \mathbb {R} \text{ are } \text{ the } {} { linear\,frequencies}. \end{aligned}$$
(1.4)

Here if the frequencies \(\omega \) are sufficiently non degenerate, say diophantine,Footnote 2 then one can prove exponential lower bounds on \(T(\delta )\) and, if the nonlinearity satisfies some suitable hypothesis (e.g. convexity or steepness ), even super-exponential ones. This was proved in [MG95] (see also the recent paper [BFN15] and references therein).

The strategy for obtaining exponential bounds is made of two main steps. The first one consists in the so-called Birkhoff normal form procedure: after \(\mathtt {N}\ge 1\) steps the Hamiltonian (1.4) is transformed into

$$\begin{aligned} \sum _{j=1}^n \omega _j |u_j|^2 +Z +R\, , \end{aligned}$$
(1.5)

where Z depends only on the actions \((|u_i|^2)_{i=1}^n\) while \( R= O(|u|^{2\mathtt {N}+3})\) contains terms of order at least \(2\mathtt {N}+ 3\) in \({\left| u\right| }\). It is well known that this procedure generically diverges in \(\mathtt {N}\), so the second step consists in finding \(\mathtt {N}=\mathtt {N}(\delta )\) which minimizes the size of the remainder R.

The problem of long-time stability for equations (1.1) has been studied by many authors. In the context of infinite chains with a finite range coupling, we mention [BFG88]. Regarding applications to PDEs (and particularly the NLS) the first results were given in [Bou96a] by Bourgain, who proved polynomial bounds for the stability times in the following terms: for any \(\mathtt {N}\) there exists \(p=p(\mathtt {N})\) such that initial data which are \(\delta \)-small in the \(H^{p'+p}\) norm stay small in the \(H^{p'}\) norm, for times of order \(\delta ^{-\mathtt {N}}\). Afterwards, Bambusi in [Bam99b] proved that superanalytic initial data stay small in analytic norm, for times of order \(e^{\ln (\frac{1}{\delta })^{1+b}}\), where \(b>0\).

Following the strategy proposed in [Bam03] for the Klein–Gordon equation Bambusi and Grébert in [BG03] first considered Eq. (1.1) on \(\mathbb {T}^d\) and then, in [BG06], proved polynomial bounds for a class of tame-modulus PDEs, which includes (1.1). Their main result is that for any \(\mathtt {N}\gg 1\) there exists \(p(\mathtt {N}) \) (tending to infinity as \(\mathtt {N}\rightarrow \infty \)) such that for all \(p\ge p(\mathtt {N})\) and all \(\delta -\)small initial data in \(H^p\) one has \(T\ge C(\mathtt {N},p)\delta ^{-\mathtt {N}}\), provided \(\delta <\delta _0(\mathtt {N},p)\). Similar results were also proved for the Klein–Gordon equation on tori and Zoll manifolds in [DS04, DS06, BDGS07]. Successively Faou and Grébert in [FG13] considered the case of analytic initial data and proved subexponential bounds of the form \(T\ge e^{\ln (\frac{1}{\delta })^{1+b}},\)\(b>0,\) for classes of NLS equations in \(\mathbb {T}^d\) (which include (1.1) by taking \(d=1\)). Regarding derivative NLS equations, the first results were in [YZ14] for the semilinear case. Recently, Feola and Iandoli in [FI] prove polynomial lower bounds for the stability times of reversible NLS equations with two derivatives in the nonlinearity.

A closely related topic is the study of orbital stability times close to periodic or quasi-periodic solutions of (1.1). In the case \(E=H^1\), Bambusi in [Bam99a] proved a lower bound of the form \(T\ge e^{\delta ^{-b}},\)\(b>0,\) for perturbations of the integrable cubic NLS close to a quasi-periodic solution. Regarding higher Sobolev norms, most results are in the periodic case. See [FGL13] (polynomial bounds for Sobolev initial data) and the preprint [MSW18] (subexponential bounds for Gevrey initial data).

A dual point of view is to construct special orbits for which the Sobolev norms grow as fast as possible (thus giving an upper bound on the stability times). As far as we are aware such results are mostly on \(\mathbb {T}^2\) and in parameterless cases (for instance [CKS+10, GK15, GHP16]) and the time scales involved are much longer than our stability times (see [Gua14] for the instability of (1.1) on \(\mathbb {T}^2\) and [Han14] for the instability of the plane wave in \(H^p\) with \(p<1\)).

In this paper we propose an abstract Birkhoff normal form result (see Theorem 1.3) on weighted sequence spaces (based on \(\ell ^2\)) and deduce from it stability estimates for initial data in analytic, Gevrey and Sobolev class. An important difference of our approach with respect to the aforementioned papers and one of the main motivations of our work is that we use a different diophantine non-resonance condition on the linear frequencies, originally introduced in [Bou05] in the context of almost-periodic solutions. More precisely set

$$\begin{aligned} \Omega _{q}:={\left\{ \omega ={\left( \omega _j\right) }_{j\in \mathbb {Z}}\in \mathbb {R}^\mathbb {Z},\quad \sup _j|\omega _j-j^2|\langle j \rangle ^{q}< 1/2 \right\} } \end{aligned}$$
(1.6)

and, for \(\gamma >0\), define the set of “good frequencies" as

$$\begin{aligned} {\mathtt {D}_{\gamma ,{q}}}:={\left\{ \omega \in \Omega _{q}\,:\, |\omega \cdot \ell |> \gamma \prod _{n\in \mathbb {Z}}\frac{1}{(1+|\ell _n|^2 \langle n \rangle ^{2+{q}})}.\quad \forall \ell \in \mathbb {Z}^\mathbb {Z}: |\ell |<\infty \right\} }.\nonumber \\ \end{aligned}$$
(1.7)

It is known that \({\mathtt {D}_{\gamma ,{q}}}\) is large with respect to a natural probability product measure on \(\Omega _{q}\) (for a proof see [Bou05] or Lemma 4.1 in the present paper). It turns out that such diophantine conditions are very natural and easy to use in the context of PDEs on the circle with a superlinear dispersion law. Then from now on we shall fix \(\gamma >0\), \(q\ge 0\) and assume that \(\omega \in {\mathtt {D}_{\gamma ,{q}}}.\)

Remark 1.1

We note that some non-resonance condition on the frequencies is inevitable if one wants to prove long-time stability, indeed if one takes \(V=0\) and \(f(x,|u|^2)=|u|^4\) then one can exhibit orbits in which the Sobolev norm is unstable in times of order \(\delta ^{-4}\), see [GT12, HP17].

At the formal level our BNF scheme is identical to the one used in finite dimensional systems, see formula (1.5). The fact that such a scheme may be applied in an infinite dimensional context follows from introducing a suitable norm (see Definition 1.2 and the comments thereafter); it turns out that our norm has explicit (and for us quite surprising) immersion properties (see Proposition 3.1) and allows good bounds on the solution of the homological equation (see Lemma 4.2). The gist of these properties is that they ensure that any vector field mapping a (neighborhood of) given Hilbert space in itself also maps (smaller neighborhoods of) more regular Hilbert spaces in themselves. Analogously also the vector field solving the homological equation maps sufficiently more regular Hilbert spaces in themselves.

To show that our procedure works in significative cases, we have computed stability times for various regularity classes. More precisely we improve the results in [FG13] on analytic and Gevrey initial data, see Theorem 1.1. Moreover we recover [BG06] on Sobolev initial data, giving an explicit control on the dependence of the stability time and of the smallness condition on the regularity, see Proposition 1.1 and the improved estimates of Theorem 1.2.

Comments on possible generalizations. In this paper we have considered the simplest possible example of dispersive PDE on the circle. One can easily see that the same strategy can be followed word by word in more general cases provided that the non-linearity does not contain derivatives and that the dispersion law is superlinear. A much more challenging question is to consider NLS models with derivatives in the non-linearity. As we have mentioned a semilinear case was discussed by [CMW]. A very promising approach to Birkhoff normal form for quasilinear PDEs is the one of [BD18, BDG10, BDGS07, BFG88, BFG18, BFN15, BG03, BG06, Bou96a, Bou96b, Bou05, CKS+10, CLSY, CMW, Del12] which was applied to fully-nonlinear reversible NLS equations in [FI]. It seems very plausible (at least in the reversible case) that one can adapt their methods (based on paralinearizations and paradifferential calculus) to our setting.

A natural generalization would be the extension to higher dimensions. While the immersion properties would work essentially in the same way, the diophantine condition should be adapted, for instance one could use the condition in [FG13].

Equation (1.1) contains infinitely many external parameters. Of course one would like to consider parameterless equations as in the very interesting recent preprint [BFG18]. In this direction a natural question would be to understand if one could impose similar diophantine conditions by tuning only one parameter such as the mass in the beam or wave equations (see, e.g., [Bam03, BD18]).

Before explaining the abstract BNF procedure in detail let us describe our stability results.

1.1 Stability results

Analytic and Gevrey initial data. Our result is similar to [FG13] in the sense that we also prove subexponential bounds on the time. We mention however that in [FG13] the control of the Sobolev norm in time is in a lower regularity space w.r.t. the initial datum. Recently we have been made aware of a preprint by Cong, Mi and Wang [CMW] in which the authors give subexponential bounds for Gevrey initial data of a model like (1.1), very similar to ours. A difference is that in their case the non linearity contains a derivative (see the comments after Theorem 1.1) but satisfies momentum conservation. The two results were obtained independently and contemporarily, anyway, the overall strategies of proofs are quite different. In particular our result is a consequence of the general Birkhoff Normal Form Theorem 1.3 and the non-resonance conditions are different (recall (1.7)).

To state our result, let us fix \( 0<\theta <1\), and define the function spaceFootnote 3

$$\begin{aligned} {\mathtt {H}}_{p,s,a}:= {\left\{ u(x)=\sum _{j\in \mathbb {Z}} u_j e^{\mathrm{i}j x}\in L^2\,:\, |u|_{p,s,a}^2:= \sum _{j\in \mathbb {Z}}{\left| u_j\right| }^2 \langle j \rangle ^{2 p}e^{2 a {\left| j\right| }+ 2s\langle j \rangle ^{\theta }}< \infty \right\} }.\nonumber \\ \end{aligned}$$
(1.8)

with the assumption \(a\ge 0, s>0, p>1/2\). We remark that if \(a>0\) this is a space of analytic functions, while if \(a=0\) the functions have Gevrey regularity. Note that for technical reasons connected to the way in which we control the small divisors, we cannot deal with the purely analytic case \(\theta =1\), see Lemmas 6.17.1. For this reason we denote this result as \(\mathtt {G}\) (Gevrey case).

Our result, stated below, depends on some constants \({{\varvec{\delta }}_\mathtt {G}},\mathtt {T}_\mathtt {G}\), explicitely defined in Subsection A, and depending only on \(\gamma , {q},{\mathtt {a}},R,|f|_{{\mathtt {a}},R}, p,s,a,\theta \).

Theorem 1.1

(Gevrey Stability). Fix any \(a\ge 0\), \(s>0\) such that \(a+s< \mathtt {a}\) and any \(p>1/2\). For any \(0<\delta \le {{\varvec{\delta }}_\mathtt {G}}\) and any \(u_0\) such that

$$\begin{aligned} |u_0|_{p,s,a} \le \delta , \end{aligned}$$

the solution u(t) of (1.1) with initial datum \(u(0)=u_0\) exists and satisfies

$$\begin{aligned} |u(t)|_{p,s,a} \le 2\delta \quad { for\ all\ times}\quad |t|\le \frac{\mathtt {T}_\mathtt {G}}{\delta ^2} e^{{\left( \ln \frac{{{\varvec{\delta }}_\mathtt {G}}}{\delta }\right) }^{1+\theta /4}} . \end{aligned}$$

Remark 1.2

Some comments on Theorem 1.1 are in order.

  1. 1.

    The main point in the proof is to verify that the abstract Birkhoff Normal Form Theorem 1.3 is applicable. Then we put the Hamiltonian of the NLS in Birkhoff normal form:

    $$\begin{aligned} \sum _{j\in \mathbb {Z}} \omega _j |u_j|^2 +Z +R\, , \end{aligned}$$
    (1.9)

    where Z depends only on the actions \((|u_i|^2)_{i\in \mathbb {Z}}\) while \( R= O(|u|^{2\mathtt {N}+3})\) is analytic in a ball centered at zero of \({\mathtt {h}}_{p,s,a}\) and has a zero of order at least \(2\mathtt {N}+ 3\) in \({u}=0\). Then we find \(\mathtt {N}=\mathtt {N}(\delta )\) which minimizes the size of the remainder R.

  2. 2.

    We did not make an effort to maximize the exponent \(1+\theta /4\) in the stability time. In fact, by trivially modifying the proof, one could get \(1+\theta /(2^+)\). We remark that in [CMW], in which \(\theta =1/2\), the exponent is better, i.e. it is \(1+1/(2^+)\).

Sobolev initial data. Here our first goal was to recover by our methods the result of [BG06], computing explicitly all the constants in the estimates. In particular it is fundamental to have a good control on the dependence of the stabiliy time T on the the regularity p. Indeed there are two natural ways of taking a small ball around zero: reducing the size \(\delta \) or increasing the regularity p. A crucial point is that, in the case of Sobolev regularity, the number of BNF steps that one may perform is (apparently unavoidably) tied to the regularity p. This is clearly seen in [BG06], where the number of steps is \(\sim \sqrt{p}\). It seemed an interesting point to verify how our approach worked in such a case, and wether we would see the same phenomenon.

As before, our estimates depend on some constants, denoted by \(\tau _\mathtt {S},{{\varvec{\delta }}_\mathtt {S}},\mathtt {k}_\mathtt {S}, \mathtt {T}_{\mathtt {S}}, \), explicitly defined in “Appendix A”. These constants depend only on \(\gamma ,{q},{\mathtt {a}},R,|f|_{{\mathtt {a}},R} \).

Proposition 1.1

(A quantitative version of [BG06]). Consider Eq. (1.1) with f satisfying (1.2) for \({\mathtt {a}},R>0\). For any \(p\ge 3\tau _\mathtt {S}+ 1\) and any initial datum \(u(0)=u_0\) satisfying

$$\begin{aligned} |u_0|_{H^p}:= |u_0|_{L^2}+ |\partial _x^p u_0|_{L^2} \le \delta \le {{\varvec{\delta }}_\mathtt {S}}({\mathtt {k}_\mathtt {S}p})^{ -3 p } \end{aligned}$$
(1.10)

the solution u(t) of (1.1) with initial datum \(u(0)=u_0\) exists and satisfies

$$\begin{aligned} |u(t)|_{H_p} \le 4\delta \quad { for\ all\ times}\quad |t|\le {\mathtt {T}_\mathtt {S}} p^{ -5 p } \left( \frac{{{\varvec{\delta }}_\mathtt {S}}}{\delta }\right) ^{\frac{2(p-1)}{{\tau _\mathtt {S}}}} . \end{aligned}$$
(1.11)

Remark 1.3

Also in this result we just have to verify the hypotheses of Theorem 1.3. However as it happens in [BG06] the maximum number \(\mathtt {N}\) of steps of BNF we can perform depends on p, in particular \(\mathtt {N}= [\frac{p-1}{{\tau _\mathtt {S}}}]\). This is in fact slightly better than the previously cited paper (\(\mathtt {N}\sim p\) instead of \(\sqrt{p}\)). On the other hand it is not difficult to show that the bound \(\delta \le {\delta }_\mathtt {S}({\mathtt {k}_\mathtt {S}p})^{ -3 p }\) is essentially optimal (see Remark 10.1).

Looking at the proof of the Theorem or even constructing other finite-dimensional models, one can see that in the traslation invariant case, the very restrictive smallness condition in (1.10) is only due to interactions between the modes \(0,1,-1\) and all the others. It then seems natural to consider initial data for which the energy on such modes is smaller, namely \(|u_0|_{L^2} \le 2^{-p}\delta \). We refer to this case as \(\mathtt {M}\), the relevant constants can be found in “Appendix A”

Theorem 1.2

Consider Eq. (1.1) with f independent of x and satisfying (1.3) for \(R>0\). For any \(p>3\tau _\mathtt {M}+1\) and for any initial datum \(u(0)=u_0\) satisfying

$$\begin{aligned} | u_0|_{H^p} \le \delta \le \frac{{{\varvec{\delta }}_\mathtt {M}}}{\sqrt{p}},\quad |u_0|_{L^2} \le 2^{-p}\delta \end{aligned}$$
(1.12)

the solution u(t) of (1.1) exists and satisfies

$$\begin{aligned} | u(t)|_{H^p} \le 8\delta \quad { for\ all\ times}\quad |t|\le {\mathtt {T}_\mathtt {M}} \left( \frac{8 {{\varvec{\delta }}_\mathtt {M}}^2}{(p-1) \delta ^2} \right) ^{\frac{p-1}{{\tau _\mathtt {M}}}} . \end{aligned}$$
(1.13)

Remark 1.4

Note that, since the \(L^2\) norm is a constant of motion, one trivially has \(| u(t)|_{L^2}\le 2^{-p}\delta \). Comparing with (1.11), we see that the time estimate is more or less the same but now it holds in a much bigger neighborhood of zero (\(\delta \le p^{-1/2}\) instead of \(\le p^{-3p}\)).

If one requires a stronger condition on the \(L^2\) norm, i.e., \(|u_0|_{L^2} \le 3^{-p}\delta ,\) it turns out that the size of the perturbation is exponentially decreasing inp and, therefore, keeping \(\delta \) fixed and sending p to infinity one immediately obtains stability.

The main difference between the Gevrey and Sobolev cases is that in the latter the number of BNF steps \(\mathtt {N}\) depends on the regularity, while in the former it is independent. Thus in the Sobolev case we cannot fix both \(\delta \) and p and optimize in \(\mathtt {N}\). What we can do is to fix \(\delta \) and find an optimal regularity \(p(\delta )\), which maximizes the stability time. It turns out that the two cases \(\mathtt {S}\) and \(\mathtt {M}\) behave differently. Indeed the weaker smallness condition (1.12) allows us to take much bigger \(p(\delta ),\) obtaining much longer stability times. As before our statements depend on some constants, denoted by \(\bar{\delta }_\mathtt {S}, \bar{\delta }_\mathtt {M}\)explicitly defined in Subsection A.

Corollary 1.1

(Sobolev stability: optimization).

\((\mathtt {S})\) For any \(0<\delta \le \bar{\delta }_\mathtt {S}\) and any \(u_0\) such that

$$\begin{aligned} |u_0|_{H^p} \le \delta , \quad p=p(\delta ):=1+\frac{\ln ({{\varvec{\delta }}_\mathtt {S}}/\delta )}{6 \ln \ln ({{\varvec{\delta }}_\mathtt {S}}/\delta )} , \end{aligned}$$
(1.14)

the solution u(t) of (1.1) with initial datum \(u(0)=u_0\) exists and satisfies

$$\begin{aligned} |u(t)|_{H^p} \le 4\delta \quad { for\ all\ times}\quad |t|\le {\mathtt {T}_\mathtt {S}} e^{ \ \frac{\ln ^2 ({{\varvec{\delta }}_\mathtt {S}}/\delta )}{4{\tau _\mathtt {S}}\ln \ln ({{\varvec{\delta }}_\mathtt {S}}/\delta )}} . \end{aligned}$$
(1.15)

\((\mathtt {M})\) Assume that f in (1.1) is independent of x. For any \(0<\delta \le {\bar{\delta }}_\mathtt {M}\) and

$$\begin{aligned} \forall \, p\ge p(\delta ):= \frac{{{\varvec{\delta }}_\mathtt {M}}^2}{ \delta ^2}, \quad \forall u_0 \quad \text {s.t.}\quad |u_0|_{H^p} \le \delta ,\quad |u_0|_{L^2} \le 2^{-p}\delta ,\qquad \qquad \, \end{aligned}$$
(1.16)

the solution u(t) of (1.1) with initial datum \(u(0)=u_0\) exists and satisfies

$$\begin{aligned} |u(t)|_{H^p} \le 8\delta \quad { for\ all\ times}\quad |t|\le {\mathtt {T}_\mathtt {M}} e^{ \frac{{{\varvec{\delta }}_\mathtt {M}}^2}{ {\tau _\mathtt {M}}\delta ^2 }} . \end{aligned}$$
(1.17)

Remark 1.5

Some remarks on Corollary 1.1 are in order.

Note that (1.15) is the stability time computed in [BFG88] for short range couplings.

  1. 1.

    We will prove the case \(\mathtt {M}\) only for \(p= p(\delta )\), the general case being analogousFootnote 4 (with the same constants!) also if \(p\ge p(\delta )\).

  2. 2.

    One can easily restate Corollary 1.1 in terms of the Sobolev exponent p, instead of \(\delta \), since the map \(\delta \rightarrow p(\delta )\) is injective.

Remark 1.6

(finite dimensional examples). It is interesting to compute the stability times predicted by our theorems for initial data supported on a finite number of modes. To this purpose consider an initial datum \(u^{(0)}\) uniformly distributed over the modes \(1,\dots ,j\):

$$\begin{aligned} |u^{(0)}_i|= {\varepsilon },\quad \forall i=1,\dots ,j. \end{aligned}$$

Theorem 1.1 with \(a=0,p=1\) states that if \({\varepsilon }\le {\varepsilon }_\mathtt {G}:= \delta _\mathtt {G}e^{-2{j}^\theta }\) then u(t) stays stable, in Gevrey norm, for times of order \( e^{{\left( \ln \frac{{\varepsilon }_\mathtt {G}}{{\varepsilon }}\right) }^{1+\theta /4}}\).

Now if \({\varepsilon }\le {\varepsilon }_\mathtt {M}(p):= \delta _{\mathtt {M}}{j}^{-p-1}/\sqrt{p}\) we have \(|u_0|_{H_p}<\delta \) and \(|u_0|_{L^2}<2^{-p}\delta \); then by Theorem 1.2 the solution u(t) stays stable, in \(H^p\) norm, for times of order \(T\sim {\left( \frac{{\varepsilon }_\mathtt {M}(p)}{{\varepsilon }}\right) }^{2(p-1)/\tau _\mathtt {M}}\). Maximizing the time in p with fixed \({\varepsilon }\) we get

$$\begin{aligned} p \sim \frac{\ln (\frac{\delta _{\mathtt {M}}^2}{{\varepsilon }}) }{2 \ln j},\quad T\sim e^{\frac{(\ln {\varepsilon }^{-1})^2}{\ln j}} \end{aligned}$$
(1.18)

provided that \({\varepsilon }\lesssim j^{-7\tau _\mathtt {M}}\). Explicitly we get a weaker constraint on \({\varepsilon }\) and a better time estimate. Of course one could play the same game directly with the estimate of Proposition 1.1. As it should be expected the time estimate is more or less the same as 1.18 but the smallness condition is much stronger, i.e. of the type \({\varepsilon }\lesssim e^{-2{j}^\theta }\).

1.2 The abstract Birkhoff Normal Form

We start by setting our functional framework. The main point is to introduce a weighted majorant norm which penalizes the terms in the Hamiltonian which do not preserve momentum, see Definition 1.1.

Let us pass to the Fourier side via the identification

$$\begin{aligned} u(x)=\sum _{j\in \mathbb {Z}}u_j e^{\mathrm{i}jx}\ \mapsto \ u=(u_j)_{j\in \mathbb {Z}}, \end{aligned}$$
(1.19)

where u belongs to some complete subspace of \(\ell ^2\). Fix the symplectic structure to be

$$\begin{aligned} \mathrm{i}\sum _j d u_j\wedge d {\bar{u}}_j. \end{aligned}$$
(1.20)

In this framework the Hamiltonian of (1.1) is

$$\begin{aligned} H_{\mathrm{NLS}}(u):= & {} D_\omega +P, \quad \mathrm{where}\nonumber \\ D_\omega:= & {} \sum _{j\in \mathbb {Z}} \omega _j |u_j|^2, \quad \nonumber \\ P:= & {} \int _\mathbb {T}F(x,|u(x)|^2) dx ,\quad F(x,y):=\int _0^y f(x,s) ds. \end{aligned}$$
(1.21)

We shall always work with quite regular solutions; given a real sequence \(\mathtt {w}=(\mathtt {w}_j)_{j\in \mathbb {Z}},\) with \(\mathtt {w}_j\ge 1\) let us set the Hilbert spaceFootnote 5

$$\begin{aligned} {\mathtt {h}}_{\mathtt {w}} := {\left\{ u:= {\left( u_j\right) }_{j\in \mathbb {Z}}\in \ell ^2(\mathbb {C})\,: \quad {\left| u\right| }_{{\mathtt {w}}}^2 := \sum _{j\in \mathbb {Z}} \mathtt {w}_j^2 {\left| u_j\right| }^2 < \infty \right\} }. \end{aligned}$$
(1.22)

As examples of \({\mathtt {h}}_{\mathtt {w}}\) we consider:

\(\mathtt {G}\)):

(Gevrey case) \(\mathtt {w}(p,s,a):={\left( \langle j \rangle ^{ p}e^{ a {\left| j\right| }+ s\langle j \rangle ^{\theta }}\right) }_{j\in \mathbb {Z}},\) which is isometrically isomorphic, by Fourier transform, to \(\mathtt {H}_{p,s,a}\) defined in (1.8).

\(\mathtt {S}\)):

(Sobolev case) \({\mathtt {w}}(p):= {\mathtt {w}}(p,0,0)={\left( \langle j \rangle ^{p}\right) }_{j\in \mathbb {Z}}\), which is isometrically isomorphic, by Fourier transform, to \(\mathtt {H}_{p,0,0}\) defined in (1.8) and is equivalent to \(H^p\) equipped with the norm \(|\cdot |_{L^2} +|\partial _x^p \cdot |_{L^2}\) with equivalence constants independent of p (see (5.28))

\(\mathtt {M}\)):

(Modified-Sobolev case) \({\mathtt {w}}_j= \lfloor j \rfloor ^{p}, \) where \(\lfloor j \rfloor := \max \{|j|,2\}\); this space is equivalent to \(H^p\) equipped with the norm \(2^p|\cdot |_{L^2} +|\partial _x^p \cdot |_{L^2}\) with equivalence constants independent of p (see (5.30))

Here and in the following, given \(r>0\), by \(B_r({\mathtt {h}}_{\mathtt {w}})\) we mean the closed ball of radius r centered at the origin of \({\mathtt {h}}_{\mathtt {w}}.\)

In the following we always consider Hamiltonians \( H : B_r({\mathtt {h}}_{\mathtt {w}}) \rightarrow \mathbb {R}\) such that there exists a pointwise absolutely convergent power series expansionFootnote 6

$$\begin{aligned} H(u) = \sum _{\begin{array}{c} {\varvec{{\alpha }}},{\varvec{{\beta }}}\in {\mathbb {N}}^\mathbb {Z}, \\ |{\varvec{{\alpha }}}|+|{\varvec{{\beta }}}|<\infty \end{array} }H_{{\varvec{{\alpha }}},{\varvec{{\beta }}}}u^{\varvec{{\alpha }}}{\bar{u}}^{\varvec{{\beta }}}, \qquad u^{\varvec{{\alpha }}}:=\prod _{j\in \mathbb {Z}}u_j^{{\varvec{{\alpha }}}_j} \end{aligned}$$

with the following properties:

  1. (i)

    Reality condition:

    $$\begin{aligned} H_{{\varvec{{\alpha }}},{\varvec{{\beta }}}}= \overline{ H}_{{\varvec{{\beta }}},{\varvec{{\alpha }}}}, \end{aligned}$$
    (1.23)

    this means that H is real analytic in the real and imaginary part of u (see section 2);

  2. (ii)

    Mass conservation:

    $$\begin{aligned} H_{{\varvec{{\alpha }}},{\varvec{{\beta }}}}= 0 \quad \text{ if }\,\, |{\varvec{{\alpha }}}|\ne |{\varvec{{\beta }}}| , \end{aligned}$$
    (1.24)

    namely the Hamiltonian Poisson commutes with the mass\(\sum _{j\in \mathbb {Z}}|u_j|^2\);

The Hamiltonian functions being defined modulo a constant term, we shall assume without loss of generality that\(H(0)=0\).

We say that a Hamiltonian H as above preserves momentum when

$$\begin{aligned} H_{{\varvec{{\alpha }}},{\varvec{{\beta }}}}=0\quad \text{ if }\quad \pi ({\varvec{{\alpha }}}- {\varvec{{\beta }}}) := \sum _{j\in \mathbb {Z}}j{\left( {\varvec{{\alpha }}}_j - {\varvec{{\beta }}}_j\right) } \ne 0, \end{aligned}$$
(1.25)

namely the Hamiltonian H Poisson commutes with \(\sum _{j\in \mathbb {Z}} j{\left| u_j\right| }^2\). Note that if the nonlinearity f in Eq. (1.1) does not depend on the variable x, then the Hamiltonian P in (1.21) preserves momentum.

Definition 1.1

(\(\eta \)-majorant analytic Hamiltonians). For \(\eta \ge 0, r>0\) let \(\mathcal {A}_{r,\eta }({\mathtt {h}}_{\mathtt {w}})\) be the space of Hamiltonians as above such that the \(\eta \)-majorant

$$\begin{aligned} \underline{ H}_\eta (u):= \sum _{{\varvec{{\alpha }}},{\varvec{{\beta }}}\in {\mathbb {N}}^\mathbb {Z}} {\left| {H}_{{\varvec{{\alpha }}},{\varvec{{\beta }}}}\right| }e^{\eta |\pi ({\varvec{{\alpha }}}-{\varvec{{\beta }}})|}{u^{\varvec{{\alpha }}}}{\bar{u}^{\varvec{{\beta }}}}\end{aligned}$$
(1.26)

is point-wise absolutely convergent on \(B_r({\mathtt {h}}_{\mathtt {w}})\). If we take \(\eta =0\) we denote \(\underline{ H}_0 (u) =\underline{ H} (u)\) as the majorant of H.

The exponential weight \(e^{\eta |\pi ({\varvec{{\alpha }}}-{\varvec{{\beta }}})|}\) is added in order to ensure that the monomials which do not preserve momentum have an exponentially small coefficient.

We will say that a Hamiltonian \(H(u)\in \mathcal {A}_{r,\eta }({\mathtt {h}}_{\mathtt {w}})\) is \(\eta \)-regular if \(X_{\underline{ H}_\eta } : B_r({\mathtt {h}}_{\mathtt {w}}) \rightarrow {\mathtt {h}}_{\mathtt {w}}\) and is uniformly bounded, where \({X}_{{{\underline{H}}}_\eta }\) is the vector field associated to the \(\eta \)-majorant Hamiltonian in (1.26). More precisely we give the following

Definition 1.2

(\(\eta \)-regular Hamiltonians). For \(\eta \ge 0, r>0\) let \({{\mathcal {H}}}_{r,\eta }({\mathtt {h}}_{\mathtt {w}})\) be the subspace of \({{\mathcal {A}}}_{r,\eta }({\mathtt {h}}_{\mathtt {w}})\) of those Hamiltonians H such that

$$\begin{aligned} |H|_{{{\mathcal {H}}}_{r,\eta }({\mathtt {h}}_{\mathtt {w}})} = |H|_{r,\eta ,{\mathtt {w}}} := r^{-1} {\left( \sup _{{\left| u\right| }_{{\mathtt {h}}_{\mathtt {w}}}\le r} {\left| {X}_{{{\underline{H}}}_\eta }\right| }_{{\mathtt {h}}_{\mathtt {w}}} \right) } < \infty . \end{aligned}$$

We shall show in Sect. 2 that this guarantees that the Hamiltonian flow of H exists at least locally and generates a symplectic transformation on \({\mathtt {h}}_{\mathtt {w}}\).

Remark 1.7

Definition 1.2 with \(\eta =0\), i.e. the idea of controlling an analytic function through the sup of its Cauchy majorant, dates back to Cauchy-Kovalevskaya. In the context of analytic functions on Hilbert spaces, this class of functions is defined and studied, with a slightly different approach, in [Nik86] and [KP10], where it is referred to as “normally analytic” functions.

Regarding the idea of introducing a weight which penalizes monomials which do not preserve momentum, this was used already in [Bam03].

In our work the crucial point is that all the dependence on the parameters \(r,\eta ,{\mathtt {w}}\) of the norm in Definition 1.2 can be encoded in the coefficients

$$\begin{aligned} c^{(j)}_{r,\eta ,{\mathtt {w}}}({\varvec{{\alpha }}},{\varvec{{\beta }}}) := r^{|{\varvec{{\alpha }}}|+|{\varvec{{\beta }}}|-2}e^{\eta |\pi ({\varvec{{\alpha }}}-{\varvec{{\beta }}})|} \frac{{\mathtt {w}}_j^2}{{\mathtt {w}}^{{\varvec{{\alpha }}}+{\varvec{{\beta }}}}},\qquad {\mathtt {w}}^{{\varvec{{\alpha }}}+{\varvec{{\beta }}}} = \prod _{j\in \mathbb {Z}}{\mathtt {w}}_j^{{\varvec{{\alpha }}}_j+{\varvec{{\beta }}}_j} \end{aligned}$$
(1.27)

defined for any \({\varvec{{\alpha }}},{\varvec{{\beta }}}\in {\mathbb {N}}^\mathbb {Z}\) and \(j\in \mathbb {Z}\) (see formula (3.1) and Lemma 3.1). This allows us to give a simple and explicit condition which guarantees the immersion\( {{\mathcal {H}}}_{r,\eta }({\mathtt {h}}_{\mathtt {w}}) \subseteq {{\mathcal {H}}}_{r',\eta '}({\mathtt {h}}_{{\mathtt {w}}'})\) in terms of the ratio of the coefficients \(c^{(j)}_{r,\eta ,{\mathtt {w}}}({\varvec{{\alpha }}},{\varvec{{\beta }}})\), \(c^{(j)}_{r',\eta ',{\mathtt {w}}'}({\varvec{{\alpha }}},{\varvec{{\beta }}})\), see Proposition 3.1.

As it is well known a Birkhoff Normal Form is achieved by an iterative procedure. Let us describe the general step. Given a Hamiltonian

$$\begin{aligned} H= \sum _{j\in \mathbb {Z}} \omega _j |u_j|^2 +Z +R, \end{aligned}$$
(1.28)

where Z is a normal form and R has a zero of degree say \(2{\mathtt {d}}+2\) (with \({\mathtt {d}}\ge 1\)) at \(u=0\), we look for a change of variables, which conjugates H to a Hamiltonian \(\sum _{j\in \mathbb {Z}} \omega _j |u_j|^2 +Z' +R'\) so that now R has a zero of degree at least \(2{\mathtt {d}}+4\). The desired change of variables is generated by the time one flow of a Hamiltonian S which solves the homological equationFootnote 7

$$\begin{aligned} \{\sum _{j\in \mathbb {Z}} \omega _j |u_j|^2 ,S\}= R. \end{aligned}$$

As for the immersion properties, givenFootnote 8\(r'\le r,\eta '\le \eta \) and \({\mathtt {w}}' \ge {\mathtt {w}}\) such that \( {{\mathcal {H}}}_{r,\eta }({\mathtt {h}}_{\mathtt {w}}) \subseteq {{\mathcal {H}}}_{r',\eta '}({\mathtt {h}}_{{\mathtt {w}}'})\), in Proposition 4.2 and Lemma 5.2 we give a simple and explicit condition -in terms of the ratio of the coefficients \(c^{(j)}_{r,\eta ,{\mathtt {w}}}({\varvec{{\alpha }}},{\varvec{{\beta }}})\), \(c^{(j)}_{r',\eta ',{\mathtt {w}}'}({\varvec{{\alpha }}},{\varvec{{\beta }}})\)- which ensures that if \(R\in {{\mathcal {H}}}_{r,\eta }({\mathtt {h}}_{\mathtt {w}})\) is appropriately small, then S is well defined and generates a close to identity change of variables \(B_{r'}({\mathtt {h}}_{\mathtt {w}'}) \rightarrow {\mathtt {h}}_{\mathtt {w}'}\). With this procedure we start in some phase space \({\mathtt {h}}_{\mathtt {w}}\) and then show the existence of the Birkhoff change of variables on a ball which not only has a smaller radius but is taken in the stronger toplogy\({\mathtt {h}}_{\mathtt {w}'}\). Note that this is not a smoothing change of variables: it is defined from the smaller space to itself.

Starting with a Hamiltonian as in (1.28) with a zero of order 4, in order to reach the form (1.9) we need to perform \({\mathtt {N}}\) steps of BNF. To this purpose we make the following

Assumption 1

We say that \(\eta \ge 0\) and two weights \({\mathtt {w}}_0\le {\mathtt {w}}\) satisfy the Birkhoff assumption at step \(\mathtt {N}\ge 1\) if the following holds. The exists a sequence of weights \(\mathtt {w}_0\le \mathtt {w}_1\le \cdots \le \mathtt {w}_\mathtt {N}={\mathtt {w}}\) such that

$$\begin{aligned}&{{\mathfrak {C}}}:= \max \Bigg \{1,\ \sup _{0\le n<\mathtt {N}} \sup _{\begin{array}{c} j,{\varvec{{\alpha }}},{\varvec{{\beta }}}\\ {\varvec{{\alpha }}}_j+{\varvec{{\beta }}}_j\ne 0 \end{array}} \frac{c^{(j)}_{\varrho ^*_{n},\eta _{n+1},{\mathtt {w}}_{n+1}}({\varvec{{\alpha }}},{\varvec{{\beta }}}) }{c^{(j)}_{\varrho _{n},\eta _{n},{\mathtt {w}}_n}({\varvec{{\alpha }}},{\varvec{{\beta }}}) } \Bigg \}<\infty , \nonumber \\&{{\mathfrak {K}}}:= \max \Bigg \{1,\ \sup _{0\le n<\mathtt {N}} \sup _{\begin{array}{c} j,{\varvec{{\alpha }}},{\varvec{{\beta }}}\\ {\varvec{{\alpha }}}_j+{\varvec{{\beta }}}_j\ne 0 \end{array}} \frac{ c^{(j)}_{\varrho ^*_{n},\eta _{n+1},{\mathtt {w}}_{n+1}}({\varvec{{\alpha }}},{\varvec{{\beta }}}) }{c^{(j)}_{\varrho _{n},\eta _{n},{\mathtt {w}}_n}({\varvec{{\alpha }}},{\varvec{{\beta }}}) |\omega \cdot ({\varvec{{\alpha }}}-{\varvec{{\beta }}})|}\Bigg \}<\infty , \nonumber \\&{{\mathfrak {K}}^\sharp }:= \max \Bigg \{1,\ \sup _{0\le n<\mathtt {N}} \sup _{\begin{array}{c} j,{\varvec{{\alpha }}},{\varvec{{\beta }}}\\ {\varvec{{\alpha }}}_j+{\varvec{{\beta }}}_j\ne 0 \end{array}} \frac{ c^{(j)}_{\varrho ^*_{n},\eta _{\mathtt {N}},{\mathtt {w}}_\mathtt {N}}({\varvec{{\alpha }}},{\varvec{{\beta }}}) }{c^{(j)}_{\varrho _{n},\eta _{n},{\mathtt {w}}_n}({\varvec{{\alpha }}},{\varvec{{\beta }}}) |\omega \cdot ({\varvec{{\alpha }}}-{\varvec{{\beta }}})|} \Bigg \} <\infty , \end{aligned}$$
(1.29)

where

$$\begin{aligned} \varrho _n =(2 - \frac{n}{\mathtt {N}}) ,\quad \eta _n = (1 - \frac{n}{\mathtt {N}})\eta ,\, \quad 0\le n\le \mathtt {N}, \quad \quad \varrho _n^* = \frac{\varrho _{n+1}+ \varrho _n}{2},\quad 0\le n<\mathtt {N}. \end{aligned}$$

Informally speaking \({{\mathfrak {C}}}<\infty \) guarantees the immersion properties at each step, while \({{\mathfrak {K}}}<\infty \) guarantees that one can solve the homological equation at each step. Finally \({{\mathfrak {K}}^\sharp }<\infty \) guarantees that the composition of the changes of variables of all steps is well defined and close to identity on some ball \(B_{r}({\mathtt {h}}_{\mathtt {w}_{\mathtt {N}}})\).

Let

$$\begin{aligned} {{\mathcal {K}}}^{}_{r}({\mathtt {h}}_{\mathtt {w}}) := \{ H\in {{\mathcal {H}}}^{}_{r,0}({\mathtt {h}}_{\mathtt {w}})\ \ \vert \quad H = \sum _{{\varvec{{\alpha }}}\in {\mathbb {N}}^\mathbb {Z}} H_{{\varvec{{\alpha }}},{\varvec{{\alpha }}}}|u|^{2{\varvec{{\alpha }}}} \} \end{aligned}$$
(1.30)

be the subspace of normal form Hamiltonians.

Theorem 1.3

(Abstract Birkhoff Normal Form). Consider a Hamiltonian of the form

$$\begin{aligned} H= D_\omega + G,\quad D_\omega = \sum _j \omega _j |u|_j^2 \end{aligned}$$
(1.31)

with \(\omega \in {\mathtt {D}_{\gamma ,{q}}}\) and \(G\in {{\mathcal {H}}}_{{\bar{r}},\eta }({\mathtt {h}}_{{\mathtt {w}}_0}),\) for some \({\bar{r}}>0,\)\(\eta \ge 0\). Assume moreover that G has a zero of order at least 4 at \(u=0\). Consider \(\mathtt {N}\ge 1\) and \({\mathtt {w}}\ge {\mathtt {w}}_0\) such that \(\eta ,{\mathtt {w}}_0,{\mathtt {w}}\) satisfy the Birkhoff assumption at step \(\mathtt {N}.\) Set

$$\begin{aligned} \widehat{{\mathtt {r}}} := \min \left\{ \frac{r_\star }{\sqrt{\mathtt {N}\max \{{{\mathfrak {C}}}{{\mathfrak {K}}},{{\mathfrak {K}}^\sharp }\}} }, \ \frac{{\bar{r}}}{2} \right\} , \qquad \text{ where }\qquad r_\star :={\bar{r}} \sqrt{\frac{\gamma }{2^{11}e |G|_{{\bar{r}},\eta ,{\mathtt {w}}_0} }} .\qquad \end{aligned}$$
(1.32)

Then for all \(0<r\le \widehat{{\mathtt {r}}}\) there exists an invertibleFootnote 9 symplectic change of variables

$$\begin{aligned}&\Psi :\quad B_{r}({\mathtt {h}}_{{\mathtt {w}}})\mapsto B_{2r}({\mathtt {h}}_{{\mathtt {w}}}), \qquad \nonumber \\&\quad \sup _{u\in B_r({\mathtt {h}}_{{\mathtt {w}}})} {\left| \Psi (u)-u\right| }_{{\mathtt {w}}} \le \widehat{\mathtt {C}}_1 r^3\le \frac{r}{8},\qquad \widehat{\mathtt {C}}_1:=\frac{{{\mathfrak {K}}^\sharp }}{2^7 e r_\star ^2}, \end{aligned}$$
(1.33)

such that in the new coordinates

$$\begin{aligned} H\circ \Psi = D_\omega + Z+ R,\qquad Z\in {{\mathcal {K}}}_{r}({\mathtt {h}}_{\mathtt {w}}), \end{aligned}$$

where

$$\begin{aligned} \ |Z|_{r,0,{\mathtt {w}}}\le & {} \widehat{\mathtt {C}}_2 r^2, \qquad |R|_{r,0,{\mathtt {w}}} \le \widehat{\mathtt {C}}_3 r^{2(\mathtt {N}+1)},\qquad \text{ with }\qquad \nonumber \\ \widehat{\mathtt {C}}_2:= & {} \frac{8|G|_{{\bar{r}},\eta ,{\mathtt {w}}_0}}{{\bar{r}}^2}, \qquad \widehat{\mathtt {C}}_3:= \frac{\gamma }{2^9 e r_\star ^2} {\left( \frac{ {{\mathfrak {C}}}\,{{\mathfrak {K}}}\, \mathtt {N}}{4 r_\star ^2}\right) }^{\mathtt {N}}. \end{aligned}$$
(1.34)

The theorem follows by a straightforward iteration, see Sect. 5.

As it is well known the bounds (1.34) imply a lower bound on the stability time; we discuss this in Corollary 5.1 where we show that the solution u(t) of the Hamiltonian flow of (1.31) with initial datum \(u(0)=u_0\) such that \(|u_0|_{\mathtt {w}}\le \frac{3r}{8}\) exists and satisfies

$$\begin{aligned} |u(t)|_{{\mathtt {w}}} \le r \quad \text{ for } \text{ all } \text{ times }\quad |t|\le \frac{1}{8 \widehat{\mathtt {C}}_3 r^{2(\mathtt {N}+1)}} . \end{aligned}$$

By Theorem 1.3 and Corollary 5.1, in order to prove the stability results we only need to define suitable sequence spaces verifying Assumption 1. In particular we consider the three applications \(\mathtt {G},\mathtt {S},\mathtt {M}\) introduced at page 7. Another interesting example (suggested to us by Z. Hani) could be the space

$$\begin{aligned} \Big \{ {\left( u_j\right) }\in L^2:\quad |u|^2:=\sum _j |u_j|^2 e^{\ln (\lfloor j \rfloor )^2} <\infty \Big \}, \end{aligned}$$

where \(\lfloor j \rfloor = \max \{|j|,2\}\). In this case one may get \(T\approx \delta ^{ \ln (\ln (1/\delta ))}\).

A preliminary version of these results was announced in [BMP19].

2 Part 1. An Abstract Framework for Birkhoff Normal Form on Sequences Spaces

3 Symplectic Structure and Hamiltonian Flows

Spaces of Hamiltonians. As explained in the Introduction our wheighted spaces \({\mathtt {h}}_{\mathtt {w}}\) are contained in \(\ell ^2(\mathbb {C})\), so we endow them with the standard symplectic structure coming from the Hermitian product on \(\ell ^2(\mathbb {C})\).

We identify \(\ell ^2(\mathbb {C})\) with \(\ell ^2(\mathbb {R})\times \ell ^2(\mathbb {R})\) through \(u_j= {\left( x_j+ i y_j\right) }/\sqrt{2}\) and induce on \(\ell ^2(\mathbb {C})\) the structure of a real symplectic Hilbert spaceFootnote 10 by setting, for any \((u^{(1)}, u^{(2)}) \in \ell ^2(\mathbb {C})\times \ell ^2(\mathbb {C})\),

$$\begin{aligned} \langle u^{(1)},u^{(2)}\rangle = \sum _j {\left( x_j^{(1)}x_j^{(2)}+ y_j^{(1)}y_j^{(2)}\right) } ,\quad \omega (u^{(1)},u^{(2)})= \sum _j {\left( y_j^{(1)}x_j^{(2)}- x_j^{(1)}y_j^{(2)}\right) }, \end{aligned}$$

which are the standard scalar product and symplectic form \(\Omega = \sum _j dy_j\wedge d x_j\).

For convenience and to keep track of the complex structure, one often writes the vector fields and the differential forms in complex notation, that is

$$\begin{aligned} \Omega = \mathrm{i}\sum _j d u_j\wedge d {\bar{u}}_j ,\quad X_H^{(j)} = \mathrm{i}\frac{\partial }{\partial {\bar{u}}_j} H\, \end{aligned}$$

where the one form and vector field are defined through the identification between \(\mathbb {C}\) and \(\mathbb {R}^2\), given by

$$\begin{aligned} d u_j&= \frac{1}{\sqrt{2}}{\left( d x_j+ \mathrm{i}d y_j\right) },\quad d {\bar{u}}_j = \frac{1}{\sqrt{2}}{\left( d x_j- \mathrm{i}d y_j\right) },\\ \frac{\partial }{\partial u_j}&= \frac{1}{\sqrt{2}}{\left( \frac{\partial }{\partial x_j} - \mathrm{i}\frac{\partial }{\partial y_j}\right) },\quad \frac{\partial }{\partial {\bar{u}}_j} = \frac{1}{\sqrt{2}}{\left( \frac{\partial }{\partial x_j} + \mathrm{i}\frac{\partial }{\partial y_j}\right) }. \end{aligned}$$

Remark 2.1

By mass conservation and since \(H(0)=0,\) it is straightforward to prove that the norm \(|\cdot |_{r,\eta ,{\mathtt {w}}}\) is increasing in the radius parameter r (see also Proposition3.1).

Note that if \(|H|_{r,\eta ,{\mathtt {w}}}<\infty \) then H admits an analytic extension \({\widehat{H}},\) that is

$$\begin{aligned} (u_+, u_{-})\in B_r(\ell ^2(\mathbb {C}))\times B_r(\ell ^2(\mathbb {C})) \rightarrow {\widehat{H}}(u_+,u_-)\, :\quad \quad H(u)={\widehat{H}} (u,{\bar{u}}), \end{aligned}$$

whose Taylor series expansion is

$$\begin{aligned} {\widehat{H}}(u_+,u_-) = \sum ^*_{{\varvec{{\alpha }}},{\varvec{{\beta }}}\in {\mathbb {N}}^\mathbb {Z}} H_{{\varvec{{\alpha }}},{\varvec{{\beta }}}}u_+^{\varvec{{\alpha }}}u_-^{\varvec{{\beta }}}\, \end{aligned}$$

where we denote by \(\sum ^*\) the sum restricted to those \({\varvec{{\alpha }}},{\varvec{{\beta }}}: |{\varvec{{\alpha }}}|=|{\varvec{{\beta }}}|<\infty \).

One can see that

$$\begin{aligned} \frac{\partial }{\partial {\bar{u}}_j} H(u) = \frac{\partial {\widehat{H}}(u_+,u_-)}{\partial u_{-,j}} \Big \vert _{u_+={\bar{u}}_-=u}. \end{aligned}$$

Poisson structure and hamiltonian flows.  The scale \(\{{{\mathcal {H}}}_{r,\eta }({\mathtt {h}}_{\mathtt {w}})\}_{r>0} \) is a Banach-Poisson algebra in the following sense

Proposition 2.1

For \(0 <\rho \le r\) and \(\eta >0\) we have

$$\begin{aligned} |\{F,G\}|_{r,\eta ,{\mathtt {w}}} \le 4{\left( 1+\frac{r}{\rho }\right) } |F|_{r+\rho ,\eta ,{\mathtt {w}}} |G|_{r+\rho ,\eta ,{\mathtt {w}}}. \end{aligned}$$
(2.1)

Proof

It is essentially contained in [BBP13]. See in particular Lemma 2.16 of [BBP13] with \(n=0\) (no action variables here) and no s and \(s'\) (no actions variable here). Note that the constant in Lemma 2.16 is 8, instead of 4 in the present paper, because of the presence there of action variables which scale different from the cartesian ones (namely \((2r)^2\) instead of 2r). Recall also the required properties of the space E (named \({\mathtt {h}}_{\mathtt {w}}\) in the present paper) mentioned after Definition 2.5. \(\square \)

The following Lemma is a simple corollary and its proof is postponed to the appendix.

Lemma 2.1

(Hamiltonian flow). Let \(0<\rho < r \), and \(S\in {{\mathcal {H}}}_{r+\rho ,\eta }({\mathtt {h}}_{\mathtt {w}})\) with

$$\begin{aligned} {\left| S\right| }_{r+\rho ,\eta ,{\mathtt {w}}} \le \delta := \frac{\rho }{8 e{\left( r+\rho \right) }}. \end{aligned}$$
(2.2)

Then the time 1-Hamiltonian flow \(\Phi ^1_S: B_r({\mathtt {h}}_{\mathtt {w}})\rightarrow B_{r + \rho }({\mathtt {h}}_{\mathtt {w}})\) is well defined, analytic, symplectic with

$$\begin{aligned} \sup _{u\in B_r({\mathtt {h}}_{\mathtt {w}})} {\left| \Phi ^1_S(u)-u\right| }_{{\mathtt {h}}_{\mathtt {w}}} \le (r+\rho ) {\left| S\right| }_{r+\rho ,\eta ,{\mathtt {w}}} \le \frac{\rho }{8 e}. \end{aligned}$$
(2.3)

For any \(H\in {{\mathcal {H}}}_{r+\rho ,\eta }({\mathtt {h}}_{\mathtt {w}})\) we have that \(H\circ \Phi ^1_S= e^{{\left\{ S,\cdot \right\} }} H\in {{\mathcal {H}}}_{r,\eta }({\mathtt {h}}_{\mathtt {w}})\) and

$$\begin{aligned} {\left| e^{{\left\{ S,\cdot \right\} }}H\right| }_{r,\eta ,{\mathtt {w}}}&\le 2 {\left| H\right| }_{r+\rho ,\eta ,{\mathtt {w}}}, \end{aligned}$$
(2.4)
$$\begin{aligned} {\left| {\left( e^{{\left\{ S,\cdot \right\} }}- {\text {id}}\right) }H\right| }_{r,\eta ,{\mathtt {w}}}&\le \delta ^{-1} {\left| S\right| }_{r+\rho ,\eta ,{\mathtt {w}}} {\left| H\right| }_{r+\rho ,\eta ,{\mathtt {w}}}, \end{aligned}$$
(2.5)
$$\begin{aligned} {\left| {\left( e^{{\left\{ S,\cdot \right\} }}- {\text {id}}- {\left\{ S,\cdot \right\} }\right) }H\right| }_{r,\eta ,{\mathtt {w}}}&\le \frac{1}{2} \delta ^{-2} {\left| S\right| }_{r+\rho ,\eta ,{\mathtt {w}}}^2 {\left| H\right| }_{r+\rho ,\eta ,{\mathtt {w}}} \end{aligned}$$
(2.6)

More generally for any \(h\in {\mathbb {N}}\) and any sequence \((c_k)_{k\in {\mathbb {N}}}\) with \(| c_k|\le 1/k!\), we have

$$\begin{aligned} {\left| \sum _{k\ge h} c_k {\text {ad}}^k_S{\left( H\right) }\right| }_{r,\eta ,{\mathtt {w}}} \le 2 |H|_{r+\rho ,\eta ,{\mathtt {w}}} \big (|S|_{r+\rho ,\eta ,{\mathtt {w}}}/2\delta \big )^h , \end{aligned}$$
(2.7)

where \({\text {ad}}_S{\left( \cdot \right) }:= {\left\{ S,\cdot \right\} }\).

4 Immersions for Spaces of Hamiltonians

Given two positive sequences \({\mathtt {w}}= {\left( {\mathtt {w}}_j\right) }_{j\in \mathbb {Z}},{\mathtt {w}}' = {\left( {\mathtt {w}}'_j\right) }_{j\in \mathbb {Z}}\) we write that \({\mathtt {w}}\le {\mathtt {w}}'\) if the inequality holds point wise, namely

$$\begin{aligned} {\mathtt {w}}\le {\mathtt {w}}' \quad :\iff \quad {\mathtt {w}}_j\le {\mathtt {w}}'_j,\ \ \ \forall \, j\in \mathbb {Z}. \end{aligned}$$

In this way if \(r'\le r\) and \({\mathtt {w}}\le {\mathtt {w}}'\) then \(B_{r'}({\mathtt {h}}_{\mathtt {w}'}) \subseteq B_r({\mathtt {h}}_{\mathtt {w}})\). Consequently if \(r'\le r , \eta '\le \eta \) and \({\mathtt {w}}\le {\mathtt {w}}'\) then \( {{\mathcal {A}}}_{r,\eta }({\mathtt {h}}_{\mathtt {w}}) \subseteq {{\mathcal {A}}}_{r',\eta '}({\mathtt {h}}_{{\mathtt {w}}'})\).

We thus wish to study conditions on \((r,\eta ,{\mathtt {w}}),({r^*},\eta ',{\mathtt {w}}')\) (with \({r^*}\le r\)) which ensure that \( {{\mathcal {H}}}_{r,\eta }({\mathtt {h}}_{\mathtt {w}}) \subseteq {{\mathcal {H}}}_{{r^*},\eta '}({\mathtt {h}}_{{\mathtt {w}}'})\). Note that this is not obvious at all, since we are asking that an Hamiltonian vector field of \(X_H\in {{\mathcal {H}}}_{r,\eta }({\mathtt {h}}_{\mathtt {w}})\), when restricted to the smaller domain \(B_{{r^*}}({\mathtt {h}}_{\mathtt {w}'})\) belongs to the smaller space \({\mathtt {h}}_{{\mathtt {w}}'}\).

The coefficients\(c^{(j)}_{r,\eta ,{\mathtt {w}}}({\alpha },\beta )\). Let us start by rewriting the norm \(|\cdot |_{r,\eta ,{\mathtt {w}}}\) in a more adimensional way. In this way all the dependence on the parameters \(r,\eta ,{\mathtt {w}}\) of the norm \(|\cdot |_{r,\eta ,{\mathtt {w}}}\) is encoded in the coefficients (1.27).

Definition 3.1

For any \(H\in {{\mathcal {H}}}_{r,\eta }({\mathtt {h}}_{\mathtt {w}})\) we define a map

$$\begin{aligned} B_1(\ell ^2)\rightarrow \ell ^2 ,\quad y={\left( y_j\right) }_{j\in \mathbb {Z}}\mapsto {\left( Y^{(j)}_{H}(y;r,\eta ,{\mathtt {w}})\right) }_{j\in \mathbb {Z}} \end{aligned}$$

by setting

$$\begin{aligned} Y^{(j)}_{H}(y;r,\eta ,{\mathtt {w}}) := \sum _*|H_{{\varvec{{\alpha }}},{\varvec{{\beta }}}}| \frac{({\varvec{{\alpha }}}_j+{\varvec{{\beta }}}_j)}{2}c^{(j)}_{r,\eta ,{\mathtt {w}}}({\varvec{{\alpha }}},{\varvec{{\beta }}}) y^{{\varvec{{\alpha }}}+{\varvec{{\beta }}}-e_j} \end{aligned}$$
(3.1)

where \(e_j\) is the j-th basis vector in \({\mathbb {N}}^\mathbb {Z}\), while the coefficient \(c^{(j)}_{r,\eta ,{\mathtt {w}}}({\varvec{{\alpha }}},{\varvec{{\beta }}})\) was defined in (1.27). For brevity, we set

$$\begin{aligned} \sum _*:=\sum _{{\varvec{{\alpha }}},{\varvec{{\beta }}}: |{\varvec{{\alpha }}}|=|{\varvec{{\beta }}}|}. \end{aligned}$$

The momentum \(\pi (\cdot )\) was defined in (1.25).

The vector field \(Y_H\) is a majorant analytic function on \(\ell ^2\) which has the same norm as H. Since the majorant analytic functions on a given space have a natural ordering this gives us a natural criterion for immersions, as formalized in the following Lemma.

Lemma 3.1

Let \(r,{r^*}>0,\,\eta ,{\eta '}\ge 0,\)\({\mathtt {w}},{{\mathtt {w}}'}\in \mathbb {R}_+^\mathbb {Z}\). The following properties hold.

  1. (i)

    The norm of H can be expressed as

    $$\begin{aligned} {\left| H\right| }_{r,\eta ,{\mathtt {w}}}= \sup _{|y|_{\ell ^2}\le 1}{\left| Y_H(y;r,\eta ,{\mathtt {w}})\right| }_{\ell ^2} \end{aligned}$$
    (3.2)
  2. (ii)

    Given \( H^{(1)}\in {{\mathcal {H}}}_{{r^*},{\eta '},{{\mathtt {w}}'}}\) and \(H^{(2)}\in {{\mathcal {H}}}_{r,\eta ,{\mathtt {w}}}, \)

    such that for all \({\varvec{{\alpha }}},{\varvec{{\beta }}}\in {\mathbb {N}}^\mathbb {Z}\) and \(j\in \mathbb {Z}\) with \({\varvec{{\alpha }}}_j+{\varvec{{\beta }}}_j\ne 0\) one has

    $$\begin{aligned} |H^{(1)}_{{\varvec{{\alpha }}},{\varvec{{\beta }}}}| c^{(j)}_{{r^*},{\eta '},{{\mathtt {w}}'}}({\varvec{{\alpha }}},{\varvec{{\beta }}}) \le c |H^{(2)}_{{\varvec{{\alpha }}},{\varvec{{\beta }}}}| c^{(j)}_{r,\eta ,{\mathtt {w}}}({\varvec{{\alpha }}},{\varvec{{\beta }}}), \end{aligned}$$

    for some \(c>0,\) then

    $$\begin{aligned} |H^{(1)}|_{{r^*},{\eta '},{{\mathtt {w}}'}} \le c |H^{(2)}|_{r,\eta ,{\mathtt {w}}}. \end{aligned}$$

Proof

See “Appendix B”. \(\square \)

As a corollary we get the following “immersion theorem” for spaces of Hamiltonians

Proposition 3.1

(Immersion). Let \(r,{r^*}>0,\,\eta ,{\eta '}\ge 0,\)\({\mathtt {w}},{{\mathtt {w}}'}\in \mathbb {R}_+^\mathbb {Z}.\) If

$$\begin{aligned} C:=\sup _{\begin{array}{c} j\in \mathbb {Z},\, {\varvec{{\alpha }}},{\varvec{{\beta }}}\in {\mathbb {N}}^\mathbb {Z}\\ {\varvec{{\alpha }}}_j+{\varvec{{\beta }}}_j\ne 0 \end{array}} \frac{c^{(j)}_{{r^*},{\eta '},{{\mathtt {w}}'}}({\varvec{{\alpha }}},{\varvec{{\beta }}}) }{c^{(j)}_{r,\eta ,{\mathtt {w}}}({\varvec{{\alpha }}},{\varvec{{\beta }}}) } < \infty , \end{aligned}$$
(3.3)

then \( {{\mathcal {H}}}_{r,\eta }({\mathtt {h}}_{\mathtt {w}}) \subseteq {{\mathcal {H}}}_{{r^*},\eta '}({\mathtt {h}}_{{\mathtt {w}}'})\), with

$$\begin{aligned} |H|_{{r^*},{\eta '},{{\mathtt {w}}'}} \le C |H|_{r,\eta ,{\mathtt {w}}}. \end{aligned}$$
(3.4)

In particular \({\left| \cdot \right| }_{r,\eta ,{\mathtt {w}}}\) is increasing in r and \(\eta \), namely if \({r^*}\le r\) and \({\eta '}\le \eta \) then

$$\begin{aligned} |H|_{{r^*},{\eta '},{\mathtt {w}}} \le |H|_{r,\eta ,{\mathtt {w}}}. \end{aligned}$$

Moreover, if \({r^*}\le r,\)\({\mathtt {w}}\le {{\mathtt {w}}'}\) and \(H\in {{\mathcal {K}}}_{r,\eta }({\mathtt {h}}_{\mathtt {w}})\) then

$$\begin{aligned} |H|_{{r^*},{\eta '},{{\mathtt {w}}'}} \le |H|_{r,\eta ,{\mathtt {w}}}. \end{aligned}$$
(3.5)

Furthermore, if H preserves momentum then

$$\begin{aligned} |H|_{{r^*},{\eta '},{{\mathtt {w}}'}} \le {C_0}|H|_{r,\eta ,{\mathtt {w}}}, \end{aligned}$$
(3.6)

where

$$\begin{aligned} {C_0}:= \sup _{\begin{array}{c} j\in \mathbb {Z},\, {\varvec{{\alpha }}},{\varvec{{\beta }}}\in {\mathbb {N}}^\mathbb {Z},\\ {\varvec{{\alpha }}}_j+{\varvec{{\beta }}}_j\ne 0, \\ \sum _{i}i({\varvec{{\alpha }}}_i-{\varvec{{\beta }}}_i)=0 \end{array}} \frac{c^{(j)}_{{r^*},{\eta '},{{\mathtt {w}}'}}({\varvec{{\alpha }}},{\varvec{{\beta }}}) }{c^{(j)}_{r,\eta ,{\mathtt {w}}}({\varvec{{\alpha }}},{\varvec{{\beta }}}) } < \infty , \end{aligned}$$
(3.7)

Proof

Inequality (3.4) directly follows from Lemma 3.1 (ii), while (3.5) follows directly by (1.27) since in the kernel \({\varvec{{\alpha }}}_j+{\varvec{{\beta }}}_j\ne 0\) implies \({\varvec{{\alpha }}}_j+{\varvec{{\beta }}}_j\ge 2.\) The momentum preserving case follows analogously. \(\square \)

Remark 3.1

The above immersion properties, with different norm and in a different context, were implicitly used by Bourgain in [Bou05].

5 Small Divisors and Homological Equation

Let us consider the set of frequencies

$$\begin{aligned} \Omega _{q}:={\left\{ \omega ={\left( \omega _j\right) }_{j\in \mathbb {Z}}\in \mathbb {R}^\mathbb {Z},\quad \sup _j|\omega _j-j^2|\langle j \rangle ^{q}< 1/2 \right\} }; \end{aligned}$$
(4.1)

this set is isomorphic to \([-1/2,1/2]^\mathbb {Z}\) via the identification

$$\begin{aligned} \xi \mapsto \omega (\xi ),\quad \text{ where }\quad \omega _j(\xi ) = j^2 +\frac{\xi _j}{\langle j \rangle ^{q}}. \end{aligned}$$
(4.2)

We endow \(\Omega _{q}\) with the probability measure \(\mu \) inducedFootnote 11 by the product measure on \([-1/2,1/2]^\mathbb {Z}\).

We now define the set of Diophantine frequencies, the following definition is a slight generalization of the one given by Bourgain in [Bou05].

Definition 4.1

Given \(\gamma >0\) and \({q}\ge 0\), we denote by \({\mathtt {D}_{\gamma ,{q}}}\equiv {\mathtt {D}_{\gamma ,{q}}^{\mu _1,\mu _2}}\) the set of \(\mu _1,\mu _2,\gamma \)-Diophantine frequencies

$$\begin{aligned}&{\mathtt {D}_{\gamma ,{q}}^{\mu _1,\mu _2}}:=\nonumber \\&\quad {\left\{ \omega \in \Omega _{q}\,:\, |\omega \cdot \ell |> \gamma \prod _{n\in \mathbb {Z}}\frac{1}{(1+|\ell _n|^{\mu _1} \langle n \rangle ^{\mu _2+{q}})},\quad \forall \ell \in \mathbb {Z}^\mathbb {Z}: 0<|\ell |<\infty \right\} }.\nonumber \\ \end{aligned}$$
(4.3)

Now we have that

Lemma 4.1

For \(\mu _1,\mu _2>1\) the exists a positive constant \({C_{\mathtt {meas}}}(\mu _1,\mu _2)\) such that

$$\begin{aligned}\mu \big (\Omega _{q}\setminus {\mathtt {D}_{\gamma ,{q}}^{\mu _1,\mu _2}}\big ) \le {C_{\mathtt {meas}}}(\mu _1,\mu _2)\gamma . \end{aligned}$$

Proof

In “Appendix C” \(\square \)

This means that, for all \(\mu _1,\mu _2>1\), Diophantine frequencies are typical in \(\Omega _{q}\) in the sense that they have full measure. Here and in the following we shall always assume that

$$\begin{aligned} 0<\gamma \le 1,\qquad \omega \in \mathtt {D}_{\gamma ,{q}}^{2,2}= {\mathtt {D}_{\gamma ,{q}}}. \end{aligned}$$
(4.4)

In the remaining part of this section, on appropriate source and target spaces, we will study the invertibility of the “Lie derivative” operator

$$\begin{aligned} L_\omega : \quad H \mapsto L_\omega H: = \sum _{*} \mathrm{i}{\left( \omega \cdot ({\varvec{{\alpha }}}-{\varvec{{\beta }}})\right) }H_{{\varvec{{\alpha }}},{\varvec{{\beta }}}}{u^{\varvec{{\alpha }}}}{\bar{u}^{\varvec{{\beta }}}}, \end{aligned}$$
(4.5)

which is nothing but the action of the Poisson bracket \({\left\{ \sum _j\omega _j{\left| u_j\right| }^2, \cdot \right\} }\) on H.

Recalling the definition of \( {{\mathcal {K}}}^{}_{r}({\mathtt {h}}_{\mathtt {w}}) \) in (1.30) we give the following

Definition 4.2

Let

$$\begin{aligned} {{\mathcal {R}}}^{}_{r,\eta }({\mathtt {h}}_{\mathtt {w}})&:= \{ H\in {{\mathcal {H}}}^{}_{r,\eta }({\mathtt {h}}_{\mathtt {w}})\ \ \vert \quad H = \sum _{{\varvec{{\alpha }}}\ne {\varvec{{\beta }}}} H_{{\varvec{{\alpha }}},{\varvec{{\beta }}}}{u^{\varvec{{\alpha }}}}{\bar{u}^{\varvec{{\beta }}}}\}. \end{aligned}$$
(4.6)

Then we have the decomposition \({{\mathcal {H}}}^{}_{r,\eta }({\mathtt {h}}_{\mathtt {w}})= {{\mathcal {R}}}^{}_{r,\eta }({\mathtt {h}}_{\mathtt {w}})\oplus {{\mathcal {K}}}^{}_{r}({\mathtt {h}}_{\mathtt {w}})\) and the continuous projectionsFootnote 12

$$\begin{aligned} |\Pi _{{{\mathcal {K}}}}H|^{}_{r,\eta ,{\mathtt {w}}}, |\Pi _{{{\mathcal {R}}}}H|^{}_{r,\eta ,{\mathtt {w}}} \le |H|^{}_{r,\eta ,{\mathtt {w}}}. \end{aligned}$$
(4.7)

Obviously for diophantine frequency \({{\mathcal {R}}}^{}_{r,\eta }({\mathtt {h}}_{\mathtt {w}})\) and \({{\mathcal {K}}}^{}_{r}({\mathtt {h}}_{\mathtt {w}})\) represent the range and kernel of \(L_\omega .\)

For any \(r,\eta ,{\mathtt {w}}\) and \({\varvec{{\alpha }}},{\varvec{{\beta }}}\in {\mathbb {N}}^\mathbb {Z}\) recall the coefficient defined in (1.27)

$$\begin{aligned} c^{(j)}_{r,\eta ,{\mathtt {w}}}({\varvec{{\alpha }}},{\varvec{{\beta }}}) := r^{|{\varvec{{\alpha }}}|+|{\varvec{{\beta }}}|-2}e^{\eta |\pi ({\varvec{{\alpha }}}-{\varvec{{\beta }}})|} \frac{{\mathtt {w}}_j^2}{{\mathtt {w}}^{{\varvec{{\alpha }}}+{\varvec{{\beta }}}}}. \end{aligned}$$

In following Lemma we consider \(R\in {{\mathcal {R}}}_{r,\eta }({\mathtt {h}}_{{\mathtt {w}}})\) and state sufficient conditions which ensure that \(L_\omega ^{-1}R\in {{\mathcal {R}}}_{{r^*},{\eta '}}({\mathtt {h}}_{{{\mathtt {w}}'}})\).

Lemma 4.2

(Homological equation). Fix \(\omega \in {\mathtt {D}_{\gamma ,{q}}}.\) Consider two ordered weights \(0<{r^*}\le r,\)\( 0\le {\eta '}\le \eta ,{{\mathtt {w}}'}\ge {\mathtt {w}},\) such that

$$\begin{aligned} K:= \gamma \sup _{\begin{array}{c} j\in \mathbb {Z},\, {\varvec{{\alpha }}}\ne {\varvec{{\beta }}}\in {\mathbb {N}}^\mathbb {Z}\\ {\varvec{{\alpha }}}_j+{\varvec{{\beta }}}_j\ne 0 \end{array}} \frac{c^{(j)}_{{r^*},{\eta '},{{\mathtt {w}}'}}({\varvec{{\alpha }}},{\varvec{{\beta }}}) }{c^{(j)}_{r,\eta ,{\mathtt {w}}}({\varvec{{\alpha }}},{\varvec{{\beta }}}) |\omega \cdot ({\varvec{{\alpha }}}-{\varvec{{\beta }}})|} < \infty , \end{aligned}$$
(4.8)

then for any \(R\in {{\mathcal {R}}}_{r,\eta }({\mathtt {h}}_{{\mathtt {w}}})\) the homological equation

$$\begin{aligned} L_\omega S = R \end{aligned}$$

has a unique solution \(S= L_\omega ^{-1} R\) in \({{\mathcal {R}}}_{{r^*},{\eta '}}({\mathtt {h}}_{{{\mathtt {w}}'}})\), which satisfies

$$\begin{aligned} {\left| L_\omega ^{-1} R\right| }_{{r^*},{\eta '},{{\mathtt {w}}'}}\le \gamma ^{-1} K{\left| R\right| }_{r,\eta ,{\mathtt {w}}}. \end{aligned}$$
(4.9)

Similarly, if R preserves momentum, assuming only

$$\begin{aligned} K_0:= \gamma \sup _{\begin{array}{c} j\in \mathbb {Z},\, {\varvec{{\alpha }}}\ne {\varvec{{\beta }}}\in {\mathbb {N}}^\mathbb {Z}\\ {\varvec{{\alpha }}}_j+{\varvec{{\beta }}}_j\ne 0\\ \sum _i i ({\varvec{{\alpha }}}_i-{\varvec{{\beta }}}_i)=0 \end{array}} \frac{c^{(j)}_{{r^*},{\eta '},{{\mathtt {w}}'}}({\varvec{{\alpha }}},{\varvec{{\beta }}}) }{c^{(j)}_{r,\eta ,{\mathtt {w}}}({\varvec{{\alpha }}},{\varvec{{\beta }}}) |\omega \cdot ({\varvec{{\alpha }}}-{\varvec{{\beta }}})|} < \infty , \end{aligned}$$
(4.10)

we have that S also preserves momentum and

$$\begin{aligned} {\left| L_\omega ^{-1} R\right| }_{{r^*},{\eta '},{{\mathtt {w}}'}}\le \gamma ^{-1} K_0 {\left| R\right| }_{r,\eta ,{\mathtt {w}}}. \end{aligned}$$
(4.11)

Proof

Given any Hamiltonian \(R\in {{\mathcal {R}}}\), the formal solution of \(L_S = R\) is given by

$$\begin{aligned} L_\omega ^{-1} R = \sum _{ |{\varvec{{\alpha }}}|=|{\varvec{{\beta }}}|,\, {\varvec{{\alpha }}}\ne {\varvec{{\beta }}}} \frac{1}{\mathrm{i}{\left( \omega \cdot ({\varvec{{\alpha }}}-{\varvec{{\beta }}})\right) }} R_{{\varvec{{\alpha }}},{\varvec{{\beta }}}}{u^{\varvec{{\alpha }}}}{\bar{u}^{\varvec{{\beta }}}}, \end{aligned}$$
(4.12)

where \(u\in B_{{r^*}}({\mathtt {h}}_{{{\mathtt {w}}'}}).\) By Lemma 3.1 (ii) (applied to \(H^{(1)}=L_\omega ^{-1} R\) and \(H^{(2)}=R\)) and (4.8), we get (4.9). The momentum preserving case is analogous. \(\square \)

6 Abstract Birkhoff Normal Form

In this section we prove the abstract Birkoff normal form Theorem 1.3. We start by defining a degree decomposition which endows \({{\mathcal {H}}}_{r,\eta }({\mathtt {h}}_{{\mathtt {w}}})\) with a graded Poisson algebra structure.

Definition 5.1

(minimal scaling degree). We say that H has minimal scaling degree \({\mathtt {d}}={\mathtt {d}}(H)\) (at zero) if

$$\begin{aligned}&H_{{\varvec{{\alpha }}},{\varvec{{\beta }}}}=0 ,\quad \forall \, {\varvec{{\alpha }}},{\varvec{{\beta }}}: \quad |{\varvec{{\alpha }}}|=|{\varvec{{\beta }}}|\le {\mathtt {d}},\\&H_{{\varvec{{\alpha }}},{\varvec{{\beta }}}}\ne 0 ,\quad \text {for some}\ \ {\varvec{{\alpha }}},{\varvec{{\beta }}}: \quad |{\varvec{{\alpha }}}|=|{\varvec{{\beta }}}|= {\mathtt {d}}+1. \end{aligned}$$

We say that \({\mathtt {d}}(0)=+\infty .\)

Essentially H has scaling degree \({\mathtt {d}}\) if and only if it has a zero of order \(2{\mathtt {d}}+2\) at zero, we prefer this notation because we find it more intrinsic, it produces a graded Poisson algebra structure and one has the following

Lemma 5.1

If \(H\in {{\mathcal {H}}}_{r,\eta }({\mathtt {h}}_{\mathtt {w}})\) with \({\mathtt {d}}(H)\ge {\mathtt {d}}\), then for all \({r^*}\le r\) one has

$$\begin{aligned} {\left| H\right| }^{{}}_{{r^*},\eta ,{\mathtt {w}}} \le {\left( \frac{{r^*}}{r}\right) }^{2{\mathtt {d}}} {\left| H\right| }^{{}}_{r,\eta ,{\mathtt {w}}}. \end{aligned}$$

Proof

Recalling (1.27), we have

$$\begin{aligned} \frac{c^{(j)}_{{r^*},\eta ,{\mathtt {w}}}({\varvec{{\alpha }}},{\varvec{{\beta }}})}{c^{(j)}_{r,\eta ,{\mathtt {w}}}({\varvec{{\alpha }}},{\varvec{{\beta }}})}= {\left( \frac{{r^*}}{r}\right) }^{|{\varvec{{\alpha }}}|+|{\varvec{{\beta }}}|-2}. \end{aligned}$$

Since \(|{\varvec{{\alpha }}}|+|{\varvec{{\beta }}}|-2\ge 2{\mathtt {d}}\), the inequality follows by Proposition 3.1. \(\square \)

The normal form will be proved iteratively by means of the following Lemma, which constitutes the main step of the procedure.

Basically we start with a Hamiltonian \( H = D_\omega + Z + R\) with \(Z\in {{\mathcal {K}}}^{}_{r}({\mathtt {h}}_{{\mathtt {w}}})\) in normal form and \(R\in {{\mathcal {R}}}^{}_{r,\eta }({\mathtt {h}}_{{\mathtt {w}}})\) of minimal degree \({\mathtt {d}}\), and we consider \({r'}\le r,{\eta '}\le \eta ,{{\mathtt {w}}'}\ge {\mathtt {w}}\) so that \({{\mathcal {H}}}_{r,\eta }({\mathtt {h}}_{\mathtt {w}})\subseteq {{\mathcal {H}}}_{{r'},{\eta '}}({\mathtt {h}}_{{\mathtt {w}}'})\). Then we give a sufficient condition which ensures the existence of a change of variables \( \Phi \ :\ B_{r'}({\mathtt {h}}_{{{\mathtt {w}}'}})\ \rightarrow \ B_{r}({\mathtt {h}}_{{{\mathtt {w}}'}})\) such that

$$\begin{aligned} H\circ \Phi = D_\omega + Z' + R' , \end{aligned}$$

with \( Z' , R' \in {{\mathcal {H}}}_{{r'},{\eta '}}({\mathtt {h}}_{{{\mathtt {w}}'}})\) and \(R'\) of minimal degree \({\mathtt {d}}+1\).

Lemma 5.2

Fix \(\omega \in {\mathtt {D}_{\gamma ,{q}}}.\) Let \(r>{r'}>0,\eta \ge {\eta '}\ge 0,\)\({\mathtt {w}}\le {{\mathtt {w}}'}.\) Consider

$$\begin{aligned} H = D_\omega + Z + R , \quad Z \in {{\mathcal {K}}}^{}_{r}({\mathtt {h}}_{{\mathtt {w}}}) , \quad R \in {{\mathcal {R}}}^{}_{r,\eta }({\mathtt {h}}_{{\mathtt {w}}}), \quad {\mathtt {d}}(Z)\ge 1, \ \ {\mathtt {d}}(R)\ge {\mathtt {d}}\ge 1. \end{aligned}$$

Assume that (3.3) and (4.8) hold and thatFootnote 13

$$\begin{aligned} |R|_{r,\eta ,{\mathtt {w}}} \le \frac{\gamma \delta }{K}, \qquad \text {with} \quad \delta :=\frac{r-{r'}}{16er} . \end{aligned}$$
(5.1)

Then there exists a change of variables

$$\begin{aligned}&\Phi \ :\ B_{r'}({\mathtt {h}}_{{{\mathtt {w}}'}})\ \rightarrow \ B_{r}({\mathtt {h}}_{{{\mathtt {w}}'}}), \end{aligned}$$
(5.2)

such that

$$\begin{aligned}&H\circ \Phi = D_\omega + Z' + R' , \quad Z' \in {{\mathcal {K}}}_{{r'},{\eta '}}({\mathtt {h}}_{{{\mathtt {w}}'}}) , \\&\quad R' \in {{\mathcal {R}}}_{{r'},{\eta '}}({\mathtt {h}}_{{{\mathtt {w}}'}}), \quad {\mathtt {d}}(Z')\ge 1, \ \ {\mathtt {d}}(R')\ge {\mathtt {d}}+1. \end{aligned}$$

MoreoverFootnote 14

$$\begin{aligned} |Z'|_{{r'},{\eta '},{{\mathtt {w}}'}}\le & {} |Z|_{r,\eta ,{\mathtt {w}}} + (\gamma \delta )^{-1} K |R|_{r,\eta ,{\mathtt {w}}} (C|R|_{r,\eta ,{\mathtt {w}}}+ |Z|_{r,\eta ,{\mathtt {w}}}), \nonumber \\ |R'|_{{r'},{\eta '},{{\mathtt {w}}'}}\le & {} (\gamma \delta )^{-1} K |R|_{r,\eta ,{\mathtt {w}}} (C|R|_{r,\eta ,{\mathtt {w}}}+ |Z|_{r,\eta ,{\mathtt {w}}}). \end{aligned}$$
(5.3)

Finally, for \({\mathtt {w}}^\sharp \ge {{\mathtt {w}}'},\) assume the further conditions

$$\begin{aligned} \gamma \sup _{\begin{array}{c} j\in \mathbb {Z},\, {\varvec{{\alpha }}}\ne {\varvec{{\beta }}}\in {\mathbb {N}}^\mathbb {Z}\\ {\varvec{{\alpha }}}_j+{\varvec{{\beta }}}_j\ne 0 \end{array}} \frac{c^{(j)}_{{r^*},{\eta '},{\mathtt {w}}^\sharp }({\varvec{{\alpha }}},{\varvec{{\beta }}}) }{c^{(j)}_{r,\eta ,{\mathtt {w}}}({\varvec{{\alpha }}},{\varvec{{\beta }}}) |\omega \cdot ({\varvec{{\alpha }}}-{\varvec{{\beta }}})|}=: K^\sharp < \infty , \qquad {r^*}:=\frac{{r'}+r}{2} \end{aligned}$$
(5.4)

and

$$\begin{aligned} |R|_{r,\eta ,{\mathtt {w}}} \le \frac{\gamma \delta }{K^\sharp }. \end{aligned}$$
(5.5)

Then

$$\begin{aligned}&\Phi _{\big |B_{r'}({\mathtt {h}}_{{\mathtt {w}}^\sharp })}\ :\ B_{r'}({\mathtt {h}}_{{\mathtt {w}}^\sharp })\ \rightarrow \ B_{r}({\mathtt {h}}_{{\mathtt {w}}^\sharp }), \nonumber \\&\quad \sup _{u\in B_{r'}({\mathtt {h}}_{{\mathtt {w}}^\sharp })} {\left| \Phi (u)-u\right| }_{{\mathtt {h}}_{{\mathtt {w}}^\sharp }} \le r\gamma ^{-1} K^\sharp {\left| R\right| }_{r,\eta ,{\mathtt {w}}}. \end{aligned}$$
(5.6)

Moreover if R preserves momentum, assuming only that

$$\begin{aligned} K^\sharp _0:= \gamma \sup _{\begin{array}{c} j\in \mathbb {Z},\, {\varvec{{\alpha }}}\ne {\varvec{{\beta }}}\in {\mathbb {N}}^\mathbb {Z}\\ {\varvec{{\alpha }}}_j+{\varvec{{\beta }}}_j\ne 0,\\ \sum _i i({\varvec{{\alpha }}}_i -{\varvec{{\beta }}}_i)=0 \end{array}} \frac{c^{(j)}_{{r^*},{\eta '},{\mathtt {w}}^\sharp }({\varvec{{\alpha }}},{\varvec{{\beta }}}) }{c^{(j)}_{r,\eta ,{\mathtt {w}}}({\varvec{{\alpha }}},{\varvec{{\beta }}}) |\omega \cdot ({\varvec{{\alpha }}}-{\varvec{{\beta }}})|} <\infty \end{aligned}$$
(5.7)

and that (5.1), (5.5) hold with \(K_0,K^\sharp _0\) instead of \(K,K^\sharp \) we have that \(R'\) preserves momentum and (5.6) holds with \(K^\sharp _0\) instead of \(K^\sharp .\)

Proof

By Lemma 4.2 let \(S= L_\omega ^{-1} R\) in \({{\mathcal {R}}}_{{r^*},{\eta '}}({\mathtt {h}}_{{{\mathtt {w}}'}})\) be the unique solution of the homological equation \(L_\omega S = R\) on \( B_{{r^*}}({\mathtt {h}}_{{{\mathtt {w}}'}})\). Note that \({\mathtt {d}}(S)\ge {\mathtt {d}}\). We have

$$\begin{aligned} {\left| S\right| }_{{r^*},{\eta '},{{\mathtt {w}}'}}\le \gamma ^{-1} K{\left| R\right| }_{r,\eta ,{\mathtt {w}}}. \end{aligned}$$
(5.8)

We now apply Lemma 2.1 with \((r,\eta ,{\mathtt {w}})\rightsquigarrow ({r'},{\eta '},{{\mathtt {w}}'})\) and \(\rho :={r^*}-{r'}.\) Note that (5.1) and (5.8) imply (2.2). We define \(\Phi :=\Phi _S^1\) and compute

$$\begin{aligned} H'&:=H\circ \Phi = D_\omega + Z + (e^{{\left\{ S,\cdot \right\} }}-{\text {id}}-\{S,\cdot \}) D_\omega + (e^{{\left\{ S,\cdot \right\} }}-{\text {id}})(Z+R)\\&= D_\omega + Z - \sum _{j=2}^\infty \frac{{\left( \mathrm{ad} S\right) }^{j-1}}{j!} R + (e^{{\left\{ S,\cdot \right\} }}-{\text {id}})(Z+R). \end{aligned}$$

We now set

$$\begin{aligned} Z'= \Pi _{{{\mathcal {K}}}} H' -D_\omega ,\quad R' = \Pi _{{{\mathcal {R}}}} H'. \end{aligned}$$

Since the scaling degree is additive w.r.t. Poisson brackets, we have that \({\mathtt {d}}(Z')\ge 1\) and \({\mathtt {d}}(R')\ge {\mathtt {d}}+1\). By (2.7)

$$\begin{aligned} |Z'|_{{r'},{\eta '},{{\mathtt {w}}'}}\le & {} |Z|_{{r'},{\eta '},{{\mathtt {w}}'}} + (\gamma \delta )^{-1} K |R|_{r,\eta ,{\mathtt {w}}} (|R|_{{r^*},{\eta '},{{\mathtt {w}}'}}+ |Z|_{{r^*},{\eta '},{{\mathtt {w}}'}}), \\ |R'|_{{r'},{\eta '},{{\mathtt {w}}'}}\le & {} (\gamma \delta )^{-1} K |R|_{r,\eta ,{\mathtt {w}}} (|R|_{{r^*},{\eta '},{{\mathtt {w}}'}}+ |Z|_{{r^*},{\eta '},{{\mathtt {w}}'}}). \end{aligned}$$

Since (4.8) holds we can apply Proposition 3.1: by (3.4) and (3.5) we get

$$\begin{aligned} |R|_{{r^*},{\eta '},{{\mathtt {w}}'}} \le C |R|_{r,\eta ,{\mathtt {w}}}, \qquad |Z|_{{r^*},{\eta '},{{\mathtt {w}}'}} \le |Z|_{r,\eta ,{\mathtt {w}}}. \end{aligned}$$

(5.3) follows.

Finally assume (5.5) and (5.4). By Lemma 4.2 let \(S^\sharp = L_\omega ^{-1} R\) in \({{\mathcal {R}}}_{{r^*},{\eta '}}({\mathtt {h}}_{{\mathtt {w}}^\sharp })\) be the solution of the homological equation \(L_\omega S^\sharp = R\) on \(B_{{r^*}}({\mathtt {h}}_{{\mathtt {w}}^\sharp })\subseteq B_{{r^*}}({\mathtt {h}}_{{{\mathtt {w}}'}})\). Since S and \(S^\sharp \) solve the same linear equation on \(B_{{r^*}}({\mathtt {h}}_{{\mathtt {w}}^\sharp })\), we have that

$$\begin{aligned} S^\sharp =S_{\big | B_{{r^*}}({\mathtt {h}}_{{\mathtt {w}}^\sharp })}. \end{aligned}$$

By (4.9) we get

$$\begin{aligned} {\left| S\right| }_{{r^*},{\eta '},{\mathtt {w}}^\sharp } \le \gamma ^{-1} K^\sharp {\left| R\right| }_{r,\eta ,{\mathtt {w}}}. \end{aligned}$$
(5.9)

We now apply Lemma 2.1 with \((r,\eta ,{\mathtt {w}})\rightsquigarrow ({r'},{\eta '},{\mathtt {w}}^\sharp )\) and \(\rho :={r^*}-{r'}.\) Note that (5.5) and (5.9) imply (2.2). Then (5.6) follows by (2.3) and (5.9).

The momentum preserving case is analogous. \(\square \)

Theorem 1.3 follows Given \(\eta \ge 0\) and a sequence of weights \(\mathtt {w}_0\le \mathtt {w}_1\le \cdots \le \mathtt {w}_\mathtt {N}={\mathtt {w}}\). For any given \( r>0\) we set

$$\begin{aligned} r_n =(2 - \frac{n}{\mathtt {N}}) r ,\quad \eta _n = (1 - \frac{n}{\mathtt {N}})\eta ,\, \quad 0\le n\le \mathtt {N}, \quad \quad r_n^* = \frac{r_{n+1}+ r_n}{2},\quad 0\le n<\mathtt {N}.\nonumber \\ \end{aligned}$$
(5.10)

From Assumption 1 and (1.27) we haveFootnote 15

$$\begin{aligned}&\max \Bigg \{1,\ \sup _{0\le n<\mathtt {N}} \sup _{\begin{array}{c} j,{\varvec{{\alpha }}},{\varvec{{\beta }}}\\ {\varvec{{\alpha }}}_j+{\varvec{{\beta }}}_j\ne 0 \end{array}} \frac{c^{(j)}_{r^*_{n},\eta _{n+1},{\mathtt {w}}_{n+1}}({\varvec{{\alpha }}},{\varvec{{\beta }}}) }{c^{(j)}_{r_{n},\eta _{n},{\mathtt {w}}_n}({\varvec{{\alpha }}},{\varvec{{\beta }}}) }\Bigg \} ={{\mathfrak {C}}}<\infty , \end{aligned}$$
(5.11)
$$\begin{aligned}&\max \Bigg \{1,\ \sup _{0\le n<\mathtt {N}} \sup _{\begin{array}{c} j,{\varvec{{\alpha }}},{\varvec{{\beta }}}\\ {\varvec{{\alpha }}}_j+{\varvec{{\beta }}}_j\ne 0 \end{array}} \frac{ c^{(j)}_{r^*_{n},\eta _{n+1},{\mathtt {w}}_{n+1}}({\varvec{{\alpha }}},{\varvec{{\beta }}}) }{c^{(j)}_{r_{n},\eta _{n},{\mathtt {w}}_n}({\varvec{{\alpha }}},{\varvec{{\beta }}}) |\omega \cdot ({\varvec{{\alpha }}}-{\varvec{{\beta }}})|}\Bigg \} ={{\mathfrak {K}}}<\infty , \end{aligned}$$
(5.12)
$$\begin{aligned}&\max \Bigg \{1,\ \sup _{0\le n<\mathtt {N}} \sup _{\begin{array}{c} j,{\varvec{{\alpha }}},{\varvec{{\beta }}}\\ {\varvec{{\alpha }}}_j+{\varvec{{\beta }}}_j\ne 0 \end{array}} \frac{ c^{(j)}_{r^*_{n},\eta _{\mathtt {N}},{\mathtt {w}}_\mathtt {N}}({\varvec{{\alpha }}},{\varvec{{\beta }}}) }{c^{(j)}_{r_{n},\eta _{n},{\mathtt {w}}_n}({\varvec{{\alpha }}},{\varvec{{\beta }}}) |\omega \cdot ({\varvec{{\alpha }}}-{\varvec{{\beta }}})|} \Bigg \} ={{\mathfrak {K}}^\sharp }<\infty . \end{aligned}$$
(5.13)

For brevity we set

$$\begin{aligned}&{\mathtt {h}}_n:={\mathtt {h}}_{\mathtt {w}_n},\quad {{\mathcal {H}}}_n:= {{\mathcal {H}}}_{r_n,\eta _n}({\mathtt {h}}_n),\, \quad 0\le n\le \mathtt {N},\quad \nonumber \\&\quad {{\mathcal {H}}}_{n,*}:= {{\mathcal {H}}}_{r^*_n,\eta _{n+1}}({\mathtt {h}}_{n+1}),\quad 0\le n<\mathtt {N}, \end{aligned}$$
(5.14)

and, correspondingly, \({{\mathcal {R}}}_n,{{\mathcal {K}}}_n, {{{\mathcal {R}}}}_{n,*}, {{{\mathcal {K}}}}_{n,*}\) and

$$\begin{aligned} |\cdot |_n:=|\cdot |_{r_n,\eta _n,{\mathtt {w}}_n},\qquad |\cdot |_{n,*}:=|\cdot |_{r^*_n,\eta _{n+1},{\mathtt {w}}_{n+1}}. \end{aligned}$$
(5.15)

Lemma 5.3

By Assumption (5.11) we have the immersion properties

$$\begin{aligned} {{\mathcal {H}}}_0\subseteq {{\mathcal {H}}}_{0,*} \subseteq \cdots \subseteq {{\mathcal {H}}}_n\subseteq {{\mathcal {H}}}_{n,*} \subseteq {{\mathcal {H}}}_{n+1} \subseteq \cdots \subseteq {{\mathcal {H}}}_\mathtt {N}, \end{aligned}$$
(5.16)

with estimates

$$\begin{aligned}&H\in {{\mathcal {H}}}_n\qquad \Longrightarrow \qquad |H|_{n,*}\le {\widehat{C}} |H|_n, \qquad \quad 0\le n\le i\le \mathtt {N}-1 \nonumber \\&H\in {{\mathcal {K}}}_n\qquad \Longrightarrow \qquad |H|_{n,*} \le |H|_n,\qquad \quad 0\le n\le i\le \mathtt {N}-1. \end{aligned}$$
(5.17)

Proof

We apply Proposition 3.1 with

$$\begin{aligned} r,\eta ,{\mathtt {w}}\rightsquigarrow r_n,\eta _{n},{\mathtt {w}}_{n} ,\quad r^*,\eta ',{\mathtt {w}}' \rightsquigarrow r^*_n,\eta _{n+1},{\mathtt {w}}_{n+1} , \end{aligned}$$

by noting that the bound (3.3) follows from (5.11). The bounds in (5.17) follow form (3.4) and (3.5). The chain of inclusions (5.16) follows. \(\square \)

Proof of Theorem 1.3

We will prove the thesis inductively. Let us start by noticing that

$$\begin{aligned} \widehat{{\mathtt {r}}}= \min \Bigg \{ \frac{{\bar{r}}}{8\sqrt{|G|_{{\bar{r}},\eta ,{\mathtt {w}}_0}}} \sqrt{\frac{\gamma {\hat{\delta }} }{\max \{{{\mathfrak {C}}}{{\mathfrak {K}}},{{\mathfrak {K}}}^\sharp \} }} ,\ \frac{{\bar{r}}}{2}\Bigg \} ,\qquad {\hat{\delta }}:=\frac{1}{32 e\mathtt {N}}\, \end{aligned}$$

and, for all \(0<r\le \widehat{{\mathtt {r}}}\), let us set

$$\begin{aligned} {\varepsilon }:=\gamma ^{-1}{\left( \frac{2 r}{{\bar{r}}}\right) }^2 |G|_{{\bar{r}},\eta ,{\mathtt {w}}_0} = \frac{1}{2^9 e}{\left( \frac{ r}{r_\star }\right) }^2. \end{aligned}$$

From definition (1.32) we thus deduce that

$$\begin{aligned} 8\, {\varepsilon }\max \{{{\mathfrak {C}}}{{\mathfrak {K}}},{{\mathfrak {K}}}^\sharp \}{\hat{\delta }}^{-1} \le 1. \end{aligned}$$
(5.18)

Recalling the notations introduced in (5.10)–(5.15), by Lemma (5.1) we have

$$\begin{aligned} \gamma ^{-1}|G|_0 \le {\varepsilon }, \end{aligned}$$

hence, setting \( Z^{(0)}:=\Pi _{{{\mathcal {K}}}} G\) and \(R^{(0)}:=\Pi _{{{\mathcal {R}}}} G,\) from (4.7) it follows that

$$\begin{aligned} \gamma ^{-1}|Z^{(0)}|_0,\ \gamma ^{-1}|R^{(0)}|_0\le {\varepsilon }. \end{aligned}$$

We perform an iterative procedure producing a sequence of Hamiltonians, for \(n= 0,\dots ,{\mathtt {N}}\)

$$\begin{aligned}&H^{(n)}= D_\omega + Z^{(n)}+ R^{(n)} , \nonumber \\&Z^{(n)}\in {{\mathcal {K}}}_{n}, \ \ R^{(n)} \in {{\mathcal {R}}}_{n} , \quad {\mathtt {d}}(Z^{(n)})\ge 1,\ \ {\mathtt {d}}(R^{(n)})\ge n+1 , \nonumber \\&\gamma ^{-1}|Z^{(n)}|_{n} \le {\varepsilon }\sum _{h=0}^n 2^{-h} ,\quad \gamma ^{-1}|R^{(n)}|_{n} \le {\varepsilon }^{n+1} {\left( 4 {{\mathfrak {C}}}{{\mathfrak {K}}}{\hat{\delta }}^{-1}\right) }^n {\mathop {\le }\limits ^{(5.18)}}2^{-n}{\varepsilon }. \end{aligned}$$
(5.19)

Fix any \(k < \mathtt {N}\). Let us assume that we have constructed \(H^{(0)},\ldots ,H^{(k)}\) satisfying (5.19) for all \(0\le n\le k.\) We want to apply Lemma 5.2 with

$$\begin{aligned} H, r,\eta ,{\mathtt {w}}\ \rightsquigarrow \ H^{(k)}, r_k, \eta _k, {\mathtt {w}}_k \quad \text {and}\quad {r'},{\eta '},{{\mathtt {w}}'},{\mathtt {w}}^\sharp ,{\mathtt {d}}\ \rightsquigarrow \ r_{k+1},\eta _{k+1},{\mathtt {w}}_{k+1},{\mathtt {w}}_\mathtt {N},k+1. \end{aligned}$$

By construction the bounds (3.3), (4.8) and (5.4) hold since \(C\le {{\mathfrak {C}}},\)\(K\le {{\mathfrak {K}}}\), \(K^\sharp \le {{\mathfrak {K}}}^\sharp \), where \(\widehat{C}, \widehat{K}, \widehat{K}^\sharp \) were defined in (5.11), (5.12), (5.13). We just have to verify that (5.1) holds, namely

$$\begin{aligned} |R^{(k)}|_k\le \frac{\gamma }{{{\mathfrak {K}}}}\frac{r_k-r_{k+1}}{16 e r_k}. \end{aligned}$$

In fact, by applying the inductive hypothesis (5.19) and the smallness condition (5.18), we get

$$\begin{aligned} |R^{(k)}|_k\le \gamma {\left( 4 {{\mathfrak {C}}}{{\mathfrak {K}}}{\hat{\delta }}^{-1}\right) }^k {\varepsilon }^{k+1} \le \frac{\gamma {\varepsilon }}{2^k}\le \frac{\gamma }{16 e {{\mathfrak {K}}}(2\mathtt {N}- k)} = \frac{\gamma }{{{\mathfrak {K}}}}\frac{r_k-r_{k+1}}{16 e r_k}. \end{aligned}$$

The verification of (5.5) is completely analogous.

So, by applying Lemma 5.2 we construct a change of variable \(\Phi _k\) as in (5.2) with

$$\begin{aligned} \Phi _k\ :\ B_{r_{k+1}}({\mathtt {h}}_{{\mathtt {w}}_{k+1}})\ \rightarrow \ B_{r_k}({\mathtt {h}}_{{\mathtt {w}}_{k+1}}). \end{aligned}$$

Let us now set

$$\begin{aligned} H^{(k+1)}=D_\omega +Z^{(k+1)}+R^{(k+1)}:= H_k\circ \Phi _k \end{aligned}$$

with \(Z^{(k+1)}\in {{\mathcal {K}}}_{k+1}, R^{(k+1)}\in {{\mathcal {R}}}_{k+1}\) and \({\mathtt {d}}(Z^{(k+1)})\ge 1,\)\({\mathtt {d}}(R^{(k+1)})\ge k+2.\) It remains to prove the bounds in the second line of (5.19) (with \(n=k+1\)). By (5.3) we have

$$\begin{aligned} |Z^{(k+1)}|_{k+1}\le & {} |Z^{(k)}|_{k} + (\gamma {\hat{\delta }})^{-1} {{\mathfrak {K}}}|R^{(k)}|_{k} ({{\mathfrak {C}}}|R^{(k)}|_{k}+ |Z^{(k)}|_{k}), \nonumber \\ |R^{(k+1)}|_{k+1}\le & {} (\gamma {\hat{\delta }})^{-1} {{\mathfrak {K}}}|R^{(k)}|_{k} ({{\mathfrak {C}}}|R^{(k)}|_{k}+ |Z^{(k)}|_{k}). \end{aligned}$$
(5.20)

By substituting the inductive hypothesis (5.19), we have the following chain of inequalities

$$\begin{aligned} \gamma ^{-1}|R^{(k+1)}|_{k+1}\le & {} {\hat{\delta }}^{-1} {\varepsilon }^2 {{\mathfrak {K}}}(4{{\mathfrak {C}}}{{\mathfrak {K}}}{\hat{\delta }}^{-1}{\varepsilon })^k ({{\mathfrak {C}}}(4 {{\mathfrak {C}}}{{\mathfrak {K}}}{\hat{\delta }}^{-1}{\varepsilon })^k + 2)\\&{\mathop {\le }\limits ^{(5.18)}}&{\hat{\delta }}^{-1} {\varepsilon }^2 {{\mathfrak {K}}}(4{{\mathfrak {C}}}{{\mathfrak {K}}}{\hat{\delta }}^{-1}{\varepsilon })^k ({{\mathfrak {C}}}+ 2)\\\le & {} (4{{\mathfrak {C}}}{{\mathfrak {K}}}{\hat{\delta }}^{-1} )^{k+1} {\varepsilon }^{k+2} = (4{{\mathfrak {C}}}{{\mathfrak {K}}}{\hat{\delta }}^{-1}\varepsilon )^{k+1}\varepsilon , \end{aligned}$$

which proves the bound on \(R^{(n)}\) in (5.19) for any n.

En passant, we note that

$$\begin{aligned} \gamma {\varepsilon }{\left( 4 {{\mathfrak {C}}}{{\mathfrak {K}}}{\hat{\delta }}^{-1}{\varepsilon }\right) }^{\mathtt {N}} = \frac{\gamma }{2^9 e r_\star ^2} {\left( \frac{ {{\mathfrak {C}}}{{\mathfrak {K}}}\mathtt {N}}{4 r_\star ^2}\right) }^{\mathtt {N}} r^{2(\mathtt {N}+1)} . \end{aligned}$$
(5.21)

Finally, using the same strategy as above, we also get

$$\begin{aligned} \gamma ^{-1}|Z^{(k+1)}|_{k+1 } \le {\varepsilon }{\left( \sum _{h=0}^k 2^{-h} + (4{{\mathfrak {C}}}{{\mathfrak {K}}}{\hat{\delta }}^{-1} )^{k+1} {\varepsilon }^{k+1}\right) } {\mathop {\le }\limits ^{(5.18)}}{\varepsilon }\sum _{h=0}^{k+1} 2^{-h}, \end{aligned}$$

which completes the proof of the inductive hypothesis (5.19), and remark that

$$\begin{aligned} \varepsilon \sum _{h=0}^{\mathtt {N}} 2^{-h} = \frac{r^2}{ 2^8 e r_\star ^2} {\left( 1 - 2^{-\mathtt {N}- 1}\right) }. \end{aligned}$$
(5.22)

By (5.6) we have

$$\begin{aligned}&\Phi _k\ :\ B_{r_{k+1}}({\mathtt {h}}_{{\mathtt {w}}_\mathtt {N}})\ \rightarrow \ B_{r_k}({\mathtt {h}}_{{\mathtt {w}}_\mathtt {N}}), \nonumber \\&\quad \sup _{u\in B_{r_{k+1}}({\mathtt {h}}_{{\mathtt {w}}_\mathtt {N}})} {\left| \Phi _k(u)-u\right| }_{{\mathtt {w}}_\mathtt {N}} \le r_k \gamma ^{-1} {{\mathfrak {K}}}^\sharp |R^{(k)}|_k. \end{aligned}$$
(5.23)

In conclusion we define

$$\begin{aligned} \Psi := \Phi _0\circ \Phi _1\circ \dots \circ \Phi _{\mathtt {N}-1}\ :\ B_{ r}( {\mathtt {h}}_{\mathtt {N}}) \rightarrow B_{2r}({\mathtt {h}}_{\mathtt {N}}). \end{aligned}$$

Since we have

$$\begin{aligned}&\Phi _0\circ \Phi _1\circ \dots \circ \Phi _{\mathtt {N}-1}-{\text {id}}\\&\quad = (\Phi _0-{\text {id}})\circ \Phi _1\circ \dots \circ \Phi _{\mathtt {N}-1}+ (\Phi _1-{\text {id}})\circ \Phi _2\circ \dots \circ \Phi _{\mathtt {N}-1}+ \ldots \Phi _{\mathtt {N}-1}-{\text {id}}. \end{aligned}$$

By (5.23) we get

$$\begin{aligned} \sup _{u\in B_r({\mathtt {h}}_{{\mathtt {w}}_\mathtt {N}})} {\left| \Psi (u)-u\right| }_{{\mathtt {w}}_\mathtt {N}} \le \sum _{k=0}^{\mathtt {N}-1} r_k \gamma ^{-1} {{\mathfrak {K}}}^\sharp |R^{(k)}|_k {\mathop {\le }\limits ^{(5.19)}}2r {\varepsilon }{{\mathfrak {K}}}^\sharp \sum _{k=0}^{\mathtt {N}-1} 2^{-k} \le 4r {{\mathfrak {K}}}^\sharp {\varepsilon }, \end{aligned}$$

proving the first bound in (1.33). The second bound in (1.33) can be written as \(8 {\widehat{C}}_1 r^2 \le 1\), which follows from \(r\le \widehat{\mathtt {r}}\). We finally set \(Z= Z_{\mathtt {N}}, R= R_{\mathtt {N}}\) and the estimates (1.34) follow by (5.21)–(5.22). Of course the same reasoniong can be applied in order to construct the inverse, i.e. a symplectic change of variables \(\Phi : B_{r}({\mathtt {h}}_{{\mathtt {w}}})\mapsto B_{2r}({\mathtt {h}}_{{\mathtt {w}}})\) such that

$$\begin{aligned} \Psi \circ \Phi u=\Phi \circ \Psi u= u,\quad \forall u\in B_{\frac{7}{8} r}({\mathtt {h}}_{\mathtt {w}}). \end{aligned}$$
(5.24)

\(\square \)

When the nonlinearity G preserves momentum Theorem 1.3 can be reformulated under slightly weaker assumptions. More precisely, setting \(\eta =0\)

$$\begin{aligned}&{{\mathfrak {C}}}_0 := \max \Bigg \{1,\ \sup _{0\le n<\mathtt {N}} \sup _{\begin{array}{c} j,{\varvec{{\alpha }}},{\varvec{{\beta }}},\\ {\varvec{{\alpha }}}_j+{\varvec{{\beta }}}_j\ne 0, \\ \pi ({\varvec{{\alpha }}}-{\varvec{{\beta }}})=0 \end{array}} \frac{c^{(j)}_{\varrho ^*_{n},0,{\mathtt {w}}_{n+1}}({\varvec{{\alpha }}},{\varvec{{\beta }}}) }{c^{(j)}_{\varrho _{n},0,{\mathtt {w}}_n}({\varvec{{\alpha }}},{\varvec{{\beta }}}) } \Bigg \}<\infty , \nonumber \\&{{\mathfrak {K}}}_0:= \max \Bigg \{1,\ \sup _{0\le n<\mathtt {N}} \sup _{\begin{array}{c} j,{\varvec{{\alpha }}},{\varvec{{\beta }}},\\ {\varvec{{\alpha }}}_j+{\varvec{{\beta }}}_j\ne 0, \\ \pi ({\varvec{{\alpha }}}-{\varvec{{\beta }}})=0 \end{array}} \frac{ c^{(j)}_{\varrho ^*_{n},0,{\mathtt {w}}_{n+1}}({\varvec{{\alpha }}},{\varvec{{\beta }}}) }{c^{(j)}_{\varrho _{n},0,{\mathtt {w}}_n}({\varvec{{\alpha }}},{\varvec{{\beta }}}) |\omega \cdot ({\varvec{{\alpha }}}-{\varvec{{\beta }}})|}\Bigg \}<\infty , \nonumber \\&{{\mathfrak {K}}}^\sharp _0 := \max \Bigg \{1,\ \sup _{0\le n<\mathtt {N}} \sup _{\begin{array}{c} j,{\varvec{{\alpha }}},{\varvec{{\beta }}},\\ {\varvec{{\alpha }}}_j+{\varvec{{\beta }}}_j\ne 0, \\ \pi ({\varvec{{\alpha }}}-{\varvec{{\beta }}})=0 \end{array}} \frac{ c^{(j)}_{\varrho ^*_{n},0,{\mathtt {w}}_\mathtt {N}}({\varvec{{\alpha }}},{\varvec{{\beta }}}) }{c^{(j)}_{\varrho _{n},0,{\mathtt {w}}_n}({\varvec{{\alpha }}},{\varvec{{\beta }}}) |\omega \cdot ({\varvec{{\alpha }}}-{\varvec{{\beta }}})|} \Bigg \} <\infty , \end{aligned}$$
(5.25)

the following holds.

Proposition 5.1

If G preserves momentum Theorem 1.3 holds word by word with \({{\mathfrak {C}}}_0 ,{{\mathfrak {K}}}_0,{{\mathfrak {K}}}^\sharp _0 \) instead of \({{\mathfrak {C}}},{{\mathfrak {K}}},{{\mathfrak {K}}}^\sharp \). Moreover also the new perturbation R preserves momentum.

We note that in the case that G preserves momentum, the same result holds with \({{\mathfrak {C}}}_0 ,{{\mathfrak {K}}}_0,{{\mathfrak {K}}}^\sharp _0 \) instead of \({{\mathfrak {C}}},{{\mathfrak {K}}},{{\mathfrak {K}}}^\sharp \); moreover also R preserves momentum.

We finally give the following abstract stability result, whose proof is postponed to the “Appendix B”.

Lemma 5.4

On the Hilbert space \({\mathtt {h}}_{\mathtt {w}}\) consider the dynamical system

$$\begin{aligned} \dot{v} = X_{{\mathcal {N}}}+X_R, \qquad v(0)=v_0, \qquad |v_0|_{\mathtt {w}}\le \frac{3}{4} r, \end{aligned}$$

where \({\mathcal {N}}\in \mathcal {A}_{r,0}({\mathtt {h}}_{\mathtt {w}}) \) and \(R\in {{\mathcal {H}}}_{r,\eta }({\mathtt {h}}_{\mathtt {w}})\) for some \(r>0,\eta \ge 0.\) Assume that

$$\begin{aligned} {\text {Re}}(X_{{\mathcal {N}}},v)_{{\mathtt {h}}_{\mathtt {w}}}=0. \end{aligned}$$

Then

$$\begin{aligned} \Big | |v(t)|_{\mathtt {w}}-|v_0|_{\mathtt {w}}\Big |< \frac{r}{8}, \qquad \forall \, |t|\le \frac{1}{8|R|_{r,\eta ,{\mathtt {w}}}}. \end{aligned}$$
(5.26)

Corollary 5.1

Under the same assumptions of Theorem 1.3, the solution u(t) of the Hamiltonian flow of (1.31) with initial datum \(u(0)=u_0\) such that \(|u_0|_{\mathtt {w}}\le \frac{3r}{8}\) exists and satisfies

$$\begin{aligned} |u(t)|_{{\mathtt {w}}} \le r \quad { for\ all\ times}\quad |t|\le \frac{1}{8 \widehat{\mathtt {C}}_3 r^{2(\mathtt {N}+1)}} . \end{aligned}$$
(5.27)

Proof

Let us consider Hamiltonian (1.31), take an initial datum \(|u_0|_{\mathtt {w}}:=r< \frac{3}{8} \widehat{\mathtt {r}}\) and apply the change of vartiables of Theorem 1.3. Denoting by \(v(0)= \Psi (u_0)\) we are under the hypotheses of Lemma 5.4 with \(\eta =0\) and we conclude

$$\begin{aligned} |v(t)|_{\mathtt {w}}\le \frac{7}{8} r , \qquad \forall \, |t|\le \frac{1}{8|R|_{r,0,{\mathtt {w}}}}. \end{aligned}$$

Now we can apply (5.24) in order to return to the original variables and deduce that \(u(t)= \Phi v(t)\) satisfies

$$\begin{aligned} |u(t)|_{\mathtt {w}}\le r , \qquad \forall \, |t|\le \frac{r^{-2(\mathtt {N}+1)}}{8 \widehat{\mathtt {C}}_3 } \le \frac{1}{8|R|_{r,0,{\mathtt {w}}}} . \end{aligned}$$

\(\square \)

7 Part 2. Applications to Gevrey and Sobolev Cases

In Part 2 we show how to apply the abstract BNF to Gevrey and Sobolev cases. Following the notations given in the introduction we work in the three sequence spaces defined for the applications \(\mathtt {G},\mathtt {S},\mathtt {M}\), see page 7. As explained in the introduction, in order to prove the estimates on the stability times we just need to verify that Assumption 1 holds. This is the content of the next sections.

Let us start by setting some notations.

Case\(\mathtt {G})\) In the case \({\mathtt {w}}(p,s,a)= {\left( \langle j \rangle ^p e^{s\langle j \rangle ^\theta + a |j|}\right) }_{j\in \mathbb {Z}}\) we denote \({{\mathtt {h}}_{\mathtt {w}(p,s,a)}}={\mathtt {h}}_{p,s,a}\), same notation for the norm of vectors \(|\cdot |_{p,s,a}\). Regarding the norm of Hamiltonians we write \( |\cdot |_{r,\eta , {\mathtt {w}}(p,s,a)}\), consistently with Definition 1.2. Of course, for any \( 0\le p \le p', 0 \le s \le s', 0 \le a \le a' \) we have

$$\begin{aligned} {\mathtt {h}}_{p', s', a'} \subseteq {\mathtt {h}}_{p, s, a},\qquad {\left| v\right| }_{p, s, a} \le {\left| v\right| }_{p', s', a'},\quad \forall v\in {\mathtt {h}}_{p', s', a'}. \end{aligned}$$

Case\(\mathtt {S})\) If \(a=s=0\) we denote \({\mathtt {h}}_{p,0,0}= {\mathtt {h}}_p\) , same notation for the norm of vectors \(|\cdot |_p\) and hamiltonians \(|\cdot |_{r,\eta ,{\mathtt {w}}(p)}\).

Remark 5.1

Note that, via the usual Fourier identification one has:

$$\begin{aligned} |u|_p\le |u(x)|_{L^2}+|\partial _x^p u(x)|_{L^2}\le 2 |u|_p. \end{aligned}$$
(5.28)

Case\(\mathtt {M})\) In the case \({\mathtt {w}}_j=\lfloor j \rfloor ^p\) where

$$\begin{aligned} \lfloor j \rfloor := \max \{|j|,2\} \end{aligned}$$

we denote the norm of vectors as

$$\begin{aligned} \Vert u\Vert _p^2 = \Vert u\Vert _{\mathtt {w}}^2:=\sum _{j\in \mathbb Z} \lfloor j \rfloor ^{2p} |u_j|^2 . \end{aligned}$$
(5.29)

Remark 5.2

Note that \({\mathtt {h}}_{\mathtt {w}}\) in \(\mathtt {M})\) and \({\mathtt {h}}_p\) are the same vector space endowed with two equivalent norms. Moreover one has

$$\begin{aligned} \Vert u\Vert _p\le 2^p |u(x)|_{L^2}+|\partial _x^p u(x)|_{L^2}\le 2\Vert u\Vert _p. \end{aligned}$$
(5.30)

Definition 5.2

(momentum preserving regular Hamiltonians). Given \(r>0,p\ge 0\) let \({{\mathcal {H}}}^{r,p}\) be the space of point-wise absolutely convergent Hamiltonians on \(\Vert u\Vert _p\le r\) which preserves momentum and such that

$$\begin{aligned} \Vert H\Vert _{r,p} := r^{-1} {\left( \sup _{\Vert u\Vert _{p}\le r} \Vert {X}_{{{\underline{H}}}}\Vert _{p} \right) }< \infty , \end{aligned}$$
(5.31)

namely.Footnote 16

$$\begin{aligned} \Vert \cdot \Vert _{r,p}=|\cdot |_{{{\mathcal {H}}}_{r,0}({\mathtt {h}}_{\mathtt {w}})},\qquad {\mathtt {w}}_j= \lfloor j \rfloor ^p \end{aligned}$$

We now verify that the nonlinearities in (1.1) are bounded in the norm \(|\cdot |_{r,\eta ,{\mathtt {w}}}\) in the cases \(\mathtt {S},\mathtt {M},\mathtt {G}\).

Proposition 5.2

Consider the correction term \(P= \int _{\mathbb {T}}F(x,|u|^2)dx\) in the NLS Hamiltonian (1.21), where the argument f in F satisfies(1.2). Let \(p> 1/2\).

  1. (i)

    For any \(a,s,\eta \ge 0\) such that \(a+\eta <{\mathtt {a}}\) and any \(r>0\) such thatFootnote 17\(({C_{\mathtt {alg}}(p)}r)^2\le R\), we have

    $$\begin{aligned} | P|_{r,\eta ,{\mathtt {w}}(p,s,a)} \le {C_{\mathtt {Nem}}}(p,s,{\mathtt {a}}- a -\eta )\frac{({C_{\mathtt {alg}}(p)}r)^2}{R}|f|_{{\mathtt {a}},R}< \infty \end{aligned}$$
    (5.32)

    where f and \(|f|_{{\mathtt {a}},R}\) are defined in 1.2.

  2. (ii)

    If F is independent ofFootnote 18x, for \(({C_{\mathtt {alg},\mathtt {M}}(p)}r)^2\le R\) we have

    $$\begin{aligned} \Vert P\Vert _{r,p} \le 2^p \frac{({C_{\mathtt {alg},\mathtt {M}}(p)}r)^2}{R}|f|_{R}< \infty . \end{aligned}$$
    (5.33)

This Proposition follows directly from the fact that the corresponding sequence spaces \({\mathtt {h}}_{\mathtt {w}}\) are closed w.r.to convolution.

Let \(\star :{\mathtt {h}}_{p, s, a} \times {\mathtt {h}}_{p, s, a} \rightarrow {\mathtt {h}}_{p, s, a}\) be the convolution operation defined as

$$\begin{aligned} {\left( f,g\right) } \mapsto f\star g :={\left( \sum _{{j_1,j_2\in \mathbb {Z},\, j_1+j_2=j}} f_{j_1}g_{j_2}\right) }_{j\in \mathbb {Z}}. \end{aligned}$$

The map \(\star : {\left( f,g\right) } \mapsto f\star g\) is continuous in the following sense:

Lemma 5.5

For \(p>1/2\) we have

$$\begin{aligned} {\left| f\star g\right| }_{p,s,a} \le {C_{\mathtt {alg}}(p)}{\left| f\right| }_{p,s,a}{\left| g\right| }_{p,s,a},\qquad \Vert f\star g\Vert _p\le {C_{\mathtt {alg},\mathtt {M}}(p)}\Vert f\Vert _p \Vert g\Vert _p.\nonumber \\ \end{aligned}$$
(5.34)

The proof is given in “Appendix B”.

Proof of Proposition 5.2

By definition (recall (1.2) and (1.21))

$$\begin{aligned} F(x,y)=\int _0^y f(x,s) ds = \sum _{d=2}^\infty \frac{f^{(d-1)}(x)}{d} y^{d} =: \sum _{d=2}^\infty F^{(d)}(x) y^{d} \end{aligned}$$
(5.35)

therefore we have

$$\begin{aligned} P= \int _{\mathbb {T}}F(x,|u|^2)dx= \sum _{d\ge 2}{\left( F^{(d)}\star \underbrace{ u \star \cdots \star u}_{d \,\text{ times }} \star \underbrace{{\bar{u}} \star \cdots \star {\bar{u}}}_{d \,\text{ times }}\right) }_0. \end{aligned}$$

To each analytic function \(F^{(d)}(x)\) we associate its Fourier coefficients; we have \({\left( F^{(d)}_j\right) }_{j\in \mathbb {Z}}\in {\mathtt {h}}_{p,s,a_0}\) for \( a_0:=a+\eta <{\mathtt {a}}\) and \(s,p\ge 0\). Indeed

$$\begin{aligned} |F^{(d)}|_{p,s,a_0}^2&:= \sum _{j} e^{2a_0 |j|+ 2s \langle j \rangle ^\theta }\langle j \rangle ^{2p} |F^{(d)}_j|^2 {\mathop {=}\limits ^{(5.34)}} \sum _{j} e^{2a_0 |j|+ 2s \langle j \rangle ^\theta }\langle j \rangle ^{2p} \frac{|f^{(d-1)}_j|^2}{d^2}\\&\le \frac{c^2(p,s,{\mathtt {a}}- a_0)}{d^2} \sum _{j} e^{2{\mathtt {a}}|j|} |f^{(d-1)}_j|^2 = \frac{c^2({\mathtt {a}}- a_0,s,p)}{d^2}|f^{(d-1)}|^2_{\mathbb {T}_{\mathtt {a}}}\, \end{aligned}$$

with

$$\begin{aligned} c(p,s,t):=e^s+ \sup _{x\ge 1} x^p e^{-t x+s x^\theta }. \end{aligned}$$

Now condition (1.2) ensures that (B.12) holds and our claim follows, by Lemma B.2, setting \(a_0= a+\eta \).

(ii) Follows from (B.14). \(\square \)

8 Immersions

The following proposition gathers the immersion properties of the norm \(|\cdot |_{r,\eta ,{\mathtt {w}}(p,s,a)}\) with respect to the parameters psa.

Proposition 6.1

The following inequalities hold:

  1. (1)

    Variations w.r.t. the paramaterp. For any \(0<\rho <r\) , \(0<\sigma < \eta \) and \(p_1>0\) we have

    $$\begin{aligned} {\left| H\right| }_{r-\rho ,\eta -\sigma ,{\mathtt {w}}(p+p_1,s,a)}\le {C_{\mathtt {mon}}}(r/\rho , {\sigma },p_1) {\left| H\right| }_{r,\eta ,{\mathtt {w}}(p,s,a)}. \end{aligned}$$
  2. (2)

    Variation w.r.t. the parameters. For any \(0<\sigma < \eta \) we have

    $$\begin{aligned} |H|_{r,\eta -{\sigma },{\mathtt {w}}(p, s+{\sigma },a)} \le |H|_{r,\eta ,{\mathtt {w}}(p,s,a)}. \end{aligned}$$
    (6.1)
  3. (3)

    Variation w.r.t. the parametera. For any \(0<\sigma < \eta \)

    $$\begin{aligned} |H|_{e^{-{\sigma }}r, \eta -{\sigma },{\mathtt {w}}(p,s,a+{\sigma })} \le e^{2{\sigma }}|H|_{r,\eta ,{\mathtt {w}}(p,s,a)}. \end{aligned}$$
    (6.2)

Remark 6.1

All the items in the previous Proposition describe immersion properties of \({{\mathcal {H}}}_{r,\eta }({\mathtt {h}}_{p,s,a})\) w.r.t variations of the parameters.

In item (1) we say that if \(H\in {{\mathcal {H}}}_{r,\eta }({\mathtt {h}}_{p,s,a})\) (i.e. if its vector field maps \(B_r({\mathtt {h}}_{p,s,a})\rightarrow {\mathtt {h}}_{p,s,a}\)) then it is also in \({{\mathcal {H}}}_{r-\rho ,\eta -{\sigma }}({\mathtt {h}}_{p+p_1,s,a})\) for any \(\rho ,{\sigma },p_1>0\). Note however that the norm of H in the latter space is in general much larger, we denote this constant by \({C_{\mathtt {mon}}}\).

In item (3) we have essentially the same phenomenon, only in order to increase the analiticity parameter \(a \rightsquigarrow a+{\sigma }\), we need to decrease the radius to \(e^{-{\sigma }} r\).

Item (2) gives the best bound, indeed not only \({{\mathcal {H}}}_{r,\eta -{\sigma }}({\mathtt {h}}_{p,s+{\sigma },a})\subseteq {{\mathcal {H}}}_{r,\eta }({\mathtt {h}}_{p,s,a})\) but the norm of H in the latter space does not increase.

To prove this Proposition we show that the hypotheses of Proposition 3.1 hold. In order to prove this, in turn we strongly rely on some notation and results introduced by Bourgain in [Bou05] and extended later on by Cong–Li–Shi–Yuan in [CLSY] (Definition 6.1 and Lemma 6.1 below). The definitions and lemmata given below are the key technical arguments. Many of the ideas come from Bourgain in [Bou05] in the case of Gevrey regularity and for momentum preserving Hamiltonians, here we give a detailed presentation adapted to our more general setting and covering also the case of Sobolev regularity.

Definition 6.1

Given a vector \(v={\left( v_i\right) }_{i\in \mathbb {Z}}\)\(v_i\in {\mathbb {N}}\), \(|v|<\infty \) we denote by \(\widehat{n}=\widehat{n}(v)\) the vector \({\left( \widehat{n}_l\right) }_{l\in I}\) (where \(I\subset {\mathbb {N}}\) is finite) which is the decreasing rearrangement of

$$\begin{aligned} \{{\mathbb {N}}\ni h> 1\,\, \text{ repeated }\, v_h + v_{-h}\, \text{ times } \} \cup {\left\{ 1\,\, \text{ repeated }\, v_1 + v_{-1} + v_0\, \text{ times } \right\} }. \end{aligned}$$

Remark 6.2

A good way of envisioning this list is as follows. Given \(v={\left( v_i\right) }_{i\in \mathbb {Z}}\) consider the monomial \(x^v:= \prod _i x_i^{v_i}\). We can write uniquely

$$\begin{aligned} x^v= \prod _i x_i^{v_i} = x_{j_1} x_{j_2}\cdots x_{j_{|v|}} \end{aligned}$$

then \(\widehat{n}(v)\) is the decreasing rearrangement of the list \({\left( \langle j_1 \rangle ,\dots ,\langle j_{|v|} \rangle \right) }\).

As an example, consider the case \(v\ne 0\). Then, by construction there exists a unique \(J\ge 0\) such that \(v_j=0\) for all \(|j|>J\) and \(v_{J}+ v_{-J}\ne 0\) hence

$$\begin{aligned} v=(\dots ,0, v_{-J},\dots ,v_0,\dots ,v_J,0\dots ). \end{aligned}$$

If \(J=0\) then

$$\begin{aligned} \widehat{n}= (\underbrace{1,\dots ,1}_{v_0 \, \mathrm {times}}) \end{aligned}$$

otherwise we have

$$\begin{aligned} \widehat{n}= (\underbrace{J,\dots ,J}_{v_J+ v_{-J} \, \mathrm {times}}, \underbrace{J-1,\dots ,J-1}_{v_{J-1}+ v_{-J+1} \, \mathrm {times}},\dots ,\underbrace{1,\dots ,1}_{v_1+ v_{-1}+ v_0 \, \mathrm {times}}). \end{aligned}$$

Given \({\varvec{{\alpha }}},{\varvec{{\beta }}}\in {\mathbb {N}}^\mathbb {Z}\) with \(1\le |{\varvec{{\alpha }}}|=|{\varvec{{\beta }}}|<\infty ,\) from now on we define

$$\begin{aligned} \widehat{n}=\widehat{n}({\varvec{{\alpha }}}+{\varvec{{\beta }}}). \end{aligned}$$

We set the even number

$$\begin{aligned} N:=|{\varvec{{\alpha }}}|+|{\varvec{{\beta }}}|, \end{aligned}$$

which is the cardinality of \(\widehat{n}.\) We observe that, given

$$\begin{aligned} \pi = \sum _{i\in \mathbb {Z}} i{\left( {\varvec{{\alpha }}}_i - {\varvec{{\beta }}}_i\right) }= \sum _{h> 0} h {\left( {\varvec{{\alpha }}}_h - {\varvec{{\beta }}}_h - {\varvec{{\alpha }}}_{-h} + {\varvec{{\beta }}}_{-h}\right) } , \end{aligned}$$

there exists a choice of \({\sigma }_i = \pm 1, 0\) such that

$$\begin{aligned} \pi = \sum _l \sigma _l\widehat{n}_l \end{aligned}$$
(6.3)

with \(\sigma _l \ne 0\) if \(\widehat{n}_l \ne 1\). Hence,

$$\begin{aligned} \widehat{n}_1\le {\left| \pi \right| } + \sum _{l\ge 2}\widehat{n}_l. \end{aligned}$$
(6.4)

Indeed, if \(\sigma _1 = \pm 1\), the inequality follows directly from (6.3); if \(\sigma _1 = 0\), then \(\widehat{n}_1=1\) and consequently \(\widehat{n}_l = 1\, \forall l\). Since the mass is conserved, the list \(\widehat{n}\) has at least two elements, and the inequality is achieved.

Lemma 6.1

Given \({\varvec{{\alpha }}},{\varvec{{\beta }}}\) such that \(\sum _i i ({\varvec{{\alpha }}}_i-{\varvec{{\beta }}}_i)=\pi \in \mathbb {Z}\), we have that setting \(\widehat{n}=\widehat{n}({\varvec{{\alpha }}}+{\varvec{{\beta }}})\)

$$\begin{aligned} \sum _i \langle i \rangle ^\theta ({\varvec{{\alpha }}}_i+{\varvec{{\beta }}}_i) =\sum _{l\ge 1} \widehat{n}_l^\theta \ge 2 \widehat{n}^\theta _1+ (2-2^\theta ) {\sum _{l\ge 3} \widehat{n}_l^\theta } -\theta {{\left| \pi \right| }}. \end{aligned}$$
(6.5)

Proof

In “Appendix C”. \(\square \)

The lemma above was proved in the simpler case of momentum preserving Hamiltonians in [Bou05] for \(\theta =\frac{1}{2}\) and for general \(\theta \) in [CLSY]. It is fundamental in discussing the properties of \({{\mathcal {H}}}_{r,\eta }({\mathtt {h}}_{p,s,a})\) with \(s>0\), indeed it implies

$$\begin{aligned} \sum _i \langle i \rangle ^\theta ({\varvec{{\alpha }}}_i+{\varvec{{\beta }}}_i) -2\langle j \rangle ^\theta + |\pi ({\varvec{{\alpha }}}-{\varvec{{\beta }}})|\ge (1-\theta ) {\left( \sum _{l\ge 3} \widehat{n}_l^\theta +|\pi |\right) } \ge 0 \end{aligned}$$
(6.6)

for all \({\varvec{{\alpha }}},{\varvec{{\beta }}}\) such that \({\varvec{{\alpha }}}_j+{\varvec{{\beta }}}_j\ne 0\).

Proof of Proposition 6.1

In all that follows we shall use systematically the fact that our Hamiltonians preserve the mass and are zero at the origin. These facts imply that \({\left| {\varvec{{\alpha }}}\right| } = {\left| {\varvec{{\beta }}}\right| } \ge 1\).

Let us start by proving Item (2), which is the simplest case. We need to show that

$$\begin{aligned} \frac{ c^{(j)}_{r,\eta -{\sigma },{\mathtt {w}}(p,s+{\sigma },a)}({\varvec{{\alpha }}},{\varvec{{\beta }}})}{c^{(j)}_{r,\eta ,{\mathtt {w}}(p,s,a) }({\varvec{{\alpha }}},{\varvec{{\beta }}})} = \exp (-{\sigma }(\sum _i \langle i \rangle ^\theta ({\varvec{{\alpha }}}_i+{\varvec{{\beta }}}_i) -2\langle j \rangle ^\theta + |\pi ({\varvec{{\alpha }}}-{\varvec{{\beta }}})|) \le 1. \end{aligned}$$
(6.7)

The last inequality follows by (6.6) of Lemma 6.1

Item (1) First we assume that \(\rho \le r/2.\) By Proposition 3.1 for any \(0<\rho \le r/2\) , \(0<\sigma < \eta \) and \(p_1>0\) we need to compute

$$\begin{aligned} {C_{\mathtt {mon}}}:= & {} \sup _{\begin{array}{c} j,{\varvec{{\alpha }}},{\varvec{{\beta }}}\\ {\varvec{{\alpha }}}_j+{\varvec{{\beta }}}_j\ne 0 \end{array}} \frac{c^{(j)}_{r-\rho ,\eta -{\sigma },{\mathtt {w}}(p+p_1,s,a) }({\varvec{{\alpha }}},{\varvec{{\beta }}}) }{ c^{(j)}_{r,\eta ,{\mathtt {w}}(p,s,a) }({\varvec{{\alpha }}},{\varvec{{\beta }}})} \nonumber \\= & {} \sup _{\begin{array}{c} j,{\varvec{{\alpha }}},{\varvec{{\beta }}}\\ {\varvec{{\alpha }}}_j+{\varvec{{\beta }}}_j\ne 0 \end{array}} {\left( \frac{\langle j \rangle ^2}{\prod _i\langle i \rangle ^{{\varvec{{\alpha }}}_i+{\varvec{{\beta }}}_i}}\right) }^{p_1} e^{-{\sigma }|\pi |} {\left( \frac{r-\rho }{r}\right) }^{|{\varvec{{\alpha }}}|+|{\varvec{{\beta }}}|-2}. \end{aligned}$$
(6.8)

We use the notations of Definition 6.1, with \(\widehat{n}({\varvec{{\alpha }}}+{\varvec{{\beta }}})\equiv \widehat{n}\). Since \({\varvec{{\alpha }}}_j+{\varvec{{\beta }}}_j\ne 0\) we have that \(\langle j \rangle \le \widehat{n}_1\). Note that

$$\begin{aligned} \prod _i\langle i \rangle ^{{\varvec{{\alpha }}}_i+{\varvec{{\beta }}}_i}= \prod _{l\ge 1}\widehat{n}_l. \end{aligned}$$
(6.9)

Hence

$$\begin{aligned} {\frac{\langle j \rangle ^2}{\prod _i\langle i \rangle ^{{\varvec{{\alpha }}}_i+{\varvec{{\beta }}}_i}}}\le \frac{\widehat{n}_1}{\prod _{l\ge 2}\widehat{n}_l}. \end{aligned}$$

Let us call \(N=|{\varvec{{\alpha }}}|+|{\varvec{{\beta }}}|\ge 2\). By (6.4) we have that

$$\begin{aligned} \sup _{\begin{array}{c} j,{\varvec{{\alpha }}},{\varvec{{\beta }}}\\ {\varvec{{\alpha }}}_j+{\varvec{{\beta }}}_j\ne 0 \end{array}} {\frac{\langle j \rangle ^2}{\prod _i\langle i \rangle ^{{\varvec{{\alpha }}}_i+{\varvec{{\beta }}}_i}}}\le & {} \frac{\widehat{n}_1}{\prod _{l\ge 2}\widehat{n}_l} \le \frac{\sum _{l=2}^{N}\widehat{n}_l +|\pi |}{\prod _{l= 2}^{N}\widehat{n}_l}\nonumber \\\le & {} \frac{(N-1)\widehat{n}_2 +|\pi |}{\prod _{l= 2}^{N}\widehat{n}_l} \le \frac{N+|\pi |}{\prod _{l=3}^{N} \widehat{n}_l}. \end{aligned}$$
(6.10)

We have shown that

$$\begin{aligned} \sup _{\begin{array}{c} j,{\varvec{{\alpha }}},{\varvec{{\beta }}}\\ {\varvec{{\alpha }}}_j+{\varvec{{\beta }}}_j\ne 0 \end{array}} {\frac{\langle j \rangle ^2}{\prod _i\langle i \rangle ^{{\varvec{{\alpha }}}_i+{\varvec{{\beta }}}_i}}} \le N+|\pi | . \end{aligned}$$

Since \({\left( N +|\pi |\right) }^{p_1}\le 2^{p_1}(N^{p_1}+|\pi |^{p_1})\), denoting \(L:=\ln {\left( r/r-\rho \right) }\) we repeatedly use Lemma C.1 in order to control

$$\begin{aligned}&\sup _{N\ge 2,\pi \in \mathbb {Z}} {\left( N +|\pi |\right) }^{p_1} e^{-{\sigma }|\pi |} {\left( \frac{r-\rho }{r}\right) }^{N-2}\nonumber \\&\quad \le 2^{p_1} {\left( \sup _{N\ge 2\, ,\pi \in \mathbb {Z}} N^{p_1} e^{-{\sigma }|\pi | - L(N-2) } + \sup _{N\ge 2\, ,\pi \in \mathbb {Z}} |\pi |^{p_1} e^{-{\sigma }|\pi | - L(N-2) } \right) }\nonumber \\&\quad \le 2^{p_1} {\left( \max \left\{ {\left( \frac{p_1}{L}\right) }^{p_1} ,1 \right\} +{\left( \frac{p_1}{{\sigma }}\right) }^{p_1}\right) } \le 2^{p_1+1} \max \left\{ {\left( \frac{p_1}{L}\right) }^{p_1} , {\left( \frac{p_1}{{\sigma }}\right) }^{p_1}, 1\right\} \nonumber \\&\quad \le 2^{p_1+1} p_1^{p_1} \max \left\{ {\left( \frac{2 r}{\rho }\right) }^{p_1} , {\left( \frac{1}{{\sigma }}\right) }^{p_1}, 1\right\} = {C_{\mathtt {mon}}}, \end{aligned}$$
(6.11)

using that

$$\begin{aligned} L\ge \ln (1+\rho /r)\ge 2\ln (3/2)\rho /r \ge \rho /2r, \end{aligned}$$

which holds since we are in the case \(\rho \le r/2.\) This completes the proof in the case \(\rho \le r/2.\)

Consider now the case \(r/2<\rho <r.\) Using the monotonicity of the norm w.r.t. r and the already proved case with \(\rho =r/2\), we have

$$\begin{aligned}&{\left| H\right| }_{r-\rho ,\eta -\sigma ,{\mathtt {w}}(p+p_1,s,a)}\le {\left| H\right| }_{r/2,\eta -\sigma ,{\mathtt {w}}(p+p_1,s,a)}\\&\quad \le 2^{p_1+1} \max \left\{ {\left( 4p_1\right) }^{p_1} , {\left( \frac{p_1}{e{\sigma }}\right) }^{p_1}, 1\right\} {\left| H\right| }_{r,\eta ,{\mathtt {w}}(p,s,a)}\\&\quad \le 2^{p_1+1} p_1^{p_1} \max \left\{ {\left( \frac{2 r}{\rho }\right) }^{p_1} , {\left( \frac{1}{{\sigma }}\right) }^{p_1}, 1\right\} {\left| H\right| }_{r,\eta ,{\mathtt {w}}(p,s,a)} , \end{aligned}$$

proving (1) also in the case \(r/2<\rho <r.\)

Item (3) We proceed as in item \((1)-(2)\),

$$\begin{aligned} \frac{ c^{(j)}_{e^{-{\sigma }}r,\eta -{\sigma },{\mathtt {w}}(p,s,a+{\sigma })}({\varvec{{\alpha }}},{\varvec{{\beta }}})}{c^{(j)}_{r,\eta ,{\mathtt {w}}(p,s,a) }({\varvec{{\alpha }}},{\varvec{{\beta }}})}= & {} \exp (-{\sigma }(\sum _i \langle i \rangle ^\theta ({\varvec{{\alpha }}}_i+{\varvec{{\beta }}}_i) -2\langle j \rangle ^\theta \nonumber \\&\quad + |\pi ({\varvec{{\alpha }}}-{\varvec{{\beta }}})| - (|{\varvec{{\alpha }}}|+|{\varvec{{\beta }}}|-2)) \le e^{2{\sigma }}. \end{aligned}$$
(6.12)

our claim follows since, by formula (6.4), one has

$$\begin{aligned} \sum _i {\left( {\varvec{{\alpha }}}_i + {\varvec{{\beta }}}_i\right) }{\left| i\right| } - 2{\left| j\right| } + {\left| \pi \right| } \ge \sum _{l\ge 2} \widehat{n}_l - \widehat{n}_1 + {\left| \pi \right| } - {\left| {\varvec{{\alpha }}}_0 + {\varvec{{\beta }}}_0\right| } \ge - {\left( {\left| {\varvec{{\alpha }}}\right| } + {\left| {\varvec{{\beta }}}\right| }\right) }. \end{aligned}$$
(6.13)

\(\square \)

Remark 6.3

Note that a key point in Items (1) and (2) are the estimates (6.10) and (6.6) where we control the ratio of the coefficient (1.27) in terms of \({\left\{ \widehat{n}_l\right\} }_{l\ge 3}\) (namely uniformly with respect to \(\widehat{n}_1\) and \(\widehat{n}_2\)). This means that if \(\widehat{n}_3\) is "big", then the norm of the Hamiltonian is correspondingly small: polynomially in the Sobolev case and subexponentially in the Gevrey one. This is a seminal property which appears in different flavors thoughout the literature; in Proposition 6.1 we do not really need to exploit it. Instead, it will be heavily used for a sharp control on the small divisors appearing in the Homological equation (see proof of Proposition 7.1).

Incidentally we note that norm \(|\cdot |_{r,\eta ,{\mathtt {w}}(p,s,a)}\) possesses the tameness property.

Proposition 6.2

$$\begin{aligned} \sup _{|u|_{p_0,s,a}\le r-\rho }\frac{|X_{\underline{H}}|_{p,s,a}}{|u|_{p,s,a}} \le {C_{\mathtt {tame}}}(\rho ,\eta ,p)|H|_{r,\eta ,{\mathtt {w}}(p_0,s,a)}. \end{aligned}$$

Proof

In “Appendix B”. \(\square \)

Proposition 6.3

The norm \(\Vert \cdot \Vert _{r,p}\) is monotone decreasing in p, namely \(\Vert \cdot \Vert _{r,p+p_1}\le \Vert \cdot \Vert _{r,p}\) for any \(p_1>0\).

Proof

For the norm \(\Vert \cdot \Vert _{r,p}\) the quantity in (1.27) becomes (recall that in the norm of a momentum preserving hamiltonian there is need of introducing the parameter \(\eta \))

$$\begin{aligned} \mathtt {c}^{(j)}_{r,p}({\varvec{{\alpha }}},{\varvec{{\beta }}}):= r^{|{\varvec{{\alpha }}}|+|{\varvec{{\beta }}}|-2} {\left( \frac{\lfloor j \rfloor ^{2}}{\prod _{i\in \mathbb {Z}}{\lfloor i \rfloor }^{{\left( {\varvec{{\alpha }}}_i+{\varvec{{\beta }}}_i\right) }}}\right) }^p . \end{aligned}$$
(6.14)

By Lemma 3.1 item (ii) we only need to show that

$$\begin{aligned} c^{(j)}_{r,p+p_1}({\varvec{{\alpha }}},{\varvec{{\beta }}}) \le c^{(j)}_{r,p}({\varvec{{\alpha }}},{\varvec{{\beta }}}) \end{aligned}$$
(6.15)

for all j, \({\varvec{{\alpha }}},{\varvec{{\beta }}}\) with \({\left| {\varvec{{\alpha }}}\right| } = {\left| {\varvec{{\beta }}}\right| } \ge 1\) and \({\varvec{{\alpha }}}_j+{\varvec{{\beta }}}_j\ge 1\) (recall the momentum conservation), namely we have to prove that

$$\begin{aligned} \sup _{\begin{array}{c} j,{\varvec{{\alpha }}},{\varvec{{\beta }}}\\ {\varvec{{\alpha }}}_j+{\varvec{{\beta }}}_j\ge 1 \end{array}} \frac{\lfloor j \rfloor ^2}{\prod _i\lfloor i \rfloor ^{{\varvec{{\alpha }}}_i+{\varvec{{\beta }}}_i}} \le 1. \end{aligned}$$
(6.16)

We first show that the inequality holds in the case \(j=0,\pm 1.\) Indeed we have

$$\begin{aligned} \prod _i\lfloor i \rfloor ^{{\varvec{{\alpha }}}_i+{\varvec{{\beta }}}_i} \ge \prod _i 2^{{\varvec{{\alpha }}}_i+{\varvec{{\beta }}}_i} =2^{\sum _i {\varvec{{\alpha }}}_i+{\varvec{{\beta }}}_i} \ge 4 \end{aligned}$$

since \(\sum _i {\varvec{{\alpha }}}_i+{\varvec{{\beta }}}_i\ge 2\) (by the fact that \({\left| {\varvec{{\alpha }}}\right| } = {\left| {\varvec{{\beta }}}\right| } \ge 1\)).

Consider now the case \(|j|=\lfloor j \rfloor \ge 2.\) Since \({\varvec{{\alpha }}}_j+{\varvec{{\beta }}}_j\ge 1\), inequality (6.16) follows by

$$\begin{aligned} \sup _{j,{\varvec{{\alpha }}},{\varvec{{\beta }}}} \frac{|j|}{\prod _{i\ne j}\lfloor i \rfloor ^{{\varvec{{\alpha }}}_i+{\varvec{{\beta }}}_i}} \le 1. \end{aligned}$$
(6.17)

By momentum conservation we have

$$\begin{aligned} |j|\le \sum _{i\ne j} |i|({\varvec{{\alpha }}}_i+{\varvec{{\beta }}}_i) \le \sum _{i\ne j} \lfloor i \rfloor ({\varvec{{\alpha }}}_i+{\varvec{{\beta }}}_i) \end{aligned}$$
(6.18)

and (6.17) follows if we show that

$$\begin{aligned} \sup _{j,{\varvec{{\alpha }}},{\varvec{{\beta }}}} \frac{\sum _{i\ne j} \lfloor i \rfloor ({\varvec{{\alpha }}}_i+{\varvec{{\beta }}}_i)}{\prod _{i\ne j}\lfloor i \rfloor ^{{\varvec{{\alpha }}}_i+{\varvec{{\beta }}}_i}} \le 1, \end{aligned}$$
(6.19)

where we can restrict the sum and the product to the indexes i such that \({\varvec{{\alpha }}}_i+{\varvec{{\beta }}}_i\ge 1.\) This last estimates follows by the fact that given \(x_k\ge 1\)

$$\begin{aligned} \frac{\sum _{2\le k\le n} k x_k}{\prod _{2\le k\le n}k^{x_k}} \le 1, \end{aligned}$$

as it can be easly proved by induction over n (noting that \(n^x\ge nx\) for \(n\ge 2,\) and any \(x\ge 1\)). \(\square \)

9 Homological Equation

Now we give estimates on the solution of the homological equation

$$\begin{aligned} L_\omega S:= \{D_\omega ,S\} = R \end{aligned}$$

The constants \({{\mathcal {C}}_1},{{\mathcal {C}}_2}(r,{\sigma },t)\) are defined in “Appendix A”. Note that \({{\mathcal {C}}_1}\) depends only on \(\theta \).

Proposition 7.1

Let \(\omega \in {\mathtt {D}_{\gamma ,{q}}}\) and let \(0< \sigma <\eta \), \(0<\rho <r/2.\) For any \(R\in {{\mathcal {R}}}_{r,\eta }({\mathtt {h}}_{p,s,a})\), the Homological equation \(L_\omega S = R\) has a unique solution \(S=L_\omega ^{-1}R\), which satisfies the following two bounds:

figure a
figure b

hence \(L_\omega ^{-1}R\in {{\mathcal {R}}}_{r,\eta - \sigma }({\mathtt {h}}_{p,s+{\sigma },a}) \cap {{\mathcal {R}}}_{r-\rho ,\eta - \sigma }({\mathtt {h}}_{p+\tau ,s,a})\).

If R preserves momentum \(R\in {{\mathcal {R}}}_{r,0}({\mathtt {h}}_{\mathtt {w}})\) , with \({\mathtt {w}}_j=\lfloor j \rfloor ^{p} \), the unique solution of the Homological equation preserves momentum and satisfies

figure c

so \(S=L^{-1}_\omega R\in {{\mathcal {R}}}_{r,0}({\mathtt {h}}_{{\mathtt {w}}'})\), with \({\mathtt {w}}'_j=\lfloor j \rfloor ^{p + \tau _1} \).

Remark 7.1

As in the abstract case we assume that \(X_R\) maps \(B_r({\mathtt {h}}_{p,s,a})\rightarrow {\mathtt {h}}_{p,s,a}\) and then show that S maps some smaller ball (because it has smaller radius or is in a stronger topology) to itself. This can be done in two ways: if we increase the Gevrey regularity index \(s \rightsquigarrow s+{\sigma }\) (case \(\mathtt {G}\)) then the increase can be arbitrarily small, at the price of an exponential increase in the bound.

If we want to keep s fixed (say that we start with \(s=a=0\) and want to stay in the Sobolev class) then we have to increase the regularity pby a fixed amount. The main difference between the cases \(\mathtt {S}\) and \(\mathtt {M}\) is that in the first case one has to decrease \(r,\eta \) and the bound on S diverges as \(\rho ,{\sigma }\rightarrow 0\). In the second case, instead we have to increase the regularity p by a slightly larger amount but then we get a uniform bound for S.

Note that, differently from Proposition 6.1, we cannot consider the purely analytic case (sp fixed say to 0, 1). This is due to the fact that in (6.13) we have a much weaker bound for the ratio of the coefficients in (6.12), w.r.t. the one afforded by (6.6) and (6.10) for the Gevrey and Sobolev cases.

The following Lemma is the key point in the control of the small divisors appearing in the solution of the Homological equation. Here we strongly use the fact that we are working with a dispersive PDE on the circle with superlinear dispersion law.

Lemma 7.1

Consider \({\varvec{{\alpha }}},{\varvec{{\beta }}}\in {\mathbb {N}}^\mathbb {Z}\) with \(1\le |{\varvec{{\alpha }}}|=|{\varvec{{\beta }}}|<\infty \). If

$$\begin{aligned} {\left| \sum _i{{\left( {\varvec{{\alpha }}}_i-{\varvec{{\beta }}}_i\right) }i^2}\right| }\le 10 \sum _i{\left| {\varvec{{\alpha }}}_i-{\varvec{{\beta }}}_i\right| }, \end{aligned}$$
(7.1)

then for all j such that \({\varvec{{\alpha }}}_j+{\varvec{{\beta }}}_j\ne 0\) one has

$$\begin{aligned}&\sum _i{\left| {\varvec{{\alpha }}}_i-{\varvec{{\beta }}}_i\right| }\langle i \rangle ^{\theta /2} \le C_* {\left( \sum _i {\left( {\varvec{{\alpha }}}_i+{\varvec{{\beta }}}_i\right) }\langle i \rangle ^\theta - 2\langle j \rangle ^\theta + {\left| \pi \right| }\right) }, \qquad C_*= \frac{13}{1-\theta }\nonumber \\ \end{aligned}$$
(7.2)
$$\begin{aligned}&\prod _i(1+{\left| {\varvec{{\alpha }}}_i-{\varvec{{\beta }}}_i\right| }{\langle i \rangle }) \le e^{27}(1+{\left| \pi \right| })^3 N^6\prod _{l=3}^N\widehat{n}_l^{\tau _0}\, \end{aligned}$$
(7.3)

where \(N=|{\varvec{{\alpha }}}|+|{\varvec{{\beta }}}|\) and \(\pi = \sum _i i{\left( {\varvec{{\alpha }}}_i - {\varvec{{\beta }}}_i\right) }\) (recall (1.25).

Proof

In “Appendix C” \(\square \)

Note that

$$\begin{aligned} {\left| \sum _i{{\left( {\varvec{{\alpha }}}_i-{\varvec{{\beta }}}_i\right) }i^2}\right| }\ge 10 \sum _i{\left| {\varvec{{\alpha }}}_i-{\varvec{{\beta }}}_i\right| } \qquad \Longrightarrow \qquad {\left| \omega \cdot {\left( {\varvec{{\alpha }}}-{\varvec{{\beta }}}\right) }\right| }\ge 1. \end{aligned}$$
(7.4)

Indeed denoting \(\omega _j = j^2 + \xi _j \langle j \rangle ^{-{q}}\) with \({\left| \xi _j\right| }\le \frac{1}{2}\),

$$\begin{aligned} {\left| \omega \cdot {\left( {\varvec{{\alpha }}}-{\varvec{{\beta }}}\right) }\right| } \ge 10\sum _j{\left| {\varvec{{\alpha }}}_j - {\varvec{{\beta }}}_j\right| } - \frac{1}{2}\sum _j{\left| {\varvec{{\alpha }}}_j - {\varvec{{\beta }}}_j\right| }\ge 1. \end{aligned}$$

Proof

In the following, we will compute for each item the corresponding \(K, K_0\) defined in (4.8) and (4.10), and show their finiteness in order to apply Lemma 4.2 and give the explicit upper bounds entailed in Proposition  7.1 (G)–(S)–(M).

Item \(\mathtt {G}\)) In this case by (6.7)

$$\begin{aligned} K=\gamma \sup _{j: {\varvec{{\alpha }}}_j+{\varvec{{\beta }}}_j\ne 0} \frac{e^{-{\sigma }{\left( \sum _i\langle i \rangle ^\theta ({\varvec{{\alpha }}}_i+{\varvec{{\beta }}}_i) -2\langle j \rangle ^\theta +|\pi |\right) }}}{{\left| \omega \cdot {{\left( {\varvec{{\alpha }}}- {\varvec{{\beta }}}\right) }}\right| }} . \end{aligned}$$

There are two cases.

If (7.1) does not hold, then by (7.4) \({\left| \omega \cdot {\left( {\varvec{{\alpha }}}-{\varvec{{\beta }}}\right) }\right| }\ge 1\) and by (6.5) and (4.4) we get

$$\begin{aligned} \gamma \frac{e^{-{\sigma }{\left( \sum _i\langle i \rangle ^\theta ({\varvec{{\alpha }}}_i+{\varvec{{\beta }}}_i) -2\langle j \rangle ^\theta +|\pi |\right) }}}{{\left| \omega \cdot {{\left( {\varvec{{\alpha }}}- {\varvec{{\beta }}}\right) }}\right| }} \le 1\, \end{aligned}$$

and the bound is trivially achieved.

Otherwise, let us consider the case in which (7.1) holds. By applying Lemma 7.1, since \(\omega \in {\mathtt {D}_{\gamma ,{q}}}\) we get:

$$\begin{aligned}&\gamma \frac{e^{-{\sigma }{\left( \sum _i\langle i \rangle ^\theta ({\varvec{{\alpha }}}_i+{\varvec{{\beta }}}_i) -2\langle j \rangle ^\theta +|\pi |\right) }}}{{\left| \omega \cdot {{\left( {\varvec{{\alpha }}}- {\varvec{{\beta }}}\right) }}\right| }} \nonumber \\&\quad \le e^{-\frac{{\sigma }}{C_* }\sum _i{\left| {\varvec{{\alpha }}}_i-{\varvec{{\beta }}}_i\right| }\langle i \rangle ^{\frac{\theta }{2}}}\prod _i{\left( 1+({\varvec{{\alpha }}}_i-{\varvec{{\beta }}}_i)^{2}\langle i \rangle ^{{2}+{q}}\right) } \nonumber \\&\quad \le \exp {\sum _i{\left[ -\frac{{\sigma }}{C_*} {\left| {\varvec{{\alpha }}}_i - {\varvec{{\beta }}}_i\right| }\langle i \rangle ^{\frac{\theta }{2}} + \ln {{\left( 1 + {\left( {\varvec{{\alpha }}}_i - {\varvec{{\beta }}}_i\right) }^{2}\langle i \rangle ^{{2}+{q}}\right) }}\right] }} \nonumber \\&\quad = \exp {\sum _i f_i({\left| {\varvec{{\alpha }}}_i-{\varvec{{\beta }}}_i\right| })} \end{aligned}$$
(7.5)

where, for \(0<{\sigma }\le 1\), \(i\in \mathbb {Z}\) and \(x\ge 0\), we defined

$$\begin{aligned} f_i(x) := -\frac{{\sigma }}{C_*} x\langle i \rangle ^{\frac{\theta }{2}} + \ln {{\left( 1 + x^{2}\langle i \rangle ^{{2}+{q}}\right) }}. \end{aligned}$$

In order to bound (7.5), we need the following lemma, whose proof is postponed to “Appendix C”.

Lemma 7.2

Setting

$$\begin{aligned} i_\sharp := \left( \frac{8C_*({q}+3)}{{\sigma }\theta } \ln \frac{4C_*({q}+3)}{{\sigma }\theta } \right) ^{\frac{2}{\theta }}, \end{aligned}$$

we get

$$\begin{aligned} \sum _i f_i(|\ell _i|)\le 7({q}+3) i_\sharp \ln i_\sharp - \frac{{\sigma }}{2C_*} \big (\widehat{n}_1(\ell )\big )^{\frac{\theta }{2}} \end{aligned}$$
(7.6)

for every \(\ell \in \mathbb {Z}^\mathbb {Z}\) with \(|\ell |<\infty .\)

The inequality (G) follows from plugging (7.6) into (7.5) and evaluating the constant.

Item \(\mathtt {S})\) In this case K in (4.8) is (recall (6.8))

$$\begin{aligned} K=\gamma \sup _{j: {\varvec{{\alpha }}}_j+{\varvec{{\beta }}}_j\ne 0}{\left( 1-\frac{\rho }{r}\right) }^{N-2}{\left( \frac{\langle j \rangle ^2}{\prod _i\langle i \rangle ^{{\varvec{{\alpha }}}_i+{\varvec{{\beta }}}_i}}\right) }^\tau \frac{e^{-{\sigma }|\pi |}}{{\left| \omega \cdot {{\left( {\varvec{{\alpha }}}- {\varvec{{\beta }}}\right) }}\right| }}, \end{aligned}$$
(7.7)

where \(N=|{\varvec{{\alpha }}}|+|{\varvec{{\beta }}}|\).

As before we consider two cases.

If (7.1) is not satisfied then(7.4) holds and the right hand side of (7.7) is bounded by the quantity in (6.8) and it is estimated analogusly.

If (7.1) holds instead, by applying formula (6.10), Lemma 7.1 and the fact that \(\omega \in {\mathtt {D}_{\gamma ,{q}}}\) we get:

$$\begin{aligned}&{\left( \frac{\langle j \rangle ^2}{\prod _i\langle i \rangle ^{{\varvec{{\alpha }}}_i+{\varvec{{\beta }}}_i}}\right) }^\tau \frac{1}{{\left| \omega \cdot {{\left( {\varvec{{\alpha }}}- {\varvec{{\beta }}}\right) }}\right| }} \le {\left( \frac{\langle j \rangle ^2}{\prod _i\langle i \rangle ^{{\varvec{{\alpha }}}_i+{\varvec{{\beta }}}_i}}\right) }^\tau \prod _i{\left( 1+|{\varvec{{\alpha }}}_i-{\varvec{{\beta }}}_i|^{2}\langle i \rangle ^{{2}+{q}}\right) }\\&\quad \le {\left( \frac{N+|\pi |}{\prod _{l=3}^{N} \widehat{n}_l}\right) }^\tau {\left( e^{27}(1+{\left| \pi \right| })^3 N^6\prod _{l\ge 3}\widehat{n}_l^{\tau _0}\right) }^{2+{q}} \\&\quad \le e^{27(2+{q})}(N+|\pi |)^{\tau +9(2+{q})} \le e^{27(2+{q})}(N+|\pi |)^{3\tau }. \end{aligned}$$

By using Lemma C.1 (just like explained in detail in formula (6.11) with \(p_1=3\tau \)), K in (7.7) is bounded by

$$\begin{aligned}&e^{27(2+{q})} (N+|\pi |)^{3\tau } {\left( 1-\frac{\rho }{r}\right) }^{N-2}e^{-{\sigma }|\pi |}\\&\quad \le e^{27(2+{q})} 2^{3\tau +1} (3\tau )^{3\tau } \max \left\{ {\left( \frac{2 r}{\rho }\right) }^{3\tau } , {\left( \frac{1}{{\sigma }}\right) }^{3\tau }, 1\right\} \end{aligned}$$

Item \(\mathtt {M})\) Note that in this case the constant in (4.8) amounts to

$$\begin{aligned} K_0= \gamma \sup _{\begin{array}{c} j\in \mathbb {Z},\, {\varvec{{\alpha }}}\ne {\varvec{{\beta }}}\in {\mathbb {N}}^\mathbb {Z}\\ {\varvec{{\alpha }}}_j+{\varvec{{\beta }}}_j\ne 0, \, \sum _i i({\varvec{{\alpha }}}_i-{\varvec{{\beta }}}_i)=0 \end{array}} \left( \frac{\lfloor j \rfloor ^2}{\prod _i \lfloor i \rfloor ^{{\varvec{{\alpha }}}_i+{\varvec{{\beta }}}_i}}\right) ^{\tau _1} \frac{\gamma }{ |\omega \cdot ({\varvec{{\alpha }}}-{\varvec{{\beta }}})|} . \end{aligned}$$

We have two cases. If (7.4) holds \(K_0\le \gamma \) by (6.16).

Otherwise (7.1) holds and, therefore, (7.3) (note that here \(\pi =0\)) applies, giving

$$\begin{aligned} K_0\le & {} \sup \left( \frac{\lfloor j \rfloor ^2}{\prod _i \lfloor i \rfloor ^{{\varvec{{\alpha }}}_i+{\varvec{{\beta }}}_i}}\right) ^{\tau _1} \prod _i{\left( 1+|{\varvec{{\alpha }}}_i-{\varvec{{\beta }}}_i|^{2}\langle i \rangle ^{{2}+{q}}\right) }\\\le & {} \sup \left( \frac{\lfloor j \rfloor ^2}{\prod _i \lfloor i \rfloor ^{{\varvec{{\alpha }}}_i+{\varvec{{\beta }}}_i}}\right) ^{\tau _1} e^{27(2+{q})} N^{6(2+{q})}\prod _{l= 3}^N\widehat{n}_l^{\tau _0(2+{q})} \end{aligned}$$

since \(\omega \in {\mathtt {D}_{\gamma ,{q}}}\). We claim that

$$\begin{aligned} N\le 4 \prod _{l= 3}^N\lfloor \widehat{n}_l \rfloor ^{\frac{1}{4\ln 2}}. \end{aligned}$$
(7.8)

Indeed if \(N=2\), the inequality is trivial. Since N is even we have to consider only the case \(N\ge 4\), which follows by Lemma C.1. Recalling (6.9) we have

$$\begin{aligned} \prod _i\lfloor i \rfloor ^{{\varvec{{\alpha }}}_i+{\varvec{{\beta }}}_i}= \prod _{l\ge 1}\lfloor \widehat{n}_l \rfloor . \end{aligned}$$
(7.9)

Then

$$\begin{aligned} \sup _{\begin{array}{c} j,{\varvec{{\alpha }}},{\varvec{{\beta }}}\\ {\varvec{{\alpha }}}_j+{\varvec{{\beta }}}_j\ge 1 \end{array}} \frac{\lfloor j \rfloor ^2}{\prod _i\lfloor i \rfloor ^{{\varvec{{\alpha }}}_i+{\varvec{{\beta }}}_i}} \le \frac{\lfloor {\hat{n}}_1 \rfloor ^2}{\prod _{l\ge 1}\lfloor \widehat{n}_l \rfloor } = \frac{\lfloor {\hat{n}}_1 \rfloor }{\prod _{l\ge 2}\lfloor \widehat{n}_l \rfloor } \le \frac{\sum _{l\ge 2}\lfloor \widehat{n}_l \rfloor }{\prod _{l\ge 2}\lfloor \widehat{n}_l \rfloor } = \frac{1}{\prod _{l\ge 3}\lfloor \widehat{n}_l \rfloor } + \frac{\sum _{l\ge 3}\lfloor \widehat{n}_l \rfloor }{\prod _{l\ge 2}\lfloor \widehat{n}_l \rfloor } , \end{aligned}$$

where the last inequality holds by momentum conservation. ThenFootnote 19

$$\begin{aligned} K_0\le & {} 2^{{\tau _1}-1} \left( \frac{1}{\prod _{l\ge 3}\lfloor \widehat{n}_l \rfloor ^{\tau _1}} + \frac{(\sum _{l\ge 3}\lfloor \widehat{n}_l \rfloor )^{\tau _1}}{\prod _{l\ge 2}\lfloor \widehat{n}_l \rfloor ^{\tau _1}} \right) (4^6 e^{27})^{2+{q}}\prod _{l\ge 3} \lfloor \widehat{n}_l \rfloor ^{{\tau _1}/2}\\\le & {} 2^{{\tau _1}-1}(4^6 e^{27})^{2+{q}} \left( 1 + \frac{(\sum _{l\ge 3}\lfloor \widehat{n}_l \rfloor )^{\tau _1}}{ \lfloor \widehat{n}_2 \rfloor ^{\tau _1}\prod _{l\ge 3}\lfloor \widehat{n}_l \rfloor ^{{\tau _1}/2}} \right) \\\le & {} 2^{{\tau _1}-1}(4^6 e^{27})^{2+{q}} \left( 1 + \frac{(\lfloor \widehat{n}_3 \rfloor ^{1/2}+4)^{\tau _1}}{ \lfloor \widehat{n}_2 \rfloor ^{\tau _1}} \right) \end{aligned}$$

by Lemma C.2 with \(a=1/2.\) The estimate on \(K_0\), hence inequality (M) follows. \(\square \)

10 Birkhoff Normal Form

We are now ready to apply Theorem 1.3 to the three applications \(\mathtt {G},\mathtt {S},\mathtt {M}\), defined in page 7. We start by verifying the assumptions.

Lemma 8.1

The following holds

\(\mathtt {G}\)) Let \(s>0\), \(p> 1/2\) and \(a\ge 0\). Then for all \({\mathtt {N}}\ge 1\), \(0<\eta \le s\), \({\mathtt {w}}:={\mathtt {w}}(p,s,a)\) and \({\mathtt {w}}_0 := {\mathtt {w}}({p,s-\eta ,a})\) satisfy the Birkhoff assumption at step \({\mathtt {N}}\) and in (1.29) we can take

$$\begin{aligned} {{\mathfrak {C}}}=1,\quad {{\mathfrak {K}}},\, {{\mathfrak {K}}}^\sharp \le e^{{{\mathcal {C}}_1}{\left( \frac{\mathtt {N}}{\eta }\right) }^{\frac{3}{\theta }}}. \end{aligned}$$

\(\mathtt {S}\)) Let \(\tau _\mathtt {S}=\tau \), \(s,a\ge 0\), \(p \ge 3\tau _\mathtt {S}+1 \) and setFootnote 20\({\mathtt {N}}:=[\frac{p-1}{\tau _\mathtt {S}}]\). Then \(\eta > 0\), \({\mathtt {w}}:={\mathtt {w}}(p,s,a)\) andFootnote 21\({\mathtt {w}}_0 := {\mathtt {w}}({p-{\mathtt {N}}\tau _\mathtt {S},s,a})\) satisfy the Birkhoff assumption at step \({\mathtt {N}}\) and in (1.29) we can take

$$\begin{aligned}&{{\mathfrak {C}}}\le {C_{\mathtt {mon}}}(4\mathtt {N},\eta /\mathtt {N},\tau _\mathtt {S}) ,\quad {{\mathfrak {K}}}\le {{\mathcal {C}}_2}(4\mathtt {N},\eta /\mathtt {N},\tau _\mathtt {S}),\quad {{\mathfrak {K}}}^\sharp \le {{\mathcal {C}}_2}(4\mathtt {N},\eta /\mathtt {N},\mathtt {N}\tau _\mathtt {S}). \end{aligned}$$

\(\mathtt {M}\)) Let \(\tau _\mathtt {M}=\tau _1\), \(p \ge 3\tau _\mathtt {M}+1 \) andFootnote 22 set \({\mathtt {N}}:=[\frac{p-1}{\tau _\mathtt {M}}]\). Then \(\eta = 0\), \({\mathtt {w}}:={\left( \lfloor j \rfloor ^p\right) }_{j\in \mathbb {Z}}\) and \({\mathtt {w}}_0:={\left( \lfloor j \rfloor ^{p-{\mathtt {N}}\tau _\mathtt {M}}\right) }_{j\in \mathbb {Z}}\) satisfy the ”momentum preserving” Birkhoff assumption at step \({\mathtt {N}}\) and in (5.25) we can take

$$\begin{aligned} {{\mathfrak {C}}}_0 = 1,\quad {{\mathfrak {K}}}_0,\, {{\mathfrak {K}}}^\sharp _0 \le 6^{\tau _\mathtt {M}} (4^6 e^{27})^{2+{q}} . \end{aligned}$$

Proof

\(\mathtt {G}\)) Set

$$\begin{aligned} {\mathtt {w}}_{n,j} := {\mathtt {w}}_{0,j} e^{\frac{n\eta }{{\mathtt {N}}} \langle j \rangle ^{\theta }},\qquad \forall n=1,\dots ,{\mathtt {N}}. \end{aligned}$$

The computation of \({{\mathfrak {C}}}\) follows from (6.1); the ones of \( {{\mathfrak {K}}},{{\mathfrak {K}}}^\sharp \) from Proposition  7.1.

\(\mathtt {S}\)) Set

$$\begin{aligned} {\mathtt {w}}_{n,j} := {\mathtt {w}}_{0,j} \langle j \rangle ^{ n \tau _\mathtt {S}},\qquad \forall n=1,\dots ,{\mathtt {N}}. \end{aligned}$$

The computation of \({{\mathfrak {C}}}\) follows from (6.8); the ones of \( {{\mathfrak {K}}},{{\mathfrak {K}}}^\sharp \) from Proposition  7.1.

\(\mathtt {M}\)) Set

$$\begin{aligned} {\mathtt {w}}_{n,j} := {\mathtt {w}}_{0,j} \lfloor j \rfloor ^{ n \tau _\mathtt {M}}, \qquad \forall n=1,\dots ,{\mathtt {N}}. \end{aligned}$$

The computation of \({{\mathfrak {C}}}_0 \) follows from (6.15); the ones of \( {{\mathfrak {K}}}_0,{{\mathfrak {K}}}^\sharp _0\) again from Proposition  7.1. \(\square \)

We now state the Birkhoff Normal Form Theorem (1.3) for the Hamiltonian in (1.21) in the usual three cases. First we define

$$\begin{aligned} \mathtt {r}(\mathtt {G}):= & {} \min \Bigg \{\frac{ \delta _\mathtt {G}}{\sqrt{\mathtt {N}} e^{\frac{1}{2}{{\mathcal {C}}_1}{\left( \frac{\mathtt {N}}{\eta _\mathtt {G}}\right) }^{\frac{3}{\theta }}}} , \frac{\sqrt{R}}{2{C_{\mathtt {alg}}(p)}}\Bigg \} , \quad \nonumber \\ \delta _\mathtt {G}:= & {} \frac{\sqrt{\gamma R}}{{C_{\mathtt {alg}}(p)}\sqrt{2^{11}e {C_{\mathtt {Nem}}}(p,s-\eta _\mathtt {G},{\mathtt {a}}- a-\eta _\mathtt {G})|f|_{{\mathtt {a}},R}}}\nonumber \\ \mathtt {C}_1(\mathtt {G}):= & {} \frac{e^{{{\mathcal {C}}_1}{\left( \frac{\mathtt {N}}{\eta _\mathtt {G}}\right) }^{\frac{3}{\theta }}}}{2^7 e \delta _\mathtt {G}^2} , \qquad \nonumber \\ \mathtt {C}_2(\mathtt {G}):= & {} \frac{\gamma }{2^8 e \delta _\mathtt {G}^2} , \qquad \nonumber \\ \mathtt {C}_3(\mathtt {G}):= & {} \frac{\gamma }{2^9 e \delta _\mathtt {G}^2} \Bigg (\frac{ \mathtt {N}e^{{{\mathcal {C}}_1}{\left( \frac{\mathtt {N}}{\eta _\mathtt {G}}\right) }^{\frac{3}{\theta }}}}{4 \delta _\mathtt {G}^2}\Bigg )^{\mathtt {N}} . \end{aligned}$$
(8.1)
$$\begin{aligned} \mathtt {r}(\mathtt {S}):= & {} \min \left\{ \frac{ {\mathtt {d}_\mathtt {S}}}{\sqrt{\mathtt {N}{{\mathcal {C}}_2}(4\mathtt {N},{\mathtt {a}}/2\mathtt {N},\mathtt {N}\tau _\mathtt {S}) }},\ \frac{\sqrt{R}}{5\cdot 2^{\tau _\mathtt {S}+2}} \right\} , \, \mathrm{where}\quad \nonumber \\ {\mathtt {d}_\mathtt {S}}:= & {} \frac{\sqrt{\gamma R}}{\sqrt{2^{17} {C_{\mathtt {Nem}}}(p-{\mathtt {N}}\tau _\mathtt {S},0,{\mathtt {a}}/2)|f|_{{\mathtt {a}},R}}},\nonumber \\ \mathtt {C}_1(\mathtt {S}):= & {} \frac{{{\mathcal {C}}_2}(4\mathtt {N},{\mathtt {a}}/2\mathtt {N},\mathtt {N}\tau _\mathtt {S})}{2^{7} e {\mathtt {d}_\mathtt {S}}^2} , \qquad \mathtt {C}_2(\mathtt {S}) := \frac{\gamma }{2^8 e {\mathtt {d}_\mathtt {S}}^2}\nonumber ,\\ \mathtt {C}_3(\mathtt {S}):= & {} \frac{2^8 {C_{\mathtt {Nem}}}(p-{\mathtt {N}}\tau _\mathtt {S},0,{\mathtt {a}}/2)|f|_{{\mathtt {a}},R}}{e R} {\left( \frac{ \mathtt {N}{C_{\mathtt {mon}}}(4\mathtt {N},{\mathtt {a}}/2\mathtt {N},\tau _\mathtt {S}) {{\mathcal {C}}_2}(4\mathtt {N},{\mathtt {a}}/2\mathtt {N},\tau _\mathtt {S})}{4{\mathtt {d}_\mathtt {S}}^2}\right) }^{\mathtt {N}} ,\nonumber \\ \mathtt {r}(\mathtt {M}):= & {} \min \bigg \{ \frac{ \delta _\mathtt {M}}{\sqrt{\mathtt {N}}}, \ \sqrt{\frac{R}{2^{\tau _\mathtt {M}+6}}}\bigg \} ,\qquad \nonumber \\ \delta _\mathtt {M}:= & {} \frac{\sqrt{ \gamma R }}{\sqrt{2^{17} e 12^{\tau _\mathtt {M}} (4^6 e^{27})^{2+{q}}|f|_{R}}},\nonumber \\ \mathtt {C}_1(\mathtt {M}):= & {} \frac{1}{2^9 \delta _\mathtt {M}^2} , \qquad \mathtt {C}_2(\mathtt {M}) := \frac{2^{{\tau _\mathtt {M}}+8}|f|_R}{R} , \qquad \nonumber \\ \mathtt {C}_3(\mathtt {M}):= & {} 2^{{\tau _\mathtt {M}}+8}\frac{|f|_{R}}{R} \Big ( \frac{\mathtt {N}}{8 \delta _\mathtt {M}^2}\Big )^{\mathtt {N}} . \end{aligned}$$
(8.2)

Theorem 8.1

(Birkhoff Normal Form). Under the same assumptions of Lemma 8.1 the following holds. Consider the Hamiltonian (1.21), assuming, only in the case \(\mathtt {M},\) that f does not depend on x (momentum conservation). Then for any \(0<r\le \mathtt {r}\) there exists two close to identity invertible symplectic change of variables

$$\begin{aligned}&\Psi ,\Psi ^{-1}:\quad B_{r}({\mathtt {h}}_{\mathtt {w}})\mapsto {\mathtt {h}}_{\mathtt {w}},\quad \sup _{|u|_{{\mathtt {w}}}\le r}|\Psi ^{\pm 1}(u)-u|_{{\mathtt {w}}} \le \mathtt {C}_1 r^3 \le \frac{1}{8} r ,\nonumber \\&\Psi \circ \Psi ^{-1}u= \Psi ^{-1}\circ \Psi u= u ,\quad \forall u\in B_{\frac{7}{8} r}({\mathtt {h}}_{\mathtt {w}}) \end{aligned}$$
(8.3)

such that in the new coordinates

$$\begin{aligned} H\circ \Psi = D_\omega + Z+ R, \end{aligned}$$

for suitable majorant analytic Hamiltonians \(Z,R \in {{\mathcal {A}}}_r({\mathtt {h}}_{\mathtt {w}}),\)\(Z\in {{\mathcal {K}}},\) satisfying the estimate

$$\begin{aligned} \sup _{|u|_{{\mathtt {w}}}\le r }|X_{\underline{Z}}|_{{\mathtt {w}}} \le \mathtt {C}_2 r^{3}, \quad \sup _{|u|_{{\mathtt {w}}}\le r }|X_{\underline{R}}|_{{\mathtt {w}}} \le \mathtt {C}_3 r^{2\mathtt {N}+3}, \end{aligned}$$
(8.4)

\(X_{\underline{Z}}\) (resp. \(X_{\underline{R}}\)), being the hamiltonian vector field generated by the the majorant of Z (resp. R). Moreover, in the case \(\mathtt {M}\), R preserves momentum.

Proof

We use Theorem 1.3 with \(G \rightsquigarrow P\).

\(\mathtt {G}\)) Setting

$$\begin{aligned} \eta =\eta _\mathtt {G}:=\min \left\{ \frac{{\mathtt {a}}-a}{2},s\right\} , \ \ {\bar{r}}:=\frac{\sqrt{R}}{{C_{\mathtt {alg}}(p)}}, \end{aligned}$$
(8.5)

we have that

$$\begin{aligned} |P|_{{\bar{r}},\eta ,{\mathtt {w}}_0}=| P|_{{\bar{r}},\eta ,{\mathtt {w}}(p,s-\eta ,a)} {\mathop {\le }\limits ^{(5.32)}}{C_{\mathtt {Nem}}}(p,s-\eta ,{\mathtt {a}}- a -\eta )|f|_{{\mathtt {a}},R}. \end{aligned}$$
(8.6)

By (1.32) \( r_\star \ge \delta _\mathtt {G}\ge {{{\varvec{\delta }}_\mathtt {G}}}\) (see “Appendix A”). Then, recalling (1.32) and Lemma 8.1, one can verify that

$$\begin{aligned} \widehat{{\mathtt {r}}} \ge \mathtt {r}(\mathtt {G}), \qquad \widehat{\mathtt {C}}_1\le \mathtt {C}_1(\mathtt {G}), \qquad \widehat{\mathtt {C}}_2\le \mathtt {C}_2(\mathtt {G}), \qquad \widehat{\mathtt {C}}_3\le \mathtt {C}_3(\mathtt {G})\, \, \end{aligned}$$

\(\mathtt {S}\)) Set

$$\begin{aligned} \eta :={\mathtt {a}}/2, \quad {\bar{r}}:=\frac{\sqrt{R}}{C_\mathtt{alg}(p-{\mathtt {N}}\tau _\mathtt {S})}. \end{aligned}$$
(8.7)

Then Assumption 1 is satisfied by Lemma 8.1 with the same choice of \(\mathtt {N},{\mathtt {w}}_0,{\mathtt {w}}\). We have that

$$\begin{aligned} |P|_{{\bar{r}},\eta ,{\mathtt {w}}_0}=| P|_{{\bar{r}},{\mathtt {a}}/2,{\mathtt {w}}(p-\mathtt {N}\tau _\mathtt {S})} {\mathop {\le }\limits ^{(5.32)}}{C_{\mathtt {Nem}}}(p-{\mathtt {N}}\tau _\mathtt {S},0,{\mathtt {a}}/2)|f|_{{\mathtt {a}},R}. \end{aligned}$$
(8.8)

For the various constants we refer to “Appendix A”. Recalling \(\tau _\mathtt {S}=\tau \), we note that

$$\begin{aligned} {C_{\mathtt {mon}}}(4\mathtt {N},\eta /\mathtt {N},\tau )= & {} 2^{2\tau +1} \tau ^{\tau } \mathtt {N}^\tau \max \left\{ 4 , (1/2\eta )\right\} ^\tau = 2 (4\tau \max \left\{ 4 , (1/2\eta )\right\} \mathtt {N})^\tau ,\quad \nonumber \\ {{\mathcal {C}}_2}(4\mathtt {N},\eta /\mathtt {N},\tau )= & {} 2 e^{27(2+{q})} (12\tau \max \left\{ 4, (1/2\eta )\right\} \mathtt {N})^{3\tau } \nonumber \\ {{\mathcal {C}}_2}(4\mathtt {N},\eta /\mathtt {N},\mathtt {N}\tau )= & {} 2 e^{27(2+{q})} (6\mathtt {N}\tau )^{3\mathtt {N}\tau } \max \left\{ {\left( 8\mathtt {N}\right) }^{3\mathtt {N}\tau } , (\mathtt {N}/\eta )^{3\mathtt {N}\tau }\right\} \nonumber \\= & {} 2 e^{27(2+{q})}(12\tau \max \left\{ 4 , (2\eta )^{-1}\right\} \mathtt {N}^2)^{3\mathtt {N}\tau }, \end{aligned}$$
(8.9)

we have that for \(\mathtt {N}\ge 3\)

$$\begin{aligned} {C_{\mathtt {mon}}}(4\mathtt {N},\eta /\mathtt {N},\tau ) {{\mathcal {C}}_2}(4\mathtt {N},\eta /\mathtt {N},\tau ) \le \sqrt{{{\mathcal {C}}_2}(4\mathtt {N},\eta /\mathtt {N},\mathtt {N}\tau )}. \end{aligned}$$
(8.10)

By (1.32)

$$\begin{aligned} r_\star\ge & {} \frac{\sqrt{\gamma R}}{C_\mathtt{alg}(p-{\mathtt {N}}\tau _\mathtt {S})\sqrt{2^{11}e {C_{\mathtt {Nem}}}(p-{\mathtt {N}}\tau _\mathtt {S},0,{\mathtt {a}}/2)|f|_{{\mathtt {a}},R}}} \ge {\mathtt {d}_\mathtt {S}}. \end{aligned}$$

Then, recalling (1.32) and (8.10) one has \(\widehat{{\mathtt {r}}} \ge \mathtt {r}(\mathtt {S})\). Moreover (recall (1.34))

$$\begin{aligned} \widehat{\mathtt {C}}_1 \le \mathtt {C}_1(\mathtt {S}), \qquad \widehat{\mathtt {C}}_2 \le \mathtt {C}_2(\mathtt {S}),\qquad \widehat{\mathtt {C}}_3 \le \mathtt {C}_3(\mathtt {S}). \end{aligned}$$
(8.11)

Finally the last inequality in (8.3) follows from the second bound in (1.33).

\(\mathtt {M}\)) Set

$$\begin{aligned} \eta :=0, \quad {\bar{r}}:=\frac{\sqrt{R}}{2^{\tau _\mathtt {M}/2} C_{\mathtt { alg,M}}(p-\tau _\mathtt {M}\mathtt {N})}\ge {\sqrt{\frac{R}{ 10 \cdot 2^{\tau _\mathtt {M}}}}}. \end{aligned}$$
(8.12)

Then Assumption 1 is satisfied by Lemma 8.1, case \(\mathtt {M}\), with the same choice of \(\mathtt {N},{\mathtt {w}}_0,{\mathtt {w}}\). We have that

$$\begin{aligned} |P|_{{\bar{r}},0,{\mathtt {w}}_0}=\Vert P\Vert _{{\bar{r}},p-\tau _\mathtt {M}{\mathtt {N}}} {\mathop {\le }\limits ^{(5.33)}}2 |f|_{R}. \end{aligned}$$
(8.13)

By (1.32)

$$\begin{aligned} r_\star \ge \frac{\sqrt{\gamma R}}{\sqrt{ 2^{\tau _\mathtt {M}+17} |f|_{R}}}. \end{aligned}$$

Then

$$\begin{aligned} \widehat{{\mathtt {r}}} \ge \mathtt {r}(\mathtt {M}) ,\qquad \widehat{\mathtt {C}}_1 \le \mathtt {C}_1(\mathtt {M}), \qquad \widehat{\mathtt {C}}_2 \le \mathtt {C}_2(\mathtt {M}),\qquad \widehat{\mathtt {C}}_3 \le \mathtt {C}_3(\mathtt {M}) . \end{aligned}$$

\(\square \)

11 Gevrey Stability. Proof of Theorem 1.1

Actually we prove of Theorem 1.1 for the slightly longer stability time \(|t|\le \frac{2^4 e \delta _\mathtt {G}^2}{\gamma \delta ^2} e^{{\left( \ln \frac{\delta _\mathtt {G}}{\delta }\right) }^{1+\theta /4}},\) where \(\delta _\mathtt {G}>{{\varvec{\delta }}_\mathtt {G}}\) (recall “Appendix A”). We set

$$\begin{aligned} r:=2\delta \end{aligned}$$

and choose

$$\begin{aligned} \mathtt {N}(r):=\left[ {\left( 2\ln \frac{2\delta _\mathtt {G}}{r}\right) }^{\theta /4}\right] =\left[ {\left( 2\ln \frac{\delta _\mathtt {G}}{\delta }\right) }^{\theta /4}\right] . \end{aligned}$$
(9.1)

Recalling (8.5) by Corollary 5.2 solutions of the PDE (1.1) in the space \({\mathtt {h}}_{p,s,a}\), correspond, by Fourier identification (1.19), to orbits of the Hamiltonian System (1.21) in the space

$$\begin{aligned} {\mathtt {h}}_{\mathtt {w}}\qquad \mathrm{with} \qquad {\mathtt {w}}_j=e^{a|j|+ s \langle j \rangle ^\theta } \langle j \rangle ^p. \end{aligned}$$

An initial datum \(u_0\) satisfying \(|u_0|_{p,s,a} \le \delta \) corresponds toFootnote 23\(u_0\in {\mathtt {h}}_{\mathtt {w}}\) with \(|u_0|_{\mathtt {w}}\le \delta \).

We claim that \(r\le 2\delta _\mathtt {G}\) implies

$$\begin{aligned} \frac{r \mathtt {N}e^{{{\mathcal {C}}_1}{\left( \frac{\mathtt {N}}{\eta _\mathtt {G}}\right) }^{\frac{3}{\theta }}}}{2 \delta _\mathtt {G}} \le 1. \end{aligned}$$
(9.2)

Indeed we have

$$\begin{aligned} \mathtt {N}(r)\ge \mathtt {N}_\mathtt {G}:= \max \left\{ \frac{16(4{{\mathcal {C}}_1})^\theta }{\eta _\mathtt {G}^{3}}, 2^{\frac{2\theta +4}{4-\theta }} \right\} \end{aligned}$$

and by (9.1) \(r\le 2 \delta _\mathtt {G}e^{-\frac{1}{2} (\mathtt {N}(r)/2)^{4/\theta }}\) and (9.2) follows if we show that the function

$$\begin{aligned} \mathtt {N}\ \rightarrow \ e^{-\frac{1}{2} (\mathtt {N}/2)^{4/\theta }} \mathtt {N}e^{{{\mathcal {C}}_1}{\left( \frac{\mathtt {N}}{\eta _\mathtt {G}}\right) }^{\frac{3}{\theta }}} \end{aligned}$$

is \(\le 1\) for \(\mathtt {N}\ge \mathtt {N}_\mathtt {G}.\) This is true since the function is decreasing for \(\mathtt {N}\ge \mathtt {N}_\mathtt {G}\) and is \(\le 1\) for \(\mathtt {N}= \mathtt {N}_\mathtt {G}.\) This proves the claim (9.2).

Then we apply Theorem 8.1 in the case \(\mathtt {G}\). Recalling (8.1), by (8.4) and (9.2)

$$\begin{aligned} \mathtt {C}_3(\mathtt {G}) r^{2(\mathtt {N}+1)} \le \frac{\gamma r^2}{2^9 e \delta _\mathtt {G}^2} {\left( \frac{r }{2 \delta _\mathtt {G}}\right) }^{\mathtt {N}(r)} = \frac{\gamma \delta ^2}{2^7 e \delta _\mathtt {G}^2} {\left( \frac{\delta }{ \delta _\mathtt {G}}\right) }^{\mathtt {N}(r)} \le \frac{\gamma \delta ^2}{2^7 e \delta _\mathtt {G}^2} e^{-{\left( \ln \frac{\delta _\mathtt {G}}{\delta }\right) }^{1+\theta /4}}, \end{aligned}$$

since \(\mathtt {N}(r)\ge {\left( \ln \frac{2\delta _\mathtt {G}}{r}\right) }^{\theta /4} ={\left( \ln \frac{\delta _\mathtt {G}}{\delta }\right) }^{\theta /4}\). We deduce the stability time by applying Lemma 5.1.

12 Sobolev Stability

Before proving Proposition 1.1 we add a comment on the optimality of condition (1.10).

Remark 10.1

We construct a finite dimensional Hamiltonian, which is a reduction of (1.1) to a finite number of Fourier indices and which exhibits fast drift in a time of order 1. For instance, consider

$$\begin{aligned} H(u_1,u_j):=(1+V_1)|u_1|^2+(j^2+V_j) |u_j|^2+e^{-{\mathtt {a}}j}{\text {Re}}(|u_1|^2u_1{\bar{u}}_j), \end{aligned}$$

which is a finite dimensional model for (1.1) with \(f(x,|u|^2)= e^{-{\mathtt {a}}j} \cos ((j-1)x) |u|^2\). Consider now the initial datum \(u(0)=(u_1(0),u_j(0))=(\delta /4,|j|^{-p}\delta /4)\), which clearly has \(H_p\) norm \(<\delta \). A direct computation shows that in a time T of order 1, the Sobolev norm of u(T) is of order

\(\delta ^3 e^{-{\mathtt {a}}j}j^p\) hence greater than \(4\delta \) if \(\delta ^2 e^{-{\mathtt {a}}j} j^p\) is large. Maximizing on j we get a constraint of the form \(\delta ^2 e^{-p}({\mathtt {a}}^{-1}p)^p< 1\).

Of course this pathological “fast diffusion” phenomenon comes from the fact that f is NOT traslation invariant (and hence H does not preserve momentum). Actually, restricting to translation invariant Hamiltonians would not result in signficantly weaker constraints on the smallness of \(\delta \) w.r.t. p. This can be seen in the following example. Consider the familiy of Hamiltonians (in three degrees of freedom)

$$\begin{aligned} K^{(j)}:= V_0|u_0|^2+ (1+V_1)|u_1|^2 + (j^2+ V_j)|u_j|^2 + {\text {Re}}({\bar{u}}_0^{j-1} u_1^j{\bar{u}}_j) \end{aligned}$$

with the constants of motion

$$\begin{aligned} L = |u_0|^2+ |u_1|^2 + |u_j|^2,\quad M= |u_1|^2 + j |u_j|^2. \end{aligned}$$

Following the same approach as in the previous example one shows that \(|u_j|^2 \) can have a drift of order \(j^{-p} \delta ^{2j }\) in a time T of order 1. This means that the Sobolev norm of u(T) is of order

\(\delta ^{2j} j^p\). Maximizing on j we get a constraint of the form \(\delta e^{p^{1^-}}< 1\).

Proof of Proposition 1.1

As before we set \( r:=2\delta . \) An initial datum \(u_0\) satisfying \( |u_0|_{L^2}+ |\partial _x^p u_0|_{L^2} \le \delta \) corresponds toFootnote 24\(u_0\in {\mathtt {h}}_{{\mathtt {w}}(p)}\) with \(|u_0|_p\le \delta \) by (5.28). We apply the Birkhoff Normal Form Theorem 8.1 in the case \(\mathtt {S}\) (recall that \(\mathtt {N}=\left[ \frac{p-1}{{\tau _\mathtt {S}}}\right] \)). Recalling the definition of \(\mathtt {r}(\mathtt {S})\) in (8.2), we verify that, for any \(\mathtt {N}\ge 1\)

$$\begin{aligned} {\delta }_\mathtt {S}({\mathtt {k}}_\mathtt {S}p)^{ -3 p } \le \frac{ {\mathtt {d}_\mathtt {S}}}{2\sqrt{\mathtt {N}{{\mathcal {C}}_2}(4\mathtt {N},{\mathtt {a}}/2\mathtt {N},\mathtt {N}\tau ) }} . \end{aligned}$$
(10.1)

Indeed

$$\begin{aligned}&\frac{ {\mathtt {d}_\mathtt {S}}}{2\sqrt{\mathtt {N}{{\mathcal {C}}_2}(4\mathtt {N},{\mathtt {a}}/2\mathtt {N},\mathtt {N}\tau ) }} {\mathop {=}\limits ^{(8.9)}} \frac{{\mathtt {d}_\mathtt {S}}}{2\sqrt{2} e^{27(2+{q})/2}} \frac{1}{\sqrt{\mathtt {N}}} \frac{1}{(\sqrt{12\tau }\mathtt {N}\max \left\{ 2 , {\mathtt {a}}^{-1/2}\right\} )^{3\mathtt {N}\tau }}\\&\quad \ge \frac{{\mathtt {d}_\mathtt {S}}\sqrt{\tau } }{2\sqrt{2} e^{27(2+{q})/2} }\left( \sqrt{\frac{12}{\tau }}\max \left\{ 2 , {\mathtt {a}}^{-1/2}\right\} \right) ^{-3(p-1)} (p-1)^{-3(p-1)-1/2}\\&\quad = {\delta }_\mathtt {S}({\mathtt {k}}_\mathtt {S})^{ -3 p } (p-1)^{-3(p-1)-1/2} \end{aligned}$$

setting

$$\begin{aligned} {\delta }_\mathtt {S}= \frac{{\mathtt {d}_\mathtt {S}}\sqrt{\tau } }{2\sqrt{2} e^{27(2+{q})/2} }{\mathtt {k}}_\mathtt {S}^{ 3}, \end{aligned}$$

(10.1) follows by verifying that \({{\varvec{\delta }}_\mathtt {S}}\le {\delta }_\mathtt {S}\) and noting that \( p^{-3p}< (p-1)^{-3(p-1)-1/2} \) for \(p>1.\)

By (8.11) and (8.9)

$$\begin{aligned}&\mathtt {C}_3(\mathtt {S})(2\delta )^{2(\mathtt {N}+1)}\\&\quad = \delta ^2 \frac{2^{10} {C_{\mathtt {Nem}}}(p-\mathtt {N}\tau ,0,{\mathtt {a}}/2)|f|_{{\mathtt {a}},R}}{e R} \\&\qquad {\left( \frac{ \mathtt {N}{C_{\mathtt {mon}}}(4\mathtt {N},{\mathtt {a}}/2\mathtt {N},\tau ) {{\mathcal {C}}_2}(4\mathtt {N},{\mathtt {a}}/2\mathtt {N},\tau ) \delta ^2}{{\mathtt {d}_\mathtt {S}}^2}\right) }^{\mathtt {N}}\\&\quad = \delta ^2 \frac{2^{10} {C_{\mathtt {Nem}}}(p-\mathtt {N}\tau ,0,{\mathtt {a}}/2)|f|_{{\mathtt {a}},R}}{e R}\\&\qquad {\left( \frac{ 4 e^{27(2+{q})} 3^{3\tau } (4 \max \left\{ 4 , (1/{\mathtt {a}})\right\} )^{4\tau } \delta ^2 }{\tau {\mathtt {d}_\mathtt {S}}^2}\right) }^{\mathtt {N}} ( \tau \mathtt {N})^{\frac{4\tau +1}{\tau }(\tau \mathtt {N}) }\\&\quad \le \delta ^2 \frac{2^{10} {C_{\mathtt {Nem}}}(p-\mathtt {N}\tau ,0,{\mathtt {a}}/2)|f|_{{\mathtt {a}},R}}{e R} \\&\qquad {\left( \frac{ 4 e^{27(2+{q})} 3^{3\tau } (4 \max \left\{ 4 , (1/{\mathtt {a}})\right\} )^{4\tau } \delta ^2 }{\tau {\mathtt {d}_\mathtt {S}}^2}\right) }^{\frac{p-1}{\tau }-1} ( p-1)^{\frac{4\tau +1}{\tau }(p-1) } \end{aligned}$$

(remember that \(\mathtt {N}=[ (p-1)/\tau ]\)). Then, noting that \( (p-1)^{\frac{4\tau +1}{\tau }(p-1)} < p^{5p} \) for \(p>1\) (recall that \(\tau \ge 15\)), we get

$$\begin{aligned} \mathtt {C}_3(\mathtt {S})(2\delta )^{2(\mathtt {N}+1)} < \frac{1}{8 {\mathbf {T}}_\mathtt {S}} p^{ 5 p } \left( \frac{\delta }{{{\varvec{\delta }}_\mathtt {S}}}\right) ^{2\frac{p-1}{\tau }}. \end{aligned}$$

We conclude by applying Lemma 5.1 and (5.28)

$$\begin{aligned} |u(t)|_{L^2}+ |\partial _x^p u(t)|_{L^2}\le 2|u(t)|_p \le 4\delta ,\qquad \forall \, |t|\le \mathtt {T}_\mathtt {S}p^{ -5 p } \left( \frac{{{\varvec{\delta }}_\mathtt {S}}}{\delta }\right) ^{\frac{2(p-1)}{{\tau _\mathtt {S}}}}, \end{aligned}$$

proving (1.11). \(\square \)

Proof of Theorem 1.2

It is similar to the previous case but now we consider

$$\begin{aligned} {\mathtt {h}}_{\mathtt {w}}\qquad \mathrm{with} \qquad {\mathtt {w}}_j =\lfloor j \rfloor ^{p} \quad (\mathrm{and} \ |\cdot |_{\mathtt {w}}=\Vert \cdot \Vert _p). \end{aligned}$$

We set \(r= 4\delta \), an initial datum \(u_0\) satisfying \( 2^p |u_0|_{L^2}, |u_0|_{L^2}+ |\partial _x^p u_0|_{L^2} \le \delta \) corresponds to \(u_0\in {\mathtt {h}}_{{\mathtt {w}}}\) with \(\Vert u_0\Vert _p\le 2\delta \) by (5.30). Now we can apply the Birkhoff Normal Form Theorem 8.1 with \(\mathtt {N}=[\frac{p-1}{\tau _1}]\)

$$\begin{aligned} 4\delta \le 4\frac{{{\varvec{\delta }}_\mathtt {M}}}{\sqrt{p}}\le \mathtt {r}(\mathtt {M}) =\min \left\{ \frac{ \delta _\mathtt {M}}{\sqrt{\mathtt {N}}}, \ \frac{\sqrt{R}}{ \sqrt{2^{{\tau _\mathtt {M}}+4}}} \right\} . \end{aligned}$$
(10.2)

Proceeding as in the case \(\mathtt {S}\) and noting that now

$$\begin{aligned} 8\mathtt {C}_3(\mathtt {M}) (4\delta )^{2(\mathtt {N}+1)}= & {} 2^{{\tau _\mathtt {M}}+12}\frac{|f|_{R}}{R} \left( \frac{\mathtt {N}\delta ^2}{2 \delta _\mathtt {M}^2}\right) ^{\mathtt {N}} \delta ^{2} \\\le & {} 2^{{\tau _\mathtt {M}}+13}\delta _\mathtt {M}^2\frac{|f|_{R}}{R} (\frac{p-1}{\tau _1})^{\frac{p-1}{\tau _1} } (\frac{ \delta ^2}{2 \delta _\mathtt {M}^2})^{\frac{p-1}{\tau _1}} \\\le & {} \frac{1}{\mathtt {T}_\mathtt {M}} \left( \frac{(p-1) \delta ^2}{8 {{\varvec{\delta }}_\mathtt {M}}^2}\right) ^{\frac{p-1}{{\tau _\mathtt {M}}}} . \end{aligned}$$

Finally by Corollary 5.1 and (5.30) we get

$$\begin{aligned} |u(t)|_{L^2}+ |\partial _x^p u(t)|_{L^2}\le 2\Vert u(t)\Vert _p \le 8\delta ,\qquad \forall \, |t| \le {\mathtt {T}_\mathtt {M}} \left( \frac{8 {{\varvec{\delta }}_\mathtt {M}}^2}{ (p-1) \delta ^2} \right) ^{\frac{p-1}{\tau _1}} , \end{aligned}$$

proving (1.13). \(\square \)

Proof of Corollary 1.1

In case \(\mathtt {S}\) we start by noticing that for \(3p \ln (\mathtt {k}_\mathtt {S}p) \le \ln ({{\varvec{\delta }}_\mathtt {S}}/\delta )\) the function \(\frac{\mathtt {T}_\mathtt {S}}{\delta ^2}( p)^{ -5 p } \left( \frac{{{\varvec{\delta }}_\mathtt {S}}}{\delta }\right) ^{\frac{2(p-1)}{{\tau _\mathtt {S}}}}\) is increasing in p.

Let us check that \(p(\delta )\) defined in (1.14) satisfies (1.10) and is \(\ge 3{\tau _\mathtt {S}}+1\) namely, passing to the logarithms and setting \(y := \ln ({{\varvec{\delta }}_\mathtt {S}}/\delta )\), we have to check that \(\frac{y}{\ln (y)}>6{\tau _\mathtt {S}}\) and \( 3p \ln (\mathtt {k}_\mathtt {S}p) \le y.\) The first bound follows from the definition of \(\bar{{\delta }_\mathtt {S}}\). For the second, we have

$$\begin{aligned} 3p \ln (\mathtt {k}_\mathtt {S}p) \le 3 \left( 1+ \frac{1}{6} \frac{y}{\ln (y)}\right) \left( \ln (\mathtt {k}_\mathtt {S})+\ln (1+ \frac{1}{6} \frac{y}{\ln (y)} )\right) \le y \end{aligned}$$

provided thatFootnote 25

$$\begin{aligned} y \ge \max \{\mathtt {k}_\mathtt {S},40\}. \end{aligned}$$

Now we have to show that

$$\begin{aligned} {\mathtt {T}_\mathtt {S}} e^{ \ \frac{\ln ^2 ({{\varvec{\delta }}_\mathtt {S}}/\delta )}{4 \tau \ln \ln ({{\varvec{\delta }}_\mathtt {S}}/\delta )}} \le {\mathtt {T}_\mathtt {S}}(p)^{ -5 p } \left( \frac{{{\varvec{\delta }}_\mathtt {S}}}{\delta }\right) ^{\frac{2(p-1)}{\tau }} \end{aligned}$$

wich amounts to

$$\begin{aligned} e^{ \ \frac{y^2}{4 \tau \ln y }}( p)^{ 5 p } e^{-\frac{2(p-1)}{\tau } y }\le 1 \end{aligned}$$

or equivalently

$$\begin{aligned} \frac{y^2}{4 \tau \ \ln y } + 5 p \ln ( p) - \frac{2(p-1)}{\tau } y \le 0. \end{aligned}$$

Assuming \( \frac{y}{ \ln y}> 6\), we have \(1+\frac{y}{6 \ln y}-\frac{\tau }{6}< p< \frac{y}{3 \ln y}\) we get

$$\begin{aligned}&\frac{y^2}{4 \tau \ln y } + 5 p \ln ({\mathtt {K}}_\mathtt {S}p) - \frac{2(p-1)}{\tau } y \le \frac{y^2}{4 \tau \ln y } \nonumber \\&\qquad + \frac{5}{3} \frac{y}{\ln (y)} \ln ( \frac{y}{3 \ln (y)})- 2 y(\frac{1}{6\tau } \frac{y}{\ln (y)}-\frac{1}{6})\\&\quad \le - \frac{y^2}{12 \tau \ln y } + 2 y <0 \end{aligned}$$

if \(\frac{y}{\ln (y)}> 24\tau >6\). Note that the last inequality holds if \(y\ge 24 \tau ^2\) (recall that \(\tau \ge 15\)). Recollecting the condition that y has to satisfy is

$$\begin{aligned} y \ge \max \{\mathtt {k}_\mathtt {S},\, 24\tau ^2\}, \end{aligned}$$

namely \(\delta \le \bar{\delta }_\mathtt {S}\).

\(\mathtt {M}\)) Since we are assuming \(\delta \le {\bar{\delta }}_\mathtt {M}\) we have that p defined in (1.16) satisfies \(p>1 +3 {\tau _\mathtt {M}}\). Moreover by (1.16), the bound (1.12) holds. Then Theorem 1.2 applies and (1.17) follows directly by (1.13). \(\square \)