1 Introduction

In this article we consider the beam equation on an irrational torus

$$\begin{aligned} \left\{ \begin{aligned}&\partial _{tt}\psi +\Delta ^{2} \psi +\psi +f(\psi )=0,\\&\psi (0,y)=\psi _0,\\&\partial _{t}\psi (0,y)=\psi _1, \end{aligned}\right. \end{aligned}$$
(1.1)

where \(f\in C^{\infty }({\mathbb {R}},{\mathbb {R}})\), \(\psi =\psi (t,y)\), \(y\in {\mathbb {T}}^{d}_{{\nu }}\), with \({\nu }=(\nu _1,\ldots ,\nu _{d})\in [1,2]^{d}\) and

$$\begin{aligned} {\mathbb {T}}^{d}_{{\nu }}:=({\mathbb {R}}/ 2\pi \nu _1{\mathbb {Z}})\times \cdots \times ({\mathbb {R}}/ 2\pi \nu _d {\mathbb {Z}}). \end{aligned}$$
(1.2)

The initial data \((\psi _0, \psi _1)\) have small size \(\varepsilon \) in the standard Sobolev space \(H^{s+1}({\mathbb {T}}_\nu ^{d})\times H^{s-1}({\mathbb {T}}_\nu ^{d})\) for some \(s\gg 1\). The nonlinearity \(f(\psi )\) has the form

$$\begin{aligned} f(\psi ):=(\partial _{\psi }F)(\psi ) \end{aligned}$$
(1.3)

for some smooth function \(F\in C^{\infty }({\mathbb {R}},{\mathbb {R}})\) having a zero of order at least \(n\ge 3\) at the origin. Local existence theory implies that (1.1) admits, for small \(\varepsilon >0\), a unique smooth solution defined on an interval of length \(O(\varepsilon ^{-n+2})\). Our goal is to prove that, generically with respect to the irrationality of the torus (i.e. generically with respect to the parameter \(\nu \)), the solution actually extends to a larger interval.

Our main theorem is the following.

Theorem 1

Let \(d\ge 2\). There exists \(s_0\equiv s_0(n,d)\in {\mathbb {R}}\) such that for almost all \(\nu \in [1,{2}]^{d}\), for any \(\delta >0\) and for any \(s\ge s_0\) there exists \(\varepsilon _0>0\) such that for any \(0<\varepsilon \le \varepsilon _0\) we have the following. For any initial data \((\psi _0,\psi _1)\in H^{s+1}({\mathbb {T}}_{\nu }^{d})\times H^{s-1}({\mathbb {T}}_{\nu }^{d})\) such that

$$\begin{aligned} \Vert \psi _0\Vert _{H^{s+1}}+\Vert \psi _1\Vert _{H^{s-1}}\le \varepsilon , \end{aligned}$$
(1.4)

there exists a unique solution of the Cauchy problem (1.1) such that

$$\begin{aligned} \begin{aligned}&\psi (t,x)\in C^0\big ([0,T_\varepsilon );H^{s+1}({\mathbb {T}}_\nu ^d)\big )\bigcap C^1\big ([0,T_\varepsilon );H^{s-1}({\mathbb {T}}_\nu ^d)\big ), \\&\sup _{t\in [0,T_\varepsilon )}\Big (\Vert \psi (t,\cdot )\Vert _{H^{s+1}} + \Vert \partial _t \psi (t,\cdot )\Vert _{H^{s-1}} \Big )\le 2\varepsilon , \quad T_\varepsilon \ge \varepsilon ^{-\mathtt {a}+\delta }, \end{aligned} \end{aligned}$$
(1.5)

where \(\mathtt {a}=\mathtt {a}(d,n)\) has the form

$$\begin{aligned} \mathtt {a}(d,n):= \left\{ \begin{array}{ll} (n-2)\left( 1+\tfrac{3}{d-1}\right) ,&{}\quad n\;\; \mathrm{even}\\ (n-2)\left( 1+\tfrac{3}{d-1}\right) +\tfrac{\max \{4-d,0\}}{d-1},&{}\quad n\;\; \mathrm{odd}. \end{array}\right. \end{aligned}$$
(1.6)

Originally, the beam equation has been introduced in physics to model the oscillations of a uniform beam, so in a one dimensional context. In dimension 2, similar equations can be used to model the motion of a clamped plate (see for instance the introduction of [28]). In larger dimension (\(d\ge 3\)) we do not claim that the beam Eq. (1.1) has a physical interpretation but nevertheless remains an interesting mathematical model of dispersive PDE. We note that when the equation is posed on a torus, there is no physical reason to assume the torus to be rational.

This problem of extending solutions of semi-linear PDEs beyond the time given by local existence theory has been considered many times in the past, starting with Bourgain [11], Bambusi [1] and Bambusi–Grébert [2] in which the authors prove the almost global existence for the Klein Gordon equation:

$$\begin{aligned} \left\{ \begin{aligned}&\partial _{tt}\psi -\Delta \psi +m\psi +f(\psi )=0 ,\\&\psi (0,x)=\psi _0,\\&\partial _{t}\psi (0,x)=\psi _1, \end{aligned}\right. \end{aligned}$$
(1.7)

on a one dimensional torus. Precisely, they proved that, given \(N\ge 1\), if the initial datum has a size \(\varepsilon \) small enough in \(H^{s}({\mathbb {T}})\times H^{s-1}({\mathbb {T}})\), and if the mass stays outside an exceptional subset of zero measure, the solution of (1.7) exists at least on an interval of length \(O(\varepsilon ^{-N})\). This result has been extended to Eq. (1.7) on Zoll manifolds (in particular spheres) by Bambusi–Delort–Grébert–Szeftel [3] but also for the nonlinear Schrödinger equation posed on \({\mathbb {T}}^{d}\) (the square torus of dimension d) [2, 19] or on \({\mathbb {R}}^d\) with a harmonic potential [24]. What all these examples have in common is that the spectrum of the linear part of the equation can be divided into clusters that are well separated from each other. Actually if you considered (1.1) with a generic mass m on the square torus \({\mathbb {T}}^d\) then the spectrum of \(\sqrt{\Delta ^2+m}\) (the square root comes from the fact that the equation is of order two in time) is given by \(\{\sqrt{|j|^4+m}\mid j\in {\mathbb {Z}}^d\}\) which can be divided in clusters around each integers n whose diameter decreases with |n|. Thus for n large enough these clusters are separated by 1/2. So in this case also we could easily prove, following [2], the almost global existence of the solution.

On the contrary when the equation is posed on an irrational torus, the nature of the spectrum drastically changes: the differences between couples of eigenvalues accumulate to zero. Even for the Klein Gordon Eq. (1.7) posed on \(\mathbb T^d\) for \(d\ge 2\) the linear spectrum is not well separated. In both cases we could expect exchange of energy between high Fourier modes and thus the almost global existence in the sense described above is not reachable (at least up to now!). Nevertheless it is possible to go beyond the time given by the local existence theory. In the case of (1.7) on \({\mathbb {T}}^d\) for \(d\ge 2\), this local time has been extended by Delort [13] and then improved in different ways by Fang and Zhang [18], Zhang [29] and Feola et al. [20] (in this last case a quasi linear Klein Gordon equation is considered). We quote also the remarkable work on multidimensional periodic water wave by Ionescu and Pusateri [26].

The beam equation has already been considered on irrational torus in dimension 2 by Imekraz [25]. In the case he considered, the irrationality parameter \(\nu \) was diophantine and fixed, but a mass m was added in the game (for us m is fixed and for convenience we chose \(m=1\)). For almost all mass, Imekraz obtained a lifespan \(T_\varepsilon =O(\varepsilon ^{-\frac{5}{4}(n-2)^+})\) while we obtain, for almost all \(\nu \), \(T_\varepsilon =O(\varepsilon ^{-4(n-2)^+})\) when n is even and \(T_\varepsilon =O(\varepsilon ^{-4(n-2)-2^+})\) when n is odd.

We notice that applying the Theorem 3 of [6] (and its Corollary 1) we obtain the almost global existence for (1.1) on irrational tori up to a large but finite loss of derivatives.

Let us also mention some recent results about the longtime existence for periodic water waves [7,8,9,10]. In the same spirit we quote the long time existence for a general class of quasi-linear Hamiltonian equations [21] and quasi-linear reversible Schrödinger equations [22] on the circle. The main theorem in [21] applies also for quasi-linear perturbations of the beam equation. We mention also [16], here the authors study the lifespan of small solutions of the semi-linear Klein–Gordon equation posed on a general compact boundary-less Riemannian manifold.

All previous results [13, 18, 20, 25, 29] have been obtained by a modified energy procedure. Such procedure partially destroys the algebraic structure of the equation and, thus, it makes more involved to iterate the procedure.Footnote 1 On the contrary, in this paper, we begin by a Birkhoff normal form procedure (when \(d=2,3\)) before applying a modified energy step. Further in dimension 2 we can iterate two steps of Birkhoff normal form and therefore we get a much better time. The other key tool that allows us to go further in time is an estimate of small divisors that we have tried to optimize to the maximum: essentially small divisors make us lose \((d-1)\) derivatives (see Proposition 2.2) which explains the strong dependence of our result on the dimension d of the torus and also explains why we obtain a better result than [25]. In Sect. 1.2 we detail the scheme of the proof of Theorem 1.

1.1 Hamiltonian Formalism

We denote by \(H^{s}({\mathbb {T}}^{d};{\mathbb {C}})\) the usual Sobolev space of functions \({\mathbb {T}}^{d}\ni x \mapsto u(x)\in {\mathbb {C}}\). We expand a function u(x) , \(x\in {\mathbb {T}}^{d}\), in Fourier series as

$$\begin{aligned} u(x) = \frac{1}{({2\pi })^{d/2}} \sum _{n \in {\mathbb {Z}}^{d} } {u}_ne^{\mathrm{i} n\cdot x }, \quad {u}_n := \frac{1}{(2\pi )^{d/2}} \int _{{\mathbb {T}}^{d}} u(x) e^{-\mathrm{i} n\cdot x } \, dx. \end{aligned}$$
(1.8)

We also use the notation

$$\begin{aligned} u_n^{+1} := u_n \quad \mathrm{and} \quad u_n^{-1} := \overline{u_n} . \end{aligned}$$
(1.9)

We set \(\langle j \rangle :=\sqrt{1+|j|^{2}}\) for \(j\in {\mathbb {Z}}^{d}\). We endow \(H^{s}({\mathbb {T}}^{d};{\mathbb {C}})\) with the norm

$$\begin{aligned} \Vert u(\cdot )\Vert _{H^{s}}^{2}:=\sum _{j\in {\mathbb {Z}}^{d}}\langle j\rangle ^{2s}| u_{j}|^{2}. \end{aligned}$$
(1.10)

Moreover, for \(r\in {\mathbb {R}}^{+}\), we denote by \(B_{r}(H^{s}({\mathbb {T}}^{d};{\mathbb {C}}))\) the ball of \(H^{s}({\mathbb {T}}^{d};{\mathbb {C}}))\) with radius r centered at the origin. We shall also write the norm in (1.10) as \(\Vert u\Vert ^{2}_{H^{s}}= (\langle D\rangle ^{s}u,\langle D\rangle ^{s} u)_{L^{2}}\), where \(\langle D\rangle e^{\mathrm{i} j\cdot x}=\langle j\rangle e^{\mathrm{i} j\cdot x}\), for any \(j\in {\mathbb {Z}}^{d}\).

In the following it will be more convenient to rescale the Eq. (1.1) and work on squared tori \({\mathbb {T}}^{d}\). For any \(y\in {\mathbb {T}}_\nu ^{d}\) we write \(\psi (y)=\phi (x)\) with \(y=(x_1\nu _1,\ldots , x_d\nu _d)\) and \(x=(x_1,\ldots ,x_d)\in {\mathbb {T}}^{d}\). The beam equation in (1.1) reads

$$\begin{aligned} \partial _{tt}\phi +\Omega ^{2}\phi +f(\phi )=0 \end{aligned}$$
(1.11)

where \(\Omega \) is the Fourier multiplier defined by linearity as

$$\begin{aligned} \Omega e^{\mathrm{i} j\cdot x}=\omega _{j} e^{\mathrm{i} j\cdot x}, \quad \omega _{j}:=\sqrt{|j|_a^{4}+1}, \quad |j|_{a}^{2}:=\sum _{i=1}^{d}a_{i}|j_i|^{2}, \quad a_{i}:=\nu _i^{2}, \quad \forall \,j\in {\mathbb {Z}}^{d}. \end{aligned}$$
(1.12)

Introducing the variable \(v={\dot{\phi }}=\partial _{t}\phi \) we can rewrite Eq. (1.11) as

$$\begin{aligned} {\dot{\phi }}=-v,\quad {\dot{v}}=\Omega ^{2}\phi +f(\phi ). \end{aligned}$$
(1.13)

By (1.3) we note that (1.13) can be written in the Hamiltonian form

$$\begin{aligned} \partial _{t}{\bigl [{\begin{matrix}\phi \\ v\end{matrix}}\bigr ]}=X_{H_{{\mathbb {R}}}}(\phi ,v)=J\left( \begin{array}{l} \partial _{\phi }H_{{\mathbb {R}}}(\phi ,v)\\ \partial _{v}H_{{\mathbb {R}}}(\phi ,v) \end{array}\right) , \quad J={\bigl [{\begin{matrix}0&{}1\\ -1&{}0\end{matrix}}\bigr ]} \end{aligned}$$

where \(\partial \) denotes the \(L^{2}\)-gradient of the Hamiltonian function

$$\begin{aligned} H_{{\mathbb {R}}}(\phi ,v)= \int _{{\mathbb {T}}^{d}}\left( \frac{1}{2}v^{2}+\frac{1}{2}(\Omega ^{2}\phi ) \phi +F(\phi )\right) dx, \end{aligned}$$
(1.14)

on the phase space \(H^{2}({\mathbb {T}}^{d};{\mathbb {R}})\times L^{2}({\mathbb {T}}^{d};{\mathbb {R}})\). Indeed we have

$$\begin{aligned} \mathrm {d}H_{{\mathbb {R}}}(\phi ,v){\bigl [{\begin{matrix}{\hat{\phi }}\\ {\hat{v}}\end{matrix}}\bigr ]}=- \lambda _{{\mathbb {R}}}(X_{H_{{\mathbb {R}}}}(\phi ,v), {\bigl [{\begin{matrix}{\hat{\phi }}\\ {\hat{v}}\end{matrix}}\bigr ]}) \end{aligned}$$
(1.15)

for any \((\phi ,v), ({\hat{\phi }},{\hat{v}})\) in \(H^{2}({\mathbb {T}}^{d};{\mathbb {R}})\times L^{2}({\mathbb {T}}^{d};{\mathbb {R}})\), where \(\lambda _{{\mathbb {R}}}\) is the non-degenerate symplectic form

$$\begin{aligned} \lambda _{{\mathbb {R}}}(W_1,W_2):=\int _{{\mathbb {T}}^{d}}(\phi _1v_2-v_1\phi _2)dx, \quad W_1:={\bigl [{\begin{matrix}\phi _1\\ v_1\end{matrix}}\bigr ]}, W_2:={\bigl [{\begin{matrix}\phi _2\\ v_2\end{matrix}}\bigr ]}. \end{aligned}$$

The Poisson bracket between two Hamiltonian \(H_{{\mathbb {R}}}, G_{{\mathbb {R}}}: H^{2}({\mathbb {T}}^{d};{\mathbb {R}})\times L^{2}({\mathbb {T}}^{d};{\mathbb {R}})\rightarrow {\mathbb {R}}\) are defined as

$$\begin{aligned} \{H_{{\mathbb {R}}},G_{{\mathbb {R}}}\} =\lambda _{{\mathbb {R}}}(X_{H_{{\mathbb {R}}}},X_{G_{{\mathbb {R}}}}). \end{aligned}$$
(1.16)

We define the complex variables

$$\begin{aligned} {\bigl [{\begin{matrix}u\\ {\bar{u}}\end{matrix}}\bigr ]}:={\mathcal {C}}{\bigl [{\begin{matrix}\phi \\ v\end{matrix}}\bigr ]},\quad {\mathcal {C}}:= \frac{1}{\sqrt{2}}\left( \begin{array}{ll} \Omega ^{\frac{1}{2}} &{} \mathrm{i} \Omega ^{-\frac{1}{2}}\\ \Omega ^{\frac{1}{2}} &{} -\mathrm{i} \Omega ^{-\frac{1}{2}} \end{array} \right) , \end{aligned}$$
(1.17)

where \(\Omega \) is the Fourier multiplier defined in (1.12). Then the system (1.13) reads

$$\begin{aligned} {\dot{u}} =\mathrm{i} \Omega u+ \frac{\mathrm{i} }{\sqrt{2}} \Omega ^{-1/2}f\left( \Omega ^{-1/2}\left( \frac{u+ {{\bar{u}}}}{\sqrt{2}}\right) \right) . \end{aligned}$$
(1.18)

Notice that (1.18) can be written in the Hamiltonian form

$$\begin{aligned} \partial _{t}{\bigl [{\begin{matrix}u\\ {\bar{u}}\end{matrix}}\bigr ]}=X_{H}(u)=\mathrm{i} J \left( \begin{array}{l} \partial _{{u}}H(u)\\ \partial _{{\bar{u}}}H(u) \end{array}\right) = \left( \begin{array}{l} \mathrm{i} \partial _{{\bar{u}}}H(u)\\ -\mathrm{i} \partial _{{u}}H(u) \end{array}\right) ,\quad J={\bigl [{\begin{matrix}0&{}1\\ -1&{}0\end{matrix}}\bigr ]} \end{aligned}$$
(1.19)

with Hamiltonian function (see (1.14))

$$\begin{aligned} H(u)=H_{{\mathbb {R}}}({\mathcal {C}}^{-1}{\bigl [{\begin{matrix}u\\ {\bar{u}}\end{matrix}}\bigr ]}) =\int _{{\mathbb {T}}^{d}}{\bar{u}}\Omega u\ \mathrm {d}x +\int _{{\mathbb {T}}^{d}} F\left( \frac{\Omega ^{-1/2}(u+{\bar{u}})}{\sqrt{2}}\right) \ \mathrm {d}x \end{aligned}$$
(1.20)

and where \(\partial _{{\bar{u}}}=(\partial _{\mathfrak {R}u}+\mathrm{i} \partial _{\mathfrak {I}u})/2\), \(\partial _{u}=(\partial _{\mathfrak {R}u}-\mathrm{i} \partial _{\mathfrak {I}u})/2\). Notice that

$$\begin{aligned} X_{H}={\mathcal {C}}\circ X_{H_{{\mathbb {R}}}}\circ {\mathcal {C}}^{-1} \end{aligned}$$
(1.21)

and that (using (1.17))

$$\begin{aligned} \mathrm {d}H(u){\bigl [{\begin{matrix}h\\ {\bar{h}}\end{matrix}}\bigr ]}=(\mathrm {d}H_{{\mathbb {R}}})(\phi ,v)\left[ {\mathcal {C}}^{-1}{\bigl [{\begin{matrix}h\\ {\bar{h}}\end{matrix}}\bigr ]} \right] {\mathop {=}\limits ^{(1.15),(1.21)}} -\lambda \left( X_{H}(u),{\bigl [{\begin{matrix}h\\ {\bar{h}}\end{matrix}}\bigr ]}\right) \end{aligned}$$
(1.22)

for any \(h\in H^{2}({\mathbb {T}}^{d};{\mathbb {C}})\) and where the two form \(\lambda \) is given by the push-forward \(\lambda =\lambda _{{\mathbb {R}}}\circ {\mathcal {C}}^{-1}\). In complex variables the Poisson bracket in (1.16) reads

$$\begin{aligned} \{H,G\} :=\lambda (X_{H},X_{G}) =\mathrm{i} \int _{{\mathbb {T}}^{d}}\partial _{u}G\partial _{{\bar{u}}}H- \partial _{{\bar{u}}}G\partial _{u}H \mathrm {d}x, \end{aligned}$$
(1.23)

where we set \(H=H_{{\mathbb {R}}}\circ {\mathcal {C}}^{-1}\), \(G=G_{{\mathbb {R}}}\circ {\mathcal {C}}^{-1}\). Let us introduce an additional notation:

Definition 1.1

If \(j \in ({\mathbb {Z}}^d)^r\) for some \(r\ge k\) then \(\mu _k(j)\) denotes the \(k^{st}\) largest number among \(|j_1|, \dots , |j_r|\) (multiplicities being taken into account). If there is no ambiguity we denote it only with \(\mu _k\).

Let \(r\in {\mathbb {N}}\), \(r\ge n\). A Taylor expansion of the Hamiltonian H in (1.20) leads to

$$\begin{aligned} H=Z_2+\sum _{k=n}^{r-1} H_k +R_r \end{aligned}$$
(1.24)

where

$$\begin{aligned} Z_{2}:=\int _{{\mathbb {T}}^{d}} {\bar{u}}\Omega u\ \mathrm {d}x {\mathop {=}\limits ^{(1.12)}} \sum _{j\in {\mathbb {Z}}^{d}}\omega _{j}|u_{j}|^{2} \end{aligned}$$
(1.25)

and \(H_k\), \(k=n,\ldots ,r-1\), is an homogeneous polynomial of order k of the form

$$\begin{aligned} H_k = \sum _{\begin{array}{c} \sigma \in \{-1,1\}^k,\ j\in ({\mathbb {Z}}^d)^k\\ \sum _{i=1}^k\sigma _i j_i=0 \end{array}}(H_{k})_{\sigma ,j} u_{j_1}^{\sigma _1}\cdots u_{j_k}^{\sigma _k} \end{aligned}$$
(1.26)

with (noticing that the zero momentum condition \(\sum _{i=1}^k\sigma _i j_i=0\) implies \(\mu _1(j)\lesssim \mu _2(j)\))

$$\begin{aligned} |(H_{k})_{\sigma ,j}|\lesssim _k \frac{1}{\mu _1(j)^{2}}, \quad \forall \sigma \in \{-1,1\}^k,\ j\in ({\mathbb {Z}}^d)^k \end{aligned}$$
(1.27)

and

$$\begin{aligned} \Vert X_{R_r}(u)\Vert _{H^{s+2}}\lesssim _s \Vert u\Vert _{H^s}^{r-1}, \quad \forall \, u\in B_{1}( H^{s}({\mathbb {T}}^{d};{\mathbb {C}})). \end{aligned}$$
(1.28)

The estimate above follows by Moser’s composition theorem in [27], section 2. Estimates (1.27) and (1.28) express the regularizing effect of the semi-linear nonlinearity in the Hamiltonian writing of (1.11).

1.2 Scheme of the Proof of Theorem 1

As usual Theorem 1 will be proved by a bootstrap argument and thus we want to control, \(N_s(u(t)):=\Vert u(t)\Vert ^2_{H^s}\), for \(t\mapsto u(t,\cdot )\) a small solution (whose local existence is given by the standard theory for semi-linear PDEs) of the Hamiltonian system generated by H given by (1.24) for the longest time possible (and at least longer than the existence time given by the local theory). So we want to control its derivative with respect to t. We have

$$\begin{aligned} \frac{d}{dt}N_s(u)=\{N_s,H\}=\sum _{k=n}^{r-1}\{N_s, H_k\} +\{N_s,R_r\}. \end{aligned}$$
(1.29)

By (1.28) we have \(\{N_s,R_r\}\lesssim \Vert u\Vert ^{r-1}_{H^s}\) and thus we can neglect this term choosing r large enough. Then we define \(H^{\le N}_k\) the truncation of \(H_k\) at order N:

$$\begin{aligned} H_k^{\le N} = \sum _{\begin{array}{c} \sigma \in \{-1,1\}^k,\ j\in ({\mathbb {Z}}^d)^k\\ \sum _{i=1}^k\sigma _i j_i=0,\ \mu _2(j)\le N \end{array}} (H_{k})_{\sigma ,j}u_{j_1}^{\sigma _1}\cdots u_{j_k}^{\sigma _k} \end{aligned}$$

and we set \(H^{> N}_k=H_k-H^{\le N}_k\). As a consequence of (1.27) we have \(\{N_s,H_k^{>N}\}\lesssim N^{-2}\Vert u\Vert ^{k-1}_{H^s}\) and thus we can neglect these terms choosing N large enough. So it remains to take care of \(\sum _{k=n}^{r-1}\{N_s, H_k^{\le N}\}\).

The natural idea to eliminate \(H_k^{\le N}\) consists in using a Birkhoff normal form procedure (see [2, 23]). In order to do that, we have first to solve the homological equation

$$\begin{aligned} \{\chi _k,Z_2\}+H_k^{\le N}=Z_k. \end{aligned}$$

This is achieved in Lemma 3.6 and, thanks to the control of the small divisors given by Proposition 2.2, we get that there exists \(\alpha \equiv \alpha (d,k)>0\) such that for any \(\delta >0\)

$$\begin{aligned} |(\chi _{k})_{\sigma ,j}| \lesssim _\delta {\mu _1(j)^{d-3+\delta }}\mu _3(j)^\alpha , \quad \forall \sigma \in \{-1,1\}^k,\ j\in ({\mathbb {Z}}^d)^k. \end{aligned}$$
(1.30)

From [2] we learn that the positive power of \(\mu _3(j)\) appearing in the right hand side of (1.30) is not dangerousFootnote 2 (taking s large enough) but the positive power of \(\mu _1(j)\) implies a loss of derivatives. So this step can be achieved only assuming \(d\le 3\) and in that case the corresponding flow is well defined in \(H^s\) (with s large enough) and is controlled by \(N^{\delta }\) (see Lemma 3.7). In other words, this step is performed only when \(d=2,3\), when \(d\ge 4\) we directly go to the modified energy step.

For \(d=2,3\), let us focus on \(n=3\). After this Birkhoff normal form step, we are left with

$$\begin{aligned} H\circ \Phi _{\chi _3}=Z_2+Z_3+Q_4+\mathrm{negligible\, terms} \end{aligned}$$

where \(Q_4\) is a Hamiltonian of order 4 whose coefficients are bounded by \(\mu _1(j)^{d-3+\delta }\) (see Lemma 3.5, estimate (3.15)) and \(Z_3\) is a Hamiltonian of order 3 which is resonant: \(\{Z_2,Z_3\}=0\). Actually, as consequence Proposition 2.2, \(Z_3=0\) and thus we have eliminated all the terms of order 3 in (1.29).

In the case \(d=2\), \(Q_4^{\le N}\) is still \((1-\delta )\)-regularizing and we can perform a second Birkhoff normal form. Actually, since in eliminating \(Q_4^{\le N}\) we create terms of order at least 6, we can eliminate both \(Q_4^{\le N}\) and \(Q_5^{\le N}\). So, for \(d=2\), we are left with

$$\begin{aligned} {\tilde{H}}=H\circ \Phi _{\chi _3}\circ \Phi _{\chi _4+\chi _5} =Z_2+ Z_4+Q_6+\mathrm{negligible\, terms} \end{aligned}$$

where \(Z_4\) is Hamiltonian of order 4 which is resonant,Footnote 3\(\{Z_2,Z_4\}=0\), and \(Q_6\) is a Hamiltonian of order 6 whose coefficients are bounded by \(N^{2\delta }\). Since resonant Hamiltonians commute with \(N_s\), the first contribution in (1.29) is \(\{N_s,Q_6\}\). This is essentially the statement of Theorem 2 (which will be stated in Sect. 3) in the case \(d=2\) and \(n=3\) and this achieves the Birkhoff normal forms step.

Let us describe the modified energy step only in the case \(d=2\) and \(n=3\) and let us focus on the worst term in \(\{N_s, {{\tilde{H}}}\}\), i.e. \(\{N_s,Q_6\}\). Let us write

$$\begin{aligned} Q_6= \sum _{\begin{array}{c} \sigma \in \{-1,1\}^6,\ j\in ({\mathbb {Z}}^d)^k\\ |j_1|\ge \cdots \ge |j_6|\\ \sum _{i=1}^6\sigma _i j_i=0 \end{array}} (Q_6)_{\sigma ,j}u_{j_1}^{\sigma _1}\cdots u_{j_6}^{\sigma _6}. \end{aligned}$$

From Proposition 2.2 we learn that if \(\sigma _1\sigma _2=1\) then the small divisor associated with \((j,\sigma )\) is controlled by \(\mu _3(j)\) and thus we can eliminate the corresponding monomial by one more Birkhoff normal forms step.Footnote 4 Now if we assume \(\sigma _1\sigma _2=-1\) we have

$$\begin{aligned} |\{N_s,u_{j_1}^{\sigma _1}\cdots u_{j_6}^{\sigma _6}\}|&= \left| \sum _{i=1}^6 \sigma _{j_i}\langle j_i\rangle ^{2s}\right| |u_{j_1}^{\sigma _1}\cdots u_{j_6}^{\sigma _6}| \\&\le (\langle j_1\rangle ^{2s}-\langle j_2\rangle ^{2s}+ 4 \langle j_3\rangle ^{2s}) |u_{j_1}^{\sigma _1}\cdots u_{j_6}^{\sigma _6}| \\&\le \big (s(\langle j_1\rangle ^{2}-\langle j_2\rangle ^{2})\langle j_1\rangle ^{2(s-1)} + 4 \langle j_3\rangle ^{2s}\big ) |u_{j_1}^{\sigma _1}\cdots u_{j_6}^{\sigma _6}| \\&\lesssim _s (\langle j_1\rangle ^{2s-1}\langle j_3\rangle + 4 \langle j_3\rangle ^{2s})|u_{j_1}^{\sigma _1}\cdots u_{j_6}^{\sigma _6}| \\&\lesssim _s \mu _1^{-1}\Vert u\Vert _{H^s}^6 \end{aligned}$$

where we used the zero momentum condition, \(\sum _{i=1}^6\sigma _i j_i=0\), to obtain \(|j_1-j_2|\le 4|j_3|\). This gain of one derivative, also known as the commutator trick, is central in a lot of results about modified energy [6, 13] or growth of Sobolev norms [4, 5, 12, 14].

So if \(Q_6^-\) denotes the restriction of \(Q_6\) to monomials satisfying \(\sigma _1\sigma _2=-1\) we have essentially proved that

$$\begin{aligned} |\{N_s,Q_6^{-,>N_1}\}|\lesssim N_1^{-1}\Vert u\Vert _{H^s}^6. \end{aligned}$$

Then we can consider the modified energy \(N_s+E_6\) with \(E_6\) solving

$$\begin{aligned} \{E_6,Z_2\}=-\{N_s, Q_6^{-,\le N_1}\} \end{aligned}$$

in such a way that

$$\begin{aligned} \{N_s+E_6,{{\tilde{H}}}\}=\{N_s, Q_6^{-,> N_1}\}+\{N_s,{{\tilde{H}}}_7\} +\{E_6,Z_4\}+\mathrm{negligible\, terms}. \end{aligned}$$

Since this modified energy will not produce new terms of order 7, we can in the same time eliminate \(Q_7^{-,\le N_1}\). Thus we obtain a new energy, \(N_s+E_6+E_7\), which is equivalent to \(N_s\) in a neighborhood of the origin, and such that, by neglecting all the powers of \(N^\delta \) and \(N_1^\delta \) which appear when we work carefully (see (4.6) for a precise estimate),

$$\begin{aligned} |\{N_s+E_6+E_7,{{\tilde{H}}}\}|\lesssim _s N_1^{-1}\Vert u\Vert _{H^s}^6+ \Vert u\Vert _{H^s}^8+N^{-1}\Vert u\Vert _{H^s}^3. \end{aligned}$$

Then, a suitable choice of N and \(N_1\) and a standard bootstrap argument lead to, \(T_\varepsilon =O(\varepsilon ^{-6})\) by using this rough estimate, and \(T_\varepsilon =O(\varepsilon ^{-6^-})\) by using the precise estimate (see Sect. 5).

Remark 1.2

In principle a Birkhoff normal form procedure gives more than just the control of \(H^s\) norm of the solutions, it gives an equivalent Hamiltonian system and therefore potentially more information about the dynamics of the solutions. However, if one wants to control only the solution in \(H^s\) norm, the modified energy method is sufficient and simpler. One could therefore imagine applying this last method from the beginning. However, when we iterate it, the modified energy method brings up terms that, when we apply a Birkhoff procedure, turn out to be zero. Unfortunately we have not been able to prove the cancellation of these terms directly by the modified energy method, that is why we use successively a Birkhoff normal form procedure and a modified energy procedure.

Notation

We shall use the notation \(A\lesssim B\) to denote \(A\le C B\) where C is a positive constant depending on parameters fixed once for all, for instance d, n. We will emphasize by writing \(\lesssim _{q}\) when the constant C depends on some other parameter q.

2 Small Divisors

As already remarked in the introduction, the proof of Theorem 1 is based on a normal form approach. In particular we have to deal with a small divisors problem involving linear combination of linear frequencies \(\omega _{j}\) in (1.12).

This section is devoted to establish suitable lower bounds for generic (in a probabilistic way) choices of the parameters \(\nu \) excepted for exceptional indices for which the small divisor is identically zero. According to the following definition such indices are called resonant.

Definition 2.1

(Resonant indices) Being given \(r\ge 3\), \(j_1,\dots ,j_r \in {\mathbb {Z}}^d\) and \(\sigma _1,\dots ,\sigma _r\in \{-1,1\}\), the couple \((\sigma ,j)\) is resonant if r is even and there exists a permutation \(\rho \in {\mathfrak {S}}_r\) such that

$$\begin{aligned} \forall k\in \llbracket 1,r/2 \rrbracket , \ \begin{pmatrix} |j_{\rho _{2k-1},1}| \\ \vdots \\ |j_{\rho _{2k-1},d}| \end{pmatrix} =\begin{pmatrix} |j_{\rho _{2k},1}| \\ \vdots \\ |j_{\rho _{2k},d}| \end{pmatrix} \quad \mathrm {and} \quad \sigma _{\rho _{2k-1}} = -\sigma _{\rho _{2k}}. \end{aligned}$$

In this section we aim at proving the following proposition whose proof is postponed to the end of this section (see Sect. 2.3). We recall that a is defined with respect to the length, \(\nu \), of the torus by the relation \(a_i= \nu _i^2\) (see (1.12)).

Proposition 2.2

For almost all \(a\in (1,4)^d\), there exists \(\gamma >0\) such that for all \(\delta >0\), \(r\ge 3\), \(\sigma _1,\dots ,\sigma _r\in \{-1,1\}\), \(j_1,\dots ,j_r \in {\mathbb {Z}}^d\) satisfying \(\sigma _1 j_1+\dots +\sigma _r j_r = 0\) and \(|j_1|\ge \dots \ge |j_r|\) at least one of the following assertion holds

  1. (i)

    \((\sigma ,j)\) is resonant (see Definition 2.1)

  2. (ii)

    \(\sigma _1 \sigma _2 = 1\) and

    $$\begin{aligned} \left| \sum _{k=1}^r \sigma _k \sqrt{1+|j_k|_a^4} \right| > rsim _r \gamma \, (\langle j_3 \rangle \dots \langle j_r \rangle )^{-9dr^2}, \end{aligned}$$
  3. (iii)

    \(\sigma _1 \sigma _2 = -1\) and

    $$\begin{aligned} \left| \sum _{k=1}^r \sigma _k \sqrt{1+|j_k|_a^4} \right| > rsim _{r,\delta } \gamma \, \langle j_1 \rangle ^{-(d-1+\delta )} (\langle j_3 \rangle \dots \langle j_r \rangle )^{-44dr^4}. \end{aligned}$$

We refer the reader to Lemma 2.9 and its corollary to understand how we get this degeneracy with respect to \(j_1\).

2.1 A Weak Non-resonance Estimate

In this subsection we aim at proving the following technical lemma.

Lemma 2.3

If \(r\ge 1\), \((j_1,\dots ,j_r) \in ({\mathbb {N}}^d)^r\) is injective,Footnote 5\(n \in ({\mathbb {Z}}^*)^r\) and \(\kappa \in {\mathbb {R}}^d\) satisfies \(\kappa _{i_{\star }}=0\) for some \(i_{\star }\in \llbracket 1,d\rrbracket \) then we have

$$\begin{aligned} \forall \gamma >0, \ \left| \left\{ a \in (1,4)^d \ : \ \big \vert \kappa \cdot a + \sum _{k=1}^r n_k \sqrt{1+|j_k|^4_a}\big \vert <\gamma \right\} \right| \lesssim _{r,d} \gamma ^{\frac{1}{r(r+1)}} (\langle j_1 \rangle \dots \langle j_r \rangle )^{\frac{12}{r+1}}. \end{aligned}$$

Its proof (postponed to the end of this subsection) rely essentially on the following lemma.

Lemma 2.4

If IJ are two bounded intervals of \({\mathbb {R}}_+^*\), \(r\ge 1\), \((j_1,\dots ,j_r) \in ({\mathbb {N}}^d)^r\) is injective, \(n \in ({\mathbb {Z}}^*)^r\) and \(h: J^{d-1}\rightarrow {\mathbb {R}}\) is measurable then for all \(\gamma >0\) we have

$$\begin{aligned}&\left| \left\{ (m,b) \in I\times J^{d-1} \ : \ \big \vert h(b)+ \sum _{k=1}^r n_k \sqrt{m+ |j_k|^4_{(1,b)}}\big \vert <\gamma \right\} \right| \\&\quad \lesssim _{r,d,I,J} \gamma ^{\frac{1}{r(r+1)}} (\langle j_1 \rangle \dots \langle j_r \rangle )^{\frac{12}{r+1}} \end{aligned}$$

where \((1,b):=(1,b_1,\dots ,b_{d-1})\in {\mathbb {R}}^d\).

Proof of Lemma 2.4

The proof of this lemma is classical and follows the lines of [1].

Without loss of generality, we assume that \(\gamma \in (0,1)\). Let \(\eta \in (0,1)\) be a positive number which will be optimized later with respect to \(\gamma \). If \(1\le i<k \le r\) then we have

$$\begin{aligned} | j_i|_{1,b}^2 - | j_k|_{1,b}^2 = (j_{i,1}^2-j_{k,1}^2)+b_1 (j_{i,2}^2-j_{k,2}^2) + \dots + b_{d-1}(j_{i,d}^2-j_{k,d}^2). \end{aligned}$$

Since, by assumption, \((j_1,\dots ,j_r)\) is injective, either there exists \(\ell \in \llbracket 2,d \rrbracket \) such that \(j_{i,\ell }\ne j_{k,\ell }\) or \(j_{i,1}\ne j_{k,1}\) and \(j_{i,\ell }=j_{k,\ell }\) for \(\ell = 2,\dots , d\). Note that in this second case, we have \(|| j_i|_{1,b}^2 - | j_k|_{1,b}^2 |\ge 1\). In any case, since the dependency with respect to b is affine the set

$$\begin{aligned} {\mathcal {P}}_{\eta }^{(i,k)} = \{ b \in J^{d-1} \ | \ | j_i|_{1,b}^2 - | j_k|_{1,b}^2 |< \eta \}\hbox { satisfies }|{\mathcal {P}}_{\eta }^{(i,k)}|<\, \eta (1+|J|^{d-1}). \end{aligned}$$

Therefore, we have

$$\begin{aligned}&\left| \left\{ (m,b) \in I\times J^{d-1} : \left| h(b)+ \sum _{k=1}^r n_k \sqrt{m+ |j_k|^4_{(1,b)}}\right|<\gamma \right\} \right| \le \frac{r(r-1)}{2} |I|\, \eta \, (1+|J|^{d-1}) \nonumber \\&\quad + |J|^{d-1} \sup _{\forall i<k, \ b \notin {\mathcal {P}}_{\eta }^{(i,k)} } \left| \left\{ m \in I: \ \left| h(b)+ \sum _{k=1}^r n_k \sqrt{m+ |j_k|^4_{(1,b)}} \right| <\gamma \right\} \right| . \end{aligned}$$
(2.1)

In order to estimate this last measure we fix \(b\in J^{d-1}{\setminus } \bigcup _{i<k} {\mathcal {P}}_{\eta }^{(i,k)}\) and we define \(g:I\rightarrow {\mathbb {R}}\) by

$$\begin{aligned} g(m) = h(b)+ \sum _{k=1}^r n_k \sqrt{m+ |j_k|^4_{(1,b)}}. \end{aligned}$$

By a straightforward calculation, for \(\ell \ge 1\), we have

$$\begin{aligned} \partial _m^\ell g(m) = c_{\ell } \sum _{k=1}^r n_k (m+ |j_k|^4_{(1,b)})^{\frac{1}{2}-\ell } \quad \mathrm {where} \quad c_{\ell } = \prod _{i=0}^{\ell -1} \frac{1}{2} - i. \end{aligned}$$
(2.2)

Therefore, we have

$$\begin{aligned} \begin{pmatrix} c_{1}^{-1} \partial _m^{1} g \\ \vdots \\ c_{r}^{-1}\partial _m^{r} g \end{pmatrix} = \begin{pmatrix} (m+ |j_1|^4_{(1,b)})^{0} &{} \dots &{} (m+ |j_r|^4_{(1,b)})^{0} \\ \vdots &{} &{} \vdots \\ (m+ |j_1|^4_{(1,b)})^{-(r-1)} &{} \dots &{} (m+ |j_r|^4_{(1,b)})^{-(r-1)} \end{pmatrix} \begin{pmatrix} n_1 \sqrt{m+ |j_1|^4_{(1,b)}}^{-1} \\ \vdots \\ n_r \sqrt{m+ |j_r|^4_{(1,b)}}^{-1} \end{pmatrix}. \end{aligned}$$

Denoting by V this Vandermonde matrix, by \(|x|_{\infty } := \max |x_i|\) for \(x\in {\mathbb {R}}^d\) and also by \(|\cdot |_\infty \) the associated matrix norm, we deduce that

$$\begin{aligned} \max _{i=1}^{r} c_{i}^{-1} |\partial _m^{i} g(m)| \ge |V^{-1}|_{\infty }^{-1} \max _{i=1}^{r} |n_i| \sqrt{m+ |j_i|^4_{(1,b)}}^{-1}. \end{aligned}$$
(2.3)

We recall that the inverse of V is given by

$$\begin{aligned} (V^{-1})_{i,\ell } = (-1)^{r-\ell } \frac{S_{r-\ell }\left( \left( \frac{1}{m+|j_k|^4_{(1,b)}}\right) _{k\ne i} \right) }{\displaystyle \prod _{k\ne i} \frac{1}{m+|j_i|^4_{(1,b)}}- \frac{1}{m+|j_k|^4_{(1,b)}} } \end{aligned}$$
(2.4)

(this formula can be easily derived using the Lagrange interpolation polynomials) where \(S_{\ell } : {\mathbb {R}}^{r-1}\rightarrow {\mathbb {R}}\) is the \(\ell ^{st}\) elementary symmetric function

$$\begin{aligned} S_\ell (x) = \sum _{1\le k_1<\dots <k_{\ell } \le r-1} x_{k_1}\dots x_{k_{\ell }} \quad \mathrm {and} \quad S_0(x):=1. \end{aligned}$$

Furthermore, we have

$$\begin{aligned} |V^{-1}|_{\infty } = \max _{i=1}^r \sum _{\ell =1}^r |(V^{-1})_{i,\ell }|. \end{aligned}$$
(2.5)

To estimate \(|V^{-1}|_{\infty } \) in (2.3), we use the estimates

$$\begin{aligned} S_{r-\ell }\left( \left( \frac{1}{m+|j_k|^4_{(1,b)}}\right) _{k\ne i} \right) \lesssim _{r,J,I} 1 \quad \mathrm {and} \quad \left| \frac{1}{m+|j_i|^4_{(1,b)}}- \frac{1}{m+|j_k|^4_{(1,b)}}\right| > rsim _{J,I} \frac{\eta }{\langle j_k \rangle ^6}. \end{aligned}$$

Indeed, if \(||j_i|^4_{(1,b)} - |j_k|^4_{(1,b)}|\ge \frac{1}{2}|j_i|^4_{(1,b)} \) we have

$$\begin{aligned} \left| \frac{1}{m+|j_i|^4_{(1,b)}}- \frac{1}{m+|j_k|^4_{(1,b)}}\right| = \left| \frac{|j_i|^4_{(1,b)} - |j_k|^4_{(1,b)}}{(m+|j_i|^4_{(1,b)})( m+|j_k|^4_{(1,b)})}\right| > rsim _{I,J} \frac{1}{\langle j_k \rangle ^4} \end{aligned}$$

and conversely, if \(||j_i|^4_{(1,b)} - |j_k|^4_{(1,b)}|\le \frac{1}{2}|j_i|^4_{(1,b)}\) then \(|j_i|^4_{(1,b)} \le 2 |j_k|^4_{(1,b)}\) and so, since \(b\in J^{d-1}{\setminus } \bigcup _{i<k} {\mathcal {P}}_{\eta }^{(i,k)}\), we have

$$\begin{aligned} \left| \frac{1}{m+|j_i|^4_{(1,b)}}- \frac{1}{m+|j_k|^4_{(1,b)}}\right| > rsim _{I,J} \frac{(|j_i|^2_{(1,b)}+|j_k|^2_{(1,b)})||j_i|^2_{(1,b)}-|j_k|^2_{(1,b)}| }{\langle j_k \rangle ^8} > rsim _{I,J} \frac{\eta }{\langle j_k \rangle ^6}. \end{aligned}$$

Therefore by (2.5) and (2.4), we have

$$\begin{aligned} |V^{-1}|_{\infty } \lesssim _{r,I,J} \eta ^{-(r-1)} (\langle j_1 \rangle \dots \langle j_r \rangle )^6 \end{aligned}$$

Consequently, we deduce from (2.3) that

$$\begin{aligned} \max _{i=1}^{r} |\partial _m^{i} g(m)| > rsim _{r,I,J} \eta ^{r-1} (\langle j_1 \rangle \dots \langle j_r \rangle )^{-6} |n|_{\infty }. \end{aligned}$$
(2.6)

Furthermore, considering (2.2), it is clear that

$$\begin{aligned} |\partial _m^\ell g(m)| \lesssim _{\ell ,I,J} |n|_{\infty }. \end{aligned}$$

As a consequence, being given \(\rho >0\) (that will be optimized later), applying Lemma B.1. of [17], we get N sub-intervals of I, denoted \(\Delta _{1},\dots ,\Delta _{N}\) such that

$$\begin{aligned}&N \lesssim _{I,r} (\langle j_1 \rangle \dots \langle j_r \rangle )^{6} \eta ^{-(r-1)}, \quad \max _{i=1}^N |\Delta _{i}| \lesssim _{I,r} \left( \frac{\rho (\langle j_1 \rangle \dots \langle j_r \rangle )^{6}}{\eta ^{r-1} |n|_{\infty }} \right) ^{\frac{1}{r-1}}, \\&\quad |\partial _m g(m)| \ge \rho \quad \forall m \in I {\setminus } (\Delta _1 \cup \dots \cup \Delta _N). \end{aligned}$$

Observing that \(I {\setminus } (\Delta _1 \cup \dots \cup \Delta _N)\) can be written as the union of M intervals with \(M\lesssim 1 + N\), we deduce that

$$\begin{aligned}&\left| \left\{ m \in I : \left| h(b)+ \sum _{k=1}^r n_k \sqrt{m+ |j_k|^4_{(1,b)}} \right|<\gamma \right\} \right| < M \rho ^{-1} \gamma + N \max _{i=1}^N |\Delta _{i}|\\&\quad \lesssim _{I,r} (\langle j_1 \rangle \dots \langle j_r \rangle )^{6} \eta ^{-(r-1)} \left[ \rho ^{-1} \gamma + \left( \frac{\rho (\langle j_1 \rangle \dots \langle j_r \rangle )^{6}}{\eta ^{r-1} |n|_{\infty }} \right) ^{\frac{1}{r-1}} \right] . \end{aligned}$$

We optimize \(\rho \) to equalize the two terms in this last sum:

$$\begin{aligned} \rho ^{\frac{r}{r-1}}= \gamma \left( \frac{\eta ^{r-1} |n|_{\infty }}{ (\langle j_1 \rangle \dots \langle j_r \rangle )^{6}}\right) ^{\frac{1}{r-1}}. \end{aligned}$$

This provides the estimate

$$\begin{aligned}&\left| \left\{ m \in I : \left| h(b)+ \sum _{k=1}^r n_k \sqrt{m+ |j_k|^4_{(1,b)}} \right| <\gamma \right\} \right| \\&\quad \lesssim _{I,r} \gamma ^{\frac{1}{r}} (\langle j_1 \rangle \dots \langle j_r \rangle )^{6} \eta ^{-(r-1)} \left( \frac{ (\langle j_1 \rangle \dots \langle j_r \rangle )^{6}}{\eta ^{r-1} |n|_{\infty }}\right) ^{\frac{1}{r}} \\&\quad \lesssim _{I,r} \left( \frac{\gamma }{|n|_{\infty }}\right) ^{\frac{1}{r}} \eta ^{-(r-1+\frac{r-1}{r})} (\langle j_1 \rangle \dots \langle j_r \rangle )^{12}. \end{aligned}$$

Finally, we optimize (2.1) by choosing

$$\begin{aligned} \eta =\gamma ^{\frac{1}{r}} \eta ^{-(r-1+\frac{r-1}{r})} (\langle j_1 \rangle \dots \langle j_r \rangle )^{12} \end{aligned}$$

and, recalling that \(|n|_{\infty }\ge 1\), we get

$$\begin{aligned}&\left| \left\{ (m,b) \in I\times J^{d-1} : \left| h(b) +\sum _{k=1}^r n_k \sqrt{m+ |j_k|^4_{(1,b)}}\right| <\gamma \right\} \right| \\&\quad \lesssim _{r,d,I,J} \left( \gamma ^{\frac{1}{r}} (\langle j_1 \rangle \dots \langle j_r \rangle )^{12}\right) ^{\frac{1}{r+\frac{r-1}{r}}}. \end{aligned}$$

Since this measure is obviously bounded by \(|I||J|^{d-1}\), the exponent \(r+\frac{r-1}{r}\) can be replaced by \(r+1\) in the above expression which conclude this proof. \(\square \)

Now using Lemma 2.4, we prove Lemma 2.3.

Proof of Lemma 2.3

Without loss of generality we assume that \(i_{\star } = 1\). First, since \(\kappa _1=0\), we note that we have

$$\begin{aligned} G(a):=\kappa \cdot a + \sum _{k=1}^r n_k \sqrt{1+|j_k|^4_a}= \frac{1}{\sqrt{m}}( h(b) + \sum _{k=1}^r n_k \sqrt{m+|j_k|^4_{(1,b)}} )=:\frac{1}{\sqrt{m}}F(m,b) \end{aligned}$$

where

$$\begin{aligned} m=\frac{1}{a_1^2}, \quad b=\left( \frac{a_2}{a_1},\dots ,\frac{a_d}{a_1}\right) \quad \mathrm {and} \quad h(b) = \sum _{k=2}^d \kappa _k b_k. \end{aligned}$$

Let denote by \(\Psi \) the map \(a\mapsto (m,b)\). It is clearly smooth and injective. Furthermore, we have

$$\begin{aligned} \det \mathrm {d}\Psi (a) = \begin{vmatrix} - 2a_1^{-3}&-a_2 a_1^{-2}&\dots&-a_d a_1^{-2} \\&a_1^{-1} \\&\ddots \\&&a_1^{-1} \end{vmatrix} = 2\, (-1) a_1^{-d-2}. \end{aligned}$$

Consequently, \(\Psi \) is a smooth diffeomorphism onto its image \(\Psi ((1,4)^d)\) which is included in the rectangle \(\left( \frac{1}{16},1\right) \times \left( \frac{1}{4},4\right) ^{d-1}\). Therefore, by a change of variable, we have

$$\begin{aligned} |\{ a \in (1,4)^d : \big \vert G(a) \big \vert<\gamma \}|&= \int _{a\in ({1},{4})^d} {\mathbb {1}}_{|G(a)|<\gamma } \mathrm{d}a \\&= \int _{(m,b)\in \Psi ((1,4)^d)} \mathbb {1}_{| F(m,b) |<\sqrt{m}\gamma } (2\, \sqrt{m}^{-d-2}) \mathrm {d}(m,b) \\&\le 2^{2d+5} \left| \left\{ (m,b) \in \left( \frac{1}{16},1\right) \times \left( \frac{1}{4},4\right) ^{d-1} : |F(m,b)|<\gamma \right\} \right| . \end{aligned}$$

Finally, by applying Lemma 2.4, we get the expected estimate. \(\square \)

2.2 Non-resonance Estimates for Two Large Modes

In this subsection we consider \(r\ge 3\), \((j_k)_{k\ge 3} \in ({\mathbb {Z}}^d)^{r-2}\) and \(\sigma \in \{-1,1\}^r\) such that \(\sigma _1=-\sigma _2\) as fixed. We define \(j_{\ge 3} \in {\mathbb {Z}}^d\) by

$$\begin{aligned} j_{\ge 3} := \sigma _3 j_3+\dots +\sigma _r j_r. \end{aligned}$$
(2.7)

Being given \(j_1\in {\mathbb {Z}}^d\), we define implicitly \(j_2:= j_1+\sigma _1 j_{\ge 3}\) in order to satisfy the zero momentum condition

$$\begin{aligned} \sum _{k=1}^r \sigma _k j_k=0, \end{aligned}$$
(2.8)

and we define the function \(g_{j_1}:(1,4)^d \rightarrow {\mathbb {R}}\) by

$$\begin{aligned} g_{j_1}(a) = \sum _{k=1}^r \sigma _k \sqrt{1+|j_k|_a^4}. \end{aligned}$$

Finally, for \(\gamma >0\), we introduce the following setsFootnote 6

$$\begin{aligned} {\mathcal {I}}= & {} \{ i : j_{\ge 3,i}\ne 0\}, \quad C_i = \left\{ j_1\in {\mathbb {Z}}^d : |j_{1,i}| \ge 2\left( 1+ \sum _{k\ge 3} |j_{k,i}|^2\right) \right\} ,\\ S= & {} \left\{ j_1 \in {\mathbb {Z}}^d {\setminus } \bigcup _{i\in {\mathcal {I}}} C_i : (\sigma ,j) \text { is non-resonant}\right\} ,\\ \mathrm {and} \quad R_{\gamma }= & {} \{ j_1 \in S : |j_1|\ge \gamma ^{-1/2}(\langle j_3 \rangle \dots \langle j_r \rangle )^{2dr^2}\}. \end{aligned}$$

First, we prove the following technical lemma whose Corollary 2.6 allows to deal with the non-degenerated cases.

Lemma 2.5

If there exists \(i\in \llbracket 1,d\rrbracket \) such that

$$\begin{aligned} |(j_{1,i}+j_{2,i}) j_{\ge 3,i}|\ge 2 \left( 1+\sum _{k= 3}^r j_{k,i}^2\right) \end{aligned}$$
(2.9)

then for all \(\gamma >0\)

$$\begin{aligned} \left| \left\{ a\in (1,4)^d : \left| \sum _{k=1}^r \sigma _k \sqrt{1+|j_k|^4_a}<\gamma \right| \right\} \right| < \frac{2\,\gamma }{|j_{1,i}+j_{2,i}|}. \end{aligned}$$
(2.10)

Proof

Without loss of generality we assume that \(\sigma _1=1\) and \(\sigma _2 = -1\). We compute the derivative with respect to \(a_1\)

$$\begin{aligned} \partial _{a_1} \sum _{k=1}^r \sigma _k \sqrt{1+|j_k|^4_a} = \sum _{k=1}^r \sigma _k j_{k,i}^2 \frac{|j_k|^2_a }{\sqrt{1+|j_k|^4_a}}. \end{aligned}$$

Consequently, we have

$$\begin{aligned} \left| \partial _{a_1} \left( \sum _{k=1}^r \sigma _k \sqrt{1+|j_k|^4_a}\right) \right| \ge \left| j_{1,i}^2 \frac{|j_1|^2_a }{\sqrt{1+|j_1|^4_a}} - j_{2,i}^2 \frac{|j_2|^2_a }{\sqrt{1+|j_2|^4_a}} \right| - \sum _{k= 3}^r j_{k,i}^2. \end{aligned}$$

Furthermore, we have

$$\begin{aligned} \left| \frac{|j_1|^2_a }{\sqrt{1+|j_1|^4_a}} - 1 \right| \le \frac{1}{2|j_1|^2_a}. \end{aligned}$$

Consequently, we get

$$\begin{aligned} \left| \partial _{a_1} \left( \sum _{k=1}^r \sigma _k \sqrt{1+|j_k|^4_a}\right) \right| \ge |j_{1,i}^2 - j_{2,i}^2| - 1-\sum _{k= 3}^r j_{k,i}^2. \end{aligned}$$

Observing that by definition we have \(j_{1,i}^2 - j_{2,i}^2 = j_{\ge 3,i}(j_{1,i} + j_{2,i})\), we deduce of the assumption (2.9) that

$$\begin{aligned} \left| \partial _{a_1} \sum _{k=1}^r \sigma _k \sqrt{1+|j_k|^4_a}\right| \ge \frac{1}{2} |j_{\ge 3,i}(j_{1,i} + j_{2,i})| \end{aligned}$$

Since by (2.9) we know that \(j_{\ge 3,i} \in {\mathbb {Z}}{\setminus } \{0\}\), we deduce that

$$\begin{aligned} \left| \partial _{a_1} \sum _{k=1}^r \sigma _k \sqrt{1+|j_k|^4_a}\right| \ge \frac{1}{2} |(j_{1,i} + j_{2,i})|. \end{aligned}$$

Therefore \(a_1\mapsto \sum _{k=1}^r \sigma _k \sqrt{1+|j_k|^4_a}\) is a diffeomorphism (it is a smooth monotonic function). Consequently, applying this change of coordinate, we get directly (2.10) which conclude this proof. \(\square \)

Corollary 2.6

For all \(\gamma >0\) we have

$$\begin{aligned} \forall i \in {\mathcal {I}}, \ |\{ a\in (1,4)^d : \exists j_1\in C_i, \ |g_{j_1}(a)| < \gamma |j_1|^{-(d-1)} \log ^{-2d}(| j_1|) \}| \lesssim _d \gamma \end{aligned}$$
(2.11)

Proof of Corollary 2.6

Let \(j_1\in C_i\). By definition of \(j_2\), we have

$$\begin{aligned} |j_{1,i}+j_{2,i}|\ge 2|j_{1,i}| - \sum _{k=3}^r |j_{k,i}|. \end{aligned}$$

Consequently, since \(j_1\in C_i\), we have

$$\begin{aligned} |j_{1,i}+j_{2,i}|\ge 2 |j_{1,i}| - \sum _{k=3}^r |j_{k,i}|^2 \ge \frac{3}{2} |j_{1,i}|. \end{aligned}$$

Therefore, since \(j_{\ge 3,i}\ne 0\), we have

$$\begin{aligned} |j_{\ge 3,i}(j_{1,i}+j_{2,i})| \ge \frac{3}{2} |j_{1,i}| \ge 3\left( 1+ \sum _{k\ge 3} |j_{k,i}|^2\right) . \end{aligned}$$

Applying Lemma 2.5, we deduce that for all \(\gamma >0\)

$$\begin{aligned} \left| \left\{ a\in (1,4)^d : \left| \sum _{k=1}^r \sigma _k \sqrt{1+|j_k|^4_a}<\gamma \right| \right\} \right| < \frac{4\gamma }{3|j_{1,i}|}. \end{aligned}$$

Consequently, we have

$$\begin{aligned}&|\{ a\in (1,4)^d : \exists j_1\in C_i, \ |g_{j_1}(a)|< \gamma |j_1|^{-(d-1)} \log ^{-2d}(| j_1|) \}| \\&\quad = \left| \bigcup _{j_1\in C_i} \{ a\in (1,4)^d : |g_{j_1}(a)|< \gamma |j_1|^{-(d-1)} \log ^{-2d}(| j_1|) \} \right| \\&\quad \le \sum _{j_1\in C_1} \left| \{ a\in (1,4)^d : |g_{j_1}(a)| < \gamma |j_1|^{-(d-1)} \log ^{-2d}(| j_1|) \} \right| \\&\quad \lesssim \gamma \sum _{j_1\in C_i} \frac{1}{|j_1|^{(d-1)} |j_{1,i}| \log ^{2d}(|j_1|)} \lesssim _d \gamma . \end{aligned}$$

\(\square \)

In the following lemma, we deal with most of the degenerated cases.

Lemma 2.7

For all \(\gamma >0\), we have

$$\begin{aligned} |\{ a\in (1,4)^d : \exists j_1\in R_{\gamma }, \ |g_{j_1}(a)| < \gamma \}| \lesssim _{r,d} \gamma ^{\frac{1}{r^2}} (\langle j_3 \rangle \dots \langle j_r \rangle )^{2d }. \end{aligned}$$
(2.12)

Proof

Without loss of generality, we assume that \(\gamma < \min ((2r)^{-2},(36d)^{-1})\). If \(j_1\in R_{\gamma }\) recalling that for \(x\ge 0\), we have \(|\sqrt{1+x}-1|\le x/2\), we deduce that

$$\begin{aligned}&|g_{j_1}(a)| \ge |h_{j_1}(a) | - \frac{1}{2|j_1|_a^2} - \frac{1}{2|j_2|_a^2} \quad \\&\quad \mathrm {where} \quad h_{j_1}(a) := |j_1|_a^2-|j_2|_a^2 + \sigma _1 \sum _{k=3}^r \sigma _k \sqrt{1+|j_k|^4_a}. \end{aligned}$$

However, by definition of \(j_2\) and \(R_{\gamma }\), we have

$$\begin{aligned}&|j_2|\ge |j_1|-\sum _{k=3}^r |j_k| \ge \gamma ^{-1/2} (\langle j_3 \rangle \dots \langle j_r \rangle )^{2d r^2} \\&\quad - (r-2) (\langle j_3 \rangle \dots \langle j_r \rangle ) \ge \frac{\gamma ^{-1/2}}{2} (\langle j_3 \rangle \dots \langle j_r \rangle )^{2d r^2}. \end{aligned}$$

Noting that, for \(a\in (1,4)^d\), we have \(|\cdot | \le |\cdot |_a\), we deduce that

$$\begin{aligned} |g_{j_1}(a)| \ge |h_{j_1}(a) | - 3\gamma (\langle j_3 \rangle \dots \langle j_r \rangle )^{-4d r^2}. \end{aligned}$$

Consequently, it is enough to prove that

$$\begin{aligned} |\{ a\in (1,4)^d :\exists j_1\in R_{1}, \ |h_{j_1}(a)| < \gamma (\langle j_3 \rangle \dots \langle j_r \rangle )^{-4dr^2} \}| \lesssim _{r,d} \gamma ^{\frac{1}{(r-1)(r-2)}}. \end{aligned}$$
(2.13)

To prove this estimate, we have to note the following result whose proof is postponed/to the end of this proof. \(\square \)

Lemma 2.8

If \(j_1\in R_{\gamma }\) then there exists \(\kappa _{j_1} \in {\mathbb {Z}}^d\) such that

$$\begin{aligned} |j_1|_a^2 - |j_2|_a^2=\kappa _{j_1}\cdot a, \quad |\kappa _{j_1}|_{\infty } \le 10(\langle j_3 \rangle \dots \langle j_r \rangle )^3 \quad \mathrm {and} \quad \exists i_{\star }\in \llbracket 1,d\rrbracket , \ \kappa _{j_1,i_{\star }}=0. \end{aligned}$$

Now we have to distinguish two cases.

  • Case 1: \((\sigma _k,j_k)_{k\ge 3}\) is resonant. If \(j_1\in R_{\gamma }\), let \(\kappa _{j_1}\in {\mathbb {Z}}^d\) be given by Lemma 2.8. Note that \(\kappa _{j_1}\ne 0\) because else we would have \(j_{1,i}^2=j_{2,i}^2\) for all \(i\in \llbracket 1,d \rrbracket \) and so \((\sigma ,j)\) would be resonant (which is excluded by definition of \(R_{\gamma }\)). Furthermore, here \(h_{j_1}(a)= \kappa _{j_1} \cdot a\) is a linear form. Consequently, for all \(\gamma >0\), we have the following estimate which is much stronger than (2.13):

    $$\begin{aligned}&|\{ a\in (1,4)^d : \exists j_1\in R_{1}, \ |h_{j_1}(a)|< \gamma \}| \\&\quad \le \left| \bigcup _{\begin{array}{c} \kappa \in {\mathbb {Z}}^d {\setminus } \{0\}\\ |\kappa |_{\infty } \le 10(\langle j_3 \rangle \dots \langle j_r \rangle )^3 \end{array}} \{ a\in (1,4)^d : \kappa \cdot a< \gamma \} \right| \\&\quad \le \sum _{\begin{array}{c} \kappa \in {\mathbb {Z}}^d {\setminus } \{0\}\\ |\kappa |_{\infty } \le 10(\langle j_3 \rangle \dots \langle j_r \rangle )^3 \end{array}} |\{ a\in (1,4)^d : \kappa \cdot a < \gamma \}| \le \gamma (20(\langle j_3 \rangle \dots \langle j_r \rangle )^3)^d \end{aligned}$$
  • Case 2: \((\sigma _k,j_k)_{k\ge 3}\) is non-resonant. If \(j_1\in R_{\gamma }\), \(h_{j_1}\) writes

    $$\begin{aligned} h_{j_1}(a) = \kappa _{j_1} \cdot a + \sum _{k=1}^{{\widetilde{r}}} n_k \sqrt{1+|{\widetilde{j}}_k|^4_a} \end{aligned}$$

    where \( \kappa _{j_1}\) is given by Lemma 2.8, \({\widetilde{r}}\le r-2\), \(({\widetilde{j}}_1,\dots ,{\widetilde{j}}_{{\widetilde{r}}}) \in ({\mathbb {N}}^d)^{{\widetilde{r}}}\) is injective, \(n_k\in ({\mathbb {Z}} {\setminus } \{0\})^d\) is defined by

    $$\begin{aligned} n_k = \sum _{\begin{array}{c} i\in \llbracket 3,r \rrbracket \\ \forall \ell ,\ |j_{i,\ell }| = {\widetilde{j}}_{k,\ell } \end{array}} \sigma _1 \sigma _i. \end{aligned}$$

    Consequently, by Lemma 2.8, we have

    $$\begin{aligned}&|\{ a\in (1,4)^d : \exists j_1\in R_{1}, \ |h_{j_1}(a)|< \gamma \}|\\&\quad \le \left| \bigcup _{\begin{array}{c} \kappa \in {\mathbb {Z}}^d \\ |\kappa |_{\infty } \le 10(\langle j_3 \rangle \dots \langle j_r \rangle )^3 \\ \exists i_{\star },\ \kappa _{i_{\star }=0} \end{array}} \left\{ a\in (1,4)^d : \left| \kappa \cdot a + \sum _{k=1}^{{\widetilde{r}}} n_k \sqrt{1+|{\widetilde{j}}_k|^4_a} \right|< \gamma \right\} \right| \\&\quad \le \sum _{\begin{array}{c} \kappa \in {\mathbb {Z}}^d \\ |\kappa |_{\infty } \le 10(\langle j_3 \rangle \dots \langle j_r \rangle )^3 \\ \exists i_{\star },\ \kappa _{i_{\star }=0} \end{array}} \left| \left\{ a\in (1,4)^d : \left| \kappa \cdot a + \sum _{k=1}^{{\widetilde{r}}} n_k \sqrt{1+|{\widetilde{j}}_k|^4_a} \right| < \gamma \right\} \right| . \end{aligned}$$

    Finally, by applying Lemma 2.3 we get

    $$\begin{aligned} |\{ a\in (1,4)^d : \exists j_1\in R_{1}, \ |h_{j_1}(a)| < \gamma \}| \lesssim _{r,d} \gamma ^{\frac{1}{(r-2)(r-1)}} (\langle j_3 \rangle \dots \langle j_r \rangle )^{\frac{12}{r-1}+3d}, \end{aligned}$$

    which is also stronger than (2.13). \(\square \)

Proof of Lemma 2.8

First let us note that

$$\begin{aligned} |j_1|_a^2 - |j_2|_a^2 = \kappa _{j_1} \cdot a \quad \mathrm {where} \quad \kappa _{j_1,i} = j_{1,i}^2-j_{2,i}^2 = \sigma _2 j_{\ge 3,i} (j_{1,i}+j_{2,i}). \end{aligned}$$

First we aim at controlling \(|\kappa |_{\infty }\). If \(i\notin {\mathcal {I}}\) then \(j_{\ge 3,i} =0\) and so \(\kappa _{j_1,i}=0\). Else, since \(j_1\in {\mathbb {Z}}^d {\setminus } \bigcup _{i\in I} C_i\), we have \(|j_{1,i}|\le 2(1+ \sum _{k\ge 3} |j_{k,i}|^2)\). Consequently, we deduce that

$$\begin{aligned} |\kappa _{j_1,i} | \le \left( \sum _{k\ge 3} |j_{k,i}| \right) \left( 4 + 4\sum _{k\ge 3} |j_{k,i}|^2 + \sum _{k\ge 3} |j_{k,i}| \right) \le 10(\langle j_3 \rangle \dots \langle j_r \rangle )^3. \end{aligned}$$

Now we assume by contradiction that \(\kappa _{j_1,i}\ne 0\) for all \(i\in \llbracket 1,d\rrbracket \). Consequently, we have \({\mathcal {I}}=\llbracket 1,d\rrbracket \) and so

$$\begin{aligned} |j_1|_{\infty } \le 2\left( 1+ \sum _{k\ge 3} |j_{k}|^2\right) \le 6 \langle j_3 \rangle ^{2} \dots \langle j_r \rangle ^{2}. \end{aligned}$$
(2.14)

However, since \(j_1\in R_{\gamma }\), we have \(|j_1| \ge \gamma ^{-1/2}(\langle j_3 \rangle \dots \langle j_r \rangle )^{2dr^2}\) which is in contradiction with (2.14) because we have assumed that \(\gamma < (36d)^{-1}\). \(\square \)

Finally in the following lemma we deal with the general degenerated cases.

Lemma 2.9

For all \(\gamma >0\), we have

$$\begin{aligned} |\{ a\in (1,4)^d : \exists j_1\in S, \ |g_{j_1}(a)| < \gamma \}| \lesssim _{r,d} \gamma ^{\frac{1}{8r^4}} (\langle j_3 \rangle \dots \langle j_r \rangle )^{5d}. \end{aligned}$$
(2.15)

Proof

Without loss of generality we assume that \(\gamma \in (0,1)\). Let \(\eta \in (0,1)\) be a small number that will be optimized with respect to \(\gamma \) later. From the decomposition \(S= R_{\eta }\cup (S{\setminus } R_{\eta })\) we get

$$\begin{aligned}&|\{ a\in (1,4)^d : \exists j_1\in S, \ |g_{j_1}(a)|< \gamma \}| \le \sum _{j_1\in S{\setminus } R_{\eta }} |\{ a\in (1,4)^d : |g_{j_1}(a)|< \gamma \}| \nonumber \\&\quad + |\{ a\in (1,4)^d : \exists j_1\in R_{\eta }, \ |g_{j_1}(a)| < \eta \}| . \end{aligned}$$
(2.16)

To estimate the sum, we apply Lemma 2.3 (with \(\kappa =0\)) and we get

$$\begin{aligned} \sum _{j_1\in S{\setminus } R_{\eta }} |\{ a\in (1,4)^d : |g_{j_1}(a)|< \gamma \}|&\le \sum _{\begin{array}{c} |j_1|< \eta ^{-1/2}(\langle j_3 \rangle \dots \langle j_r \rangle )^{2dr^2} \\ (\sigma ,j) \text { is non-resonant} \end{array}} |\{ a\in (1,4)^d : |g_{j_1}(a)|< \gamma \}|\\&\le \sum _{ |j_1| < \eta ^{-1/2}(\langle j_3 \rangle \dots \langle j_r \rangle )^{2dr^2} } \gamma ^{\frac{1}{r(r+1)}} (\langle j_1 \rangle \dots \langle j_r \rangle )^{\frac{12}{r+1}}. \end{aligned}$$

Furthermore, by the zero momentum condition (2.8), since \(\eta \in (0,1)\), we also have

$$\begin{aligned} |j_2|\lesssim _r \eta ^{-1/2}(\langle j_3 \rangle \dots \langle j_r \rangle )^{2dr^2}. \end{aligned}$$

Consequently, we have

$$\begin{aligned}&\sum _{j_1\in S{\setminus } R_{\eta }} |\{ a\in (1,4)^d : |g_{j_1}(a)| < \gamma \}| \lesssim _r \gamma ^{\frac{1}{r(r+1)}} \eta ^{-\frac{1}{2}- \frac{12}{r+1}} (\langle j_3 \rangle \dots \langle j_r \rangle )^{2dr^2 + \frac{12}{r+1} + \frac{24}{r+1}2dr^2 } \\&\quad \lesssim _r \gamma ^{\frac{1}{2r^2}} \eta ^{-\frac{7}{2}} (\langle j_3 \rangle \dots \langle j_r \rangle )^{15dr^2 }. \end{aligned}$$

Therefore, applying Lemma 2.7, we deduce of (2.16) that

$$\begin{aligned}&|\{ a\in (1,4)^d : \exists j_1\in S, \ |g_{j_1}(a)| < \gamma \}| \lesssim _{r,d} \eta ^{\frac{1}{r^2}} (\langle j_3 \rangle \dots \langle j_r \rangle )^{2d }\\&\quad +\gamma ^{\frac{1}{2r^2}} \eta ^{-\frac{7}{2}} (\langle j_3 \rangle \dots \langle j_r \rangle )^{15dr^2 }. \end{aligned}$$

Finally, we get (2.15) by optimizing this last estimate choosing

$$\begin{aligned} \eta = \gamma ^{\frac{1}{7r^2+2}} (\langle j_3 \rangle \dots \langle j_r \rangle )^{\frac{15dr^2-2d}{7/2+1/r^2} } . \end{aligned}$$

\(\square \)

2.3 Proof of Proposition 2.2

For \(r\ge 3\) let \({\mathcal {M}}_r\) and \({\mathcal {R}}_r\) be the sets defined by

$$\begin{aligned} {\mathcal {M}}_r&=\{ (\sigma ,j)\in (\{-1,1\})^r \times ({\mathbb {Z}}^d)^{r} : \sum _{k=1}^r \sigma _k j_k =0\} \quad \mathrm {and}\\ {\mathcal {R}}_r&= \{ (\sigma ,j)\in (\{-1,1\})^r \times ({\mathbb {Z}}^d)^{r} : (\sigma ,j) \text { is resonant}\}. \end{aligned}$$

On the one hand, as a direct corollary of Lemma 2.9 and Corollary 2.6, for all \(\gamma >0\) we have

$$\begin{aligned}&\left| \left\{ a \in (1,4)^d : \exists r\ge 3, \exists (\sigma ,j) \in {\mathcal {M}}_r {\setminus } {\mathcal {R}}_r, \ \sigma _1 \sigma _2 = -1\quad \mathrm {and}\right. \right. \\&\quad \left. \left. \left| \sum _{k=1}^r \sigma _k \sqrt{1+|j_k|_a^4} \right|< c_{r,d}\gamma ^{8r^4} \langle j_1 \rangle ^{-(d-1)} \log ^{-2d}(\langle j_1\rangle ) (\langle j_3 \rangle \dots \langle j_r \rangle )^{-44dr^4}\right\} \right| < \gamma \end{aligned}$$

where \(c_{r,d}>0\) is a constant depending only on r and d. Consequently, it is enough to prove that for all \(\gamma \in (0,1)\), we have

$$\begin{aligned}&I_{\gamma }:=\left| \left\{ a \in (1,4)^d : \exists r\ge 3, \exists (\sigma ,j) \in {\mathcal {M}}_r {\setminus } {\mathcal {R}}_r, \ \sigma _1 \sigma _2 = 1 \quad \mathrm {and}\right. \right. \nonumber \\&\quad \left. \left. \left| \sum _{k=1}^r \sigma _k \sqrt{1+|j_k|_a^4} \right|< \kappa _{r,d} \gamma ^{r(r+1)} (\langle j_3 \rangle \dots \langle j_r \rangle )^{-9 d r^2}\right\} \right| < \gamma \end{aligned}$$
(2.17)

where \(\kappa _{r,d}\in (0,1)\) is another constant depending only on r and d (and that will be determined later). Indeed, by additivity of the measure, we have

$$\begin{aligned} I_{\gamma }&\le \sum _{r\ge 3} \sum _{\begin{array}{c} (\sigma ,j) \in {\mathcal {M}}_r {\setminus } {\mathcal {R}}_r\\ \sigma _1 \sigma _2=1 \end{array}} \left| \left\{ a \in (1,4)^d :\left| \sum _{k=1}^r \sigma _k \sqrt{1+|j_k|_a^4} \right| \right. \right. \\&< \left. \left. \kappa _{r,d} \gamma ^{r(r+1)} (\langle j_3 \rangle \dots \langle j_r \rangle )^{-9 d r^2}\right\} \right| . \end{aligned}$$

Note that if \(|j_1| \ge 2\sqrt{r} \langle j_3 \rangle \dots \langle j_r \rangle \) and \(\sigma _1 \sigma _2 = 1\) then

$$\begin{aligned}&\left| \sum _{k=1}^r \sigma _k \sqrt{1+|j_k|_a^4} \right| \ge \sqrt{1+|j_1|_a^4} - \sum _{k=3}^r \sqrt{1+|j_k|_a^4} \ge \sqrt{1+|j_1|^4} - \sum _{k=3}^r \sqrt{1+16|j_k|^4} \\&\quad \ge \sqrt{1+|j_1|^4} - 4 \sum _{k=3}^r \sqrt{1+|j_k|^4} \ge |j_1|^2 - 4 \sum _{k=3}^r (1+|j_k|^2) \ge 4 (\langle j_3 \rangle \dots \langle j_r \rangle )^2 >1 \end{aligned}$$

and so \(\Big |\{ a \in (1,4)^d : \big | \sum _{k=1}^r \sigma _k \sqrt{1+|j_k|_a^4} \big | < \kappa _{r,d} \gamma ^{r(r+1)} (\langle j_3 \rangle \dots \langle j_r \rangle )^{-9d r^2}\} \Big | \) vanishes. Since the same holds if \(j_1\) is replaced by \(j_2\), consequently, we have that \(I_{\gamma }\) is bounded from above by

$$\begin{aligned} \sum _{r\ge 3} \sum _{\begin{array}{c} (\sigma ,j) \in {\mathcal {M}}_r {\setminus } {\mathcal {R}}_r\\ |j_1| \le 2\sqrt{r} \langle j_3 \rangle \dots \langle j_r \rangle \\ |j_2| \le 2\sqrt{r} \langle j_3 \rangle \dots \langle j_r \rangle \end{array}} \left| \left\{ a \in (1,4)^d : \left| \sum _{k=1}^r \sigma _k \sqrt{1+|j_k|_a^4} \right| < \kappa _{r,d} \gamma ^{r(r+1)} (\langle j_3 \rangle \dots \langle j_r \rangle )^{-9 d r^2}\right\} \right| . \end{aligned}$$

Now denoting by \(c_{r,d}>0\) the constant given by Lemma 2.3, we get

$$\begin{aligned} I_{\gamma } \le \sum _{r\ge 3} c_{r,d} \sum _{\begin{array}{c} (\sigma ,j) \in {\mathcal {M}}_r {\setminus } {\mathcal {R}}_r\\ |j_1| \le 2\sqrt{r} \langle j_3 \rangle \dots \langle j_r \rangle \\ |j_2| \le 2\sqrt{r} \langle j_3 \rangle \dots \langle j_r \rangle \end{array}} \left( \kappa _{r,d} \gamma ^{r(r+1)} (\langle j_3 \rangle \dots \langle j_r \rangle )^{-9 d r^2} \right) ^{\frac{1}{r(r+1)}} (\langle j_1 \rangle \dots \langle j_r \rangle )^{\frac{12}{r+1}}. \end{aligned}$$

Consequently, we get an other constant \({\widetilde{c}}_{r,d}>0\) such that

$$\begin{aligned} I_{\gamma } \le \gamma \sum _{r\ge 3} {\widetilde{c}}_{r,d} \kappa _{r,d}^{\frac{1}{r(r+1)}} \sum _{j_3,\dots ,j_r \in {\mathbb {Z}}^d} (\langle j_3 \rangle \dots \langle j_r \rangle )^{-9 d \frac{r^2}{r(r+1)} +\frac{36}{r+1}}. \end{aligned}$$

Noting that \(9 d \frac{r^2}{r(r+1)} -\frac{36}{r+1}\ge 2d\), we deduce that

$$\begin{aligned} I_{\gamma } \le \gamma \sum _{r\ge 3} {\widetilde{c}}_{r,d}\, \kappa _{r,d}^{\frac{1}{r(r+1)}} \left( \sum _{j\in {\mathbb {Z}}^d} \langle j\rangle ^{-2d} \right) ^{r-2}. \end{aligned}$$

Consequently, we deduce a natural choice for \(\kappa _{r,d}\) such that \(I_{\gamma }<\gamma \) which conclude this proof.

3 The Birkhoff Normal form Step

In the rest of the paper we shall fix the parameter \(\nu \), (see (1.2) and (1.12)) defining the irrationality of the torus, in the full Lebesgue measure set given by Proposition 2.2. For \(d\ge 2\) and \(n\in {\mathbb {N}}\) we define

$$\begin{aligned} M_{d,n}:=\left\{ \begin{array}{ll} n+2(n-2)+1 &{}\quad \mathrm{if}\quad d=2\quad \mathrm{and}\quad n\;\; \mathrm{odd}\\ n+2(n-2) &{}\quad \mathrm{if}\quad d=2\quad \mathrm{and}\quad n\;\; \mathrm{even}\\ n+(n-2) &{}\quad \mathrm{if}\quad d=3\\ n &{} \quad \mathrm{if}\quad d\ge 4. \end{array}\right. \end{aligned}$$
(3.1)

The main result of this section is the following.

Theorem 2

Let \(d=2,3\) and let \(r\in {\mathbb {N}}\) such that \(M_{d,n}\le r\le 4n\). There exits \(\beta =\beta (d,r)>0\) such that for any \(N\ge 1\), any \(\delta >0\) and \(s\ge s_0=s_0(\beta )\), there exist \(\varepsilon _0\lesssim _{s,\delta } N^{-\delta }\) and two canonical transformation \(\tau ^{(0)}\) and \(\tau ^{(1)}\) making the following diagram to commute

(3.2)

and close to the identity

$$\begin{aligned} \forall \sigma \in \{0,1\}, \ \Vert u\Vert _{{H}^s} <2^{\sigma }\varepsilon _0 \;\; \Rightarrow \;\; \Vert \tau ^{(\sigma )}(u)-u\Vert _{{H}^s} \lesssim _{s,\delta } N^{\delta } \Vert u\Vert _{{H}^s}^2 \end{aligned}$$
(3.3)

such that, on \(B_s(0,2\varepsilon _0)\), \(H \circ \tau ^{(1)}\) writes

$$\begin{aligned} H \circ \tau ^{(1)} = Z_2+ \sum _{k=n}^{M_{d,n}-1}Z_{k}^{\le N}+\sum _{k= M_{d,n}}^{r-1}K_{k}+K^{>N}+{\tilde{R}}_{r} \end{aligned}$$
(3.4)

where \(M_{d,n}\) is given in (3.1) and where

  1. (i)

    \(Z_{k}^{\le N}\), for \(k=n,\ldots , M_{d,n}-1\), are resonant Hamiltonians of order k given by the formula

    $$\begin{aligned} Z_k^{\le N} = \sum _{\begin{array}{c} \sigma \in \{-1,1\}^k,\ j\in ({\mathbb {Z}}^{d})^{k},\ \mu _2(j)\le N\\ \sum _{i=1}^k\sigma _i j_i=0\\ \sum _{i=1}^k\sigma _i \omega _{j_i}=0 \end{array}} (Z_{k}^{\le N})_{\sigma ,j}u_{j_1}^{\sigma _1}\cdots u_{j_k}^{\sigma _k}, \quad |(Z_{k}^{\le N})_{\sigma ,j}|\lesssim _{\delta }N^{\delta } \frac{\mu _3(j)^\beta }{\mu _1(j)}; \end{aligned}$$
    (3.5)
  2. (ii)

    \(K_k\), \(k=M_{d,n},\ldots , r-1\), are homogeneous polynomials of order k

    $$\begin{aligned} K_k= \sum _{\begin{array}{c} \sigma \in \{-1,1\}^k,\ j\in ({\mathbb {Z}}^{d})^{k}\\ \sum _{i=1}^k\sigma _i j_i=0 \end{array}} (K_{k})_{\sigma ,j}u_{j_1}^{\sigma _1}\cdots u_{j_k}^{\sigma _k}, \quad |(K_k)_{\sigma ,j}|\lesssim _{\delta } N^{\delta } \mu _3(j)^\beta ; \end{aligned}$$
    (3.6)
  3. (iii)

    \(K^{>N}\) and \({\tilde{R}}_{r}\) are remainders satisfying

    $$\begin{aligned} \Vert X_{K^{>N}}(u)\Vert _{H^{s}}&\lesssim _{s,\delta }N^{-1+\delta }\Vert u\Vert _{H^{s}}^{n-1}, \end{aligned}$$
    (3.7)
    $$\begin{aligned} \Vert X_{{{\tilde{R}}}_r}(u)\Vert _{H^{s}}&\lesssim _{s,\delta } N^{\delta } \Vert u\Vert _{H^s}^{r-1}. \end{aligned}$$
    (3.8)

It is convenient to introduce the following class.

Definition 3.1

(Formal Hamiltonians) Let \(N\in {\mathbb {R}}\), \(k\in {\mathbb {N}}\) with \(k\ge 3\) and \(N\ge 1\).

  1. (i)

    We denote by \( {\mathcal {L}}_k\) the set of Hamiltonian having homogeneity k and such that they may be written in the form

    $$\begin{aligned} G_{k}(u)&= \sum _{\begin{array}{c} \sigma _i\in \{-1,1\},\ j_i\in {\mathbb {Z}}^d\\ \sum _{i=1}^k\sigma _i j_i=0 \end{array}} (G_{k})_{\sigma ,j}u_{j_1}^{\sigma _1}\cdots u_{j_k}^{\sigma _k}, \quad (G_{k})_{\sigma ,j}\in {\mathbb {C}}, \quad \begin{array}{cl}&{}\sigma :=(\sigma _1,\ldots ,\sigma _k)\\ &{}j:=(j_1,\ldots ,j_k)\end{array} \end{aligned}$$
    (3.9)

    with symmetric coefficients \((G_k)_{\sigma ,j}\), i.e. for any \(\rho \in {\mathfrak {S}}_{k}\) one has \((G_{k})_{\sigma ,j}=(G_{k})_{\sigma \circ \rho ,j\circ \rho }\).

  2. (ii)

    If \(G_{k}\in {\mathcal {L}}_{k}\) then \(G_{k}^{>N}\) denotes the element of \({\mathcal {L}}_{k}\) defined by

    $$\begin{aligned} (G_{k}^{>N})_{\sigma ,j}:=\left\{ \begin{array}{lll} &{}(G_{k})_{\sigma ,j},&{}\mathrm{if} \;\;\mu _{2}(j)>N,\\ &{}0, &{} \mathrm{else}. \end{array} \right. \end{aligned}$$
    (3.10)

    We set \(G^{\le N}_{k}:=G_{k}-G^{>N}_{k}\).

Remark 3.2

Consider the Hamiltonian H in (1.20) and its Taylor expansion in (1.24). One can note that the Hamiltonians \(H_{k}\) in (1.26) belong to the class \({\mathcal {L}}_{k}\). This follows form the fact that, without loss of generality, one can substitute the Hamiltonian \(H_{k}\) with its symmetrization.

We also need the following definition.

Definition 3.3

Consider the Hamiltonian \(Z_2\) in (1.25) and \(G_{k}\in {\mathcal {L}}_{k}\).

  • (Adjoint action). We define the adjoint action \(\mathrm{ad}_{Z_2}G_k\) in \({\mathcal {L}}_{k}\) by

    $$\begin{aligned} (\mathrm{ad}_{Z_2}G_k)_{\sigma ,j}:= \Big (\mathrm{i} \sum _{i=1}^{k}\sigma _i\omega _{j_i}\Big ) (G_{k})_{\sigma ,j}. \end{aligned}$$
    (3.11)
  • (Resonant Hamiltonian). We define \(G_{k}^{\mathrm{res}}\in {\mathcal {L}}_{j}\) by

    $$\begin{aligned} (G_{k}^{res})_{\sigma ,j}:=(G_{k})_{\sigma ,j},\quad \mathrm{when}\quad \sum _{i=1}^{k}\sigma _i\omega _{j_i}=0 \end{aligned}$$

    and \((G_{k}^{\mathrm{res}})_{\sigma ,j}=0\) otherwise.

  • We define \(G_{k}^{(+1)}\in {\mathcal {L}}_{k}\) by

    $$\begin{aligned}&(G_{k}^{(+1)})_{\sigma ,j}:=(G_{k})_{\sigma ,j},\quad \mathrm{when}\quad \exists i,p=1,\ldots ,k\;\mathrm{s.t.}\\&\mu _{1}(j)=|j_{i}|,\quad \mu _{2}(j)=|j_{p}| \quad \mathrm{and}\quad \sigma _{i}\sigma _{p}=+1. \end{aligned}$$

    We define \(G_{k}^{(-1)}:=G_k-G_{k}^{(+1)}\).

Remark 3.4

Notice that, in view of Proposition 2.2, the resonant Hamiltonians given in Definition 3.3 must be supported on indices \(\sigma \in \{-1,1\}^{k}\), \(j\in {\mathbb {Z}}^{kd}\) which are resonant according to Definition 2.1. We remark that \((G_{k})^{\mathrm{res}}\equiv 0\) if k is odd.

In the following lemma we collect some properties of the Hamiltonians in Definition 3.1.

Lemma 3.5

Let \(N\ge 1\), \(0\le \delta _i<1\), \(q_i\in {\mathbb {R}}\), \(k_i\ge 3\), consider \(G^i_{k_i}(u)\) in \({\mathcal {L}}_{k_i}\) for \(i=1,2\). Assume that the coefficients \((G^i_{k_i})_{\sigma ,j}\) satisfy

$$\begin{aligned} |(G^i_{k_i})_{\sigma ,j}|\le C_{i} N^{\delta _i} \mu _{3}(j)^{\beta _i}\mu _{1}(j)^{-q_{i}}, \quad \forall \sigma \in \{-1,+1\}^{k},\; j\in {\mathbb {Z}}^{kd}, \end{aligned}$$
(3.12)

for some \(\beta _i>0\) and \(C_i>0\), \(i=1,2\).

  1. (i)

    (Estimates on Sobolev spaces) Set \(k=k_i\), \(\delta =\delta _i\), \(q=q_i\), \(\beta =\beta _i\), \(C=C_i\) and \(G^i_{k_i}=G_k\) for \(i=1,2\). There is \(s_0=s_0(\beta ,d)\) such that for \(s\ge s_0\), \(G_{k}\) defines naturally a smooth function from \(H^{s}({\mathbb {T}}^{d})\) to \({\mathbb {R}}\). In particular one has the following estimates:

    $$\begin{aligned} |G_{k}(u)|&\lesssim _{s}CN^{\delta }\Vert u\Vert _{H^{s}}^{k}, \end{aligned}$$
    (3.13)
    $$\begin{aligned} \Vert X_{G_k}(u)\Vert _{H^{s+q}}&\lesssim _s CN^{\delta }\Vert u\Vert _{H^{s}}^{k-1}, \end{aligned}$$
    (3.14)
    $$\begin{aligned} \Vert X_{G_{k}^{>N}}(u)\Vert _{H^{s}}&\lesssim _{s} CN^{-q+\delta }\Vert u\Vert ^{k-1}_{H^{s}}, \end{aligned}$$
    (3.15)

    for any \(u\in H^{s}({\mathbb {T}}^{d}).\)

  2. (ii)

    (Poisson bracket) The Poisson bracket between \(G^1_{k_1}\) and \(G^{2}_{k_2}\) is an element of \({\mathcal {L}}_{k_1+k_2-2}\) and it verifies the estimate

    $$\begin{aligned} |(\{G^1_{k_1},G^2_{k_2}\})_{\sigma ,j}|\lesssim _{s} C_1 C_2 N^{\delta _1+\delta _2}\mu _{3}^{\beta _1+\beta _2}\mu _1(j)^{-\min \{q_1,q_2\}}, \end{aligned}$$
    (3.16)

    for any \(\sigma \in \{+1,-1\}^{k_1+k_2-1}\) and \( j\in {\mathbb {Z}}^{d(k_1+k_2-2)}.\)

Proof

We prove item (i). Concerning the proof of (3.13) it is sufficient to give the proof in the case \(q=0\). For convenience, without loss of generality, we assume \(C_{i}=1\), \(i=1,2\). We have

$$\begin{aligned} |G_k(u)|&\le k!\sum _{\begin{array}{c} j_1,\ldots ,j_k\in {\mathbb {Z}}^d \\ |j_1|\ge |j_{2}|\ge |j_3|\ge \ldots \ge |j_{k}| \end{array}} |(G_k)_{\sigma ,k}||u_{j_1}^{\sigma _1}|\cdots |u_{j_k}^{\sigma _k}| \\&\lesssim _{k} N^{\delta }\sum _{j_3\in {\mathbb {Z}}^d}|j_3|^{\beta }|u_{j_3}^{\sigma _3}| \prod _{3 \ne i=1}^k\sum _{j_i\in {\mathbb {Z}}^d}|u_{j_i}^{\sigma _i}| \lesssim _{k,\epsilon } N^{\delta }\Vert u\Vert _{H^{d/2+\beta +\epsilon }}\Vert u\Vert _{H^{d/2+\epsilon }}^{k-1}, \end{aligned}$$

for any \(\epsilon >0\), we proved the (3.13) with \(s_0=d/2+\epsilon +\beta \).

We now prove (3.14). Since the coefficients of \(G_k\) are symmetric, we have

$$\begin{aligned} \partial _{{\bar{u}}_n}G_k(u) = k \sum _{\sigma _1 j_1+\dots +\sigma _{k-1} j_{k-1}=n} (G_k)_{(\sigma ,-1),(j,n)} u_{j_1}^{\sigma _1} \dots u_{j_{r-1}}^{\sigma _{r-1}} \end{aligned}$$

Therefore, we have

$$\begin{aligned} \begin{aligned} \langle n\rangle ^{s+q}|\partial _{{\bar{u}}_n}G_k(u)|&\le k ! \! \! \! \! \! \! \! \! \! \! \! \! \sum _{\begin{array}{c} \sigma _1 j_1+\dots +\sigma _{k-1} j_{k-1}=n \\ |j_1|\ge \dots \ge |j_{k-1}| \end{array}} \! \! \! \! \! \! \! \! \! \! \! \! |(G_k)_{(\sigma ,-1),(j,n)}| |u_{j_1}^{\sigma _1}| \dots |u_{j_{r-1}}^{\sigma _{r-1}}| \langle n\rangle ^{s+q} \\&\mathop {\lesssim }^{(3.12)} N^{\delta } \! \! \! \! \! \! \! \! \! \! \! \! \sum _{\begin{array}{c} \sigma _1 j_1+\dots +\sigma _{k-1} j_{k-1}=n \\ |j_1|\ge \dots \ge |j_{k-1}| \end{array}} \! \! \! \! \! \! \! \! \! \! \! \! \mu _{3}(j,n)^{\beta }\mu _{1}(j,n)^{-q} |u_{j_1}^{\sigma _1}| \dots |u_{j_{r-1}}^{\sigma _{r-1}}| \langle n\rangle ^{s+q}. \end{aligned} \end{aligned}$$

We note that in the last sum above, we have \(\langle n\rangle \lesssim \langle j_1 \rangle \), \(\mu _{1}(j,n)\ge \langle j_1 \rangle \) and \(\mu _{3}(j,n)\le \langle j_2 \rangle \). As a consequence, we deduce that

$$\begin{aligned} \begin{aligned} \langle n\rangle ^{s+q}|\partial _{{\bar{u}}_n}G_k(u)|&\lesssim _{s} N^{\delta } \! \! \! \! \! \! \! \! \! \! \! \! \sum _{\begin{array}{c} \sigma _1 j_1+\dots +\sigma _{k-1} j_{k-1}=n \\ |j_1|\ge \dots \ge |j_{k-1}| \end{array}} \! \! \! \! \! \! \! \! \! \! \! \! \langle j_1 \rangle ^{s} \langle j_2 \rangle ^{\beta } |u_{j_1}^{\sigma _1}| \dots |u_{j_{r-1}}^{\sigma _{r-1}}| \\&\lesssim _{s} N^{\delta } \! \! \! \! \! \! \! \! \sum _{j_1+\dots +j_{k-1}=n } \! \! \! \! \! \! \! \! \langle j_1 \rangle ^{s} \langle j_2 \rangle ^{\beta } |u_{j_1}| \dots |u_{j_{r-1}}|. \end{aligned} \end{aligned}$$

Consequently, applying the Young convolutional inequality, we get

$$\begin{aligned} \Vert X_{G_k}(u) \Vert _{H^{s+q}}&= \Vert (\langle n\rangle ^{s+q}|\partial _{{\bar{u}}_n}G_k(u)|)_{n\in {\mathbb {Z}}^d} \Vert _{\ell ^2} \\&\lesssim _s N^{\delta }\Vert u \Vert _{H^s} \left( \sum _{j\in {\mathbb {Z}}^d} \langle j \rangle ^\beta |u_j| \right) \left( \sum _{j\in {\mathbb {Z}}^d} |u_j| \right) ^{k-3} \\&\lesssim _s N^{\delta } \Vert u\Vert _{H^s}^{k-1}. \end{aligned}$$

The proof of (3.15) follows the same lines. The proof of item (ii) of the lemma is a direct consequence of the previous computations, definition (1.23) and the momentum condition. \(\square \)

We are in position to prove the main Birkhoff result.

Proof of Theorem 2

In the case \(d=2\) we perform two steps of Birkhoff normal form procedure, see Lemmata 3.8, 3.12. The case \(d=3\) is slightly different. Indeed, due to the estimates on the small divisors given in Proposition 2.2, we can note that the Hamiltonian in (3.24) has already the form (3.4) since the coefficients of the Hamiltonians \({\tilde{K}}_{k}\) (see (3.25)) do not decay anymore in the largest index \(\mu _1(j)\). The proof of Theorem 2 is then concluded after just one step of Birkhoff normal form.

Step 1 if \(d=2\) or \(d=3\). We have the following Lemma.

Lemma 3.6

(Homological equation 1) Let \(q_{d}=3-d\) for \(d=2,3\). For any \(N\ge 1\) and \(\delta >0\) there exist multilinear Hamiltonians \(\chi ^{(1)}_{k}\), \(k=n,\ldots , 2n-3\) in the class \({\mathcal {L}}_{k}\) with coefficients \((\chi _{k}^{(1)})_{\sigma ,j}\) satisfying

$$\begin{aligned} |(\chi _{k}^{(1)})_{\sigma ,j}|\lesssim _{\delta } N^{\delta } \mu _3(j)^{\beta }\mu _1(j)^{-q_{d}}, \end{aligned}$$
(3.17)

such that (recall Definition 3.3)

$$\begin{aligned} \{\chi _{k}^{(1)},Z_{2}\}+H_{k}=Z_{k}+H_{k}^{>N},\quad k=n,\ldots ,2n-3, \end{aligned}$$
(3.18)

where \(Z_2\), \(H_k\) are given in (1.25), (1.26) and \(Z_{k}\) is the resonant Hamiltonian defined as

$$\begin{aligned} Z_{k}:=(H_{k}^{\le N})^{\mathrm{res}}, \quad k=n,\ldots ,2n-3. \end{aligned}$$
(3.19)

Moreover \(Z_{k}\) belongs to \({\mathcal {L}}_{k}\) and has coefficients satisfying (3.5).

Proof

Consider the Hamiltonians \(H_{k}\) in (1.26) with coefficients satisfying (1.27). Recalling Definition 3.1 we write

$$\begin{aligned} H_{k}=Z_{k}+(H_{k}^{\le N}-Z_{k})+H^{>N}_{k},\quad k=n,\ldots ,r-1, \end{aligned}$$

with \(Z_k\) as in (3.19). We define

$$\begin{aligned} \chi _{k}^{(1)}:=(\mathrm{ad }_{Z_2})^{-1}\Big [H_{k}^{\le N}-Z_{k}\Big ],\quad k=n,\ldots ,2n-3, \end{aligned}$$
(3.20)

where \(\mathrm{ad}_{Z_2}\) is given by Definition 3.3. In particular (recall formula (3.11)) their coefficients have the form

$$\begin{aligned} (\chi _{k}^{(1)})_{\sigma ,j}:=(H_{k})_{\sigma ,j}\left( \mathrm{i} \sum _{i=1}^{k}\sigma _i\omega _{j_i}\right) ^{-1} \end{aligned}$$
(3.21)

for indices \(\sigma \in \{-1,+1\}^{k}\), \(j\in ({\mathbb {Z}}^{d})^{k}\) such that

$$\begin{aligned} \sum _{i=1}^{k}\sigma _i j_i=0, \quad \mu _{2}(j)\le N\quad \mathrm{and}\quad \sum _{i=1}^{k}\sigma _i\omega _{j_i}\ne 0. \end{aligned}$$

By (1.27) and Proposition 2.2 (with \(d=2,3\)) we deduce the bound (3.17) for some \(\beta >0\). The resonant Hamiltonians \(Z_{k}\) in (3.19) have the form (3.5). One can check by an explicit computation that Eq. (3.18) is verified. \(\square \)

We shall use the Hamiltonians \(\chi ^{(1)}_{k}\) given by Lemma 3.6 to generate a symplectic change of coordinates.

Lemma 3.7

Let us define

$$\begin{aligned} \chi ^{(1)}:=\sum _{k=n}^{2n-3}\chi ^{(1)}_{k}. \end{aligned}$$
(3.22)

There is \(s_0=s_0(d,r)\) such that for any \(\delta >0\), for any \(N\ge 1\) and any \(s\ge s_0\), if \(\varepsilon _0\lesssim _{s,\delta } N^{-\delta }\), then the problem

$$\begin{aligned} \left\{ \begin{aligned}&\partial _{\tau }Z(\tau )=X_{{\chi ^{(1)}}}(Z(\tau ))\\&Z(0)=U={\bigl [{\begin{matrix}u\\ {\bar{u}}\end{matrix}}\bigr ]},\quad u\in B_{s}(0,\varepsilon _0) \end{aligned} \right. \end{aligned}$$
(3.23)

has a unique solution \(Z(\tau )=\Phi ^{\tau }_{\chi ^{(1)}}(u)\) belonging to \(C^{k}([-1,1];H^{s}({\mathbb {T}}^{d}))\) for any \(k\in {\mathbb {N}}\). Moreover the map \(\Phi _{\chi ^{(1)}}^{\tau } : B_{s}(0,\varepsilon _0)\rightarrow H^{s}({\mathbb {T}}^{d})\) is symplectic. The flow map \(\Phi ^{\tau }_{\chi ^{(1)}}\) and its inverse \(\Phi ^{-\tau }_{\chi ^{(1)}}\) satisfy

$$\begin{aligned} \begin{aligned}&\sup _{\tau \in [0,1]} \Vert \Phi ^{\pm \tau }_{\chi ^{(1)}}(u)-u\Vert _{H^{s}} \lesssim _{s,\delta } N^{\delta }\Vert u\Vert _{H^{s}}^{n-1},\\&\sup _{\tau \in [0,1]} \Vert \mathrm{d}\Phi ^{\pm \tau }_{\chi ^{(1)}}(u)[\cdot ]\Vert _{{\mathcal {L}}(H^{s};H^{s})} \le 2. \end{aligned} \end{aligned}$$

Proof

By estimate (3.17) and Lemma 3.5 we have that the vector field \(X_{\chi ^{(1)}}\) is a bounded operator on \(H^{s}({\mathbb {T}}^{d})\). Hence the flow \(\Phi ^{\tau }_{\chi ^{(1)}}\) is well-posed by standard theory of Banach space ODE. The estimates of the map and its differential follow by using the equation in (3.23), the fact that \(\chi ^{(1)}\) is multilinear and the smallness condition on \(\varepsilon _0\). Finally the map is symplectic since it is generated by a Hamiltonian vector field. \(\square \)

We now study how changes the Hamiltonian H in (1.24) under the map \(\Phi ^{\tau }_{\chi ^{(1)}}\).

Lemma 3.8

(The new Hamiltonian 1) There is \(s_0=s_0(d,r)\) such that for any \(N\ge 1\), \(\delta >0\) and any \(s\ge s_0\), if \(\varepsilon _0\lesssim _{s,\delta } N^{-\delta }\) then we have that

$$\begin{aligned} H\circ \Phi _{\chi ^{(1)}}=Z_{2}+\sum _{k=n}^{2n-3}Z_{k}+{\widetilde{K}}^{>N} +\sum _{k=2n-2}^{r-1}{\widetilde{K}}_{k}+{\mathcal {R}}_{r} \end{aligned}$$
(3.24)

where

  • \(\Phi _{\chi ^{(1)}}:=(\Phi ^{\tau }_{\chi ^{(1)}})_{|\tau =1}\) is the flow map given by Lemma 3.7;

  • the resonant Hamiltonians \(Z_k\) are defined in (3.19);

  • \({\widetilde{K}}_k\) are in \({\mathcal {L}}_{k}\) with coefficients \(({\widetilde{K}}_k)_{\sigma ,j}\) satisfying

    $$\begin{aligned} |({\widetilde{K}}_k)_{\sigma ,j}|\lesssim _{\delta } N^{\delta } \mu _3(j)^{\beta }\mu _{1}(j)^{-q_{d}}, \quad k=2n-2,\ldots , r-1, \end{aligned}$$
    (3.25)

    with \(q_{d}=3-d\) for \(d=2,3\);

  • the Hamiltonian \({\widetilde{K}}^{>N}\) and the remainder \({\mathcal {R}}_r\) satisfy

    $$\begin{aligned} \Vert X_{{\widetilde{K}}^{>N}}(u)\Vert _{H^{s}}&\lesssim _{s,\delta } N^{-1}\Vert u\Vert _{H^{s}}^{n-1}, \end{aligned}$$
    (3.26)
    $$\begin{aligned} \Vert X_{{\mathcal {R}}_r}(u)\Vert _{H^{s}}&\lesssim _{s,\delta } N^{\delta }\Vert u\Vert ^{r-1}_{H^{s}}, \quad \forall u\in B_{s}(0,2\varepsilon _0). \end{aligned}$$
    (3.27)

Proof

Fix \(\delta >0\) and \(\varepsilon _0N^{\delta }\) small enough. We apply Lemma 3.7 with \(\delta \rightsquigarrow \delta '\) to be chosen small enough with respect to \(\delta \) we have fixed (which ensures us that the smallness condition \(\varepsilon _0N^{\delta '}\lesssim _{s,\delta '}1\) of Lemma 3.7 is fulfilled). Let \(\Phi ^{\tau }_{\chi ^{(1)}}\) be the flow at time \(\tau \) of the Hamiltonian \(\chi ^{(1)}\). We note that

$$\begin{aligned} \partial _{\tau }H\circ \Phi ^{\tau }_{\chi ^{(1)}}=dH(z)[X_{\chi ^{(1)}}(z)]_{|z=\Phi ^{\tau }_{\chi ^{(1)}}} {\mathop {=}\limits ^{(1.22), (1.23)}}\{\chi ^{(1)}, H\}\circ \Phi ^{\tau }_{\chi ^{(1)}}. \end{aligned}$$

Then, for \(L\ge 2\), we get the Lie series expansion

$$\begin{aligned} H\circ \Phi _{\chi ^{(1)}}=H+\{\chi ^{(1)}, H\} +\sum _{p=2}^{L}\frac{1}{p!}\mathrm{ad}_{\chi ^{(1)}}^{p}\Big [H\Big ] +\frac{1}{L!}\int _{0}^{1}(1-\tau )^{L} \mathrm{ad}_{\chi ^{(1)}}^{L+1}\Big [H\Big ]\circ \Phi ^{\tau }_{\chi ^{(1)}}\mathrm {d}\tau \end{aligned}$$

where \(\mathrm{ad}_{\chi ^{(1)}}^{p}\) is defined recursively as

$$\begin{aligned} \mathrm{ad}_{\chi ^{(1)}}[H]:=\{\chi ^{(1)},H\},\quad \mathrm{ad}_{\chi ^{(1)}}^{p}[H]:= \big \{\chi ^{(1)}, \mathrm{ad}_{\chi ^{(1)}}^{p-1}[H] \big \},\quad p\ge 2. \end{aligned}$$
(3.28)

Recalling the Taylor expansion of the Hamiltonian H in (1.24) we obtain

$$\begin{aligned} H\circ \Phi _{\chi ^{(1)}}&= Z_{2}+\sum _{k=n}^{2n-3}\Big (H_{k} +\{\chi _{k}^{(1)},Z_2\}\Big )+\sum _{k=2n-2}^{r-1}H_{k}\nonumber \\&\quad +\sum _{p=2}^{L}\frac{1}{p!}\mathrm{ad}^{p}_{\chi ^{(1)}}[Z_2] +\sum _{j=n}^{r-1}\sum _{p=1}^{L} \frac{1}{p!}\mathrm{ad}_{\chi ^{(1)}}^{p}[H_{j}] \end{aligned}$$
(3.29)
$$\begin{aligned}&\quad +\frac{1}{L!}\int _{0}^{1}(1-\tau )^{L} \mathrm{ad}_{\chi ^{(1)}}^{L+1}[Z_{2} +\sum _{j=n}^{r-1}H_{j}]\circ \Phi ^{\tau }_{\chi ^{(1)}}\mathrm {d}\tau \end{aligned}$$
(3.30)
$$\begin{aligned}&\quad +R_{r}\circ \Phi _{\chi ^{(1)}}. \end{aligned}$$
(3.31)

We study each summand separately. First of all, by definition of \(\chi _{k}^{(1)}\) (see (3.18) in Lemma 3.6), we deduce that

$$\begin{aligned} \sum _{k=n}^{2n-3}\big (H_{k}+\{\chi _{k}^{(1)},Z_2\}\big ) =\sum _{k=n}^{2n-3}Z_{k}+{\widetilde{K}}^{>N}, \quad {\widetilde{K}}^{>N}:=\sum _{k=n}^{2n-3}H_{k}^{>N}. \end{aligned}$$
(3.32)

One can check, using Lemma 3.5 (see (3.15)), that \({\widetilde{K}}^{>N}\) satisfies (3.26). Consider now the term in (3.29). By definition of \(\chi ^{(1)}\) (see (3.18) and (3.22)), we get, for \(p=2,\ldots , L\),

$$\begin{aligned} \mathrm{ad}^{p}_{\chi ^{(1)}}[Z_2]=\mathrm{ad}^{p-1}_{\chi ^{(1)}}\Big [ \{\chi ^{(1)},Z_{2}\} \Big ]{\mathop {=}\limits ^{(3.32)}} \mathrm{ad}^{p-1}_{\chi ^{(1)}}\left[ \sum _{k=n}^{2n-3}(Z_{k}-H_{k}^{\le N})\right] . \end{aligned}$$

Therefore, by Lemma 3.5-(ii) and recalling (3.28), we get

$$\begin{aligned} {}(3.29)=\sum _{k=2n-2}^{L(2n-3)+r-1-2L}{\widetilde{K}}_{k} \end{aligned}$$

where \({\widetilde{K}}_k\) are k-homogeneous Hamiltonians in \({\mathcal {L}}_{k}\). In particular, by (3.16), (3.17) and (1.27) (with \(\delta \rightsquigarrow \delta '\)), we have

$$\begin{aligned} |({\widetilde{K}}_k)_{\sigma ,j}|\lesssim _{\delta '} N^{L\delta '} \mu _3(j)^{\beta }\mu _{1}(j)^{-q_{d}} \end{aligned}$$

for some \(\beta >0\) depending only on dn. This implies the estimates (3.25) taking \(L\delta '\le \delta \), where L will be fixed later. Then formula (3.24) follows by setting

$$\begin{aligned} {\mathcal {R}}_{r}:=\sum _{k=r}^{L(2n-3)+r-1-2L}{\widetilde{K}}_{k}+ (3.30)+(3.31). \end{aligned}$$
(3.33)

The estimate (3.27) holds true for \(X_{{\widetilde{K}}_k}\) with \(k=r,\ldots ,L(2n-3)+r-1-2L\), thanks to (3.25) and Lemma 3.5. It remains to study the terms appearing in (3.30), (3.31). We start with the remainder in (3.31). We note that

$$\begin{aligned} X_{R_{r}\circ \Phi }(u)= (\mathrm{d}\Phi _{\chi ^{(1)}})^{-1}(u)\Big [X_{R_r}(\Phi _{\chi ^{(1)}}(u))\Big ]. \end{aligned}$$

We obtain the estimate (3.27) on the vector field \(X_{R_{r}\circ \Phi }\) by using (1.28) and Lemma 3.7. In order to estimate the term in (3.30) we reason as follows. First notice that

$$\begin{aligned} \mathrm{ad}_{\chi ^{(1)}}^{L+1}[Z_{2}+H_{j}] {\mathop {=}\limits ^{(3.32)}} \mathrm{ad}_{\chi ^{(1)}}^{L} \left[ \sum _{k=n}^{2n-3}(Z_{k}-H_{k}^{\le N})\right] + \mathrm{ad}_{\chi ^{(1)}}^{L+1}[H_{j}]:={\mathcal {Q}}_{j} \end{aligned}$$

with \(j=n,\ldots ,r-1\). Using Lemma 3.5 we deduce that

$$\begin{aligned} \Vert X_{{\mathcal {Q}}_{j}}(u)\Vert _{H^{s}} \lesssim _{\delta '} N^{(L+1)\delta '} \Vert u\Vert _{H^{s}}^{(Ln+n-2L)-1}. \end{aligned}$$

We choose \(L=9\) which implies \(Ln+n-2L\ge r\) since \(r\le 4n\). Notice also that all the summand in (3.30) are of the form

$$\begin{aligned} \int _{0}^{1}(1-\tau )^{L}{\mathcal {Q}}_{j}\circ \Phi ^{\tau }_{\chi ^{(1)}}\mathrm {d}\tau . \end{aligned}$$

Then we can estimates their vector fields by reasoning as done for the Hamiltonian \(R_{r}\circ \Phi _{\chi ^{(1)}}\). This concludes the proof. \(\square \)

Remark 3.9

(Case \(d=3\)) We remark that Theorem 2 for \(d=3\) follows by Lemmata 3.6, 3.7, 3.8, by setting \(\tau ^{(1)}:=\Phi _{\chi ^{(1)}}\) and recalling that (see (3.1)) \(M_{d,n}=2n-2\) for \(d=3\).

Step 2 if \(d=2\). This step is performed only in the case \(d=2\). Consider the Hamiltonian in (3.24). Our aim is to reduce in Birkhoff normal form all the Hamiltonians \({\widetilde{K}}_{k}\) of homogeneity \(k=2n-2\,\ldots , M_{2,n}-1\) where \(M_{2,n}\) is given in (3.1). We follow the same strategy adopted in the previous step.

Lemma 3.10

(Homological equation 2) Let \(N\ge 1\), \(\delta >0\) and consider the Hamiltonian in (3.24). There exist multilinear Hamiltonians \(\chi ^{(2)}_{k}\), \(k=2n-2,\ldots , M_{2,n}-1\) in the class \({\mathcal {L}}_{k}\), with coefficients satisfying

$$\begin{aligned} |(\chi _{k}^{(2)})_{\sigma ,j}|\lesssim _{\delta } N^{\delta } \mu _3(j)^{\beta }, \end{aligned}$$
(3.34)

for some \(\beta >0\), such that

$$\begin{aligned} \{\chi _{k}^{(2)},Z_{2}\}+{\widetilde{K}}_{k}=Z_{k}+{\widetilde{K}}_{k}^{>N}, \quad k=2n-2,\ldots ,M_{2,n}-1, \end{aligned}$$
(3.35)

where \({\widetilde{K}}_k\) are given in Lemma 3.8 and \(Z_{k}\) is the resonant Hamiltonian defined as

$$\begin{aligned} Z_{k}:=({\widetilde{K}}_{k}^{\le N})^{\mathrm{res}},\quad k=2n-2,\ldots ,M_{2,n}-1. \end{aligned}$$
(3.36)

Moreover \(Z_{k}\) belongs to \({\mathcal {L}}_{k}\) and has coefficients satisfying (3.5).

Proof

Recalling Definitions 3.1, 3.3, we write

$$\begin{aligned} {\widetilde{K}}_{k}=Z_{k}+\big ({\widetilde{K}}_{k}^{\le N} -Z_{k}\big ) +{\widetilde{K}}_{k}^{> N}, \end{aligned}$$

with \(Z_{k}\) as in (3.36), and we define

$$\begin{aligned} \chi _{k}^{(2)}:=(\mathrm{ad}_{Z_2})^{-1}\Big [{\widetilde{K}}_{k}^{\le N}-Z_{k}\Big ], \quad k=2n-2,\ldots ,M_{2,n}-1. \end{aligned}$$
(3.37)

The Hamiltonians \(\chi _{k}^{(2)}\) have the form (3.9) with coefficients

$$\begin{aligned} (\chi _{k}^{(2)})_{\sigma ,j}:=({\widetilde{K}}_{k})_{\sigma ,j} \left( \mathrm{i} \sum _{i=1}^{k}\sigma _i\omega _{j_i}\right) ^{-1} \end{aligned}$$
(3.38)

for indices \(\sigma \in \{-1,+1\}^{k}\), \(j\in ({\mathbb {Z}}^{d})^{k}\) such that

$$\begin{aligned} \sum _{i=1}^{k}\sigma _i j_i=0, \quad \mu _{2}(j)\le N\quad \mathrm{and}\quad \sum _{i=1}^{k}\sigma _i\omega _{j_i}\ne 0. \end{aligned}$$

Recalling that we are in the case \(d=2\), by (3.25) and Proposition 2.2 we deduce (3.34). The resonant Hamiltonians \(Z_{k}\) in (3.36) have the form (3.5). The (3.35) follows by an explicit computation. \(\square \)

Lemma 3.11

Let us define

$$\begin{aligned} \chi ^{(2)}:=\sum _{k=2n-2}^{M_{2,n}-1}\chi ^{(2)}_{k}. \end{aligned}$$
(3.39)

There is \(s_0=s_0(d,r)\) such that for any \(\delta >0\), for any \(N\ge 1\) and any \(s\ge s_0\), if \(\varepsilon _0\lesssim _{s,\delta } N^{-\delta }\), then the problem

$$\begin{aligned} \left\{ \begin{aligned}&\partial _{\tau }Z(\tau )=X_{{\chi ^{(2)}}}(Z(\tau ))\\&Z(0)=U={\bigl [{\begin{matrix}u\\ {\bar{u}}\end{matrix}}\bigr ]},\quad u\in B_{s}(0,\varepsilon _0) \end{aligned}\right. \end{aligned}$$

has a unique solution \(Z(\tau )=\Phi ^{\tau }_{\chi ^{(2)}}(u)\) belonging to \(C^{k}([-1,1];H^{s}({\mathbb {T}}^{d}))\) for any \(k\in {\mathbb {N}}\). Moreover the map \(\Phi _{\chi ^{(2)}}^{\tau } : B_{s}(0,\varepsilon _0)\rightarrow H^{s}({\mathbb {T}}^{d})\) is symplectic. The flow map \(\Phi ^{\tau }_{\chi ^{(2)}}\), and its inverse \(\Phi ^{-\tau }_{\chi ^{(2)}}\), satisfy

$$\begin{aligned} \begin{aligned}&\sup _{\tau \in [0,1]}\Vert \Phi ^{\pm \tau }_{\chi ^{(2)}}(u)-u\Vert _{H^{s}} \lesssim _{s,\delta } N^{\delta }\Vert u\Vert _{H^{s}}^{n-1},\\&\sup _{\tau \in [0,1]} \Vert d\Phi ^{\pm \tau }_{\chi ^{(2)}}(u)[\cdot ]\Vert _{{\mathcal {L}}(H^{s};H^{s})} \le 2. \end{aligned} \end{aligned}$$

Proof

It follows reasoning as in the proof of Lemma 3.7. \(\square \)

We have the following.

Lemma 3.12

(The new Hamiltonian 2) There is \(s_0=s_0(d,r)\) such that for any \(N\ge 1\), \(\delta >0\) and any \(s\ge s_0\), if \(\varepsilon _0\lesssim _{s,\delta } N^{-\delta }\) then we have that \(H\circ \Phi _{\chi ^{(1)}}\circ \Phi _{\chi ^{(2)}}\) has the form (3.4) and satisfies items (i), (ii), (iii) of Theorem 2.

Proof

We fix \(\delta >0\) and we apply Lemmata 3.8, 3.10 with \(\delta \rightsquigarrow \delta '\) with \(\delta '\) to be chosen small enough with respect to \(\delta \) fixed here.

Reasoning as in the previous step we have (recall (3.1), (3.28) and (3.24))

$$\begin{aligned}&H\circ \,\Phi _{\chi ^{(1)}}\circ \Phi _{\chi ^{(2)}}= Z_{2}+\sum _{k=n}^{2n-3}Z_{k} +\sum _{k=2n-2}^{M_{2,d}-1} \Big ({\widetilde{K}}_{k}+ \{\chi _{k}^{(2)},Z_2\}\Big )+\sum _{k=M_{2,n}}^{r-1}{\widetilde{K}}_{k} \end{aligned}$$
(3.40)
$$\begin{aligned}&\quad +{\widetilde{K}}^{>N}\circ \Phi _{\chi ^{(2)}} \end{aligned}$$
(3.41)
$$\begin{aligned}&\quad +\sum _{p=2}^{L}\frac{1}{p!}\mathrm{ad}_{\chi ^{(2)}}^{p}[Z_2]+ \sum _{p=1}^{L}\frac{1}{p!}\mathrm{ad}_{\chi ^{(2)}}^{p}\left[ \sum _{k=n}^{2n-3}Z_{k}+\sum _{k=2n-2}^{r-1}{\widetilde{K}}_{k} \right] \end{aligned}$$
(3.42)
$$\begin{aligned}&\quad + {\mathcal {R}}_{r}\circ \Phi _{\chi ^{(2)}} +\frac{1}{L!}\int _{0}^{1}(1-\tau )^{L} \mathrm{ad}^{L+1}_{\chi ^{(2)}}[Z_2]\circ \Phi _{\chi ^{(2)}}^{\tau }d\tau \end{aligned}$$
(3.43)
$$\begin{aligned}&\quad +\frac{1}{L!}\int _{0}^{1}(1-\tau )^{L}\mathrm{ad}^{L+1}_{\chi ^{(2)}} \left[ \sum _{k=n}^{2n-3}Z_{k} +\sum _{k=2n-2}^{r-1}{\widetilde{K}}_{k}\right] \circ \Phi _{\chi ^{(2)}}^{\tau }\mathrm {d}\tau , \end{aligned}$$
(3.44)

where \(\Phi ^{\tau }_{\chi ^{(2)}}\), \(\tau \in [0,1]\), is the flow at time \(\tau \) of the Hamiltonian \(\chi ^{(2)}\). We study each summand separately. First of all, thanks to (3.35), we deduce that

$$\begin{aligned} \sum _{k=2n-2}^{M_{2,n}-1}\big ({\widetilde{K}}_{k}+\{\chi _{k}^{(2)},Z_2\}\big ) =\sum _{k=2n-2}^{M_{2,n}-1}Z_{k}+{\widetilde{K}}_{+}^{>N}, \quad {\widetilde{K}}_{+}^{>N}:=\sum _{k=2n-2}^{M_{2,n}-1}{\widetilde{K}}_{k}^{>N}. \end{aligned}$$
(3.45)

One can check, using Lemma 3.5, that \({\widetilde{K}}_{+}^{>N}\) satisfies

$$\begin{aligned} \Vert X_{{\widetilde{K}}_{+}^{>N}}\Vert _{H^{s}}\lesssim _{s,\delta } N^{-1+\delta '}\Vert u\Vert _{H^{s}}^{2n-3}. \end{aligned}$$
(3.46)

Consider now the terms in (3.42). First of all notice that we have

$$\begin{aligned} \mathrm{ad}_{\chi ^{(2)}}^{p}[Z_2]{\mathop {=}\limits ^{(3.45)}} \sum _{k=2n-2}^{M_{2,n}-1}\mathrm{ad}^{p-1}_{\chi ^{(2)}}\Big [ Z_{k}-{\widetilde{K}}_{k}^{\le N} \Big ], \quad p=2,\ldots ,L. \end{aligned}$$

The Hamiltonian above has a homogeneity at least of degree \(4n-6\) which actually is larger or equal to \(M_{2,n}\) (see (3.1)). The terms with lowest homogeneity in the sum (3.42) have degree exactly \(M_{2,n}\) and come from the term \(\mathrm{ad}_{\chi ^{(2)}}\Big [\sum _{k=n}^{2n-3}Z_{k}\Big ]\) recalling that (see Remark 3.4) if n is odd then \(Z_{n}\equiv 0\). Then, by (3.34), (3.25) and Lemma 3.5-(ii), we get

$$\begin{aligned} {}(3.42)=\sum _{k=M_{2,n}}^{L(M_{2,n}-1)+r-1-2L}{\widetilde{K}}^{+}_{k} \end{aligned}$$

where \({\widetilde{K}}^{+}_k\) are k-homogeneous Hamiltonians of the form (3.6) with coefficients satisfying

$$\begin{aligned} |({\widetilde{K}}^{+}_k)_{\sigma ,j}|\lesssim _{\delta '} N^{(L+1)\delta '} \mu _3(j)^{\beta }, \end{aligned}$$
(3.47)

for some \(\beta >0\). By the discussion above, using formulæ  (3.40)–(3.44), we obtain that the Hamiltonian \(H\circ \Phi _{\chi ^{(1)}}\circ \Phi _{\chi ^{(2)}}\) has the form (3.4) with (recall (3.19), (3.36), (3.41), (3.45))

$$\begin{aligned} Z_{k}^{\le N}&:=Z_{k},\quad k=n,\ldots ,M_{2,n-1},\quad K_{k}:={\widetilde{K}}_{k}+{\widetilde{K}}_{k}^{+},\quad k=M_{2,n},\ldots ,r-1, \end{aligned}$$
(3.48)
$$\begin{aligned} K^{>N}&:={\widetilde{K}}^{>N}\circ \Phi _{\chi ^{(2)}}+ {\widetilde{K}}_{+}^{>N} \end{aligned}$$
(3.49)

and with remainder \({\tilde{R}}_{r}\) defined as

$$\begin{aligned} {\tilde{R}}_{r}:=\sum _{k=r}^{L(M_{2,n}-1)+r-1-2L}{\widetilde{K}}_{k}^{+}+ (3.43)+(3.44). \end{aligned}$$
(3.50)

Recalling (3.19), (3.36) and the estimates (1.27), (3.25) we have that \(Z_{k}^{\le N}\) in (3.48) satisfies the condition of item (i) of Theorem 2. Similarly \(K_{k}\) in (3.48) satisfies (3.6) thanks to (3.25) and (3.47) as long as \(\delta '\) is sufficiently small. The remainder \(K^{>N}\) in (3.49) satisfies the bound (3.7) using (3.46), (3.26) and Lemma 3.5-(i). It remains to show that the remainder defined in (3.50) satisfies the estimate (3.8). The claim follows for the terms \({\widetilde{K}}_{k}^{+}\) for \(k=r,\ldots , L(M_{2,n}-1)+r-1-2L\) by using (3.47) and Lemma 3.5. For the remainder in (3.43), (3.44) one can reason following almost word by word the proof of the estimate of the vector field of \({\mathcal {R}}_r\) in (3.33) in the previous step. In this case we choose \(L+1=8\) which implies \(L+1\ge (r+n)/(2n-4)\). \(\square \)

Theorem 2 follows by Lemmata 3.8, 3.12 setting \(\tau ^{(1)}:=\Phi _{\chi ^{(1)}}\circ \Phi _{\chi ^{(2)}}\). The bound (3.3) follows by Lemmata 3.7 and 3.11.

4 The Modified Energy Step

In this section we construct a modified energy which is an approximate constant of motion for the Hamiltonian system of \(H\circ \tau ^{(1)}\) in (3.4), when \(d=2,3\), and for the Hamiltonian H in (1.24) when \(d\ge 4\). For compactness we shall write, for \(s\in {\mathbb {R}}\),

$$\begin{aligned} N_{s}(u):=\Vert u\Vert _{H^{s}}^{2}=\sum _{j\in {\mathbb {Z}}^{2}}\langle j\rangle ^{2s}|u_{j}|^{2}, \end{aligned}$$
(4.1)

for \(u\in H^{s}({\mathbb {T}}^{2};{\mathbb {C}})\). For \(d\ge 2\) and \(n\in {\mathbb {N}}\) we define (recall (3.1))

$$\begin{aligned} {\widetilde{M}}_{d,n}:= {\left\{ \begin{array}{ll} M_{d,n}+n-1 &{}n\;\; \mathrm{odd}\\ M_{d,n}+n-2 &{}n\;\; \mathrm{even}. \end{array}\right. } \end{aligned}$$
(4.2)

Proposition 4.1

There exists \(\beta =\beta (d,n)>0\) such that for any \(\delta >0\), any \(N\ge N_1>1\) (\(N=N_1\) if \(d\ge 4\)) and any \(s\ge {\tilde{s}}_0\), for some \({\tilde{s}}_0={\tilde{s}}_0(\beta )>0\), if \(\varepsilon _0 \lesssim _{s,\delta } N^{-\delta }\), there are multilinear maps \(E_{k}\), \(k=M_{d,n},\ldots , {\widetilde{M}}_{d,n}-1\), in the class \({\mathcal {L}}_{k}\) such that the following holds:

  • the coefficients \((E_{k})_{\sigma ,j}\) satisfies

    $$\begin{aligned} |(E_{k})_{\sigma ,j}|\lesssim _{s,\delta }N^{\delta } N_1^{\kappa _{d}} \mu _{3}(j)^{\beta } \mu _1(j)^{2s}, \end{aligned}$$
    (4.3)

    for \(\sigma \in \{-1,1\}^{k}\), \(j\in ({\mathbb {Z}}^{d})^{k}\), \(k=M_{d,n},\ldots ,{\widetilde{M}}_{d,n}-1\), where

    $$\begin{aligned} \kappa _d:=0\; \mathrm{if}\; d=2,\;\;\;\;\;\; \kappa _d:=1\; \mathrm{if}\; d=3,\;\;\;\;\;\; \kappa _d:=d-4\; \mathrm{if}\; d\ge 4. \end{aligned}$$
    (4.4)
  • for any \(u\in B_{s}(0,2\varepsilon _0)\) setting

    $$\begin{aligned} E(u):=\sum _{k=M_{d,n} }^{{\widetilde{M}}_{d,n}-1}E_{k}(u). \end{aligned}$$
    (4.5)

    one has

    $$\begin{aligned} \begin{aligned} |\{N_{s}+E,H\circ \tau ^{(1)}\}|&\lesssim _{s,\delta } N_1^{\kappa _d} N^{\delta } \big (\Vert u\Vert ^{{\widetilde{M}}_{d,n}}_{H^{s}} +N^{-1}\Vert u\Vert ^{M_{d,n}+n-2}_{H^{s}}\big )\\&\quad +N_1^{-{\mathfrak {s}}_d+\delta }\Vert u\Vert _{H^{s}}^{M_{d,n}} +N^{-{\mathfrak {s}}_d+\delta }\Vert u\Vert _{H^{s}}^{n}, \end{aligned} \end{aligned}$$
    (4.6)

    where

    $$\begin{aligned} {\mathfrak {s}}_{d}:=1,\;\;\;\mathrm{for }\;\; d=2,3,\;\;\; \mathrm{and}\;\;\; {\mathfrak {s}}_{d}:=3,\;\;\;\mathrm{for }\;\; d\ge 4. \end{aligned}$$
    (4.7)

Remark 4.2

We remark that in the proposition above we introduced a second truncation parameter \(N_1\). This is needed in order to optimize the time of existence that we shall deduce by estimate (4.6) . In Sect. 5 we shall choose \(N, N_1\) (depending on \(\varepsilon \)) is such a way that the last two summands in the r.h.s. of (4.6) are negligible w.r.t. the first two summands. Since for \(d=3\) the term \(\Vert u\Vert ^{n}_{H^{s}}\) is larger than \(\Vert u\Vert ^{M_{d,n}}_{H^{s}}\) it would be convenient to choose \(N\gg N_1\) to make the last summand small enough. This is possible since the factor \(N_1^{k_{d}}N^{\delta }\) grows very slowly in N since \(\delta \) is arbitrary small. Note that in the case \(d\ge 4\) we need just one truncation since no preliminary Birkhoff normal form is performed. In the case \(d=2\) one could use the same truncation N since \(\kappa _d=0\).

We need the following technical lemma.

Lemma 4.3

(Energy estimate) Let \(N\ge 1\), \(0\le \delta <1\), \(p\in {\mathbb {N}}\), \(p\ge 3\). Consider the Hamiltonians \(N_{s}\) in (4.1), \(G_p\in {\mathcal {L}}_{p}\) and write \(G_{p}=G_{p}^{(+1)}+G_{p}^{(-1)}\) (recall Definition 3.3). Assume also that the coefficients of \(G_p\) satisfy

$$\begin{aligned} |(G^{(\eta )}_{p})_{\sigma ,j}|\le C N^{\delta }\mu _{3}(j)^{\beta }\mu _{1}(j)^{-q}, \quad \forall \sigma \in \{-1,+1\}^{p},\; j\in {\mathbb {Z}}^{d},\eta \in \{-1,+1\}, \end{aligned}$$
(4.8)

for some \(\beta >0\), \(C>0\) and \(q\ge 0\). We have that the Hamiltonian \(Q_{p}^{(\eta )}:=\{N_s,G_{p}^{(\eta )}\}\), \(\eta \in \{-1,1\}\), belongs to the class \({\mathcal {L}}_{p}\) and has coefficients satisfying

$$\begin{aligned} |(Q_{p}^{(\eta )})_{\sigma ,j}|\lesssim _{s} C N^{\delta }\mu _{3}(j)^{\beta +1}\mu _1(j)^{2s} \mu _1(j)^{-q-\alpha },\quad \alpha :=\left\{ \begin{array}{ll} 1&{}\quad \mathrm{if }\;\; \eta =-1\\ 0&{}\quad \mathrm{if}\;\; \eta =+1. \end{array} \right. \end{aligned}$$
(4.9)

Proof

Using formulæ  (4.1), (1.23), (3.9) and recalling Definition 3.3 we have that the Hamiltonian \(\{N_s,G_{p}^{(\eta )}\}\) has coefficients

$$\begin{aligned} (Q_{p}^{(\eta )})_{\sigma ,j}=(G_{p}^{(\eta )})_{\sigma ,j} \mathrm{i} \left( \sum _{i=1}^{p}\sigma _i\langle j_i\rangle ^{2s}\right) \end{aligned}$$

for any \(\sigma \in \{-1,+1\}^p\), \(j\in ({\mathbb {Z}}^{d})^{p}\) satisfying

$$\begin{aligned} \sum _{i=1}^p\sigma _i j_i=0,\quad \sigma _i \sigma _k=\eta ,\quad \mu _1(j)=|j_i|, \;\mu _2(j)=|j_k|, \end{aligned}$$

for some \(i,k=1,\ldots ,p\). Then the bound (4.9) follows by the fact that

$$\begin{aligned} |\langle j_i\rangle ^{2s}+\eta \langle j_k\rangle ^{2s}|\lesssim _{s} \left\{ \begin{array}{ll} \mu _1(j)^{2s-1}\mu _{3}(j)&{}\mathrm{if} \; \eta =-1\\ \mu _1(j)^{2s}&{}\mathrm{if} \; \eta =+1. \end{array}\right. \end{aligned}$$

and using the assumption (4.8). \(\square \)

Proof of Proposition 4.1

Case \(d=2,3\). Consider the Hamiltonians \(K_{k}\) in (3.6) for \(k=M_{d,n},\ldots , {\widetilde{M}}_{n,d}-1\) where \( {\widetilde{M}}_{n,d}\) is defined in (4.2). Recalling Definition 3.3 we set \(E_{k}:=E_{k}^{(+1)}+E_{k}^{(-1)}\), where

$$\begin{aligned} E_{k}^{(+1)}:=(\mathrm{ad}_{Z_2})^{-1}\{N_{s}, K_{k}^{(+1)}\}, \quad E_{k}^{(-1)}:=(\mathrm{ad}_{Z_2})^{-1} \{N_{s}, K_{k}^{(-1,\le N_1)}\}, \end{aligned}$$
(4.10)

for \(k=M_{d,n},\ldots , {\widetilde{M}}_{d,n}-1\). Notice that formulæin (4.10) are well-defined since \(\{N_{s}, K_{k}^{(+1)}\}\) and \(\{N_{s}, K_{k}^{(-1,\le N_1)}\}\) are in the range of the adjoint action \(\mathrm{ad}_{Z_2}\) thanks to Proposition 2.2. It is easy to note that \(E_{k}\in {\mathcal {L}}_{k}\). Moreover, using the bounds on the coefficients \((K_{k})_{\sigma ,j}\) in (3.6) and item (ii) of Proposition 2.2 (with \(\delta \) therein possibly smaller than the one fixed here), one can check that the coefficients \((E_{k}^{(+1)})_{\sigma ,j}\) satisfy the (4.3). By (3.6), Lemma 4.3 (in particular formula (4.9) with \(\eta =-1\)) and item (iii) of Proposition 2.2, one gets that the coefficients \((E_{k}^{(-1)})_{\sigma ,j}\) satisfy the (4.3) as well. Using (4.10) we notice that

$$\begin{aligned} \{N_{s}, K_{k}\}+\{E_{k},Z_2\}=\{N_{s}, K_{k}^{(-1,>N_1)}\},\quad k=M_{d,n},\ldots , {\widetilde{M}}_{d,n}-1. \end{aligned}$$
(4.11)

Combining Lemmata 3.5 and 4.3 we deduce

$$\begin{aligned} |{\{N_{s}, K_{k}^{(-1,>N_1)}\}}(u)|\lesssim _{s,\delta } N_1^{-1+\delta }\Vert u\Vert ^{k}_{H^{s}}, \end{aligned}$$
(4.12)

for s large enough with respect to \(\beta \). We define the energy E as in (4.5). We are now in position to prove the estimate (4.6).

Using the expansions (3.4) and (4.5) we get

$$\begin{aligned} \{N_{s}+E,H\circ \tau ^{(1)}\}&= \left\{ N_{s},Z_2+ \sum _{k=n}^{M_{d,n}-1}Z_{k}^{\le N}\right\} \end{aligned}$$
(4.13)
$$\begin{aligned}&\quad +\{N_{s},K^{>N}\}+\{N_{s},{\tilde{R}}_{r}\}\end{aligned}$$
(4.14)
$$\begin{aligned}&\quad +\sum _{k=M_{d,n}}^{{\widetilde{M}}_{d,n}-1}\big ( \{N_{s}, K_{k}\}+\{E_{k},Z_2\}\big ) \end{aligned}$$
(4.15)
$$\begin{aligned}&\quad +\left\{ E,\sum _{k=n}^{M_{d,n}-1}Z_{k}^{\le N}\right\} +\left\{ E,\sum _{k=M_{d,n}}^{r-1}K_{k}+{\tilde{R}}_r\right\} \end{aligned}$$
(4.16)
$$\begin{aligned}&\quad +\{E, K^{>N}\}. \end{aligned}$$
(4.17)

We study each summand separately. First of all note that, by item (i) in Theorem 2 and Proposition 2.2 we deduce that the right hand side of (4.13) vanishes. Consider now the term in (4.14). Using the bounds (3.7), (3.8) and recalling (1.23) one can check that, for \(\varepsilon _0N^{\delta }\lesssim _{s,\delta }1\),

$$\begin{aligned} |(4.14)|\lesssim _{s,\delta } N^{-1+\delta }\Vert u\Vert _{H^{s}}^{n}+ N^{\delta }\Vert u\Vert _{H^{s}}^{r}. \end{aligned}$$
(4.18)

By (4.11) and (4.12) we deduce that

$$\begin{aligned} |(4.15)|\lesssim _{s,\delta }N_1^{-1+\delta }\Vert u\Vert ^{M_{d,n}}_{H^{s}}. \end{aligned}$$
(4.19)

By (4.3), (3.4)–(3.8), Lemma 3.5 (recall also (4.2)) we get

$$\begin{aligned} \begin{aligned} |(4.16)|&\lesssim _{s,\delta } N_1^{\kappa _{d}}N^{\delta }(\Vert u\Vert _{H^{s}}^{{\widetilde{M}}_{d,n}}+\Vert u\Vert _{H^{s}}^{r}),\\ |(4.17)|&\lesssim _{s,\delta } N_1^{\kappa _{d}}N^{-1+\delta }\Vert u\Vert _{H^{s}}^{{M}_{d,n}+n-2}. \end{aligned} \end{aligned}$$

The discussion above implies the bound (4.6) using that \(r\ge {\widetilde{M}}_{d,n}\). This concludes the proof in the case \(d=2,3\).

Case \(d\ge 4\). In this case we consider the Hamiltonian H in (1.24). Recalling Definition 3.3 we set

$$\begin{aligned} E_{k}:=E_{k}^{(+1)}+E_{k}^{(-1)} \end{aligned}$$

where

$$\begin{aligned} E_{k}^{(+1)}:=(\mathrm{ad}_{Z_2})^{-1}\{N_{s}, H_{k}\}^{(+1)}, \quad E_{k}^{(-1)}:=(\mathrm{ad}_{Z_2})^{-1} \{N_{s}, H_{k}^{(-1,\le N_1)}\}, \end{aligned}$$
(4.20)

for \(k=M_{d,n},\ldots , {\widetilde{M}}_{d,n}-1\). Notice that the energies \(E_{k}^{(+1)}\), \(E_{k}^{(-1)}\) are in \({\mathcal {L}}_{k}\) with coefficients

$$\begin{aligned} (E_{k}^{(+1)})_{\sigma ,j}=\left( \sum _{i=1}^{k}\sigma _i\langle j_i\rangle ^{2s}\right) \left( \sum _{i=1}^{k}\sigma _i\omega _{j_i}\right) ^{-1}(H_{k}^{(+1)})_{\sigma ,j}, \quad \sigma \in \{-1,+1\}^{k},\quad j\in ({\mathbb {Z}}^{d})^{k}, \end{aligned}$$

and

$$\begin{aligned} (E_{k}^{(-1)})_{\sigma ,j}=\left( \sum _{i=1}^{k}\sigma _i\langle j_i\rangle ^{2s}\right) \left( \sum _{i=1}^{k}\sigma _i\omega _{j_i}\right) ^{-1}(H_{k}^{(-1)})_{\sigma ,j}, \quad \mu _{2}(j)\le N_1, \end{aligned}$$

with \(\sigma \in \{-1,+1\}^{k}\), \(j\in ({\mathbb {Z}}^{d})^{k}\). Recall that in this case \(M_{d,n}=n\) (see (3.1)). Using Proposition 2.2 and reasoning as in the proof of Lemma 4.3 one can check that estimate (4.3) on the coefficients of \(E_{k}^{(+1)}\) and \(E_{k}^{(-1)}\) holds true with \(\kappa _{d}\) as in (4.4). Equation (4.20) implies

$$\begin{aligned} \{N_{s}, H_{k}\}+\{E_{k},Z_2\}=\{N_{s}, H_{k}^{(-1,>N_1)}\},\quad k=n,\ldots , {\widetilde{M}}_{d,n}-1, \end{aligned}$$
(4.21)

where \({\widetilde{M}}_{d,n}-1=2n-1\) if n odd and \({\widetilde{M}}_{d,n}-1=2n-2\) if n even (see (4.2)). Recall that the coefficients of the Hamiltonian \(H_{k}\) satisfy the bound (1.27). Therefore, combining Lemmata 4.3 and 3.5, we deduce

$$\begin{aligned} |\{N_{s}, H_{k}^{(-1,>N_1)}\}(u)| \lesssim _{s,\delta }N_1^{-3}\Vert u\Vert ^{k}_{H^{s}}, \end{aligned}$$
(4.22)

for s large enough with respect to \(\beta \). Recalling (1.24) we have

$$\begin{aligned} \{N_{s}+E,H\}&= \{N_{s},Z_2\}+\{N_{s},{R}_{r}\} +\left\{ E,\sum _{k=n}^{r-1}H_{k}+{R}_{r}\right\} \\&\quad +\sum _{k=n}^{{\widetilde{M}}_{d,n}-1}\big ( \{N_{s}, K_{k}\}+\{E_{k},Z_2\}\big ). \end{aligned}$$

One can obtain the bound (4.6) by reasoning as in the case \(d=2,3\), using (4.22), (1.28). This concludes the proof. \(\square \)

5 Proof of Theorem 1

In this section we show how to combine the results of Theorem 2 and Proposition 4.1 in order to prove Theorem 1.

Consider \(\psi _0\) and \(\psi _1\) satisfying (1.4) and let \(\psi (t,y)\), \(y\in {\mathbb {T}}_{\nu }^{d}\), be the unique solution of (1.1) with initial conditions \((\psi _0,\psi _1)\) defined for \(t\in [0,T]\) for some \(T>0\). By rescaling the space variable y and passing to the complex variable in (1.17) we consider the function u(tx), \(x\in {\mathbb {T}}^{d} \) solving the Eq. (1.18). We recall that (1.18) can be written in the Hamiltonian form

$$\begin{aligned} \partial _{t}u=\mathrm{i} \partial _{{\bar{u}}} H(u), \end{aligned}$$
(5.1)

where H is the Hamiltonian function in (1.20) (see also (1.24)). We have that Theorem 1 is a consequence of the following Lemma.

Lemma 5.1

(Main bootstrap) There exists \(s_0=s_0(n,d)\) such that for any \(\delta >0\), \(s\ge s_0\), there exists \(\varepsilon _0=\varepsilon _0(\delta ,s)\) such that the following holds. Let u(tx) be a solution of (5.1) with \(t\in [0,T)\), \(T>0\) and initial condition \(u(0,x)=u_0(x)\in H^{s}({\mathbb {T}}^{d})\). For any \(\varepsilon \in (0, \varepsilon _0)\) if

$$\begin{aligned} \Vert u_0\Vert _{H^{s}}\le \varepsilon ,\quad \sup _{t\in [0,T)}\Vert u(t)\Vert _{H^{s}}\le 2\varepsilon , \quad T\le \varepsilon ^{-\mathtt {a}+\delta }, \end{aligned}$$
(5.2)

with \(\mathtt {a}=\mathtt {a}(d,n)\) in (1.6), then we have the improved bound \(\sup _{t\in [0,T)}\Vert u(t)\Vert _{H^{s}}\le \frac{3}{2} \varepsilon \).

In order to prove Lemma 5.1 we first need a preliminary result.

Lemma 5.2

(Equivalence of the energy norm) Let \(\delta >0\), \(N\ge N_1\ge 1\). Let u(tx) as in (5.2) with \(s\gg 1\) large enough. Then, for any \(0<c_0<1\), there exists \(C=C(\delta ,s,d,n,c_0)>0\) such that, if we have the smallness condition

$$\begin{aligned} \varepsilon C N^{\delta }N_1^{\kappa _d}\le 1, \end{aligned}$$
(5.3)

the following holds true. Define

$$\begin{aligned} z:=\tau ^{(0)}(u),\quad u=\tau ^{(1)}(z), \quad {\mathcal {E}}_{s}(z):=(N_{s}+E)(z) \end{aligned}$$
(5.4)

where \(\tau ^{(\sigma )}\), \(\sigma =0,1\), are the maps given by Theorem 2 and \(N_{s}\) is in (4.1), E is given by Proposition 4.1. We have

$$\begin{aligned} 1/(1+c_0)\Vert z\Vert _{H^{s}}\le \Vert u\Vert _{H^{s}}\le & {} (1+c_0)\Vert z\Vert _{H^{s}},\quad \forall t\in [0,T]; \end{aligned}$$
(5.5)
$$\begin{aligned} 1/(1+12c_0){\mathcal {E}}_{s}(z)\le & {} \Vert u\Vert ^{2}_{H^{s}} \le (1+12c_0){\mathcal {E}}_{s}(z),\quad \forall t\in [0,T]. \end{aligned}$$
(5.6)

Proof

Thanks to (5.3) we have that Theorem 2 and Proposition 4.1 apply. Consider the function \(z=\tau ^{(0)}(u)\). By estimate (3.3) we have

$$\begin{aligned} \Vert z\Vert _{H^{s}}\le \Vert u\Vert _{H^{s}}+{\tilde{C}}N^{\delta } \Vert u\Vert _{{H}^s}^2 {\mathop {\le }\limits ^{(5.3)}}\Vert u\Vert _{H^{s}}(1+c_0), \end{aligned}$$

where \({\tilde{C}}\) is some constant depending on s and \(\delta \). The latter inequality follows by taking C in (5.3) large enough. Reasoning similarly and using the bound (3.3) on \(\tau ^{(1)}\) one gets the (5.5). Let us check the (5.6). First notice that, by (4.3), (4.5) and Lemma 3.5,

$$\begin{aligned} |E(z)|\le {\tilde{C}} \Vert z\Vert _{H^{s}}^{M_{d,n}}N^{\delta }N_1^{\kappa _{d}}, \end{aligned}$$
(5.7)

for some \({\tilde{C}}>0\) depending on s and \(\delta \). Then, recalling (5.4), we get

$$\begin{aligned} |{\mathcal {E}}_{s}(z)|\le \Vert z\Vert ^{2}_{H^{s}} (1+ {\tilde{C}}\Vert z\Vert _{H^{s}}^{M_{d,n}-2}N^{\delta }N_1^{\kappa _{d}}) {\mathop {\le }\limits ^{(5.5), (5.3)}}\Vert u\Vert _{H^{s}}^{2}(1+c_0)^{3}, \end{aligned}$$

where we used that \(M_{d,n}-2\ge 1\). This implies the first inequality in (5.6). On the other hand, using (5.5), (5.7) and (5.2), we have

$$\begin{aligned} \Vert u\Vert _{H^{s}}^{2}\le (1+c_0)^{2}{\mathcal {E}}_{s}(z) +(1+c_0)^{M_{d,n}+{2}}{\tilde{C}} N^{\delta }N_1^{\kappa _{d}}\varepsilon ^{M_{d,n}-2}\Vert u\Vert _{H^{s}}^{2}. \end{aligned}$$

Then, since \(M_{d,n}>2\) (see (3.1)), taking C in (5.3) large enough we obtain the second inequality in (5.6). \(\square \)

Proof of Lemma 5.1

Assume the (5.2). We study how the Sobolev norm \(\Vert u(t)\Vert _{H^{s}}\) evolves for \(t\in [0,T]\) by inspecting the equivalent energy norm \({\mathcal {E}}_{s}(z)\) defined in (5.4). Notice that

$$\begin{aligned} \partial _{t}{\mathcal {E}}_{s}(z)=-\{{\mathcal {E}}_{s}, H\circ \tau ^{(1)}\}(z). \end{aligned}$$

Therefore, for any \(t\in [0,T]\), we have that

$$\begin{aligned} \left| \int _{0}^{T}\partial _{t}{\mathcal {E}}_{s}(z)\ \mathrm {d}t\right|&{\mathop {\lesssim _{s,\delta }}\limits ^{(4.6),(5.2)}} TN_1^{\kappa _d} N^{\delta } \big (\varepsilon ^{{\widetilde{M}}_{d,n}} +N^{-1}\varepsilon ^{M_{d,n}+n-2}\big ) \\&\quad +TN_1^{-{\mathfrak {s}}_d+\delta }\varepsilon ^{M_{d,n}} +TN^{-{\mathfrak {s}}_d+\delta }\varepsilon ^{n}. \end{aligned}$$

We now fix

$$\begin{aligned} N_1:=\varepsilon ^{-\alpha },\quad N:=\varepsilon ^{-\gamma }, \end{aligned}$$

with \(0<\alpha \le \gamma \) to be chosen properly. Hence we have

$$\begin{aligned} \left| \int _{0}^{T}\partial _{t}{\mathcal {E}}_{s}(z)\ \mathrm {d}t\right|&\lesssim _{s,\delta } \varepsilon ^{2}T\Big ( \varepsilon ^{M_{d,n}-2+\alpha {\mathfrak {s}}_{d}-\delta \alpha } +\varepsilon ^{{\widetilde{M}}_{n,d}-2-\alpha \kappa _{d}-\delta \gamma } \Big ) \end{aligned}$$
(5.8)
$$\begin{aligned}&\quad + \varepsilon ^{2}T\Big ( \varepsilon ^{n-2+\gamma {\mathfrak {s}}_{d}-\delta \gamma } +\varepsilon ^{M_{n,d}+n-4+\gamma -\alpha \kappa _{d}-\delta \gamma } \Big ). \end{aligned}$$
(5.9)

We choose \(\alpha >0\) such that

$$\begin{aligned} M_{d,n}-2+\alpha {\mathfrak {s}}_{d}={\widetilde{M}}_{n,d}-2-\alpha \kappa _{d}, \end{aligned}$$
(5.10)

i.e.

$$\begin{aligned} \alpha := \frac{{\widetilde{M}}_{n,d}-M_{d,n}}{{\mathfrak {s}}_d+\kappa _d} {\mathop {=}\limits ^{(4.2), (4.7), (4.4)}} \left\{ \begin{array}{ll} \tfrac{n-1}{d-1}&{}\;\mathrm{if} \; n \;\;\mathrm{odd}\\ \tfrac{n-2}{d-1}&{}\;\mathrm{if} \; n \;\;\mathrm{even}. \end{array}\right. \end{aligned}$$
(5.11)

We shall choose \(\gamma >0\) is such a way the terms in (5.9) are negligible with respect to the terms in (5.8). In particular we set (recall (5.11))

$$\begin{aligned} \gamma \ge \max \big \{\frac{1}{{\mathfrak {s}}_d}( M_{d,n}-n +\frac{{\widetilde{M}}_{d,n}-M_{d,n}}{{\mathfrak {s}}_d+\kappa _d}{\mathfrak {s}}_d), 2-n+{\widetilde{M}}_{d,n}-M_{d,n} \big \}. \end{aligned}$$
(5.12)

Therefore estimates (5.8)–(5.9) become

$$\begin{aligned} \left| \int _{0}^{T}\partial _{t}{\mathcal {E}}_{s}(z)\ \mathrm {d}t\right| \lesssim _{s,\delta } \varepsilon ^{2}T\varepsilon ^{\mathtt {a}}( \varepsilon ^{-\delta \alpha } +\varepsilon ^{-\delta \gamma }) \end{aligned}$$

where \(\mathtt {a}\) is defined in (1.6) and appears thanks to definitions (3.1), (4.2), (4.4), (4.7) and (5.11). Moreover we define

$$\begin{aligned} \delta ':=2\delta \max \{\alpha ,\gamma \}, \end{aligned}$$

with \(\alpha ,\gamma \) given in (5.11) and (5.12). Notice that, since \(\delta >0\) is arbitrary small, then \(\delta '\) can be chosen arbitrary small. Since \(\varepsilon \) can be chosen arbitrarily small with respect to s and \(\delta \), with this choices we get

$$\begin{aligned} \left| \int _{0}^{T}\partial _{t}{\mathcal {E}}_{s}(z)\ \mathrm {d}t\right| \le \varepsilon ^{2}/4 \end{aligned}$$

as long as \(T\le \varepsilon ^{-\mathtt {a}+\delta '}\). Then, using the equivalence of norms (5.6) and choosing \(c_0>0\) small enough, we have

$$\begin{aligned} \Vert u(t)\Vert _{H^{s}}^{2}&\le (1+12c_0){\mathcal {E}}_0(z(t))\\&\le (1+12c_0)\left[ {\mathcal {E}}_s(z(0)) +\left| \int _{0}^{T}\partial _{t}{\mathcal {E}}_{s}(z)\ \mathrm {d}t\right| \right] \\&\le (1+12c_0)^{2}\varepsilon ^{2}+(1+12c_0)\varepsilon ^{2}/4\le \varepsilon ^{2}3/2, \end{aligned}$$

for times \(T\le \varepsilon ^{-\mathtt {a}+\delta '}\). This implies the thesis. \(\square \)