1 Introduction and Main Results

We study d-dimensional discrete nonlinear random Schrödinger equation (DRNLS)

$$\begin{aligned} {\text{ i }}\frac{{\text{ d }}{q}_j}{{\text{ d }}t} =v_jq_{j}+\epsilon _1\sum _{|i-j|_{1}=1}q_{i} + \epsilon _2\left| q_{j}\right| ^{2} q_{j}, \ i, j \in \mathbb {Z}^d, \end{aligned}$$
(1.1)

where \({\varvec{v}}=\left\{ v_j\right\} _{j\in \mathbb {Z}^d}\) is a sequence of independent identically distributed (i.i.d.) random variables in \(\mathcal {W}=[0,1]^{\mathbb {Z}^d}\) (denote by \({\text {mes}}~(\cdot )\) the standard probability measure on \(\mathcal {W}\)) with the uniform distribution, \(\epsilon _1,\epsilon _2\ge 0\) and \(j=(j_1,\cdots ,j_d)\) with \(|j|_{1}=\sum ^{d}_{i=1}|j_{i}|\). In the case of \(\epsilon _2=0\), the Eq. (1.1) becomes the celebrated Anderson model, which was first introduced by Anderson [2] to describe the motion of non-interacting electrons in crystal with impurities. The Eq. (1.1) with the nonlinearity is due to interactions of quantum particles which plays an essential role in both classical optics [20] and Bose–Einstein condensate [21] in the presence of disorder. In fact, it is one of the center themes in Anderson localization (namely, absence of diffusion) transitions theory that how does the interplay between the disorder and nonlinearity affect the above transitions. While the study of Anderson localization for the linear case has attracted great attention over the years since the fundamental works of Fröhlich–Spencer [16] and Aizenman–Molchanov [1], much less is known bout the Anderson localization for DRNLS, particularly in multidimensional case (i.e., \(d\ge 1\)). In this paper, we prove the first long-time localization result for DRNLS on \({\mathbb {Z}}^d\) for arbitrary \(d\ge 1\) by using the Birkhoff normal form technique.

We state our result more precisely. To this end, we need to give some notations. Consider the phase space

$$\begin{aligned} \ell _{s}^{2}(\mathbb {Z}^d,\mathbb {C})=\left\{ q=(q_{j})_{j\in \mathbb {Z}^d}\in \mathbb {C}^{\mathbb {Z}^d}:\Vert q\Vert ^{2}_{s}{:=} \sum _{j\in \mathbb {Z}^d}\langle j\rangle ^{2s}|q_{j}|^{2}<\infty \right\} , \end{aligned}$$
(1.2)

where \(\langle j\rangle ^{2}{:=}1+|j|_2^{2}\) and \(|j|_{2}=\sqrt{\sum ^{d}_{i=1}|j_{i}|^2}\). Moreover, for a vector \(w=(w_j)_{j\in \mathbb {Z}^d}\in \mathbb {C}^{\mathbb {Z}^d}\) and a given large \(\mathcal N>0,\) we define the projection

$$\begin{aligned} \widetilde{\Pi }_{\mathcal N}w{:=}\widetilde{w}=(\widetilde{w}_{j})_{j\in \mathbb {Z}^d} \end{aligned}$$
(1.3)

by \(\widetilde{w}_{j}=w_{j}\) when \(|j|_2\le \mathcal N\) and otherwise \(\widetilde{w}_{j}=0.\) Let

$$\begin{aligned} \widehat{\Pi }_\mathcal {N}w{:=}\widehat{w}=w-\widetilde{w}. \end{aligned}$$
(1.4)

Now we give the following \((r,\mathcal N)\)-nonresonant conditions: for any positive integer \(r\ge 2\) and a large \(\mathcal N>0\) depending on r and d,

$$\begin{aligned}&\left| \langle {\varvec{k}}, {\varvec{v}}\rangle \right| \ge \frac{1}{\mathcal N^{20dr}},\qquad 0<\left| {\varvec{k}}\right| _1\le r \end{aligned}$$
(1.5)

and \({\varvec{k}}\) satisfies

$$\begin{aligned} \left| \widehat{{\varvec{k}}}\right| _1\le 1\quad {\text{ or }}\quad \left| \widehat{{\varvec{k}}}\right| _1=2\ {\text{ with }}\ \sum _{|j|_2> \mathcal N}k_j=\pm 2. \end{aligned}$$
(1.6)

Furthermore, define the resonant set \(\mathcal {R}_{{\varvec{k}}}\) by

$$\begin{aligned} \mathcal {R}_{{\varvec{k}}}&{:=}\left\{ {\varvec{v}}\in \mathcal {W} \bigg | \left| \langle {\varvec{k}}, {\varvec{v}}\rangle \right| < \frac{1}{\mathcal N^{20dr}} \right\} . \end{aligned}$$
(1.7)

Let

$$\begin{aligned} \mathcal {R}=\bigcup _{\begin{array}{c} 0\ne {\varvec{k}}\in \mathbb {Z}^{\mathbb {Z}^d},|{\varvec{k}}|_1\le r\\ {\varvec{k}}\ {\text {satisfy}}\ (1.6) \end{array}} \mathcal {R}_{{\varvec{k}}} \end{aligned}$$
(1.8)

and

$$\begin{aligned} \mathcal {R}^{c}=\mathcal {W}\setminus \mathcal {R}. \end{aligned}$$

Then it is easy to see that any frequency \( {\varvec{v}} \in \mathcal {R}^{c}\) is \((r,\mathcal N)\)-nonresonant.

Let \(\epsilon =\epsilon _1+\epsilon _2\). We have

Theorem 1.1

Let \(r>1\), \(s-320 dr^2\ge 2s_{0}>d+1\). Then there is some \(\epsilon _*=\epsilon _*(r,s,s_0,d)>0\) such that, for \(0<\epsilon \le \epsilon _*\), there exists the nonresonant set \(\mathcal {R}^{c}\subset \mathcal {W}\) satisfying

$$\begin{aligned} {\text {mes}} ~(\mathcal {W}\setminus \mathcal {R}^{c}) \le \epsilon ^{\frac{2r}{s-2s_{0}}} \end{aligned}$$

with the following properties: For \({\varvec{v}}\in \mathcal {R}^{c}\), if the initial state q(0) satisfies \(\left\| q(0)\right\| _{s}\le \epsilon \), then

$$\begin{aligned} \left\| \widetilde{q}(t)\right\| _{\frac{s}{2}} \le 4\epsilon ,\ \left\| \widehat{q}(t)\right\| _{s_{0}}\le 4\epsilon ^{r+1} \end{aligned}$$

for any \(t\le \epsilon ^{-\frac{r}{2s_{0}+1}}\) with \(\mathcal N=\epsilon ^{-\frac{2r}{s-2s_{0}}}\) in the truncation \(\widetilde{q}(t)\).

Remark 1.1

Specially, the above result holds true for q(0) with compact support. So for this initial data, the long-time Anderson localization occurs. Moreover, Faou–Grébert in [13] proved a subexponential long stability time interval for d-dimensional NLS in the analytic space by construct a partial normal form, where the terms with three high modes and above can not be eliminated and they are not small. In order to make these terms small, different weights have to be chosen for q(0) and q(t). Similarly, in this paper, one can not eliminate the terms with two high modes in opposite signs, to make these terms small, we had to choose different weights s, \(\frac{s}{2}\) and \(s_{0}\) for q(0), \(\widetilde{q}(t)\) and \(\widehat{q}(t)\). However, the terms with three high modes and above or two high modes in opposite signs are all eliminated in [7], which means that the desired long time stability results can be achieved by choosing the same weight for q(0) and q(t).

Remark 1.2

We want to mention that the nonresonant set \(\mathcal {R}^c\) of random variables does not depend on the choice of initial data q(0).

Remark 1.3

In contrast, we would like to introduce the remarkable work of Bourgain–Wang [10], in which the quasi-periodic solutions were constructed for multi-dimensional DRNLS in the case of both large disorder and weak nonlinearity. Their proof is based on deep Craig–Wayne–Bourgain (CWB) method together with delicate spectral properties of the Anderson model. Very recently, Shi–Wang [22, 23] generalized the result of [10] to the multi-dimensional nonlinear quasi-periodic Schrödinger and wave equations via introducing new ideas of Diophantine approximation on manifolds [18] and Bourgain’s geometric lemma [8]. A Birkhoff normal form type method for proving long-time Anderson localization was developed in the more recent work of Cong–Shi–Wang [11], in dealing with multi-dimensional nonlinear quasi-periodic NLS.

1.1 Previous Related Works

In the investigations of localization for nonlinear discrete models, Fröhlich–Spencer–Wayne [17] first showed that, with high probability and weak nonlinearity, any sup-exponentially localized initial state always stayed in a full dimensional KAM tori. Their proof is based on an extension of the KAM techniques. Later, if the initial state is small and is polynomially localized, Benettin–Fröhlich–Giorgilli [3] proved the propagation remains localized in a very long-time scale for some d-dimensional discrete nonlinear oscillation equation with i.i.d. Gaussian random potential by using Birkhoff normal form method. We would like to mention that Yuan [25] first applied the traditional KAM method based on normal form transformations to establish the existence of quasi-periodic solutions for some discrete nonlinear model in the absence of randomness. These models of [3, 17, 25] do not contain the Laplacian term (namely, the third term in (1.1)), which is the main obstacle to achieve the localization for (1.1) via the KAM or normal form techniques. One of the main reasons may be that the discrete Laplacian has pure absolutely continuous spectrum, thus exhibits delocalization. In this sense, Bourgain–Wang [10] made the first breakthrough and proved the existence of quasi-periodic solutions for some multi-dimensional DRNLS by using the CWB method based on the Lyapunov–Schmidt decomposition, the Nash–Moser iteration, multi-scale analysis of [16] an the semi-algebraic geometry theory of [6]. For the long-time localization, the most important result applying to very rough initial states was due to Wang–Zhang [24], in which they proved the first “truly” long-time localization for the 1-dimensional DRNLS by using the Birkhoff normal form type method of Bourgain–Wang [9]. It is well-know that the Anderson localization holds for 1-dimensional Anderson model with arbitrary non-zero disorder. So along this line, Fishman–Krivolapov–Soffer [14] obtained the \( O(\epsilon _2^{-2})\) long-time localization (concerning the second moment of the norm) under only weak nonlinearity assumption. Liu–Wang [19] proved the existence of quasi-periodic solutions for 1-dimensional DRNLS without requiring the large disorder condition, and the proof also works for multi-dimensional DRNLS in the regime of localization. Some results of [14] have been improved to a time scale of \(O(\epsilon _2^{-A})\) for any \(A\ge 2\) [15], but the proof is partly rigorous: in some parts it relies on conjectures tested numerically. Very recently, the result of [24] has been generalized to an exponential time scale by Cong–Shi–Zhang [12].

1.2 The Strategy of Proof

While we employ the Birkhoff normal form type methods in this paper, the technical details are different from that of [24], where a finite barrier is constructed to avoid the transfer energies between Fourier models. Instead, our method here is inspired by [5] in the PDE case, thus all Fourier models are involved. In fact, we also obtain a partial normal form around the origin. Compared with [5], one can not eliminate the terms with two high modes in opposite signs which are generated by the discrete Laplacian [see \({\textbf {W}}\) in (2.11)]. Fortunately, we can use some ideas of Bernier–Faou–Grébert in [4] and the mass conservation to deal with these remainder terms. It turns out the frequency of the linear wave operator in [4] is unbounded. In contrast, the frequency considered in this paper is bounded, which leads to the need for a better estimation of the coefficients. As a result, the estimate (3.17) is of vital importance, which together with the so-called short-range property of the DRNLS guarantees the nonresonant conditions.

1.3 Structure of the Paper

In Sect. 2, we will introduce some notations and the normal form theory is stated. In Sect. 3, we will prove the normal form theorem. The proof of Theorem 1.1 is given in Sect. 4. The measure estimate concerning the nonresonant frequencies is established in Sect. 5. Two technical lemmas are given in Sect. 6.

2 Birkhoff Normal Form Theorem

2.1 Some Notations

In this subsection, we will first introduce some notations.

Define the scale of phase space

$$\mathcal {P}_{s}(\mathbb {C}){:=}l_{s}^{2}(\mathbb {Z}^d,\mathbb {C})\oplus l_{s}^{2}(\mathbb {Z}^d,\mathbb {C})\ni (q,\bar{q})= \left( \left( q_{j}\right) _{j\in \mathbb {Z}^d},\left( \bar{q}_{j}\right) _{j\in \mathbb {Z}^d}\right) $$

is endowed by the standard symplectic structure \(-2{\text{ i }}\sum _{j\in \mathbb {Z}^d}{\text{ d }}q_{j}\wedge {\text{ d }}\bar{q}_{j}\), where the phase space \(l^2_{s}(\mathbb {Z}^d,\mathbb {C})\) is defined in (1.2). Denote by \(B_{\mathbb {C},s}(\delta )\) the open ball centered at the origin and of radius \(\delta \) in \(\mathcal {P}_{s}(\mathbb {C}).\)

Let \(H: \mathcal {P}_{s}\rightarrow \mathbb {C} \) be a Hamiltonian function given by

$$\begin{aligned} H(q,\bar{q})=\sum _{{\varvec{m}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}}H({\varvec{m}})q^{{\varvec{\beta }}}\bar{q}^{{\varvec{\gamma }}}, \end{aligned}$$
(2.1)

where \({\varvec{\beta }}=(\beta _{j})_{j \in \mathbb {Z}^d}\in \mathbb {N}^{\mathbb {Z}^d}\), \({\varvec{\gamma }}=(\gamma _{j})_{j \in \mathbb {Z}^d}\in \mathbb {N}^{\mathbb {Z}^d}\), \({\varvec{m}}=({\varvec{\beta }},{\varvec{\gamma }})\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}\) and \(H({\varvec{m}})\) is the coefficient of the monomial

$$\begin{aligned} q^{{\varvec{\beta }}}\bar{q}^{{\varvec{\gamma }}}{:=}\prod _{j\in {\mathbb {Z}^d}}q_{j}^{\beta _{j}}\bar{q}_{j}^{\gamma _{j}}. \end{aligned}$$
(2.2)

Define the modulus and vector field of the Hamiltonian function H by

$$\begin{aligned} \Vert H\Vert = \sup _{{\varvec{m}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}}|H({\varvec{m}})| \end{aligned}$$

and

$$\begin{aligned} X_{H}(q,\bar{q})=-2{\text{ i }}\Bigg (\frac{\partial H}{\partial \bar{q}}, -\frac{\partial H}{\partial q}\Bigg ) \end{aligned}$$

respectively. Next, we introduce the notation support, diameter and degree:

$$\begin{aligned}&{\text {supp}} ({\varvec{\beta }})=\left\{ j \mid \beta _{j} \ne 0 \right\} ,\ {\text {supp}} ({\varvec{m}})= \left\{ j \mid \beta _{j} \ne 0\ {\text{ or }}\ \gamma _{j} \ne 0\right\} , \\&\Delta ({\varvec{\beta }})= {\text {diam}}\{{\text {supp }} ({\varvec{\beta }})\},\ \Delta ({\varvec{m}})=\Delta ({\varvec{\beta }}+{\varvec{\gamma }}),\\&|{\varvec{\beta }}|_1=\sum _{j \in {\text {supp}} ({\varvec{\beta }})}\beta _j,\ |{\varvec{m}}|_1= |{\varvec{\beta }}+{\varvec{\gamma }}|_1. \end{aligned}$$

If \(\beta _{j}=\gamma _{j}\) for all \(j \in {\text {supp}}({\varvec{m}})\), then the monomial (2.2) is called resonant. Otherwise, it is called nonresonant. Moreover, we consider the Hamiltonian functions satisfying mass conservation in this paper, i.e., the Hamiltonian functions satisfy \(|{\varvec{\beta }}|_1=|{\varvec{\gamma }}|_1\).

Finally, for two Hamiltonian functions \(H(q,\bar{q})\) and \(F(q,\bar{q})\), we define their Poisson bracket by

$$\begin{aligned} \{H,F\}{:=}&-2{\text{ i }}\sum _{k\in \mathbb {Z}^d}\Bigg (\frac{\partial H}{\partial q_{k}}\frac{\partial F}{\partial \bar{q}_k}-\frac{\partial H}{\partial \bar{q}_k}\frac{\partial F}{\partial q_{k}}\Bigg ). \end{aligned}$$

Precisely, let

$$\begin{aligned} F(q,\bar{q})=\sum _{{\varvec{n}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}}F({\varvec{n}})q^{{\varvec{\eta }}}\bar{q}^{{\varvec{\xi }}}, \end{aligned}$$

where \({\varvec{n}}=({\varvec{\eta }},{\varvec{\xi }})\). Thus, \(\{H,F\}\) can be rewritten as

$$\begin{aligned} \{H,F\}=&\sum _{{\varvec{\mu }}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}}\left\{ H,F\right\} ({\varvec{\mu }})q^{{\varvec{\alpha }}}\bar{q}^{{\varvec{\zeta }}}, \end{aligned}$$

where H is defined in (2.1), \({\varvec{\mu }}=({\varvec{\alpha }},{\varvec{\zeta }})\) and

$$\begin{aligned} \left\{ H,F\right\} ({\varvec{\mu }})=-2{\text{ i }}\sum _{k\in \mathbb {Z}^d}\left( \sum ^{*}_{{\varvec{m}},{\varvec{n}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}}H({\varvec{m}})F({\varvec{n}})\left( \gamma _k \eta _k -\beta _k \xi _k\right) \right) \end{aligned}$$
(2.3)

with \({*}\) is taken as

$$\begin{aligned}&\alpha _{j}=\beta _{j}+\eta _{j}-1,\quad \zeta _{j}=\gamma _{j}+\xi _{j}-1\quad {\text{ for }}\quad j=k,\\&\alpha _{j}=\beta _{j}+\eta _{j},\quad \zeta _{j}=\gamma _{j}+\xi _{j}\quad {\text{ for }}\quad j\ne k. \end{aligned}$$

For simplicity, we often write

$$\begin{aligned} \mathcal {P}_{s}\equiv \mathcal {P}_{s}(\mathbb {C}),\quad B_{s}(\delta )\equiv B_{\mathbb {C},s}(\delta ) \quad {\text{ and }}\quad H\equiv H(q,\bar{q}). \end{aligned}$$

2.2 The Birkhoff Normal Form Theorem

In order to prove Theorem 1.1, we will construct a partial normal form in the case that the frequency \({\varvec{v}}\) is nonresonant.

Next, we give the following Birkhoff normal form theorem.

Theorem 2.1

Consider the Hamiltonian function

$$\begin{aligned} H(q, \bar{q})=N(q, \bar{q})+P(q, \bar{q})+Z(q, \bar{q}) \end{aligned}$$
(2.4)

with the following form of

$$\begin{aligned} N(q, \bar{q})= \frac{1}{2}\sum _{j \in \mathbb {Z}^d} v_{j}\left| q_{j}\right| ^{2}, \end{aligned}$$
(2.5)
$$\begin{aligned} P(q, \bar{q})=\sum _{\begin{array}{c} {\varvec{m}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}\\ |{\varvec{m}}|_1=2, \Delta ({\varvec{m}})=1\\ |{\varvec{\beta }}|_1=|{\varvec{\gamma }}|_1 \end{array}}P({\varvec{m}})q^{{\varvec{\beta }}}\bar{q}^{{\varvec{\gamma }}} \end{aligned}$$
(2.6)

and

$$\begin{aligned} Z(q, \bar{q})=\sum _{\begin{array}{c} {\varvec{m}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}\\ |{\varvec{m}}|_1=4, \Delta ({\varvec{m}})=0\\ {\varvec{\beta }}={\varvec{\gamma }} \end{array}}Z({\varvec{m}})q^{{\varvec{\beta }}}\bar{q}^{{\varvec{\gamma }}}. \end{aligned}$$

Assume that P and Z satisfy

$$\begin{aligned} |P({\varvec{m}})|,|Z({\varvec{m}})|\le \frac{1}{2} \epsilon ^{\frac{1}{4}(\Delta ({\varvec{m}})+|{\varvec{m}}|_1-1)},\quad \forall {\varvec{m}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d} \end{aligned}$$
(2.7)

and

$$\begin{aligned} \Vert P\Vert \le \frac{\epsilon }{2},\quad \Vert Z\Vert \le \frac{\epsilon }{4}. \end{aligned}$$
(2.8)

Given any \(r>1\), \(s-320 dr^2\ge 2s_{0}>d+1\), there exists \(\epsilon _*=\epsilon _*(r,s,s_0,d)>0\). Then for each \(0<\epsilon <\epsilon _*\) and each \({\varvec{v}}\in \mathcal {R}^{c}\) with

$$\begin{aligned} \mathcal N=\epsilon ^{-\frac{2r}{s-2s_0}}, \end{aligned}$$
(2.9)

there exists a symplectic map

$$\begin{aligned} \Phi : B_{s}(4\epsilon )\rightarrow B_{s}(5\epsilon ), \end{aligned}$$

such that

$$\begin{aligned} H\circ \Phi = N+{\textbf {W}}+{\textbf {V}}+{\textbf {Z}}+{\textbf {R}}, \end{aligned}$$
(2.10)

where

$$\begin{aligned} {\textbf {W}}= \sum _{\begin{array}{c} {\varvec{m}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}, |{\varvec{m}}|_1\le 4r+1\\ \left| \widehat{{\varvec{m}}}\right| _1=2, \left| \widehat{{\varvec{\beta }}}\right| _1=\left| \widehat{{\varvec{\gamma }}}\right| _1=1 \end{array}} {\textbf {W}}({\varvec{m}})q^{{\varvec{\beta }}}\bar{q}^{{\varvec{\gamma }}}, \end{aligned}$$
(2.11)
$$\begin{aligned} {\textbf {V}} =\sum _{\begin{array}{c} {\varvec{m}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}, |{\varvec{m}}|_1\le 4r+1\\ \left| \widehat{{\varvec{m}}}\right| _1>2 \end{array}} {\textbf {V}}({\varvec{m}})q^{{\varvec{\beta }}}\bar{q}^{{\varvec{\gamma }}}, \end{aligned}$$
(2.12)
$$\begin{aligned} {\textbf {Z}} =\sum _{\begin{array}{c} {\varvec{m}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}\\ {\varvec{\beta }}={\varvec{\gamma }} \end{array}} {\textbf {Z}}({\varvec{m}})q^{{\varvec{\beta }}}\bar{q}^{{\varvec{\gamma }}} \end{aligned}$$

and

$$\begin{aligned} {\textbf {R}}=\mathcal {O}\left( \epsilon ^{r+1}\right) . \end{aligned}$$

Moreover, the following estimates hold:

  1. (i)

    the symplectic map \(\Phi \) fulfills

    $$\begin{aligned} \sup _{q\in B_s(4\epsilon )}\left\| q-\Phi (q)\right\| _{s}\le \epsilon ^{\frac{3}{2}}; \end{aligned}$$
    (2.13)
  2. (ii)

    the Hamiltonian functions \({\textbf {W}}\) and \({\textbf {V}}\) satisfy

    $$\begin{aligned} |{\textbf {W}}({\varvec{m}})|, |{\textbf {V}}({\varvec{m}})|\le \epsilon ^{\frac{1}{4}(\Delta ({\varvec{m}})+|{\varvec{m}}|_1-1)},\quad \forall {\varvec{m}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d} \end{aligned}$$
    (2.14)

    and

    $$\begin{aligned} \Vert {\textbf {W}}\Vert , \Vert {\textbf {V}}\Vert \le C(r)\epsilon ; \end{aligned}$$
    (2.15)
  3. (iii)

    the Hamiltonian function \({\textbf {R}}\) satisfies

    $$\begin{aligned} \left\| {\textbf {R}}\right\| \le C(r)\epsilon ^{r+1}\mathcal N^{20dr^2}, \end{aligned}$$
    (2.16)

    where C(r) is a constant depending on r.

Proof

The details of proof for the Theorem 2.1 will be given in the Sect. 3. \(\square \)

3 The Proof of Birkhoff Normal Form Theorem

In this section, we will prove Theorem 2.1 by a finite step induction.

3.1 The First Step of Birkhoff Normal Form

At the first step \(h=1\), in view of (2.4), we rewrite H as

$$\begin{aligned} H_{1}=N+P_1+Z_1, \end{aligned}$$

where

$$\begin{aligned} P_1=P\quad {\text{ and }}\quad Z_1=Z. \end{aligned}$$
(3.1)

Let

$$\begin{aligned} R_1=&\sum _{\begin{array}{c} {\varvec{m}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}\\ |{\varvec{m}}|_1=2, \Delta ({\varvec{m}})=1\\ {\varvec{m}}\in \mathcal {M} \end{array}}P_1({\varvec{m}})q^{{\varvec{\beta }}}\bar{q}^{{\varvec{\gamma }}}, \end{aligned}$$
(3.2)

where

$$\begin{aligned} \mathcal {M}=\left\{ {\varvec{m}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}\bigg |\left| \widehat{{\varvec{m}}}\right| _1\le 1\ {\text{ or }}\ \left| \widehat{{\varvec{m}}}\right| _1=\left| \widehat{{\varvec{\beta }}}\right| _1=2\ {\text{ or }}\ \left| \widehat{{\varvec{m}}}\right| _1=\left| \widehat{{\varvec{\gamma }}}\right| _1=2\right\} . \end{aligned}$$

Next, we will desire to eliminate the terms \(R_1\). Rewrite \(R_1\) as

$$\begin{aligned} R_1 =\sum _{\begin{array}{c} {\varvec{m}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d} \end{array}}R_1({\varvec{m}})q^{{\varvec{\beta }}}\bar{q}^{{\varvec{\gamma }}}. \end{aligned}$$

Following the standard Birkhoff normal form approach, we define

$$\begin{aligned} H_{2}=H_{1} \circ \Phi ^1_{\chi _1}, \end{aligned}$$

where

$$\Phi ^1_{\chi _1}:B_{s}(\delta _1)\rightarrow B_{s}(\delta _2) $$

is the time-1 map of \(\chi _{1}\) with

$$\begin{aligned} \delta _1=4\epsilon , \end{aligned}$$
(3.3)
$$\begin{aligned} \chi _1=&\sum _{{\varvec{m}}\in \mathcal {M}}\frac{R_1({\varvec{m}})}{-2{\text{ i }}\langle {\varvec{k}}, {\varvec{v}}\rangle }q^{{\varvec{\beta }}}\bar{q}^{{\varvec{\gamma }}} \end{aligned}$$
(3.4)

and \({\varvec{k}}= {\varvec{\gamma }}-{\varvec{\beta }}\). Then in view of (2.7), it suffices to eliminate the terms satisfying

$$\begin{aligned} \Delta ({\varvec{m}})+|{\varvec{m}}|_1\le 4r+1. \end{aligned}$$
(3.5)

Since the frequency \({\varvec{v}}\in \mathcal {R}^{c}\) satisfies \((r,\mathcal N)\)-nonresonant conditions (1.5). Thus in view of (1.5), (2.7), (2.8), (3.1), (3.2) and (3.4), we can obtain

$$\begin{aligned} \left| \frac{R_1({\varvec{m}})}{-2{\text{ i }}\langle {\varvec{k}}, {\varvec{v}}\rangle }\right| \le \frac{1}{2}\epsilon ^{\frac{1}{4}(\Delta ({\varvec{m}})+|{\varvec{m}}|_1-1)}\mathcal N^{20dr},\quad \forall {\varvec{m}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d} \end{aligned}$$

and

$$\begin{aligned} \Vert \chi _1\Vert \le \frac{1}{2}\epsilon \mathcal N^{20dr}. \end{aligned}$$
(3.6)

Moreover, in view of (2.9), (3.6) and Lemma 6.1, we have

$$\begin{aligned} \sup _{q\in B_s(4\epsilon )}\left\| q-\Phi ^1_{\chi _1}(q)\right\| _{s}\le \epsilon ^{\frac{7}{4}}. \end{aligned}$$
(3.7)

By (3.3) and (3.7), we have the radius

$$\begin{aligned} \delta _2=\delta _1+\epsilon ^{\frac{7}{4}}=4\epsilon +\epsilon ^{\frac{7}{4}}. \end{aligned}$$

Employing the Taylor series expansion, we get

$$\begin{aligned} H_{2}= H_{1} \circ \Phi ^1_{\chi _1} =&\ N+\left\{ N, \chi _1\right\} +P_1+Z_1 \nonumber \\&+\sum _{n=2}^{r-1} \frac{1}{n!} N^{(n)} +\sum _{n=1}^{r-1} \frac{1}{n!} P_1^{(n)} +\sum _{n=1}^{r-1} \frac{1}{n!} Z_1^{(n)}+{\textbf {R}}_{2,r}, \end{aligned}$$

where

$$\begin{aligned} {\textbf {R}}_{2,r}= \sum _{n=r}^{\infty } \frac{1}{n!} N^{(n)} +\sum _{n=r}^{\infty } \frac{1}{n!} P_1^{(n)} +\sum _{n=r}^{\infty } \frac{1}{n!} Z_1^{(n)} \end{aligned}$$
(3.8)

and

$$\begin{aligned} \mathcal {Y}^{(0)}=\mathcal {Y}, \quad \mathcal {Y}^{(n)}=\left\{ \mathcal {Y}^{(n-1)},\chi _1\right\} \end{aligned}$$
(3.9)

with \(n\ge 1\), \(\mathcal {Y}=N,P_1,Z_1\).

Note that \(\chi _1\) solves the so-called homological equation

$$\begin{aligned} R_1+\left\{ N, \chi _{1}\right\} =0. \end{aligned}$$

Now we turn to estimate the Poisson brackets \(\mathcal {Y}^{(n)}\). Rewrite \(\mathcal {Y}^{(n)}\) as

$$\begin{aligned}&\mathcal {Y}^{(n)}=\sum _{{\varvec{\mu }}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}} \mathcal {Y}^{(n)}({\varvec{\mu }}) q^{{\varvec{\alpha }}}\bar{q}^{{\varvec{\zeta }}}. \end{aligned}$$

In fact, we only need to estimate Poisson brackets \(P_1^{(n)}\). We estimate the Poisson brackets \(P_1^{(1)}\) firstly. In view of (2.3), for any \({\varvec{\mu }}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}\), we know that

$$\begin{aligned} P_1^{(1)}({\varvec{\mu }}) =-2{\text{ i }}\sum _{k\in \mathbb {Z}^d}\left( \sum ^{*}_{{\varvec{m}},{\varvec{n}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}}\frac{P_1({\varvec{m}})R_1({\varvec{n}})}{-2{\text{ i }}\langle {\varvec{k}}, {\varvec{v}}\rangle }\left( \gamma _k \eta _k -\beta _k \xi _k\right) \right) . \end{aligned}$$
(3.10)

Moreover, we note that the monomials corresponding to multi-index \({\varvec{\mu }}\) satisfy

$$\begin{aligned} \Delta ({\varvec{\mu }}) \le \Delta ({\varvec{m}})+\Delta ({\varvec{n}}), \ |{\varvec{\mu }}|_1 =|{\varvec{m}}|_1+|{\varvec{n}}|_1-2 \end{aligned}$$
(3.11)

and the number of realizations of a fixed monomials \(q^{{\varvec{\alpha }}}\bar{q}^{{\varvec{\zeta }}}\) is bounded by

$$\begin{aligned} 2^{|{\varvec{\mu }}|_1}(\Delta ({\varvec{m}}) \wedge \Delta ({\varvec{n}}))<2^{|{\varvec{\mu }}|_1}(\Delta ({\varvec{m}})+\Delta ({\varvec{n}})). \end{aligned}$$
(3.12)

Then we have

$$\begin{aligned} \ \left| P_1^{(1)}({\varvec{\mu }})\right| \le&\sum ^{*}_{{\varvec{m}},{\varvec{n}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}}\left( |P_1({\varvec{m}})|\left| R_1({\varvec{n}})\right| \mathcal N^{20dr}\left( \sum _{k\in \mathbb {Z}^d}\left| \gamma _k \eta _k -\beta _k \xi _k\right| \right) \right) \\&\left( {\text{ in } \text{ view } \text{ of } \text{(1.5) } \text{ and } \text{(3.10) }}\right) \\ \le&\sum ^{*}_{{\varvec{m}},{\varvec{n}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}}\epsilon ^{\frac{1}{4}(\Delta ({\varvec{m}})+|{\varvec{m}}|_1+\Delta ({\varvec{n}})+|{\varvec{n}}|_1-2)} \mathcal N^{20dr}(|{\varvec{m}}|_1+|{\varvec{n}}|_1)^2\\&\left( {\text{ in } \text{ view } \text{ of } \text{(2.7) } \text{ and } \text{(3.1) }} \right) \\ \le&\sum ^{*}_{{\varvec{m}},{\varvec{n}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}}C(r)\epsilon ^{\frac{1}{8}}\epsilon ^{\frac{1}{4}\left( \Delta ({\varvec{\mu }})+|{\varvec{\mu }}|_1-1\right) }\\&\left( {\text{ in } \text{ view } \text{ of } \text{(3.5), } \text{(3.11) } \text{ and } } \epsilon ^{\frac{1}{8}}\mathcal N^{20dr}\le 1 \right) \\ \le&C(r)2^{|{\varvec{\mu }}|_1}(\Delta ({\varvec{m}})+\Delta ({\varvec{n}}))\epsilon ^{\frac{1}{8}}\epsilon ^{\frac{1}{4}\left( \Delta ({\varvec{\mu }})+|{\varvec{\mu }}|_1-1\right) }\quad \left( {\text{ in } \text{ view } \text{ of } \text{(3.12) }} \right) \\ \le&\epsilon ^{\frac{1}{4}\left( \Delta ({\varvec{\mu }})+|{\varvec{\mu }}|_1-1\right) }\\&\left( {\text{ in } \text{ view } \text{ of } } 2^{|{\varvec{\mu }}|_1}(\Delta ({\varvec{m}})+\Delta ({\varvec{n}}))\le C(r) { \text{ and } } C(r)\epsilon ^{\frac{1}{8}}\le 1 \right) , \end{aligned}$$

where C(r) is a constant depending on r. Similarly for \(2\le n\le r-1\), by induction we obtain

$$\begin{aligned} \left| \mathcal {Y}^{(n)}({\varvec{\mu }})\right| \le \epsilon ^{\frac{1}{4}\left( \Delta ({\varvec{\mu }})+|{\varvec{\mu }}|_1-1\right) },\quad \forall {\varvec{\mu }}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}. \end{aligned}$$
(3.13)

Furthermore, for any \(1\le n\le r-1\) one also has

$$\begin{aligned} \left\| \mathcal {Y}^{(n)}\right\| \le&C(r)\epsilon ^{n+1}\mathcal N^{20dnr}. \end{aligned}$$
(3.14)

Finally we can rewrite \(H_2\) as

$$\begin{aligned} H_2 =\ N+P_{2}+{\textbf {Z}}_2+{\textbf {R}}_{2,r}, \end{aligned}$$

where

$$\begin{aligned} P_2=&P_1-R_1+\sum _{n=2}^{r-1} \frac{1}{n!} N^{(n)} +\sum _{n=1}^{r-1} \frac{1}{n!} P_1^{(n)} +\sum _{n=1}^{r-1} \frac{1}{n!} Z_1^{(n)} \end{aligned}$$
(3.15)

and

$$\begin{aligned} {\textbf {Z}}_2=Z_1. \end{aligned}$$
(3.16)

In view of (2.7), (2.8), (3.1), (3.2), (3.8), (3.13)–(3.16), we know that Hamiltonian functions \(P_{2}\), \({\textbf {Z}}_2\) and \({\textbf {R}}_{2,r}\) satisfy

$$\begin{aligned} \left| P_{2}({\varvec{\mu }})\right| ,\left| {\textbf {Z}}_{2}({\varvec{\mu }})\right| \le \epsilon ^{\frac{1}{4}\left( \Delta ({\varvec{\mu }})+|{\varvec{\mu }}|_1-1\right) }, \quad \forall {\varvec{\mu }}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}, \end{aligned}$$
$$\begin{aligned} \left\| P_{2}\right\| , \left\| {\textbf {Z}}_{2}\right\| \le C(r)\epsilon \end{aligned}$$

and

$$\begin{aligned} \left\| {\textbf {R}}_{2,r}\right\| \le&C(r)\epsilon ^{r+1}\mathcal N^{20dr^2}. \end{aligned}$$

3.2 Iterative Lemma

Now, we turn to the general iteration steps.

Lemma 3.1

(Iterative lemma) For \(2\le h\le r\), consider the Hamiltonian function

$$\begin{aligned} H_h =\ N+P_{h}+{\textbf {Z}}_h+{\textbf {R}}_{h,r}, \end{aligned}$$

where

$$\begin{aligned} P_h=\sum _{\begin{array}{c} {\varvec{m}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d} \end{array}}P_{h}({\varvec{m}})q^{{\varvec{\beta }}}\bar{q}^{{\varvec{\gamma }}}, \end{aligned}$$
$$\begin{aligned} {\textbf {Z}}_{h}= \sum _{\begin{array}{c} {\varvec{m}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}\\ {\varvec{\beta }}={\varvec{\gamma }} \end{array}} {\textbf {Z}}_{h}({\varvec{m}})q^{{\varvec{\beta }}}\bar{q}^{{\varvec{\gamma }}} \end{aligned}$$

and

$$\begin{aligned} {\textbf {R}}_{h,r}=\mathcal {O}\left( \epsilon ^{r+1}\right) . \end{aligned}$$

Suppose that the Hamiltonian functions \(P_h\), \({\textbf {Z}}_{h}\) and \({\textbf {R}}_{h,r}\) satisfy the following estimates,

$$\begin{aligned} \left| P_{h}({\varvec{m}})\right| , \left| {\textbf {Z}}_h({\varvec{m}})\right| \le \epsilon ^{\frac{1}{4}\left( \Delta ({\varvec{m}})+|{\varvec{m}}|_1-1\right) },\quad \forall {\varvec{m}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}, \end{aligned}$$
(3.17)
$$\begin{aligned} \left\| P_{h}\right\| , \left\| {\textbf {Z}}_h\right\| \le C(r)\epsilon \end{aligned}$$
(3.18)

and

$$\begin{aligned} \Vert {\textbf {R}}_{h,r}\Vert \le C(r)\epsilon ^{r+1}\mathcal N^{20dr^2}, \end{aligned}$$

where C(r) is a constant depending on r, and assume that

$$\begin{aligned} R_h=\sum _{\begin{array}{c} {\varvec{m}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}\\ {\varvec{m}}\in \mathcal {M} \end{array}}P_{h}({\varvec{m}})q^{{\varvec{\beta }}}\bar{q}^{{\varvec{\gamma }}} \end{aligned}$$
(3.19)

satisfying

$$\begin{aligned} \Vert R_{h}\Vert \le C(r)\epsilon ^{h}\mathcal N^{20d(h-1)r}. \end{aligned}$$
(3.20)

Let the radius \(\delta _h\) as

$$\begin{aligned} \delta _h=4\epsilon +\sum ^{h-1}_{i=1}\epsilon ^{1+\frac{3i}{4}}. \end{aligned}$$

Then for each \({\varvec{v}}\in \mathcal {R}^{c}\), there exists a symplectic transformation

$$\Phi ^1_{\chi _h}:B_{s}(\delta _h)\rightarrow B_{s}(\delta _{h+1})$$

as the time-1 map of some Hamiltonian function \(\chi _h\) such that

$$\begin{aligned} H_{h+1}=H_{h} \circ \Phi ^1_{\chi _h} =\ N+P_{h+1}+{\textbf {Z}}_{h+1}+{\textbf {R}}_{h+1,r}, \end{aligned}$$

where

$$\begin{aligned} P_{h+1}=\sum _{\begin{array}{c} {\varvec{m}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d} \end{array}}P_{h+1}({\varvec{m}})q^{{\varvec{\beta }}}\bar{q}^{{\varvec{\gamma }}}, \end{aligned}$$
$$\begin{aligned} {\textbf {Z}}_{h+1}= \sum _{\begin{array}{c} {\varvec{m}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}\\ {\varvec{\beta }}={\varvec{\gamma }} \end{array}} {\textbf {Z}}_{h+1}({\varvec{m}})q^{{\varvec{\beta }}}\bar{q}^{{\varvec{\gamma }}} \end{aligned}$$

and

$$\begin{aligned} {\textbf {R}}_{h+1,r}=\mathcal {O}\left( \epsilon ^{r+1}\right) . \end{aligned}$$

Moreover, the following estimates hold:

  1. (i)

    the symplectic transformation \(\Phi ^1_{\chi _h}\) fulfills

    $$\begin{aligned} \sup _{q\in B_s(\delta _h)}\left\| q-\Phi ^1_{\chi _h}(q)\right\| _{s}\le \epsilon ^{1+\frac{3}{4}h} \end{aligned}$$
    (3.21)

    and the radius \(\delta _{h+1}\) satisfies

    $$\begin{aligned} \delta _{h+1}\le 4\epsilon +\sum ^{h}_{i=1}\epsilon ^{1+\frac{3i}{4}}; \end{aligned}$$
  2. (ii)

    the Hamiltonian functions \(P_{h+1}\) and \({\textbf {Z}}_{h+1}\) satisfy

    $$\begin{aligned} |P_{h+1}({\varvec{m}})|, |{\textbf {Z}}_{h+1}({\varvec{m}})|\le \epsilon ^{\frac{1}{4}(\Delta ({\varvec{m}})+|{\varvec{m}}|_1-1)},\quad \forall {\varvec{m}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d} \end{aligned}$$
    (3.22)

    and

    $$\begin{aligned} \Vert P_{h+1}\Vert , \Vert {\textbf {Z}}_{h+1}\Vert \le C(r)\epsilon ; \end{aligned}$$
    (3.23)
  3. (iii)

    the Hamiltonian function \({\textbf {R}}_{h+1,r}\) satisfies

    $$\begin{aligned} \left\| {\textbf {R}}_{h+1,r}\right\| \le C(r)\epsilon ^{r+1}\mathcal N^{20dr^2}. \end{aligned}$$
    (3.24)

Proof

In view of (3.19), we rewrite \(R_{h}\) as

$$\begin{aligned} R_h=\sum _{\begin{array}{c} {\varvec{m}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d} \end{array}}R_{h}({\varvec{m}})q^{{\varvec{\beta }}}\bar{q}^{{\varvec{\gamma }}}. \end{aligned}$$
(3.25)

Following the standard Birkhoff normal form approach, we define

$$\begin{aligned} H_{h+1}=H_{h} \circ \Phi ^1_{\chi _h}, \end{aligned}$$

where \(\Phi ^1_{\chi _h}\) is the time-1 map of \(\chi _{h}\) with

$$\begin{aligned} \chi _h=&\sum _{{\varvec{m}}\in \mathcal {M}}\frac{R_h({\varvec{m}})}{-2{\text{ i }}\langle {\varvec{k}}, {\varvec{v}}\rangle }q^{{\varvec{\beta }}}\bar{q}^{{\varvec{\gamma }}}. \end{aligned}$$
(3.26)

Note that \({\varvec{k}}\) satisfies (3.5) and the frequency \({\varvec{v}}\) satisfies \((r,\mathcal N)\)-nonresonant conditions (1.5). Thus in view of (1.5), (3.17)–(3.20), (3.26) and following from the proof of (3.6) and (3.7) respectively, we can obtain

$$\begin{aligned} \Vert \chi _h\Vert \le C(r)\epsilon ^{h}\mathcal N^{20dhr} \end{aligned}$$

and the estimate (3.21). Furthermore one has

$$\begin{aligned} \delta _{h+1}=4\epsilon +\sum ^{h}_{i=1}\epsilon ^{1+\frac{3i}{4}}. \end{aligned}$$

By Taylor series, we obtain

$$\begin{aligned} H_{h+1}= H_{h} \circ \Phi ^1_{\chi _h} =&N+\left\{ N, \chi _h\right\} +P_h+{\textbf {Z}}_h \nonumber \\&+\sum _{n=2}^{r-1} \frac{1}{n!} N^{(n)} +\sum _{n=1}^{r-1} \frac{1}{n!} P_h^{(n)} +\sum _{n=1}^{r-1} \frac{1}{n!} {\textbf {Z}}_h^{(n)}+{\textbf {R}}_{h+1,r}, \end{aligned}$$

where

$$\begin{aligned} {\textbf {R}}_{h+1,r}= \sum _{n=r}^{\infty } \frac{1}{n!} N^{(n)} +\sum _{n=r}^{\infty } \frac{1}{n!} P_h^{(n)} +\sum _{n=r}^{\infty } \frac{1}{n!} {\textbf {Z}}_h^{(n)} \end{aligned}$$
(3.27)

and here

$$\begin{aligned} \mathcal {Y}^{(0)}=\mathcal {Y}, \quad \mathcal {Y}^{(n)}=\left\{ \mathcal {Y}^{(n-1)},\chi _h\right\} \end{aligned}$$

with \(n\ge 1\), \(\mathcal {Y}=N,P_h,{\textbf {Z}}_h\).

For any \(1\le n\le r-1\), we can get

$$\begin{aligned} \left| \mathcal {Y}^{(n)}({\varvec{\mu }})\right| \le \epsilon ^{\frac{1}{4}\left( \Delta ({\varvec{\mu }})+|{\varvec{\mu }}|_1-1\right) },\quad \forall {\varvec{\mu }}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d} \end{aligned}$$
(3.28)

and

$$\begin{aligned} \left\| \mathcal {Y}^{(n)}\right\| \le C(r)\epsilon ^{1+nh}\mathcal N^{20dnhr}, \end{aligned}$$
(3.29)

which follows the proof of (3.13) and (3.14) respectively.

Finally, write

$$\begin{aligned} H_{h+1} =\ N+P_{h+1}+{\textbf {Z}}_{h+1}+{\textbf {R}}_{h+1,r}, \end{aligned}$$

where

$$\begin{aligned} P_{h+1}=&P_h-R_h-Z_h +\sum _{n=2}^{r-1} \frac{1}{n!} N^{(n)} +\sum _{n=1}^{r-1} \frac{1}{n!} P_h^{(n)} +\sum _{n=1}^{r-1} \frac{1}{n!} {\textbf {Z}}_h^{(n)} \end{aligned}$$
(3.30)

with

$$\begin{aligned} Z_{h}= \sum _{\begin{array}{c} {\varvec{m}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}\\ {\varvec{\beta }}={\varvec{\gamma }} \end{array}}P_{h}({\varvec{m}})q^{{\varvec{\beta }}}\bar{q}^{{\varvec{\gamma }}} \end{aligned}$$
(3.31)

and

$$\begin{aligned} {\textbf {Z}}_{h+1}={\textbf {Z}}_{h}+Z_h. \end{aligned}$$
(3.32)

Rewrite \(P_{h+1}\) and \({\textbf {Z}}_{h+1}\) as

$$\begin{aligned} P_{h+1}=\sum _{\begin{array}{c} {\varvec{m}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d} \end{array}}P_{h+1}({\varvec{m}})q^{{\varvec{\beta }}}\bar{q}^{{\varvec{\gamma }}} \end{aligned}$$

and

$$\begin{aligned} {\textbf {Z}}_{h+1}= \sum _{\begin{array}{c} {\varvec{m}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}\\ {\varvec{\beta }}={\varvec{\gamma }} \end{array}} {\textbf {Z}}_{h+1}({\varvec{m}})q^{{\varvec{\beta }}}\bar{q}^{{\varvec{\gamma }}}. \end{aligned}$$

In view of (3.17), (3.18), (3.25), (3.27)–(3.32), we know that the Hamiltonian function \(P_{h+1}\), \({\textbf {Z}}_{h+1}\) and \({\textbf {R}}_{h+1,r}\) satisfy (3.22)–(3.24). Then we finish the proof of Lemma 3.1. \(\square \)

3.3 The Proof of Theorem 2.1

Proof

We will finish the proof of Theorem 2.1 by using Iterative Lemma 3.1. Let

$$\begin{aligned} \Phi =\Phi ^1_{\chi _1}\circ \Phi ^1_{\chi _2}\circ \cdots \circ \Phi ^1_{\chi _{r}}. \end{aligned}$$
(3.33)

Then based on Lemma 3.1, \(H_{r+1}: =H\circ \Phi \) has the form of

$$\begin{aligned} H_{r+1} =\ N+P_{r+1}+{\textbf {Z}}_{r+1}+{\textbf {R}}_{r+1,r}, \end{aligned}$$

where

$$\begin{aligned} P_{r+1} =\sum _{\begin{array}{c} {\varvec{m}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d} \end{array}}P_{r+1}({\varvec{m}})q^{{\varvec{\beta }}}\bar{q}^{{\varvec{\gamma }}}, \end{aligned}$$
$$\begin{aligned} {\textbf {Z}}_{r+1} =\sum _{\begin{array}{c} {\varvec{m}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}\\ {\varvec{\beta }}={\varvec{\gamma }} \end{array}} {\textbf {Z}}_{r+1}({\varvec{m}})q^{{\varvec{\beta }}}\bar{q}^{{\varvec{\gamma }}} \end{aligned}$$

and

$$\begin{aligned} {\textbf {R}}_{r+1,r} =\mathcal {O}\left( \epsilon ^{r+1}\right) . \end{aligned}$$

In fact, \(P_{r+1}\) can be rewritten as

$$\begin{aligned} P_{r+1} ={\textbf {W}}+{\textbf {V}}+Z_{r}+{\textbf {R}}_{r}, \end{aligned}$$

where

$$\begin{aligned} {\textbf {W}}= \sum _{\begin{array}{c} {\varvec{m}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}, |{\varvec{m}}|_1\le 4r+1\\ \left| \widehat{{\varvec{m}}}\right| _1=2, \left| \widehat{{\varvec{\beta }}}\right| _1=\left| \widehat{{\varvec{\gamma }}}\right| _1=1 \end{array}}P_{r+1}({\varvec{m}})q^{{\varvec{\beta }}}\bar{q}^{{\varvec{\gamma }}}, \end{aligned}$$
$$\begin{aligned} {\textbf {V}} =\sum _{\begin{array}{c} {\varvec{m}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}, |{\varvec{m}}|_1\le 4r+1\\ \left| \widehat{{\varvec{m}}}\right| _1>2 \end{array}}P_{r+1}({\varvec{m}})q^{{\varvec{\beta }}}\bar{q}^{{\varvec{\gamma }}}, \end{aligned}$$
$$\begin{aligned} Z_{r} =\sum _{\begin{array}{c} {\varvec{m}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}\\ {\varvec{\beta }}={\varvec{\gamma }} \end{array}}P_{r+1}({\varvec{m}})q^{{\varvec{\beta }}}\bar{q}^{{\varvec{\gamma }}} \end{aligned}$$

and

$$\begin{aligned} {\textbf {R}}_{r}=\mathcal {O}\left( \epsilon ^{r+1}\right) \end{aligned}$$

with

$$\left\| {\textbf {R}}_{r}\right\| \le C(r)\epsilon ^{r+1}\mathcal N^{20dr^2}.$$

Finally, let

$${\textbf {Z}}={\textbf {Z}}_{r+1}+Z_{r}\quad {\text{ and }}\quad {\textbf {R}}={\textbf {R}}_{r+1,r}+{\textbf {R}}_{r}.$$

Then we obtain that

$$\begin{aligned} H_{r+1} =\ N+{\textbf {W}}+{\textbf {V}}+{\textbf {Z}}+{\textbf {R}}, \end{aligned}$$

where \({\textbf {W}}, {\textbf {V}}\) and \({\textbf {R}}\) satisfy (2.14)–(2.16) by using Lemma 3.1. Moreover, recalling the definition of \(\Phi \) [see (3.33)] and by the inequality (3.21) for \(h\ge 1\), we have

$$\begin{aligned} \sup _{q\in B_s(4\epsilon )}\left\| q-\Phi (q)\right\| _{s}\le \epsilon ^{\frac{3}{2}}. \end{aligned}$$

\(\square \)

4 Proof the Main Theorem

Firstly, we write (1.1) as a Hamiltonian equation

$$ {\text{ i }} \dot{q}_{j}=2 \frac{\partial H}{\partial \bar{q}_{j}}, $$

where

$$\begin{aligned} H(q, \bar{q})=\frac{1}{2}\sum _{j \in \mathbb {Z}^d} v_{j}\left| q_{j}\right| ^{2}+\frac{\epsilon }{2} \sum _{j \in \mathbb {Z}^d}\sum _{|i-j|_{1}=1}q_{i}\bar{q}_{j}+\frac{\epsilon }{4}\sum _{j \in \mathbb {Z}^d}\left| q_{j}\right| ^{4}. \end{aligned}$$
(4.1)

We rewrite H as

$$\begin{aligned} H=N+P+Z, \end{aligned}$$

where

$$\begin{aligned} N= \frac{1}{2}\sum _{j \in \mathbb {Z}^d} v_{j}\left| q_{j}\right| ^{2}, \end{aligned}$$
$$\begin{aligned} P=\sum _{\begin{array}{c} {\varvec{m}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}\\ |{\varvec{m}}|_1=2, \Delta ({\varvec{m}})=1\\ |{\varvec{\beta }}|_1=|{\varvec{\gamma }}|_1 \end{array}}P({\varvec{m}})q^{{\varvec{\beta }}}\bar{q}^{{\varvec{\gamma }}} \end{aligned}$$
(4.2)

and

$$\begin{aligned} Z=\sum _{\begin{array}{c} {\varvec{m}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}\\ |{\varvec{m}}|_1=4, \Delta ({\varvec{m}})=0\\ {\varvec{\beta }}={\varvec{\gamma }} \end{array}}Z({\varvec{m}})q^{{\varvec{\beta }}}\bar{q}^{{\varvec{\gamma }}} \end{aligned}$$
(4.3)

with

$$\begin{aligned} |P({\varvec{m}})|\le \frac{\epsilon }{2}\quad {\text{ and }}\quad |Z({\varvec{m}})|\le \frac{\epsilon }{4}. \end{aligned}$$
(4.4)

In view of (4.2)–(4.4), we can obtain the Hamiltonian functions P and Z satisfy

$$\begin{aligned} |P({\varvec{m}})|,|Z({\varvec{m}})|\le \frac{1}{2} \epsilon ^{\frac{1}{4}(\Delta ({\varvec{m}})+|{\varvec{m}}|_1-1)},\quad \forall {\varvec{m}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d} \end{aligned}$$
(4.5)

and

$$\begin{aligned} \Vert P\Vert \le \frac{\epsilon }{2},\quad \Vert Z\Vert \le \frac{\epsilon }{4}. \end{aligned}$$
(4.6)

Based on the formulas (4.5) and (4.6), we know that the Hamiltonian H in (4.1) satisfies assumptions conditions (2.7) and (2.8) in Theorem 2.1. Moreover, in view of Sect. 5, we know that \((r,\mathcal N)\)-nonresonant conditions (1.5) hold. Thus, all assumptions in Theorem 2.1 hold.

Next, we prove the main Theorem 1.1 by applying Theorem 2.1.

By Theorem 2.1, we know that for any \({\varvec{v}}\in \mathcal {R}^{c}\), there exists a normalizing transformation

$$\begin{aligned} \Phi : B_{s}(4\epsilon )\rightarrow B_{s}(5\epsilon ) \end{aligned}$$

such that \(\Phi (q(t))=q'(t)\). By the initial state \(\left\| q(0)\right\| _{s}\le \epsilon \) in Theorem 1.1 and (2.13), for any small enough \(\epsilon >0\), one has

$$\begin{aligned} \left\| q'(0)\right\| _{s}\le \left\| q(0)\right\| _{s} +\left\| q(0)-\Phi \left( q(0)\right) \right\| _{s}\le 2\epsilon . \end{aligned}$$

Thus we have

$$\begin{aligned} \left\| \widetilde{q'}(0)\right\| _{\frac{s}{2}} \le \left\| q'(0) \right\| _{s}\le 2\epsilon \end{aligned}$$
(4.7)

and

$$\begin{aligned} \left\| \widehat{q'}(0)\right\| _{s_{0}}\le \mathcal N^{s_{0}-s} \left\| \widehat{q'}(0)\right\| _{s} <2\epsilon ^{2r+1} \end{aligned}$$
(4.8)

by using (2.9) and (6.2) in Lemma 6.2.

Define

$$\begin{aligned} {t^{*}=\inf \left\{ t\ge 0:\left\| \widetilde{q'}(t)\right\| _{\frac{s}{2}} =3\epsilon \;{\text {or}}\;\left\| \widehat{q'}(t)\right\| _{s_{0}} =3\epsilon ^{r+1}\right\} ,} \end{aligned}$$
(4.9)

then for all \(0\le t\le t^{*}\) one has

$$\begin{aligned} {\left\| \widetilde{q'}(t)\right\| _{\frac{s}{2}} \le 3\epsilon \quad {\text{ and }}\quad \left\| \widehat{q'}(t)\right\| _{s_{0}} \le 3\epsilon ^{r+1}.} \end{aligned}$$
(4.10)

Next, we will prove that

$$\begin{aligned} t^{*}\ge \epsilon ^{-\frac{r}{2s_{0}+1}}. \end{aligned}$$
(4.11)

On the contrary, assume that \(t^{*}<\epsilon ^{-\frac{r}{2s_{0}+1}}\), we will construct the contradiction in two parts.

(1) Estimate \(\left\| \widetilde{q'}(t^*)\right\| _{\frac{s}{2}}<3\epsilon \) when \(t^{*}<\epsilon ^{-\frac{r}{2s_{0}+1}}\).

For any \(0\le t\le t^{*}\), define

$$\widetilde{F}(q'(t))=\left\| \widetilde{q'}(t)\right\| _{\frac{s}{2}}^{2}.$$

By Newton–Leibniz formula, we have

$$\begin{aligned} \widetilde{F}\left( q'(t)\right) -\widetilde{F}\left( q'(0)\right) =\int _{0}^{t}\dot{\widetilde{F}}\left( q'(z)\right) {\text{ d }}z. \end{aligned}$$
(4.12)

Note that

$$\begin{aligned} \dot{\widetilde{F}}\left( q'(t)\right) = \left\{ H\circ \Phi ,\widetilde{F}\right\} \left( q'(t)\right) =\left\{ {\textbf {W}},\widetilde{F}\right\} \left( q'(t)\right) +\left\{ {\textbf {V}},\widetilde{F}\right\} \left( q'(t)\right) +\left\{ {\textbf {R}},\widetilde{F}\right\} \left( q'(t)\right) , \end{aligned}$$

where the last equality uses (2.10) in Theorem 2.1 and \(\left\{ N+{\textbf {Z}},\widetilde{F}\right\} \left( q'(t)\right) =0\). Then we have

$$\begin{aligned}&\nonumber {\left| \int _{0}^{t^{*}}\dot{\widetilde{F}}\left( q'(t)\right) {\text{ d }}t\right| }\\ \le&{\int _{0}^{t^{*}} \left( \left| \left\{ {\textbf {W}},\widetilde{F}\right\} \left( q'(t)\right) \right| +\left| \left\{ {\textbf {V}},\widetilde{F}\right\} \left( q'(t)\right) \right| +\left| \left\{ {\textbf {R}},\widetilde{F}\right\} \left( q'(t)\right) \right| \right) {\text{ d }}t}. \end{aligned}$$
(4.13)

In view of (4.7), (4.12), (4.13) and the Triangle inequality, we obtain

$$\begin{aligned} {\left| \widetilde{F}\left( q'(t^{*})\right) \right| \le 4\epsilon ^{2}+3\epsilon \int _{0}^{t^{*}} \left( \left\| \widetilde{X}_{{\textbf {W}}}(q'(t))\right\| _{\frac{s}{2}}+\left\| \widetilde{X}_{{\textbf {V}}}(q'(t))\right\| _{\frac{s}{2}}+\left\| \widetilde{X}_{{\textbf {R}}}(q'(t))\right\| _{\frac{s}{2}}\right) {\text{ d }}t} \end{aligned}$$
(4.14)

by using \(\left\| \widetilde{q'}(t)\right\| _{\frac{s}{2}} \le 3\epsilon \) in (4.10). Hence we only need to estimate \( \left\| \widetilde{X}_{{\textbf {W}}}(q'(t))\right\| _{\frac{s}{2}}, \left\| \widetilde{X}_{{\textbf {V}}}(q'(t))\right\| _{\frac{s}{2}}\) and \(\left\| \widetilde{X}_{{\textbf {R}}}(q'(t))\right\| _{\frac{s}{2}}\).

In fact, we have

$$\begin{aligned} \left\| \widetilde{X}_{{\textbf {W}}}\left( q'(t)\right) \right\| _{\frac{s}{2}}\le&\mathcal N^{\frac{s-2s_{0}}{2}}\left\| \widetilde{X}_{{\textbf {W}}}\left( q'(t)\right) \right\| _{s_{0}}\quad ({\text{ in } \text{ view } \text{ of } \text{(6.1) } \text{ in } \text{ Lemma } \text{6.2 }})\nonumber \\ \le&C(r,s_{0})\mathcal N^{\frac{s-2s_{0}}{2}}\epsilon \left\| \widetilde{q'}(t)\right\| _{s_{0}}\left\| \widehat{q'}(t)\right\| _{s_{0}}^{2}\nonumber \\&({\text{ in } \text{ view } \text{ of } \text{(2.11), } \text{(2.15) } \text{ and } \text{ Lemma } \text{6.1 }})\nonumber \\ \le&C(r,s_{0})\mathcal N^{\frac{s-2s_{0}}{2}}\epsilon \left\| \widetilde{q'}(t)\right\| _{\frac{s}{2}}\left\| \widehat{q'}(t)\right\| _{s_{0}}^{2}\nonumber \\ \le&\epsilon ^{r+\frac{3}{2}}, \end{aligned}$$
(4.15)

where the last inequality uses (2.9), (4.10) and \(C(r, s_{0})\epsilon ^{\frac{1}{2}}\le 1\). We also have

$$\begin{aligned} \nonumber \left\| \widetilde{X}_{{\textbf {V}}}\left( q'(t)\right) \right\| _{\frac{s}{2}} \le&\mathcal N^{\frac{s-2s_{0}}{2}}\left\| \widetilde{X}_{{\textbf {V}}}\left( q'(t)\right) \right\| _{s_{0}}\quad ({\text{ in } \text{ view } \text{ of } \text{(6.1) } \text{ in } \text{ Lemma } \text{6.2 }})\\ \nonumber \le&C(r,s_{0})\mathcal N^{\frac{s-2s_{0}}{2}}\epsilon \left\| \widehat{q'}(t)\right\| _{s_{0}}^{3}\\&\nonumber ({\text{ in } \text{ view } \text{ of } \text{(2.11), } \text{(2.15) } \text{ and } \text{ Lemma } \text{6.1 }})\\ \le&\epsilon ^{r+\frac{3}{2}}, \end{aligned}$$
(4.16)

where the last inequality uses (2.9), \(\left\| \widehat{q'}(t)\right\| _{s_{0}} \le 3\epsilon ^{r+1}\) in (4.10) and \(C(r,s_{0})\epsilon ^{\frac{1}{2}}\le 1\). Moreover, we have

$$\begin{aligned} \left\| \widetilde{X}_{{\textbf {R}}}(q'(t))\right\| _{\frac{s}{2}} \le \epsilon ^{r+\frac{3}{2}}, \end{aligned}$$
(4.17)

where one applies \({\textbf {R}}=\mathcal {O}\left( \epsilon ^{r+1}\right) \) in Theorem 2.1 with \(r^{*}{:=}r + 1\) in place of r. Then in view of (4.15)–(4.17), we get

$$\begin{aligned} \left\| \widetilde{X}_{{\textbf {W}}}(q'(t))\right\| _{\frac{s}{2}}+\left\| \widetilde{X}_{{\textbf {V}}}(q'(t))\right\| _{\frac{s}{2}}+\left\| \widetilde{X}_{{\textbf {R}}}(q'(t))\right\| _{\frac{s}{2}}\le 3\epsilon ^{r+\frac{3}{2}}. \end{aligned}$$
(4.18)

Thus, in view of (4.14) and (4.18), when \(t^*<\epsilon ^{-\frac{r}{2s_{0}+1}}\), one has

$$\begin{aligned} {\left| \widetilde{F}\left( q'(t^*)\right) \right| }<4\epsilon ^{2}+9\epsilon ^{-\frac{r}{2s_{0}+1}} \epsilon ^{r+\frac{5}{2}} \le 9\epsilon ^{2}, \end{aligned}$$

so we get

$$\begin{aligned} {\left\| \widetilde{q'}(t^*)\right\| _{\frac{s}{2}} <3\epsilon .} \end{aligned}$$
(4.19)

(2) Estimate \(\left\| \widehat{q'}(t^*)\right\| _{s_{0}}<3\epsilon ^{r+1}\) when \(t^{*}<\epsilon ^{-\frac{r}{2s_{0}+1}}\).

For any \(0\le t\le t^{*}\), define

$$\begin{aligned} \widehat{F}(q'(t))=\left\| \widehat{q'}(t)\right\| _{s_{0}}^{2}. \end{aligned}$$
(4.20)

Similar to the proof process in the first part, we only need to estimate \(\left| \left\{ {\textbf {W}},\widehat{F}\right\} \left( q'(t)\right) \right| \), \( \left| \left\{ {\textbf {V}},\widehat{F}\right\} \left( q'(t)\right) \right| \) and \(\left| \left\{ {\textbf {R}},\widehat{F}\right\} \left( q'(t)\right) \right| \).

Firstly, for any \(0\le t\le t^{*}\), we estimate \(\left| \left\{ {\textbf {V}},\widehat{F}\right\} \left( q'(t)\right) \right| \). Note that

$$\begin{aligned} \left| \left\{ {\textbf {V}},\widehat{F}\right\} \left( q'(t)\right) \right| \le \nonumber \left\| \widehat{X}_{{\textbf {V}}}\left( q'(t)\right) \right\| _{s_{0}} \left\| \widehat{q'}(t)\right\| _{s_{0}} \le 3\epsilon ^{r+1}\left\| \widehat{X}_{{\textbf {V}}}\left( q'(t)\right) \right\| _{s_{0}}, \end{aligned}$$

where the last inequality uses \(\left\| \widehat{q'}(t)\right\| _{s_{0}} \le 3\epsilon ^{r+1}\) in (4.10). Since

$$\begin{aligned} \nonumber \left\| \widehat{X}_{{\textbf {V}}}\left( q'(t)\right) \right\| _{s_{0}}\le&C(r,s_{0})\epsilon \left\| \widetilde{q'}(t)\right\| _{s_{0}}\left\| \widehat{q'}(t)\right\| _{s_{0}}^{2}\\&\nonumber ({\text{ in } \text{ view } \text{ of } \text{(2.12), } \text{(2.15) } \text{ and } \text{ Lemma } \text{6.1 }})\\ \le&\nonumber C(r,s_{0})\epsilon \left\| \widetilde{q'}(t)\right\| _{\frac{s}{2}}\left\| \widehat{q'}(t)\right\| _{s_{0}}^{2}\\ \le&\epsilon ^{2r+\frac{3}{2}}\quad ({\text{ by } \text{ using } \text{(4.10) } \text{ and } } C(r,s_{0})\epsilon ^{\frac{1}{2}}\le 1). \end{aligned}$$
(4.21)

Hence, we have

$$\begin{aligned} \left| \left\{ {\textbf {V}},\widehat{F}\right\} \left( q'(t)\right) \right| \le 3\epsilon ^{3r+\frac{5}{2}}. \end{aligned}$$
(4.22)

Similarly, we have

$$\begin{aligned} \left| \left\{ {\textbf {R}},\widehat{F}\right\} \left( q'(t)\right) \right| \le \left\| \widehat{X}_{{\textbf {R}}}\left( q'(t)\right) \right\| _{s_{0}} \left\| \widehat{q'}(t)\right\| _{s_{0}} \le 3\epsilon ^{3r+\frac{5}{2}}, \end{aligned}$$
(4.23)

where one applies \({\textbf {R}}=\mathcal {O}\left( \epsilon ^{r+1}\right) \) in Theorem 2.1 with \(r^{*}{:=}2r + 2\) in place of r.

Next, for any \(0\le t\le t^{*}\), we estimate \(\left| \left\{ {\textbf {W}},\widehat{F}\right\} \left( q'(t)\right) \right| \). In view of (2.11) in Theorem 2.1, we have

$$\begin{aligned} \left| \left\{ {\textbf {W}},\widehat{F}\right\} (q'(t))\right| \le \sum ^{4r+1}_{i=2}\left| \left\{ {\textbf {W}}_i,\widehat{F}\right\} (q'(t))\right| , \end{aligned}$$
(4.24)

where

$$\begin{aligned} {\textbf {W}}_i=\sum _{\begin{array}{c} {\varvec{m}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}, |{\varvec{m}}|_1=i\\ \left| \widehat{{\varvec{m}}}\right| _1=2, \left| \widehat{{\varvec{\beta }}}\right| _1=\left| \widehat{{\varvec{\gamma }}}\right| _1=1 \end{array}} {\textbf {W}}_i({\varvec{m}})q^{{\varvec{\beta }}}\bar{q}^{{\varvec{\gamma }}}. \end{aligned}$$
(4.25)

Hence, we only need to estimate \(\left| \left\{ {\textbf {W}}_i,\widehat{F}\right\} (q'(t))\right| \). By (4.25), we know

$$\begin{aligned} \left| \widehat{{\varvec{\beta }}}\right| _1=\left| \widehat{{\varvec{\gamma }}}\right| _1=1. \end{aligned}$$

Then there exist \(a_1, a_2\in \mathbb {Z}^d\) and \(|a_{1}|_2,|a_{2}|_2>\mathcal N \) such that

$$\left| \beta _{a_1}\right| =\left| \gamma _{a_2}\right| =1.$$

Thus the Hamiltonian \({\textbf {W}}_i\) can be rewritten as

$$\begin{aligned} {\textbf {W}}_i=\sum _{\begin{array}{c} a_{1},a_{2}\in \mathbb {Z}^d\\ |a_{1}|_2,|a_{2}|_2>\mathcal N \end{array}}B_{a_{1}a_{2}}\left( \widetilde{q'}\right) q'_{a_{1}}\bar{q}'_{a_{2}}, \end{aligned}$$
(4.26)

where

$$\begin{aligned} B_{a_{1}a_{2}}\left( \widetilde{q'}\right) =\sum _{\begin{array}{c} \widetilde{{\varvec{m}}}\in \mathbb {N}^{\mathbb {Z}^d}\times \mathbb {N}^{\mathbb {Z}^d}\\ |\widetilde{{\varvec{m}}}|_1= i-2 \end{array}}B_{a_{1}a_{2}}(\widetilde{{\varvec{m}}})\widetilde{q'}^{\widetilde{{\varvec{\beta }}}}\widetilde{\bar{q}'}^{\widetilde{{\varvec{\gamma }}}} \end{aligned}$$

and

$$\begin{aligned} \left| B_{a_{1}a_{2}}(\widetilde{{\varvec{m}}})\right| \le C(r)\epsilon \end{aligned}$$

by using (2.15). Hence, for any \(|a_{1}|_2,|a_{2}|_2>\mathcal N\), we obtain

$$\begin{aligned} \left| B_{a_{1}a_{2}}\left( \widetilde{q'}\right) \right| \le C(r, s)\epsilon \left\| \widetilde{q'}(t)\right\| ^{i-2}_{\frac{s}{2}}\le C(r, s)\epsilon , \end{aligned}$$
(4.27)

the last inequality is based on \(2\le i\le 4r+1\) and \(\left\| \widetilde{q'}(t)\right\| _{\frac{s}{2}} \le 3\epsilon \) in (4.10), where constant C(rs) depends on r and s. Moreover, in view of (3.5), we have

$$\begin{aligned} \left| |a_{1}|_2-|a_{2}|_2\right| \le \sqrt{d(4r+1)^2}. \end{aligned}$$
(4.28)

Then one has

$$\begin{aligned}&\nonumber \left| \left\{ {\textbf {W}}_i,\widehat{F}\right\} (q'(t))\right| \\ \nonumber =&\left| \sum _{\begin{array}{c} a_{1},a_{2}\in \mathbb {Z}^d\\ |a_{1}|_2,|a_{2}|_2>\mathcal N \end{array}}B_{a_{1}a_{2}}\left( \widetilde{q'}\right) \left( \langle a_{1}\rangle ^{2s_{0}}-\langle a_{2}\rangle ^{2s_{0}}\right) q_{a_{1}}'\bar{q}_{a_{2}}'\right| \quad \left( {\text{ in } \text{ view } \text{ of } \text{(4.26) }} \right) \\ \nonumber \le&C(r,s)\epsilon \sum _{\begin{array}{c} a_{1},a_{2}\in \mathbb {Z}^d\\ |a_{1}|_2,|a_{2}|_2>\mathcal N \end{array}}2s_{0} \big |\langle a_{1}\rangle -\langle a_{2}\rangle \big | \left( \langle a_{1}\rangle ^{2s_{0}-1}+\langle a_{2}\rangle ^{2s_{0}-1}\right) \left| q_{a_{1}}'\bar{q}_{a_{2}}'\right| \\ &\nonumber ({\text{ in } \text{ view } \text{ of } \text{(4.27) }}) \\ \nonumber \le&C(r,s,s_0,d) \epsilon \sum _{\begin{array}{c} a_{1},a_{2}\in \mathbb {Z}^d\\ |a_{1}|_2,|a_{2}|_2>\mathcal N \end{array}} \langle a_{1}\rangle ^{s_{0}-1} \langle a_{2}\rangle ^{s_{0}} \left| q_{a_{1}}'\bar{q}_{a_{2}}'\right| \quad ({\text{ in } \text{ view } \text{ of } \text{(4.28) }}) \\ \nonumber \le&\left\| \widehat{q'}(t)\right\| _{s_{0}-1}\left\| \widehat{q'}(t)\right\| _{s_{0}}\\&\nonumber ({\text{ by } \text{ using } \text{ Cauchy--Schwarz } \text{ inequality } \text{ and } } C(r,s,s_0,d)\epsilon \le 1) \\ \le&\left\| \widehat{q'}(t)\right\| _{0}^{\frac{1}{s_{0}}} \left\| \widehat{q'}(t)\right\| _{s_{0}}^{2-\frac{1}{s_{0}}} \quad \left( {\text{ by } \text{ using } \text{ H }}\ddot{{\textrm{o}}}{\text{ lder } \text{ inequality }}\right) , \end{aligned}$$
(4.29)

where \(C(r,s,s_0,d)\) is a constant depending on \(r,s,s_0\) and d. Now we estimate \(\left\| \widehat{q'}(t)\right\| _{0}\) and \(\left\| \widehat{q'}(t)\right\| _{s_{0}}\). On one hand, we have

$$\begin{aligned} \left\| \widehat{q'}(t)\right\| _{s_{0}}^{2-\frac{1}{s_{0}}}\le \left( 3\epsilon ^{r+1}\right) ^{\left( 2-\frac{1}{s_{0}}\right) } \end{aligned}$$
(4.30)

by using \(\left\| \widehat{q'}(t)\right\| _{s_{0}} \le 3\epsilon ^{r+1}\) in (4.10). On the other hand, note that the Hamiltonian \({\textbf {W}}_{i}\) satisfies \(\left| \widehat{{\varvec{\beta }}}\right| _1=\left| \widehat{{\varvec{\gamma }}}\right| _1=1\), then we have

$$\begin{aligned} \left\{ {\textbf {W}}_{i},\left\| \widehat{q'}(t)\right\| _{0}^{2}\right\} =0. \end{aligned}$$
(4.31)

Hence, in view of (4.20) with \(s_0=0\), (4.31) and the Triangle inequality, we can obtain

$$\begin{aligned} \left\| \widehat{q'}(t)\right\| _{0}^{2} \le&\nonumber \left\| \widehat{q'}(0)\right\| _{0}^{2} +{\int _{0}^{t} \left( \left| \left\{ {\textbf {V}},\left\| \widehat{q'}(z)\right\| _{0}^{2}\right\} \left( q'(z)\right) \right| +\left| \left\{ {\textbf {R}},\left\| \widehat{q'}(z)\right\| _{0}^{2}\right\} \left( q'(z)\right) \right| \right) {\text{ d }}z}\\ \le&\nonumber \left\| \widehat{q'}(0)\right\| _{s_{0}}^{2}+{\int _{0}^{t} \left( \left\| \widehat{X}_{{\textbf {V}}}(q'(z))\right\| _{s_{0}}+\left\| \widehat{X}_{{\textbf {R}}}(q'(z))\right\| _{s_{0}} \right) \left\| \widehat{q'}(z)\right\| _{s_{0}}{\text{ d }}z}\\ \le&6(1+t)\epsilon ^{3r+\frac{5}{2}}\quad ({\text{ in } \text{ view } \text{ of } \text{(4.8), } \text{(4.10), } \text{(4.21) } \text{ and } \text{(4.23) }}). \end{aligned}$$
(4.32)

Thus, in view of (4.29), (4.30) and (4.32), we obtain

$$\begin{aligned} \left| \left\{ {\textbf {W}}_i,\widehat{F}\right\} (q'(t))\right| \le \left( 6(1+t)\epsilon ^{3r+\frac{5}{2}}\right) ^{\frac{1}{2s_{0}}} \left( 3\epsilon ^{r+1}\right) ^{\left( 2-\frac{1}{s_{0}}\right) }, \end{aligned}$$
(4.33)

then in view of (4.24) and (4.33), we obtain

$$\begin{aligned} \left| \left\{ {\textbf {W}},\widehat{F}\right\} (q'(t))\right| \le C(r)\left( 6(1+t)\epsilon ^{3r+\frac{5}{2}}\right) ^{\frac{1}{2s_{0}}} \left( 3\epsilon ^{r+1}\right) ^{\left( 2-\frac{1}{s_{0}}\right) }. \end{aligned}$$
(4.34)

Consequently, it is easy to get

$$\begin{aligned} {\left| \int _{0}^{t^*}\dot{\widehat{F}}\left( q'(t)\right) {\text{ d }}t\right| }\le&\nonumber {\int _{0}^{t^*} \left( \left| \left\{ {\textbf {W}},\widehat{F}\right\} \left( q'(t)\right) \right| + \left| \left\{ {\textbf {V}},\widehat{F}\right\} \left( q'(t)\right) \right| +\left| \left\{ {\textbf {R}},\widehat{F}\right\} \left( q'(t)\right) \right| \right) {\text{ d }}t} \\ \le&\nonumber 6t^*\epsilon ^{3r+\frac{5}{2}}+ C(r)t^*\left( 6(1+t^*)\epsilon ^{3r+\frac{5}{2}}\right) ^{\frac{1}{2s_{0}}} \left( 3\epsilon ^{r+1}\right) ^{\left( 2-\frac{1}{s_{0}}\right) }\\&\nonumber \left( {\text{ by } \text{ using } \text{(4.22), } \text{(4.23) } \text{ and } \text{(4.34) }} \right) \\ <&\frac{1}{2}\epsilon ^{2r+2}, \end{aligned}$$
(4.35)

where the last equality uses \(t^*<\epsilon ^{-\frac{r}{2s_{0}+1}}\).

Thus, in view of (4.8), (4.35) and the Triangle inequality, when \(t^*<\epsilon ^{-\frac{r}{2s_{0}+1}}\) we obtain

$$\begin{aligned} \left| \widehat{F}(q'(t^*))\right| <9\epsilon ^{2r+2}, \end{aligned}$$

i.e.

$$\begin{aligned} \left\| \widehat{q'}(t^*)\right\| _{s_{0}} <3\epsilon ^{r+1}. \end{aligned}$$
(4.36)

To sum up, we know that (4.19), (4.36) and (4.9) are contradictory. Hence we can conclude that (4.11) holds.

Finally, going back to the original variables q(t), for any \(0\le t\le t^{*}\), we have

$$\begin{aligned} \left\| \widetilde{q}(t)\right\| _{\frac{s}{2}} \le&\left\| \widetilde{q'}(t)\right\| _{\frac{s}{2}}+\left\| \widetilde{q}(t)-\Phi (\widetilde{q}(t))\right\| _{\frac{s}{2}}\\ \le&3\epsilon +\left\| q(t)-\Phi (q(t))\right\| _{s}\quad \left( {\text{ in } \text{ view } \text{ of } } \left\| \widetilde{q'}(t)\right\| _{\frac{s}{2}} \le 3\epsilon { \text{ in } \text{(4.10) }}\right) \\ \le&4\epsilon \quad \left( {\text{ in } \text{ view } \text{ of } \text{(2.13) }}\right) \end{aligned}$$

and

$$\begin{aligned} \left\| \widehat{q}(t)\right\| _{s_{0}} \le&\left\| \widehat{q'}(t)\right\| _{s_{0}}+\left\| \widehat{q}(t)-\Phi (\widehat{q}(t))\right\| _{s_{0}}\\ \le&3\epsilon ^{r+1}+\mathcal N^{\frac{2s_{0}-s}{2}}\left\| \widehat{q}(t)-\Phi (\widehat{q}(t))\right\| _{\frac{s}{2}}\\&\left( {\text{ in } \text{ view } \text{ of } } \left\| \widehat{q}(t)\right\| _{s_{0}} \le 3\epsilon ^{r+1} { \text{ in } \text{(4.10) } \text{ and } \text{(6.2) } \text{ in } \text{ Lemma } \text{6.2 }}\right) \\ \le&3\epsilon ^{r+1}+\mathcal N^{\frac{2s_{0}-s}{2}}\left\| q(t)-\Phi (q(t))\right\| _{s}\\ \le&4\epsilon ^{r+1}\quad \left( {\text{ in } \text{ view } \text{ of } \text{(2.9) } \text{ and } \text{(2.13) } }\right) . \end{aligned}$$

5 Measure Estimate

In this section, we will estimate the measure of the resonant set \(\mathcal {R}\) defined in (1.8), i.e., we will show

$$\begin{aligned} {\text {mes}} ~\mathcal {(R)} \le \epsilon ^{\frac{2r}{s-2s_{0}}}. \end{aligned}$$
(5.1)

Proof

For the convenience of calculation, rewritten the resonant set \(\mathcal {R}\) defined in (1.8) as

$$ \mathcal {R}= \bigcup _{i=0}^2 \mathcal {R}_i,$$

where

$$ \mathcal {R}_i= \bigcup _{\begin{array}{c} 0\ne {\varvec{k}}\in \mathbb {Z}^{\mathbb {Z}^d}\\ \left| \widehat{{\varvec{k}}}\right| _1=i \end{array}} \mathcal {R}_{{\varvec{k}}},\quad i=0,1$$

and

$$ \mathcal {R}_2= \bigcup _{\begin{array}{c} 0\ne {\varvec{k}}\in \mathbb {Z}^{\mathbb {Z}^d}\\ \left| \widehat{{\varvec{k}}}\right| _1=2\ {\text {and}}\ \sum _{|j|_2> \mathcal N}k_j=\pm 2 \end{array}} \mathcal {R}_{{\varvec{k}}}.$$

In view of (3.5), we obtain

$$\begin{aligned} \Delta ({\varvec{k}})+|{\varvec{k}}|_1\le 4r+1. \end{aligned}$$
(5.2)

Now it suffices to prove that the estimate (5.1) holds, which will be given in the following three cases.

Case 1: \(\left| \widehat{{\varvec{k}}}\right| _1=0.\)

In view of (1.7), one has

$$\begin{aligned} {\text {mes}}~(\mathcal {R}_{{\varvec{k}}}) \le \frac{1}{\mathcal N^{20dr}}. \end{aligned}$$

Hence, we have

$$\begin{aligned} {\text {mes}}~(\mathcal {R}_0)&\nonumber \le \sum _{\begin{array}{c} 0\ne {\varvec{k}}\in \mathbb {Z}^{\mathbb {Z}^d}\\ \left| \widehat{{\varvec{k}}}\right| _1=0 \end{array}} {\text {mes}}~(\mathcal {R}_{{\varvec{k}}})\\&\nonumber \le \frac{1}{\mathcal N^{20dr}}\cdot (2\mathcal N+1)^{d(4r+1)}\quad \left( {\text{ in } \text{ view } \text{ of } \text{(5.2) }} \right) \\&\le \frac{1}{3\mathcal N}. \end{aligned}$$
(5.3)

Case 2: \(\left| \widehat{{\varvec{k}}}\right| _1=1.\)

Then one has

$$\begin{aligned} \widetilde{{\varvec{k}}}\ne {\varvec{0}} \end{aligned}$$

by using the mass conservation, and

$$\begin{aligned} \langle {\varvec{k}}, {\varvec{v}}\rangle =\langle \widetilde{{\varvec{k}}},{\varvec{v}}\rangle \pm v_{a_1}, \end{aligned}$$

where \(a_1\in \mathbb {Z}^d\) and \(|a_1|_2>\mathcal N\).

In view of (5.2), we know that

$$\begin{aligned} |a_1|_2\le \mathcal N+\sqrt{d(4r+1)^2}. \end{aligned}$$

Then following the proof of (5.3), we get

$$\begin{aligned} {\text {mes}}~(\mathcal {R}_1) \le \frac{1}{3\mathcal N}. \end{aligned}$$
(5.4)

Case 3: \(\left| \widehat{{\varvec{k}}}\right| _1=2\quad {\text{ and }}\quad \sum _{|j|_2> \mathcal N}k_j=\pm 2.\)

Then we have

$$\begin{aligned} \widetilde{{\varvec{k}}}\ne {\varvec{0}} \end{aligned}$$

by using the mass conservation, and \(\langle {\varvec{k}}, {\varvec{v}}\rangle \) is either equal to

$$\begin{aligned} \langle \widetilde{{\varvec{k}}},{\varvec{v}}\rangle + v_{a_1}+v_{a_2} \end{aligned}$$

or

$$\begin{aligned} \langle \widetilde{{\varvec{k}}},{\varvec{v}}\rangle - v_{a_1}-v_{a_2}, \end{aligned}$$

where \(a_1, a_2\in \mathbb {Z}^d\) and \(|a_1|_2, |a_2|_2\ge \mathcal N\).

In view of (5.2), we know that

$$\begin{aligned} |a_1|_2,|a_2|_2\le \mathcal N+\sqrt{d(4r+1)^2}. \end{aligned}$$

Then following the proof of (5.3), we get

$$\begin{aligned} {\text {mes}}~(\mathcal {R}_2) \le \frac{1}{3\mathcal N}. \end{aligned}$$
(5.5)

Finally, in view of (5.3)–(5.5), we obtain

$$\begin{aligned} {\text {mes}}~\mathcal {(R)} \le \frac{1}{\mathcal {N}}. \end{aligned}$$

In view of (2.9), we have

$$\begin{aligned} {\text {mes}} ~(\mathcal {R}) \le \epsilon ^{\frac{2r}{s-2s_{0}}}. \end{aligned}$$

Thus, one finishes the proof of (5.1). \(\square \)