1 Introduction

A challenging and open question in the theory of quasi-periodic motions for PDEs concerns its possible extension to quasi-linear and fully nonlinear equations, namely PDEs whose nonlinearities contain derivatives of the same order as the linear operator. Besides its mathematical interest, this question is also relevant in view of applications to physical real world nonlinear models, for example in fluid dynamics and elasticity.

The goal of this paper is to make the first step in this direction, developing a KAM theory for quasi-periodically forced perturbations of the linear Airy equation

$$\begin{aligned} u_{t} + u_{xxx} + \varepsilon f(\omega t , x , u, u_{x}, u_{xx}, u_{xxx} ) = 0 , \quad x \in {\mathbb {T}}:= {\mathbb {R}}/ 2\pi {\mathbb {Z}}. \end{aligned}$$
(1.1)

In (1.1) the modulus of the frequency vector \(\omega \) is used as a parameter in the problem, see (1.2).

First, in Theorem 1.1 we prove an existence result of quasi-periodic solutions for a large class of quasi-linear nonlinearities \( f \). Then for Hamiltonian or reversible nonlinearities, we also prove the linear stability of the solutions, see Theorems 1.2, 1.3. Theorem 1.3 also holds for fully nonlinear perturbations. The precise meaning of stability is stated in Theorem 1.5. The key analysis is the reduction to constant coefficients of the linearized Airy equation, see Theorem 1.4. These results are presented in [3]. To the best of our knowledge, these are the first KAM results for quasi-linear or fully nonlinear PDEs. We reserve to a future work the study of autonomous, parameter independent, perturbations of KdV, which also requires the analysis of the frequency-to-amplitude map arising from the nonlinearity. We think it is worth to split these difficulties.

Let us outline a short history of the subject. KAM and Nash–Moser theory for PDEs, which counts nowadays on a wide literature, started with the pioneering works of Kuksin [29] and Wayne [43], and was developed in the 1990s by Craig–Wayne [17], Bourgain [12, 13], Pöschel [37] (see also [16, 31, 32] for more references). These papers concern wave and Schrödinger equations with bounded Hamiltonian nonlinearities.

The first KAM results for unbounded perturbations have been obtained by Kuksin [30, 31], and, then, Kappeler–Pöschel [27], for Hamiltonian, analytic perturbations of KdV. Here the highest constant coefficients linear operator is \(\partial _{xxx}\) and the nonlinearity contains one space derivative \(\partial _x\). This means that the Hamiltonian density is a functions of \(x\) and \(u\) (it could also depend on \( |\partial _x|^{1/2}u\)). The key idea is to work with a variable coefficients normal form. The corresponding homological equations are solved thanks to the so called “Kuksin lemma”, see Chapter 5 in [27]. Their approach has been recently improved by Liu–Yuan [34] who proved a stronger version of the Kuksin lemma. Then in [35] (see also Zhang et al. [44]) they applied it to \(1\)-dimensional derivative NLS (DNLS) and Benjamin–Ono equations, where the highest order constant coefficients linear operator is \( \partial _{xx}\) and the nonlinearity contains one derivative \(\partial _x\). These methods apply to dispersive PDEs with derivatives like KdV, DNLS, the Duffing oscillator (see Bambusi–Graffi [4]), but not to derivative wave equations (DNLW) which contain first order derivatives \(\partial _x , \partial _t \) in the nonlinearity.

For DNLW, KAM theorems have been recently proved by Berti–Biasco–Procesi for both Hamiltonian [6] and reversible [7] equations. The key ingredient is an asymptotic expansion of the perturbed eigenvalues that is sufficiently accurate to impose the second order Melnikov non-resonance conditions. This is achieved introducing the notion of “quasi-Töplitz” vector field, which is inspired to the concept of “quasi-Töplitz” and “Töplitz–Lipschitz” Hamiltonians, developed, respectively, in Procesi–Xu [39] and Eliasson–Kuksin [19, 20] (see also Geng et al. [21], Grébert–Thomann [23], Procesi–Procesi [38]).

Existence of quasi-periodic solutions of PDEs can also be proved by imposing only the first order Melnikov conditions. This approach has been developed by Bourgain [1215] extending the work of Craig–Wayne [17] for periodic solutions. It is especially convenient for PDEs in higher space dimension, because of the high multiplicity of the eigenvalues: see also the recent results by Wang [42], Berti–Bolle [9, 10] (and [11, 22] for periodic solutions). This method does not provide information about the stability of the quasi-periodic solutions, because the linearized equations have variable coefficients.

All the aforementioned results concern “semilinear” PDEs, namely equations in which the nonlinearity contains strictly less derivatives than the linear differential operator. For quasi-linear or fully nonlinear PDEs the perturbative effect is much stronger, and the possibility of extending KAM theory in this context is doubtful, see [16, 27, 35], because of the possible phenomenon of formation of singularities outlined in Lax [33], Klainerman and Majda [28]. For example, Kappeler–Pöschel [27] (remark 3, page 19) wrote: “It would be interesting to obtain perturbation results which also include terms of higher order, at least in the region where the KdV approximation is valid. However, results of this type are still out of reach, if true at all”. The study of this important issue is at its first steps.

For quasi-linear and fully nonlinear PDEs, the literature concerns, so far, only existence of periodic solutions. We quote the classical bifurcation results of Rabinowitz [40, 41] for fully nonlinear forced wave equations with a small dissipation term. More recently, Baldi [1] proved existence of periodic forced vibrations for quasi-linear Kirchhoff equations. Here the quasi-linear perturbation term depends explicitly only on time. Both these results are proved via Nash–Moser methods.

For the water waves equations, which are a fully nonlinear PDE, we mention the pioneering work of Iooss et al. [24] about existence of time periodic standing waves, and of Iooss–Plotnikov [25, 26] for 3-dimensional traveling water waves. The key idea is to use diffeomorphisms of the torus \({\mathbb {T}}^2\) and pseudo-differential operators, in order to conjugate the linearized operator to one with constant coefficients plus a sufficiently smoothing remainder. This is enough to invert the whole linearized operator by Neumann series. Very recently Baldi [2] has further developed the techniques of [24], proving the existence of periodic solutions for fully nonlinear autonomous, reversible Benjamin–Ono equations.

These approaches do not imply the linear stability of the solutions and, unfortunately, they do not work for quasi-periodic solutions, because stronger small divisors difficulties arise, see the comment 1.2 below.

We finally mention that, for quasi-linear Klein–Gordon equations on spheres, Delort [18] has proved long time existence results via Birkhoff normal form methods.

The key analysis of the present paper concerns the linearized operator (1.16) obtained at any step of the Nash–Moser iteration. Its reduction to constant coefficients can not be obtained by the KAM schemes [27, 30, 35]. The reason is that the perturbation in (1.1) is unbounded of order three (i.e. \( O(\partial _{xxx}) \)) and the homological equation (solved by the Kuksin lemma) gains only two space derivatives (thanks to the cubic dispersion relation of KdV). Therefore the scheme does not converge. Our idea is to perform, before starting with the KAM iteration, some preliminary transformations which decrease the \( \partial _x \)-order of the perturbation, but not its size. We use changes of variables, like quasi-periodic time-dependent diffeomorphisms of the space variable \( x \), a quasi-periodic reparametrization of time, multiplication operators and Fourier multipliers, which reduce the linearized operator to constant coefficients up to a bounded remainder, see (1.24). These transformations, which are inspired by [2, 24], are very different from the usual KAM transformations. At this point, we start a KAM reducibility scheme à la Eliasson-Kuksin which reduces the size of the perturbation quadratically, and completely diagonalizes the linearized operator (actually, since we work with finite differentiability, we implement a Nash–Moser scheme). For reversible or Hamiltonian perturbations we get that the eigenvalues of this diagonal operator are purely imaginary, i.e. we prove the linear stability. In Sect. 1.2 we present the main ideas of the proof.

1.1 Main results

We consider problem (1.1) where \( \varepsilon > 0 \) is a small parameter, the nonlinearity is quasi-periodic in time with diophantine frequency vector

$$\begin{aligned} \omega = \lambda \bar{\omega }\in {\mathbb {R}}^{\nu } , \quad \lambda \in \Lambda := \left[ \frac{1}{2}, \frac{3}{2} \right] , \quad |\bar{\omega }\cdot l | \ge \frac{3 \gamma _0}{|l|^{\tau _0}} \quad \forall l \in {\mathbb {Z}}^{\nu } {\setminus } \{ 0 \}, \end{aligned}$$
(1.2)

and \( f(\varphi , x, z )\), \(\varphi \in {\mathbb {T}}^{\nu }\), \( z := (z_0, z_1, z_2, z_3) \in {\mathbb {R}}^4 \), is a finitely many times differentiable function, namely

$$\begin{aligned} f \in C^q ( {\mathbb {T}}^{\nu } \times {\mathbb {T}}\times {\mathbb {R}}^4; {\mathbb {R}}) \end{aligned}$$
(1.3)

for some \( q \in {\mathbb {N}}\) large enough. For simplicity we fix in (1.2) the diophantine exponent \( \tau _0 := \nu \). The only “external” parameter in (1.1) is \( \lambda \), which is the length of the frequency vector (this corresponds to a time scaling). We consider the following questions:

  • For \( \varepsilon \) small enough, do there exist quasi-periodic solutions of (1.1) for positive measure sets of \( \lambda \in \Lambda \)?

  • Are these solutions linearly stable?

Clearly, if \( f(\varphi ,x, 0)\) is not identically zero, then \( u = 0 \) is not a solution of (1.1) for \( \varepsilon \ne 0 \). Thus we look for non-trivial \( (2 \pi )^{\nu +1}\)-periodic solutions \(u(\varphi ,x) \) of the Airy equation

$$\begin{aligned} \omega \cdot \partial _{\varphi } u + u_{xxx} + \varepsilon f(\varphi , x , u, u_{x}, u_{xx}, u_{xxx} ) = 0 \end{aligned}$$
(1.4)

in the Sobolev space

$$\begin{aligned} H^s&:= H^s ( {\mathbb {T}}^\nu \times {\mathbb {T}}; {\mathbb {R}}) \nonumber \\&:= \left\{ u(\varphi ,x) = \sum _{(l,j) \in {\mathbb {Z}}^{\nu } \times {\mathbb {Z}}} u_{l,j} \, e^{\mathrm{i} (l \cdot \varphi + jx)} \in {\mathbb {R}}, \ \ {\bar{u}}_{l,j} = u_{-l,-j} ,\right. \nonumber \\&\quad \left. \ \ \Vert u \Vert _s^2 := \sum _{(l,j) \in {\mathbb {Z}}^{\nu } \times {\mathbb {Z}}} \langle l, j \rangle ^{2s} | u_{l,j} |^ 2 < \infty \right\} \end{aligned}$$
(1.5)

where

$$\begin{aligned} \langle l,j \rangle := \max \{ 1, |l|, |j| \}. \end{aligned}$$

From now on, we fix \( {{\mathfrak {s}}}_0 := (\nu + 2) / 2 > (\nu +1 ) / 2 \), so that for all \(s \ge \mathfrak {s}_0\) the Sobolev space \(H^s\) is a Banach algebra, and it is continuously embedded \( H^s ({\mathbb {T}}^{\nu +1} ) \hookrightarrow C({\mathbb {T}}^{\nu +1} ) \).

We need some assumptions on the perturbation \( f(\varphi , x,u, u_x, u_{xx}, u_{xxx}) \). We suppose that

  • Type (F). The fully nonlinear perturbation has the form

    $$\begin{aligned} f (\varphi , x, u, u_x, u_{xxx}) , \end{aligned}$$
    (1.6)

namely it is independent of \( u_{xx} \) (note that the dependence on \( u_{xxx} \) may be nonlinear). Otherwise, we require that

  • Type (Q). The perturbation is quasi-linear, namely

    $$\begin{aligned} f = f_0 (\varphi , x, u, u_x, u_{xx}) + f_1 (\varphi , x,u,u_x, u_{xx}) u_{xxx} \end{aligned}$$

    is affine in \( u_{xxx} \), and it satisfies (naming the variables \( z_0 = u \), \( z_1 = u_x \), \( z_2 = u_{xx} \), \( z_3 = u_{xxx} \))

    $$\begin{aligned} \partial _{z_2} f = \alpha (\varphi ) \left( \partial ^2_{z_3 x} f + z_1 \partial ^2_{z_3 z_0} f + z_2 \partial ^2_{z_3 z_1} f + z_3 \partial ^2_{z_3 z_2} f \right) \end{aligned}$$
    (1.7)

    for some function \( \alpha (\varphi ) \) (independent on \( x \)).

The Hamiltonian nonlinearities in (1.11) satisfy the above assumption (Q), see remark 3.2. In comment 3 after Theorem 1.5 we explain the reason for assuming either condition (F) or (Q).

The following theorem is an existence result of quasi-periodic solutions.

Theorem 1.1

(Existence) There exist \( s := s( \nu ) > 0\), \( q := q( \nu ) \in {\mathbb {N}}\), such that:

For every quasi-linear nonlinearity \( f \in C^q \) of the form

$$\begin{aligned} f = \partial _x \left( g(\omega t, x, u, u_x, u_{xx})\right) \end{aligned}$$
(1.8)

satisfying the (Q)-condition (1.7), for all \(\varepsilon \in (0, \varepsilon _0)\), where \(\varepsilon _0 := \varepsilon _0 (f, \nu ) \) is small enough, there exists a Cantor set \( \mathcal{C}_\varepsilon \subset \Lambda \) of asymptotically full Lebesgue measure, i.e.

$$\begin{aligned} | \mathcal{C}_\varepsilon | \rightarrow 1 \quad \text {as} \quad \varepsilon \rightarrow 0, \end{aligned}$$
(1.9)

such that \( \forall \lambda \in \mathcal{C}_\varepsilon \) the perturbed equation (1.4) has a solution \( u( \varepsilon , \lambda ) \in H^s \) with \( \Vert u(\varepsilon , \lambda ) \Vert _s \rightarrow 0 \) as \( \varepsilon \rightarrow 0 \).

We may ensure the linear stability of the solutions requiring further conditions on the nonlinearity, see Theorem 1.5 for the precise statement. The first case is that of Hamiltonian equations

$$\begin{aligned} u_t \!=\! \partial _x \nabla _{L^2} H(t,x,u, u_x), \quad \! H(t,x,u,u_x) \!:=\! \int _{{\mathbb {T}}} \frac{u_x^2}{2}+ \varepsilon F(\omega t,x,u,u_x) \, dx\quad \quad \quad \end{aligned}$$
(1.10)

which have the form (1.1), (1.8) with

$$\begin{aligned} f(\varphi ,x,u,u_x, u_{xx}, u_{xxx}) = - \partial _x \big \{ (\partial _{z_0} F)(\varphi , x, u , u_x) \big \} + \partial _{xx} \big \{ (\partial _{z_1} F)(\varphi , x, u, u_x ) \big \}.\nonumber \\ \end{aligned}$$
(1.11)

The phase space of (1.10) is

$$\begin{aligned} H^1_0 ({\mathbb {T}}) := \left\{ u(x) \in H^1({\mathbb {T}}, {\mathbb {R}}) \, : \, \int _{{\mathbb {T}}} u(x) \, dx = 0 \right\} \end{aligned}$$

endowed with the non-degenerate symplectic form

$$\begin{aligned} \Omega (u,v) := \int _{{\mathbb {T}}} (\partial _x^{-1} u) v \, dx , \quad \forall u, v \in H_0^1 ({\mathbb {T}}) , \end{aligned}$$
(1.12)

where \( \partial _x^{-1} u \) is the periodic primitive of \( u \) with zero average, see (3.19). As proved in Remark 3.2, the Hamiltonian nonlinearity \( f \) in (1.11) satisfies also the (Q)-condition (1.7). As a consequence, Theorem 1.1 implies the existence of quasi-periodic solutions of (1.10). In addition, we also prove their linear stability.

Theorem 1.2

(Hamiltonian case) For all Hamiltonian quasi-linear equations (1.10) the quasi-periodic solution \(u(\varepsilon ,\lambda )\) found in Theorem 1.1 is linearly stable (see Theorem 1.5).

The stability of the quasi-periodic solutions also follows by the reversibility condition

$$\begin{aligned} f (-\varphi , -x, z_0, -z_1, z_2, -z_3) = - f(\varphi , x, z_0, z_1, z_2, z_3). \end{aligned}$$
(1.13)

Actually (1.13) implies that the infinite-dimensional non-autonomous dynamical system

$$\begin{aligned} u_t = V(t, u ), \quad V(t,u ) := - u_{xxx} - \varepsilon f(\omega t , x , u, u_{x}, u_{xx}, u_{xxx}) \end{aligned}$$

is reversible with respect to the involution

$$\begin{aligned} S : u(x) \rightarrow u(-x), \quad S^2 = I, \end{aligned}$$

namely

$$\begin{aligned} - S V(-t,u) = V(t,Su). \end{aligned}$$

In this case it is natural to look for “reversible” solutions of (1.4), that is

$$\begin{aligned} u ( \varphi ,x ) = u ( -\varphi , -x ). \end{aligned}$$
(1.14)

Theorem 1.3

(Reversible case) There exist \( s := s( \nu ) > 0\), \( q := q( \nu ) \in {\mathbb {N}}\), such that:

For every nonlinearity \( f \in C^q \) that satisfies

  1. (i)

    the reversibility condition (1.13), and

  2. (ii)

    either the (F)-condition (1.6) or the (Q)-condition (1.7), for all \(\varepsilon \in (0, \varepsilon _0)\), where \(\varepsilon _0 := \varepsilon _0 (f, \nu ) \) is small enough, there exists a Cantor set \( \mathcal{C}_\varepsilon \subset \Lambda \) with Lebesgue measure satisfying (1.9), such that for all \( \lambda \in \mathcal{C}_\varepsilon \) the perturbed Airy equation (1.4) has a solution \( u (\varepsilon , \lambda ) \in H^s \) that satisfies (1.14), with \( \Vert u (\varepsilon , \lambda ) \Vert _s \rightarrow 0 \) as \( \varepsilon \rightarrow 0 \). In addition, \(u(\varepsilon ,\lambda )\) is linearly stable.

Let us make some comments on the results.

  1. 1.

    The quasi-periodic solutions of Theorem 1.1 could be unstable because the nonlinearity \( f \) has no special structure and some eigenvalues of the linearized operator at the solutions could have non zero real part (partially hyperbolic tori). In any case, we reduce to constant coefficients the linearized operator (Theorem 1.4) and we may compute its eigenvalues (i.e. Lyapunov exponents) with any order of accuracy. With further conditions on the nonlinearity—like reversibility or in the Hamiltonian case—the eigenvalues are purely imaginary, and the torus is linearly stable. The present situation is very different with respect to [9, 10, 1215, 17] and also [2, 2426], where the lack of stability information is due to the fact that the linearized equation has variable coefficients.

  2. 2.

    One cannot expect the existence of quasi-periodic solutions of (1.4) for any perturbation \( f \). Actually, if \( f = m \ne 0 \) is a constant, then, integrating (1.4) in \( (\varphi ,x) \) we find the contradiction \( \varepsilon m = 0 \). This is a consequence of the fact that

    $$\begin{aligned} \mathrm{Ker}(\omega \cdot \partial _\varphi + \partial _{xxx}) = {\mathbb {R}}\end{aligned}$$
    (1.15)

    is non trivial. Both the condition (1.8) (which is satisfied by the Hamiltonian nonlinearities) and the reversibility condition (1.13) allow to overcome this obstruction, working in a space of functions with zero average. The degeneracy (1.15) also reflects in the fact that the solutions of (1.4) appear as a \(1\)-dimensional family \( c + u_c( \varepsilon , \lambda ) \) parametrized by the “average” \( c \in {\mathbb {R}}\). We could also avoid this degeneracy by adding a “mass” term \( + m u \) in (1.1), but it does not seem to have physical meaning.

  3. 3.

    In Theorem 1.1 we have not considered the case in which \( f \) is fully nonlinear and satisfies condition (F) in (1.6), because any nonlinearity of the form (1.8) is automatically quasi-linear (and so the first condition in (1.7) holds) and (1.6) trivially implies the second condition in (1.7) with \( \alpha (\varphi ) = 0 \).

  4. 4.

    The solutions \( u \in H^s \) have the same regularity in both variables \( (\varphi ,x) \). This functional setting is convenient when using changes of variables that mix the time and space variables, like the composition operators \({\mathcal {A}}\), \(\mathcal {T}\) in Sects. 3.1, 3.4.

  5. 5.

    In the Hamiltonian case (1.10), the nonlinearity \(f\) in (1.11) satisfies the reversibility condition (1.13) if and only if \( F( -\varphi , -x, z_0, -z_1) = F( \varphi , x, z_0, z_1) \).

Theorems 1.1–1.3 are based on a Nash–Moser iterative scheme. An essential ingredient in the proof—which also implies the linear stability of the quasi-periodic solutions—is the reducibility of the linear operator

$$\begin{aligned} {\mathcal {L}}:= {\mathcal {L}}(u) = \omega \cdot \partial _\varphi + (1 + a_3(\varphi ,x)) \partial _{xxx} + a_2(\varphi ,x) \partial _{xx} + a_1(\varphi ,x) \partial _x + a_0 (\varphi ,x)\nonumber \\ \end{aligned}$$
(1.16)

obtained by linearizing (1.4) at any approximate (or exact) solution \( u \), where the coefficients \( a_i (\varphi , x) \) are defined in (3.2). Let \( H^s_x := H^s ({\mathbb {T}}) \) denote the usual Sobolev spaces of functions of \( x \in {\mathbb {T}}\) only.

Theorem 1.4

(Reducibility) There exist \( \bar{\sigma }> 0 \), \( q \in {\mathbb {N}}\), depending on \( \nu \), such that:

For every nonlinearity \( f \in C^q \) that satisfies the hypotheses of Theorems 1.1 or 1.3, for all \(\varepsilon \in (0, \varepsilon _0)\), where \(\varepsilon _0 := \varepsilon _0 (f, \nu ) \) is small enough, for all \(u\) in the ball \(\Vert u \Vert _{ { {\mathfrak {s}}}_0 + \bar{\sigma }} \le 1\), there exists a Cantor like set \( \Lambda _\infty (u) \subset \Lambda \) such that, for all \( \lambda \in \Lambda _\infty (u) \):

  1. i)

    for all \( s \in ({{\mathfrak {s}}}_0, q - \bar{\sigma }) \), if \(\Vert u \Vert _{ s + \bar{\sigma }} < + \infty \) then there exist linear invertible bounded operators \( W_1 \), \( W_2 : H^s ({\mathbb {T}}^{\nu +1})\rightarrow H^s ( {\mathbb {T}}^{\nu +1} ) \) (see (4.72)) with bounded inverse, that semi-conjugate the linear operator \( \mathcal{L}(u) \) in (1.16) to the diagonal operator \( \mathcal{L}_\infty \), namely

    $$\begin{aligned} \mathcal{L}(u) = W_1 \mathcal{L}_\infty W_2^{-1} , \quad \mathcal{L}_\infty := {\omega \cdot \partial _{\varphi }}+ \mathcal{D}_\infty \end{aligned}$$
    (1.17)

    where

    $$\begin{aligned}&\mathcal{D}_\infty := \mathrm{diag}_{j \in {\mathbb {Z}}} \{ \mu _j \}, \quad \mu _j := \mathrm{i} (-m_3 j^3 + m_1 j) + r_j , \quad m_3, m_1 \in {\mathbb {R}},\nonumber \\&\sup _j |r_j | \le C \varepsilon . \end{aligned}$$
    (1.18)
  2. ii)

    For each \( \varphi \in {\mathbb {T}}^\nu \) the operators \( W_i \) are also bounded linear bijections of \( H^s_x \) (see notation (2.18))

    $$\begin{aligned} W_i ( \varphi ) , W_i^{-1} ( \varphi ) : H^s_x \rightarrow H^s_x , \quad i = 1,2. \end{aligned}$$

    A curve \( h(t) = h(t, \cdot ) \in H^{s}_x \) is a solution of the quasi-periodically forced linear equation

    $$\begin{aligned} \partial _t h + (1 + a_3(\omega t,x)) \partial _{xxx}h + a_2(\omega t,x) \partial _{xx}h + a_1(\omega t,x) \partial _xh + a_0 (\omega t,x)h = 0\nonumber \\ \end{aligned}$$
    (1.19)

    if and only if the transformed curve

    $$\begin{aligned} v(t) := v(t, \cdot ) := W_2^{-1} ( \omega t ) [h(t)] \in H^{s}_x \end{aligned}$$

    is a solution of the constant coefficients dynamical system

    $$\begin{aligned} \partial _t v + \mathcal{D}_\infty v = 0 \, , \quad {\dot{v}}_j = - \mu _j v_j , \ \ \forall j \in {\mathbb {Z}}. \end{aligned}$$
    (1.20)

    In the reversible or Hamiltonian case all the \( \mu _j \in \mathrm{i} {\mathbb {R}}\) are purely imaginary.

The operator \( W_1 \) differs from \( W_2 \) (see (4.72)) only for the multiplication by the function \( \rho \) in (3.26) which comes from the re-parametrization of time of Sect. 3.2. As explained in Sect. 2.2 this does not affect the dynamical consequence of Theorem 1.4-\(ii\)).

The exponents \( \mu _j \) can be effectively computed. All the solutions of (1.20) are

$$\begin{aligned} v(t) = \sum _{j \in {\mathbb {Z}}} v_j(t) e^{\mathrm{i} j x} , \quad v_j(t) = e^{- \mu _j t } v_j(0). \end{aligned}$$

If the \( \mu _j \) are purely imaginary—as in the reversible or the Hamiltonian cases—all the solutions of (1.20) are almost periodic in time (in general) and the Sobolev norm

$$\begin{aligned} \Vert v(t) \Vert _{H^s_x} \!=\! \left( \,\sum _{j \in {\mathbb {Z}}} |v_j(t)|^2 \langle j \rangle ^{2s}\right) ^{\!1/2} \!=\! \left( \,\sum _{j \in {\mathbb {Z}}} |v_j(0)|^2 \langle j \rangle ^{2s}\right) ^{\!1/2} \!=\! \Vert v(0) \Vert _{H^s_x}\quad \quad \quad \end{aligned}$$
(1.21)

is constant in time. As a consequence we have:

Theorem 1.5

(Linear stability) Assume the hypothesis of Theorem 1.4 and, in addition, that \( f \) is Hamiltonian (see (1.11)) or it satisfies the reversibility condition (1.13). Then, \( \forall s \in ( \mathfrak {s}_0, q - \bar{\sigma }- {\mathfrak {s}}_0) \), \( \Vert u \Vert _{s+ {\mathfrak {s}}_0 + \bar{\sigma }} < + \infty \), there exists \( K_0 > 0 \) such that for all \(\lambda \in \Lambda _\infty (u) \), \(\varepsilon \in (0,\varepsilon _0)\), all the solutions of (1.19) satisfy

$$\begin{aligned} \Vert h(t)\Vert _{H^s_x} \le K_0 \Vert h(0)\Vert _{H^s_x} \, \end{aligned}$$
(1.22)

and, for some \( \mathtt a \in (0,1) \),

$$\begin{aligned} \Vert h(0)\Vert _{H^s_x} \!- \varepsilon ^{\mathtt a} K_0 \Vert h(0)\Vert _{H^{s+1}_x} \le \Vert h(t)\Vert _{H^s_x} \le \Vert h(0)\Vert _{H^s_x} + \varepsilon ^{\mathtt a} K_0 \Vert h(0)\Vert _{H^{s+1}_x}.\quad \quad \quad \end{aligned}$$
(1.23)

Theorems 1.1–1.5 are proved in Sect. 5.1 collecting all the information of Sects. 25.

1.2 Ideas of the proof

The proof of Theorems 1.1–1.3 is based on a Nash–Moser iterative scheme in the scale of Sobolev spaces \( H^s \). The main issue concerns the invertibility of the linearized operator \( \mathcal{L} \) in (1.16), at each step of the iteration, and the proof of the tame estimates (5.7) for its right inverse. This information is obtained in Theorem 4.3 by conjugating \( \mathcal{L} \) to constant coefficients. This is also the key which implies the stability results for the Hamiltonian and reversible nonlinearities, see Theorems 1.4–1.5.

We now explain the main ideas of the reducibility scheme. The term of \( \mathcal{L} \) that produces the strongest perturbative effects to the spectrum (and eigenfunctions) is \( a_3 (\varphi ,x) \partial _{xxx} \), and, then, \( a_2 (\varphi ,x) \partial _{xx} \). The usual KAM transformations are not able to deal with these terms because they are “too close” to the identity. Our strategy is the following. First, we conjugate the operator \( {\mathcal {L}}\) in (1.16) to a constant coefficients third order differential operator plus a zero order remainder

$$\begin{aligned} \begin{aligned} {\mathcal {L}}_5&= \omega \cdot \partial _\varphi + m_3 \partial _{xxx} + m_1 \partial _x + \mathcal{R}_0, \\ m_3&= 1 + O(\varepsilon ),\,\, m_1 = O(\varepsilon ), \,\, m_1, m_3 \in {\mathbb {R}}, \end{aligned} \end{aligned}$$
(1.24)

(see (3.55)), via changes of variables induced by diffeomorphisms of the torus, a reparametrization of time, and pseudo-differential operators. This is the goal of Sect. 3. All these transformations could be composed into one map, but we find it more convenient to split the regularization procedure into separate steps (Sects. 3.13.5), both to highlight the basic ideas, and, especially, in order to derive estimates on the coefficients in Sect. 3.6. Let us make some comments on this procedure.

  1. 1.

    In order to eliminate the space variable dependence of the highest order perturbation \( a_3 (\varphi ,x) \partial _{xxx} \) (see (3.20)) we use, in Sect. 3.1, \(\varphi \)-dependent changes of variables of the form

    $$\begin{aligned} (\mathcal{A} h)(\varphi , x) := h(\varphi , x + \beta (\varphi , x)). \end{aligned}$$

    These transformations converge pointwise to the identity if \( \beta \rightarrow 0 \) but not in operatorial norm. If \( \beta \) is odd, \({\mathcal {A}}\) preserves the reversible structure, see Remark 3.4. On the other hand for the Hamiltonian equation (1.10) we use the modified transformation

    $$\begin{aligned} (\mathcal{A}h)(\varphi ,x)&:= (1+ \beta _x(\varphi , x)) \, h(\varphi , x + \beta (\varphi , x))\nonumber \\&= \frac{d}{dx} \big \{ ({\partial _x}^{-1} h )(\varphi , x+ \beta (\varphi ,x)) \big \} \end{aligned}$$
    (1.25)

    for all \( h(\varphi , \cdot ) \in H^1_0 ({\mathbb {T}}) \). This map is canonical, for each \( \varphi \in {\mathbb {T}}^\nu \), with respect to the KdV-symplectic form (1.12), see Remark 3.3. Thus (1.25) preserves the Hamiltonian structure and also eliminates the term of order \( \partial _{xx} \), see Remark 3.5.

  2. 2.

    In the second step of Sect. 3.2 we eliminate the time dependence of the coefficients of the highest order spatial derivative operator \( \partial _{xxx} \) by a quasi-periodic time re-parametrization. This procedure preserves the reversible and the Hamiltonian structure, see Remark 3.6 and 3.7.

  3. 3.

    Assumptions (Q) (see (1.7)) or (F) (see (1.6)) allow to eliminate terms like \( a (\varphi , x) \partial _{xx} \) along this reduction procedure, see (3.41). This is possible, by a conjugation with multiplication operators (see (3.34)), if (see (3.40))

    $$\begin{aligned} \int _{\mathbb {T}}\frac{a_2(\varphi ,x)}{1 + a_3(\varphi ,x)} \, dx = 0. \end{aligned}$$
    (1.26)

    If (F) holds, then the coefficient \( a_2(\varphi ,x) = 0 \) and (1.26) is satisfied. If (Q) holds, then an easy computation shows that \( a_2(\varphi ,x) = \alpha (\varphi ) \, \partial _x a_3(\varphi ,x) \) (using the explicit expression of the coefficients in (3.2)), and so

    $$\begin{aligned} \int _{\mathbb {T}}\frac{a_2(\varphi ,x)}{1 + a_3(\varphi ,x)} \, dx = \int _{\mathbb {T}}\alpha (\varphi ) \, \partial _x \left( \log [ 1+a_3(\varphi ,x)] \right) \, dx = 0 . \end{aligned}$$

    In both cases (Q) and (F), condition (1.26) is satisfied. In the Hamiltonian case there is no need of this step because the symplectic transformation (1.25) also eliminates the term of order \( \partial _{xx} \), see Remark 3.7. We note that without assumptions (Q) or (F) we may always reduce \({\mathcal {L}}\) to a time dependent operator with \( a (\varphi ) \partial _{xx} \). If \( a(\varphi ) \) were a constant, then this term would even simplify the analysis, killing the small divisors. The pathological situation that we avoid by assuming (Q) or (F) is when \( a(\varphi ) \) changes sign. In such a case, this term acts as a friction when \( a(\varphi ) < 0 \) and as an amplifier when \( a(\varphi ) > 0 \).

  4. 4.

    In Sects. 3.43.5, we are finally able to conjugate the linear operator to another one with a coefficient in front of \( \partial _x \) which is constant, i.e. obtaining (1.24). In this step we use a transformation of the form \( I + w(\varphi ,x) \partial _x^{-1} \), see (3.49). In the Hamiltonian case we use the symplectic map \( e^{\pi _0 w(\varphi , x) \partial _x^{-1}} \), see Remark 3.13.

  5. 5.

    We can iterate the regularization procedure at any finite order \( k = 0, 1, \ldots \), conjugating \( \mathcal{L} \) to an operator of the form \({\mathfrak D} + \mathcal{R }\), where

    $$\begin{aligned} {\mathfrak D} = \omega \cdot \partial _\varphi + {\mathcal {D}}, \quad {\mathcal {D}}= m_3 \partial _{x}^3 + m_1 \partial _x + \cdots + m_{-k} \partial _x^{-k} , \quad m_{i} \in {\mathbb {R}}, \end{aligned}$$

    has constant coefficients, and the rest \( \mathcal{R } \) is arbitrarily regularizing in space, namely

    $$\begin{aligned} \partial _x^{k} \circ {\mathcal {R}}= \text {bounded}. \end{aligned}$$
    (1.27)

    However, one cannot iterate this regularization infinitely many times, because it is not a quadratic scheme, and therefore, because of the small divisors, it does not converge. This regularization procedure is sufficient to prove the invertibility of \( \mathcal{L} \), giving tame estimates for the inverse, in the periodic case, but it does not work for quasi-periodic solutions. The reason is the following. In order to use Neumann series, one needs that \({\mathfrak D} ^{-1}{\mathcal {R}}= ({\mathfrak D}^{-1}\partial _x^{-k}) (\partial _x^{k} {\mathcal {R}})\) is bounded, namely, in view of (1.27), that \( {\mathfrak D} ^{-1}\partial _x^{-k} \) is bounded. In the region where the eigenvalues \((\mathrm{i} \omega \cdot l + {\mathcal {D}}_j)\) of \({\mathfrak D} \) are small, space and time derivatives are related, \(|\omega \cdot l| \sim |j|^3\), where \(l\) is the Fourier index of time, \(j\) is that of space, and \({\mathcal {D}}_j = - \mathrm{i} m_3 j^3 + \mathrm{i} m_1 j + \cdots \) are the eigenvalues of \({\mathcal {D}}\). Imposing the first order Melnikov conditions \(|\mathrm{i} \omega \cdot l + {\mathcal {D}}_j| > \gamma |l|^{-\tau }\), in that region, \(( {\mathfrak D} ^{-1}\partial _x^{-k})\) has eigenvalues

    $$\begin{aligned} \left| \frac{1}{(\mathrm{i} \omega \cdot l + {\mathcal {D}}_j) j^{k}}\, \right| < \frac{|l|^\tau }{\gamma |j|^{k}} \, < \frac{C |l|^\tau }{|\omega \cdot l|^{k/3}}. \end{aligned}$$

    In the periodic case, \(\omega \in {\mathbb {R}}\), \(l \in {\mathbb {Z}}\), \(|\omega \cdot l| = |\omega | |l|\), and this determines the order of regularization that is required by the procedure: \( k \ge 3 \tau \). In the quasi-periodic case, instead, \(|l|\) is not controlled by \(|\omega \cdot l|\), and the argument fails.

Once (1.24) has been obtained, we implement a quadratic reducibility KAM scheme to diagonalize \( \mathcal{L}_5 \), namely to conjugate \( \mathcal{L}_5 \) to the diagonal operator \( \mathcal{L}_\infty \) in (1.17). Since we work with finite regularity, we perform a Nash–Moser smoothing regularization (time-Fourier truncation). We use standard KAM transformations, in order to decrease, quadratically at each step, the size of the perturbation \({\mathcal {R}}\), see Sect. 4.1.1. This iterative scheme converges (Theorem 4.2) because the initial remainder \( \mathcal{R}_0 \) is a bounded operator (of the space variable \(x\)), and this property is preserved along the iteration. This is the reason for performing the regularization procedure of Sects. 3.13.5. The second order Melnikov non-resonance conditions required by the reducibility scheme (see (4.17)) are verified thanks to the good control of the eigenvalues

$$\begin{aligned} \mu _j = - \mathrm{i} m_3(\varepsilon ,\lambda ) j^3 + \mathrm{i} m_1(\varepsilon ,\lambda ) j + r_j (\varepsilon ,\lambda ) , \quad \sup _j |r_j (\varepsilon ,\lambda )| = O(\varepsilon ). \end{aligned}$$

We underline that the goal of the Töplitz–Lipschitz [19, 21, 23] and quasi-Töplitz property [6, 7, 38, 39] is precisely to provide an asymptotic expansion of the perturbed eigenvalues sharp enough to verify the second order Melnikov conditions.

Note that the above eigenvalues \( \mu _j \) could be not purely imaginary, i.e. \( r_j \) could have a non-zero real part which depends on the nonlinearity (unlike the reversible or Hamiltonian case, where \( r_j \in \mathrm{i} {\mathbb {R}}\)). In such a case, the invariant torus could be (partially) hyperbolic. Since we do not control the real part of \( r_j \) (i.e. the hyperbolicity may vanish), we perform the measure estimates proving the diophantine lower bounds of the imaginary part of the small divisors.

The final comment concerns the dynamical consequences of Theorem 1.4-\(ii\)). All the above transformations (both the changes of variables of Sects. 3.13.5 as well as the KAM matrices of the reducibility scheme) are time-dependent quasi-periodic maps of the phase space (of functions of \( x\) only), see Sect. 2.2. It is thanks to this “Töplitz-in-time” structure that the linear equation (1.19) is transformed into the dynamical system (1.20) as explained in Sect. 2.2. Note that in [24] (and also [9, 10, 15]) the analogous transformations have not this Töplitz-in-time structure and stability informations are not obtained.

2 Functional setting

For a function \(f : \Lambda _o \rightarrow E\), \(\lambda \mapsto f(\lambda )\), where \((E, \Vert \ \Vert _E)\) is a Banach space and \( \Lambda _o \) is a subset of \({\mathbb {R}}\), we define the sup-norm and the Lipschitz semi-norm

$$\begin{aligned} \begin{aligned} {\Vert } f \Vert ^{\sup }_E&:= \Vert f \Vert ^{\sup }_{E,\Lambda _o} := \sup _{ \lambda \in \Lambda _o } \Vert f(\lambda ) \Vert _E,\\ \Vert f \Vert ^{\mathrm{lip}}_E&:= \Vert f \Vert ^{\mathrm{lip}}_{E,\Lambda _o} := {\mathop {\mathop {\sup }\limits _{\lambda _1, \lambda _2 \in \Lambda _o}}\limits _{\lambda _1 \ne \lambda _2}} \frac{ \Vert f(\lambda _1) - f(\lambda _2) \Vert _E }{ | \lambda _1 - \lambda _2 | }, \end{aligned} \end{aligned}$$
(2.1)

and, for \( \gamma > 0 \), the Lipschitz norm

$$\begin{aligned} \Vert f \Vert ^{\mathrm{{Lip}(\gamma )}}_E := \Vert f \Vert ^\mathrm{{Lip}(\gamma )}_{E,\Lambda _o} := \Vert f \Vert ^{\sup }_E + \gamma \Vert f \Vert ^{\mathrm{lip}}_E . \end{aligned}$$
(2.2)

If \( E = H^s \) we simply denote \( \Vert f \Vert ^{\mathrm{{Lip}(\gamma )}}_{H^s} := \Vert f \Vert ^{\mathrm{{Lip}(\gamma )}}_s \).

As a notation, we write

$$\begin{aligned} a \le _s b \Longleftrightarrow a \le C(s) b \end{aligned}$$

for some constant \( C(s) \). For \( s = {\mathfrak {s}}_0 := (\nu +2) \slash 2 \) we only write \( a \,\lessdot \,b \). More in general the notation \( a \lessdot b \) means \( a \le C b \) where the constant \( C \) may depend on the data of the problem, namely the nonlinearity \( f \), the number \( \nu \) of frequencies, the diophantine vector \( \bar{\omega }\), the diophantine exponent \( \tau > 0 \) in the non-resonance conditions in (4.6). Also the small constants \( \delta \) in the sequel depend on the data of the problem.

2.1 Matrices with off-diagonal decay

Let \( b \in {\mathbb {N}}\) and consider the exponential basis \(\{ e_i : i \in {\mathbb {Z}}^b \} \) of \(L^2({\mathbb {T}}^b) \), so that \(L^2({\mathbb {T}}^b)\) is the vector space \(\{ u = \sum u_i e_i\), \(\sum |u_i |^2 < \infty \}\). Any linear operator \(A : L^2 ({\mathbb {T}}^b) \rightarrow L^2 ({\mathbb {T}}^b) \) can be represented by the infinite dimensional matrix

$$\begin{aligned} ( A_{i}^{i'} )_{i, i' \in {\mathbb {Z}}^b}, \quad A_{i}^{i'} := ( A e_{i'}, e_{i})_{L^2({\mathbb {T}}^b)}, \quad A u = \sum _{i, i'} A_{i}^{i'} u_{i'} e_{i}. \end{aligned}$$

We now define the \( s \)-norm (introduced in [9]) of an infinite dimensional matrix.

Definition 2.1

The \(s\)-decay norm of an infinite dimensional matrix \( A := (A_{i_1}^{i_2} )_{i_1, i_2 \in {\mathbb {Z}}^b } \) is

$$\begin{aligned} \left| A \right| _{s}^2 := \sum _{i \in {\mathbb {Z}}^b} \left\langle i \right\rangle ^{2s} \left( {\mathop {\sup }\limits _{i_{1} - i_{2} = i}} | A^{i_2}_{i_1}| \right) ^{2}. \end{aligned}$$
(2.3)

For parameter dependent matrices \( A := A(\lambda ) \), \(\lambda \in \Lambda _o \subseteq {\mathbb {R}}\), the Definitions (2.1) and (2.2) become

$$\begin{aligned} \begin{aligned} | A |^{\sup }_s&:= \sup _{ \lambda \in \Lambda _o } | A(\lambda ) |_s , \quad | A |^{\mathrm{lip}}_s := \sup _{\lambda _1 \ne \lambda _2} \frac{ | A(\lambda _1) - A(\lambda _2) |_s }{ | \lambda _1 - \lambda _2 | }, \\ | A |^{\mathrm{{Lip}(\gamma )}}_s&:= | A |^{\sup }_s + \gamma | A |^{\mathrm{lip}}_s . \end{aligned} \end{aligned}$$

Clearly, the matrix decay norm (2.3) is increasing with respect to the index \( s \), namely

$$\begin{aligned} | A |_s \le | A |_{s'} , \quad \forall s < s'. \end{aligned}$$

The \( s \)-norm is designed to estimate the polynomial off-diagonal decay of matrices, actually it implies

$$\begin{aligned} |A_{i_1}^{i_2}| \le \frac{|A|_s}{\langle i_1 - i_2 \rangle ^s} , \quad \forall i_1, i_2 \in {\mathbb {Z}}^b , \end{aligned}$$

and, on the diagonal elements,

$$\begin{aligned} |A_i^i | \le |A|_0 , \quad |A_i^i |^\mathrm{lip} \le |A|_0^\mathrm{lip}. \end{aligned}$$
(2.4)

We now list some properties of the matrix decay norm proved in [9].

Lemma 2.1

(Multiplication operator) Let \( p = \sum _i p_i e_i \in H^s({\mathbb {T}}^b)\). The multiplication operator \( h \mapsto p h\) is represented by the Töplitz matrix \( T_i^{i'} = p_{i - i'} \) and

$$\begin{aligned} |T|_s = \Vert p \Vert _s. \end{aligned}$$
(2.5)

Moreover, if \(p = p(\lambda )\) is a Lipschitz family of functions,

$$\begin{aligned} |T|_s^\mathrm{{Lip}(\gamma )}= \Vert p \Vert _s^\mathrm{{Lip}(\gamma )}. \end{aligned}$$
(2.6)

The \(s\)-norm satisfies classical algebra and interpolation inequalities.

Lemma 2.2

(Interpolation) For all \(s \ge s_0 > b/2 \) there are \( C(s) \ge C(s_0) \ge 1 \) such that

$$\begin{aligned} | A B|_{s} \le C(s) |A|_{s_0} |B|_s + C(s_0) |A|_s |B|_{s_0}. \end{aligned}$$
(2.7)

In particular, the algebra property holds

$$\begin{aligned} |A B |_s \le C(s) |A|_s |B|_s. \end{aligned}$$
(2.8)

If \(A = A(\lambda )\) and \(B = B(\lambda )\) depend in a Lipschitz way on the parameter \(\lambda \in \Lambda _o \subset {\mathbb {R}}\), then

$$\begin{aligned} |A B |_s^{\mathrm{{Lip}(\gamma )}}&\le C(s) |A|_s^{\mathrm{{Lip}(\gamma )}} |B|_s^{\mathrm{{Lip}(\gamma )}} ,\end{aligned}$$
(2.9)
$$\begin{aligned} |A B|_{s}^{\mathrm{{Lip}(\gamma )}}&\le C(s) |A|_{s}^{\mathrm{{Lip}(\gamma )}} |B|_{s_0}^{\mathrm{{Lip}(\gamma )}} + C(s_0) |A|_{s_0}^{\mathrm{{Lip}(\gamma )}} |B|_{s}^{\mathrm{{Lip}(\gamma )}}. \end{aligned}$$
(2.10)

For all \(n \ge 1\), using (2.8) with \( s = s_0 \), we get

$$\begin{aligned} | A^n |_{s_0} \le [C(s_0)]^{n-1} | A |_{s_0}^n \quad \text {and} \quad | A^n |_{s} \le n [ C(s_0) |A|_{s_0} ]^{n-1} C(s) | A |_{s} , \ \forall s \ge s_0.\nonumber \\ \end{aligned}$$
(2.11)

Moreover (2.10) implies that (2.11) also holds for Lipschitz norms \(| \ |_s^\mathrm{{Lip}(\gamma )}\).

The \( s \)-decay norm controls the Sobolev norm, also for Lipschitz families:

$$\begin{aligned} \begin{aligned} \Vert A h \Vert _s&\le C(s) \left( |A|_{s_0} \Vert h \Vert _s + |A|_{s} \Vert h \Vert _{s_0} \right) \!, \\ \Vert A h \Vert _s^\mathrm{{Lip}(\gamma )}&\le C(s) \left( |A|_{s_0}^\mathrm{{Lip}(\gamma )}\Vert h \Vert _s^\mathrm{{Lip}(\gamma )}|A|_{s}^\mathrm{{Lip}(\gamma )}\Vert h \Vert _{s_0}^\mathrm{{Lip}(\gamma )}\right) . \end{aligned} \end{aligned}$$
(2.12)

Lemma 2.3

Let \( \Phi = I + \Psi \) with \(\Psi := \Psi (\lambda )\), depending in a Lipschitz way on the parameter \(\lambda \in \Lambda _o \subset {\mathbb {R}}\), such that \( C(s_0) | \Psi |_{s_0}^{\mathrm{{Lip}(\gamma )}} \le 1/ 2 \). Then \( \Phi \) is invertible and, for all \( s \ge s_0 > b / 2 \),

$$\begin{aligned} | \Phi ^{-1} - I |_s \le C(s) | \Psi |_s , \quad | \Phi ^{-1} |_{s_0}^{\mathrm{{Lip}(\gamma )}} \le 2 , \quad | \Phi ^{-1} - I |_{s}^{\mathrm{{Lip}(\gamma )}} \le C(s) | \Psi |_{s}^{\mathrm{{Lip}(\gamma )}}.\nonumber \\ \end{aligned}$$
(2.13)

If \( \Phi _i = I + \Psi _i \), \( i = 1,2 \), satisfy \( C(s_0) | \Psi _i |_{s_0}^{\mathrm{{Lip}(\gamma )}} \le 1/ 2 \), then

$$\begin{aligned} \vert \Phi _2^{-1} - \Phi _1^{-1} \vert _{s} \le C(s) \left( \vert \Psi _2 - \Psi _1 \vert _{s} + \left( \vert \Psi _1 \vert _s + \vert \Psi _2 \vert _s \right) \vert \Psi _2 - \Psi _1 \vert _{s_0} \right) .\quad \quad \end{aligned}$$
(2.14)

Proof

Estimates (2.13) follow by Neumann series and (2.11). To prove (2.14), observe that

$$\begin{aligned} \Phi _2^{-1} - \Phi _1^{-1} = \Phi _1^{-1} (\Phi _1 - \Phi _2) \Phi _2^{-1} = \Phi _1^{-1} (\Psi _1 - \Psi _2) \Phi _2^{-1} \end{aligned}$$

and use (2.7), (2.13). \(\square \)

2.1.1 Töplitz-in-time matrices

Let now \( b := \nu + 1 \) and

$$\begin{aligned} e_i (\varphi , x) := e^{\mathrm{i} (l \cdot \varphi + j x)}, \quad i := (l, j) \in {\mathbb {Z}}^b , \quad l \in {\mathbb {Z}}^\nu , \quad j \in {\mathbb {Z}}. \end{aligned}$$

An important sub-algebra of matrices is formed by the matrices Töplitz in time defined by

$$\begin{aligned} A^{(l_2, j_2)}_{(l_1, j_1)} := A^{j_2}_{j_1}(l_1 - l_2 ), \end{aligned}$$
(2.15)

whose decay norm (2.3) is

$$\begin{aligned} |A|_s^2 = \sum _{j \in {\mathbb {Z}}, l \in {\mathbb {Z}}^\nu } \sup _{j_1 - j_2 = j} |A_{j_1}^{j_2}(l)|^2 \langle l,j \rangle ^{2 s}. \end{aligned}$$
(2.16)

These matrices are identified with the \( \varphi \)-dependent family of operators

$$\begin{aligned} A(\varphi ) := \left( A_{j_1}^{j_2} (\varphi )\right) _{j_1, j_2 \in {\mathbb {Z}}} , \quad A_{j_1}^{j_2} (\varphi ) := \sum _{l \in {\mathbb {Z}}^\nu } A_{j_1}^{j_2}(l) e^{\mathrm{i} l \cdot \varphi } \end{aligned}$$
(2.17)

which act on functions of the \(x\)-variable as

$$\begin{aligned} A(\varphi ) : h(x) = \sum _{j \in {\mathbb {Z}}} h_j e^{\mathrm{i} jx} \mapsto A(\varphi ) h(x) = \sum _{j_1, j_2 \in {\mathbb {Z}}} A_{j_1}^{j_2} (\varphi ) h_{j_2} e^{\mathrm{i} j_1 x}. \end{aligned}$$
(2.18)

We still denote by \( | A(\varphi ) |_s \) the \( s \)-decay norm of the matrix in (2.17).

Lemma 2.4

Let \( A \) be a Töplitz matrix as in (2.15), and \({\mathfrak {s}}_0 := (\nu + 2)/2\) (as defined above). Then

$$\begin{aligned} |A(\varphi )|_{s} \le C({\mathfrak {s}}_0) |A|_{s+ \mathfrak s_0} , \quad \forall \varphi \in {\mathbb {T}}^\nu . \end{aligned}$$

Proof

For all \( \varphi \in {\mathbb {T}}^\nu \) we have

$$\begin{aligned} |A(\varphi )|_{s}^2 \!&:= \! \sum _{j \in {\mathbb {Z}}} \langle j \rangle ^{2 s} \sup _{j_1 - j_2 = j} |A_{j_1}^{j_2}(\varphi )|^2 \lessdot \sum _{j \in {\mathbb {Z}}} \langle j \rangle ^{2 s} \sup _{j_1 - j_2 = j} \sum _{l \in {\mathbb {Z}}^\nu } |A_{j_1}^{j_2}(l)|^2 \langle l \rangle ^{2 {{\mathfrak {s}}}_0} \\ \!&\lessdot \! \sum _{j \in {\mathbb {Z}}} \sup _{j_1 - j_2 = j} \sum _{l \in {\mathbb {Z}}^\nu } |A_{j_1}^{j_2}(l)|^2 \langle l,j \rangle ^{2 (s + {\mathfrak s}_0)} \lessdot \sum _{j \in {\mathbb {Z}}, l \in {\mathbb {Z}}^\nu } \sup _{j_1 - j_2 = j} |A_{j_1}^{j_2}(l)|^2 \langle l,j \rangle ^{2 (s + {\mathfrak s}_0)}\\ \!&\mathop {=}\limits ^{(2.16)} \! |A|_{s + {{\mathfrak {s}}}_0}^2 , \end{aligned}$$

whence the lemma follows. \(\square \)

Given \( N \in {\mathbb {N}}\), we define the smoothing operator \(\Pi _N\) as

$$\begin{aligned} \left( \Pi _N A \right) ^{(l_2, j_2)}_{(l_1, j_1)} := {\left\{ \begin{array}{ll} A^{(l_2, j_2)}_{(l_1, j_1)} &{}\quad \mathrm{if} \ | l_1 - l_2| \le N \\ 0 &{}\quad \mathrm{otherwise.} \end{array}\right. } \end{aligned}$$
(2.19)

Lemma 2.5

The operator \( \Pi _N^\bot := I - \Pi _N \) satisfies

$$\begin{aligned} | \Pi _N^\bot A |_{s} \le N^{- \beta } | A |_{s+\beta } , \quad | \Pi _N^\bot A |_{s}^{\mathrm{{Lip}(\gamma )}} \le N^{- \beta } | A |_{s+\beta }^{\mathrm{{Lip}(\gamma )}} , \quad \beta \ge 0, \end{aligned}$$
(2.20)

where in the second inequality \(A := A(\lambda )\) is a Lipschitz family \(\lambda \in \Lambda \).

2.2 Dynamical reducibility

All the transformations that we construct in Sects. 3 and 4 act on functions \( u(\varphi , x ) \) (of time and space). They can also be seen as:

(a):

transformations of the phase space \(H^s_x\) that depend quasi-periodically on time (Sects. 3.1, 3.33.5 and 4);

(b):

quasi-periodic reparametrizations of time (Sect. 3.2).

This observation allows to interpret the conjugacy procedure from a dynamical point of view.

Consider a quasi-periodic linear dynamical system

$$\begin{aligned} \partial _t u = L(\omega t) u. \end{aligned}$$
(2.21)

We want to describe how (2.21) changes under the action of a transformation of type \((a)\) or \((b)\).

Let \(A(\omega t)\) be of type \((a)\), and let \(u = A(\omega t)v\). Then (2.21) is transformed into the linear system

$$\begin{aligned} \partial _{t} v = L_+(\omega t)v \quad \mathrm{where} \quad L_{+}(\omega t) = A(\omega t)^{-1} L(\omega t) A(\omega t) - A(\omega t)^{-1} \partial _t [A(\omega t)].\nonumber \\ \end{aligned}$$
(2.22)

The transformation \(A(\omega t)\) may be regarded to act on functions \( u(\varphi , x) \) as

$$\begin{aligned} ({\tilde{A}} u)(\varphi ,x) := \left( A(\varphi )u(\varphi , \cdot )\right) (x) := A(\varphi )u(\varphi , x) \end{aligned}$$
(2.23)

and one can check that \( ({\tilde{A}}^{-1} u)(\varphi ,x) = A^{-1}(\varphi ) u(\varphi , x) \). The operator associated to (2.21) (on quasi-periodic functions)

$$\begin{aligned} \mathcal{L} := \omega \cdot \partial _\varphi - L(\varphi ) \end{aligned}$$
(2.24)

transforms under the action of \( {\tilde{A}} \) into

$$\begin{aligned} {\tilde{A}}^{-1} \mathcal{L} {\tilde{A}} = \omega \cdot \partial _\varphi - L_+(\varphi ), \end{aligned}$$

which is exactly the linear system in (2.22), acting on quasi-periodic functions.

Now consider a transformation of type \((b)\), namely a change of the time variable

$$\begin{aligned}&\tau := t + \alpha (\omega t) \ \Leftrightarrow \ t = \tau + \tilde{\alpha }(\omega \tau ); \nonumber \\&\quad (Bv)(t) := v(t + \alpha (\omega t)),\quad (B^{-1} u)(\tau ) = u(\tau + \tilde{\alpha }(\omega \tau )), \end{aligned}$$
(2.25)

where \(\alpha = \alpha (\varphi )\), \(\varphi \in {\mathbb {T}}^\nu \), is a \(2\pi \)-periodic function of \(\nu \) variables (in other words, \( t \mapsto t + \alpha (\omega t) \) is the diffeomorphism of \({\mathbb {R}}\) induced by the transformation \(B\)). If \( u(t) \) is a solution of (2.21), then \( v(\tau ) \), defined by \( u = Bv\), solves

$$\begin{aligned} \partial _\tau v(\tau ) = L_+ (\omega \tau ) v (\tau ) , \quad L_+ (\omega \tau ) := \left( \frac{L(\omega t)}{1+ ({\omega \cdot \partial _{\varphi }}\alpha ) (\omega t)} \right) _{|t= \tau + \tilde{\alpha }(\omega \tau )}.\quad \quad \end{aligned}$$
(2.26)

We may regard the associated transformation on quasi-periodic functions defined by

$$\begin{aligned} (\tilde{B} h)(\varphi ,x) := h( \varphi + \omega \alpha (\varphi ), x) , \quad (\tilde{B}^{-1} h)(\varphi ,x) := h( \varphi + \omega \tilde{\alpha }(\varphi ), x) , \end{aligned}$$

as in step 3.2, where we calculate

$$\begin{aligned} B^{-1} \mathcal{L} B = \rho (\varphi ) \mathcal{L}_+ , \quad \rho (\varphi ) := B^{-1} (1+ \omega \cdot \partial _\varphi \alpha ) , \end{aligned}$$
$$\begin{aligned} \mathcal{L}_+ = \omega \cdot \partial _\varphi - L_+(\varphi ) , \ \ L_+(\varphi ) := \frac{1}{\rho (\varphi )} L(\varphi + \omega {\tilde{\alpha }}(\varphi )). \end{aligned}$$
(2.27)

(2.27) is nothing but the linear system (2.26), acting on quasi-periodic functions.

2.3 Real, reversible and Hamiltonian operators

We consider the space of real functions

$$\begin{aligned} Z := \{ u(\varphi ,x) = \overline{u(\varphi ,x)} \}, \end{aligned}$$
(2.28)

and of even (in space-time), respectively odd, functions

$$\begin{aligned} X := \{ u(\varphi ,x) = u(-\varphi ,-x) \} , \quad Y := \{ u(\varphi ,x) = -u(-\varphi ,-x) \}. \end{aligned}$$
(2.29)

Definition 2.2

An operator \( R \) is

  1. 1.

    real if \( R : Z \rightarrow Z \)

  2. 2.

    reversible if \( R : X \rightarrow Y \)

  3. 3.

    reversibility-preserving if \( R : X \rightarrow X \), \( R : Y \rightarrow Y \).

The composition of a reversible and a reversibility-preserving operator is reversible.

The above properties may be characterized in terms of matrix elements.

Lemma 2.6

We have

$$\begin{aligned}&R : X \rightarrow Y \Longleftrightarrow R^{-j}_{-k}(-l) = - R^j_{k}(l) , \quad R : X \rightarrow X \Longleftrightarrow R^{-j}_{-k}(-l) = R^j_k (l) ,\\&R : Z \rightarrow Z \Longleftrightarrow \overline{R^j_{k}(l)} = R^{-j}_{-k}(-l). \end{aligned}$$

For the Hamiltonian equation (1.10) the phase space is \( H^1_0 := \{ u \in H^1 ({\mathbb {T}}) \, : \, \int _{{\mathbb {T}}} u(x) dx = 0 \} \).

Definition 2.3

A time dependent linear vector field \( X(t) : H_0^1 \rightarrow H_0^1\) is Hamiltonian if \( X(t) = \partial _x G(t) \) for some real linear operator \( G(t) \) which is self-adjoint with respect to the \( L^2 \) scalar product.

If \( G(t) = G(\omega t)\) is quasi-periodic in time, we say that the associated operator \( \omega \cdot \partial _{\varphi } - \partial _x G( \varphi ) \) (see (2.24)) is Hamiltonian.

Definition 2.4

A map \( A : H_0^1 \rightarrow H_0^1\) is symplectic if

$$\begin{aligned} \Omega (A u, A v) = \Omega (u, v) , \quad \forall u,v \in H_0^1 , \end{aligned}$$
(2.30)

where the symplectic 2-form \( \Omega \) is defined in (1.12). Equivalently \( A^T \partial _x^{-1} A = \partial _x^{-1} \).

If \( A (\varphi ) \), \( \forall \varphi \in {\mathbb {T}}^\nu \), is a family of symplectic maps we say that the corresponding operator in (2.23) is symplectic.

Under a time dependent family of symplectic transformations \( u = \Phi (t) v \) the linear Hamiltonian equation

$$\begin{aligned} u_t = \partial _x G(t) u \quad \mathrm{with \ Hamiltonian} \quad H(t, u) := \tfrac{1}{2} \, \left( G(t)u ,u \right) _{L^2} \end{aligned}$$

transforms into the equation

$$\begin{aligned} v_t = \partial _x E(t) v, \quad E(t) := \Phi (t)^T G(t) \Phi (t) - \Phi (t)^T \partial _x^{-1} \Phi _t(t) \end{aligned}$$

with Hamiltonian

$$\begin{aligned} K(t,v) = \tfrac{1}{2}\, \left( G(t) \Phi (t) v , \Phi (t) v \right) _{L^2} - \tfrac{1}{2}\, \left( \partial _x^{-1} \Phi _t(t)v, \Phi (t) v \right) _{L^2} . \end{aligned}$$
(2.31)

Note that \(E(t)\) is self-adjoint with respect to the \(L^2\) scalar product because \(\Phi ^T \partial _x^{-1} \Phi _t + \Phi _t^T \partial _x^{-1} \Phi = 0\).

3 Regularization of the linearized operator

Our existence proof is based on a Nash–Moser iterative scheme. The main step concerns the invertibility of the linearized operator (see (1.16))

$$\begin{aligned} {\mathcal {L}}h = {\mathcal {L}}(\lambda ,u,\varepsilon ) h := {\omega \cdot \partial _{\varphi }}h + (1 + a_3) \partial _{xxx} h + a_2 \partial _{xx} h + a_1 \partial _{x} h + a_0 h\quad \quad \end{aligned}$$
(3.1)

obtained linearizing (1.4) at any approximate (or exact) solution \( u \). The coefficients \(a_i = a_i(\varphi ,x) = a_i(u,\varepsilon )(\varphi ,x)\) are periodic functions of \((\varphi ,x)\), depending on \(u,\varepsilon \). They are explicitly obtained from the partial derivatives of \(\varepsilon f(\varphi ,x,z)\) as

$$\begin{aligned} a_i(\varphi ,x) \!=\! \varepsilon (\partial _{z_i} f)\left( \varphi , x, u(\varphi ,x), u_x(\varphi ,x), u_{xx}(\varphi ,x), u_{xxx}(\varphi ,x) \right) , \quad \! i\!=\!0,1,2,3.\nonumber \\ \end{aligned}$$
(3.2)

The operator \({\mathcal {L}}\) depends on \(\lambda \) because \(\omega = \lambda \bar{\omega }\). Since \(\varepsilon \) is a (small) fixed parameter, we simply write \({\mathcal {L}}(\lambda ,u)\) instead of \({\mathcal {L}}(\lambda ,u,\varepsilon )\), and \(a_i(u)\) instead of \(a_i(u,\varepsilon )\). We emphasize that the coefficients \(a_i\) do not depend explicitly on the parameter \(\lambda \) (they depend on \( \lambda \) only through \( u(\lambda ) \)).

In the Hamiltonian case (1.11) the linearized operator (3.1) has the form

$$\begin{aligned} \mathcal{L}h = {\omega \cdot \partial _{\varphi }}h + \partial _{x} \left( \partial _x \big \{ A_1 (\varphi ,x) \partial _x h \big \} - A_0 (\varphi ,x) h \right) \end{aligned}$$

where

$$\begin{aligned} A_1 (\varphi ,x)&:= 1 + \varepsilon (\partial _{z_1 z_1} F) (\varphi ,x,u,u_x) ,\\ A_0 (\varphi ,x)&:= - \varepsilon \partial _x \{ (\partial _{z_0 z_1} F)(\varphi ,x,u,u_x) \}+ \varepsilon (\partial _{z_0 z_0} F) (\varphi ,x,u,u_x) \end{aligned}$$

and it is generated by the quadratic Hamiltonian

$$\begin{aligned} H_L(\varphi , h) := \frac{1}{2} \int _{{\mathbb {T}}} \left( A_0 (\varphi , x) h^2 + A_1 (\varphi , x) h_x^2 \right) \, dx , \quad h \in H^1_0. \end{aligned}$$

Remark 3.1

In the reversible case, i.e. the nonlinearity \( f\) satisfies (1.13) and \( u \in X \) (see (2.29), (1.14)) the coefficients \( a_i \) satisfy the parity

$$\begin{aligned} a_3, a_1 \in X, \quad a_2, a_0 \in Y, \end{aligned}$$
(3.3)

and \({\mathcal {L}}\) maps \(X\) into \(Y\), namely \({\mathcal {L}}\) is reversible, see Definition 2.2.

Remark 3.2

In the Hamiltonian case (1.11), assumption (Q)-(1.7) is automatically satisfied (with \( \alpha (\varphi ) = 2 \)) because

$$\begin{aligned} f(\varphi ,x,u,u_x, u_{xx}, u_{xxx})&= a(\varphi , x, u, u_x) + b(\varphi , x, u, u_x) u_{xx} + c(\varphi , x, u, u_x) u_{xx}^2\\&\,+ \,d(\varphi , x, u, u_x) u_{xxx} \end{aligned}$$

where

$$\begin{aligned} b = 2 (\partial _{z_1 z_1 x}^3 F) + 2 z_1 (\partial _{z_1 z_1 z_0}^3 F), \qquad c = \partial _{z_1}^3 F, \qquad d = \partial _{z_1}^2 F, \end{aligned}$$

and so

$$\begin{aligned} \partial _{z_2} f&= b + 2 z_2 c = 2(d_x + z_1 d_{z_0} + z_2 d_{z_1})\\&= 2 \left( \partial ^2_{z_3 x} f + z_1 \partial ^2_{z_3 z_0} f + z_2 \partial ^2_{z_3 z_1} f + z_3 \partial ^2_{z_3 z_2} f \right) . \end{aligned}$$

The coefficients \(a_i\), together with their derivative \(\partial _u a_i(u)[h]\) with respect to \(u\) in the direction \(h\), satisfy tame estimates:

Lemma 3.1

Let \( f \in C^q \), see (1.3). For all \( {\mathfrak {s}}_{0} \le s \le q - 2 \), \( \Vert u \Vert _{{\mathfrak {s}}_0 + 3} \le 1 \), we have, for all \(i = 0,1,2,3\),

$$\begin{aligned} \Vert a_i(u) \Vert _s&\le \varepsilon \, C(s) \left( 1 + \Vert u \Vert _{s+3} \right) \!,\end{aligned}$$
(3.4)
$$\begin{aligned} \Vert \partial _{u} a_i(u)[h] \Vert _{s}&\le \varepsilon \, C(s) \left( \Vert h \Vert _{s+3} + \Vert u \Vert _{s+3} \Vert h \Vert _{{\mathfrak {s}}_0+3} \right) \!. \end{aligned}$$
(3.5)

If, moreover, \( \lambda \mapsto u(\lambda ) \in H^s \) is a Lipschitz family satisfying \( \Vert u \Vert _{{\mathfrak {s}}_0 + 3}^{\mathrm{{Lip}(\gamma )}} \le 1 \) (see (2.2)), then

$$\begin{aligned} \Vert a_i \Vert _{s}^{\mathrm{{Lip}(\gamma )}} \le \varepsilon \, C(s) \left( 1 + \Vert u \Vert _{s+3}^{\mathrm{{Lip}(\gamma )}} \right) . \end{aligned}$$
(3.6)

Proof

The tame estimate (3.4) follows by Lemma 6.2\((i)\) applied to the function \(\partial _{z_i}f\), \(i=0,\ldots ,3 \), which is valid for \(s+1 \le q\). The tame bound (3.5) for

$$\begin{aligned} \partial _u a_i(u) [h] \mathop {=}\limits ^{(3.2)} \varepsilon \sum _{k=0}^3 (\partial ^2_{z_k z_i} f)\left( \varphi , x, u, u_x, u_{xx}, u_{xxx} \right) \, \partial _x^k h, \quad i = 0, \ldots , 3, \end{aligned}$$

follows by (6.5) and applying Lemma 6.2\((i)\) to the functions \(\partial ^2_{z_k z_i}f\), which gives

$$\begin{aligned} \Vert (\partial ^2_{z_k z_i} f)\left( \varphi , x, u, u_x, u_{xx}, u_{xxx} \right) \Vert _s \le C(s) \Vert f \Vert _{C^{s+2}} (1 + \Vert u \Vert _{s+3}), \end{aligned}$$

for \(s+2 \le q\). The Lipschitz bound (3.6) follows similarly. \(\square \)

3.1 Step 1. Change of the space variable

We consider a \( \varphi \)-dependent family of diffeomorphisms of the \( 1 \)-dimensional torus \( {\mathbb {T}}\) of the form

$$\begin{aligned} y = x + \beta (\varphi ,x), \end{aligned}$$
(3.7)

where \( \beta \) is a (small) real-valued function, \(2\pi \) periodic in all its arguments. The change of variables (3.7) induces on the space of functions the linear operator

$$\begin{aligned} (\mathcal{A}h)(\varphi ,x):= h(\varphi , x + \beta (\varphi , x)). \end{aligned}$$
(3.8)

The operator \( \mathcal{A} \) is invertible, with inverse

$$\begin{aligned} (\mathcal{A}^{-1} v)(\varphi ,y) = v(\varphi , y + {\tilde{\beta }}(\varphi ,y) ), \end{aligned}$$
(3.9)

where \( y \mapsto y + {\tilde{\beta }}(\varphi ,y) \) is the inverse diffeomorphism of (3.7), namely

$$\begin{aligned} x = y + {\tilde{\beta }}(\varphi ,y) \Longleftrightarrow y = x + \beta (\varphi ,x). \end{aligned}$$
(3.10)

Remark 3.3

In the Hamiltonian case (1.11) we use, instead of (3.8), the modified change of variable (1.25) which is symplectic, for each \( \varphi \in {\mathbb {T}}^\nu \). Indeed, setting \( U := \partial _x^{-1} u \) (and neglecting to write the \( \varphi \)-dependence)

$$\begin{aligned} \Omega (\mathcal{A}u, \mathcal{A}v)&= \int _{{\mathbb {T}}} \partial _{x}^{-1} \left( \partial _x \big \{ U(x+ \beta (x) ) \big \} \right) \, (1+ \beta _x (x) ) v(x+ \beta (x) ) \, dx\\&= \int _{{\mathbb {T}}} U(x+ \beta (x)) (1+ \beta _x (x) ) v(x+ \beta (x) ) dx\\&\quad - c \int _{{\mathbb {T}}} (1+ \beta _x (x) ) v(x+ \beta (x) ) dx\\&= \int _{{\mathbb {T}}} U(y) v(y ) dy = \Omega (u,v) , \quad v \in H^1_0, \end{aligned}$$

where \( c \) is the average of \( U(x+ \beta (x) ) \) in \( {\mathbb {T}}\). The inverse operator of (1.25) is \( (\mathcal{A}^{-1} v) (\varphi , y) = (1+ {\tilde{\beta }}_y (\varphi , y)) v( y + \tilde{\beta }(\varphi , y)) \) which is also symplectic.

Now we calculate the conjugate \( \mathcal{A}^{-1} \mathcal{L} \mathcal{A} \) of the linearized operator \({\mathcal {L}}\) in (3.1) with \( \mathcal{A} \) in (3.8).

The conjugate \( \mathcal{A} ^{-1}a \mathcal{A} \) of any multiplication operator \(a : h(\varphi ,x) \mapsto a(\varphi ,x) h(\varphi ,x)\) is the multiplication operator \(( \mathcal{A} ^{-1}a)\) that maps \(v(\varphi ,y) \mapsto ( \mathcal{A} ^{-1}a)(\varphi ,y) \, v(\varphi ,y)\). By conjugation, the differential operators become

$$\begin{aligned} \mathcal{A} ^{-1}{\omega \cdot \partial _{\varphi }}\mathcal{A}&= {\omega \cdot \partial _{\varphi }}+ \{ \mathcal{A} ^{-1}({\omega \cdot \partial _{\varphi }}\beta ) \} \, \partial _y,\\ \mathcal{A} ^{-1}\partial _x \mathcal{A}&= \{ \mathcal{A} ^{-1}(1 + \beta _x) \} \, \partial _y,\\ \mathcal{A} ^{-1}\partial _{xx} \mathcal{A}&= \{ \mathcal{A} ^{-1}(1+\beta _x)^2 \} \, \partial _{yy} + \{ \mathcal{A} ^{-1}(\beta _{xx}) \} \, \partial _y,\\ \mathcal{A} ^{-1}\partial _{xxx} \mathcal{A}&= \{ \mathcal{A} ^{-1}(1+\beta _x)^3 \} \, \partial _{yyy} + \{ 3 \mathcal{A} ^{-1}[ (1+\beta _x) \beta _{xx}] \} \, \partial _{yy} \!+\! \{ \mathcal{A} ^{-1}(\beta _{xxx}) \} \, \partial _y, \end{aligned}$$

where all the coefficients \(\{ A^{-1}(\ldots ) \}\) are periodic functions of \((\varphi ,y)\). Thus (recall (3.1))

$$\begin{aligned} {\mathcal {L}}_1 := \mathcal{A}^{-1} {\mathcal {L}}\mathcal{A} = {\omega \cdot \partial _{\varphi }}+ b_3(\varphi ,y) \partial _{yyy} + b_2(\varphi ,y) \partial _{yy} + b_1(\varphi ,y) \partial _{y} + b_0(\varphi ,y)\nonumber \\ \end{aligned}$$
(3.11)

where

$$\begin{aligned} \begin{aligned} b_3&= \mathcal{A} ^{-1}[(1+a_3) (1+\beta _x)^3], \\ b_1&= \mathcal{A} ^{-1}[{\omega \cdot \partial _{\varphi }}\beta + (1+a_3) \beta _{xxx}+ a_2 \beta _{xx} + a_1 (1+\beta _x)], \end{aligned} \end{aligned}$$
(3.12)
$$\begin{aligned} b_0 = \mathcal{A} ^{-1}(a_0), \qquad b_2 = \mathcal{A} ^{-1}[(1+a_3) 3 (1+\beta _x) \beta _{xx} + a_2 (1+\beta _x)^2].\quad \quad \end{aligned}$$
(3.13)

We look for \(\beta (\varphi ,x)\) such that the coefficient \(b_3(\varphi ,y)\) of the highest order derivative \(\partial _{yyy}\) in (3.11) does not depend on \(y\), namely

$$\begin{aligned} b_3(\varphi ,y) \mathop {=}\limits ^{(3.12)} \mathcal{A}^{-1} [(1+a_3) (1+\beta _x)^3] (\varphi ,y) = b(\varphi ) \end{aligned}$$
(3.14)

for some function \(b(\varphi )\) of \(\varphi \) only. Since \(\mathcal{A}\) changes only the space variable, \(\mathcal{A}b = b\) for every function \(b(\varphi )\) that is independent on \(y\). Hence (3.14) is equivalent to

$$\begin{aligned} \left( 1 + a_3(\varphi ,x) \right) \left( 1 + \beta _x(\varphi ,x) \right) ^3 = b(\varphi ), \end{aligned}$$
(3.15)

namely

$$\begin{aligned} \beta _{x} = \rho _0, \qquad \rho _0(\varphi ,x) := b(\varphi )^{1/3} \left( 1 + a_3(\varphi ,x) \right) ^{-1/3} - 1. \end{aligned}$$
(3.16)

The equation (3.16) has a solution \(\beta \), periodic in \( x \), if and only if \( \int _{{\mathbb {T}}}{\rho _0(\varphi ,x) \, dx} = 0 \). This condition uniquely determines

$$\begin{aligned} b(\varphi ) = \left( \frac{1}{2\pi }\int _{{\mathbb {T}}} \left( 1 + a_3(\varphi ,x) \right) ^{-\frac{1}{3}} \, dx \right) ^{-3}. \end{aligned}$$
(3.17)

Then we fix the solution (with zero average) of (3.16),

$$\begin{aligned} \beta (\varphi ,x) := \, (\partial _x^{-1}\rho _0)(\varphi ,x) , \end{aligned}$$
(3.18)

where \( \partial _x^{-1} \) is defined by linearity as

$$\begin{aligned} \partial _x^{-1} e^{\mathrm{i} j x} := \frac{ e^{\mathrm{i} j x} }{\mathrm{i} j}\, \quad \forall j \in {\mathbb {Z}}{\setminus } \{ 0 \}, \qquad \partial _x^{-1} 1 = 0. \end{aligned}$$
(3.19)

In other words, \(\partial _x^{-1}h\) is the primitive of \(h\) with zero average in \(x \).

With this choice of \( \beta \), we get (see (3.11), (3.14))

$$\begin{aligned} {\mathcal {L}}_1 = \mathcal{A}^{-1}{\mathcal {L}}\mathcal{A} = {\omega \cdot \partial _{\varphi }}+ b_3(\varphi ) \partial _{yyy} + b_2(\varphi ,y) \partial _{yy} + b_1(\varphi ,y) \partial _y + b_0(\varphi ,y),\nonumber \\ \end{aligned}$$
(3.20)

where \( b_3(\varphi ) := b(\varphi ) \) is defined in (3.17).

Remark 3.4

In the reversible case, \( \beta \in Y \) because \(a_3 \in X\), see (3.3). Therefore the operator \(\mathcal{A} \) in (3.8), as well as \( \mathcal{A}^{-1} \) in (3.9), maps \( X \rightarrow X \) and \( Y \rightarrow Y \), namely it is reversibility-preserving, see Definition 2.2. By (3.3) the coefficients of \({\mathcal {L}}_1\) (see (3.12), (3.13)) have parity

$$\begin{aligned} b_3, b_1 \in X, \qquad b_2, b_0 \in Y, \end{aligned}$$
(3.21)

and \({\mathcal {L}}_1\) maps \(X \rightarrow Y\), namely it is reversible.

Remark 3.5

In the Hamiltonian case (1.11) the resulting operator \( \mathcal{L}_1 \) in (3.20) is Hamiltonian and \( b_2 (\varphi , y) = 2 \partial _y b_3 (\varphi ) \equiv 0 \). Actually, by (2.31), the corresponding Hamiltonian has the form

$$\begin{aligned} K(\varphi , v) = \frac{1}{2} \int _{{\mathbb {T}}} b_3(\varphi ) v_y^2 + B_0 (\varphi ,y) v^2\, dy , \end{aligned}$$
(3.22)

for some function \( B_0 (\varphi ,y) \).

3.2 Step 2. Time reparametrization

The goal of this section is to make constant the coefficient of the highest order spatial derivative operator \(\partial _{yyy}\) of \( \mathcal{L}_1 \) in (3.20), by a quasi-periodic reparametrization of time. We consider a diffeomorphism of the torus \( {\mathbb {T}}^{\nu } \) of the form

$$\begin{aligned} \varphi \mapsto \varphi + \omega \alpha (\varphi ), \quad \varphi \in {\mathbb {T}}^{\nu }, \quad \alpha (\varphi ) \in {\mathbb {R}}, \end{aligned}$$
(3.23)

where \( \alpha \) is a (small) real valued function, \( 2\pi \)-periodic in all its arguments. The induced linear operator on the space of functions is

$$\begin{aligned} (Bh)(\varphi ,y):= h \left( \varphi + \omega \alpha (\varphi ), \,y \right) \end{aligned}$$
(3.24)

whose inverse is

$$\begin{aligned} (B^{-1} v)(\vartheta ,y):= v \left( \vartheta + \omega {\tilde{\alpha }}(\vartheta ), \,y \right) \end{aligned}$$
(3.25)

where \( \varphi = \vartheta + \omega {\tilde{\alpha }}(\vartheta ) \) is the inverse diffeomorphism of \( \vartheta = \varphi + \omega \alpha (\varphi ) \). By conjugation, the differential operators become

$$\begin{aligned} B^{-1}\omega \cdot \partial _{\varphi } B = \rho (\vartheta )\, {\omega \cdot \partial _{\vartheta }}, \quad B^{-1}\partial _y B = \partial _y, \quad \rho := B^{-1}(1 + {\omega \cdot \partial _{\varphi }}\alpha ).\quad \quad \quad \end{aligned}$$
(3.26)

Thus, see (3.20),

$$\begin{aligned} B^{-1}{\mathcal {L}}_1 B = \rho \, {\omega \cdot \partial _{\vartheta }}+ \{ B^{-1}b_3 \} \, \partial _{yyy} + \{ B^{-1}b_2 \} \, \partial _{yy} + \{ B^{-1}b_1 \} \, \partial _{y} + \{ B^{-1}b_0 \}.\nonumber \\ \end{aligned}$$
(3.27)

We look for \(\alpha (\varphi )\) such that the (variable) coefficients of the highest order derivatives (\({\omega \cdot \partial _{\vartheta }}\) and \(\partial _{yyy}\)) are proportional, namely

$$\begin{aligned} \{ B^{-1}b_3\}(\vartheta ) = m_3 \rho (\vartheta ) = m_3 \{ B^{-1}(1 + {\omega \cdot \partial _{\varphi }}\alpha )\}(\vartheta ) \end{aligned}$$
(3.28)

for some constant \(m_3 \in {\mathbb {R}}\). Since \( B \) is invertible, this is equivalent to require that

$$\begin{aligned} b_3(\varphi ) = m_3 \left( 1 + {\omega \cdot \partial _{\varphi }}\alpha (\varphi ) \right) . \end{aligned}$$
(3.29)

Integrating on \({\mathbb {T}}^\nu \) determines the value of the constant \(m_3\),

$$\begin{aligned} m_3 := \frac{1}{(2\pi )^\nu } \, \int _{{\mathbb {T}}^\nu } b_3(\varphi ) \, d\varphi . \end{aligned}$$
(3.30)

Thus we choose the unique solution of (3.29) with zero average

$$\begin{aligned} \alpha (\varphi ) := \frac{1}{m_3} \, ({\omega \cdot \partial _{\varphi }})^{-1}(b_3 - m_3)(\varphi ) \end{aligned}$$
(3.31)

where \( ({\omega \cdot \partial _{\varphi }})^{-1}\) is defined by linearity

$$\begin{aligned} ({\omega \cdot \partial _{\varphi }})^{-1}e^{\mathrm{i} l \cdot \varphi } := \frac{e^{\mathrm{i} l \cdot \varphi }}{\mathrm{i} \omega \cdot l} , \ l \ne 0 , \quad ({\omega \cdot \partial _{\varphi }})^{-1}1 = 0. \end{aligned}$$

With this choice of \( \alpha \) we get (see (3.27), (3.28))

$$\begin{aligned} B^{-1}{\mathcal {L}}_1 B = \rho \, {\mathcal {L}}_2, \quad \! {\mathcal {L}}_2 := {\omega \cdot \partial _{\vartheta }}+ m_3 \, \partial _{yyy} + c_2(\vartheta ,y) \, \partial _{yy} \!+\! c_1(\vartheta ,y) \, \partial _{y} + c_0(\vartheta ,y),\nonumber \\ \end{aligned}$$
(3.32)

where

$$\begin{aligned} c_i := \frac{B^{-1}b_i}{\rho }, \quad i = 0,1,2. \end{aligned}$$
(3.33)

Remark 3.6

In the reversible case, \(\alpha \) is odd because \(b_3\) is even (see (3.21)), and \( B \) is reversibility preserving. Since \(\rho \) (defined in (3.26)) is even, the coefficients \( c_3, c_1 \in X\), \( c_2, c_0 \in Y \) and \({\mathcal {L}}_2 : X \rightarrow Y \) is reversible.

Remark 3.7

In the Hamiltonian case, the operator \( {\mathcal {L}}_2 \) is still Hamiltonian (the new Hamiltonian is the old one at the new time, divided by the factor \( \rho \)). The coefficient \( c_2 (\vartheta , y) \equiv 0 \) because \( b_2 \equiv 0 \), see Remark 3.5.

3.3 Step 3. Descent method: step zero

The aim of this section is to eliminate the term of order \(\partial _{yy}\) from \( \mathcal{L}_2 \) in (3.32).

Consider the multiplication operator

$$\begin{aligned} \mathcal{M} h := v(\vartheta , y) h \end{aligned}$$
(3.34)

where the function \(v\) is periodic in all its arguments. Calculate the difference

$$\begin{aligned} \mathcal{L}_2 \, \mathcal{M} - \mathcal{M} \, (\omega \cdot \partial _{\vartheta } + m_3 \partial _{yyy}) = T_2 \partial _{yy} + T_{1} \partial _{y} + T_{0}, \end{aligned}$$
(3.35)

where

$$\begin{aligned} \begin{aligned} T_{2}&:= 3 m_3 v_{y} + c_{2} v, \quad T_{1} := 3 m_3 v_{yy} + 2 c_{2} v_{y} + c_{1} v,\\ T_{0}&:= \omega \cdot \partial _{\vartheta } v + m_3 v_{yyy} + c_{2} v_{yy} + c_{1} v_{y} + c_0 v. \end{aligned} \end{aligned}$$
(3.36)

To eliminate the factor \(T_2\), we need

$$\begin{aligned} 3 m_3 v_{y} + c_{2} v = 0. \end{aligned}$$
(3.37)

Equation (3.37) has the periodic solution

$$\begin{aligned} v(\vartheta ,y) = \exp \left\{ - \frac{1}{3m_3} \, (\partial _y^{-1}c_2)(\vartheta ,y) \right\} \end{aligned}$$
(3.38)

provided that

$$\begin{aligned} \int _{\mathbb {T}}c_2(\vartheta ,y) \, dy = 0. \end{aligned}$$
(3.39)

Let us prove (3.39). By (3.33), (3.26), for each \(\vartheta = \varphi + \omega \alpha (\varphi )\) we get

$$\begin{aligned} \int _{\mathbb {T}}c_2(\vartheta ,y) \, dy&= \frac{1}{ \{ B^{-1}(1 + {\omega \cdot \partial _{\varphi }}\alpha )\}(\vartheta )} \, \int _{\mathbb {T}}(B^{-1}b_2)(\vartheta ,y) \, dy \\&= \frac{1}{ 1 + {\omega \cdot \partial _{\varphi }}\alpha (\varphi ) } \, \int _{\mathbb {T}}b_2(\varphi ,y) \, dy. \end{aligned}$$

By the definition (3.13) of \(b_2\) and changing variable \( y = x + \beta (\varphi ,x) \) in the integral (recall (3.8))

$$\begin{aligned} \int _{\mathbb {T}}b_2(\varphi ,y) \, dy&\mathop {=}\limits ^{(3.13)} \int _{\mathbb {T}}\left( (1+a_3) 3 (1+\beta _x) \beta _{xx} + a_2 (1+\beta _x)^2 \right) \, (1 + \beta _x) \, dx \nonumber \\&\mathop {=}\limits ^{ (3.15)} b(\varphi ) \left\{ 3 \int _{\mathbb {T}}\frac{ \beta _{xx}(\varphi ,x)}{1 \!+\! \beta _x(\varphi ,x)} \, dx \!+\! \int _{\mathbb {T}}\frac{ a_2(\varphi ,x) }{ 1 \!+ \!a_3(\varphi ,x) } \, dx \right\} .\quad \end{aligned}$$
(3.40)

The first integral in (3.40) is zero because \(\beta _{xx} / (1 + \beta _x) = \partial _x \log (1 + \beta _x)\). The second one is zero because of assumptions (Q)-(1.7) or (F)-(1.6), see (1.26). As a consequence (3.39) is proved, and (3.37) has the periodic solution \(v\) defined in (3.38). Note that \( v \) is close to \( 1 \) for \( \varepsilon \) small. Hence the multiplication operator \( \mathcal{M} \) defined in (3.34) is invertible and \( \mathcal{M} ^{-1} \) is the multiplication operator by \( 1 / v \). By (3.35) and since \(T_2 = 0\), we deduce

$$\begin{aligned} {\mathcal {L}}_3&:= \mathcal{M} ^{-1} {\mathcal {L}}_2 \mathcal{M} = {\omega \cdot \partial _{\vartheta }}+ m_3 \partial _{yyy} + d_{1}(\vartheta , y)\partial _y + d_{0}(\vartheta , y) , \nonumber \\ d_i&:= \frac{T_i}{v},\quad i=0,1. \end{aligned}$$
(3.41)

Remark 3.8

In the reversible case, since \(c_2\) is odd (see Remark 3.6 ) the function \(v\) is even, then \( \mathcal{M} \), \( \mathcal{M} ^{-1}\) are reversibility preserving and by (3.36) and (3.41) \(d_1 \in X\) and \(d_0 \in Y\), which implies that \({\mathcal {L}}_3 : X \rightarrow Y\).

Remark 3.9

In the Hamiltonian case, there is no need to perform this step because \( c_2 \equiv 0 \), see Remark 3.7.

3.4 Step 4. Change of space variable (translation)

Consider the change of the space variable

$$\begin{aligned} z = y + p(\vartheta ) \end{aligned}$$

which induces the operators

$$\begin{aligned} \mathcal{T} h(\vartheta ,y) := h(\vartheta , y + p(\vartheta )), \quad \mathcal{T}^{-1}v(\vartheta ,z) := v(\vartheta , z - p(\vartheta )). \end{aligned}$$
(3.42)

The differential operators become

$$\begin{aligned} \mathcal{T}^{-1}{\omega \cdot \partial _{\vartheta }}\mathcal{T} = {\omega \cdot \partial _{\vartheta }}+ \{ {\omega \cdot \partial _{\vartheta }}p(\vartheta ) \} \, \partial _z, \qquad \mathcal{T}^{-1}\partial _y \mathcal{T} = \partial _z. \end{aligned}$$

Thus, by (3.41),

$$\begin{aligned} {\mathcal {L}}_4 := \mathcal{T}^{-1}{\mathcal {L}}_3 \mathcal{T} = {\omega \cdot \partial _{\vartheta }}+ m_3 \partial _{zzz} + e_1(\vartheta ,z) \, \partial _z + e_0(\vartheta ,z) \end{aligned}$$

where

$$\begin{aligned} e_1(\vartheta ,z) := {\omega \cdot \partial _{\vartheta }}p(\vartheta ) + (\mathcal{T}^{-1}d_1) (\vartheta ,z), \quad e_0(\vartheta ,z) := (\mathcal{T} ^{-1}d_0)(\vartheta ,z).\quad \quad \end{aligned}$$
(3.43)

Now we look for \(p(\vartheta )\) such that the average

$$\begin{aligned} \frac{1}{2\pi } \, \int _{\mathbb {T}}e_1(\vartheta ,z) \, dz = m_1 , \quad \forall \vartheta \in {\mathbb {T}}^\nu , \end{aligned}$$
(3.44)

for some constant \(m_1 \in {\mathbb {R}}\) (independent of \( \vartheta \)). Equation (3.44) is equivalent to

$$\begin{aligned} \omega \cdot \partial _{\vartheta } p = m_1 - \int _{{\mathbb {T}}} d_{1}(\vartheta , y) \, dy =: V(\vartheta ). \end{aligned}$$
(3.45)

The equation (3.45) has a periodic solution \(p(\vartheta )\) if and only if \(\int _{{\mathbb {T}}^{\nu }}V(\vartheta ) \, d \vartheta = 0\). Hence we have to define

$$\begin{aligned} m_1 := \frac{1}{(2\pi )^{\nu +1}} \, \int _{{\mathbb {T}}^{\nu + 1}} d_{1}(\vartheta , y) \, d \vartheta d y \end{aligned}$$
(3.46)

and

$$\begin{aligned} p(\vartheta ) := (\omega \cdot \partial _{\vartheta })^{-1}V(\vartheta ). \end{aligned}$$
(3.47)

With this choice of \(p\), after renaming the space-time variables \(z = x\) and \(\vartheta = \varphi \), we have

$$\begin{aligned}&{\mathcal {L}}_4 = {\omega \cdot \partial _{\varphi }}+ m_3 \partial _{xxx} + e_1(\varphi ,x) \, \partial _x + e_0(\varphi ,x),\nonumber \\&\frac{1}{2\pi } \, \int _{{\mathbb {T}}} e_1(\varphi ,x) \, dx = m_1 , \ \forall \varphi \in {\mathbb {T}}^\nu . \end{aligned}$$
(3.48)

Remark 3.10

By (3.45), (3.47) and since \( d_1 \in X \) (see Remark 3.8), the function \(p\) is odd. Then \( \mathcal{T} \) and \( \mathcal{T}^{-1}\) defined in (3.42) are reversibility preserving and the coefficients \( e_1, e_0 \) defined in (3.43) satisfy \(e_1 \in X\), \(e_0 \in Y\). Hence \({\mathcal {L}}_4 : X \rightarrow Y\) is reversible.

Remark 3.11

In the Hamiltonian case the operator \( {\mathcal {L}}_4 \) is Hamiltonian, because the operator \( \mathcal{T} \) in (3.42) is symplectic (it is a particular case of the change of variables (1.25) with \( \beta (\varphi ,x) = p( \varphi ) \)).

3.5 Step 5. Descent method: conjugation by pseudo-differential operators

The goal of this section is to conjugate \( {\mathcal {L}}_4 \) in (3.48) to an operator of the form \( \omega \cdot \partial _{\varphi } + m_3 \partial _{xxx} + m_1 \partial _{x} + {\mathcal {R}}\) where the constants \(m_3\), \(m_1\) are defined in (3.30), (3.46), and \({\mathcal {R}}\) is a pseudo-differential operator of order \(0\).

Consider an operator of the form

$$\begin{aligned} {\mathcal {S}}:= I + w(\varphi ,x) \partial _{x}^{-1} \end{aligned}$$
(3.49)

where \(w : {\mathbb {T}}^{\nu + 1}\rightarrow {\mathbb {R}}\) and the operator \(\partial _{x}^{-1}\) is defined in (3.19). Note that \(\partial _x^{-1} \partial _x = \partial _x \partial _x^{-1} = \pi _0\), where \( \pi _0 \) is the \( L^2 \)-projector on the subspace \( H_0 := \{ u(\varphi ,x) \in L^2 ({\mathbb {T}}^{\nu +1})\, : \, \int _{{\mathbb {T}}} u(\varphi , x) \, dx = 0 \} \).

A direct computation shows that the difference

$$\begin{aligned} {\mathcal {L}}_4 {\mathcal {S}}- {\mathcal {S}}(\omega \cdot \partial _{\varphi } + m_3 \partial _{xxx} + m_1 \partial _{x}) = r_1 \partial _{x} + r_0 + r_{-1} \partial _{x}^{-1} \end{aligned}$$
(3.50)

where (using \( \partial _x \pi _0 = \pi _0 \partial _x = \partial _x \), \( \partial _x^{-1} \partial _{xxx} = \partial _{xx} \))

$$\begin{aligned} r_{1}&:= 3m_3 w_{x} + e_{1}(\varphi ,x) - m_1 \end{aligned}$$
(3.51)
$$\begin{aligned} r_{0}&:= e_{0} + \left( 3m_3 w_{xx} + e_{1}w - m_1 w \right) \pi _{0}\end{aligned}$$
(3.52)
$$\begin{aligned} r_{-1}&:= \omega \cdot \partial _{\varphi }w + m_3 w_{xxx} + e_{1} w_{x}. \end{aligned}$$
(3.53)

We look for a periodic function \( w (\varphi , x )\) such that \( r_1 = 0\). By (3.51) and (3.44) we take

$$\begin{aligned} w = \frac{1}{3m_3}\partial _{x}^{-1}[m_1 - e_{1}]. \end{aligned}$$
(3.54)

For \( \varepsilon \) small enough the operator \( \mathcal{S} \) is invertible and we obtain, by (3.50),

$$\begin{aligned} {\mathcal {L}}_5 := {\mathcal {S}}^{-1} {\mathcal {L}}_4 {\mathcal {S}}= \omega \cdot \partial _{\varphi } + m_3 \partial _{xxx} + m_1 \partial _{x} + \mathcal{R}, \quad \mathcal{R} := {\mathcal {S}}^{-1} ( r_{0} + r_{-1} \partial _{x}^{-1} ).\nonumber \\ \end{aligned}$$
(3.55)

Remark 3.12

In the reversible case, the function \(w \in Y\), because \(e_1 \in X\), see Remark 3.10. Then \({\mathcal {S}}\), \({\mathcal {S}}^{-1}\) are reversibility preserving. By (3.52) and (3.53), \(r_0 \in Y\) and \(r_{-1} \in X\). Then the operators \( {\mathcal {R}}, {\mathcal {L}}_5 \) defined in (3.55) are reversible, namely \({\mathcal {R}}, {\mathcal {L}}_5 : X \rightarrow Y\).

Remark 3.13

In the Hamiltonian case, we consider, instead of (3.49), the modified operator

$$\begin{aligned} {\mathcal {S}}:= e^{\pi _0 w(\varphi ,x) \partial _{x}^{-1}} := I + \pi _0 w(\varphi ,x) \partial _{x}^{-1} + \cdots \end{aligned}$$
(3.56)

which, for each \( \varphi \in {\mathbb {T}}^\nu \), is symplectic. Actually \( {\mathcal {S}}\) is the time one flow map of the Hamiltonian vector field \( \pi _0 w(\varphi , x) \partial _{x}^{-1} \) which is generated by the Hamiltonian

$$\begin{aligned} H_{\mathcal {S}}(\varphi , u) := - \frac{1}{2}\, \int _{{\mathbb {T}}} w(\varphi , x) \left( \partial _x^{-1} u \right) ^2 dx \, , \quad u \in H^1_0. \end{aligned}$$

The corresponding \( {\mathcal {L}}_5 \) in (3.55) is Hamiltonian. Note that the operators (3.56) and (3.49) differ only for pseudo-differential smoothing operators of order \( O( \partial _{x}^{-2} ) \) and of smaller size \( O( w^2 ) = O(\varepsilon ^2) \).

3.6 Estimates on \({\mathcal {L}}_5\)

Summarizing the steps performed in the previous Sects. 3.13.5, we have (semi)-conjugated the operator \( {\mathcal {L}}\) defined in (3.1) to the operator \({\mathcal {L}}_5 \) defined in (3.55), namely

$$\begin{aligned} {\mathcal {L}}= \Phi _1 {\mathcal {L}}_5 \Phi _2^{-1}, \quad \Phi _1 := \mathcal{A} B \rho \mathcal{M} \mathcal{T} {\mathcal {S}}, \quad \Phi _2 := \mathcal{A} B \mathcal{M} \mathcal{T} {\mathcal {S}}\end{aligned}$$
(3.57)

(where \( \rho \) means the multiplication operator for the function \( \rho \) defined in (3.26)).

In the next lemma we give tame estimates for \({\mathcal {L}}_5\) and \(\Phi _1, \Phi _2\). We define the constants

$$\begin{aligned} \sigma := 2\tau _0 + 2 \nu + 17, \quad \sigma ' := 2\tau _0 + \nu + 14 \end{aligned}$$
(3.58)

where \( \tau _0 \) is defined in (1.2) and \( \nu \) is the number of frequencies.

Lemma 3.2

Let \( f \in C ^q \), see (1.3), and \( {\mathfrak {s}}_0 \le s \le q - \sigma \). There exists \(\delta > 0 \) such that, if \( \varepsilon \gamma _0 ^{-1}< \delta \) (the constant \( \gamma _0 \) is defined in (1.2)), then, for all

$$\begin{aligned} \Vert u \Vert _{{\mathfrak {s}}_0 + \sigma } \le 1 , \end{aligned}$$
(3.59)

\((i)\) the transformations \({\Phi }_1, {\Phi }_2\) defined in (3.57) are invertible operators of \(H^s({\mathbb {T}}^{\nu +1})\), and satisfy

$$\begin{aligned} \Vert \Phi _i h \Vert _s + \Vert \Phi _i^{-1}h \Vert _s \le C(s) \left( \Vert h \Vert _s + \Vert u \Vert _{s+\sigma } \Vert h \Vert _{{\mathfrak {s}}_0} \right) , \end{aligned}$$
(3.60)

for \(i = 1, 2\). Moreover, if \(u(\lambda )\), \(h(\lambda )\) are Lipschitz families with

$$\begin{aligned} \Vert u \Vert _{{\mathfrak {s}}_0 + \sigma }^{\mathrm{{Lip}(\gamma )}} \le 1, \end{aligned}$$
(3.61)

then

$$\begin{aligned} \Vert \Phi _i h \Vert _s^{\mathrm{{Lip}(\gamma )}} + \Vert \Phi _i^{-1}h \Vert _s^{\mathrm{{Lip}(\gamma )}} \le C(s) \left( \Vert h \Vert _{s+3}^{\mathrm{{Lip}(\gamma )}} + \Vert u \Vert _{s+\sigma }^{\mathrm{{Lip}(\gamma )}} \Vert h \Vert _{{\mathfrak {s}}_0+3}^{\mathrm{{Lip}(\gamma )}} \right) , \quad i = 1,2.\nonumber \\ \end{aligned}$$
(3.62)

\((ii)\) The constant coefficients \(m_3, m_1\) of \({\mathcal {L}}_5\) defined in (3.55) satisfy

$$\begin{aligned} | m_3 - 1| + |m_1|&\le \varepsilon C ,\end{aligned}$$
(3.63)
$$\begin{aligned} | \partial _u m_3(u)[h]| + | \partial _u m_1(u)[h]|&\le \varepsilon C \Vert h \Vert _{\sigma }. \end{aligned}$$
(3.64)

Moreover, if \(u(\lambda )\) is a Lipschitz family satisfying (3.61), then

$$\begin{aligned} | m_3 - 1 |^{\mathrm{{Lip}(\gamma )}} + | m_1 |^{\mathrm{{Lip}(\gamma )}} \le \varepsilon C. \end{aligned}$$
(3.65)

\((iii)\) The operator \({\mathcal {R}}\) defined in (3.55) satisfies:

$$\begin{aligned} | {\mathcal {R}}|_s&\le \varepsilon C(s) (1 + \Vert u \Vert _{s + \sigma }),\end{aligned}$$
(3.66)
$$\begin{aligned} | \partial _u {\mathcal {R}}(u)[h] \, |_{s}&\le \varepsilon C(s) \left( \Vert h \Vert _{s + \sigma '} + \Vert u \Vert _{s + \sigma } \Vert h \Vert _{\mathfrak s_0 + \sigma '} \right) , \end{aligned}$$
(3.67)

where \( \sigma > \sigma ' \) are defined in (3.58). Moreover, if \(u(\lambda )\) is a Lipschitz family satisfying (3.61), then

$$\begin{aligned} | {\mathcal {R}}|_s^{\mathrm{{Lip}(\gamma )}} \le \varepsilon C(s) (1 + \Vert u \Vert _{s + \sigma }^{\mathrm{{Lip}(\gamma )}}). \end{aligned}$$
(3.68)

Finally, in the reversible case, the maps \(\Phi _i, \Phi _i^{-1}\), \(i=1,2\) are reversibility preserving and \({\mathcal {R}}, {\mathcal {L}}_5 : X \rightarrow Y\) are reversible. In the Hamiltonian case the operator \( {\mathcal {L}}_5 \) is Hamiltonian.

Proof

The proof is elementary. It is based only on a repeated use of the tame estimates of the Lemmata in the Appendix. \(\square \)

In the same way we get the following lemma.

Lemma 3.3

In the same hypotheses of Lemma 3.2, for all \(\varphi \in {\mathbb {T}}^\nu \), the operators \(\mathcal{A}(\varphi )\), \( \mathcal{M} (\varphi )\), \( \mathcal{T} (\varphi )\), \({\mathcal {S}}(\varphi )\) are invertible operators of the phase space \(H^s_x := H^s({\mathbb {T}})\), with

$$\begin{aligned} \Vert \mathcal{A}^{\pm 1}(\varphi ) h \Vert _{H^s_x}&\le C(s) \left( \Vert h \Vert _{H^s_x} + \Vert u \Vert _{s + {\mathfrak {s}}_0 + 3} \Vert h \Vert _{H^1_x} \right) ,\end{aligned}$$
(3.69)
$$\begin{aligned} \Vert (\mathcal{A}^{\pm 1}(\varphi ) \!-\! I) h \Vert _{H^s_x}&\le \! \varepsilon C(s) \left( \Vert h \Vert _{H^{s+1}_x} \!+\! \Vert u \Vert _{s\!+\! \mathfrak s_0 \!+\! 3} \Vert h \Vert _{H^2_x} \right) ,\qquad \end{aligned}$$
(3.70)
$$\begin{aligned} \Vert (\mathcal{M} (\varphi ) \mathcal{T}(\varphi ) {\mathcal {S}}(\varphi ))^{\pm 1} h \Vert _{H^s_x}&\le C(s) \left( \Vert h \Vert _{H^s_x} + \Vert u \Vert _{s + \sigma } \Vert h \Vert _{H^1_x} \right) ,\end{aligned}$$
(3.71)
$$\begin{aligned} \Vert (( \mathcal{M} (\varphi ) \mathcal{T}(\varphi ) {\mathcal {S}}(\varphi ) )^{\pm 1} - I) h \Vert _{H^s_x}&\le \varepsilon \gamma _0^{-1} C(s) \left( \Vert h \Vert _{H^{s+1}_x} + \Vert u \Vert _{s + \sigma } \Vert h \Vert _{H^1_x} \right) . \end{aligned}$$
(3.72)

4 Reduction of the linearized operator to constant coefficients

The goal of this section is to diagonalize the linear operator \( {\mathcal {L}}_5 \) obtained in (3.55), and therefore to complete the reduction of \( \mathcal{L} \) in (3.1) into constant coefficients. For \( \tau > \tau _0 \) (see (1.2)) we define the constant

$$\begin{aligned} \beta := 7 \tau + 6. \end{aligned}$$
(4.1)

Theorem 4.1

Let \( f \in C^q \), see (1.3). Let \( \gamma \in (0,1) \) and \( {\mathfrak {s}}_0 \le s \le q - \sigma - \beta \) where \( \sigma \) is defined in (3.58), and \( \beta \) in (4.1). Let \(u(\lambda ) \) be a family of functions depending on the parameter \(\lambda \in \Lambda _o \subset \Lambda := [1/2, 3/2]\) in a Lipschitz way, with

$$\begin{aligned} \Vert u \Vert _{{\mathfrak {s}}_0 + \sigma + \beta , \Lambda _o}^{\mathrm{{Lip}(\gamma )}} \le 1. \end{aligned}$$
(4.2)

Then there exist \( \delta _{0} \), \( C \) (depending on the data of the problem) such that, if

$$\begin{aligned} \varepsilon \gamma ^{-1} \le \delta _{0} , \end{aligned}$$
(4.3)

then:

(i) :

(Eigenvalues) \(\forall \lambda \in \Lambda \) there exists a sequence

$$\begin{aligned} \begin{aligned} \mu _j^\infty (\lambda )&:= \mu _j^\infty (\lambda , u) = {\tilde{\mu }}^{0}_j(\lambda ) + r_j^\infty (\lambda ) ,\\ {\tilde{\mu }}^0_j(\lambda )&:= \mathrm{i} \left( - {\tilde{m}}_3 ( \lambda ) j^3 + {\tilde{m}}_1(\lambda ) j \right) , \ j \in {\mathbb {Z}}, \end{aligned} \end{aligned}$$
(4.4)

where \( {\tilde{m}}_3, {\tilde{m}}_1\) coincide with the coefficients of \( \mathcal{L}_5 \) in (3.55) for all \( \lambda \in \Lambda _o \), and the corrections \(r_j^\infty \) satisfy

$$\begin{aligned} | {\tilde{m}}_3 - 1 |^{\mathrm{{Lip}(\gamma )}} + | {\tilde{m}}_1 |^{\mathrm{{Lip}(\gamma )}} + | r^{\infty }_j |^{\mathrm{{Lip}(\gamma )}}_{\Lambda }&\le \varepsilon C , \ \ \forall j \in {\mathbb {Z}}. \end{aligned}$$
(4.5)

Moreover, in the reversible case (i.e. (1.13) holds) or Hamiltonian case (i.e. (1.11) holds), all the eigenvalues \(\mu _j^{\infty }\) are purely imaginary.

(ii) :

(Conjugacy). For all \(\lambda \) in

$$\begin{aligned} \Lambda _\infty ^{2\gamma } := \Lambda _\infty ^{2\gamma } (u)&:= \left\{ \lambda \in \Lambda _o \, : \, | \mathrm{i} \lambda \bar{\omega }\cdot l + \mu ^{\infty }_j (\lambda ) - \mu ^{\infty }_{k} (\lambda ) |\right. \nonumber \\&\ge \left. 2 \gamma | j^{3} - k^{3} | \langle l \rangle ^{-\tau }, \ \forall l \in {\mathbb {Z}}^{\nu }, \, j ,k \in {\mathbb {Z}}\right\} \end{aligned}$$
(4.6)

there is a bounded, invertible linear operator \(\Phi _\infty (\lambda ) : H^s \rightarrow H^s\), with bounded inverse \(\Phi _\infty ^{-1}(\lambda )\), that conjugates \({\mathcal {L}}_5\) in (3.55) to constant coefficients, namely

$$\begin{aligned} \begin{aligned} \mathcal{L}_{\infty }(\lambda )&:= \Phi _{\infty }^{-1}(\lambda ) \circ {\mathcal {L}}_5(\lambda ) \circ \Phi _{\infty }(\lambda ) = \lambda \bar{\omega }\cdot \partial _{\varphi } + \mathcal{D}_{\infty }(\lambda ), \\ \mathcal{D}_{\infty }(\lambda )&:= \mathrm{diag}_{j \in {\mathbb {Z}}} \mu ^{\infty }_{j}(\lambda ). \end{aligned} \end{aligned}$$
(4.7)

The transformations \(\Phi _\infty , \Phi _\infty ^{-1}\) are close to the identity in matrix decay norm, with estimates

$$\begin{aligned} | \Phi _{\infty } (\lambda ) - I |_{s,\Lambda _\infty ^{2\gamma }}^\mathrm{{Lip}(\gamma )}+ | \Phi _{\infty }^{- 1} (\lambda ) - I |_{s,\Lambda _\infty ^{2\gamma }}^\mathrm{{Lip}(\gamma )}\le \varepsilon \gamma ^{-1} C(s) \left( 1 + \Vert u \Vert _{s + \sigma + \beta ,\Lambda _o }^\mathrm{{Lip}(\gamma )}\right) .\nonumber \\ \end{aligned}$$
(4.8)

For all \(\varphi \in {\mathbb {T}}^\nu \), the operator \(\Phi _\infty (\varphi ) : H^s_x \rightarrow H^s_x \) is invertible (where \(H^s_x := H^s({\mathbb {T}})\)) with inverse \( (\Phi _\infty (\varphi ))^{-1} = \Phi _\infty ^{-1}(\varphi )\), and

$$\begin{aligned} \Vert (\Phi _\infty ^{\pm 1}(\varphi ) - I) h \Vert _{H^s_x}&\le \varepsilon \gamma ^{-1} C(s) \left( \Vert h \Vert _{H^s_x} + \Vert u \Vert _{s + \sigma + \beta + {\mathfrak {s}}_0} \Vert h \Vert _{H^1_x} \right) . \end{aligned}$$
(4.9)

In the reversible case \(\Phi _{\infty }, \Phi _{\infty }^{-1} : X \rightarrow X \), \(Y \rightarrow Y\) are reversibility preserving, and \({\mathcal {L}}_\infty : X \rightarrow Y\) is reversible. In the Hamiltonian case the final \({\mathcal {L}}_\infty \) is Hamiltonian.

An important point of Theorem 4.1 is to require only the bound (4.2) for the low norm of \( u \), but it provides the estimate for \( \Phi _\infty ^{\pm 1} - I \) in (4.8) also for the higher norms \( | \cdot |_s \), depending also on the high norms of \( u \). From Theorem 4.1 we shall deduce tame estimates for the inverse linearized operators in Theorem 4.3.

Note also that the set \( \Lambda _{\infty }^{2 \gamma } \) in (4.6) depend only on the final eigenvalues, and it is not defined inductively as in usual KAM theorems. This characterization of the set of parameters which fulfill all the required Melnikov non-resonance conditions (at any step of the iteration) was first observed in [5, 8] in an analytic setting. Theorem 4.1 extends this property also in a differentiable setting. A main advantage of this formulation is that it allows to discuss the measure estimates only once and not inductively: the Cantor set \( \Lambda _{\infty }^{2 \gamma } \) in (4.6) could be empty (actually its measure \( |\Lambda _{\infty }^{2 \gamma } | = 1 - O(\gamma ) \) as \( \gamma \rightarrow 0 \)) but the functions \( \mu ^\infty _j (\lambda ) \) are anyway well defined for all \( \lambda \in \Lambda \), see (4.4). In particular we shall perform the measure estimates only along the nonlinear iteration, see Sect. 5.

Theorem 4.1 is deduced from the following iterative Nash–Moser reducibility theorem for a linear operator of the form

$$\begin{aligned} \mathcal{L}_{0} = \omega \cdot \partial _{\varphi } + \mathcal{D}_{0} + \mathcal{R}_{0} , \end{aligned}$$
(4.10)

where \(\omega = \lambda \bar{\omega }\),

$$\begin{aligned} \mathcal{D}_{0} := m_3 (\lambda ,u(\lambda )) \partial _{xxx} + m_1(\lambda ,u(\lambda )) \partial _{x} , \quad {\mathcal {R}}_0(\lambda ,u(\lambda )) := {\mathcal {R}}(\lambda ,u(\lambda )) ,\quad \quad \end{aligned}$$
(4.11)

the \( m_3(\lambda ,u(\lambda )), m_1 (\lambda ,u(\lambda )) \in {\mathbb {R}}\) and \( u(\lambda ) \) is defined for \( \lambda \in \Lambda _o \subset \Lambda \). Clearly \({\mathcal {L}}_5\) in (3.55) has the form (4.10). Define

$$\begin{aligned} N_{-1} := 1 , \quad N_{\nu } := N_{0}^{\chi ^{\nu }} \ \forall \nu \ge 0 , \quad \chi := 3 /2 \end{aligned}$$
(4.12)

(then \( N_{\nu +1} = N_{\nu }^\chi \), \( \forall \nu \ge 0 \)) and

$$\begin{aligned} \alpha := 7 \tau + 4, \quad \sigma _2 := \sigma + \beta \end{aligned}$$
(4.13)

where \(\sigma \) is defined in (3.58) and \(\beta \) is defined in (4.1).

Theorem 4.2

(KAM reducibility) Let \( q > \sigma + {\mathfrak {s}}_0 + \beta \). There exist \(C_0 > 0 \), \( N_{0} \in {\mathbb {N}}\) large, such that, if

$$\begin{aligned} N_{0}^{C_0} |\mathcal{R}_{0} |_{{\mathfrak {s}}_{0} + \beta }^{\mathrm{{Lip}(\gamma )}} \gamma ^{-1} \le 1, \end{aligned}$$
(4.14)

then, for all \( \nu \ge 0 \):

\(\mathbf{(S1)_{\nu }}\) :

There exists an operator

$$\begin{aligned} \mathcal{L}_\nu&:= \omega \cdot \partial _{\varphi } + \mathcal{D}_\nu + \mathcal{R}_\nu \quad where \quad \mathcal{D}_\nu = \mathrm{diag}_{j \in {\mathbb {Z}}} \{ \mu ^{\nu }_{j}(\lambda ) \}\end{aligned}$$
(4.15)
$$\begin{aligned} \mu _j^{\nu }(\lambda )&= \mu _j^0(\lambda ) + r_j^{\nu }(\lambda ), \mu _j^0(\lambda ) := - \mathrm{i} \left( m_3(\lambda ,u(\lambda )) j^3 - m_1(\lambda ,u(\lambda )) j \right) , \nonumber \\&j \in {\mathbb {Z}}, \end{aligned}$$
(4.16)

defined for all \( \lambda \in \Lambda _{\nu }^{\gamma }(u)\), where \(\Lambda _{0}^{\gamma }(u) := \Lambda _o \) (is the domain of \( u \)), and, for \(\nu \ge 1\),

$$\begin{aligned} \Lambda _{\nu }^{\gamma } := \Lambda _{\nu }^{\gamma }(u)&:= \left\{ \lambda \in \Lambda _{\nu - 1}^{\gamma } : \left| \mathrm{i} \omega \cdot l + \mu ^{\nu -1}_{j}(\lambda ) - \mu ^{\nu - 1}_{k}(\lambda ) \right| \right. \nonumber \\&\ge \left. \gamma \frac{ |j^{3} - k^{3}|}{\left\langle l\right\rangle ^{\tau }} \ \forall \left| l\right| \le N_{ \nu -1}, \ j, k \in {\mathbb {Z}}\right\} . \end{aligned}$$
(4.17)

For \(\nu \ge 0\), \(r_j^{\nu } = \overline{r_{-j}^{\nu }}\), equivalently \( \mu _j^{\nu } = \overline{\mu _{-j}^{\nu }}\), and

$$\begin{aligned} |r_j^{\nu }|^{\mathrm{{Lip}(\gamma )}} := |r_j^{\nu }|^{\mathrm{{Lip}(\gamma )}}_{\Lambda _{\nu }^\gamma } \le \varepsilon C . \end{aligned}$$
(4.18)

The remainder \( \mathcal{R}_\nu \) is real (Definition 2.2) and, \( \forall s \in [ {\mathfrak {s}}_0, q - \sigma - \beta ] \),

$$\begin{aligned} \left| \mathcal{R}_{\nu }\right| _{s}^{\mathrm{{Lip}(\gamma )}} \le \left| \mathcal{R}_{0}\right| _{s+\beta }^{\mathrm{{Lip}(\gamma )}}N_{\nu - 1}^{-\alpha } , \quad \left| \mathcal{R}_{\nu }\right| _{s + \beta }^{\mathrm{{Lip}(\gamma )}} \le \left| \mathcal{R}_{0}\right| _{s+\beta }^{\mathrm{{Lip}(\gamma )}}\,N_{\nu - 1}. \end{aligned}$$
(4.19)

Moreover, for \( \nu \ge 1 \),

$$\begin{aligned} \mathcal{L}_{\nu } = \Phi _{\nu -1}^{-1} \mathcal{L}_{\nu -1} \Phi _{\nu -1} , \quad \Phi _{\nu -1} := I + \Psi _{\nu -1} , \end{aligned}$$
(4.20)

where the map \( \Psi _{\nu -1} \) is real, Töplitz in time \( \Psi _{\nu -1} := \Psi _{\nu -1}(\varphi ) \) (see (2.17)), and satisfies

$$\begin{aligned} \left| \Psi _{\nu -1} \right| _{s}^{\mathrm{{Lip}(\gamma )}} \le |\mathcal{R}_{0} |_{s+\beta }^{\mathrm{{Lip}(\gamma )}} \gamma ^{-1} N_{\nu -1}^{2 \tau +1} N_{\nu - 2}^{- \alpha } . \end{aligned}$$
(4.21)

In the reversible case, \({\mathcal {R}}_{\nu } : X \rightarrow Y\), \(\Psi _{\nu - 1}, \Phi _{\nu - 1}, \Phi _{\nu - 1}^{-1} \) are reversibility preserving. Moreover, all the \( \mu ^\nu _{j}(\lambda ) \) are purely imaginary and \( \mu ^{\nu }_{j} = - \mu ^{\nu }_{-j} \), \( \forall j \in {\mathbb {Z}}\).

\(\mathbf{(S2)_{\nu }}\) :

For all \( j \in {\mathbb {Z}}\), there exist Lipschitz extensions \( \widetilde{\mu }_{j}^{\nu }(\cdot ): \Lambda \rightarrow {\mathbb {R}}\) of \( \mu _{j}^{\nu }(\cdot ) : \Lambda _\nu ^\gamma \rightarrow {\mathbb {R}}\) satisfying, for \(\nu \ge 1\),

$$\begin{aligned} |\widetilde{\mu }_{j}^{\nu } - \widetilde{\mu }_{j}^{\nu -1} |^{\mathrm{{Lip}(\gamma )}} \le | \mathcal{R}_{\nu -1} |^{\mathrm{{Lip}(\gamma )}}_{{\mathfrak {s}}_0}. \end{aligned}$$
(4.22)
\(\mathbf{(S3)_{\nu }}\) :

Let \( u_1(\lambda )\), \( u_2(\lambda )\), be Lipschitz families of Sobolev functions, defined for \(\lambda \in \Lambda _o\) and such that conditions (4.2), (4.14) hold with \( \mathcal{R}_0 := \mathcal{R}_0 ( u_i) \), \( i = 1,2 \), see (4.11). Then, for \( \nu \ge 0 \), \(\forall \lambda \in \Lambda _{\nu }^{\gamma _1}(u_1) \cap \Lambda _{\nu }^{\gamma _2}(u_2)\), with \( \gamma _1, \gamma _2 \in [\gamma /2, 2\gamma ]\),

$$\begin{aligned} \begin{aligned} |\mathcal{R}_{\nu }(u_2) - {\mathcal {R}}_{\nu }(u_1)|_{{\mathfrak {s}}_{0}}&\le \varepsilon N_{\nu - 1}^{-\alpha } \Vert u_1 - u_2 \Vert _{{\mathfrak {s}}_0 + \sigma _2},\\ |\mathcal{R}_{\nu }(u_2) - {\mathcal {R}}_{\nu }(u_1)|_{\mathfrak s_{0}+\beta }&\le \varepsilon N_{\nu - 1} \Vert u_1 - u_2 \Vert _{\mathfrak s_0 + \sigma _2}. \end{aligned} \end{aligned}$$
(4.23)

Moreover, for \(\nu \ge 1\), \( \forall s \in [\mathfrak s_{0},{\mathfrak {s}}_{0}+\beta ] \), \(\forall j \in {\mathbb {Z}}\),

$$\begin{aligned} \big |\!\left( r_{j}^{\nu }(u_2) \!-\! r_{j}^{\nu }(u_1)\right) - \left( r_{j}^{\nu -1}(u_2) \!-\! r_{j}^{\nu -1}(u_1)\right) \big |\!&\le \! \vert \mathcal{R}_{\nu -1}(u_2) \!-\! {\mathcal {R}}_{\nu -1}(u_1) \vert _{{\mathfrak {s}}_0} ,\nonumber \\\end{aligned}$$
(4.24)
$$\begin{aligned} | r_j^{\nu }(u_2) \!-\! r_j^{\nu }(u_1) | \!&\le \! \varepsilon C \Vert u_1 \!-\! u_2 \Vert _{{\mathfrak {s}}_0 + \sigma _2}. \end{aligned}$$
(4.25)
\(\mathbf{(S4)_{\nu }}\) :

Let \(u_1, u_2\) like in \((\mathbf{S3})_\nu \) and \( 0 < \rho < \gamma / 2 \). For all \( \nu \ge 0 \) such that

$$\begin{aligned} \varepsilon C N_{\nu - 1}^{\tau } \Vert u_1 - u_2 \Vert _{{\mathfrak {s}}_0 + \sigma _2}^\mathrm{sup} \le \rho \Longrightarrow \Lambda _{\nu }^{\gamma }(u_1) \subseteq \Lambda _{\nu }^{\gamma - \rho }(u_2). \end{aligned}$$
(4.26)

Remark 4.1

In the Hamiltonian case \( \Psi _{\nu -1}\) is Hamiltonian and, instead of (4.20) we consider the symplectic map

$$\begin{aligned} \Phi _{\nu -1} := \exp (\Psi _{\nu -1}). \end{aligned}$$
(4.27)

The corresponding operators \( {\mathcal {L}}_\nu \), \(\mathcal{R}_\nu \) are Hamiltonian. Note that the operators (4.27) and (4.20) differ for an operator of order \(\Psi _{\nu - 1}^2\).

The proof of Theorem 4.2 is postponed in Subsection 4.1. We first give some consequences.

Corollary 4.1

(KAM transformation) \( \forall \lambda \in \cap _{\nu \ge 0} \Lambda _{\nu }^{\gamma } \) the sequence

$$\begin{aligned} \widetilde{\Phi }_{\nu } := \Phi _{0} \circ \Phi _1 \circ \cdots \circ \Phi _{\nu } \end{aligned}$$
(4.28)

converges in \( |\cdot |_{s}^{\mathrm{{Lip}(\gamma )}}\) to an operator \(\Phi _{\infty }\) and

$$\begin{aligned} \left| \Phi _{\infty } - I \right| _{s}^{\mathrm{{Lip}(\gamma )}} + \left| \Phi _{\infty }^{-1} - I \right| _{s}^{\mathrm{{Lip}(\gamma )}} \le C(s) \left| \mathcal{R}_{0}\right| _{s + \beta }^{\mathrm{{Lip}(\gamma )}} \gamma ^{-1}. \end{aligned}$$
(4.29)

In the reversible case \(\Phi _\infty \) and \(\Phi _{\infty }^{-1}\) are reversibility preserving.

Proof

To simplify notations we write \(|\cdot |_s \) for \(|\cdot |_s^{\mathrm{{Lip}(\gamma )}}\). For all \( \nu \ge 0 \) we have \( \widetilde{\Phi }_{\nu + 1} = \widetilde{\Phi }_{\nu }\circ \Phi _{\nu + 1} = \widetilde{\Phi }_{\nu } + \widetilde{\Phi }_{\nu }\Psi _{\nu + 1} \) (see (4.20)) and so

$$\begin{aligned} |\widetilde{\Phi }_{\nu + 1}|_{{\mathfrak {s}}_{0}} \mathop {\le }\limits ^{(2.9)}|\widetilde{\Phi }_{\nu } |_{{\mathfrak {s}}_{0}} + C |\widetilde{\Phi }_{\nu } |_{\mathfrak s_{0}}\left| \Psi _{\nu + 1}\right| _{{\mathfrak {s}}_{0}} \mathop {\le }\limits ^{(4.21)}|\widetilde{\Phi }_{\nu } |_{\mathfrak s_{0}} ( 1+ \varepsilon _\nu ) \end{aligned}$$
(4.30)

where \( \varepsilon _\nu := C' |\mathcal{R}_{0} |_{{\mathfrak {s}}_0 +\beta }^{\mathrm{{Lip}(\gamma )}} \gamma ^{-1} N_{\nu +1}^{2 \tau +1} N_{\nu }^{- \alpha } \). Iterating (4.30) we get, for all \( \nu \),

$$\begin{aligned} |\widetilde{\Phi }_{\nu + 1}|_{{\mathfrak {s}}_{0}} \le | \widetilde{\Phi }_0 |_{{\mathfrak {s}}_{0}} \Pi _{\nu \ge 0} (1+ \varepsilon _\nu ) \le | \Phi _0 |_{{\mathfrak {s}}_0} e^{C |\mathcal{R}_{0} |_{\mathfrak s_0+\beta }^{\mathrm{{Lip}(\gamma )}} \gamma ^{-1}} \le 2 \end{aligned}$$
(4.31)

using (4.21) (with \( \nu =1 \), \( s = {\mathfrak {s}}_0 \)) to estimate \( | \Phi _0 |_{{\mathfrak {s}}_0} \) and (4.14). The high norm of \( \widetilde{\Phi }_{\nu + 1} = \widetilde{\Phi }_{\nu } + \widetilde{\Phi }_{\nu }\Psi _{\nu + 1} \) is estimated by (2.10), (4.31) (for \( {\widetilde{\Phi }}_\nu \)), as

$$\begin{aligned} |\widetilde{\Phi }_{\nu + 1}|_s&\le | \widetilde{\Phi }_{\nu } |_s ( 1 + C(s)\left| \Psi _{\nu + 1}\right| _{{\mathfrak {s}}_{0}} ) + C(s) \left| \Psi _{\nu + 1}\right| _s \\&\mathop {\le }\limits ^{(4.21), (4.13)} | \widetilde{\Phi }_{\nu } |_s ( 1 + \varepsilon _{\nu }^{(0)}) + \varepsilon _{\nu }^{(s)} , \ \varepsilon _{\nu }^{(0)} := |\mathcal{R}_0|_{{\mathfrak {s}}_0+\beta } \gamma ^{-1} N_\nu ^{-1}, \end{aligned}$$
$$\begin{aligned} \varepsilon _{\nu }^{(s)} := |\mathcal{R}_0|_{s+\beta } \gamma ^{-1} N_\nu ^{-1}. \end{aligned}$$

Iterating the above inequality and using \( \Pi _{j \ge 0} (1+ \varepsilon _j^{(0)}) \le 2 \), we get

$$\begin{aligned} |\widetilde{\Phi }_{\nu + 1}|_s \le _s \sum _{j=0}^\infty \varepsilon _j^{(s)} + |\widetilde{\Phi }_0 |_s \le C(s) \left( 1+ | \mathcal{R}_0|_{s+ \beta } \gamma ^{-1} \right) \end{aligned}$$
(4.32)

using \( |\Phi _0 |_s \le 1+ C(s) | \mathcal{R}_0|_{s+ \beta } \gamma ^{-1} \). Finally, \(\widetilde{\Phi }_{j}\) is a Cauchy sequence in norm \( | \cdot |_s \) because

$$\begin{aligned} | \widetilde{\Phi }_{\nu +m} - \widetilde{\Phi }_{\nu } |_{s}&\le \sum _{j =\nu }^{\nu +m-1} |\widetilde{\Phi }_{j + 1} - \widetilde{\Phi }_{j} |_{s} \nonumber \\&\mathop {\le _s}\limits ^{ (2.10)~} \sum _{j = \nu }^{\nu +m-1}\left( |\widetilde{\Phi }_j |_s |\Psi _{j + 1} |_{{\mathfrak {s}}_{0}} + |\widetilde{\Phi }_j |_{{\mathfrak {s}}_{0}} |\Psi _{j + 1} |_{s}\right) \nonumber \\&\mathop {\le _s}\limits ^{(4.32), (4.21), (4.31), (4.14)} \sum _{j \ge \nu } \left| \mathcal{R}_{0}\right| _{s + \beta } \gamma ^{-1} N_j^{-1} \le _s \left| \mathcal{R}_{0}\right| _{s + \beta } \gamma ^{-1} N_\nu ^{-1}. \nonumber \\ \end{aligned}$$
(4.33)

Hence \(\widetilde{\Phi }_{\nu } \mathop {\rightarrow }\limits ^{\left| \cdot \right| _ s} \Phi _{\infty } \). The bound for \( \Phi _\infty - I \) in (4.29) follows by (4.33) with \( m = \infty \), \( \nu = 0 \) and \( |\widetilde{\Phi }_0 - I |_s = \) \( |\Psi _0|_s \lessdot \gamma ^{-1} | \mathcal{R}_0|_{s+\beta } \). Then the estimate for \( \Phi _\infty ^{-1} - I \) follows by (2.13).

In the reversible case all the \(\Phi _\nu \) are reversibility preserving and so \(\widetilde{\Phi }_\nu \), \(\Phi _{\infty }\) are reversibility preserving. \(\square \)

Remark 4.2

In the Hamiltonian case, the transformation \(\widetilde{\Phi }_\nu \) in (4.28) is symplectic, because \(\Phi _\nu \) is symplectic for all \(\nu \) (see Remark 4.1). Therefore \(\Phi _\infty \) is also symplectic.

Let us define for all \(j \in {\mathbb {Z}}\)

$$\begin{aligned} \mu ^{\infty }_{j}(\lambda ) = \lim _{\nu \rightarrow + \infty } \widetilde{\mu }_{j}^{\nu }(\lambda ) = \tilde{\mu }_j^0 + r_j^{\infty }(\lambda ), \quad r_j^{\infty }(\lambda ) := \lim _{\nu \rightarrow + \infty } \tilde{r}_j^{\nu }(\lambda ) \quad \forall \lambda \in \Lambda . \end{aligned}$$

It could happen that \( \Lambda _{\nu _0}^\gamma = \emptyset \) (see (4.17)) for some \( \nu _0 \). In such a case the iterative process of Theorem 4.2 stops after finitely many steps. However, we can always set \( \widetilde{\mu }_{j}^{\nu } := \widetilde{\mu }_{j}^{\nu _0} \), \( \forall \nu \ge \nu _0 \), and the functions \( \mu ^{\infty }_{j} : \Lambda \rightarrow {\mathbb {R}}\) are always well defined.

Corollary 4.2

(Final eigenvalues) For all \( \nu \in {\mathbb {N}}\), \( j \in {\mathbb {Z}}\)

$$\begin{aligned} \begin{aligned} | { \mu }_{j}^{\infty } - {\widetilde{\mu }}^{\nu }_{j} |_\Lambda ^{\mathrm{{Lip}(\gamma )}}&= | r_{j}^{\infty } - {\widetilde{r} }^{\nu }_{j} |^{\mathrm{{Lip}(\gamma )}}_\Lambda \le C \left| \mathcal{R}_{0}\right| _{{\mathfrak {s}}_{0}+\beta }^{\mathrm{{Lip}(\gamma )}} N_{\nu -1}^{-\alpha } , \\ | { \mu }_{j}^{\infty } - {\widetilde{\mu }}^{0}_{j}|_\Lambda ^{\mathrm{{Lip}(\gamma )}}&= | r_j^{\infty } |_\Lambda ^{\mathrm{{Lip}(\gamma )}} \le C \left| \mathcal{R}_{0}\right| _{{\mathfrak {s}}_{0}+\beta }^{\mathrm{{Lip}(\gamma )}}. \end{aligned} \end{aligned}$$
(4.34)

Proof

The bound (4.34) follows by (4.22) and (4.19) by summing the telescopic series. \(\square \)

Lemma 4.1

(Cantor set)

$$\begin{aligned} \Lambda _{\infty }^{2 \gamma } \subset \cap _{\nu \ge 0} \Lambda _{\nu }^\gamma . \end{aligned}$$
(4.35)

Proof

Let \( \lambda \in \Lambda _{\infty }^{2\gamma } \). By definition \( \Lambda _{\infty }^{2\gamma } \subset \Lambda _0^\gamma := \Lambda _o \). Then for all \( \nu > 0 \), \( | l | \le N_{\nu } \), \( j \ne k \)

$$\begin{aligned} \left| \mathrm{i} \omega \cdot l + { \mu }_{j}^{\nu } - { \mu }_k^{\nu }\right|&\ge \left| \mathrm{i} \omega \cdot l + \mu _j^{\infty } - \mu _k^{\infty }\right| - \left| { \mu }_j^{\nu } - \mu _j^{\infty }\right| - \left| {\mu }_k^{\nu } - \mu _k^{\infty }\right| \\&\mathop {\ge }\limits ^{(4.6),(4.34)} 2\gamma \left| j^{3} - k^{3}\right| \left\langle l\right\rangle ^{-\tau } - 2 C | \mathcal{R}_0|_{{\mathfrak {s}}_0+ \beta } N_{\nu -1}^{-\alpha } \\&\ge \gamma \left| j^{3} - k^{3}\right| \left\langle l\right\rangle ^{-\tau } \end{aligned}$$

because \( \gamma |j^{3} - k^{3} | \langle l \rangle ^{-\tau } \ge \gamma N_\nu ^{-\tau } \mathop {\ge }\limits ^{(4.14)}2 C | \mathcal{R}_0|_{{\mathfrak {s}}_0+ \beta } N_{\nu -1}^{-\alpha } \). \(\square \)

Lemma 4.2

For all \(\lambda \in \Lambda _{\infty }^{2\gamma } (u) \) ,

$$\begin{aligned} \mu _j^{\infty }(\lambda ) = \overline{\mu _{-j}^{\infty }(\lambda )}, \quad r_j^{\infty }(\lambda ) = \overline{r_{-j}^{\infty }(\lambda )}, \end{aligned}$$
(4.36)

and in the reversible case

$$\begin{aligned} \mu _j^{\infty }(\lambda ) = - \mu _{-j}^{\infty }(\lambda ), \quad r_j^{\infty }(\lambda ) = - r_{-j}^{\infty }(\lambda ). \end{aligned}$$
(4.37)

Actually in the reversible case \( \mu _j^{\infty }(\lambda ) \) are purely imaginary for all \( \lambda \in \Lambda \).

Proof

Formula (4.36) and (4.37) follow because, for all \( \lambda \in \Lambda _\infty ^{2\gamma } \subseteq \cap _{\nu \ge 0} \Lambda _\nu ^\gamma \) (see (4.35)), we have \( \mu _j^{\nu } = \overline{\mu _{-j}^{\nu }} \), \( r_j^{\nu } = \overline{r_{-j}^{\nu }} \), and, in the reversible case, the \( \mu _j^{\nu } \) are purely imaginary and \( \mu _j^{\nu } = - \mu _{-j}^{\nu } \), \( r_j^{\nu } = - r_{-j}^{\nu } \). The final statement follows because, in the reversible case, the \( \mu _j^\nu (\lambda ) \in \mathrm{i} {\mathbb {R}}\) as well as its extension \( {\widetilde{\mu }}_j^\nu (\lambda ) \). \(\square \)

Remark 4.3

In the reversible case, (4.37) imply that \(\mu _0^\infty = r_0^\infty = 0\).

Proof of Theorem 4.1

We apply Theorem 4.2 to the linear operator \( \mathcal{L}_0 := \mathcal{L}_5\) in (3.55), where \( {\mathcal {R}}_0 = \mathcal{R }\) defined in (4.11) satisfies

$$\begin{aligned} \left| \mathcal{R}_{0}\right| _{{\mathfrak {s}}_{0} + \beta }^{\mathrm{{Lip}(\gamma )}} \mathop {\le }\limits ^{(3.68)}\varepsilon C({\mathfrak {s}}_0 + \beta ) \left( 1 + \Vert u \Vert _{{\mathfrak {s}}_{0} + \sigma + \beta }^{\mathrm{{Lip}(\gamma )}}\right) \mathop {\le }\limits ^{(4.2)}2 \varepsilon C({\mathfrak {s}}_0 + \beta ). \end{aligned}$$
(4.38)

Then the smallness condition (4.14) is implied by (4.3) taking \(\delta _0:= \delta _0(\nu )\) small enough.

For all \( \lambda \in \Lambda _\infty ^{2\gamma } \subset \cap _{\nu \ge 0} \Lambda _\nu ^\gamma \) (see (4.35)), the operators

$$\begin{aligned} \mathcal{L}_{\nu } \mathop {=}\limits ^{(4.15)} \omega \cdot \partial _{\varphi } + \mathcal{D}_{\nu } + \mathcal{R}_{\nu } \mathop {\longrightarrow }\limits ^{\left| \cdot \right| _{s}^{\mathrm{{Lip}(\gamma )}}}\omega \cdot \partial _{\varphi } + \mathcal{D}_{\infty } =: \mathcal{L}_{\infty } , \quad \mathcal{D}_{\infty } := \mathrm{diag}_{j \in {\mathbb {Z}}} \mu _j^{\infty }\nonumber \\ \end{aligned}$$
(4.39)

because

$$\begin{aligned} \left| \mathcal{D}_{\nu } - \mathcal{D}_{\infty }\right| _{s}^{\mathrm{{Lip}(\gamma )}}&= \sup _{j \in {\mathbb {Z}}} \left| {\mu }_{j}^{\nu } - \mu _{j}^{\infty }\right| ^{\mathrm{{Lip}(\gamma )}} \mathop {\le }\limits ^{(4.34)}C \left| \mathcal{R}_{0}\right| _{{\mathfrak {s}}_{0}+\beta }^{\mathrm{{Lip}(\gamma )}} N_{\nu - 1}^{- \alpha }, \\ \left| \mathcal{R}_{\nu }\right| _{s}^{\mathrm{{Lip}(\gamma )}}&\mathop {\le }\limits ^{(4.19)}\left| \mathcal{R}_{0}\right| _{s + \beta }^{\mathrm{{Lip}(\gamma )}} N_{\nu - 1}^{-\alpha } \, . \end{aligned}$$

Applying (4.20) iteratively we get \( \mathcal{L}_{\nu } = {{\widetilde{\Phi }}_{\nu -1}}^{-1} \mathcal{L}_0 {\widetilde{\Phi }}_{\nu -1} \) where \( {\widetilde{\Phi }}_{\nu -1} \) is defined by (4.28) and \( {\widetilde{\Phi }}_{\nu -1} \rightarrow {\Phi }_\infty \) in \( | \ |_s \) (Corollary 4.1). Passing to the limit we deduce (4.7). Moreover (4.34) and (4.38) imply (4.5). Then (4.29), (3.68) (applied to \( \mathcal{R}_0 = \mathcal{R} \)) imply (4.8).

Estimate (4.9) follows from (2.12) (in \( H^s_x ({\mathbb {T}}) \)), Lemma 2.4, and the bound (4.8).

In the reversible case, since \(\Phi _\infty \), \(\Phi _{\infty }^{-1}\) are reversibility preserving (see Corollary 4.1), and \({\mathcal {L}}_0\) is reversible (see Remark 3.12 and Lemma 3.2), we get that \({\mathcal {L}}_\infty \) is reversible too. The eigenvalues \( \mu _j^{\infty } \) are purely imaginary by Lemma 4.2.

In the Hamiltonian case, \( {\mathcal {L}}_0 \equiv {\mathcal {L}}_5 \) is Hamiltonian, \(\Phi _{\infty }\) is symplectic, and therefore \(\mathcal{L}_{\infty } = \Phi _{\infty }^{-1} {\mathcal {L}}_5 \Phi _{\infty }\) (see (4.7)) is Hamiltonian, namely \({\mathcal {D}}_\infty \) has the structure \( {\mathcal {D}}_\infty = \partial _x \mathcal {B} \), where \(\mathcal {B} = \mathrm {diag}_{j \ne 0} \{ b_j \}\) is self-adjoint. This means that \(b_j \in {\mathbb {R}}\), and therefore \(\mu _j^\infty = \mathrm{i} j b_j\) are all purely imaginary.

4.1 Proof of Theorem 4.2

Proof of \(\mathbf{({S}i)}_{0}\), \(i=1,\ldots ,4\). Properties (4.15)–(4.19) in \(\mathbf{({S}1)}_0\) hold by (4.10)–(4.11) with \( \mu _j^0 \) defined in (4.16) and \( r_j^0(\lambda ) = 0 \) (for (4.19) recall that \( N_{-1} := 1 \), see (4.12)). Moreover, since \(m_1\), \(m_3\) are real functions, \(\mu _j^0\) are purely imaginary, \(\mu _j^0 = \overline{{\mu }_{-j}^0}\) and \(\mu _j^0 = - \mu _{-j}^0\). In the reversible case, Remark 3.12 implies that \({\mathcal {R}}_0 := {\mathcal {R}}\), \({\mathcal {L}}_0 := {\mathcal {L}}_5\) are reversible operators. Then there is nothing else to verify.

\(\mathbf{({S}2)}_0 \) holds extending from \( \Lambda ^\gamma _0 := \Lambda _o \) to \( \Lambda \) the eigenvalues \(\mu _{j}^0 (\lambda ) \), namely extending the functions \( m_1 (\lambda ) \), \( m_3 (\lambda ) \) to \( {\tilde{m}}_1 (\lambda ) \), \( {\tilde{m}}_3 (\lambda ) \), preserving the sup norm and the Lipschitz semi-norm, by Kirszbraun theorem, see e.g. [37]-Lemma A.2, or [32].

\(\mathbf{({S}3)}_0 \) follows by (3.67), for \(s = \mathfrak s_0 , {\mathfrak {s}}_0 + \beta \), and (4.2), (4.13).

\(\mathbf{({S}4)}_0 \) is trivial because, by definition, \(\Lambda _0^\gamma (u_1) = \Lambda _o = \Lambda _0^{\gamma -\rho }(u_2)\).

4.1.1 The reducibility step

We now describe the generic inductive step, showing how to define \( \mathcal{L}_{\nu +1 } \) (and \( \Phi _\nu \), \( \Psi _\nu \), etc). To simplify notations, in this section we drop the index \( \nu \) and we write \( + \) for \( \nu + 1\). We have

$$\begin{aligned} \mathcal{L} \Phi h&= \omega \cdot \partial _{\varphi } (\Phi (h)) + \mathcal{D} \Phi h + \mathcal{R} \Phi h \nonumber \\&= \omega \cdot \partial _{\varphi } h + \Psi \omega \cdot \partial _{\varphi } h + (\omega \cdot \partial _{\varphi } \Psi ) h + \mathcal{D} h + \mathcal{D} \Psi h + \mathcal{R} h + \mathcal{R} \Psi h \nonumber \\&= \Phi \left( \omega \cdot \partial _{\varphi } h + \mathcal{D} h\right) + \left( \omega \cdot \partial _{\varphi } \Psi + \left[ \mathcal{D}, \Psi \right] + \Pi _{N} \mathcal{R}\right) h + \left( \Pi _{N}^{\bot } \mathcal{R} + \mathcal{R} \Psi \right) h \nonumber \\ \end{aligned}$$
(4.40)

where \([\mathcal{D}, \Psi ] := \mathcal{D} \Psi - \Psi \mathcal{D} \) and \(\Pi _{N}\mathcal{R}\) is defined in (2.19).

Remark 4.4

The application of the smoothing operator \( \Pi _N \) is necessary since we are performing a differentiable Nash–Moser scheme. Note also that \( \Pi _N \) regularizes only in time (see (2.19)) because the loss of derivatives of the inverse operator is only in \( \varphi \) (see (4.44) and the bound on the small divisors (4.17)).

We look for a solution of the homological equation

$$\begin{aligned} \omega \cdot \partial _{\varphi } \Psi + \left[ \mathcal{D}, \Psi \right] + \Pi _{N} \mathcal{R} = [\mathcal{R}] \quad \mathrm{where} \quad [\mathcal{R}] := \mathrm{diag}_{j \in {\mathbb {Z}}} \mathcal{R}^{j}_{j}(0).\qquad \end{aligned}$$
(4.41)

Lemma 4.3

(Homological equation) For all \( \lambda \in {\Lambda }_{\nu +1}^{\gamma } \), (see (4.17)) there exists a unique solution \( \Psi := \Psi (\varphi ) \) of the homological equation (4.41). The map \( \Psi \) satisfies

$$\begin{aligned} \left| \Psi \right| _{s}^{\mathrm{{Lip}(\gamma )}} \le C N^{2\tau + 1} \gamma ^{-1} \left| \mathcal{R}\right| _{s}^{\mathrm{{Lip}(\gamma )}}. \end{aligned}$$
(4.42)

Moreover if \(\gamma / 2 \le \gamma _1, \gamma _2 \le 2\gamma \) and if \( u_1(\lambda )\), \( u_2(\lambda ) \) are Lipschitz functions, then \(\forall s \in [\mathfrak s_0, {\mathfrak {s}}_0 + \beta ] \), \(\lambda \in \Lambda _{\nu + 1}^{\gamma _1}(u_1) \cap \Lambda _{\nu + 1}^{\gamma _2}(u_2)\)

$$\begin{aligned} | \Delta _{12} \Psi |_s \le C N^{2\tau + 1}\gamma ^{-1}\left( |{\mathcal {R}}(u_2)|_s \Vert u_1 - u_2 \Vert _{{\mathfrak {s}}_0 + \sigma _2} + |\Delta _{12}{\mathcal {R}}|_s \right) \end{aligned}$$
(4.43)

where we define \( \Delta _{12} \Psi := \Psi ( u_1) -\Psi (u_2) \).

In the reversible case, \( \Psi \) is reversibility-preserving.

Proof

Since \(\mathcal{D} := \mathrm{diag}_{j \in {\mathbb {Z}}} (\mu _{j})\) we have \( [\mathcal{D}, \Psi ]_{j}^{k} = (\mu _j - \mu _k) \Psi _{j}^{k}(\varphi ) \) and (4.41) amounts to

$$\begin{aligned} \omega \cdot \partial _{\varphi } \Psi _{j}^{k}(\varphi ) + (\mu _{j} - \mu _{k}) \Psi _{j}^{k}(\varphi ) + \mathcal{R}_{j}^{k}(\varphi ) = [\mathcal{R}]_{j}^{k} , \quad \forall j, k \in {\mathbb {Z}}, \end{aligned}$$

whose solutions are \( \Psi _{j}^{k}(\varphi ) = \sum _{l \in {\mathbb {Z}}^\nu } \Psi _{j}^{k}(l) e^{\mathrm{i} l \cdot \varphi } \) with coefficients

$$\begin{aligned} {\Psi }_{j}^{k}(l) := {\left\{ \begin{array}{ll} \dfrac{\mathcal{R}_{j}^{k}(l)}{ \delta _{ljk}(\lambda ) } \quad \, &{}\text {if} \ (j-k, l) \ne (0,0) \ \ \text {and} \ \ |l | \le N , \\ &{}\text {where} \ \ \delta _{ljk}(\lambda ) := \mathrm{i} \omega \cdot l + \mu _j - \mu _k, \\ 0 &{} \text {otherwise.} \end{array}\right. } \end{aligned}$$
(4.44)

Note that, for all \( \lambda \in \Lambda _{\nu + 1}^{\gamma } \), by (4.17) and (1.2), if \( j \ne k \) or \( l \ne 0 \) the divisors \( \delta _{ljk}(\lambda ) \ne 0 \). Recalling the definition of the \( s \)-norm in (2.3) we deduce by (4.44), (4.17), (1.2), that

$$\begin{aligned} | \Psi |_s \le \gamma ^{-1} N^\tau | \mathcal{R} |_s , \quad \forall \lambda \in \Lambda _{\nu +1}^\gamma . \end{aligned}$$
(4.45)

For \( \lambda _{1}, \lambda _{2} \in \Lambda _{\nu + 1}^{\gamma } \),

$$\begin{aligned} |\Psi _{j}^{k}(l) (\lambda _{1}) - \Psi _{j}^{k}(l) (\lambda _{2})|&\le \frac{|\mathcal{R}_{j}^{k}(l) (\lambda _{1}) - \mathcal{R}_{j}^{k}(l) (\lambda _{2})|}{|\delta _{ljk}(\lambda _{1})|} \nonumber \\&+ |\mathcal{R}_{j}^{k}(l) (\lambda _{2})| \, \frac{|\delta _{ljk}(\lambda _{1}) - \delta _{ljk}(\lambda _{2})|}{|\delta _{ljk}(\lambda _{1})| |\delta _{ljk}(\lambda _{2})|} \end{aligned}$$
(4.46)

and, since \(\omega = \lambda \bar{\omega }\),

$$\begin{aligned} |\delta _{ljk}(\lambda _{1}) - \delta _{ljk}(\lambda _{2})|&\mathop {=}\limits ^{(4.44)} |(\lambda _{1}-\lambda _{2})\bar{\omega }\cdot l + (\mu _{j} - \mu _{k})(\lambda _{1}) - (\mu _{j}-\mu _{k})(\lambda _{2})| \nonumber \\\end{aligned}$$
(4.47)
$$\begin{aligned}&\mathop {\le }\limits ^{(4.16)} |\lambda _{1}-\lambda _{2}||\bar{\omega }\cdot l | + |m_3(\lambda _{1}) - m_3(\lambda _{2})| |j^{3}-k^{3} | \nonumber \\&+ |m_1(\lambda _{1}) - m_1(\lambda _{2})| | j-k | \nonumber \\&+ \, |r_{j}(\lambda _{1}) - r_{j}(\lambda _{2})| + |r_{k}(\lambda _{1}) - r_{k}(\lambda _{2})| \nonumber \\&\lessdot |\lambda _{1}-\lambda _{2}| \left( | l | + \varepsilon \gamma ^{-1} |j^{3}-k^{3} | + \varepsilon \gamma ^{-1} | j-k | + \varepsilon \gamma ^{-1} \right) \nonumber \\ \end{aligned}$$
(4.48)

because

$$\begin{aligned} \gamma |m_3|^\mathrm{lip}&= \gamma |m_3 - 1|^\mathrm{lip} \le |m_3 - 1|^{\mathrm{{Lip}(\gamma )}} \le \varepsilon C, \\ |m_1|^{\mathrm{{Lip}(\gamma )}}&\le \varepsilon C, \quad |r_{j} |^{\mathrm{{Lip}(\gamma )}} \le \varepsilon C \quad \forall j \in {\mathbb {Z}}. \end{aligned}$$

Hence, for \( j \ne k \), \( \varepsilon \gamma ^{-1} \le 1 \),

$$\begin{aligned} \frac{|\delta _{ljk}(\lambda _{1}) \!-\! \delta _{ljk}(\lambda _{2})|}{|\delta _{ljk}(\lambda _{1})||\delta _{ljk}(\lambda _{2})|} \!\!\!&\mathop {\lessdot }\limits ^{~(4.48),(4.17)} \! |\lambda _{1}-\lambda _{2}| \left( | l | + |j^{3}-k^{3} | \right) \frac{\left\langle l\right\rangle ^{2\tau }}{\gamma ^{2}\left| j^{3}-k^{3}\right| ^{2}}\nonumber \\&\lessdot |\lambda _{1} - \lambda _{2}| N^{2\tau + 1} \gamma ^{-2} \end{aligned}$$
(4.49)

for \( |l| \!\le \! N \). Finally, recalling (2.3), the bounds (4.46), (4.49) and (4.45) imply (4.42). Now we prove (4.43). By (4.44), for any \( \lambda \!\in \! \Lambda _{\nu + 1}^{\gamma _1} (u_1) \cap \Lambda _{\nu + 1}^{\gamma _2} (u_2) \), \( l \!\in \! {\mathbb {Z}}^{\nu } \), \( j \!\ne \! k \), we get

$$\begin{aligned} \Delta _{12}\Psi _j^k(l) = \frac{\Delta _{12}{\mathcal {R}}_j^k(l)}{\delta _{ljk}(u_1)} - {\mathcal {R}}_j^k(l)(u_2) \frac{\Delta _{12}\delta _{ljk}}{\delta _{ljk}(u_1) \delta _{ljk}(u_2)} \end{aligned}$$
(4.50)

where

$$\begin{aligned} |\Delta _{12}\delta _{ljk}|&= |\Delta _{12}(\mu _j - \mu _k)| \le |\Delta _{12}m_3 | \, | j^{3} - k^{3}|\nonumber \\&+ |\Delta _{12}m_1|\, |j - k| + |\Delta _{12}r_j | + | \Delta _{12}r_k| \nonumber \\&\mathop {\lessdot }\limits ^{(3.64),(4.25) } \varepsilon |j^3 - k^3| \Vert u_1 - u_2 \Vert _{{\mathfrak {s}}_0 + \sigma _2}. \end{aligned}$$
(4.51)

Then (4.50), (4.51), \(\varepsilon \gamma ^{-1} \le 1\), \(\gamma _{1}^{-1}, \gamma _{2}^{-1} \le \gamma ^{-1}\) imply

$$\begin{aligned} |\Delta _{12}\Psi _j^k(l)| \lessdot N^{2\tau } \gamma ^{-1} \left( |\Delta _{12}{\mathcal {R}}_j^k (l)| + |{\mathcal {R}}_j^k (l)(u_2)| \Vert u_1 - u_2 \Vert _{{\mathfrak {s}}_0 + \sigma _2} \right) \end{aligned}$$

and so (4.43) (in fact, (4.43) holds with \(2\tau \) instead of \(2\tau +1\)).

In the reversible case \( \mathrm{i} \omega \cdot l + \mu _j - \mu _k \in \mathrm{i} {\mathbb {R}}\), \(\overline{{\mu }_{-j}} = \mu _j\) and \( \mu _{-j} = - \mu _j \). Hence Lemma 2.6 and (4.44) imply

$$\begin{aligned} \overline{ {\Psi }_{-j}^{-k}(-l)} = \frac{ \overline{\mathcal{R}_{-j}^{-k}(-l)}}{ {-\mathrm{i} \omega \cdot (-l) + \overline{{\mu }_{-j}} - \overline{{\mu }_{-k}}} } = \frac{\mathcal{R}_{j}^{k}(l)}{\mathrm{i} \omega \cdot l + \mu _{j} - \mu _k } = {\Psi }_j^k(l) \end{aligned}$$

and so \( \Psi \) is real, again by Lemma 2.6. Moreover, since \( \mathcal{R} : X \rightarrow Y \),

$$\begin{aligned} \Psi ^{-k}_{-j}(-l) = \frac{ \mathcal{R}^{-k}_{-j}(-l)}{\mathrm{i} \omega \cdot (-l) + \mu _{-j} - \mu _{-k} } = \frac{- \mathcal{R}^{k}_{j}(l)}{ \mathrm{i} \omega \cdot (-l) - \mu _{j} + \mu _{k} } = {\Psi }^{k}_{j}(l) \end{aligned}$$

which implies \( \Psi : X \rightarrow X \) by Lemma 2.6. Similarly we get \( \Psi : Y \rightarrow Y \). \(\square \)

Remark 4.5

In the Hamiltonian case \( \mathcal{R} \) is Hamiltonian and the solution \( \Psi \) in (4.44) of the homological equation is Hamiltonian, because \( \overline{ \delta _{l,j,k} } = \delta _{-l,k,j} \) and, in terms of matrix elements, an operator \(G(\varphi )\) is self-adjoint if and only if \( \overline{ G_j^k(l) } = G_k^j(-l) \).

Let \( \Psi \) be the solution of the homological equation (4.41) which has been constructed in Lemma 4.3. By Lemma 2.3, if \( C({\mathfrak {s}}_0) | \Psi |_{{\mathfrak {s}}_0} < 1 /2 \) then \( \Phi := I + \Psi \) is invertible and by (4.40) (and (4.41)) we deduce that

$$\begin{aligned} \mathcal{L}_{+} := \Phi ^{-1} \mathcal{L} \Phi = \omega \cdot \partial _{\varphi } + \mathcal{D}_{+} + \mathcal{R}_{+} , \end{aligned}$$
(4.52)

where

$$\begin{aligned} \mathcal{D}_{+} := \mathcal{D} + [\mathcal{R}] , \quad \mathcal{R}_{+} := \Phi ^{-1} \left( \Pi _N^{\bot } \mathcal{R} + \mathcal{R} \Psi - \Psi [\mathcal{R}] \right) . \end{aligned}$$
(4.53)

Note that \({\mathcal {L}}_+\) has the same form of \( \mathcal{L} \), but the remainder \( {\mathcal {R}}_+ \) is the sum of a quadratic function of \( \Psi , \mathcal{R} \) and a remainder supported on high modes.

Lemma 4.4

(New diagonal part) The eigenvalues of

$$\begin{aligned} \mathcal{D}_{+} = \mathrm{diag}_{j \in {\mathbb {Z}}} \{ \mu ^{+}_{j}(\lambda ) \}, \quad \text {where} \ \mu ^{+}_{j}&:= \mu _{j} + \mathcal{R}^{j}_{j}(0) = \mu _j^0 + r_j + {\mathcal {R}}_j^j (0) \\&= \mu _j^{0} + r_j^+, \quad r_j^+ := r_j + {\mathcal {R}}_j^j (0), \end{aligned}$$

satisfy \(\mu _{j}^{+} = \overline{{\mu }^{+}_{-j}}\) and

$$\begin{aligned} |\mu ^{+}_{j} - \mu _{j} |^\mathrm{lip} = |r^{+}_{j} - r_{j} |^\mathrm{lip} = |{\mathcal {R}}_j^j(0)|^\mathrm{lip}\le \left| \mathcal{R}\right| _{{\mathfrak {s}}_0}^\mathrm{lip},\quad \forall j \in {\mathbb {Z}}. \end{aligned}$$
(4.54)

Moreover if \( u_1 (\lambda )\), \(u_2 (\lambda )\) are Lipschitz functions, then for all \(\lambda \in \Lambda _{\nu }^{\gamma _1}(u_1) \cap \Lambda _{\nu }^{\gamma _2}(u_2)\)

$$\begin{aligned} |\Delta _{12} r_j^+ - \Delta _{12} r_j| \le |\Delta _{12}{\mathcal {R}}|_{{\mathfrak {s}}_0}. \end{aligned}$$
(4.55)

In the reversible case, all the \(\mu _j^{+}\) are purely imaginary and satisfy \(\mu ^{+}_{j} = - \mu ^{+}_{-j}\) for all \(j \in {\mathbb {Z}}\).

Proof

The estimates (4.54)–(4.55) follow using (2.4) because \( | \mathcal{R}^{j}_{j}(0) |^\mathrm{lip} = \) \( |\mathcal{R}^{(l,j)}_{(l,j)} |^\mathrm{lip}\le |\mathcal{R} |_0^\mathrm{lip} \le |\mathcal{R} |_{{\mathfrak {s}}_0}^\mathrm{lip} \) and

$$\begin{aligned} |\Delta _{12}r^{+}_{j} - \Delta _{12} r_{j} | = | \Delta _{12} \mathcal{R}^{j}_{j}(0) | = |\Delta _{12}\mathcal{R}^{(l,j)}_{(l,j)} | \le |\Delta _{12}\mathcal{R} |_0 \le |\Delta _{12}\mathcal{R} |_{{\mathfrak {s}}_0} . \end{aligned}$$

Since \( \mathcal{R} \) is real, by Lemma 2.6,

$$\begin{aligned} \mathcal{R}^{k}_{j}(l)\,=\, \overline{\mathcal{R}^{-k}_{-j}(-l)} \Longrightarrow {\mathcal {R}}_{j}^{j}(0) = \overline{{{\mathcal {R}}}_{-j}^{-j}(0)} \end{aligned}$$

and so \( \mu _{j}^+ = \overline{{\mu }_{-j}^{+}} \). If \({\mathcal {R}}\) is also reversible, by Lemma 2.6,

$$\begin{aligned} \mathcal{R}^{k}_{j}(l) = - \mathcal{R}^{-k}_{-j}(-l) , \quad \mathcal{R}^{k}_{j}(l) = \overline{\mathcal{R}^{-k}_{-j}(-l)} = - \overline{{ \mathcal{R}^{k}_{j}(l)}}. \end{aligned}$$

We deduce that \( \mathcal{R}^{j}_{j}(0) = - \mathcal{R}^{-j}_{-j}(0) \), \( \mathcal{R}^{j}_{j}(0) \in \mathrm{i} {\mathbb {R}}\) and therefore, \( \mu ^{+}_{j} = - \mu ^{+}_{-j} \) and \(\mu _{j}^{+} \in \mathrm{i} {\mathbb {R}}\). \(\square \)

Remark 4.6

In the Hamiltonian case, \({\mathcal {D}}_\nu \) is Hamiltonian, namely \( {\mathcal {D}}_\nu = \partial _x \mathcal {B} \) where \(\mathcal {B} = \mathrm {diag}_{j \ne 0} \{ b_j \}\) is self-adjoint. This means that \(b_j \in {\mathbb {R}}\), and therefore all \(\mu _j^\nu = \mathrm{i} j b_j \) are purely imaginary.

4.1.2 The iteration

Let \(\nu \ge 0\), and suppose that the statements \(\mathbf{({S}i)_{\nu }}\) are true. We prove \((\mathbf{Si})_{\nu +1}\), \(i=1,\ldots ,4\). To simplify notations we write \(|\cdot |_s\) instead of \(|\cdot |_s^{\mathrm{{Lip}(\gamma )}}\).

Proof of \((\mathbf{S1})_{\nu + 1}\). By \(\mathbf{(S1)_\nu } \), the eigenvalues \(\mu _j^\nu \) are defined on \(\Lambda _\nu ^\gamma \). Therefore the set \(\Lambda _{\nu +1}^\gamma \) is well-defined. By Lemma 4.3, for all \( \lambda \in \Lambda _{\nu +1}^{\gamma } \) there exists a real solution \( \Psi _{\nu } \) of the homological equation (4.41) which satisfies, \( \forall s \in [{\mathfrak {s}}_0, q- \sigma - \beta ] \),

$$\begin{aligned} \left| \Psi _{\nu }\right| _{s} \mathop {\lessdot }\limits ^{(4.42)} N_{\nu }^{2\tau + 1}\left| \mathcal{R}_{\nu }\right| _{s} \gamma ^{-1} \mathop {\lessdot }\limits ^{(4.19)} \left| \mathcal{R}_{0}\right| _{s + \beta } \gamma ^{-1} N_{\nu }^{2\tau + 1}\ N_{\nu -1}^{- \alpha } \end{aligned}$$
(4.56)

which is (4.21) at the step \( \nu +1 \). In particular, for \( s = {\mathfrak {s}}_0 \),

$$\begin{aligned} C({\mathfrak {s}}_0) \left| \Psi _{\nu }\right| _{{\mathfrak {s}}_0} \mathop {\le }\limits ^{(4.56)}C({\mathfrak {s}}_0) \left| \mathcal{R}_{0}\right| _{{\mathfrak {s}}_0 + \beta } \gamma ^{-1} N_{\nu }^{2\tau + 1}\ N_{\nu -1}^{- \alpha } \mathop {\le }\limits ^{(4.14)}1/2 \end{aligned}$$
(4.57)

for \( N_0 \) large enough. Then the map \( \Phi _{\nu } := I + \Psi _\nu \) is invertible and, by (2.13),

$$\begin{aligned} \left| \Phi _{\nu }^{-1}\right| _{{\mathfrak {s}}_{0}} \le 2 \, , \quad \left| \Phi _{\nu }^{-1}\right| _{s} \le 1 + C(s) | \Psi _\nu |_s. \end{aligned}$$
(4.58)

Hence (4.52)–(4.53) imply \( \mathcal{L}_{\nu + 1} := \) \( \Phi _{\nu }^{-1} \mathcal{L}_{\nu } \Phi _{\nu } = \) \( \omega \cdot \partial _{\varphi } + \mathcal{D}_{\nu + 1} + \mathcal{R}_{\nu + 1} \) where (see Lemma 4.4)

$$\begin{aligned} \mathcal{D}_{\nu + 1} := \mathcal{D}_{\nu } + [\mathcal{R}_{\nu }] = \mathrm{diag}_{j \in {\mathbb {Z}}} (\mu _j^{\nu + 1}) , \quad \mu _j^{\nu + 1} := \mu _j^{\nu } + (\mathcal{R}_{\nu })_{j}^{j}(0) , \end{aligned}$$
(4.59)

with \(\mu _j^{\nu + 1} = \overline{\mu _{-j}^{\nu + 1}} \) and

$$\begin{aligned} \mathcal{R}_{\nu +1} := \Phi _\nu ^{-1} H_{\nu },\quad H_{\nu }:= \Pi _{N_\nu }^{\bot } \mathcal{R}_\nu + \mathcal{R}_\nu \Psi _\nu - \Psi _\nu [\mathcal{R}_\nu ] . \end{aligned}$$
(4.60)

In the reversible case, \({\mathcal {R}}_\nu : X \rightarrow Y\), therefore, by Lemma 4.3, \(\Psi _\nu \), \(\Phi _\nu \), \(\Phi _{\nu }^{-1}\) are reversibility preserving, and then, by formula (4.60), also \({\mathcal {R}}_{\nu + 1} : X \rightarrow Y\).

Let us prove the estimates (4.19) for \( \mathcal{R}_{\nu + 1} \). For all \( s \in [{\mathfrak {s}}_0, q - \sigma - \beta ] \) we have

$$\begin{aligned} |\mathcal{R}_{\nu + 1} |_{s}&\mathop {\le _s}\limits ^{(4.60),(2.10)} | \Phi _\nu ^{-1} |_{{\mathfrak {s}}_{0}} \left( |\Pi _{N_\nu }^\bot \mathcal{R}_\nu |_s + |\mathcal{R}_\nu |_s |\Psi _\nu |_{{\mathfrak {s}}_{0}} + |\mathcal{R}_\nu |_{{\mathfrak {s}}_{0}} |\Psi _\nu |_s \right) \nonumber \\&+ | \Phi _\nu ^{-1} |_{s} \left( |\Pi _{N_\nu }^\bot \mathcal{R}_\nu |_{{\mathfrak {s}}_{0}} + |\mathcal{R}_\nu |_{{\mathfrak {s}}_{0}} |\Psi _\nu |_{{\mathfrak {s}}_{0}} \right) \nonumber \\&\mathop {\le _s}\limits ^{(4.58)} 2 \left( |\Pi _{N_\nu }^\bot \mathcal{R}_\nu |_s + |\mathcal{R}_\nu |_s |\Psi _\nu |_{{\mathfrak {s}}_{0}} + |\mathcal{R}_\nu |_{{\mathfrak {s}}_{0}} |\Psi _\nu |_s \right) \nonumber \\&+ (1 + | \Psi _\nu |_s) \left( |\Pi _{N_\nu }^\bot \mathcal{R}_\nu |_{{\mathfrak {s}}_{0}} + |\mathcal{R}_\nu |_{{\mathfrak {s}}_{0}} |\Psi _\nu |_{{\mathfrak {s}}_{0}} \right) \nonumber \\&\mathop {\le _s}\limits ^{(4.57)} |\Pi _{N_\nu }^\bot \mathcal{R}_\nu |_s + |\mathcal{R}_\nu |_s |\Psi _\nu |_{{\mathfrak {s}}_{0}} + |\mathcal{R}_\nu |_{{\mathfrak {s}}_{0}} |\Psi _\nu |_s \nonumber \\&\mathop {\le _s}\limits ^{(4.42)} |\Pi _{N_\nu }^\bot \mathcal{R}_\nu |_s + N_\nu ^{2\tau +1} \gamma ^{-1} |\mathcal{R}_\nu |_s | \mathcal{R}_\nu |_{{\mathfrak {s}}_{0}}. \end{aligned}$$
(4.61)

Hence (4.61) and (2.20) imply

$$\begin{aligned} |\mathcal{R}_{\nu + 1} |_{s} \le _{s} N_\nu ^{-\beta } | \mathcal{R}_\nu |_{s+\beta } + N_\nu ^{2\tau +1} \gamma ^{-1} |\mathcal{R}_\nu |_s | \mathcal{R}_\nu |_{{\mathfrak {s}}_{0}} \end{aligned}$$
(4.62)

which shows that the iterative scheme is quadratic plus a super-exponentially small term. In particular

$$\begin{aligned} |\mathcal{R}_{\nu + 1} |_{s} \!\!&\mathop {\le _s}\limits ^{(4.62),(4.19)} N_\nu ^{-\beta } |\mathcal{R}_0|_{s+\beta } N_{\nu -1} + N_\nu ^{2\tau +1} \gamma ^{-1} |\mathcal{R}_0|_{s+\beta } |\mathcal{R}_0|_{{\mathfrak {s}}_{0}+\beta } N_{\nu -1}^{- 2\alpha }\\&\mathop {\le }\limits ^{(4.1),(4.13),(4.14)} |\mathcal{R}_0|_{s+\beta } N_{\nu }^{-\alpha } \end{aligned}$$

(\( \chi = 3 / 2 \)) which is the first inequality of (4.19) at the step \( \nu +1 \). The next key step is to control the divergence of the high norm \( | \mathcal{R}_{\nu +1} |_{s+\beta } \). By (4.61) (with \( s + \beta \) instead of \( s \)) we get

$$\begin{aligned} |\mathcal{R}_{\nu + 1} |_{s+\beta } \, {\le _{s+\beta }} \, | \mathcal{R}_\nu |_{s+ \beta } + N_\nu ^{2\tau +1} \gamma ^{-1} |\mathcal{R}_\nu |_{s+\beta }| \mathcal{R}_\nu |_{{\mathfrak {s}}_{0}} \end{aligned}$$
(4.63)

(the difference with respect to (4.62) is that we do not apply to \( | \Pi _{N_\nu }^\bot \mathcal{R}_{\nu } |_{s+\beta } \) any smoothing). Then (4.63), (4.19), (4.14), (4.13) imply the inequality

$$\begin{aligned} |\mathcal{R}_{\nu + 1} |_{s+\beta } \le C(s+\beta ) | \mathcal{R}_\nu |_{s+ \beta }, \end{aligned}$$

whence, iterating,

$$\begin{aligned} |\mathcal{R}_{\nu + 1} |_{s+\beta } \le N_{\nu } |\mathcal{R}_0 |_{s+ \beta } \end{aligned}$$

for \( N_0 := N_0 (s,\beta ) \) large enough, which is the second inequality of (4.19) with index \( \nu +1 \).

By Lemma 4.4 the eigenvalues \( \mu _j^{\nu + 1} := \mu _j^0 + r_j^{\nu + 1} \), defined on \( \Lambda _{\nu +1}^{\gamma } \), satisfy \(\mu _j^{\nu + 1} = \overline{{\mu }_{-j}^{\nu + 1}} \), and, in the reversible case, the \(\mu _{j}^{\nu + 1}\) are purely imaginary and \(\mu _j^{\nu + 1} = - \mu _{-j}^{\nu + 1}\).

It remains only to prove (4.18) for \(\nu +1\), which is proved below.

Proof of \(\mathbf{({S}2)}_{\nu + 1} \). By (4.54),

$$\begin{aligned} |\mu _j^{\nu + 1} - \mu _j^{\nu } |^{\mathrm{{Lip}(\gamma )}} = |r_j^{\nu + 1} - r_j^{\nu } |^{\mathrm{{Lip}(\gamma )}} \le |\mathcal{R}_{\nu } |_{{\mathfrak {s}}_{0}}^{\mathrm{{Lip}(\gamma )}} \mathop {\le }\limits ^{(4.19)}\left| \mathcal{R}_{0}\right| ^{\mathrm{{Lip}(\gamma )}}_{{\mathfrak {s}}_{0} + \beta }N_{\nu - 1}^{- \alpha }.\quad \quad \end{aligned}$$
(4.64)

By Kirszbraun theorem, we extend the function \( \mu _j^{\nu + 1} - \mu _j^{\nu } = r_j^{\nu + 1} - r_j^{\nu } \) to the whole \( \Lambda \), still satisfying (4.64). In this way we define \( \tilde{\mu }_j^{\nu + 1}\). Finally (4.18) follows summing all the terms in (4.64) and using (3.68).

Proof of \(\mathbf{({S}3)}_{\nu + 1} \). Set, for brevity,

$$\begin{aligned} {\mathcal {R}}_{\nu }^{i}&:= {\mathcal {R}}_{\nu }(u_i),\quad \Psi _{\nu - 1}^i := \Psi _{\nu - 1}(u_i),\quad \Phi _{\nu - 1}^{i} := \Phi _{\nu - 1}(u_i), \\ H_{\nu - 1}^i&:= H_{\nu - 1} (u_i) , \quad i:= 1, 2 , \end{aligned}$$

which are all operators defined for \(\lambda \in \Lambda _{\nu }^{\gamma _1}(u_1) \cap \Lambda _{\nu }^{\gamma _2}(u_2) \). By Lemma 4.3 one can construct \(\Psi _{\nu }^{i}:= \Psi _{\nu }(u_i)\), \(\Phi _{\nu }^i := \Phi _{\nu }(u_i)\), \(i = 1, 2\), for all \(\lambda \in \Lambda _{\nu + 1}^{\gamma _1}(u_1) \cap \Lambda _{\nu + 1}^{\gamma _2}(u_2)\). One has

$$\begin{aligned} \vert \Delta _{12} \Psi _{\nu } \vert _{{\mathfrak {s}}_0}&\mathop {\lessdot }\limits ^{(4.43)} N_\nu ^{2\tau +1} \gamma ^{-1} \left( |{\mathcal {R}}_\nu (u_2)|_{{\mathfrak {s}}_0} \Vert u_2 - u_1 \Vert _{{\mathfrak {s}}_0 + \sigma _2} + |\Delta _{12} {\mathcal {R}}_\nu |_{{\mathfrak {s}}_0} \right) \nonumber \\&\mathop {\lessdot }\limits ^{(4.19),(4.23)} N_\nu ^{2\tau +1} N_{\nu -1}^{-\alpha } \gamma ^{-1} \left( |{\mathcal {R}}_0|_{\mathfrak s_0+\beta } + \varepsilon \right) \Vert u_2 - u_1 \Vert _{{\mathfrak {s}}_0 + \sigma _2} \nonumber \\&\mathop {\lessdot }\limits ^{(3.68),(4.2)} N_\nu ^{2\tau +1} N_{\nu -1}^{-\alpha } \varepsilon \gamma ^{-1} \Vert u_2 - u_1 \Vert _{{\mathfrak {s}}_0 + \sigma _2} \le \Vert u_2 - u_1 \Vert _{{\mathfrak {s}}_0 + \sigma _2}. \quad \quad \end{aligned}$$
(4.65)

for \( \varepsilon \gamma ^{-1} \) small (and (4.13)). By (2.14), applied to \(\Phi := \Phi _{\nu }\), and (4.65), we get

$$\begin{aligned} \vert \Delta _{12} \Phi _{\nu }^{-1} \vert _{s} \le _s \left( \vert \Psi _{\nu }^{1} \vert _s + \vert \Psi _{\nu }^{2} \vert _{s} \right) \Vert u_1 - u_2 \Vert _{{\mathfrak {s}}_0 + \sigma _2} + \vert \Delta _{12} \Psi _{\nu } \vert _{s} \end{aligned}$$
(4.66)

which implies for \(s = {\mathfrak {s}}_0\), and using (4.21), (4.14), (4.65)

$$\begin{aligned} \vert \Delta _{12} \Phi _{\nu }^{-1} \vert _{{\mathfrak {s}}_0} \lessdot \Vert u_1 - u_2 \Vert _{{\mathfrak {s}}_0 + \sigma _2}. \end{aligned}$$
(4.67)

Let us prove the estimates (4.23) for \(\Delta _{12}{\mathcal {R}}_{\nu + 1}\), which is defined on \(\lambda \in \Lambda _{\nu + 1}^{\gamma _1}(u_1) \cap \Lambda _{\nu + 1}^{\gamma _2}(u_2)\). For all \(s \in [ {{\mathfrak {s}}}_{0}, {\mathfrak s}_{0}+\beta ]\), using the interpolation (2.7) and (4.60),

$$\begin{aligned} |\Delta _{12}\mathcal{R}_{\nu + 1} |_{s} \!&\mathop {\le _{s}}\limits ^{} \! |\Delta _{12}\Phi _{\nu }^{-1} |_{s} |H_{\nu }^{1} |_{{\mathfrak {s}}_{0}} + \vert \Delta _{12}\Phi _{\nu }^{-1}\vert _{{\mathfrak {s}}_{0}} \vert H_{\nu }^{1} \vert _{s} \nonumber \\&+ |(\Phi _{\nu }^2 )^{-1}|_s |\Delta _{12}H_{\nu } |_{{\mathfrak {s}}_0} + |(\Phi _{\nu }^2 )^{-1} |_{{\mathfrak {s}}_{0}} |\Delta _{12}H_\nu |_s. \end{aligned}$$
(4.68)

We estimate the above terms separately. Set for brevity \( A^\nu _{s} := | {\mathcal {R}}_\nu (u_1) |_s + | {\mathcal {R}}_\nu (u_2) |_s \). By (4.60) and (2.7),

$$\begin{aligned} | \Delta _{12}H_{\nu }|_s&\le _s \left| \Pi _{N_{\nu }}^{\bot }\Delta _{12}\mathcal{R}_{\nu }\right| _{s} + |\Delta _{12}\Psi _{\nu }|_{s} |\mathcal{R}_{\nu }^1 |_{{\mathfrak {s}}_{0}} + |\Delta _{12}\Psi _{\nu } |_{{\mathfrak {s}}_{0}} |\mathcal{R}_{\nu }^1|_{s}\nonumber \\&+ |\Psi _{\nu }^2 |_{s} |\Delta _{12}\mathcal{R}_{\nu } |_{{\mathfrak {s}}_{0}} + |\Psi _{\nu }^2 |_{{\mathfrak {s}}_{0}} |\Delta _{12}\mathcal{R}_{\nu } |_{s} \nonumber \\&\mathop {\le _{s}}\limits ^{(4.42),(4.43)} \left| \Pi _{N_{\nu }}^{\bot }\Delta _{12}\mathcal{R}_{\nu }\right| _{s} + N_{\nu }^{2\tau +1}\gamma ^{-1} A^\nu _{{\mathfrak {s}}_0} A^\nu _s \Vert u_1 - u_2\Vert _{{\mathfrak {s}}_{0}+\sigma _2} \nonumber \\&\, + \, N_{\nu }^{2\tau +1}\gamma ^{-1} A^\nu _{s} \vert \Delta _{12}\mathcal{R}_{\nu } \vert _{{\mathfrak {s}}_{0}} + N_{\nu }^{2\tau +1}\gamma ^{-1} A^\nu _{{\mathfrak {s}}_0} \vert \Delta _{12}\mathcal{R}_{\nu } \vert _s.\quad \quad \quad \quad \end{aligned}$$
(4.69)

Estimating the four terms in the right hand side of (4.68) in the same way, using (4.66), (4.60), (4.42), (4.43), (4.21), (4.67), (4.58), (4.69), (4.19), we deduce

$$\begin{aligned} \vert \Delta _{12}\mathcal{R}_{\nu +1} \vert _{s}&{\le _{s}} |\Pi _{N_{\nu }}^{\bot } \Delta _{12}{\mathcal {R}}_{\nu }|_s + N_{\nu }^{2\tau + 1} \gamma ^{-1} A_{s}^{\nu } A_{{\mathfrak {s}}_0}^{\nu } \Vert u_1 - u_2 \Vert _{{\mathfrak {s}}_0 + \sigma _2} \nonumber \\&+ N_{\nu }^{2\tau + 1}\gamma ^{-1} A_{s}^{\nu } |\Delta _{12}{\mathcal {R}}_{\nu }|_{{\mathfrak {s}}_0} + N_{\nu }^{2\tau + 1} \gamma ^{-1} A_{{\mathfrak {s}}_0}^{\nu } |\Delta _{12}{\mathcal {R}}_{\nu }|_s .\quad \quad \quad \end{aligned}$$
(4.70)

Specializing (4.70) for \( s = {\mathfrak {s}}_0 \) and using (3.68), (2.20), (4.19), (4.23), we deduce

$$\begin{aligned} \vert \Delta _{12}\mathcal{R}_{\nu + 1}\vert _{{\mathfrak {s}}_{0}}&\le C ( \varepsilon N_{\nu - 1}N_{\nu }^{-\beta } + N_{\nu }^{2\tau + 1}N_{\nu - 1}^{-2\alpha }\varepsilon ^{2}\gamma ^{-1} ) \Vert u_1 - u_2 \Vert _{\mathfrak s_{0}+\sigma _2}\\&\le \varepsilon N_{\nu }^{-\alpha } \Vert u_1 - u_2 \Vert _{{\mathfrak {s}}_{0}+\sigma _2} \end{aligned}$$

for \(N_{0}\) large and \(\varepsilon \gamma ^{-1}\) small. Next by (4.70) with \( s = {\mathfrak {s}}_0 + \beta \)

$$\begin{aligned} \vert \Delta _{12}\mathcal{R}_{\nu } \vert _{{\mathfrak {s}}_{0}+\beta }&\mathop {\le _{{\mathfrak {s}}_{0}+\beta }}\limits ^{(4.19),(4.23),(4.14)} A_{{\mathfrak {s}}_0 + \beta }^{\nu } \Vert u_1 - u_2 \Vert _{{\mathfrak {s}}_{0}+\sigma _2} + \vert \Delta _{12}\mathcal{R}_{\nu }\vert _{{\mathfrak {s}}_{0}+\beta } \\&\mathop {\le }\limits ^{(4.19)(4.23)} C({\mathfrak {s}}_{0}+\beta ) \varepsilon N_{\nu - 1} \Vert u_1 - u_2 \Vert _{{\mathfrak {s}}_{0}+\sigma _2} \\&\le \varepsilon N_{\nu } \Vert u_1 - u_2 \Vert _{{\mathfrak {s}}_{0}+\sigma _2} \end{aligned}$$

for \(N_{0}\) large enough. Finally note that (4.24) is nothing but (4.55).

Proof of \(\mathbf{({S}4)}_{\nu + 1} \). We have to prove that, if \(C \varepsilon N_{\nu }^{\tau } \Vert u_1 - u_2 \Vert _{{\mathfrak {s}}_0 + \sigma _2} \le \rho \), then

$$\begin{aligned} \lambda \in \Lambda _{\nu +1}^{\gamma }(u_1) \Longrightarrow \lambda \in \Lambda _{\nu +1}^{\gamma - \rho }(u_2). \end{aligned}$$

Let \( \lambda \in \Lambda _{\nu +1}^{\gamma }(u_1) \). Definition (4.17) and \(\mathbf{({S}4)_{\nu }}\) (see (4.26)) imply that \( \Lambda _{\nu +1}^{\gamma }(u_1) \subseteq \Lambda _{\nu }^{\gamma }(u_1) \subseteq \Lambda _{\nu }^{\gamma - \rho }(u_2) \). Hence \( \lambda \in \Lambda _{\nu }^{\gamma - \rho }(u_2) \subset \Lambda _{\nu }^{\gamma /2}(u_2) \). Then, by \(\mathbf{({S}1)_{\nu }}\), the eigenvalues \(\mu _{j}^{\nu }(\lambda , u_2(\lambda ))\) are well defined. Now (4.16) and the estimates (3.64), (4.25) (which holds because \( \lambda \in \Lambda _{\nu }^{\gamma }(u_1) \cap \Lambda _{\nu }^{\gamma /2}(u_2) \)) imply that

$$\begin{aligned}&|(\mu _j^{\nu } - \mu _k^{\nu })(\lambda , u_2(\lambda )) - (\mu _j^{\nu } - \mu _k^{\nu })(\lambda , u_1(\lambda ))| \nonumber \\&\quad \le |(\mu _j^{0} - \mu _k^{0})(\lambda , u_2(\lambda )) - (\mu _j^{0} - \mu _k^{0})(\lambda , u_1(\lambda ))|\nonumber \\&\quad \quad + \, 2 \sup _{j \in {\mathbb {Z}}} |r_{j}^{\nu }(\lambda , u_2(\lambda )) - r_{j}^{\nu }(\lambda , u_1(\lambda ))| \nonumber \\&\quad \le \varepsilon C|j^3 - k^3| \Vert u_2 - u_1 \Vert _{{\mathfrak {s}}_0 + \sigma _2}^\mathrm{sup}. \end{aligned}$$
(4.71)

Then we conclude that for all \(|l| \le N_{\nu }\), \(j \ne k\), using the definition of \(\Lambda _{\nu +1}^\gamma (u_1)\) (which is (4.17) with \(\nu +1\) instead of \(\nu \)) and (4.71),

$$\begin{aligned} |\mathrm{i} \omega \cdot l + \mu _j^{\nu } (u_2) - \mu _k^{\nu } (u_2) |&\ge |\mathrm{i} \omega \cdot l + \mu _j^{\nu } (u_1) - \mu _k^{\nu } (u_1) | - |(\mu _j^{\nu } - \mu _k^{\nu })(u_2)\\&- (\mu _j^{\nu } - \mu _k^{\nu })(u_1) |\\&\ge \gamma |j^3 - k^3| \langle l \rangle ^{-\tau } - C \varepsilon |j^3 - k^3| \Vert u_1 - u_2 \Vert _{{\mathfrak {s}}_0 + \sigma _2} \\&\ge (\gamma - \rho )|j^3 - k^3| \langle l \rangle ^{-\tau } \end{aligned}$$

provided \(C \varepsilon N_{\nu }^{\tau } \Vert u_1 - u_2 \Vert _{{\mathfrak {s}}_0 + \sigma _2} \le \rho \). Hence \( \lambda \in \Lambda ^{\gamma - \rho }_{\nu +1} ( u_2 ) \). This proves (4.26) at the step \( \nu + 1 \).

4.2 Inversion of \(\mathcal{L}(u)\)

In (3.57) we have conjugated the linearized operator \( {\mathcal {L}}\) to \({\mathcal {L}}_5\) defined in (3.55), namely \({\mathcal {L}}= \Phi _1 {\mathcal {L}}_5 \Phi _2^{-1}\). In Theorem 4.1 we have conjugated the operator \({\mathcal {L}}_5\) to the diagonal operator \({\mathcal {L}}_{\infty }\) in (4.7), namely \( {\mathcal {L}}_5 = \Phi _{\infty } {\mathcal {L}}_{\infty } \Phi _{\infty }^{-1}\). As a consequence

$$\begin{aligned} {\mathcal {L}}= W_1 {\mathcal {L}}_\infty W_2^{-1}, \quad W_i := \Phi _{i} \Phi _{\infty }, \quad \Phi _1 := \mathcal{A} B \rho \mathcal{M} \mathcal{T} {\mathcal {S}}, \quad \Phi _2 := \mathcal{A} B \mathcal{M} \mathcal{T} {\mathcal {S}}.\nonumber \\ \end{aligned}$$
(4.72)

We first prove that \(W_1, W_2 \) and their inverses are linear bijections of \(H^{s}\). We take

$$\begin{aligned} \gamma \le \gamma _0 / 2 , \quad \tau \ge \tau _0. \end{aligned}$$
(4.73)

Lemma 4.5

Let \( {\mathfrak {s}}_{0} \le s \le q - \sigma - \beta -3 \) where \( \beta \) is defined in (4.1) and \( \sigma \) in (3.58). Let \(u:= u(\lambda )\) satisfy \( \Vert u \Vert _{{\mathfrak {s}}_0 + \sigma + \beta + 3}^{\mathrm{{Lip}(\gamma )}} \le 1 \), and \( \varepsilon \gamma ^{-1} \le \delta \) be small enough. Then \( W_i \), \( i = 1, 2 \), satisfy, \( \forall \lambda \in \Lambda _{\infty }^{2\gamma }(u) \),

$$\begin{aligned} \left\| W_i h\right\| _{s} + \left\| W_i^{-1}h\right\| _{s}&\le C(s) \left( \left\| h\right\| _{s } + \left\| u\right\| _{s + \sigma + \beta }\left\| h\right\| _{{\mathfrak {s}}_{0}} \right) ,\end{aligned}$$
(4.74)
$$\begin{aligned} \left\| W_i h\right\| _{s}^{\mathrm{{Lip}(\gamma )}} \!+\! \left\| W_i^{-1}h\right\| _{s}^{\mathrm{{Lip}(\gamma )}} \!&\le \! C(s) \left( \left\| h\right\| _{s + 3}^{\mathrm{{Lip}(\gamma )}} \!+\! \left\| u\right\| ^{\mathrm{{Lip}(\gamma )}}_{s + \sigma + \beta + 3}\left\| h\right\| _{{\mathfrak {s}}_{0}+3}^{\mathrm{{Lip}(\gamma )}} \right) .\quad \quad \quad \end{aligned}$$
(4.75)

In the reversible case (i.e. (1.13) holds), \( W_i \), \( W_i^{-1} \), \( i = 1, 2 \) are reversibility-preserving.

Proof

The bound (4.74), resp. (4.75), follows by (4.8), (3.60), resp. (3.62), (2.12) and Lemma 6.5. In the reversible case \( W_i^{\pm 1} \) are reversibility preserving because \( \Phi _i^{\pm 1} \), \( \Phi _\infty ^{\pm 1} \) are reversibility preserving. \(\square \)

By (4.72) we are reduced to show that, \( \forall \lambda \in \Lambda ^{2\gamma }_{\infty }(u) \), the operator

$$\begin{aligned} \mathcal{L}_\infty := \mathrm{diag}_{j \in {\mathbb {Z}}} \{\mathrm{i} \lambda \bar{\omega }\cdot l + \mu _j^\infty (\lambda )\} , \quad \mu _j^\infty (\lambda ) = -\mathrm{i} \left( m_3 (\lambda ) j^3 - m_1(\lambda ) j \right) + r_j^\infty (\lambda ) \end{aligned}$$

is invertible, assuming (1.8) or the reversibility condition (1.13).

We introduce the following notation:

$$\begin{aligned} \begin{aligned} \Pi _C u&:= \frac{1}{(2\pi )^{\nu +1}}\, \int _{{\mathbb {T}}^{\nu +1}} u(\varphi ,x) \, d\varphi dx,\quad {\mathbb {P}}u := u - \Pi _C u, \\ H^s_{00}&:= \{ u \in H^s({\mathbb {T}}^{\nu +1}) : \Pi _C u = 0 \}. \end{aligned} \end{aligned}$$
(4.76)

If (1.8) holds, then the linearized operator \( \mathcal{L} \) in (3.1) satisfies

$$\begin{aligned} {\mathcal {L}}: H^{s+3} \rightarrow H^s_{00} \end{aligned}$$
(4.77)

(for \( {\mathfrak {s}}_0 \le s \le q-1 \)). In the reversible case (1.13)

$$\begin{aligned} {\mathcal {L}}: X \cap H^{s+3} \rightarrow Y \cap H^s \subset H^s_{00} . \end{aligned}$$
(4.78)

Lemma 4.6

Assume either (1.8) or the reversibility condition (1.13). Then the eigenvalue

$$\begin{aligned} \mu _0^\infty (\lambda ) = r^\infty _0 (\lambda ) = 0 , \quad \forall \lambda \in \Lambda _\infty ^{2\gamma } (u) \, . \end{aligned}$$
(4.79)

Proof

Assume (1.8). If \( r_0^\infty \ne 0 \) then there exists a solution of \( \mathcal{L}_\infty w = 1 \), which is \( w = 1 / r_0^\infty \). Therefore, by (4.72),

$$\begin{aligned} \mathcal{L} W_2 [1 / r^\infty _0] = \mathcal{L} W_2 w = W_1 \mathcal{L}_\infty w = W_1 [1] \end{aligned}$$

which is a contradiction because \( \Pi _C W_1 [1] \ne 0 \), for \( \varepsilon \gamma ^{-1} \) small enough, but the average \( \Pi _C \mathcal{L} W_2 [1 / r^\infty _0] = 0 \) by (4.77). In the reversible case \( r^\infty _0 = 0 \) was proved in Remark 4.3. \(\square \)

As a consequence of (4.79), the definition of \( \Lambda _\infty ^{2 \gamma } \) in (4.6) (just specializing (4.6) with \( k = 0 \)), and (1.2) (with \(\gamma \) and \(\tau \) as in (4.73)), we deduce also the first order Melnikov non-resonance conditions

$$\begin{aligned} \forall \lambda \in \Lambda _{\infty }^{2 \gamma } , \qquad \big |\mathrm{i} \lambda \bar{\omega }\cdot l + \mu _j^\infty (\lambda ) \big | \ge 2 \gamma \frac{ \langle j \rangle ^3}{ \langle l \rangle ^\tau }, \quad \forall (l, j) \ne (0, 0) . \end{aligned}$$
(4.80)

Lemma 4.7

(Invertibility of \(\mathcal{L}_\infty \) ) For all \( \lambda \in \Lambda _\infty ^{2 \gamma } (u) \), for all \( g \in H^s_{00} \) the equation \( \mathcal{L}_\infty w = g \) has the unique solution with zero average

$$\begin{aligned} {\mathcal {L}}_{\infty }^{-1}\, g (\varphi ,x) := \sum _{(l,j) \ne (0,0)} \frac{g_{lj}}{\mathrm{i} \lambda \bar{\omega }\cdot l + \mu _j^\infty (\lambda ) }\, e^{\mathrm{i} (l \cdot \varphi + j x)}. \end{aligned}$$
(4.81)

For all Lipschitz family \( g := g(\lambda ) \in H^s_{00} \) we have

$$\begin{aligned} \left\| \mathcal{L}_{\infty }^{-1} g \right\| _{s}^{\mathrm{{Lip}(\gamma )}} \le C \gamma ^{-1} \left\| g \right\| _{s + 2\tau + 1}^{\mathrm{{Lip}(\gamma )}} . \end{aligned}$$
(4.82)

In the reversible case, if \( g \in Y \) then \( \mathcal{L}_\infty ^{-1} g \in X \).

Proof

For all \(\lambda \in \Lambda _\infty ^{2\gamma } (u) \), by (4.80), formula (4.81) is well defined and

$$\begin{aligned} \left\| \mathcal{L}_{\infty }^{-1}(\lambda )g(\lambda )\right\| _{s} \lessdot \gamma ^{-1}\left\| g(\lambda )\right\| _{s + \tau }. \end{aligned}$$
(4.83)

Now we prove the Lipschitz estimate. For \( \lambda _1 , \lambda _2 \in \Lambda _\infty ^{2\gamma } (u) \)

$$\begin{aligned} \mathcal{L}_{\infty }^{-1}(\lambda _{1})g(\lambda _{1}) - \mathcal{L}_{\infty }^{-1}(\lambda _{2})g(\lambda _{2})&= \mathcal{L}_{\infty }^{-1}(\lambda _{1}) [g(\lambda _1) - g(\lambda _2)] \nonumber \\&+ \left( \mathcal{L}_{\infty }^{-1}(\lambda _1) - \mathcal{L}_{\infty }^{-1}(\lambda _2) \right) g(\lambda _{2}).\quad \quad \end{aligned}$$
(4.84)

By (4.83)

$$\begin{aligned} \gamma \Vert \mathcal{L}_{\infty }^{-1}(\lambda _{1}) [g(\lambda _{1}) - g(\lambda _{2})] \Vert _s \lessdot \Vert g(\lambda _{1})- g(\lambda _{2}) \Vert _{s + \tau } \le \gamma ^{-1} \Vert g \Vert _{s + \tau }^{\mathrm{{Lip}(\gamma )}} |\lambda _1 - \lambda _2 |.\nonumber \\ \end{aligned}$$
(4.85)

Now we estimate the second term of (4.84). We simplify notations writing \( g := g(\lambda _{2}) \) and \( \delta _{lj} := \mathrm{i} \lambda \bar{\omega }\cdot l + \mu _j^\infty \).

$$\begin{aligned} \left( \mathcal{L}_{\infty }^{-1}(\lambda _{1}) - \mathcal{L}_{\infty }^{-1}(\lambda _{2})\right) g = \sum _{(l , j)\ne (0,0)} \frac{\delta _{lj}(\lambda _{2}) - \delta _{lj}(\lambda _{1})}{\delta _{lj}(\lambda _{1})\delta _{lj}(\lambda _{2})} \, g_{lj} e^{\mathrm{i} (l \cdot \varphi + j x)}.\quad \quad \end{aligned}$$
(4.86)

The bound (4.5) imply \( \vert \mu _{j}^{\infty } \vert ^\mathrm{lip} \lessdot \varepsilon \gamma ^{-1} | j |^{3} \lessdot | j |^{3} \) and, using also (4.80),

$$\begin{aligned} \gamma \frac{|\delta _{lj}(\lambda _{2}) - \delta _{lj}(\lambda _{1}) |}{|\delta _{lj}(\lambda _{1})| |\delta _{lj}(\lambda _{2}) |} \!&\lessdot \! \frac{( | l | + | j |^{3}) \langle l \rangle ^{2\tau }}{\gamma \langle j \rangle ^{6}} |\lambda _{2} - \lambda _{1} | \lessdot \langle l \rangle ^{2\tau + 1} \gamma ^{-1} | \lambda _2 - \lambda _1 | . \quad \quad \quad \end{aligned}$$
(4.87)

Then (4.86) and (4.87) imply \( \gamma \Vert (\mathcal{L}_{\infty }^{-1}(\lambda _2) - \mathcal{L}_{\infty }^{-1}(\lambda _1) )g \Vert _s \lessdot \gamma ^{-1} \Vert g \Vert _{s + 2\tau + 1}^{\mathrm{{Lip}(\gamma )}} |\lambda _2 - \lambda _1 | \) that, finally, with (4.83), (4.85), prove (4.82). The last statement follows by the property (4.37). \(\square \)

In order to solve the equation \( \mathcal{L} h = f \) we first prove the following lemma.

Lemma 4.8

Let \({\mathfrak {s}}_0 + \tau + 3 \le s \le q - \sigma - \beta - 3 \). Under the assumption (1.8) we have

$$\begin{aligned} W_1 (H^s_{00}) = H^s_{00} , \quad \ W_1^{-1} (H^s_{00}) = H^s_{00}. \end{aligned}$$
(4.88)

Proof

It is sufficient to prove that \( W_1 (H^s_{00}) = H^s_{00} \) because the second equality of (4.88) follows applying the isomorphism \( W_1^{-1} \). Let us give the proof of the inclusion

$$\begin{aligned} W_1 (H^s_{00}) \subseteq H^s_{00} \end{aligned}$$
(4.89)

(which is essentially algebraic). For any \( g \in H^s_{00}\), let \( w(\varphi ,x) := \mathcal{L}_\infty ^{-1} g \in H^{s - \tau }_{00} \) defined in (4.81). Then \( h := W_2 w \in H^{s-\tau } \) satisfies

$$\begin{aligned} {\mathcal {L}}h \mathop {=}\limits ^{(4.72)} W_1 {\mathcal {L}}_\infty W_2^{-1}h = W_1 {\mathcal {L}}_\infty w = W_1 g. \end{aligned}$$

By (4.77) we deduce that \(W_1 g = {\mathcal {L}}h \in H^{s - \tau - 3}_{00} \). Since \( W_1 g \in H^s \) by Lemma 4.5, we conclude \( W_1 g \in H^s \cap H^{s - \tau - 3}_{00} = H^s_{00}\). The proof of (4.89) is complete.

It remains to prove that \(H^s_{00} {\setminus } W_1(H^s_{00}) = \emptyset \). By contradiction, let \( f \in H^s_{00} {\setminus } W_1(H^s_{00}) \). Let \( g := W_1^{-1}f \in H^s \) by Lemma 4.5. Since \(W_1 g = f \notin W_1(H^s_{00})\), it follows that \( g \notin H^s_{00} \) (otherwise it contradicts (4.89)), namely \(c := \Pi _C g \ne 0\). Decomposing \( g = c + {\mathbb {P}}g \) (recall (4.76)) and applying \(W_1\), we get \( W_1 g = c W_1[1] + W_1 {\mathbb {P}}g \). Hence

$$\begin{aligned} W_1[1] = c^{-1}(W_1 g - W_1 {\mathbb {P}}g) \in H^s_{00} \end{aligned}$$

because \(W_1 g = f \in H^s_{00}\) and \(W_1 {\mathbb {P}}g \in W_1(H^s_{00}) \subseteq H^s_{00}\) by (4.89). However, \( \Pi _C W_1[1] \ne 0 \), a contradiction. \(\square \)

Remark 4.7

In the Hamiltonian case (which always satisfies (1.8)), the \( W_i (\varphi ) \) are maps of (a subspace of) \( H^1_0 \) so that Lemma 4.8 is automatic, and there is no need of Lemma 4.6.

We may now prove the main result of Sects. 3 and 4.

Theorem 4.3

(Right inverse of \( \mathcal{L}\) ) Let

$$\begin{aligned} \tau _1 := 2\tau + 7, \quad \mu := 4\tau + \sigma + \beta + 14, \end{aligned}$$
(4.90)

where \( \sigma \), \(\beta \) are defined in (3.58), (4.1) respectively. Let \( u ( \lambda ) \), \( \lambda \in \Lambda _o \subseteq \Lambda \), be a Lipschitz family with

$$\begin{aligned} \Vert u \Vert _{{\mathfrak {s}}_0 + \mu }^{\mathrm{{Lip}(\gamma )}} \le 1. \end{aligned}$$
(4.91)

Then there exists \( \delta \) (depending on the data of the problem) such that if

$$\begin{aligned} \varepsilon \gamma ^{-1} \le \delta , \end{aligned}$$

and condition (1.8), resp. the reversibility condition (1.13), holds, then for all \( \lambda \in \Lambda _\infty ^{2 \gamma }(u)\) defined in (4.6), the linearized operator \({\mathcal {L}}:= {\mathcal {L}}(\lambda , u(\lambda ))\) (see (3.1)) admits a right inverse on \( H^s_{00} \), resp. \( Y \cap H^s \). More precisely, for \(\mathfrak s_0 \le s \le q - \mu \), for all Lipschitz family \( f(\lambda ) \in H^s_{00} \), resp. \( Y \cap H^s \), the function

$$\begin{aligned} h := \mathcal{L}^{-1} f := W_2 {\mathcal {L}}_{\infty }^{-1}\, W_1^{-1}f \end{aligned}$$
(4.92)

is a solution of \( \mathcal{L} h = f \). In the reversible case, \( \mathcal{L}^{-1} f \in X \). Moreover

$$\begin{aligned} \Vert {\mathcal {L}}^{-1} f \Vert _{s}^{\mathrm{{Lip}(\gamma )}} \le C(s)\gamma ^{-1} \left( \Vert f \Vert _{s + \tau _1}^{\mathrm{{Lip}(\gamma )}} + \Vert u \Vert _{s + \mu }^{\mathrm{{Lip}(\gamma )}} \Vert f \Vert _{{\mathfrak {s}}_0}^{\mathrm{{Lip}(\gamma )}} \right) . \end{aligned}$$
(4.93)

Proof

Given \(f \in H^s_{00}\), resp. \( f \in Y \cap H^s \), with \( s \) like in Lemma 4.8, the equation \( {\mathcal {L}}h = f \) can be solved for \( h \) because \(\Pi _C f = 0 \). Indeed, by (4.72), the equation \( \mathcal{L} h = f \) is equivalent to \( {\mathcal {L}}_\infty W_2^{-1}h = W_1^{-1}f \) where \(W_1^{-1}f \in H^s_{00} \) by Lemma 4.8, resp. \( W_1^{-1}f \in Y \cap H^s \) being \( W_1^{-1} \) reversibility-preserving (Lemma 4.5). As a consequence, by Lemma 4.7, all the solutions of \( \mathcal{L} h = f \) are

$$\begin{aligned} h = c W_2[1] + W_2 {\mathcal {L}}_{\infty }^{-1}W_1^{-1}f, \quad c \in {\mathbb {R}}. \end{aligned}$$
(4.94)

The solution (4.92) is the one with \( c = 0 \). In the reversible case, the fact that \( \mathcal{L}^{-1} f \in X \) follows by (4.92) and the fact that \( W_i \), \( W_i^{-1}\) are reversibility-preserving and \( \mathcal{L}_\infty ^{-1} : Y \rightarrow X \), see Lemma 4.7.

Finally (4.75), (4.82), (4.91) imply

$$\begin{aligned} \Vert {\mathcal {L}}^{-1} f \Vert _s^{\mathrm{{Lip}(\gamma )}} \le C(s)\gamma ^{-1} \left( \Vert f \Vert _{s + 2\tau + 7}^{\mathrm{{Lip}(\gamma )}} + \Vert u \Vert _{s + 2\tau + \sigma + \beta + 7}^{\mathrm{{Lip}(\gamma )}} \Vert f \Vert _{{\mathfrak {s}}_0 + 2\tau + 7}^{\mathrm{{Lip}(\gamma )}} \right) \end{aligned}$$

and (4.93) follows using (6.2) with \( b_0 = {\mathfrak {s}}_0 \), \( a_0 := {\mathfrak {s}}_0 + 2 \tau + \sigma + \beta + 7 \), \( q = 2 \tau + 7 \), \( p = s - {\mathfrak {s}}_0 \). \(\square \)

In the next section we apply Theorem 4.3 to deduce tame estimates for the inverse linearized operators at any step of the Nash–Moser scheme. The approximate solutions along the iteration will satisfy (4.91).

5 The Nash–Moser iteration

We define the finite-dimensional subspaces of trigonometric polynomials

$$\begin{aligned} H_{n} := \left\{ u \in L^{2}({\mathbb {T}}^{\nu + 1}) : u(\varphi ,x)=\sum _{\left| (l , j)\right| \le N_{n}}u_{lj} e^{\mathrm{i} (l\cdot \varphi + j x)} \right\} \end{aligned}$$

where \( N_n := N_0^{\chi ^n}\) (see (4.12)) and the corresponding orthogonal projectors

$$\begin{aligned} \Pi _{n}:=\Pi _{N_{n}} : L^{2}({\mathbb {T}}^{\nu + 1}) \rightarrow H_{n} , \quad \Pi _{n}^\bot := I - \Pi _{n}. \end{aligned}$$

The following smoothing properties hold: for all \(\alpha , s \ge 0\),

$$\begin{aligned} \begin{aligned} \Vert \Pi _{n}u \Vert _{s + \alpha }^\mathrm{{Lip}(\gamma )}&\le N_{n}^{\alpha } \Vert u \Vert _{s}^\mathrm{{Lip}(\gamma )}, \ \ \forall u(\lambda ) \in H^{s} \,; \\ \Vert \Pi _{n}^\bot u \Vert _{s}^\mathrm{{Lip}(\gamma )}&\le N_{n}^{-\alpha } \Vert u \Vert _{s + \alpha }^\mathrm{{Lip}(\gamma )}, \ \ \forall u(\lambda ) \in H^{s + \alpha }, \end{aligned} \end{aligned}$$
(5.1)

where the function \(u(\lambda )\) depends on the parameter \(\lambda \) in a Lipschitz way. The bounds (5.1) are the classical smoothing estimates for truncated Fourier series, which also hold with the norm \(\Vert \cdot \Vert ^\mathrm{{Lip}(\gamma )}_s \) defined in (2.2).

Let

$$\begin{aligned} F(u) := F(\lambda , u) := \lambda \bar{\omega }\cdot \partial _{\varphi } u + u_{xxx} + \varepsilon f(\varphi , x , u, u_{x}, u_{xx}, u_{xxx} ). \end{aligned}$$
(5.2)

We define the constants

$$\begin{aligned} \kappa := 28 + 6 \mu , \qquad \beta _1 := 50 + 11 \mu , \, \end{aligned}$$
(5.3)

where \(\mu \) is the loss of regularity in (4.90).

Theorem 5.1

(Nash–Moser) Assume that \( f \in C^q \), \( q \ge {\mathfrak {s}}_0 + \mu + \beta _1 \), satisfies the assumptions of Theorem 1.1 or Theorem 1.3. Let \( 0 < \gamma \le \mathrm{min}\{\gamma _0, 1/48 \} \), \( \tau > \nu + 1 \). Then there exist \( \delta > 0 \), \( C_* > 0 \), \( N_0 \in {\mathbb {N}}\) (that may depend also on \( \tau \)) such that, if \( \varepsilon \gamma ^{-1} < \delta \), then, for all \( n \ge 0 \):

\((\mathcal{P}1)_{n}\) :

there exists a function \(u_n : {\mathcal {G}}_n \subseteq \Lambda \rightarrow H_n\), \(\lambda \mapsto u_n(\lambda )\), with \( \Vert u_{n} \Vert _{{\mathfrak {s}}_0 + \mu }^{\mathrm{{Lip}(\gamma )}} \le 1 \), \( u_0 := 0\), where \(\mathcal{G}_{n} \) are Cantor like subsets of \( \Lambda := [1/2, 3/2] \) defined inductively by: \( \mathcal{G}_{0} := \Lambda \),

$$\begin{aligned} \mathcal{G}_{n+1}&:= \left\{ \lambda \in \mathcal{G}_{n} \, : \, |\mathrm{i} \omega \cdot l + \mu _j^\infty (u_{n}) - \mu _k^\infty (u_{n})| \right. \nonumber \\&\qquad \qquad \qquad \quad \ge \left. \frac{2\gamma _{n} |j^{3}-k^{3}|}{\left\langle l\right\rangle ^{\tau }}, \ \forall j , k \in {\mathbb {Z}}, \ l \in {\mathbb {Z}}^{\nu } \right\} \end{aligned}$$
(5.4)

where \( \gamma _{n}:=\gamma (1 + 2^{-n}) \). In the reversible case, namely (1.13) holds, then \( u_n (\lambda ) \in X \). The difference \(h_n := u_{n} - u_{n-1}\), where, for convenience, \(h_0 := 0\), satisfies

$$\begin{aligned} \Vert h_{n} \Vert _{{\mathfrak {s}}_0 + \mu }^\mathrm{{Lip}(\gamma )}\le C_* \varepsilon \gamma ^{-1} N_{n}^{-\sigma _1} , \quad \sigma _1 := 18 + 2 \mu . \end{aligned}$$
(5.5)
\((\mathcal{P}2)_{n}\) :

\( \Vert F(u_n) \Vert _{{\mathfrak {s}}_{0}}^{\mathrm{{Lip}(\gamma )}} \le C_* \varepsilon N_{n}^{- \kappa }\).

\((\mathcal{P}3)_{n}\) :

(High norms). \( \Vert u_n \Vert _{\mathfrak s_{0}+ \beta _1}^{\mathrm{{Lip}(\gamma )}} \le C_* \varepsilon \gamma ^{-1} N_{n}^{\kappa }\) and \( \Vert F(u_n ) \Vert _{{\mathfrak {s}}_{0}+\beta _1}^{\mathrm{{Lip}(\gamma )}} \le C_* \varepsilon N_{n}^{\kappa }\).

\((\mathcal{P}4)_{n}\) :

(Measure). The measure of the Cantor like sets satisfies

(5.6)

All the Lip norms are defined on \( \mathcal{G}_{n} \).

Proof

The proof of Theorem 5.1 is split into several steps. For simplicity, we denote \( \Vert \ \Vert ^\mathrm{Lip} \) by \( \Vert \ \Vert \).

Step 1: prove \(({\mathcal {P}}1,2,3)_0\). \(({\mathcal {P}}1)_0\) and the first inequality of \(({\mathcal {P}}3)_0\) are trivial because \(u_0 = h_0 = 0\). \(({\mathcal {P}}2)_0\) and the second inequality of \(({\mathcal {P}}3)_0\) follow with \( C_* \ge \) \( \max \{ \Vert f(0)\Vert _{{\mathfrak {s}}_0} N_0^\kappa , \) \( \Vert f(0)\Vert _{{\mathfrak {s}}_0+ \beta _1} N_0^{-\kappa } \} \).

Step 2: assume that \(({\mathcal {P}}1,2,3)_n\) hold for some \(n \ge 0\), and prove \(({\mathcal {P}}1,2,3)_{n+1}\). By \(({\mathcal {P}}1)_n\) we know that \( \Vert u_n \Vert _{{\mathfrak {s}}_{0} + \mu } \le 1 \), namely condition (4.91) is satisfied. Hence, for \( \varepsilon \gamma ^{-1}\) small enough, Theorem 4.3 applies. Then, for all \(\lambda \in \mathcal{G}_{n+1} \) defined in (5.4), the linearized operator

$$\begin{aligned} {\mathcal {L}}_n(\lambda ) := \mathcal{L}(\lambda , u_{n}(\lambda )) = F'(\lambda , u_n(\lambda )) \end{aligned}$$

(see (3.1)) admits a right inverse for all \( h \in H^s_{00} \), if condition (1.8) holds, respectively for \( h \in Y \cap H^s \) if the reversibility condition (1.13) holds. Moreover (4.93) gives the estimates

$$\begin{aligned} \Vert \mathcal{L}_n^{-1} h \Vert _s&\le _s \gamma ^{-1} \left( \Vert h \Vert _{s+\tau _1} + \Vert u_n \Vert _{s+ \mu } \Vert h \Vert _{{\mathfrak {s}}_0} \right) , \quad \forall h(\lambda ), \end{aligned}$$
(5.7)
$$\begin{aligned} \Vert \mathcal{L}_n^{-1} h \Vert _{{\mathfrak {s}}_0}&\le \gamma ^{-1} N_{n+1}^{\tau _1} \Vert h \Vert _{{\mathfrak {s}}_0} , \quad \forall h(\lambda ) \in H_{n+1} , \end{aligned}$$
(5.8)

(use (5.1) and \( \Vert u_n \Vert _{{\mathfrak {s}}_{0} + \mu } \le 1 \)), for all Lipschitz map \(h(\lambda )\). Then, for all \(\lambda \in \mathcal{G}_{n+1} \), we define

$$\begin{aligned} u_{n+1} := u_{n} + h_{n + 1} \in H_{n+1} , \quad h_{n + 1}:= - \Pi _{n + 1} \mathcal{L}_n^{-1} \Pi _{n + 1} F(u_{n}) , \end{aligned}$$
(5.9)

which is well defined because, if condition (1.8) holds, then \( \Pi _{n + 1} F(u_n) \in H^s_{00} \), and, respectively, if (1.13) holds, then \( \Pi _{n + 1} F(u_{n}) \in Y \cap H^s \) (hence in both cases \( \mathcal{L}_n^{-1} \Pi _{n + 1} F(u_n) \) exists). Note also that in the reversible case \( h_{n + 1} \in X \) and so \( u_{n + 1} \in X \).

Recalling (5.2) and that \({\mathcal {L}}_n := F'(u_n) \), we write

$$\begin{aligned} F(u_{n + 1}) = F(u_{n}) + \mathcal{L}_n h_{n + 1} + \varepsilon Q(u_{n}, h_{n + 1}) \end{aligned}$$
(5.10)

where

$$\begin{aligned} Q(u_{n},h_{n + 1})&:= \mathcal{N}(u_{n} + h_{n + 1}) - \mathcal{N}(u_{n}) - \mathcal{N}'(u_{n}) h_{n + 1}, \\ {\mathcal {N}}(u)&:= f(\varphi ,x,u,u_x, u_{xx}, u_{xxx}). \end{aligned}$$

With this definition,

$$\begin{aligned} F(u) = L_\omega u + \varepsilon \mathcal{N}(u), \quad F'(u) h = L_\omega h + \varepsilon {\mathcal {N}}'(u)h, \quad L_\omega := {\omega \cdot \partial _{\varphi }}+ \partial _{xxx}. \end{aligned}$$

By (5.10) and (5.9) we have

$$\begin{aligned} F(u_{n + 1})&= F(u_{n}) - {\mathcal {L}}_n \Pi _{n + 1} {\mathcal {L}}_n^{-1} \Pi _{n + 1} F(u_{n}) + \varepsilon Q(u_{n},h_{n + 1}) \nonumber \\&= \Pi _{n + 1}^{\bot } F(u_{n}) + {\mathcal {L}}_n \Pi _{n + 1}^{\bot } {\mathcal {L}}_n^{-1} \Pi _{n + 1} F(u_{n}) + \varepsilon Q(u_{n},h_{n + 1}) \nonumber \\&= \Pi _{n + 1}^{\bot } F(u_{n}) + \Pi _{n + 1}^{\bot } {\mathcal {L}}_n {\mathcal {L}}_n^{-1} \Pi _{n + 1} F(u_{n})\nonumber \\&+ [ {\mathcal {L}}_n , \Pi _{n + 1}^{\bot } ] {\mathcal {L}}_n^{-1} \Pi _{n + 1} F(u_{n}) + \varepsilon Q(u_{n},h_{n + 1})\nonumber \\&= \Pi _{n + 1}^{\bot } F(u_{n}) + \varepsilon [\mathcal{N}'(u_{n}) , \Pi _{n + 1}^{\bot } ] {\mathcal {L}}_n^{-1} \Pi _{n + 1} F(u_{n}) + \varepsilon Q(u_{n},h_{n + 1}) \nonumber \\ \end{aligned}$$
(5.11)

where we have gained an extra \(\varepsilon \) from the commutator

$$\begin{aligned}{}[\mathcal{L}_n , \Pi _{n + 1}^{\bot } ] = [ L_{\omega } + \varepsilon \mathcal{N}'(u_{n}) , \Pi _{n + 1}^{\bot } ] = \varepsilon [\mathcal{N}'(u_{n}) , \Pi _{n + 1}^{\bot } ] . \end{aligned}$$

Lemma 5.1

Set

$$\begin{aligned} U_{n}:=\Vert u_{n} \Vert _{{\mathfrak {s}}_{0}+\beta _1} + \gamma ^{-1} \Vert F(u_{n}) \Vert _{{\mathfrak {s}}_{0}+\beta _1} , \quad w_n := \gamma ^{-1} \Vert F(u_{n}) \Vert _{{\mathfrak {s}}_{0}}. \end{aligned}$$
(5.12)

There exists \(C_0 := C ( \tau _1, \mu , \nu , \beta _1) > 0 \) such that

$$\begin{aligned} \begin{aligned} w_{n+1}&\le C_0 N_{n + 1}^{- \beta _1 + \mu '} U_n ( 1 + w_n ) + C_0 N_{n + 1}^{6 + 2\mu } w_n^2 , \\ U_{n+1}&\le C_0 N_{n + 1}^{9 + 2\mu } ( 1 + w_n )^2 \, U_n. \end{aligned} \end{aligned}$$
(5.13)

Proof

The operators \({\mathcal {N}}'(u_n)\) and \(Q(u_n,\cdot )\) satisfy the following tame estimates:

$$\begin{aligned} \Vert Q(u_n , h) \Vert _s&\le _s \Vert h \Vert _{{\mathfrak {s}}_0 + 3} \left( \Vert h \Vert _{s + 3} + \Vert u_n \Vert _{s + 3} \Vert h \Vert _{{\mathfrak {s}}_0+3} \right) \quad \ \forall h(\lambda ),\end{aligned}$$
(5.14)
$$\begin{aligned} \Vert Q(u_n , h) \Vert _{{\mathfrak {s}}_0}&\le N_{n+1}^6 \Vert h \Vert _{\mathfrak {s}_0}^2 \ \quad \forall h(\lambda ) \in H_{n+1} ,\end{aligned}$$
(5.15)
$$\begin{aligned} \Vert {\mathcal {N}}'(u_n) h \Vert _{s}&\le _{s} \Vert h \Vert _{s + 3} + \Vert u_n \Vert _{s + 3} \Vert h \Vert _{{\mathfrak {s}}_0+3} \quad \forall h(\lambda ), \end{aligned}$$
(5.16)

where \(h(\lambda )\) depends on the parameter \(\lambda \) in a Lipschitz way. The bounds (5.14) and (5.16) follow by Lemma 6.2\((i)\) and Lemma 6.3. (5.15) is simply (5.14) at \(s = {\mathfrak {s}}_0\), using that \(\Vert u_n \Vert _{{\mathfrak {s}}_0 + 3} \le 1\), \(u_n, h_{n+1} \in H_{n+1}\) and the smoothing (5.1).

By (5.7) and (5.16), the term (in (5.11)) \( R_n := [ \mathcal{N}' (u_n), \Pi _{n+1}^\bot ] \mathcal{L}_n^{-1} \Pi _{n+1} F(u_n) \) satisfies, using also that \( u_n \in H_n \) and (5.1),

$$\begin{aligned} \Vert R_n \Vert _s&\le _s \gamma ^{-1} N_{n+1}^{\mu '} \left( \Vert F(u_n) \Vert _s + \Vert u_n \Vert _{s} \Vert F(u_n) \Vert _{{\mathfrak {s}}_0} \right) , \quad \mu ' := 3 + \mu ,\quad \quad \quad \quad \end{aligned}$$
(5.17)
$$\begin{aligned} \Vert R_n \Vert _{{\mathfrak {s}}_0}&\le _{\mathfrak {s}_0 + \beta _1} \gamma ^{-1} N_{n+1}^{-\beta _1 + \mu '} \left( \Vert F(u_n) \Vert _{{\mathfrak {s}}_0+ \beta _1} + \Vert u_n \Vert _{{\mathfrak {s}}_0 + \beta _1} \Vert F(u_n) \Vert _{{\mathfrak {s}}_0} \right) , \end{aligned}$$
(5.18)

because \(\mu \ge \tau _1 + 3\). In proving (5.17) and (5.18), we have simply estimated \({\mathcal {N}}'(u_n) \Pi _{n+1}^\perp \) and \(\Pi _{n+1}^\perp {\mathcal {N}}'(u_n)\) separately, without using the commutator structure.

From the definition (5.9) of \(h_{n+1}\), using (5.7), (5.8) and (5.1), we get

$$\begin{aligned} \Vert h_{n + 1} \Vert _{{\mathfrak {s}}_{0}+ \beta _1}&\le _{{\mathfrak {s}}_0 + \beta _1} \gamma ^{-1} N_{n + 1}^{\mu } \left( \Vert F(u_{n}) \Vert _{{\mathfrak {s}}_0+\beta _1} + \Vert u_n \Vert _{{\mathfrak {s}}_0 + \beta _1} \Vert F(u_{n}) \Vert _{{\mathfrak {s}}_0} \right) ,\end{aligned}$$
(5.19)
$$\begin{aligned} \Vert h_{n + 1} \Vert _{{\mathfrak {s}}_{0}}&\le _{{\mathfrak {s}}_0} \gamma ^{-1} N_{n + 1}^{\mu } \Vert F(u_n) \Vert _{\mathfrak s_0} \end{aligned}$$
(5.20)

because \(\mu \ge \tau _1\). Then

$$\begin{aligned} \Vert u_{n + 1} \Vert _{{\mathfrak {s}}_0 + \beta _1}&\mathop {\le }\limits ^{(5.9)} \Vert u_{n} \Vert _{{\mathfrak {s}}_0 + \beta _1} + \Vert h_{n + 1} \Vert _{{\mathfrak {s}}_0 + \beta _1} \nonumber \\&\mathop {\le }\limits ^{(5.19)}_{{\mathfrak {s}}_0 + \beta _1} \Vert u_n \Vert _{{\mathfrak {s}}_0 + \beta _1} \left( 1 + \gamma ^{-1} N_{n+1}^{\mu } \Vert F (u_n) \Vert _{{\mathfrak {s}}_0} \right) \nonumber \\&+ \gamma ^{-1} N_{n + 1}^{\mu } \Vert F(u_n) \Vert _{{\mathfrak {s}}_0 + \beta _1}. \end{aligned}$$
(5.21)

Formula (5.11) for \(F(u_{n+1})\), and (5.18), (5.15), (5.20), \(\varepsilon \gamma ^{-1}\le 1\), (5.1), imply

$$\begin{aligned} \Vert F(u_{n + 1}) \Vert _{{\mathfrak {s}}_0}&\le _{{\mathfrak {s}}_0 + \beta _1} N_{n + 1}^{- \beta _1 + \mu '} \left( \Vert F(u_n) \Vert _{{\mathfrak {s}}_0 + \beta _1} + \Vert u_n \Vert _{{\mathfrak {s}}_0 + \beta _1} \Vert F(u_n) \Vert _{{\mathfrak {s}}_0} \right) \nonumber \\&+ \varepsilon \gamma ^{-2} N_{n + 1}^{6 + 2 \mu } \Vert F(u_{n}) \Vert _{\mathfrak s_{0}}^{2}. \end{aligned}$$
(5.22)

Similarly, using the “high norm” estimates (5.17), (5.14), (5.19), (5.20), \(\varepsilon \gamma ^{-1}\le 1\) and (5.1),

$$\begin{aligned} \Vert F(u_{n + 1}) \Vert _{{\mathfrak {s}}_0 + \beta _1}&\le _{{\mathfrak {s}}_0 + \beta _1} \left( \Vert F(u_n) \Vert _{{\mathfrak {s}}_0 + \beta _1} + \Vert u_n \Vert _{{\mathfrak {s}}_0 + \beta _1} \Vert F(u_n) \Vert _{{\mathfrak {s}}_0} \right) \nonumber \\&\times \left( 1 + N_{n+1}^{\mu '} + N_{n+1}^{9 + 2 \mu } \gamma ^{-1}\Vert F(u_n) \Vert _{{\mathfrak {s}}_0} \right) . \end{aligned}$$
(5.23)

By (5.21), (5.22) and (5.23) we deduce (5.13). \(\square \)

By \( (\mathcal{P}2)_n \) we deduce, for \( \varepsilon \gamma ^{-1} \) small, that (recall the definition on \( w_n \) in (5.12))

$$\begin{aligned} w_n \le \varepsilon \gamma ^{-1} C_* N_{n}^{-\kappa } \le 1. \end{aligned}$$
(5.24)

Then, by the second inequality in (5.13), (5.24), \( (\mathcal{P}3)_n \) (recall the definition on \( U_n \) in (5.12)) and the choice of \( \kappa \) in (5.3), we deduce \( U_{n+1} \le C_* \varepsilon \gamma ^{-1} N_{n+1}^\kappa \), for \( N_0 \) large enough. This proves \( (\mathcal{P}3)_{n+1} \).

Next, by the first inequality in (5.13), (5.24), \( (\mathcal{P}2)_n \) (recall the definition on \( w_n \) in (5.12)) and (5.3), we deduce \( w_{n+1} \le C_* \varepsilon \gamma ^{-1} N_{n+1}^\kappa \), for \( N_0 \) large, \( \varepsilon \gamma ^{-1}\) small. This proves \( (\mathcal{P}2)_{n+1} \).

The bound (5.5) at the step \( n +1\) follows by (5.20) and \((\mathcal{P}2)_n \) (and (5.3)). Then

$$\begin{aligned} \Vert u_{n+1} \Vert _{{\mathfrak {s}}_0 + \mu } \le \Vert u_0 \Vert _{{\mathfrak {s}}_0+ \mu } + \sum _{k=1}^{n+1} \Vert h_k \Vert _{{\mathfrak {s}}_0 + \mu } \le \sum _{k=1}^\infty C_* \varepsilon \gamma ^{-1} N_k^{-\sigma _1} \le 1 \end{aligned}$$

for \( \varepsilon \gamma ^{-1} \) small enough. As a consequence \(({\mathcal {P}}1,2,3)_{n+1}\) hold.

Step 3: prove \(({\mathcal {P}}4)_n\), \(n \ge 0\). For all \(n \ge 0\),

$$\begin{aligned} \mathcal{G}_n {\setminus }\mathcal{G}_{n+1} = \bigcup _{l \in {\mathbb {Z}}^{\nu }, j,k \in {\mathbb {Z}}} R_{ljk} (u_{n}) \end{aligned}$$
(5.25)

where

$$\begin{aligned} R_{ljk} (u_{n})&:= \left\{ \lambda \in \mathcal{G}_n : \left| \mathrm{i} \lambda \bar{\omega }\cdot l + \mu _{j}^{\infty }(\lambda ,u_{n}(\lambda )) - \mu _{k}^{\infty }(\lambda ,u_{n}(\lambda ))\right| \right. \nonumber \\&\qquad \qquad \qquad < \left. 2\gamma _{n} | j^{3}-k^{3} | \left\langle l\right\rangle ^{-\tau }\right\} . \end{aligned}$$
(5.26)

Notice that, by the definition (5.26), \(R_{ljk} (u_{n}) = \emptyset \) for \(j = k\). Then we can suppose in the sequel that \(j \ne k\). We divide the estimate into some lemmata.

Lemma 5.2

For \( \varepsilon \gamma ^{-1}\) small enough, for all \(n \ge 0\), \(|l|\le N_n\),

$$\begin{aligned} R_{ljk}(u_{n}) \subseteq R_{ljk}(u_{n - 1}). \end{aligned}$$
(5.27)

Proof

We claim that, for all \( j , k \in {\mathbb {Z}}\),

$$\begin{aligned} |(\mu _{j}^{\infty } - \mu _{k}^{\infty })(u_{n}) - (\mu _{j}^{\infty } - \mu _{k}^{\infty })(u_{n-1})| \le C \varepsilon |j^{3} - k^{3}| N_n^{-\alpha } , \quad \forall \lambda \in \mathcal{G}_n ,\quad \quad \quad \end{aligned}$$
(5.28)

where \(\mu _{j}^{\infty }(u_{n}) := \mu _{j}^{\infty }(\lambda , u_{n}(\lambda ))\) and \( \alpha \) is defined in (4.13). Before proving (5.28) we show how it implies (5.27). For all \( j \ne k\), \(|l| \le N_{n}\), \(\lambda \in {\mathcal {G}}_n\), by (5.28)

$$\begin{aligned} |\mathrm{i} \lambda \bar{\omega }\cdot l + \mu _{j}^{\infty }(u_{n}) - \mu _{k}^{\infty }(u_{n}) |&\ge |\mathrm{i} \lambda \bar{\omega }\cdot l + \mu _{j}^{\infty }(u_{n-1}) - \mu _{k}^{\infty }(u_{n-1}) | \\&\quad - |(\mu _{j}^{\infty }- \mu _{k}^{\infty })(u_{n}) - (\mu _{j}^{\infty }- \mu _{k}^{\infty })(u_{n - 1}) |\\&\ge 2\gamma _{n - 1} |j^3 - k^3| \langle l \rangle ^{-\tau } - C \varepsilon |j^3 - k^3| N_{n}^{-\alpha }\\&\ge 2\gamma _{n} |j^3 - k^3| \langle l \rangle ^{-\tau } \end{aligned}$$

for \(C \varepsilon \gamma ^{-1}N_{n}^{\tau -\alpha }\, 2^{n+1} \le 1\) (recall that \(\gamma _n := \gamma (1 + 2^{-n})\)), which implies (5.27).

Proof of (5.28)

By (4.4),

$$\begin{aligned}&(\mu _{j}^{\infty }- \mu _{k}^{\infty })(u_{n}) - (\mu _{j}^{\infty }- \mu _{k}^{\infty })(u_{n - 1})\nonumber \\&\quad = -\mathrm{i} \big [ m_3(u_{n}) - m_{3}(u_{n - 1}) \big ] (j^{3} - k^{3}) +\mathrm{i} \big [ m_1(u_{n}) - m_1(u_{n-1}) \big ] (j - k) \nonumber \\&\qquad + r_{j}^{\infty }(u_{n}) - r_{j}^{\infty }(u_{n - 1}) - \left( r_{k}^{\infty }(u_{n}) - r_{k}^{\infty }(u_{n-1}) \right) \end{aligned}$$
(5.29)

where \( m_3 (u_{n}) := m_3(\lambda , u_{n}(\lambda ))\) and similarly for \( m_1, r_{j}^{\infty }\). We first apply Theorem 4.2-\(\mathbf{(S4)_{\nu }}\) with \( \nu = n + 1 \), \( \gamma = \gamma _{n-1} \), \( \gamma - \rho = \gamma _n \), and \( u_1 \), \( u_2 \), replaced, respectively, by \( u_{n-1} \), \( u_n \), in order to conclude that

$$\begin{aligned} \Lambda _{n+1}^{\gamma _{n-1}} ( u_{n-1}) \subseteq \Lambda _{n+1}^{\gamma _n} ( u_n ). \end{aligned}$$
(5.30)

The smallness condition in (4.26) is satisfied because \(\sigma _2 < \mu \) (see definitions (4.13), (4.90)) and so

$$\begin{aligned} \varepsilon C N_n^{\tau } \Vert u_n - u_{n - 1} \Vert _{{\mathfrak {s}}_0 + \sigma _2}&\le \varepsilon C N_n^{\tau } \Vert u_n - u_{n - 1} \Vert _{{\mathfrak {s}}_0 + \mu } \mathop {\le }\limits ^{(5.5)} \varepsilon ^2 \gamma ^{-1} C C_* N_{n}^{\tau - \sigma _1}\\&\le \gamma _{n-1} - \gamma _{n} =: \rho = \gamma 2^{-n } \end{aligned}$$

for \(\varepsilon \gamma ^{-1}\) small enough, because \( \sigma _1 > \tau \) (see (5.5), (4.90)). Then, by the definitions (5.4) and (4.6), we have

$$\begin{aligned} \mathcal{G}_{n} := \mathcal{G}_{n-1} \cap \Lambda _{\infty }^{2 \gamma _{n-1}} (u_{n-1}) \mathop {\subseteq }\limits ^{(4.35)}\bigcap _{\nu \ge 0} \Lambda _{\nu }^{\gamma _{n - 1}}(u_{n - 1}) \subset \Lambda _{n+1}^{\gamma _{n-1}}(u_{n-1}) \mathop {\subseteq }\limits ^{(5.30)}\Lambda _{n+1}^{\gamma _n}(u_n). \end{aligned}$$

Next, for all \( \lambda \in \mathcal{G}_n \subset \Lambda _{n+1}^{\gamma _{n-1}}(u_{n-1}) \cap \Lambda _{n+1}^{\gamma _n}(u_n) \) both \( r_j^{n+1} (u_{n-1}) \) and \( r_j^{n+1} (u_{n}) \) are well defined, and we deduce by Theorem 4.2-\(\mathbf{(S3)}_\nu \) with \( \nu = n+1 \), that

$$\begin{aligned} | r^{n+1}_j (u_n) - r_j^{n+1} (u_{n-1})| \mathop {\lessdot }\limits ^{(4.25)}\varepsilon \Vert u_{n-1} - u_n \Vert _{{\mathfrak {s}}_0 + \sigma _2}. \end{aligned}$$
(5.31)

Moreover (4.34) (with \( \nu = n+1 \)) and (3.66) imply that

$$\begin{aligned}&| r_j^{\infty }(u_{n -1}) - r_j^{n + 1}(u_{n - 1})| + |r_j^{\infty }(u_{n}) - r_j^{n + 1}(u_{n})| \nonumber \\&\quad \lessdot \, \varepsilon (1 + \Vert u_{n-1} \Vert _{{\mathfrak {s}}_0 + \beta + \sigma }+ \Vert u_n \Vert _{{\mathfrak {s}}_0 + \beta + \sigma }) N_{n}^{-\alpha } \lessdot \varepsilon N_{n}^{-\alpha } \end{aligned}$$
(5.32)

because \( \sigma + \beta < \mu \) and \( \Vert u_{n-1} \Vert _{{\mathfrak {s}}_0 + \mu } + \) \( \Vert u_n \Vert _{{\mathfrak {s}}_0 + \mu } \le 2 \) by \( \mathbf{(S1)}_{n-1} \) and \( \mathbf{(S1)}_n \). Therefore, for all \(\lambda \in \mathcal{G}_{n}\), \( \forall j \in {\mathbb {Z}}\),

$$\begin{aligned} \big | r_j^{\infty }(u_{n}) - r_j^{\infty }(u_{n - 1}) \big |&\le \big | r_j^{n + 1}(u_{n}) - r_j^{n + 1}(u_{n-1}) \big |+ | r_j^{\infty }(u_{n}) - r_j^{n + 1}(u_{n})|\nonumber \\&+ | r_j^{\infty }(u_{n -1}) - r_j^{n + 1}(u_{n - 1})| \nonumber \\&\mathop {\lessdot }\limits ^{(5.31),(5.32)} \varepsilon \Vert u_n - u_{n - 1} \Vert _{{\mathfrak {s}}_0+ \sigma _2} \!+\! \varepsilon N_{n}^{-\alpha } \mathop {\lessdot }\limits ^{(5.5)}\varepsilon N_{n}^{-\alpha } \quad \end{aligned}$$
(5.33)

because \( \sigma _1 > \alpha \) (see (4.13), (5.5)). Finally (5.29), (5.33), (3.64), \(\Vert u_n \Vert _{{\mathfrak {s}}_0 + \mu }\le 1\), imply (5.28). \(\square \)

By definition, \( R_{ljk}(u_n) \subset \mathcal{G}_n \) (see (5.26)) and, by (5.27), for all \( |l| \le N_n \), we have \( R_{ljk} (u_n) \subseteq R_{ljk} (u_{n-1}) \). On the other hand \( R_{ljk}(u_{n-1}) \cap \mathcal{G}_n = \emptyset \), see (5.4). As a consequence, \( \forall |l| \le N_n \), \( R_{ljk} (u_n) = \emptyset \), and

$$\begin{aligned} \mathcal{G}_{n} {\setminus }\mathcal{G}_{n+1} \mathop {\subseteq }\limits ^{(5.25)}\bigcup _{|l|> N_{n}, j,k \in {\mathbb {Z}}} R_{ljk}(u_{n}) , \quad \forall n \ge 1. \end{aligned}$$
(5.34)

Lemma 5.3

Let \(n \ge 0\). If \(R_{ljk}(u_{n}) \ne \emptyset \), then \(|j^{3}-k^{3}| \le 8 |\bar{\omega }\cdot l|\).

Proof

If \(R_{ljk}(u_{n})\,\ne \,\emptyset \) then there exists \(\lambda \in \Lambda \) such that \( |\mathrm{i} \lambda \bar{\omega }\cdot l + \mu _{j}^{\infty }(\lambda ,u_{n}(\lambda ))- \mu _{k}^{\infty }(\lambda ,u_{n}(\lambda )) | < \) \( 2 \gamma _{n} |j^{3}-k^{3} | \langle l \rangle ^{-\tau } \) and, therefore,

$$\begin{aligned} |\mu _{j}^{\infty }(\lambda ,u_{n}(\lambda )) - \mu _{k}^{\infty }(\lambda ,u_{n}(\lambda )) | < 2\gamma _{n} |j^{3}-k^{3}| \langle l \rangle ^{-\tau }\, + 2 |\bar{\omega }\cdot l|. \end{aligned}$$
(5.35)

Moreover, by (4.4), (3.63), (4.5), for \(\varepsilon \) small enough,

$$\begin{aligned} |\mu _{j}^{\infty } - \mu _{k}^{\infty }|&\ge |m_3| |j^{3}-k^{3}| - |m_1| |j-k| - |r_j^{\infty }| - |r_k^{\infty }| \nonumber \\&\ge \frac{1}{2} |j^{3}-k^{3}| - C \varepsilon |j - k| - C \varepsilon \ge \frac{1}{3} |j^{3}-k^{3}| \end{aligned}$$
(5.36)

if \(j \ne k\). Since \(\gamma _n \le 2\gamma \) for all \(n \ge 0\), \(\gamma \le 1/ 48\), by (5.35) and (5.36) we get

$$\begin{aligned} 2 |\bar{\omega }\cdot l| \ge \left( \frac{1}{3} -\frac{4\gamma }{\langle l \rangle ^{\tau }} \right) |j^{3}-k^{3}|\ \ge \frac{1}{4}|j^{3}- k^3| \end{aligned}$$

proving the Lemma. \(\square \)

Lemma 5.4

For all \(n \ge 0\),

$$\begin{aligned} |R_{ljk}(u_{n})| \le C \gamma \left\langle l\right\rangle ^{-\tau }. \end{aligned}$$
(5.37)

Proof

Consider the function \( \phi : \Lambda \rightarrow {\mathbb {C}}\) defined by

$$\begin{aligned} \phi (\lambda )&:= \mathrm{i} \lambda \bar{\omega }\cdot l +\mu _{j}^{\infty }(\lambda )- \mu _{k}^{\infty }(\lambda ) \\&\mathop {=}\limits ^{(4.4)} \mathrm{i} \lambda \bar{\omega }\cdot l - \mathrm{i} {\tilde{m}}_3(\lambda )(j^{3}-k^{3}) + \mathrm{i} {\tilde{m}}_1(\lambda )(j-k) + r_{j}^{\infty }(\lambda )- r_{k}^{\infty }(\lambda ) \end{aligned}$$

where \( {\tilde{m}}_3 (\lambda ) \), \( {\tilde{m}}_1 (\lambda ) \), \( r^{\infty }_j (\lambda ) \), \( \mu _{j}^{\infty }(\lambda )\), are defined for all \( \lambda \in \Lambda \) and satisfy (4.5) by \( \Vert u_n \Vert ^{\mathrm{{Lip}(\gamma )}}_{\mathfrak s_0 + \mu , {\mathcal {G}}_n} \le 1\) (see \((\mathcal{P}1)_n\)). Recalling \( | \cdot |^\mathrm{lip}\le \gamma ^{-1}| \cdot |^\mathrm{{Lip}(\gamma )}\) and using (4.5)

$$\begin{aligned} | \mu _{j}^{\infty } - \mu _{k}^{\infty } |^\mathrm{lip}&\le |{\tilde{m}}_3|^\mathrm{lip} |j^{3} - k^{3}| + | {\tilde{m}}_1|^\mathrm{lip} |j - k| + |r_{j}^{\infty }|^\mathrm{lip} + |r_{k}^{\infty }|^\mathrm{lip}\nonumber \\&\le C \varepsilon \gamma ^{-1} |j^{3} - k^{3}|. \end{aligned}$$
(5.38)

Moreover Lemma 5.3 implies that, \(\forall \lambda _1, \lambda _2 \in \Lambda \),

$$\begin{aligned} |\phi (\lambda _1) - \phi (\lambda _2)| \!\!&\ge \!\! \left( |\bar{\omega } \cdot l| - |\mu _j^{\infty } - \mu _k^{\infty }|^\mathrm{lip} \right) |\lambda _1 - \lambda _2|\\&\mathop {\ge }\limits ^{(5.38)} \left( \frac{1}{8} - C \varepsilon \gamma ^{-1} \right) |j^3 - k^3| |\lambda _1 - \lambda _2| \ge \frac{|j^3 - k^3|}{9} |\lambda _1 - \lambda _2| \end{aligned}$$

for \(\varepsilon \gamma ^{-1}\) small enough. Hence

$$\begin{aligned} |R_{ljk}(u_n)| \le \frac{4 \gamma _n|j^{3}-k^3|}{\langle l \rangle ^{\tau }} \frac{9}{|j^3 - k^3|} \le \frac{72 \gamma }{\langle l \rangle ^{\tau }} , \end{aligned}$$

which is (5.37). \(\square \)

Now we prove \(({\mathcal {P}}4)_0\). We observe that, for each fixed \(l\), all the indices \(j,k\) such that \(R_{ljk}(0) \ne \emptyset \) are confined in the ball \(j^2 + k^2 \le 16 |\bar{\omega }| |l|\), because

$$\begin{aligned} |j^3 - k^3|&= |j-k| |j^2+jk+k^2| \\&\ge j^2 + k^2 - |jk| \ge \frac{1}{2}\, (j^2 + k^2) , \quad \forall j,k \in {\mathbb {Z}}, \ j \ne k, \end{aligned}$$

and \(|j^{3}-k^{3}| \le 8 |\bar{\omega }| |l|\) by Lemma 5.3. As a consequence

$$\begin{aligned} | \mathcal{G}_0 {\setminus } \mathcal{G}_1 | \mathop {=}\limits ^{(5.25)} \left| \bigcup _{l,j,k} R_{ljk}(0) \right| \le \sum _{l \in {\mathbb {Z}}^\nu } \sum _{j^2 + k^2 \le 16 |\bar{\omega }| |l|} |R_{ljk}(0)| \mathop {\lessdot }\limits ^{(5.37)}\sum _{l \in {\mathbb {Z}}^\nu } \gamma \langle l \rangle ^{-\tau +1} = C \gamma \end{aligned}$$

if \(\tau > \nu + 1\). Thus the first estimate in (5.6) is proved, taking a larger \(C_*\) if necessary.

Finally, \(({\mathcal {P}}4)_n\) for \(n \ge 1\), follows by

$$\begin{aligned} |\mathcal{G}_{n } {\setminus } \mathcal{G}_{n+1}| \!\!&\mathop {\le }\limits ^{~(5.34)}\!\! \sum _{|l|> N_{n} |j|,|k|\le C |l|^{1/2}} |R_{ljk}(u_{n})| \mathop {\lessdot }\limits ^{(5.37)}\sum _{|l|> N_{n} |j|,|k|\le C|l|^{1/2}} \gamma \langle l \rangle ^{-\tau }\\ \!\!&\lessdot \!\! \sum _{|l| > N_n} \gamma \langle l \rangle ^{-\tau + 1} \lessdot \gamma N_{n}^{-\tau + \nu } \le C \gamma N_{n}^{-1} \end{aligned}$$

and (5.6) is proved. The proof of Theorem 5.1 is complete. \(\square \)

5.1 Proof of Theorems 1.1, 1.2, 1.3, 1.4 and 1.5

Proof of Theorems 1.1, 1.2, 1.3

Assume that \( f \in C^q \) satisfies the assumptions in Theorem 1.1 or in Theorem 1.3 with a smoothness exponent \( q := q(\nu ) \ge {\mathfrak {s}}_0 + \mu + \beta _1 \) which depends only on \( \nu \) once we have fixed \( \tau := \nu + 2 \) (recall that \( {\mathfrak {s}}_0 := (\nu + 2 ) \slash 2 \), \( \beta _1 \) is defined in (5.3) and \( \mu \) in (4.90)).

For \( \gamma = \varepsilon ^a \), \( a \in (0,1) \) the smallness condition \( \varepsilon \gamma ^{- 1} = \varepsilon ^{1- a} < \delta \) of Theorem 5.1 is satisfied. Hence on the Cantor set \(\mathcal{G}_{\infty } := \cap _{n \ge 0} \mathcal{G}_{n} \), the sequence \( u_{n}(\lambda ) \) is well defined and converges in norm \( \Vert \cdot \Vert _{{\mathfrak {s}}_{0}+\mu , {\mathcal {G}}_\infty }^{\mathrm{{Lip}(\gamma )}}\) (see (5.5)) to a solution \(u_{\infty }(\lambda )\) of

$$\begin{aligned} F(\lambda , u_\infty (\lambda )) = 0 \quad \mathrm{with} \quad \sup _{\lambda \in \mathcal{G}_\infty } \Vert u_\infty (\lambda ) \Vert _{{\mathfrak {s}}_0 + \mu } \le C \varepsilon \gamma ^{-1} = C \varepsilon ^{1-a} , \end{aligned}$$

namely \(u_{\infty }(\lambda )\) is a solution of the perturbed equation (1.4) with \( \omega = \lambda \bar{\omega }\). Moreover, by (5.6), the measure of the complementary set satisfies

$$\begin{aligned} |\Lambda {\setminus } \mathcal{G}_{\infty }| \le \sum _{n \ge 0}|\mathcal{G}_{n } {\setminus } \mathcal{G}_{n+1} | \le C \gamma + \sum _{n \ge 1} \gamma C N_{n}^{-1} \le C \gamma = C \varepsilon ^{a} , \end{aligned}$$

proving (1.9). The proof of Theorem 1.1 is complete. In order to finish the proof of Theorems 1.2 or 1.3, it remains to prove the linear stability of the solution, namely Theorem 1.5.

Proof of Theorem 1.4

Part \((i) \) follows by (4.72), Lemma 4.5, Theorem 4.1 (applied to the solution \( u_\infty (\lambda ) \)) with the exponents \( \bar{\sigma }:= \sigma + \beta + 3 \), \( \Lambda _\infty (u) := \Lambda _\infty ^{2\gamma } (u) \), see (4.6). Part (\(ii\)) follows by the dynamical interpretation of the conjugation procedure, as explained in Sect. 2.2. Explicitly, in Sects. 3 and 4, we have proved that

$$\begin{aligned} {\mathcal {L}}= \mathcal{A} B \rho W {\mathcal {L}}_\infty W^{-1} B^{-1} \mathcal{A}^{-1}, \quad W := \mathcal{M} \mathcal{T} {\mathcal {S}}\Phi _\infty . \end{aligned}$$

By the arguments in Sect. 2.2 we deduce that a curve \(h(t)\) in the phase space \(H^s_x\) is a solution of the dynamical system (1.19) if and only if the transformed curve

$$\begin{aligned} v(t) := W^{-1}(\omega t) B^{-1} \mathcal{A}^{-1}(\omega t) h(t) \end{aligned}$$
(5.39)

(see notation (2.18), Lemma 3.3, (4.9)) is a solution of the constant coefficients dynamical system (1.20).

Proof of Theorem 1.5

If all \( \mu _j \) are purely imaginary, the Sobolev norm of the solution \( v(t) \) of (1.20) is constant in time, see (1.21). We now show that also the Sobolev norm of the solution \( h(t) \) in (5.39) does not grow in time. For each \(t \in {\mathbb {R}}\), \( \mathcal{A}(\omega t) \) and \(W(\omega t)\) are transformations of the phase space \(H^s_x \) that depend quasi-periodically on time, and satisfy, by (3.69), (3.71), (4.9),

$$\begin{aligned} \Vert \mathcal{A}^{\pm 1}(\omega t) g \Vert _{H^s_x} + \Vert W^{\pm 1}(\omega t) g \Vert _{H^s_x} \le C(s) \Vert g \Vert _{H^s_x} , \quad \forall t \in {\mathbb {R}}, \ \forall g = g(x) \in H^s_x,\nonumber \\ \end{aligned}$$
(5.40)

where the constant \(C(s)\) depends on \(\Vert u \Vert _{s + \sigma + \beta + {\mathfrak {s}}_0} < + \infty \). Moreover, the transformation \(B\) is a quasi-periodic reparametrization of the time variable (see (2.25)), namely

$$\begin{aligned} Bf(t) = f(\psi (t)) = f(\tau ), \quad B^{-1}f(\tau ) = f(\psi ^{-1}(\tau )) = f(t) \quad \forall f : {\mathbb {R}}\rightarrow H^s_x,\nonumber \\ \end{aligned}$$
(5.41)

where \(\tau = \psi (t) := t + \alpha (\omega t)\), \(t = \psi ^{-1}(\tau ) = \tau + \tilde{\alpha }(\omega \tau )\) and \(\alpha \), \(\tilde{\alpha }\) are defined in Sect. 3.2. Thus

$$\begin{aligned} \Vert h(t) \Vert _{H^s_x}&\mathop {=}\limits ^{(5.39)} \Vert \mathcal{A}(\omega t) B W(\omega t) v (t) \Vert _{H^s_x} \mathop {\le }\limits ^{(5.40)}C(s) \Vert B W(\omega t) v (t) \Vert _{H^s_x} \\&\mathop {=}\limits ^{(5.41)} C(s) \Vert W(\omega \tau ) v (\tau ) \Vert _{H^s_x} \mathop {\le }\limits ^{(5.40)}C(s) \Vert v (\tau ) \Vert _{H^s_x} \mathop {=}\limits ^{(1.21)} C(s) \Vert v (\tau _0) \Vert _{H^s_x} \\&\mathop {=}\limits ^{(5.39)} C(s) \Vert W^{-1}(\omega \tau _0) B^{-1} \mathcal{A}^{-1}(\omega \tau _0) h(\tau _0) \Vert _{H^s_x} \\&\mathop {\le }\limits ^{(5.40)}C(s) \Vert B^{-1} \mathcal{A}^{-1}(\omega \tau _0) h(\tau _0) \Vert _{H^s_x} \mathop {=}\limits ^{(5.41)} C(s) \Vert \mathcal{A}^{-1}(0) h(0) \Vert _{H^s_x}\\&\mathop {\le }\limits ^{(5.40)}C(s) \Vert h(0) \Vert _{H^s_x} \end{aligned}$$

having chosen \(\tau _0 := \psi (0) = \alpha (0)\) (in the reversible case, \(\alpha \) is an odd function, and so \(\alpha (0) = 0\)). Hence (1.22) is proved. To prove (1.23), we collect the estimates (3.70), (3.72), (4.9) into

$$\begin{aligned}&\Vert (\mathcal{A}^{\pm 1}(\omega t) - I) g \Vert _{H^s_x} + \Vert (W^{\pm 1}(\omega t) - I) g \Vert _{H^s_x} \nonumber \\&\quad \le \varepsilon \gamma ^{-1} C(s) \Vert g \Vert _{H^{s+1}_x} , \quad \forall t \in {\mathbb {R}}, \ \forall g \in H^s_x, \end{aligned}$$
(5.42)

where the constant \(C(s)\) depends on \(\Vert u \Vert _{s + \sigma + \beta + {\mathfrak {s}}_0}\). Thus

$$\begin{aligned} \Vert h(t) \Vert _{H^s_x}&\mathop {=}\limits ^{(5.39)} \Vert \mathcal{A}(\omega t) B W(\omega t) v (t) \Vert _{H^s_x} \\&\le \Vert B W(\omega t) v (t) \Vert _{H^s_x} + \Vert (\mathcal{A}(\omega t)-I) B W(\omega t) v (t) \Vert _{H^s_x} \\&\mathop {\le }\limits ^{(5.41)(5.42)} \Vert W(\omega \tau ) v(\tau ) \Vert _{H^s_x} + \varepsilon \gamma ^{-1} C(s) \Vert B W(\omega t) v (t) \Vert _{H^{s+1}_x} \\&\mathop {=}\limits ^{(5.41)} \Vert W(\omega \tau ) v(\tau ) \Vert _{H^s_x} + \varepsilon \gamma ^{-1} C(s) \Vert W(\omega \tau ) v (\tau ) \Vert _{H^{s+1}_x} \\&\mathop {\le }\limits ^{(5.40)} \Vert v(\tau ) \Vert _{H^s_x} + \Vert (W(\omega \tau ) - I) v(\tau ) \Vert _{H^s_x} + \varepsilon \gamma ^{-1} C(s) \Vert v (\tau ) \Vert _{H^{s+1}_x} \\&\mathop {\le }\limits ^{(5.42)} \Vert v(\tau ) \Vert _{H^s_x} + \varepsilon \gamma ^{-1} C(s) \Vert v(\tau ) \Vert _{H^{s+1}_x}\\&\mathop {=}\limits ^{(1.21)} \Vert v(\tau _0) \Vert _{H^s_x} + \varepsilon \gamma ^{-1} C(s) \Vert v(\tau _0) \Vert _{H^{s+1}_x} \\&\mathop {=}\limits ^{(5.39)} \Vert W^{-1}(\omega \tau _0) B^{-1} \mathcal{A}^{-1}(\omega \tau _0) h(\tau _0) \Vert _{H^s_x}\\&+ \varepsilon \gamma ^{-1} C(s) \Vert W^{-1}(\omega \tau _0) B^{-1} \mathcal{A}^{-1}(\omega \tau _0) h(\tau _0) \Vert _{H^{s+1}_x}. \end{aligned}$$

Applying the same chain of inequalities at \( \tau = \tau _0 \), \( t = 0 \), we get that the last term is

$$\begin{aligned} \le \Vert h(0) \Vert _{H^s_x} + \varepsilon \gamma ^{-1} C(s) \Vert h(0) \Vert _{H^{s+1}_x} , \end{aligned}$$

proving the second inequality in (1.23) with \( \mathtt a := 1 - a \). The first one follows similarly.