Abstract
We prove the existence of Cantor families of small amplitude, linearly stable, quasi-periodic solutions of quasi-linear (also called strongly nonlinear) autonomous Hamiltonian differentiable perturbations of the mKdV equation. The proof is based on a weak version of the Birkhoff normal form algorithm and a nonlinear Nash–Moser iteration. The analysis of the linearized operators at each step of the iteration is achieved by pseudo-differential operator techniques and a linear KAM reducibility scheme.
We’re sorry, something doesn't seem to be working properly.
Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.
Avoid common mistakes on your manuscript.
1 Introduction and main result
In the paper [5] we proved the first existence result of quasi-periodic solutions for autonomous quasi-linear PDEs (also called “strongly nonlinear” in [23]), in particular of small amplitude quasi-periodic solutions of the KdV equation subject to a Hamiltonian quasi-linear perturbation. The approach developed in [5] (see also [4]) is of wide applicability for quasi-linear PDEs in one space dimension. In this paper we take the opportunity to explain the general strategy of [5] applied to a model which is slightly simpler than KdV.
We consider the cubic, focusing or defocusing, mKdV equation
under periodic boundary conditions \( x \in \mathbb T:= \mathbb R/ 2 \pi \mathbb Z\), where
is the most general quasi-linear Hamiltonian (local) nonlinearity. Note that \( \mathcal{N}_4 \) contains as many derivatives as the linear vector field \( \partial _{xxx} \). It is a quasi-linear perturbation because \( \mathcal{N}_4 \) depends linearly on the highest derivative \( u_{xxx} \) multiplied by a coefficient which is a nonlinear function of the lower order derivatives \( u, u_x, u_{xx} \). The Eq. (1.1) is the Hamiltonian PDE
where \( \nabla H \) denotes the \(L^2(\mathbb T_x)\) gradient of the Hamiltonian
on the real phase space
endowed with the non-degenerate symplectic form
where \( \partial _x^{-1} u \) is the periodic primitive of u with zero average. The phase space \( H^1_0 (\mathbb T_x) \) is invariant for the evolution of (1.1) because the integral \( \int _{\mathbb T} u(x) \, dx \) is a prime integral (the mass). For simplicity we fix its value to \( \int _{\mathbb T} u(x) \, dx = 0 \). We recall that the Poisson bracket between two functions F , \( G :H^1_0(\mathbb T_x) \rightarrow \mathbb R\) is defined as
We assume that the “Hamiltonian density” f is of class \(C^q (\mathbb T\times \mathbb R\times \mathbb R; \mathbb R) \) for some q large enough (otherwise, as it is well known, we cannot expect the existence of smooth invariant KAM tori). We also assume that f vanishes at \( u = u_x = 0\) and
As a consequence the nonlinearity \( {\mathcal N}_4 \) vanishes at order 4 at \( u = 0 \) and (1.1) may be seen, close to the origin, as a “small” perturbation of the cubic mKdV equation
Such equation is known to be completely integrable. Actually it is mapped into KdV by a Miura transform, and it may be described by global analytic action-angle variables, as it was proved by Kappeler and Topalov [19]. We also remark that, among the generalized KdV equations \( u_t + u_{xxx} \pm \partial _x (u^p) = 0\), \( p \in \mathbb N\), the only known completely integrable ones are the KdV \( p=2\) and the cubic mKdV \( p = 3 \).
It is a natural question to know whether the periodic, quasi-periodic or almost periodic solutions of (1.9) persist under small perturbations. This is the content of KAM theory. It is a difficult problem because of small divisors resonance phenomena, which are especially strong in presence of quasi-linear perturbations like \(\mathcal{N}_4\).
In this paper (as well as in [5]) we restrict the analysis to the search of small amplitude quasi-periodic solutions. It is also a very interesting question to investigate possible extensions of this result to perturbations of finite gap solutions. A difficulty which arises in the search of small amplitude solutions is that the mKdV Eq. (1.1) is a completely resonant PDE at \( u = 0 \), namely the linearized equation at the origin is the linear Airy equation
which possesses only the \( 2 \pi \)-periodic in time, real solutions
Thus the existence of small amplitude quasi-periodic solutions of (1.1) is entirely due to the nonlinearity. Indeed, the nonlinear term \(\varsigma \partial _x (u^3)\) is the one that produces the main modulation of the frequency vector of the solution with respect to its amplitude (the well-known frequency-to-action map, or frequency-amplitude relation, or “twist”, see (4.10)) and that allows to “tune” the action parameters \(\xi \) so that the frequencies becomes rationally independent and diophantine. Note that the mKdV Eq. (1.1) does not depend on other external parameters which may influence the frequencies. This is a further difficulty in the study of autonomous PDEs with respect to the forced cases studied in [3]. Actually, in [3] we considered non-autonomous quasi-linear (and fully nonlinear) perturbations of the Airy equation and we used the forcing frequencies as independent parameters.
The core of the matter is to understand the perturbative effect of the quasi-linear term \( \mathcal{N}_4 \) over infinite times. By (1.8), close to the origin, the quartic term \( \mathcal{N}_4 \) is smaller than the pure cubic mKdV (1.9). Therefore, when we restrict the equation to finitely many space-Fourier indices \( |j| \le C \), we essentially enter in the range of applicability of finite dimensional KAM theory close to an elliptic equilibrium. The new problem is to understand what happens to the dynamics on the high frequencies \( |j| \rightarrow + \infty \), since \( \mathcal{N}_4 \) is a nonlinear differential operator of the same order (i.e. 3) as the constant coefficient linear (and integrable) vector field \(\partial _{xxx}\).
Does such a strongly nonlinear perturbation give rise to the formation of singularities for a solution in finite time, as it happens for the quasi-linear wave equations considered by Lax [25] and Klainerman and Majda [20]? Or, on the contrary, does the KAM phenomenon persist nevertheless for the mKdV Eq. (1.1)? The answer to these questions has been controversial for several years. For example, Kappeler and Pöschel [18, Remark 3, p. 19] wrote: “It would be interesting to obtain perturbation results which also include terms of higher order, at least in the region where the KdV approximation is valid. However, results of this type are still out of reach, if true at all”.
We think that these are very important dynamical questions to be investigated, especially because many of the equations arising in physics are quasi-linear or even fully nonlinear.
The main result of this paper proves that the KAM phenomenon actually persists, at least close to the origin, for quasi-linear Hamiltonian perturbations of mKdV (the same result is proved in [5] for KdV). More precisely, Theorem 1 proves the existence of Cantor families of small amplitude, linearly stable, quasi-periodic solutions of the mKdV Eq. (1.1) subject to quasi-linear Hamiltonian perturbations. It is not surprising that the same result applies for both the focusing and the defocusing mKdV because we are looking for small amplitude solutions. Thus the different sign \( \varsigma = \pm 1 \) only affects the branch of the bifurcation.
From a dynamical point of view, note that the parameters \(\xi \) selected by the KAM Theorem 1 give rise to solutions of (1.1) and (1.2) which are global in time. This is interesting information because, as far as we know, there are no results of global or even local solutions of the Cauchy problem for (1.1) and (1.2), and such PDEs are in general believed to be ill-posed in Sobolev spaces (for a rough result of local well-posedness for (1.1) and (1.2) see [6]).
The iterative procedure we are going to present is able to select many parameters \(\xi \) which give rise to quasi-periodic solutions (hence defined for all times). This procedure works for parameters belonging to a finite dimensional Cantor like set which becomes asymptotically dense at the origin.
How can this kind of result be achieved? The proof of Theorem 1—which we shall discuss in more detail later—is based on an iterative Nash–Moser scheme. As it is well known, the main step of this procedure is to invert the linearized operators obtained at each step of the iteration and to prove that the inverse operators, albeit they lose derivatives (because of small divisors), satisfy tame estimates in high Sobolev norms. The linearized equations are non-autonomous linear PDEs which depend quasi-periodically on time. The key point of this paper (and [5]) is that, using the symplectic decoupling of [10], some techniques of pseudo-differential operators adapted to the symplectic structure, and a linear Birkhoff normal form analysis, we are able to construct, for most diophantine frequencies, a time dependent (quasi-periodic) change of variables which conjugates each linearized equation into another one that is diagonal and has constant coefficients, that is, in “normal form”. This means that, in the new coordinates, we have integrated the equations. Then we easily invert the linearized operator (recall that the inverse loses derivatives because of small divisors) and we conjugate it back to solve the linear equation in the original set of variables. We remark that these quasi-periodic Floquet changes of variable map Sobolev spaces of arbitrarily high norms into itself and satisfy tame estimates. Hence the inverse operator also loses derivatives, but it satisfies tame estimates as well.
In the dynamical systems literature, this strategy is called “reducibility” of the equation and it is a quasi-periodic KAM perturbative extension of Floquet theory (Floquet theory deals with periodic solutions of finite dimensional systems). The difficulty to make it work in the present setting is due to the quasi-linear character of the nonlinearity in (1.1).
Before stating precisely our main result we shortly present some related literature. In the last years a big interest has been devoted to understand the effect of derivatives in the nonlinearity in KAM theory. For unbounded perturbations the first KAM results have been proved by Kuksin [22] and Kappeler and Pöschel [18] for KdV (see also Bourgain [13]), and more recently by Liu and Yuan [26], Zhang et al. [30] for derivative NLS, and by Berti et al. [7, 8] for derivative NLW. For a recent survey of known results for KdV, we refer to [15]. Actually all these results still concern semi-linear perturbations.
The KAM theorems in [18, 22] prove the persistence of the finite-gap solutions of the integrable KdV under semilinear Hamiltonian perturbations \( \varepsilon \partial _{x} (\partial _u f) (x, u) \), namely when the density f is independent of \( u_x \), so that (1.2) is a differential operator of order 1 . The key idea in [22] is to exploit the fact that the frequencies of KdV grow as \( \sim j^3 \) and the difference \( |j^3 - i^3| \ge \frac{1}{2} (j^2 + i^2) \), \(i \ne j \), so that KdV gains (outside the diagonal) two derivatives. This approach also works for Hamiltonian pseudo-differential perturbations of order 2 (in space), using the improved Kuksin’s lemma proved by Liu and Yuan [26]. However it does not work for the general quasi-linear perturbation in (1.2), which is a nonlinear differential operator of the same order as the constant coefficient linear operator \( \partial _{xxx}\).
Now we state precisely the main result of the paper. The solutions we find are, at the first order of amplitude, localized in Fourier space on finitely many “tangential sites”
The set S is required to be even because the solutions u of (1.1) have to be real valued. Moreover, we also assume the following explicit hypothesis on S:
Assumption (1.12) is a “non-degeneracy” condition. We assume it to prove that the Cantor-like set of amplitudes \( \xi \in \mathbb R^\nu _+ \) for which the quasi-periodic solution (1.13) exists has positive measure, see Lemmata 28, 29 and Remark 13.
Theorem 1
(KAM for quasi-linear perturbations of mKdV) Given \( \nu \in \mathbb N\), let \( f \in C^q \) [with \( q := q(\nu ) \) large enough] satisfy (1.8). Then, for all the tangential sites S as in (1.11) satisfying (1.12), the mKdV Eq. (1.1) possesses small amplitude quasi-periodic solutions with diophantine frequency vector \(\omega := \omega (\xi ) = (\omega _j)_{j \in S^+} \in \mathbb R^\nu \) of the form
where
for a “Cantor-like” set of small amplitudes \( \xi \in \mathbb R^\nu _+ \) with density 1 at \( \xi = 0 \). The term \(o(\sqrt{|\xi |})\) in (1.13) is a function \(u_1(t,x) = \tilde{u}_1(\omega t, x)\), with \(\tilde{u}_1\) in the Sobolev space \(H^s(\mathbb T^{\nu +1},\mathbb R)\) of periodic functions, and Sobolev norm \(\Vert \tilde{u}_1 \Vert _s = o(\sqrt{|\xi |})\) as \(\xi \rightarrow 0\), for some \(s < q\). These quasi-periodic solutions are linearly stable.
If the density \( f(u, u_x) \) is independent on x , a similar result holds for all the choices of the tangential sites, without assuming (1.12).
This result is deduced from Theorem 2. It was announced also in [4, 5] under the stronger condition on the tangential sites
Let us make some comments.
-
1.
In the case \(\nu = 1\) (time-periodic solutions), the condition (1.12) is always satisfied. Indeed, suppose, by contradiction, that there exist integers \(\bar{\jmath }_1 \ge 1\), \(j,k \in \mathbb Z\) such that
$$\begin{aligned} 2 \bar{\jmath }_1^{\,2} = j^2 + jk + k^2. \end{aligned}$$(1.16)Then \(j^2 + jk + k^2\) is even, and therefore both j and k are even, say \(j = 2n\), \(k = 2m\) with \(n,m \in \mathbb Z\). Hence \(2 \bar{\jmath }_1^{\,2} = 4(n^2 + nm + m^2)\), and this implies that \(\bar{\jmath }_1\) is even, say \(\bar{\jmath }_1 = 2p\) for some positive integer p. It follows that \(2 p^2 = n^2 + nm + m^2\), namely p, n, m satisfy (1.16). Then, iterating the argument, we deduce that \(\bar{\jmath }_1\) can be divided by 2 infinitely many times in \(\mathbb N\), which is impossible.
-
2.
When the density \( f(u, u_x )\) is independent of x , the \(L^2\)-norm
$$\begin{aligned} M(u) := \int _\mathbb Tu^2 \, dx = \Vert u \Vert _{L^2(\mathbb T)}^2 \end{aligned}$$(1.17)is a prime integral of the Hamiltonian Eq. (1.1). Hence the solutions of (1.1) are in one-to-one correspondence with those of the Hamiltonian equation
$$\begin{aligned} v_t = \partial _x \nabla K(v) \ \quad \text {with } K := H + \lambda M^2 , \quad \lambda \in \mathbb R. \end{aligned}$$(1.18)More precisely, if u(t, x) is a solution of (1.1), then \(v(t,x) := u(t, x-ct)\), with \(c := -4\lambda M(u)\), is a solution of (1.18). Vice versa, if v(t, x) solves (1.18), then the function \(u(t,x) := v(t, x+ct)\), with \(c := -4\lambda M(v)\), is a solution of (1.1) (M(v) is also a prime integral of the Eq. (1.18)). The advantage of looking for quasi-periodic solutions of (1.18) is that, for \( \lambda = 3\varsigma /4 \), the fourth order Birkhoff normal form of K is diagonal (Remark 1) and therefore no conditions on the tangential sites S are required (Remark 13).
-
3.
The diophantine frequency vector \( \omega (\xi ) = (\omega _j)_{j \in S^+} \in \mathbb R^\nu \) of the quasi-periodic solutions of Theorem 1 is \( O(|\xi |) \)-close as \( \xi \rightarrow 0 \) (see (1.14)) to the integer vector of the unperturbed linear frequencies
$$\begin{aligned} \bar{\omega }:= (\bar{\jmath }_1^3, \ldots , \bar{\jmath }_\nu ^3) \in \mathbb N^\nu . \end{aligned}$$(1.19)This makes perturbation theory more difficult. This is the difficulty due to the fact that the mKdV Eq. (1.1) is completely resonant at \( u = 0 \).
-
4.
As shown by (1.13) the expected quasi-periodic solutions are mainly supported in Fourier space on the tangential sites S . The dynamics of the Hamiltonian PDE (1.1) restricted (and projected) to the symplectic subspaces
$$\begin{aligned} H_S := \left\{ v = \sum _{j \in S} u_j e^{{\mathrm i}jx} \right\} , \quad H_S^\bot := \left\{ z = \sum _{j \in S^c} u_j e^{{\mathrm i}jx} \in H^1_0(\mathbb T_x) \right\} , \end{aligned}$$(1.20)where \(S^c := \{ j \in \mathbb Z{\setminus } \{ 0 \} : j \notin S \}\), is quite different. We call v the tangential variable and z the normal one. On \( H_S \) the dynamics is mainly governed by a finite dimensional integrable system (see Proposition 1), and we find it convenient to describe the dynamics in this subspace by introducing action-angle variable, see Sect. 4. On the infinite dimensional subspace \( H_S^\bot \) the solution will stay forever close to the elliptic equilibrium \( z = 0 \).
In Theorem 1 it is stated that the quasi-periodic solutions are linearly stable. This information is not only an important complement of the result, but also an essential ingredient for the existence proof. Let us explain better what we mean. By the general procedure in [10] we prove that, around each invariant torus, there exist symplectic coordinates (see (6.13))
in which the mKdV Hamiltonian (1.4) assumes the normal form
where \( K_{\ge 3} \) collects the terms at least cubic in the variables \( (\eta , w )\), see Remark 4. In these coordinates the quasi-periodic solution reads \( t \mapsto (\omega t , 0, 0 ) \) and the corresponding linearized equations are
Thus the actions \( \eta (t) = \eta (0) \) do not evolve in time and the third equation reduces to the forced PDE
Ignoring the forcing term \(\partial _x K_{11}(\omega t)[ \eta _0]\) for a moment, we note that the equation \(\dot{w} = \partial _x K_{02}(\omega t)[w]\) is, up to a finite dimensional remainder (Proposition 3), the restriction to \( H_{S}^\bot \) of the “variational equation”
where \( X_K \) is the KdV Hamiltonian vector field with quadratic Hamiltonian \( K = \frac{1}{2} ((\partial _u \nabla H)(u)[h], h)_{L^2(\mathbb T_x)} \) \(= \frac{1}{2} (\partial _{uu} H)(u)[h, h] \). This is a linear PDE with quasi-periodically time-dependent coefficients of the form
In Sect. 8 we prove the reducibility of the linear operator \( {\dot{w}} - \partial _x K_{0 2}(\omega t ) w \), which conjugates (1.23) to the diagonal system (see (8.64))
where \(\mathcal{D}_\infty := \mathrm{Op} \{ \mu _j^\infty \}_{j \in S^c}\) is a Fourier multiplier operator acting in \( H^s_\bot \),
with \( m_3 = 1 + O(\varepsilon ^3) \), \( m_1 = O(\varepsilon ^2) \), \( \sup _{j \in S^c} r_j^\infty = o(\varepsilon ^2) \), see (8.61) and (8.62). The eigenvalues \( \mu _j^\infty \) are the Floquet exponents of the quasi-periodic solution. The solutions of the scalar non-homogeneous equations
are
(recall that the first Melnikov conditions (8.66) hold at a solution). As a consequence, the Sobolev norm of the solution of (1.25) satisfies
i.e. it does not increase in time.
We now describe in detail the strategy of proof of Theorem 1. Many of the arguments that we use are quite general and of wide applicability to other PDEs. Nevertheless, we think that a unique abstract theorem of existence and stability of quasi-periodic solutions applicable to all quasi-linear PDEs cannot be expected. Indeed the suitable pseudo-differential operators that are required to conjugate the highest order of the linearized operator to constant coefficients highly depend on the PDE at hand, see the discussion after (1.29).
There are two main issues in the proof:
-
1.
Bifurcation analysis Find approximate quasi-periodic solutions of (1.1) up to a sufficiently small remainder [which, in our case, should be \( O( u^4 ) \)]. In this step we also find the approximate “frequency-to-amplitude” modulation of the frequency with respect to the amplitude, see (4.10). This is the goal of Sects. 3 and 4.
-
2.
Nash–Moser implicit function theorem Prove that, close to the above approximate solutions, there exist exact quasi-periodic solutions of (1.1). By means of a Nash–Moser iteration, we construct a sequence of approximate solutions that converges to a quasi-periodic solution of (1.1) (Sects. 5–9). The key step consists in proving the invertibility of the linearized operator and tame estimates for its inverse. This is achieved in two main steps.
-
(a)
Symplectic decoupling procedure The method in Berti and Bolle [10] allows to approximately decouple the “tangential” and the “normal” dynamics around an approximate invariant torus (Sect. 6). It reduces the problem to the one of inverting a quasi-periodically forced PDE restricted to the normal subspace \( H_S^\bot \). Its precise form is found in Sect. 7.2.
-
(b)
Analysis of the linearized operator in the normal directions In Sects. 7 and 8 we reduce the linearized equations to constant coefficients. This involves three steps:
All the changes of variables used in the steps (i)–(iii) are \( \varphi \)-dependent families of symplectic maps \( \Phi (\varphi ) \) which act on the phase space \( H^1_0 (\mathbb T_x) \). Therefore they preserve the Hamiltonian dynamical systems structure of the conjugated linear operators.
-
(a)
Let us discuss these issues in detail.
Weak Birkhoff normal form. According to the orthogonal splitting
into the symplectic subspaces defined in (1.20), we decompose
where \(\Pi _S \), \(\Pi _S^\bot \) denote the orthogonal projectors on \( H_S \), \( H_S^\bot \).
We perform a “weak” Birkhoff normal form (weak BNF), whose goal is to find an invariant manifold of solutions of the third order approximate mKdV Eq. (1.1), on which the dynamics is completely integrable, see Sect. 3. We construct in Proposition 1 a symplectic map \( \Phi _B \) such that the transformed Hamiltonian \(\mathcal {H}:= H \circ \Phi _B\) possesses the invariant subspace \( H_S \) (see (1.20)). To this purpose we have to eliminate the term \( \int v^3 z \, dx \) (which is linear in z ). Then we check that its dynamics on \( H_S \) is integrable and non-isocronous. For that we perform the classical finite dimensional Birkhoff normalization of the Hamiltonian term \( \int v^4 \, dx \) which turns out to be integrable and non-isocronous.
Since the present weak Birkhoff map has to remove only finitely many monomials, it is the time 1 -flow map of an Hamiltonian system whose Hamiltonian is supported on only finitely many Fourier indices. Therefore it is close to the identity up to finite dimensional operators, see Proposition 1. The key advantage is that it modifies \( {\mathcal N}_4 \) very mildly, only up to finite dimensional operators (see for example Lemma 12), and thus the spectral analysis of the linearized equations (that we shall perform in Sect. 8) is essentially the same as if we were in the original coordinates.
The weak normal form (3.7) does not remove (nor normalize) the monomials \( O(z^2) \). We point out that a stronger normal form that removes/normalizes the monomials \(O(z^2)\) is also well-defined (it is called “partial Birkhoff normal form” in Kuksin and Pöschel [24] and Pöschel [27]). However, we do not use it because, for such a stronger normal form, the corresponding Birkhoff map is close to the identity only up to an operator of order \( O(\partial _x^{-1}) \), and so it would produce terms of order \( \partial _{xx} \) and \( \partial _x \). For the same reason, we do not use the global nonlinear Fourier transform in [19] (Birkhoff coordinates), which is close to the Fourier transform up to smoothing operators of order \( O(\partial _x^{-1}) \) (this is explicitly proved for KdV).
We remark that mKdV is simpler than KdV because the nonlinearity in (1.1) is cubic and not only quadratic, and, as a consequence, less steps of Birkhoff normal form are required to reach the sufficient smallness for the Nash–Moser scheme to converge (see Remark 11).
Action-angle and rescaling At this point we introduce action-angle variables on the tangential sites (Sect. 4) and, after the rescaling (4.5), we look for quasi-periodic solutions of the Hamiltonian (4.9). Note that the coefficients of the normal form \( \mathcal{N } \) in (4.13) depend on the angles \( \theta \), unlike the usual KAM theorems [21, 27], where the whole normal form is reduced to constant coefficients. This is because the weak BNF of Sect. 3 did not normalize the quadratic terms \( O(z^2) \). These terms are dealt with the “linear Birkhoff normal form” (linear BNF) in Sect. 8.4. In some sense the “partial” Birkhoff normal form of [27] is split into the weak BNF of Sect. 3 and the linear BNF of Sect. 8.4.
The present functional formulation with the introduction of the action-angle variables allows to prove the stability of the solutions (unlike the Lyapunov–Schmidt reduction approach).
Nonlinear functional setting and approximate inverse We look for a zero of the nonlinear operator (5.6), where the unknown is the torus embeddeding \( \varphi \mapsto i(\varphi ) \), and where the frequency \( \omega \) is seen as an “external” parameter. This formulation is convenient in order to verify the Melnikov non-resonance conditions required to invert the linearized operators at each step. The solution is obtained by a Nash–Moser iterative scheme in Sobolev scales. The key step is to construct (for \( \omega \) restricted to a suitable Cantor-like set) an approximate inverse (à la Zehnder [31]) of the linearized operator at any approximate solution. Roughly, this means to find a linear operator which is an inverse at an exact solution. A major difficulty is that the tangential and the normal dynamics near an invariant torus are strongly coupled.
Symplectic approximate decoupling The above difficulty is overcome by implementing the abstract procedure in Berti and Bolle [10], which was developed in order to prove the existence of quasi-periodic solutions for autonomous NLW (and NLS) with a multiplicative potential. This approach reduces the search of an approximate inverse for (5.6) to the invertibility of a quasi-periodically forced PDE restricted to the normal directions. This method approximately decouples the tangential and the normal dynamics around an approximate invariant torus, introducing a suitable set of symplectic variables
near the torus, see (6.13). Note that, in the first line of (6.13), \( \psi \) is the “natural” angle variable which coordinates the torus, and, in the third line, the normal variable z is only translated by the component \( z_0 (\psi )\) of the torus. The second line completes this transformation to a symplectic one. The canonicity of this map is proved in [10] using the isotropy of the approximate invariant torus \( i_\delta \), see Lemma 8. In these new variables the torus \( \psi \mapsto i_\delta (\psi ) \) reads \( \psi \mapsto (\psi , 0, 0 )\). The main advantage of these coordinates is that the second equation in (6.22) (which corresponds to the action variables of the torus) can be immediately solved, see (6.24). Then it remains to solve the third Eq. (6.25), i.e. to invert the linear operator \( \mathcal{L}_\omega \). This is a quasi-periodic Hamiltonian perturbed linear Airy equation of the form
where \( \mathcal {R}\) is a finite dimensional remainder. The exact form of \( \mathcal{L}_\omega \) is obtained in Proposition 3, see (7.23).
Reduction to constant coefficients of the linearized operator in the normal directions In Sect. 8 we conjugate the variable coefficients operator \( \mathcal{L}_\omega \) to a diagonal operator with constant coefficients which describes infinitely many harmonic oscillators
where the constants \( m_3 -1 \), \( m_1 \in \mathbb R\) and \( \sup _j |r_j^\infty | \) are small, see Theorem 4. The main perturbative effect to the spectrum (and the eigenfunctions) of \( \mathcal{L}_\omega \) is due to the term \( a_1 (\omega t, x ) \partial _{xxx} \) (see (1.27)), and it is too strong for the usual reducibility KAM techniques to work directly. The conjugacy of \( \mathcal{L}_\omega \) with (1.28) is obtained in several steps. The first task (obtained in Sects. 8.1–8.5) is to conjugate \( \mathcal{L}_\omega \) to another Hamiltonian operator of \( H_S^\bot \) with constant coefficients
up to a small bounded remainder \( R_5 = O(\partial _x^0 ) \), see (8.56). This expansion of \( \mathcal{L}_\omega \) in “decreasing symbols” with constant coefficients follows [3], and it is somehow in the spirit of the works of Iooss et al. [16, 17] in water waves theory, and Baldi [2] for Benjamin–Ono. It is obtained by transformations which are very different from the usual KAM changes of variables. We underline that the specific form of these transformations depend on the structure of mKdV. For other quasi-linear PDEs the analogous reduction requires different transformations, see for example Alazard and Baldi [1], Berti and Montalto [12] for recent developments of these techniques for gravity-capillary water waves, and Feola and Procesi [14] for quasi-linear forced perturbations of Schrödinger equations.
The transformation of (1.27) into (1.29) is made in several steps.
-
1.
Reduction of the highest order The first step (Sect. 8.1) is to eliminate the x -dependence from the coefficient \( a_1 (\omega t, x ) \partial _{xxx} \) of the Hamiltonian operator \( \mathcal{L}_\omega \). For this purpose, we have to construct a symplectic diffeomorphism of \( H_S^\bot \) near \( \mathcal{A}_\bot := \Pi _S^\bot \mathcal {A}\Pi _S^\bot \), where \(\mathcal {A}\) is a diffeomorphism of the form
$$\begin{aligned} u \mapsto (\mathcal{A} u)(\varphi ,x) := (1 + \beta _x(\varphi ,x)) u(\varphi ,x + \beta (\varphi ,x)) , \end{aligned}$$see (8.1). The starting point is to observe that \(\mathcal {A}\) is, for each \( \varphi \in \mathbb T^\nu \), the time-one flow map of the time dependent Hamiltonian transport linear PDE
$$\begin{aligned} \partial _\tau u = \partial _x (b(\varphi , \tau , x) u) , \quad b (\varphi , \tau , x) := \frac{\beta (\varphi , x)}{1 + \tau \beta _x(\varphi , x)}. \end{aligned}$$(1.30)Actually the flow of (1.30) is the path of symplectic diffeomorphisms
$$\begin{aligned} u (\varphi , x) \mapsto (1+ \tau \beta _x (\varphi , x) ) u (\varphi , x+ \tau \beta (\varphi , x) ), \quad \tau \in [0,1] . \end{aligned}$$Thus, like in [5], we conjugate \( \mathcal{L}_\omega \) with the symplectic time-one flow map of the projected Hamiltonian equation
$$\begin{aligned} \partial _\tau u = \Pi _S^\bot \partial _x (b(\tau , x) u) = \partial _x (b(\tau , x) u) - \Pi _S \partial _x (b(\tau , x) u) , \quad u \in H_S^\bot \end{aligned}$$(1.31)generated by the quadratic Hamiltonian \( \frac{1}{2} \int _{\mathbb T} b(\tau , x) u^2 dx \) restricted to \( H_S^\bot \). By Lemma 15 (which was proved in [5]) such a symplectic map differs from \( \mathcal{A}_\bot \) only for finite dimensional operators. This step may be seen as a quantitative application of the Egorov theorem, see [29], which describes how the principal symbol of a pseudo-differential operator [here \( a_1 (\omega t, x) \partial _{xxx} \)] transforms under the flow of a linear hyperbolic PDE (here (1.31)). Because of the Hamiltonian structure, the previous step also eliminates the term \( O( \partial _{xx} )\), see (8.13). In Sect. 8.2 we eliminate the time-dependence of the coefficient at the order \( \partial _{xxx} \).
-
2.
Linear Birkhoff normal form In Sect. 8.4 we eliminate the variable coefficient terms at the order \( O(\varepsilon ^2 )\), which are present in the operator \( \mathcal{L}_\omega \), see (7.23) and (7.24). This is a consequence of the fact that the weak BNF procedure of Sect. 3 did not touch the quadratic terms \( O(z^2 ) \). These terms cannot be reduced to constants by the perturbative scheme in Sect. 8.6 (developed in [3]) which applies to terms R such that \( R \gamma ^{ -1} \ll 1 \) where \( \gamma \) is the diophantine constant of the frequency vector \( \omega \) (the case in [3] is simpler because the diophantine constant is \( \gamma = O(1) \)). Here, as well as in [5], since mKdV is completely resonant, such \( \gamma = o(\varepsilon ^2 ) \), see (5.3). The terms of size \(\varepsilon ^2\) are reduced to constant coefficients in Sect. 8.4 by means of purely algebraic arguments (linear BNF), which, ultimately, stem from the complete integrability of the fourth order BNF of the mKdV Eq. (1.9). More general nonlinearities should be dealt with the normal form arguments of Procesi and Procesi [28] for generic choices of the tangential sites.
Complete diagonalization of (1.29) In Sect. 8.6 we apply the abstract KAM reducibility Theorem 4.2 of [3], which completely diagonalizes the linearized operator, obtaining (1.28). The required smallness condition (8.58) for \( R_5 \) holds, after that the linear BNF of Sect. 8.4 has put into constant coefficients the unbounded terms of nonperturbative size \(\varepsilon ^2\), and the conjugation procedure of Sects. 8.1-8.3 and 8.5 has arrived to a bounded and small remainder \(R_5\).
The Nash–Moser iteration to an invariant torus embedding In Sect. 9 we perform the nonlinear Nash–Moser iteration which finally proves Theorem 2 and, therefore, Theorem 1. The smallness condition that is required for the convergence of the scheme is \( \varepsilon ^2 \Vert \mathcal{F}(\varphi , 0, 0 ) \Vert _{s_0+ \mu } \gamma ^{-2}\) sufficiently small, see (9.5). It is verified because \( \Vert X_P(\varphi , 0 , 0 ) \Vert _s \le _s \varepsilon ^{5 - 2b} \) (Lemma 5) and \( \gamma = \varepsilon ^{2+a}\) with \( a > 0 \) small. See also Remark 11 for a comparison between the smallness condition required here with the one in [5].
Notation We shall use the notation
We denote by \( \pi _0 \) the operator
2 Functional setting
For a function \(u :\Omega _o \rightarrow E\), \(\omega \mapsto u(\omega )\), where \((E, \Vert \ \Vert _E)\) is a Banach space and \( \Omega _o \) is a subset of \(\mathbb R^\nu \), we define the sup-norm and the Lipschitz semi-norm
and, for \( \gamma > 0 \), the Lipschitz norm
If \( E = H^s \) we simply denote \( \Vert u \Vert ^{{\mathrm {Lip}(\gamma )}}_{H^s} := \Vert u \Vert ^{{\mathrm {Lip}(\gamma )}}_s \).
Sobolev norms We denote by
the Sobolev norm of functions \( u = u(\varphi ,x) \) in the Sobolev space \( H^{s} (\mathbb T^{\nu + 1} ) \). We denote by \( \Vert \ \Vert _{H^s_x} \) the Sobolev norm in the phase space of functions \( u := u(x) \in H^{s} (\mathbb T) \). Moreover \( \Vert \ \Vert _{H^s_\varphi } \) denotes the Sobolev norm of scalar functions, like the Fourier components \( u_j (\varphi ) \).
We fix \( s_0 := (\nu +2) \slash 2 \) so that \( H^{s_0} (\mathbb T^{\nu + 1} ) \hookrightarrow L^{\infty } (\mathbb T^{\nu + 1} ) \) and any space \( H^s (\mathbb T^{\nu + 1} ) \), \( s \ge s_0 \), is an algebra and satisfy the interpolation inequalities: for \(s \ge s_0\),
The above inequalities also hold for the norms \(\Vert \ \Vert _s^{\mathrm{Lip}(\gamma )}\).
We also denote
Matrices with off-diagonal decay A linear operator can be identified, as usual, with its matrix representation. We recall the definition of the s -decay norm (introduced in [9]) of an infinite dimensional matrix.
Definition 1
Let \( A := (A_{i_1}^{i_2} )_{i_1, i_2 \in \mathbb Z^b } \), \(b \ge 1\), be an infinite dimensional matrix. Its s-decay norm \(|A|_s\) is defined by
For parameter dependent matrices \( A := A(\omega ) \), \(\omega \in \Omega _o \subseteq \mathbb R^\nu \), the definitions (2.1) and (2.2) become
and \(| A |^{{\mathrm {Lip}(\gamma )}}_s := | A |^{\sup }_s + \gamma | A |^{\mathrm {lip}}_s\).
Such a norm is modeled on the behavior of matrices representing the multiplication operator by a function. Actually, given a function \( p \in H^s(\mathbb T^b) \), the multiplication operator \( h \mapsto p h \) is represented by the Töplitz matrix \( T_i^{i'} = p_{i - i'} \) and \( |T|_s = \Vert p \Vert _s \). If \(p = p(\omega )\) is a Lipschitz family of functions, then
The s-norm satisfies classical algebra and interpolation inequalities proved in [3].
Lemma 1
Let \(A = A(\omega ), B = B(\omega )\) be matrices depending in a Lipschitz way on the parameter \(\omega \in \Omega _o \subset \mathbb R^\nu \). Then for all \(s \ge s_0 > b/2 \) there are \( C(s) \ge C(s_0) \ge 1 \) such that
The s -decay norm controls the Sobolev norm, namely
Let now \( b := \nu + 1 \). An important sub-algebra is formed by the Töplitz in time matrices defined by
whose decay norm (2.4) is
These matrices are identified with the \( \varphi \)-dependent family of operators
which act on functions of the x-variable as
Transformations of this kind were also used in [3, 9, 11]. All the transformations that we construct in this paper are of this type [with \( j, j_1, j_2 \ne 0 \) because they act on the phase space \( H^1_0 (\mathbb T_x) \)].
Definition 2
We say that
-
1.
an operator \((A h)(\varphi , x) := A(\varphi ) h(\varphi , x)\) is symplectic if each \( A (\varphi ) \), \( \varphi \in \mathbb T^\nu \), is a symplectic map of the phase space (or of a symplectic subspace like \( H_S^\bot \));
-
2.
an operator is real if it maps real-valued functions into real-valued functions;
-
3.
the real operator \(\omega \cdot \partial _{\varphi } - \partial _x G( \varphi )\) is Hamiltonian if each \( G (\varphi ) \), \( \varphi \in \mathbb T^\nu \), is self-adjoint with respect to the \(L^2(\mathbb T)\) complex scalar product.
A Hamiltonian operator is transformed, under a symplectic map, into another Hamiltonian operator, see [3, section 2.3].
We conclude this preliminary section recalling the following well known lemmata about composition of functions (see, e.g., Appendix of [3]).
Lemma 2
(Composition) Assume \( f \in C^s (\mathbb T^d \times B_1)\), \(B_1 := \{ y \in \mathbb R^m :|y| \le 1 \}\). Then \( \forall u \in H^{s}(\mathbb T^d, \mathbb R^m) \) such that \( \Vert u \Vert _{L^\infty } < 1 \), the composition operator \(\tilde{f}(u)(x) := f(x, u(x))\) satisfies \( \Vert \tilde{f}(u) \Vert _s \le C \Vert f \Vert _{C^s} (\Vert u\Vert _{s} + 1) \) where the constant C depends on s , d . If \( f \in C^{s+2} \) and \( \Vert u + h \Vert _{L^\infty } < 1\), then for \(k=0,1\)
The statement also holds replacing \(\Vert \ \Vert _s\) with the norms \(| \ |_{s, \infty }\) of \(W^{s,\infty }(\mathbb T^d)\).
Lemma 3
(Change of variable) Let \(p \in W^{s,\infty } (\mathbb T^d,\mathbb R^d) \), \( s \ge 1\), with \( \Vert p \Vert _{W^{1, \infty }}\) \( \le 1/2 \). Then the function \(f(x) = x + p(x)\) is invertible, with inverse \( f^{-1}(y) = y + q(y)\) where \(q \in W^{s,\infty }(\mathbb T^d,\mathbb R^d)\), and \( \Vert q \Vert _{W^{s, \infty }} \le C \Vert p \Vert _{ W^{s, \infty }} \).
If, moreover, p depends in a Lipschitz way on a parameter \(\omega \in \Omega \subset \mathbb R^\nu \), and \(\Vert D_x p \Vert _ {L^\infty } \le 1/2 \) for all \(\omega \), then \( \Vert q \Vert _{W^{s, \infty }}^{\mathrm{Lip}(\gamma )} \le C \Vert p \Vert _{W^{s+1, \infty }}^{\mathrm{Lip}(\gamma )} \). The constant \(C := C (d, s) \) is independent of \(\gamma \).
If \(u \in H^s (\mathbb T^d,\mathbb C)\), then \( (u\circ f)(x) := u(x+p(x))\) satisfies
The function \(u \circ f^{-1} \) satisfies the same bounds.
3 Weak Birkhoff normal form
In this section it is convenient to analize the mKdV equation in the Fourier representation
where the Fourier indices are nonzero integers j, by the definition (1.5) of the phase space, and \(u_{-j} = \overline{u}_j\) because u(x) is real-valued. The symplectic structure (1.6) writes
the Hamiltonian vector field \(X_H\) in (1.3) and the Poisson bracket \(\{ F, G \}\) in (1.7) are respectively
We shall sometimes identify \( v \equiv (v_j)_{j \in S } \) and \( z \equiv (z_j)_{j \in S^c } \).
The Hamiltonian of the perturbed cubic mKdV Eq. (1.1) is \( H = H_2 + H_4 + H_{\ge 5} \) (see (1.4)) where
\(\varsigma = \pm 1\) and f satisfies (1.8). According to the splitting (1.26) \( u = v + z \), where \( v \in H_S \) and \( z \in H_S^\bot \), we have \(H_2(u) = H_2(v) + H_2(z)\) and
For a finite-dimensional space
let \(\Pi _E \) denote the corresponding \( L^2 \)-projector on E.
In the next proposition we construct a symplectic map \( \Phi _B \) such that the transformed Hamiltonian \(\mathcal {H}:= H \circ \Phi _B\) possesses the invariant subspace \( H_S \) defined in (1.20), and its dynamics on \( H_S \) is integrable and non-isocronous. To this purpose we have to eliminate the term \( \int v^3 z \, dx \) (which is linear in z ) and to normalize the term \( \int v^4 \, dx \) (which is independent of z ) in the quartic component of the Hamiltonian.
Proposition 1
(Weak Birkhoff normal form) There exists an analytic invertible symplectic transformation of the phase space \( \Phi _B : H^1_0 (\mathbb T_x) \rightarrow H^1_0 (\mathbb T_x) \) of the form
where E is a finite-dimensional space as in (3.5), such that the transformed Hamiltonian is
where \(H_2\) is defined in (3.4),
and \(\mathcal{H}_{\ge 5}\) collects all the terms of order at least five in (v, z).
Proof
In Fourier coordinates (3.1) we have (see (3.4))
We look for a symplectic transformation \(\Phi \) of the phase space which eliminates or normalizes the monomials \( u_{j_1} u_{j_2} u_{j_3} u_{j_4} \) of \( H_4 \) with at most one index outside S . By the relation \( j_1 + j_2 + j_3 + j_4 = 0 \), they are finitely many. Thus, we look for a map \(\Phi := (\Phi _{F}^t)_{|t=1}\) which is the time 1-flow map of an auxiliary quartic Hamiltonian
The transformed Hamiltonian is
where \( \mathcal {H}_{\ge 5} \) collects all the terms in \(\mathcal {H}\) of order at least five. By (3.3) and (3.9) we calculate
In order to eliminate or normalize only the monomials with at most one index outside S , we choose
where
We recall the following elementary identity (Lemma 13.4 in [18]). \(\square \)
Lemma 4
Let \(j_1, j_2, j_3, j_4 \in \mathbb Z\) such that \( j_1 + j_2 + j_3 + j_4 = 0 \). Then
By definition (3.11), \(\mathcal {H}_4\) does not contain any monomial \(u_{j_1} u_{j_2} u_{j_3} u_{j_4}\) with three indices in S and one outside, because there exist no integers \( j_1, j_2 , j_3 \in S\), \( j_4 \in S^c \) satisfying \( j_1 + j_2 + j_3 + j_4 = 0 \) and \( j_1^3 + j_2^3 + j_3^3 + j_4^3 = 0 \), by Lemma 4 and the fact that S is symmetric.
By construction, the quartic monomials with at least two indices outside S are not changed by \(\Phi \). Also, by construction, the monomials \(u_{j_1} u_{j_2} u_{j_3} u_{j_4}\) in \(\mathcal {H}_4\) with all integers in S are those for which \(j_1 + j_2 + j_3 + j_4 = 0\) and \(j_1^3 + j_2^3 + j_3^4 + j_4 ^3 = 0\). By Lemma 4, we split
where \(A_1\) is given by the sum over \(j_1, j_2, j_3, j_4 \in S\), \(j_1 + j_2 + j_3 + j_4 = 0\) with the restriction \(j_1 + j_2 = 0\), \(A_2\) with the restriction \(j_1 + j_2 \ne 0\) and \(j_1 + j_3 = 0\), and \(A_3\) with the restriction \(j_1 + j_2 \ne 0\), \(j_1 + j_3 \ne 0\) and \(j_2 + j_3 = 0\). We get
whence (3.8) follows.
Remark 1
In the Birkhoff normal form for the Hamiltonian \( K = H + \lambda M^2 \) defined in (1.18), three additional terms appear in (3.8), which are
Then in (3.8) the sum \((\lambda - \frac{3\varsigma }{4}) \sum _{j, j' \in S} |u_j|^2 |u_{j'}|^2\) vanishes if we choose \(\lambda := 3 \varsigma /4\).
4 Action-angle variables
We introduce action-angle variables on the tangential directions by the change of coordinates
where (recall that \( u_{-j} = {\overline{u}}_j \))
To simplify notation, for the tangential sites \( S^+ := \{ {\bar{\jmath }_1}, \ldots , {\bar{\jmath }_\nu } \} \) we also denote \(\tilde{\theta }_{\bar{\jmath }_i} := \tilde{\theta }_i \), \( \tilde{y}_{\bar{\jmath }_i} := \tilde{y}_i \), \( \tilde{\xi }_{\bar{\jmath }_i} := \tilde{\xi }_i \), \( i =1, \ldots \, \nu \).
The symplectic 2-form \( \Omega \) in (3.2) (i.e. (1.6)) becomes
where \( \Omega _{S^\bot } \) denotes the restriction of \( \Omega \) to \( H_S^\bot \) (see (1.20)) and \( \Lambda \) is the Liouville 1 -form on \( \mathbb T^\nu \times \mathbb R^\nu \times H_S^\bot \) defined by \( \Lambda _{(\tilde{\theta }, \tilde{y}, \tilde{z})} : \mathbb R^\nu \times \mathbb R^\nu \times H_S^\bot \rightarrow \mathbb R\),
We rescale the “unperturbed actions” \( \tilde{\xi }\) and the variables \(\tilde{\theta }, \tilde{y}, \tilde{z}\) as
where \(b>1\) will be fixed below (see (5.9) and Remark 3). The symplectic 2 -form in (4.3) transforms into \( \varepsilon ^{2b} \mathcal{W } \). Hence the Hamiltonian system generated by \( \mathcal{H} \) in (3.7) transforms into the new Hamiltonian system
where
We still denote by
the Hamiltonian vector field in the variables \( (\theta , y, z ) \in \mathbb T^\nu \times \mathbb R^\nu \times H_S^\bot \).
We now write explicitly the Hamiltonian \( H_{\varepsilon } (\theta , y, z) \) defined in (4.6). Recall the expression of \( \mathcal{H } \) given in (3.7). The quadratic Hamiltonian \( H_2 \) in (3.4) transforms into
and, by (3.7) and (3.8) we get [writing, in short, \( v_\varepsilon := v_\varepsilon (\theta , y) \)]
where \(e(\xi )\) is a constant, and \(\alpha (\xi ) \in \mathbb R^\nu \) is the vector of components
This is the “frequency-to-amplitude” map which describes, at the main order, how the tangential frequencies are shifted by the amplitudes \( \xi := ( \xi _1, \ldots , \xi _\nu ) \). It can be written in compact form as
where \( \bar{\omega }:= (\bar{\jmath }_1^3, \ldots , \bar{\jmath }_\nu ^3) \in \mathbb N^\nu \) (see (1.19)) is the vector of the unperturbed linear frequencies of oscillations on the tangential sites, \( D_S \) is the diagonal matrix
I is the \(\nu \times \nu \) identity matrix, and U is the \(\nu \times \nu \) matrix with all entries equal to 1. The matrix \(\mathbb A\) is often called the “twist” matrix . It turns out to be invertible. Indeed, since \(U^2 = \nu U\), one has \((I - 2 U)( I - \frac{2}{2\nu -1}\, U ) = I\), and therefore
With this notation, one can also write
Remark 2
By Remark 1, for the Hamiltonian \( K = H + \lambda M^2 \), \(\lambda := 3 \varsigma /4\), defined in (1.18) the twist matrix in the frequency-amplitude relation (4.10) becomes \(\mathbb A = 3 \varsigma D_S\), which is diagonal.
We write the Hamiltonian in (4.9) [eliminating the constant \(e(\xi )\) which is irrelevant for the dynamics] as \(H_{\varepsilon } = \mathcal{N} + P\), where
describes the linear dynamics, and \( P := H_{\varepsilon } - \mathcal{N} \), namely
collects the nonlinear perturbative effects.
5 The nonlinear functional setting
We look for an embedded invariant torus
of the Hamiltonian vector field \( X_{H_\varepsilon } \) filled by quasi-periodic solutions with diophantine frequency \( \omega \in \mathbb R^\nu \), that we regard as independent parameters. We require that \( \omega \) belongs to the set
where \( \alpha \) is the affine diffeomorphism (4.10). Since any \( \omega \in \Omega _\varepsilon \) is \( \varepsilon ^2 \)-close to the integer vector \( \bar{\omega }\in \mathbb N^\nu \) (see (1.19), (4.10)), we require that the constant \(\gamma \) in the diophantine inequality
In (5.9) we will fix \(a \in (0,1/6)\) (see also the discussion in Remark 3). Note that the definition of \(\gamma \) in (5.3) is slightly stronger than the minimal condition, which is \( \gamma \le c \varepsilon ^2 \) with c small enough. We assume \( a > 0 \) just for simplicity. In addition to (5.3) we shall also require that \( \omega \) satisfies the first and second order Melnikov-non-resonance conditions (8.63).
We fix the amplitude \(\xi \) as a function of \(\omega \) and \( \varepsilon \), as
so that \(\alpha (\xi ) = \omega \) (see (4.10)).
Now we look for an embedded invariant torus of the modified Hamiltonian vector field \( X_{H_{\varepsilon , \zeta }} = X_{H_\varepsilon } + (0, \zeta , 0) \), \( \zeta \in \mathbb R^\nu \), which is generated by the Hamiltonian
Note that the vector field \( X_{H_{\varepsilon , \zeta }} \) is periodic in \(\theta \) (unlike the Hamiltonian \( H_{\varepsilon , \zeta } \)). We introduce \(\zeta \) in order to adjust the average in the second equation of the linearized system (6.22), see (6.23). The vector \( \zeta \) has however no dynamical consequences. Indeed it turns out that an invariant torus for the Hamiltonian vector field \( X_{H_{\varepsilon , \zeta }} \) is actually invariant for \( X_{H_\varepsilon } \) itself, see Lemma 6. Hence we look for zeros of the nonlinear operator
where \( \Theta (\varphi ) := \theta (\varphi ) - \varphi \) is \( (2 \pi )^\nu \)-periodic and we use (here and everywhere in the paper) the short notation
The Sobolev norm of the periodic component of the embedded torus
is \(\Vert {\mathfrak I} \Vert _s := \Vert \Theta \Vert _{H^s_\varphi } + \Vert y \Vert _{H^s_\varphi } + \Vert z \Vert _s \) where \( \Vert z \Vert _s := \Vert z \Vert _{H^s_{\varphi ,x}} \) is defined in (2.3). We link the rescaling (4.5) with the diophantine constant \( \gamma = \varepsilon ^{2+a} \) by choosing
Other choices are possible, see Remark 3.
Theorem 2
Let the tangential sites S in (1.11) satisfy (1.12). For all \( \varepsilon \in (0, \varepsilon _0 ) \), where \( \varepsilon _0 \) is small enough, there exist a constant \(C>0\) and a Cantor-like set \( \mathcal{C}_\varepsilon \subset \Omega _\varepsilon \), with asympotically full measure as \( \varepsilon \rightarrow 0 \), namely
such that, for all \( \omega \in \mathcal{C}_\varepsilon \), there exists a solution \( i_\infty (\varphi ) := i_\infty (\omega , \varepsilon )(\varphi ) \) of the equation \(\mathcal {F}(i_\infty , 0, \omega , \varepsilon ) = 0\) (the nonlinear operator \(\mathcal {F}(i,\zeta ,\omega ,\varepsilon )\) is defined in (5.6)). Hence the embedded torus \( \varphi \mapsto i_\infty (\varphi ) \) is invariant for the Hamiltonian vector field \( X_{H_\varepsilon } \), and it is filled by quasi-periodic solutions with frequency \( \omega \). The torus \(i_\infty \) satisfies
for some \( \mu := \mu (\nu ) > 0 \). Moreover, the torus \( i_\infty \) is linearly stable.
Theorem 2 is proved in Sects. 6–9. It implies Theorem 1 where the \( \xi _j \) in (1.13) are the components of the vector \(\mathbb {A}^{-1}[\omega - \bar{\omega }]\). By (5.11), going back to the variables before the rescaling (4.5), we get \( \tilde{\Theta }_\infty = O( \varepsilon ^{5-4b}) \), \( \tilde{y}_\infty = O( \varepsilon ^{5-2b} ) \), \( \tilde{z}_\infty = O( \varepsilon ^{5-3b} ) \).
Remark 3
The way to link the amplitude-rescaling (4.5) with the diophantine constant \( \gamma = \varepsilon ^{2+a} \) in (5.3) is not unique.
The choice \( \varepsilon ^{2b} < \gamma \) (i.e. “\( b > 1 \) large”) reduces to study the Hamiltonian \( H_\varepsilon \) in (4.9) as a perturbation of an isochronous system (as in [21, 23, 27]). We can take \( b = 4 / 3 \) in order to minimize the size of the perturbation \( P = O( \varepsilon ^{7/3}) \), estimating uniformly all the terms in the last two lines of (4.9). As a counterpart we have to regard in (4.9) the constants \( \alpha := \alpha (\xi ) \in \mathbb R^\nu \) (or \( \xi \) in (4.7)) as independent variables. This is the perspective described for example in [10]. Then the Nash–Moser scheme produces iteratively a sequence of \( \xi _n = \xi _n (\omega ) \) and embeddings \( \varphi \mapsto i_n (\varphi ) := (\theta _n (\varphi ), y_n (\varphi ), z_n (\varphi ) )\) at the same time.
The case \( \varepsilon ^{2b} > \gamma \) (i.e. “\( b \ge 1 \) small”), in particular if \( b = 1 \), reduces to study the Hamiltonian \( H_\varepsilon \) in (4.9) as a perturbation of a non-isochronous system à la Arnold–Kolmogorov (note that the quadratic Hamiltonian in (4.12) satisfies the usual Kolmorogov non-degeneracy condition). In this case, the constant \( \xi _j \) in (4.7) and the average of \( |j| y_j (\varphi ) \) have the same size and therefore the same role. Then we may consider \( \xi _j \) as fixed, and tune the average of the action component \( y_j (\varphi ) \) in order to solve the linear Eq. (6.28), which corresponds to the angle component. We use the invertible (averaged) “twist”-matrix (6.30) to impose that the right hand side in (6.28) has zero average.
The intermediate case \(\varepsilon ^{2b} = \gamma \), adopted in this paper (as well as in [5]), has the advantage to avoid the introduction of the \( \xi (\omega ) \) as an independent variable, but it also enables to estimate uniformly the sizes of the components of \( (\Theta (\varphi ) , y (\varphi ) , z (\varphi ) ) \) with no distinctions.
Now we prove tame estimates for the composition operator induced by the Hamiltonian vector fields \( X_\mathcal{N} \) and \( X_P \) in (5.6), which are used in the next sections. Since the functions \( y \mapsto \sqrt{\xi + \varepsilon ^{2(b - 1)}|j| y} \), \(\theta \mapsto e^{{\mathrm i}\theta }\) are analytic for \(\varepsilon \) small enough, \(j \in S\) and \(|y| \le C\), the composition Lemma 2 implies that, for all \( \Theta , y \in H^s(\mathbb T^\nu , \mathbb R^\nu )\) with \(\Vert \Theta \Vert _{s_0}\), \(\Vert y \Vert _{s_0} \le 1\), setting \(\theta (\varphi ) := \varphi + \Theta (\varphi )\), one has the tame estimate
Hence the map \( A_\varepsilon \) in (4.7) satisfies, for all \( \Vert {\mathfrak I} \Vert _{s_0}^{\mathrm {Lip}(\gamma )}\le 1 \) (see (5.8))
In the following lemma we collect tame estimates for the Hamiltonian vector fields \( X_\mathcal{N} \), \( X_P \), \( X_{H_\varepsilon } \) (see (4.13), (4.14)) whose proof is a direct application of classical tame product and composition estimates.
Lemma 5
Let \( {\mathfrak {I}}(\varphi ) \) in (5.8) satisfy \( \Vert {\mathfrak I} \Vert _{s_0 + 3}^{\mathrm {Lip}(\gamma )}\le C \varepsilon ^{5-2b} \gamma ^{-1} = C \varepsilon ^{5-4b}\). Then, writing in short \(\Vert \ \Vert _s\) to indicate \(\Vert \ \Vert _s^{\mathrm {Lip}(\gamma )}\), one has
(\(\mathbb {A}, D_S\) are defined in (4.10)) and, for all \( {\widehat{\imath }} := ({\widehat{\Theta }}, {\widehat{y}}, {\widehat{z}}) \),
In the sequel we also use that, by the diophantine condition (5.3), the operator \( \mathcal{D}_\omega ^{-1} \) (see (5.7)) is defined for all functions u with zero \( \varphi \)-average, and satisfies
6 Approximate inverse
In order to implement a convergent Nash–Moser scheme that leads to a solution of \( \mathcal {F}(i, \zeta ) = 0 \), we now construct an approximate right inverse (which satisfies tame estimates) of the linearized operator
see Theorem 3. Note that \( d_{i, \zeta } \mathcal{F}(i_0, \zeta _0 ) \) is independent of \( \zeta _0 \) (see (5.6)).
The notion of approximate right inverse is introduced in [31]. It denotes a linear operator which is an exact right inverse at a solution \( (i_0, \zeta _0) \) of \( \mathcal{F}(i_0, \zeta _0) = 0 \). We implement the general strategy in [10] which reduces the search of an approximate right inverse of (6.1) to the search of an approximate inverse on the normal directions only.
It is well known that an invariant torus \( i_0 \) with diophantine flow is isotropic (see e.g. [10]), namely the pull-back 1-form \( i_0^* \Lambda \) is closed, where \( \Lambda \) is the Liouville 1-form in (4.4). This is tantamount to say that the 2-form \( \mathcal W \) (see (4.3)) vanishes on the torus \( i_0 (\mathbb T^\nu )\), because \( i_0^* \mathcal{W} = i_0^* d \Lambda = d i_0^* \Lambda \). For an “approximately invariant” torus \( i_0 \) the 1-form \( i_0^* \Lambda \) is only “approximately closed”. In order to make this statement quantitative we consider
and we quantify how small is
Along this section we will always assume the following hypothesis (which will be verified at each step of the Nash–Moser iteration):
-
Assumption The map \(\omega \mapsto i_0(\omega )\) is a Lipschitz function defined on some subset \(\Omega _o \subset \Omega _\varepsilon \), where \(\Omega _\varepsilon \) is defined in (5.2), and, for some \({\mu } := {\mu } ({{\tau }}, {\nu }) > 0 \),
where \({\mathfrak {I}}_0(\varphi ) := i_0(\varphi ) - (\varphi ,0,0)\), and
is the “error” function.
Lemma 6
(Lemma 6.1 in [5]) \( |\zeta _0|^{{\mathrm {Lip}(\gamma )}} \le C \Vert Z \Vert _{s_0}^{{\mathrm {Lip}(\gamma )}}\) . If \( \mathcal{F}(i_0, \zeta _0) = 0 \), then \( \zeta _0 = 0 \), and the torus \(i_0(\varphi )\) is invariant for \(X_{H_\varepsilon }\).
Now we estimate the size of \( i_0^* \mathcal{W} \) in terms of Z . From (6.2) and (6.3) one has \(\Vert A_{kj} \Vert _s^{\mathrm {Lip}(\gamma )}\le _s \Vert {\mathfrak {I}}_0 \Vert _{s+2}^{\mathrm {Lip}(\gamma )}\). Moreover, \(A_{kj}\) also satisfies the following bound.
Lemma 7
(Lemma 6.2 in [5]) The coefficients \(A_{kj} (\varphi ) \) in (6.3) satisfy
As in [10], we first modify the approximate torus \( i_0 \) to obtain an isotropic torus \( i_\delta \) which is still approximately invariant. We denote the Laplacian \( \Delta _\varphi := \sum _{k=1}^\nu \partial _{\varphi _k}^2 \).
Lemma 8
(Isotropic torus) The torus \( i_\delta (\varphi ) := (\theta _0(\varphi ), y_\delta (\varphi ), z_0(\varphi ) ) \) defined by
is isotropic. If (6.4) holds, then, for some \( \sigma := \sigma (\nu ,{\tau }) \),
In the paper we denote equivalently the differential by \( \partial _i \) or \( d_i \). Moreover we denote by \( \sigma := \sigma (\nu , {\tau }) \) possibly different (larger) “loss of derivatives” constants.
Proof
It is sufficient to closely follow the proof of Lemma 6.3 of [5]. We mention the only difference: equation (6.11) of [5] is \(\Vert \mathcal{F}(i_\delta , \zeta _0) \Vert _s^{{\mathrm {Lip}(\gamma )}} \le _s \Vert Z \Vert _{s + \sigma }^{{\mathrm {Lip}(\gamma )}} + \varepsilon ^{2b-1} \gamma ^{-1} \Vert {\mathfrak {I}}_0 \Vert _{s+\sigma }^{\mathrm {Lip}(\gamma )}\Vert Z \Vert _{s_0 + \sigma }^{{\mathrm {Lip}(\gamma )}}\), with a big factor \(\varepsilon ^{2b-1} \gamma ^{-1} = \varepsilon ^{-1}\) more with respect to the present bound (6.10). In (6.10) there is no such a factor, because, by the estimates for \(\partial _\theta \partial _y P, \partial _{yy}P, \partial _y \nabla _z P\) in Lemma 5, here we have \(\Vert \partial _y X_P (i) \Vert _s \le _s \varepsilon ^{2b} (1 + \Vert {\mathfrak I}\Vert _{s+3})\). Hence (6.4), (6.8) and (6.9) imply that
Then the proof goes on as in [5], without the large factor \(\varepsilon ^{2b-1} \gamma ^{-1}\). \(\square \)
In order to find an approximate inverse of the linearized operator \(d_{i, \zeta } \mathcal{F}(i_\delta )\) we introduce a suitable set of symplectic coordinates nearby the isotropic torus \( i_\delta \). We consider the map \( G_\delta : (\psi , \eta , w) \rightarrow (\theta , y, z)\) of the phase space \(\mathbb T^\nu \times \mathbb R^\nu \times H_S^\bot \) defined by
where \(\tilde{z}_0 (\theta ) := z_0 (\theta _0^{-1} (\theta ))\). It is proved in [10] that \( G_\delta \) is symplectic, using that the torus \( i_\delta \) is isotropic (Lemma 8). In the new coordinates, \( i_\delta \) is the trivial embedded torus \( (\psi , \eta , w ) = (\psi , 0, 0 ) \). The transformed Hamiltonian \( K := K(\psi , \eta , w, \zeta _0) \) is (recall (5.5))
where \( K_{\ge 3} \) collects the terms at least cubic in the variables \( (\eta , w )\). At any fixed \(\psi \), the Taylor coefficient \(K_{00}(\psi ) \in \mathbb R\), \(K_{10}(\psi ) \in \mathbb R^\nu \), \(K_{01}(\psi ) \in H_S^\bot \) (it is a function of \( x \in \mathbb T\)), \(K_{20}(\psi ) \) is a \(\nu \times \nu \) real matrix, \(K_{02}(\psi )\) is a linear self-adjoint operator of \( H_S^\bot \) and \(K_{11}(\psi ) :\mathbb R^\nu \rightarrow H_S^\bot \). Note that the above Taylor coefficients do not depend on the parameter \( \zeta _0 \).
The Hamilton equations associated to (6.14) are
where \( [\partial _{\psi }K_{10}(\psi )]^T \) is the \( \nu \times \nu \) transposed matrix and the operators \( [\partial _{\psi }K_{01}(\psi )]^T \) and \( K_{11}^T(\psi ) :{H_S^\bot \rightarrow \mathbb R^\nu } \) are defined by the duality relation \(( \partial _{\psi } K_{01}(\psi ) [\hat{\psi } ], w)_{L^2}\) \(= \hat{\psi } \cdot [\partial _{\psi }K_{01}(\psi )]^T w \), for all \(\hat{\psi } \in \mathbb R^\nu \), \(w \in H_S^\bot \), and similarly for \( K_{11} \). Explicitly, for all \( w \in H_S^\bot \), and denoting \(\underline{e}_k\) the k-th versor of \(\mathbb R^\nu \),
In the next lemma we estimate the coefficients \( K_{00}, K_{10}, K_{01} \) of the Taylor expansion (6.14). Note that on an exact solution we have \( Z = 0 \) and therefore \( K_{00} (\psi ) = \mathrm{const} \), \( K_{10} = \omega \) and \( K_{01} = 0 \).
Lemma 9
Assume (6.4). Then there is \( \sigma := \sigma (\tau , \nu )\) such that
Proof
Follow the proof of Lemma 6.4 in [5]. The fact that here there is no factor \(\varepsilon ^{2b-1} \gamma ^{-1}\) is a consequence of the better estimate (6.10) for \(\mathcal {F}(i_\delta ,\zeta _0)\) compared to the analogous estimate in [5]. \(\square \)
Remark 4
If \( \mathcal{F} (i_0, \zeta _0) = 0 \) then \(\zeta _0 = 0\) by Lemmas 6 and 9 implies that (6.14) simplifies to the normal form
We now estimate \( K_{20}, K_{11}\) in (6.14). The norm of \(K_{20}\) is the sum of the norms of its matrix entries.
Lemma 10
Assume (6.4). Then
In particular \( \Vert K_{20} - \varepsilon ^{2b} \mathbb {A} D_S \Vert _{s_0 }^{{\mathrm {Lip}(\gamma )}} \le C \varepsilon ^{5-2b}\), and
Proof
See the proof of Lemma 6.6 in [5]. \(\square \)
Consider the linear change of variables \(({\widehat{\theta }}, {\widehat{y}}, {\widehat{z}}) = D G_\delta (\varphi , 0, 0) [{\widehat{\psi }}, {\widehat{\eta }}, {\widehat{w}}]\), where \(D G_\delta (\varphi ,0,0)\) is obtained by linearizing \(G_\delta \) in (6.13) at \((\varphi ,0,0)\), and it is represented by the matrix
The linearized operator \(d_{i, \zeta }\mathcal{F}(i_\delta , \zeta _0)\) transforms (approximately, see (6.40)) into the operator obtained linearizing (6.15) at \((\psi , \eta , w, \zeta ) = (\varphi , 0, 0, \zeta _0 )\) (with \( \partial _t \rightsquigarrow \mathcal{D}_\omega \)), which is the linear operator
where
Lemma 11
(Lemma 6.7 in [5]) Assume (6.4) and let \( {\widehat{\imath }} := ({\widehat{\psi }}, {\widehat{\eta }}, {\widehat{w}})\). Then
for some \(\sigma := \sigma (\nu ,{\tau })\). The same estimates hold for the \(\Vert \ \Vert _s^{{\mathrm {Lip}(\gamma )}}\) norm.
In order to construct an approximate inverse of (6.20) it is sufficient to solve the equation
which is obtained by neglecting in \(B_1, B_2, B_3\) in (6.20) the terms \( \partial _\psi K_{10} \), \( \partial _{\psi \psi } K_{00} \), \( \partial _\psi K_{00} \), \( \partial _\psi K_{01} \) and \( \partial _\psi [\partial _\psi \theta _0(\varphi )]^T [ \cdot , \zeta _0] \) (these terms are naught at a solution by Lemmata 6 and 9).
First we solve the second equation in (6.22), namely \( \mathcal{D}_\omega {\widehat{\eta }} = g_2 - [\partial _\psi \theta _0(\varphi )]^T {\widehat{\zeta }} \). We choose \( {\widehat{\zeta }} \) so that the \(\varphi \)-average of the right hand side is zero, namely
(we denote \( \langle g \rangle := (2 \pi )^{- \nu } \int _{\mathbb T^\nu } g (\varphi ) d \varphi \)). Note that the \(\varphi \)-averaged matrix \( \langle [\partial _\psi \theta _0 ]^T \rangle \) \( = \langle I + [\partial _\psi \Theta _0]^T \rangle = I \) because \(\theta _0(\varphi ) = \varphi + \Theta _0(\varphi )\) and \(\Theta _0(\varphi )\) is a periodic function. Therefore
where the average \(\langle {\widehat{\eta }} \rangle \) will be fixed below. Then we consider the third equation
-
Inversion assumption
There exists a set \( \Omega _\infty \subset \Omega _o\) such that for all \( \omega \in \Omega _\infty \), for every function \( g \in H^{s+\mu }_{S^\bot } (\mathbb T^{\nu +1}) \) there exists a solution \( h := \mathcal{L}_\omega ^{- 1} g \in H^{s}_{S^\bot } (\mathbb T^{\nu +1}) \) of the linear equation \( \mathcal{L}_\omega h = g \), which satisfies
for some \( \mu := \mu ({\tau }, \nu ) > 0 \).
By the above assumption there exists a solution
of (6.25). Finally, we solve the first equation in (6.22), which, substituting (6.24) and (6.27), becomes
where
To solve Eq. (6.28) we have to choose \(\langle {\widehat{\eta }} \rangle \) such that the right hand side in (6.28) has zero average. By Lemma 10 and (6.4), the \(\varphi \)-averaged matrix
Therefore, for \( \varepsilon \) small, \(\langle M_1 \rangle \) is invertible and \(\langle M_1 \rangle ^{-1} = O(\varepsilon ^{-2 b}) = O(\gamma ^{- 1})\) (recall (5.9)). Thus we define
With this choice of \(\langle {\widehat{\eta }} \rangle \), Eq. (6.28) has the solution
In conclusion, we have constructed a solution \(({\widehat{\psi }}, {\widehat{\eta }}, {\widehat{w}}, {\widehat{\zeta }})\) of the linear system (6.22).
Proposition 2
Assume (6.4) and (6.26). Then, \(\forall \omega \in \Omega _\infty \), \( \forall g := (g_1, g_2, g_3) \), the system (6.22) has a solution \( {\mathbb D}^{-1} g := ({\widehat{\psi }}, {\widehat{\eta }}, {\widehat{w}}, {\widehat{\zeta }} ) \) where \(({\widehat{\psi }}, {\widehat{\eta }}, {\widehat{w}}, {\widehat{\zeta }})\) are defined in (6.23), (6.24), (6.27), (6.31), (6.32), and satisfy
Proof
Recalling (6.29), by Lemma 10, (6.26) and (6.4) we get \( \Vert M_2 h \Vert _{s_0} + \Vert M_3 h \Vert _{s_0} \) \(\le C \Vert h \Vert _{s_0 + \sigma } \). Then, by (6.31) and \(\langle M_1 \rangle ^{-1} = O(\varepsilon ^{-2 b}) = O(\gamma ^{-1}) \), we deduce \( |\langle {\widehat{\eta }}\rangle |^{{\mathrm {Lip}(\gamma )}} \le C\gamma ^{-1} \Vert g \Vert _{s_0+ \sigma }^{{\mathrm {Lip}(\gamma )}} \) and (6.24), (5.16) imply \( \Vert {\widehat{\eta }} \Vert _s^{{\mathrm {Lip}(\gamma )}} \le _s \gamma ^{-1} \big ( \Vert g \Vert _{s + \sigma }^{\mathrm {Lip}(\gamma )}\) \( + \Vert {\mathfrak {I}}_0 \Vert _{s + \sigma } \Vert g \Vert _{s_0}^{\mathrm {Lip}(\gamma )}\big )\). The bound (6.33) is sharp for \( {\widehat{w}} \) because \( \mathcal{L}_\omega ^{-1} g_3 \) in (6.27) is estimated using (6.26). Finally \( {\widehat{\psi }} \) satisfies (6.33) using (6.26), (6.29), (6.32), (5.16), and Lemma 10. \(\square \)
Let \(\widetilde{G}_\delta (\psi , \eta , w, \zeta ) := ( G_\delta (\psi , \eta , w), \zeta )\). Let \(\Vert (\psi , \eta , w, \zeta ) \Vert _s^{\mathrm {Lip}(\gamma )}\) denote the maximum between \(\Vert (\psi , \eta , w) \Vert _s^{\mathrm {Lip}(\gamma )}\) and \(| \zeta |^{\mathrm {Lip}(\gamma )}\). We prove that the operator
is an approximate right inverse for \(d_{i,\zeta } \mathcal{F}(i_0 )\).
Theorem 3
(Approximate inverse) Assume (6.4) and the inversion assumption (6.26). Then there exists \( \mu := \mu (\tau , \nu ) > 0 \) such that, for all \( \omega \in \Omega _\infty \), for all \( g := (g_1, g_2, g_3) \), the operator \( \mathbf{T}_0 \) defined in (6.34) satisfies
The operator \(\mathbf {T}_0\) is an approximate inverse of \(d_{i, \zeta } \mathcal{F}(i_0 )\), namely
Proof
In this proof we denote \(\Vert \ \Vert _s\) instead of \(\Vert \ \Vert _s^{{\mathrm {Lip}(\gamma )}}\). The bound (6.35) follows from (6.21), (6.33) and (6.34). By (5.6), since \( X_\mathcal {N}\) does not depend on y , and \( i_\delta \) differs from \( i_0 \) only for the y component, we have
By (5.13), (6.4), (6.8) and (6.9), we estimate
where \(Z := \mathcal {F}(i_0, \zeta _0)\) (recall (6.5)). Note that \(\mathcal {E}_0[{\widehat{\imath }}, {\widehat{\zeta }}]\) is, in fact, independent of \({\widehat{\zeta }}\). Denote the set of variables \( (\psi , \eta , w) =: {\mathtt u} \). Under the transformation \(G_\delta \), the nonlinear operator \(\mathcal{F}\) in (5.6) transforms into
where \(K = H_{\varepsilon , \zeta } \circ G_\delta \), see (6.14) and (6.15). Differentiating (6.39) at the trivial torus \( {\mathtt u}_\delta (\varphi ) = G_\delta ^{-1}(i_\delta ) (\varphi ) = (\varphi , 0 , 0 ) \), at \( \zeta = \zeta _0 \), in the direction \(({\widehat{\mathtt u}}, {\widehat{\zeta }}\,)\) \(= (D G_\delta ({\mathtt u}_\delta )^{-1} [\, {\widehat{\imath }} \, ], {\widehat{\zeta }}) = D {\widetilde{G}}_\delta ({\mathtt u}_\delta )^{-1} [\, {\widehat{\imath }} , {\widehat{\zeta }} \, ] \), we get
where \( d_{\mathtt u, \zeta } X_K( {\mathtt u}_\delta , \zeta _0) \) is expanded in (6.20). In fact, \(\mathcal{E}_1\) is independent of \({\widehat{\zeta }}\). We split
where \( {\mathbb D} [{\widehat{\mathtt u}}, {\widehat{\zeta }}] \) is defined in (6.22) and \(R_Z [ {\widehat{\psi }}, {\widehat{\eta }}, {\widehat{w}}, {\widehat{\zeta }}]\) is defined by difference, so that its first component is \( - \partial _\psi K_{10}(\varphi ) [{\widehat{\psi }} ]\), its second component is
and its third component is \(- \partial _x \{ \partial _{\psi } K_{01}(\varphi )[{\widehat{\psi }}] \} \) (in fact, \(R_Z\) is independent of \({\widehat{\zeta }}\)). By (6.37) and (6.40),
By Lemmata 6, 9, 11, and (6.4) and (6.10), the terms \(\mathcal {E}_1, \mathcal {E}_2\) satisfy the same bound (6.38) as \(\mathcal {E}_0\). Thus the sum \(\mathcal {E}:= \mathcal {E}_0 + \mathcal {E}_1 + \mathcal {E}_2\) satisfies (6.38). Applying \( \mathbf{T}_0 \) defined in (6.34) to the right in (6.42), since \( {\mathbb D} \circ {\mathbb D}^{-1} = I \) (see Proposition 2), we get \(d_{i, \zeta } \mathcal{F}(i_0 ) \circ \mathbf{T}_0 - I = \mathcal {E}\circ \mathbf{T}_0\). Then (6.36) follows from (6.35) and the bound (6.38) for \(\mathcal {E}\). \(\square \)
7 The linearized operator in the normal directions
The goal of this section is to write an explicit expression of the linearized operator \(\mathcal {L}_\omega \) defined in (6.25), see Proposition 3. To this aim, we compute \( \frac{1}{2} ( K_{02}(\psi ) w, w )_{L^2(\mathbb T)} \), \( w \in H_S^\bot \), which collects all the terms of \((H_\varepsilon \circ G_\delta )(\psi , 0, w)\) that are quadratic in w, see (6.14). We first recall some preliminary lemmata.
Lemma 12
[5, Lemma 7.1] Let H be a Hamiltonian function of class \( C^2 ( H^1_0(\mathbb T_x), \mathbb R)\) and consider a map \( \Phi (u) := u + \Psi (u) \) satisfying \(\Psi (u) = \Pi _E \Psi (\Pi _E u)\), for all u , where E is a finite dimensional subspace as in (3.5). Then
where \( \mathcal{R}(u) \) has the “finite dimensional” form
with \( \chi _j (u) = e^{{\mathrm i}j x} \) or \( g_j(u) = e^{{\mathrm i}j x} \). The remainder in (7.2) is \( \mathcal{R} (u) = \mathcal{R}_0 (u) + \mathcal{R}_1 (u) + \mathcal{R}_2 (u) \) with
Lemma 13
(Lemma 7.3 in [5]) Let \( \mathcal{R} \) be an operator of the form
where the functions \(g_j({\tau }),\,\chi _j({\tau }) \in H^s\), \({\tau }\in [0, 1]\) depend in a Lipschitz way on the parameter \(\omega \). Then its matrix s-decay norm (see (2.4), (2.5)) satisfies
7.1 Composition with the map \(G_\delta \)
In the sequel we use the fact that \({\mathfrak {I}}_\delta := {\mathfrak {I}}_\delta (\varphi ; \omega ) := i_\delta (\varphi ; \omega ) - (\varphi ,0,0) \) satisfies, by (6.4) and (6.8),
In this section we study the Hamiltonian \( K := H_\varepsilon \circ G_\delta = \varepsilon ^{-2b} \mathcal {H}\circ A_\varepsilon \circ G_\delta \) defined in (4.6) and (6.14). Recalling (4.7) and (6.13), \(A_\varepsilon \circ G_\delta \) has the form
where \(v_\varepsilon \) is defined in (4.7), and
By Taylor’s formula, we expand (7.6) in w at \((\eta ,w)=(0,0)\), and we get
where
is the approximate isotropic torus in the phase space \( H^1_0 (\mathbb T) \) (it corresponds to \( i_\delta \) in Lemma 8),
and \(T_{\ge 3}(\psi , w)\) collects all the terms of order at least cubic in w. The terms \(U_1, U_2 = O(1)\) in \(\varepsilon \). Moreover, using that \( L_2 (\psi ) \) in (7.7) vanishes as \( z_0 = 0 \), they satisfy
and also in the \( \Vert \ \Vert _s^{\mathrm {Lip}(\gamma )}\)-norm. We expand \( \mathcal{H} \) by Taylor’s formula
Specifying at \(u = T_\delta (\psi )\) and \( h = T_1(\psi ) w + T_2(\psi )[w,w] + T_{\ge 3}(\psi ,w)\), we obtain that the sum of all the components of \( K = \varepsilon ^{-2b} (\mathcal {H}\circ A_\varepsilon \circ G_\delta )(\psi , 0, w) \) that are quadratic in w is
Inserting the expressions (7.9) and (7.10) in the last equality we get
Lemma 14
The operator \(K_{02}\) reads
where \(R(\psi )w \) has the “finite dimensional” form
The functions \(g_j, \chi _j\) satisfy, for some \(\sigma := \sigma (\nu , {\tau }) > 0\),
where \(i = (\theta , y, z)\) (see (5.1)) and \({\widehat{\imath }} = ({\widehat{\theta }}, {\widehat{y}}, {\widehat{z}})\).
Proof
Since \( U_1 = \Pi _S U_1 \) and \( U_2 = \Pi _S U_2 \), the last three terms in (7.12) have all the form (7.14). We have to prove that they are also small in size.
By (4.8), (6.13) and (7.7), the only term in \(\varepsilon ^{-2b} H_2(A_\varepsilon (G_\delta (\psi , \eta , w)))\) that is quadratic in w is \(\frac{1}{2} \int _\mathbb Tw_x^2 \, dx\), so this is the only contribution to (7.12) coming from \(H_2\).
It remains to consider all the terms coming from \(\mathcal {H}_{\ge 4} := \mathcal {H}_4 + \mathcal{H}_{\ge 5} = O(u^4)\). The term \(\varepsilon ^{b - 1} \partial _u \nabla \mathcal{H}_{\ge 4}(T_\delta ) U_1\), the term \(\varepsilon ^{2(b - 1)} U_1^T (\partial _u \nabla \mathcal{H}_{\ge 4})(T_\delta ) U_1\) and the term \(\varepsilon ^{2 b - 3} U_2^T \nabla \mathcal{H}_{\ge 4}(T_\delta ) \) have all the form (7.14) and, using the inequality \( \Vert T_\delta \Vert _s^{\mathrm {Lip}(\gamma )}\le \varepsilon (1 + \Vert {\mathfrak {I}}_\delta \Vert _s^{\mathrm {Lip}(\gamma )}) \), (6.4) and (7.11), the bound (7.15) holds. By (6.11) and using explicit formulae (7.7)–(7.10) we get (7.16). \(\square \)
The conclusion of this section is that, after the composition with the action-angle variables, the rescaling (4.5), and the transformation \( G_\delta \), the linearized operator to analyze is \(w \mapsto (\partial _u \nabla \mathcal {H})(T_\delta ) [w] \), \(w \in H_S^\bot \), up to finite dimensional operators which have the form (7.14) and size (7.15).
7.2 The linearized operator in the normal directions
In view of (7.13) we now compute \( ( (\partial _u \nabla \mathcal {H})(T_\delta ) [w], w )_{L^2(\mathbb T)} \), \( w \in H_S^\bot \), where \( \mathcal {H}= H \circ \Phi _B \) and \(\Phi _B \) is the Birkhoff map of Proposition 1. We recall that \(\Phi _B(u) = u + \Psi (u)\) where \(\Psi \) satisfies (3.6) and \(\Psi (u) = O(u^3)\). It is convenient to estimate separately the terms in
where \( H_2, H_4, H_{\ge 5}\) are defined in (3.4).
We first consider \( H_{\ge 5} \circ \Phi _B \). By (3.4) we get \( \nabla H_{\ge 5}(u) = \pi _0[ (\partial _u f)(x, u, u_x) ]\) \(- \partial _x \{ (\partial _{u_x} f)(x, u,u_x) \} \) where \( \pi _0 \) is the operator defined in (1.32). Since \( \Phi _B \) has the form (3.6), Lemma 12 (at \( u = T_\delta \), see (7.8)) implies that
where the multiplicative functions \(r_0(T_\delta )\), \(r_1(T_\delta )\) are
the remainder \( \mathcal{R}_{H_{\ge 5}}(u) \) has the form (7.2) with \(\chi _j = e^{{\mathrm i}jx}\) or \(g_j = e^{{\mathrm i}jx}\) and, using (7.3), it satisfies, for some \( \sigma := \sigma (\nu , {\tau }) > 0\),
Now we consider the contributions from \( H_2 \circ \Phi _B\) and \(H_4 \circ \Phi _B \). By Lemma 12 and the expressions of \( H_2, H_4 \) in (3.4) we deduce that
where \( \mathcal{R}_{H_2}(u) \), \( \mathcal{R}_{H_4}(u) \) have the form (7.2). By (7.3), they have size \(\mathcal{R}_{H_2}(T_\delta ) = O(\varepsilon ^2)\), \(\mathcal{R}_{H_4}(T_\delta ) = O(\varepsilon ^4)\). More precisely, the functions \(g_j, \chi _j\) in \(\mathcal {R}_{H_4}(T_\delta )\) satisfy the bounds in (7.20) with \(\varepsilon ^5\) replaced by \(\varepsilon ^4\). Regarding \(\mathcal {R}_{H_2}(T_\delta )\), we need to find an exact formula for the terms of order \(\varepsilon ^2\).
The sum of (7.18), (7.21) and (7.22) gives a formula for \(\partial _u \nabla \mathcal {H}(T_\delta )[h]\), where the terms of form (7.2) and order \(\varepsilon ^2\) are confined in \(\mathcal {R}_{H_2}(T_\delta )\). On the other hand, recalling (3.7), \(\mathcal {H}= H_2 + \mathcal {H}_4 + \mathcal {H}_{\ge 5}\), and \(\partial _u \nabla H_2(T_\delta ) = -\partial _{xx}\), while \(\partial _u \nabla \mathcal {H}_{\ge 5}(T_\delta ) = O(\varepsilon ^3)\). Therefore all the terms of order \(\varepsilon ^2\) in \(\partial _u \nabla \mathcal {H}(T_\delta )\) can only come from \(\partial _u \nabla \mathcal {H}_4(T_\delta )\). Using formula (3.8) for \(\mathcal {H}_4\), we calculate
Hence all the terms of order \(\varepsilon ^2\) in \(\Pi _S^\bot (\partial _u \nabla \mathcal {H}(T_\delta )[h] )\) are contained in the term \(- 3\varsigma \Pi _S^\bot (T_\delta ^2 h)\) (and the term \(- 3\varsigma \Pi _S^\bot (T_\delta ^2 h)\) is included in \(-3\varsigma \Pi _S^\bot [ (\Phi _B (T_\delta ))^2 h]\) because \(\Phi _B(T_\delta ) = T_\delta + \Psi (T_\delta )\)). As a consequence, \(\Pi _S^\bot \mathcal {R}_{H_2}(T_\delta )\) is of size \(O(\varepsilon ^3)\), and its functions \(g_j, \chi _j\) (see (7.2)) satisfy (7.20) with \(\varepsilon ^5\) replaced by \(\varepsilon ^3\).
By Lemma 14 and the results of this section we deduce:
Proposition 3
Assume (7.5). Then the Hamiltonian operator \( \mathcal{L}_\omega \) has the form, \( \forall h \in H_{S^\bot }^s ( \mathbb T^{\nu +1}) \),
where \( {\mathcal {R}}_* := \mathcal {R}_{H_2}(T_\delta ) + \mathcal {R}_{H_4}(T_\delta ) + \mathcal {R}_{H_{\ge 5}}(T_\delta ) + R(\psi ) \) (with \(R(\psi )\) defined in Lemma 14, and \(\mathcal {R}_{H_2}(T_\delta )\), \(\mathcal {R}_{H_4}(T_\delta )\), \(\mathcal {R}_{H_{\ge 5}}(T_\delta )\) defined in (7.18), (7.21) and (7.22)), the functions
\( r_0, r_1 \) are defined in (7.19), and \( T_\delta \) in (7.8). They satisfy
where \( {\mathfrak {I}}_\delta (\varphi ) := (\theta _0(\varphi ) - \varphi , y_\delta (\varphi ), z_0(\varphi )) \) corresponds to \(T_\delta \). The remainder \( \mathcal{R}_* \) has the form (7.2), and its coefficients \(g_j, \chi _j\) satisfy bounds (7.15) and (7.16).
Remark 5
For \( K = H + \lambda M^2\), \( \lambda = 3 \varsigma / 4 \), the coefficient \(a_0\) in (7.24) becomes
where \( \pi _0\) is defined in (1.32). Thus the space average of \( a_0\) has size \(O(\varepsilon ^3)\).
Bound (7.15) imply, by Lemma 13, estimates for the s -decay norms of \(\mathcal{R}_*\). The linearized operator \( \mathcal{L}_\omega := \mathcal{L}_\omega (\omega , i_\delta (\omega ))\) depends on the parameter \( \omega \) both directly and also through the dependence on the torus \(i_\delta (\omega )\). We have estimated also the partial derivative \( \partial _i \) with respect to the variables i (see (5.1)) in order to control, along the nonlinear Nash–Moser iteration, the Lipschitz variation of the eigenvalues of \( \mathcal{L}_\omega \) with respect to \( \omega \) and the approximate solution \( i_\delta \).
8 Reduction of the linearized operator in the normal directions
The goal of this section is to conjugate the Hamiltonian linear operator \( \mathcal{L}_\omega \) in (7.23) to the constant coefficients linear operator \( \mathcal{L}_\infty \) defined in (8.64). The proof is obtained applying different kind of symplectic transformations. We shall always assume (7.5).
8.1 Space reduction at the order \( \partial _{xxx} \)
As a first step, we symplectically conjugate the operator \( \mathcal{L}_\omega \) in (7.23) to \( \mathcal{L}_1 \) in (8.13), which has the coefficient of \(\partial _{xxx}\) independent on the space variable. Because of the Hamiltonian structure, this step also eliminates the terms \( O( \partial _{xx} )\).
We look for a \( \varphi \)-dependent family of symplectic diffeomorphisms \(\Phi (\varphi ) \) of \( H_S^\bot \) which differ from
up to a small “finite dimensional” remainder, see (8.3). For each \( \varphi \in \mathbb T^\nu \), the map \( \mathcal{A}(\varphi ) \) is a symplectic map of the phase space, see Remark 3.3 in [3]. If \( \Vert \beta \Vert _{W^{1,\infty }} < 1/2\), then \( \mathcal{A} \) is invertible (see Lemma 3), and its inverse and adjoint maps are
where \(x = y + \tilde{\beta } (\varphi , y) \) is the inverse diffeomorphism (of \(\mathbb T\)) of \( y = x + \beta (\varphi , x) \).
The restricted map \( \mathcal{A}_\bot (\varphi ): H_S^\bot \rightarrow H_S^\bot \) is not symplectic. We have already observed in the introduction that \( \mathcal{A }(\varphi ) \) is the time-1 flow map of the linear Hamiltonian PDE (1.30). The Eq. (1.30) is a linear transport equation, whose charactheristic curves are the solutions of the ODE
To obtain a symplectic transformation close to \(\mathcal {A}_\bot \), we define a symplectic map \(\Phi \) of \( H_S^\bot \) as the time 1 flow of the Hamiltonian PDE (1.31). The linear operator \( \Pi _S^\bot \partial _x (b({\tau }, x) u) \) is the Hamiltonian vector field generated by the quadratic Hamiltonian \( \frac{1}{2} \int _{\mathbb T} b({\tau }, x) u^2 dx \) restricted to \( H_S^\bot \). The flow of (1.31) is well defined in the Sobolev spaces \( H^s_{S^\bot } (\mathbb T_x) \) for \( b(\varphi , {\tau }, x) \) smooth enough, by standard theory of linear hyperbolic PDEs (see e.g. section 0.8 in [29]). The difference between the time 1 flow map \( \Phi \) and \( \mathcal{A}_\bot \) is a “finite-dimensional” remainder of size \(O(\beta )\).
Lemma 15
(Lemma 8.1 of [5]) For \( \Vert \beta \Vert _{W^{s_0 + 1,\infty }} \) small, there exists an invertible symplectic transformation \(\Phi = \mathcal{A}_\bot + \mathcal{R}_\Phi \) of \(H_{S^\bot }^s\), where \( \mathcal{A}_\bot \) is defined in (8.1) and \( \mathcal{R}_\Phi \) is a “finite-dimensional” remainder
for some functions \( \chi _j ({\tau }), g_j ({\tau }) , \psi _j \in H^s \) satisfying for all \({\tau }\in [0,1]\)
Moreover
We conjugate \( \mathcal{L}_\omega \) in (7.23) via the symplectic map \( \Phi = \mathcal{A}_\bot + \mathcal{R}_\Phi \) of Lemma 15. Using the splitting \( \Pi _S^{\bot } = I - \Pi _S \), we compute
where the coefficients \(b_i(\varphi ,y)\), \(i=0,1,2,3\), are
and the remainder
The commutator \([\mathcal{D}_\omega , \mathcal{R}_\Phi ] \) has the form (8.3) with \(\mathcal{D}_\omega g_j\) or \(\mathcal{D}_\omega \chi _j\), \(\mathcal{D}_\omega \psi _j\) instead of \(\chi _j\), \(g_j\), \(\psi _j\) respectively. Also the last term \((\mathcal{L}_\omega - \mathcal{D}_\omega ) \mathcal{R}_\Phi \) in (8.8) has the form (8.3) (note that \(\mathcal{L}_\omega - \mathcal{D}_\omega \) does not contain derivatives with respect to \(\varphi \)). By (8.6), and decomposing \( I = \Pi _S + \Pi _S^\bot \), we get
Now we choose the function \( \beta = \beta (\varphi , x) \) such that
so that the coefficient \( b_3 \) in (8.7) depends only on \( \varphi \) (note that \( \mathcal{A}^T [b_3 (\varphi )]\) \(= b_3 (\varphi )\)). The only solution of (8.11) with zero space average is (see e.g. [3, section 3.1]) \(\beta := \partial _x^{-1} \rho _0\), where \(\rho _0 := b_3 (\varphi )^{1/3} (a_1 (\varphi , x))^{-1/3} - 1\), and
Applying the symplectic map \( \Phi ^{-1} \) in (8.9) we obtain the Hamiltonian operator (see Definition 2)
where \( {\mathfrak R}_1 := \Phi ^{-1} \mathcal{R}_{II} \). Note that the term \(b_2 \partial _{yy}\) has disappeared from (8.13) because, by the Hamiltonian nature of \( \mathcal{L}_1 \), the coefficient \( b_2 = 2 (b_3)_y \) (see [3, Remark 3.5]) and therefore, by (8.12), \( b_2 = 2 (b_3)_y = 0 \).
Lemma 16
(Lemma 8.2 of [5]) The operator \( {\mathfrak R}_1 \) in (8.13) has the form (7.4).
Since \(a_1 = 1 + O(\varepsilon ^3)\) and \(a_0 = 3\varsigma T_\delta ^2 + O(\varepsilon ^3)\) (see (7.25), (7.26) for the precise estimates), by the usual composition estimates we deduce the following lemma.
Lemma 17
There is \(\sigma = \sigma ({\tau },\nu ) > 0\) such that
where \(T_\delta \) is defined in (7.8). The transformations \(\Phi \), \(\Phi ^{-1}\) satisfy
Moreover the remainder \({\mathfrak R}_1\) has the form (7.4), where the functions \(\chi _j({\tau })\), \(g_j({\tau })\) satisfy the estimates (7.15) and (7.16) uniformly in \({\tau }\in [0, 1]\).
8.2 Time reduction at the order \( \partial _{xxx}\)
The goal of this section is to get a constant coefficient in front of \( \partial _{yyy} \), using a quasi-periodic reparametrization of time. We consider the change of variable
where \(\mathbb T^\nu \rightarrow \mathbb T^\nu \), \(\vartheta \mapsto \varphi = \vartheta + \omega \tilde{\alpha }(\vartheta )\) is the inverse diffeomorphism of \( \vartheta = \varphi + \omega \alpha (\varphi ) \) in \(\mathbb T^\nu \). By conjugation, the differential operators become
By (8.13), using also that B and \( B^{-1} \) commute with \( \Pi _S^\bot \), the conjugate operator \(B^{-1} \mathcal{L}_1 B\) is equal to
We choose \( \alpha \) such that \((B^{-1}b_3 )(\vartheta ) = m_3 \rho (\vartheta )\) for some constant \(m_3 \in \mathbb R\), namely
(recall (8.19)). The unique solution with zero average of (8.21) is
Hence, by (8.20),
The transformed operator \(\mathcal{L}_2\) in (8.23) is still Hamiltonian, because the reparametrization of time preserves the Hamiltonian structure (see Section 2.2 and Remark 3.7 in [3]).
Lemma 18
There is \( \sigma = \sigma (\nu ,{\tau }) > 0 \) (possibly larger than \( \sigma \) in Lemma 17) such that
The transformations B, \(B^{-1}\) satisfy the estimates (8.16) and (8.17). The remainder \( \mathfrak {R}_2 \) has the form (7.4), and the functions \(g_j({\tau })\), \(\chi _j({\tau })\) satisfy the estimates (7.15) and (7.16) for all \({\tau }\in [0,1]\).
Proof
To estimate \(\Vert \alpha \Vert _s^{\mathrm {Lip}(\gamma )}\) we also differentiate (8.22) with respect to the parameter \( \omega \). Note that \(c_1 - 3 \varsigma B^{-1}(T_\delta ^2) = O(\varepsilon ^3)\), and similarly \(c_0 - 3 \varsigma B^{-1}((T_\delta ^2)_x)\) \(= O(\varepsilon ^3)\). The factor \(\varepsilon ^5 \gamma ^{-1}\) in the last two inequalities comes from the estimate of the difference \(B^{-1}(T_\delta ^2) - T_\delta ^2 \simeq (T_\delta ^2)_\varphi \alpha = O(\varepsilon ^2 \varepsilon ^3 \gamma ^{-1})\). \(\square \)
8.3 Translation of the space variable
In this section we remove the space average from the coefficient in front of \( \partial _y \). Consider the change of the space variable \( z = y + p(\vartheta ) \) which induces on \( H^s_{S^\bot } (\mathbb T^{\nu +1}) \) the operators
(which are a particular case of those used in Sect. 8.1). The differential operators become \( \mathcal{T}^{-1} \omega \cdot \partial _{\vartheta } \mathcal{T} \) \( = \omega \cdot \partial _{\vartheta } + \{ \omega \cdot \partial _{\vartheta }p (\vartheta ) \} \partial _z \), \( \mathcal{T}^{-1} \partial _{y} \mathcal{T} = \partial _{z} \). Since \(\mathcal {T}, \mathcal {T}^{-1}\) commute with \( \Pi _S^\bot \), we get
We choose
so that
Recalling (8.26), we analyze the space average of \(c_1\) in more detail. To avoid ambiguity between the space variable \(y \in \mathbb T\) and the action \(y_\delta : \mathbb T^\nu \rightarrow \mathbb R^\nu \) of (7.8), we rename \(x \in \mathbb T\) the space variable, and \(\varphi \in \mathbb T^\nu \) the variable on the torus (time variable). Let
where \(\ell :S \rightarrow \mathbb Z^\nu \) is the odd injective map (see (1.11))
and \(e_i = (0,\ldots ,1, \ldots ,0)\) denotes the i-th vector of the canonical basis of \(\mathbb R^\nu \). In view of the next linear Birkhoff normal form step (whose goal is to normalize the term of size \(\varepsilon ^2\)), we observe that the component of order \(\varepsilon ^2\) in \(T_\delta ^2\) (see (7.8)) is \(\varepsilon ^2 \bar{v}^2\), with
Moreover, from (7.8), since \((v_\delta , z_0)_{L^2(\mathbb T)} = 0\), and \((\theta _0)_{-j} = - (\theta _0)_j\) for all \(j \in S\), we have
We define
and note that, by (8.31) and (8.32),
Using the explicit formulae above, and Lemma 13 for the estimate of \(\mathfrak R_3\), we get the following bounds.
Lemma 19
There is \( \sigma := \sigma (\nu ,{\tau }) > 0 \) (possibly larger than in Lemma 18) such that
The matrix s -decay norm (see (2.4)) of the operator \({\mathfrak R}_3\) satisfies
The transformations \(\mathcal{T}\), \(\mathcal{T}^{-1}\) satisfy (8.16) and (8.17).
Remark 6
When \( K = H + \lambda M^2 \), \( \lambda = 3 / 4 \), the constant coefficient \(m_1\) in (8.30) becomes of size
The inequality (8.40) is the key difference between the cases \(H + (3\varsigma /4) M^2\) and H (compare (8.40) with (8.37), where \(m_1\) contains the non-perturbative term \(\varepsilon ^2 c(\xi )\)).
It is sufficient to estimate \( \mathfrak R_3 \) (which has the form (7.4)) only in the s -decay norm (see (8.39)) because the next transformations will preserve it. Such norms will be used in the reducibility scheme of Sect. 8.6.
8.4 Linear Birkhoff normal form
Now we normalize the terms of order \( \varepsilon ^2 \) of \( \mathcal{L}_3 \). This step is different from the reducibility steps that we shall perform in Sect. 8.6: the diophantine constant \(\gamma \) in (5.3) is \( \gamma = o(\varepsilon ^2 ) \), and therefore the terms of order \( \varepsilon ^2 \) are not perturbative, because \(\varepsilon ^2 \gamma ^{-1}\) is not small (in fact, it is big). The reduction of this section is possible thanks to the special form of the term \( \varepsilon ^2 \mathcal{B} \) defined in (8.41): the harmonics of \( \varepsilon ^2 \mathcal {B}\) corresponding to a possible small divisor are naught, except \(\mathcal {B}_j^j(0)\), see Lemma 20. Note that, since the previous linear transformations \( \Phi \), B , \( \mathcal{T} \) are \( O(\varepsilon ^5 \gamma ^{-2} ) \)-close to the identity, the terms of order \( \varepsilon ^2 \) in \( \mathcal{L}_3 \) are the same as in the original linearized operator.
First, we collect all the terms of order \( \varepsilon ^2 \) in the operator \( \mathcal{L}_3 \) in (8.28). We have
where \( \widetilde{d}_1, \widetilde{d}_0, {\mathfrak R}_3 \) are defined in (8.29), (8.35) and (recall (8.32))
Note that \(\mathcal{B}\) is the linear Hamiltonian vector field of \( H_{S}^\bot \) generated by the Hamiltonian \( z \mapsto \frac{3\varsigma }{2} \int _\mathbb T\bar{v}^2 z^2 \, dx \).
We transform \( \mathcal{L}_3 \) by a symplectic operator \( \Phi _2 : H_{S^\bot }^s(\mathbb T^{\nu + 1}) \rightarrow H_{S^\bot }^s(\mathbb T^{\nu + 1}) \) of the form
where \( A(\varphi ) h = {\mathop \sum }_{j,j' \in S^c} A_j^{j'}(\varphi ) h_{j'} e^{{\mathrm i}j x} \) is a Hamiltonian vector field. The map \( \Phi _2 \) is symplectic, because it is the time 1 flow of a Hamiltonian vector field. We calculate
where
Remark 7
\( R_3 \) has no longer the form (7.4). However \( R_3 = O( \partial _x^0 ) \) because \(A = O(\partial _x^{-1})\) (see Lemma 22), and therefore \(\Phi _2 - I_{H_S^\bot } = O(\partial _x^{-1})\). Moreover the matrix decay norm of \( R_3 \) is \( o(\varepsilon ^2) \).
In order to normalize the term of order \(\varepsilon ^2\) of (8.43), we expand \(A_j^{j'}(\varphi ) = \sum _{l \in \mathbb Z^\nu } A_j^{j'}(l) e^{{\mathrm i}l \cdot \varphi }\), and for each \(j, j' \in S^c\), \(l \in \mathbb Z^\nu \), we choose
This definition is well posed. Indeed, by (8.32) and (8.41),
In particular \( \mathcal{B}_{j}^{j'}(l) = 0 \) unless \( |l| \le 2 \). For \(|l| \le 2\) and \( \bar{\omega } \cdot l + j'^3 - j^3 \ne 0 \), the denominators in (8.45) satisfy
for \( \varepsilon \) small, because \( |\bar{\omega } \cdot l + j'^3 - j^3| \ge 1 \) (\(\bar{\omega } \cdot l + j'^3 - j^3\) is a nonzero integer), \( \omega = \bar{\omega }+ O(\varepsilon ^2) \) and by (8.25).
Remark 8
The operator A defined in (8.45) is Hamiltonian, because \(\mathcal {B}\) is Hamiltonian. The reason is a general fact: the denominators \( \delta _{l,j,k} := {\mathrm i}(\omega \cdot l + m_3( k^3 - j^3)) \) satisfy \( \overline{ \delta _{l,j,k} } = \delta _{-l,k,j} \) and an operator \(G(\varphi )\) is self-adjoint with respect to the \(L^2(\mathbb T)\) scalar product if and only if its matrix elements satisfy \( \overline{ G_j^k(l) } = G_k^j(-l) \), see [3, Remark 4.5]. Alternatively, we could solve the homological equation of this Birkhoff step directly for the Hamiltonian function whose flow generates \( \Phi _2 \).
By the definition (8.45), the term of order \(\varepsilon ^2\) in (8.43) is zero on the Fourier indices \((l,j,j')\) such that \(\bar{\omega }\cdot l + j'^3 - j^3 \ne 0\), while it is equal to \(\varepsilon ^2 \mathcal {B}_j^{j'}(l)\) for \((l,j,j')\) such that \(\bar{\omega }\cdot l + j'^3 - j^3 = 0\). Now we prove that the only nonzero components of \(\mathcal {B}\) that remain in (8.43) are \(\mathcal {B}_j^j(0)\).
Lemma 20
If \(\bar{\omega }\cdot l + j'^3 - j^3 = 0\) and \(\mathcal {B}_j^{j'}(l) \ne 0\), then \(l=0\) and \(j=j'\).
Proof
If \(\mathcal {B}_j^{j'}(l) \ne 0\), then, by (8.46), there exist \(j_1, j_2 \in S\) such that \(j_1 + j_2 = j - j'\) and \(\ell (j_1) + \ell (j_2) = l\). Hence, recalling (1.19) and (8.33),
This equality, together with \(j_1 + j_2 + j' - j = 0\), implies that \((j_1 + j_2) (j_1 + j') (j_2 + j') = 0\) by Lemma 4. Since \(j_1, j_2 \in S\), \(j' \in S^c\), the set S is symmetric, and \(0 \notin S\), we deduce that the factors \(j_1 + j'\) and \(j_2+j'\) are nonzero. Hence \(j_1 + j_2 = 0\), and therefore \(l=\ell (j_1) + \ell (-j_1) = 0\). \(\square \)
Thus, the only nonzero term of order \(\varepsilon ^2\) in (8.43) is \(\mathcal {B}_j^j(0)\). By (8.46), we calculate \(\mathcal {B}_j^j(0) = {\mathrm i}j c(\xi )\), where \(c(\xi )\) is defined in (8.36). Hence, by (8.36), (8.45) and Lemma 20, the term of order \(\varepsilon ^2\) in (8.43) is
Remark 9
When \( K = H + \lambda M^2 \), \( \lambda = 3 \varsigma / 4 \), the operator in (8.41) becomes \(\mathcal {B}h = \partial _x (3 \varsigma \pi _0(\bar{v}^2) h)\). Hence \(\mathcal {B}_j^j(0) = 0\), and the right-hand side term in (8.48) is zero, namely the first step of linear Birkhoff normal form completely eliminates all the terms of order \(\varepsilon ^2\).
We now estimate the transformation A .
Lemma 21
-
(i)
For all \(l \in \mathbb Z^\nu \), \(j,j' \in S^c\),
$$\begin{aligned} | A_j^{j'}(l)| \le C (| j | + | j' |)^{-1}, \quad | A_j^{j'}(l)|^\mathrm{lip} \le \varepsilon ^{-2} (|j| + |j'|)^{-1} . \end{aligned}$$(8.49) -
(ii)
\( (A_1)_j^{j'}(l) = 0\) for all \(l \in \mathbb Z^\nu \), \(j,j' \in S^c\) such that \(|j - j'| > 2 C_S \), where \(C_S := \max \{ |j| : j \in S\}\).
Proof
(i) As already observed, for all \(|l| > 2\) one has \( \mathcal {B}_j^{j'}(l) = 0\), and therefore \( A_j^{j'}(l) = 0\). For \(|l| \le 2\), \( j \ne j' \), one has (since \( | \omega | \le |\bar{\omega }| + 1 \))
for \((j'^2 + j^2) \ge C\), for some constant C. Since also (8.47) holds, we deduce that, for all \( j \ne j' \),
On the other hand, if \( j = j' \in S^c\), and \(l \ne 0\), then \(\mathcal {B}_j^{j'}(l) = 0\), and therefore \(A_j^{j'}(l) = 0\). For \(j=j'\) and \(l=0\) we also have \(A_j^{j'}(l) = 0\) because \(\bar{\omega }\cdot l + j'^3 - j^3 = 0\). Hence (8.50) holds for all \( j, j' \). By (8.45), (8.46) and (8.50) we deduce the first bound in (8.49). The Lipschitz bound follows similarly (use also \( |j - j'| \le 2 C_S \)). (ii) follows by (8.45) and (8.46). \(\square \)
The previous lemma means that \( A = O(| \partial _x|^{-1})\). More precisely, we deduce the following bound.
Lemma 22
(Lemma 8.19 of [5]) \( | A \partial _x |_s^{\mathrm {Lip}(\gamma )}+ | \partial _x A |_s^{\mathrm {Lip}(\gamma )}\le C(s) \).
It follows that the symplectic map \( \Phi _2 \) in (8.42) is invertible for \( \varepsilon \) small, with inverse
By (8.43) and (8.48) we get the Hamiltonian operator
Lemma 23
There is \( \sigma = \sigma (\nu ,{\tau }) > 0 \) (possibly larger than in Lemma 19) such that
Proof
Use (8.25), (8.38), (8.39), (8.42), (8.44), and Lemma 22. \(\square \)
8.5 Space reduction at the order \( \partial _x \)
The goal of this section is to transform \( \mathcal{L}_5 \) in (8.52) so that the coefficient of \( \partial _x \) becomes constant. We conjugate \( \mathcal{L}_4 \) via a symplectic map of the form
where \(\widehat{\mathcal{S}} := \sum _{k \ge 2} \frac{1}{k!} [\Pi _S^\bot (w \partial _x^{-1})]^k \Pi _S^\bot \) and \( w :\mathbb T^{\nu +1} \rightarrow \mathbb R\) is a function. Note that the linear operator \(\Pi _S^\bot (w \partial _x^{-1}) \Pi _S^\bot \) is the Hamiltonian vector field generated by the Hamiltonian \( - \frac{1}{2} \int _\mathbb Tw (\partial _x^{-1} h)^2\,dx\), \(h \in H_S^\bot \). We calculate
where \(\tilde{R}_5\) collects all the terms of order at most \(\partial _x^0\). By (8.36), we solve \( 3 m_3 w_x\) \(+ \varepsilon ^2 c(\xi ) + \tilde{d}_1 - m_1 = 0 \) by choosing \(w := - (3 m_3)^{-1} \partial _x^{-1} ( \varepsilon ^2 c(\xi ) + \tilde{d}_1 - m_1 )\). For \( \varepsilon \) small the operator \( \mathcal{S} \) is invertible, and we get
Since \( \mathcal{S} \) is symplectic, \(\mathcal{L}_5\) is Hamiltonian (recall Definition 2). By (8.25), (8.37) and (8.38), one has \(\Vert w \Vert _s^{\mathrm {Lip}(\gamma )}\le _s \varepsilon ^7 \gamma ^{-2} + \varepsilon ^2 \Vert {\mathfrak I}_\delta \Vert _{s + \sigma }^{\mathrm {Lip}(\gamma )}\).
Lemma 24
There is \( \sigma = \sigma (\nu ,{\tau }) > 0 \) (possibly larger than in Lemma 23) such that
The remainder \(R_5\) satisfies the same estimates (8.54) as \(R_4\).
8.6 KAM reducibility and inversion of \( \mathcal{L}_{\omega } \)
The coefficients \( m_3, m_1 \) of the operator \( \mathcal{L}_5 \) in (8.56) are constants, and the remainder \( R_5 \) is a bounded operator of order \( \partial _x^0 \) with small matrix decay norm, see (8.59). Then we can diagonalize \( \mathcal{L}_5 \) by applying the iterative KAM reducibility Theorem 4.2 in [3] along the sequence of scales
In Sect. 9, the initial \( N_0 \) will (slightly) increase to infinity as \( \varepsilon \rightarrow 0 \), see (9.5). The required smallness condition (see (4.14) in [3]) is (written in the present notations)
where \( \beta := 7 {\tau }+ 6 \) (see (4.1) in [3]), \( {\tau }\) is the diophantine exponent in (5.3) and (8.63), and the constant \( C_0 := C_0 ({\tau }, \nu ) > 0 \) is fixed in Theorem 4.2 in [3]. By Lemma 24, the remainder \( R_5 \) satisfies the bound (8.54), and using (7.5) we get (recall (5.9))
We use that \( \mu \) in (7.5) is assumed to satisfy \( \mu \ge \sigma + \beta \) where \( \sigma := \sigma ({\tau }, \nu ) \) is given in Lemma 24.
Theorem 4
(Reducibility) Assume that \(\omega \mapsto i_\delta (\omega ) \) is a Lipschitz function defined on some subset \(\Omega _o \subset \Omega _\varepsilon \) (recall (5.2)), satisfying (7.5) with \( \mu \ge \sigma + \beta \), where \( \sigma := \sigma ({\tau }, \nu ) \) is given in Lemma 24 and \( \beta := 7 {\tau }+ 6 \). Then there exists \( \delta _{0} \in (0,1) \) such that, if
then:
-
(i)
(Eigenvalues) For all \( \omega \in \Omega _\varepsilon \) there exists a sequence
$$\begin{aligned} \mu _j^\infty (\omega ) := \mu _j^\infty (\omega , i_\delta (\omega )) := {\mathrm i}( - {\tilde{m}}_3 (\omega ) j^3 + {\tilde{m}}_1(\omega ) j ) + r_j^\infty (\omega ), \quad j \in S^c , \end{aligned}$$(8.61)where \( {\tilde{m}}_3, {\tilde{m}}_1\) coincide with the coefficients \(m_3, m_1\) of \( \mathcal{L}_5 \) in (8.56) for all \( \omega \in \Omega _o \), and
$$\begin{aligned} | {\tilde{m}}_3 - 1 |^{\mathrm {Lip}(\gamma )}&\le C \varepsilon ^3, \quad | {\tilde{m}}_1 - \varepsilon ^2 c(\xi ) |^{\mathrm {Lip}(\gamma )}\le C \varepsilon ^5 \gamma ^{-1}, \nonumber \\ | r^{\infty }_j |^{\mathrm {Lip}(\gamma )}&\le C \varepsilon ^{3 - 2 a} \quad \forall j \in S^c \end{aligned}$$(8.62)for some \( C > 0 \) (and \(c(\xi )\) is defined in (8.36)). All the eigenvalues \(\mu _j^{\infty }\) are purely imaginary. We define, for convenience, \(\mu _0^\infty (\omega ) := 0\).
-
(ii)
(Conjugacy) For all \(\omega \) in the set
$$\begin{aligned} \Omega _\infty ^{2\gamma } := \Omega _\infty ^{2\gamma } (i_\delta )&:= \Bigg \{ \omega \in \Omega _o : \, | {\mathrm i}\omega \cdot l + \mu ^{\infty }_j (\omega ) - \mu ^{\infty }_{k} (\omega ) | \ge \frac{2 \gamma | j^{3} - k^{3} |}{ \langle l \rangle ^{{\tau }}} \nonumber \\&\qquad \quad \forall l \in \mathbb Z^{\nu }, \ \forall j ,k \in S^c \cup \{0\} \Bigg \} \end{aligned}$$(8.63)there is a real, bounded, invertible linear operator \(\Phi _\infty (\omega ) : H^s_{S^\bot } (\mathbb T^{\nu +1}) \rightarrow H^s_{S^\bot } (\mathbb T^{\nu +1}) \), with bounded inverse \(\Phi _\infty ^{-1}(\omega )\), that conjugates \(\mathcal {L}_6\) in (8.56) to constant coefficients, namely
$$\begin{aligned} \begin{array}{ll} \mathcal{L}_{\infty }(\omega ) &{} := \Phi _{\infty }^{-1}(\omega ) \circ \mathcal {L}_5(\omega ) \circ \Phi _{\infty }(\omega ) = \omega \cdot \partial _{\varphi } + \mathcal{D}_{\infty }(\omega ), \\ \mathcal{D}_{\infty }(\omega ) &{} := \mathrm{diag}_{j \in S^c} \{ \mu ^{\infty }_{j}(\omega ) \} . \end{array} \end{aligned}$$(8.64)The transformations \(\Phi _\infty , \Phi _\infty ^{-1}\) are close to the identity in matrix decay norm, with
$$\begin{aligned} | \Phi _{\infty } - I |_{s,\Omega _\infty ^{2\gamma }}^{\mathrm{Lip}(\gamma )} + | \Phi _{\infty }^{- 1} - I |_{s,\Omega _\infty ^{2\gamma }}^{\mathrm {Lip}(\gamma )}\le _s \varepsilon ^7 \gamma ^{-3} + \varepsilon ^2 \gamma ^{-1} \Vert {\mathfrak I}_\delta \Vert _{s + \sigma }^{\mathrm {Lip}(\gamma )}. \end{aligned}$$(8.65)Moreover \(\Phi _{\infty }, \Phi _{\infty }^{-1}\) are symplectic, and \(\mathcal {L}_\infty \) is a Hamiltonian operator.
Proof
The proof closely follows the one of Theorem 4.1 in [3], which is based on Theorem 4.2, Corollaries 4.1, 4.2 and Lemmata 4.1, 4.2 of [3]. Here \(\omega \in \mathbb R^\nu \), while in [3] the parameter \(\lambda \in \mathbb R\), but Kirszbraun’s theorem on Lipschitz extension also holds in \(\mathbb R^\nu \). The bound (8.65) follows by Corollary 4.1 of [3] and the estimate of \( R_5 \) in Lemma 24 above.
To adapt the proof of [3] to the present case, the only changes in the statement of Theorem 4.2 of [3] are: \(\varepsilon ^{3-2a}\) instead of \(\varepsilon \) in (4.18) of [3], and \(\varepsilon ^{1+b}\) instead of \(\varepsilon \) in (4.23), (4.25) and (4.26) of [3]. The factor \(\varepsilon ^{1+b}\) comes from the bound for \(\partial _i R_5\), see Lemma 24 and (8.54). \(\square \)
Remark 10
Theorem 4.2 in [3] also provides the Lipschitz dependence of the (approximate) eigenvalues \( \mu _j^n \) with respect to the unknown \( i_0 (\varphi ) \), which is used for the measure estimate (Lemma 25).
All the parameters \( \omega \in \Omega _\infty ^{2 \gamma } \) satisfy (specialize (8.63) for \( k = 0 \))
and the diagonal operator \( \mathcal{L}_\infty \) is invertible.
In the following theorem we verify the inversion assumption (6.26) for \(\mathcal{L}_\omega \).
Theorem 5
(Inversion of \( \mathcal{L}_\omega )\) Assume the hypotheses of Theorem 4 and (8.60). Then there exists \( \sigma _1 := \sigma _1 ( {\tau }, \nu ) > 0 \) such that, \( \forall \omega \in \Omega ^{2 \gamma }_\infty (i_\delta )\) (see (8.63)), for any function \( g \in H^{s+\sigma _1}_{S^\bot } (\mathbb T^{\nu +1}) \) the equation \(\mathcal{L}_\omega h = g\) has a solution \(h = \mathcal{L}_\omega ^{-1} g \in H^s_{S^\bot } (\mathbb T^{\nu +1})\), satisfying
Proof
See the proof of Theorem 8.16 in [5]. \(\square \)
9 The Nash–Moser nonlinear iteration
In this section we prove Theorem 2. It will be a consequence of the Nash–Moser Theorem 6 below.
Consider the finite-dimensional subspaces
where \( N_n := N_0^{\chi ^n} \) are introduced in (8.57), and \( \Pi _n \) are the projectors (which, with a small abuse of notation, we denote with the same symbol)
where \(\Theta (\varphi ) = \sum _{l \in \mathbb Z^\nu } \Theta _l e^{{\mathrm i}l \cdot \varphi }\) and \(z(\varphi ,x) = \sum _{l \in \mathbb Z^\nu , j \in S^c} z_{lj} e^{{\mathrm i}(l \cdot \varphi + jx)}\) [for \(\Pi _n y(\varphi )\) similar definition as for \(\Pi _n \Theta (\varphi )\)]. We define \( \Pi _n^\bot := I - \Pi _n \). The classical smoothing properties hold: for all \(\alpha , s \ge 0\),
We define the constants
where \( \mu := \mu ({\tau }, \nu ) \) is the “loss of regularity” defined in Theorem 3 (see (6.35)) and \( C_1 \) is fixed below.
Theorem 6
(Nash–Moser) Assume that \( f \in C^q \) with \( q > s_0 + \beta _1 + \mu + 3 \). Let \( {\tau }\ge \nu + 2 \). Then there exist \( C_1 > \max \{ \mu _1 + \alpha , C_0 \} \) [where \( C_0 := C_0 ({\tau }, \nu ) \) is the one in Theorem 4], \( \delta _0 := \delta _0 ({\tau }, \nu ) > 0 \) such that, if
then, for all \( n \ge 0 \):
-
\((\mathcal{P}1)_{n}\) there exists a function \(({\mathfrak {I}}_n, \zeta _n) : \mathcal{G}_n \subseteq \Omega _\varepsilon \rightarrow E_{n-1} \times \mathbb R^\nu \), \(\omega \mapsto ({\mathfrak {I}}_n(\omega ), \zeta _n(\omega ))\), \( ({\mathfrak {I}}_0, \zeta _0) := 0 \), \( E_{-1} := \{ 0 \} \), satisfying \( | \zeta _n |^{\mathrm {Lip}(\gamma )}\le C \Vert \mathcal{F}(U_n) \Vert _{s_0}^{\mathrm {Lip}(\gamma )}\),
$$\begin{aligned} \Vert {\mathfrak {I}}_n \Vert _{s_0 + \mu }^{\mathrm{Lip}(\gamma )} \le C_* \varepsilon ^{b_*} \gamma ^{-1}, \quad \Vert \mathcal{F}(U_n)\Vert _{s_0 + \mu + 3}^{\mathrm{Lip}(\gamma )} \le C_*\varepsilon ^{b_*} , \end{aligned}$$(9.6)where \(U_n := (i_n, \zeta _n)\) with \(i_n(\varphi ) = (\varphi ,0,0) + {\mathfrak {I}}_n(\varphi )\). The sets \(\mathcal{G}_{n} \) are defined inductively by:
$$\begin{aligned} \mathcal{G}_{0}:= & {} \Bigg \{\omega \in \Omega _\varepsilon \, : \, |\omega \cdot l| \ge \frac{2 \gamma }{\langle l \rangle ^{{\tau }}} \, \ \forall l \in \mathbb Z^\nu {\setminus } \{0\} \Bigg \} ,\nonumber \\ \mathcal{G}_{n+1}:= & {} \Bigg \{ \omega \in \mathcal{G}_{n} \, : \, |{\mathrm i}\omega \cdot l + \mu _j^\infty ( i_n) - \mu _k^\infty ( i_n )| \ge \frac{2\gamma _{n} |j^{3}-k^{3}|}{\left\langle l\right\rangle ^{{\tau }}} \nonumber \\&\quad \forall j , k \in S^c \cup \{0\}, \ l \in \mathbb Z^{\nu } \Bigg \}, \end{aligned}$$(9.7)where \( \gamma _{n}:=\gamma (1 + 2^{-n}) \) and \(\mu _j^\infty (\omega ) := \mu _j^\infty (\omega , i_n(\omega )) \) are defined in (8.61) [and \( \mu _0^\infty (\omega ) = 0 ]\). The difference \(\widehat{\mathfrak I}_n := {\mathfrak I}_n - {\mathfrak I}_{n - 1} \) (where we set \( \widehat{\mathfrak {I}}_0 := 0 \)) is defined on \(\mathcal {G}_n\), and it satisfies
$$\begin{aligned} \Vert \widehat{\mathfrak I}_1 \Vert _{ s_0 + \mu }^{{\mathrm {Lip}(\gamma )}} \le C_* \varepsilon ^{b_*} \gamma ^{-1} , \quad \Vert \widehat{\mathfrak I}_n \Vert _{ s_0 + \mu }^{{\mathrm {Lip}(\gamma )}} \le C_* \varepsilon ^{b_*} \gamma ^{-1} N_{n - 1}^{-\alpha _1} \quad \forall n > 1. \end{aligned}$$(9.8) -
\((\mathcal{P}2)_{n}\) \( \Vert \mathcal{F}(U_n) \Vert _{ s_{0}}^{\mathrm{Lip}(\gamma )} \le C_* \varepsilon ^{b_*} N_{n - 1}^{- \alpha }\) where we set \(N_{-1} := 1\).
-
\((\mathcal{P}3)_{n}\) (High norms). \( \Vert {\mathfrak {I}}_n \Vert _{ s_{0}+ \beta _1}^{\mathrm{Lip}(\gamma )} \le C_* \varepsilon ^{b_*} \gamma ^{-1} N_{n - 1}^{\kappa } \) and \( \Vert \mathcal{F}(U_n ) \Vert _{ s_{0}+\beta _1}^{\mathrm{Lip}(\gamma )} \le C_* \varepsilon ^{b_*} N_{n - 1}^{\kappa } \).
-
\((\mathcal{P}4)_{n}\) (Measure). The measure of the “Cantor-like” sets \( \mathcal{G}_n \) satisfies
$$\begin{aligned} | \Omega _\varepsilon {\setminus } \mathcal{G}_0 | \le C_* \varepsilon ^{2(\nu - 1)} \gamma , \quad \big | \mathcal{G}_n {\setminus } \mathcal{G}_{n+1} \big | \le C_* \varepsilon ^{2(\nu - 1)} \gamma N_{n - 1}^{-1} . \end{aligned}$$(9.9)
All the Lip norms are defined on \( \mathcal{G}_{n} \), namely \(\Vert \ \Vert _s^{\mathrm{Lip}(\gamma )} = \Vert \ \Vert _{s,\mathcal {G}_n}^{\mathrm{Lip}(\gamma )}\).
Proof
To simplify notations, in this proof we denote \(\Vert \, \Vert ^{\mathrm{Lip}(\gamma )}\) by \(\Vert \, \Vert \).
Step 1: Proof of \((\mathcal{P}1, 2, 3)_0\). Recalling (5.6) we have \( \Vert \mathcal{F}( U_0 ) \Vert _s\) \(= \Vert \mathcal{F}(\varphi , 0 , 0, 0 ) \Vert _s\) \(= \Vert X_P(\varphi , 0 , 0 ) \Vert _s \le _s \varepsilon ^{5-2b} \) by Lemma 5. Hence (recall that \( b_* := 5 - 2 b \)) the smallness conditions in \((\mathcal{P}1)_0\)–\((\mathcal{P}3)_0\) hold taking \( C_* := C_* (s_0 + \beta _1) \) large enough.
Step 2: Assume that \((\mathcal{P}1,2,3)_n\) hold for some \(n \ge 0\), and prove \((\mathcal{P}1,2,3)_{n+1}\). The proof of this step closely follows Step 2 in the proof of Theorem 9.1 of [5]. We just mention the main changes: here it is convenient to define
while the corresponding quantities defined in (9.18) of [5] have \(\varepsilon \) instead of \(\varepsilon ^2\) (and then, with definition (9.10), the bounds (9.19) of [5] are also valid here without changes). In the present case, the estimates (9.20) and (9.21) of [5] for the quadratic Taylor remainder have to be adapted by replacing the factor \(\varepsilon \) with \(\varepsilon ^2\). The reason for this improvement is that the nonlinearity in the mKdV equation is cubic, whereas in the KdV equation considered in [5] the nonlinearity is just quadratic. \(\square \)
Remark 11
Since the KdV, respectively mKdV, nonlinearity is quadratic, respectively cubic, the smallness condition required in [5] for the convergence of the Nash–Moser scheme is stronger than for Theorem 6: it is \( \varepsilon \Vert \mathcal{F}(\varphi , 0, 0 ) \Vert _{s_0+ \mu } \gamma ^{-2} \ll 1 \) instead of \( \varepsilon ^2 \Vert \mathcal{F}(\varphi , 0, 0 ) \Vert _{s_0+ \mu } \gamma ^{-2} \ll 1 \). As a consequence less steps of Birkhoff normal form are required (namely less monomials to work out in the original Hamiltonian) to reach the sufficient smallness \(\mathcal {F}(U_0) = O( \varepsilon ^{5-2b}) \) to make the Nash–Moser scheme to converge (in [5] it is needed \(\mathcal {F}(U_0) = O( \varepsilon ^{6-2b}) \)).
Step 3: Prove \((\mathcal{P}4)_n\) for all \(n \ge 0\). For all \(n \ge 0\), the difference \(\mathcal {G}_n {\setminus } \mathcal {G}_{n+1}\) is the union over \(l \in \mathbb Z^\nu \), \(j,k \in S^c \cup \{ 0 \}\) of the sets \(R_{ljk}(i_n)\), where
Since \(R_{ljk}(i_n) = \emptyset \) for \(j = k\), in the sequel we assume that \(j \ne k\).
Lemma 25
For \(n \ge 1\), \(|l| \le N_{n - 1}\), one has the inclusion \(R_{ljk}(i_n) \subseteq R_{ljk}(i_{n - 1}) \).
Proof
The proof closely follows the one of Lemma 5.2 in [3]. The differences are that here the vector \(\omega \) is not confined along a fixed direction, here we have \(N_{n-1}\) instead of \(N_n\), and the factor \(\varepsilon \) in (5.28) and (5.33) of [3] is replaced here by \(\varepsilon ^7 \gamma ^{-2} = \varepsilon ^{3-2a}\).
In the proof we use (8.25), (8.37), (8.59) and (9.8), and the bounds (4.25), (4.26) and (4.34) of [3] adapted to the present case (the bounds (4.25) and (4.26) of [3] hold here with \(\varepsilon ^{1+b}\) instead of \(\varepsilon \), as already pointed out in the proof of Theorem 4; the bound (4.34) of [3] holds here with no change). \(\square \)
By definition, \( R_{ljk} (i_n) \subseteq \mathcal{G}_n \) (see (9.11)). By Lemma 25, for \(n \ge 1\) and \( |l| \le N_{n-1} \) we also have \(R_{ljk}(i_n) \subseteq R_{ljk}(i_{n - 1}) \). On the other hand, \( R_{ljk}(i_{n-1}) \cap \mathcal{G}_{n} = \emptyset \) (see (9.7)). As a consequence, \( R_{ljk} (i_n) = \emptyset \) for all \( |l| \le N_{n-1} \), and
Lemma 26
Let \(n \ge 0\). If \(R_{ljk}(i_n) \ne \emptyset \), then \(|l| \ge C_1 |j^3 - k^3| \ge \frac{1}{2} C_1 (j^2 + k^2) \) for some constant \(C_1 > 0\) (independent of \(l,j,k,n,i_n,\omega \)).
Proof
Follow the proof of Lemma 5.3 of [3], also using (8.62). Note that \(|\omega | \le 2 |\bar{\omega }|\) for all \(\omega \in \Omega _\varepsilon \), for \(\varepsilon \) small enough, by (4.10) and (5.2). \(\square \)
Now we study the measure of the resonant sets \(R_{ljk}(i_n)\) defined in (9.11). We have to analyze in more details the sublevels of the function
appearing in (9.11) (\(\phi \) also depends on \(l,j,k,i_n\)).
Lemma 27
There exists \(C_0 > 0\) such that for all \(j \ne k\), with \(j^2 + k^2 > C_0\), the set \(R_{ljk}(i_n)\) has Lebesgue measure \(|R_{ljk}(i_n)| \le C \varepsilon ^{2(\nu -1)} \gamma \langle l \rangle ^{-{\tau }}\).
Proof
For \(l \ne 0\), decompose \(\omega = s \hat{l} + v\), where \(\hat{l} := l / |l|\), \(s \in \mathbb R\), and \(l \cdot v = 0\) (so that \(\omega \cdot l = s |l|\)). Let \(\psi (s) := \phi (s \hat{l} + v)\). The eigenvalues \(\mu _j^\infty \) are given in (8.61). By (5.4) and (8.36), \( \varepsilon ^2 |c(\xi )|^\mathrm {lip}\le C_2 \) for some constant \(C_2 > 0 \) depending only on the set S of the tangential sites. Then, by (2.2) and (8.62),
for some \(C > 0\) and \(\varepsilon \) small enough, where, with a slight abuse of notations, we have written
for \(\varepsilon \) small enough and \(j^2 + k^2 + jk > C_0 := 12 C_2 / C_1\). As a consequence, the set \(\Delta _{ljk}(i_n) := \{ s :s \hat{l} + v \in R_{ljk}(i_n) \}\) has Lebesgue measure
for some \(C > 0\). The lemma follows by Fubini’s theorem. \(\square \)
Remark 12
When \( K = H + \lambda M^2 \), \( \lambda = 3 / 4 \), using (8.40), the conclusion of Lemma 27 holds without restrictions on j, k.
It remains to estimate the measure of the finitely many resonant sets \(R_{ljk}(i_n)\) for \(j^2 + k^2 \le C_0\). Recalling (8.36) and the parity \(\xi _{-j} = \xi _j\), we write \(c(\xi ) = 6 \varsigma \mathbf {1} \cdot \xi \) where \(\mathbf {1}\) is the vector \((1, \ldots , 1) \in \mathbb R^\nu \) and \(\xi = (\xi _j)_{j \in S^+} \in \mathbb R^\nu \). Hence, by (5.4),
where \(\mathbb {A}^{-T}\) is the transpose of \(\mathbb {A}^{-1}\). We write the function \( \phi (\omega ) \) in (9.13) as
where
(and \(\tilde{m}_3, \tilde{m}_1, \xi , r_j^\infty , r_k^\infty \) all depend on \(\omega \)). By (8.62) and since \( j^2 + k^2 \le C_0 \) we deduce that \( |q_{jk}|^{\mathrm {Lip}(\gamma )}\le C \varepsilon ^{3-2a} \). Recalling (2.2) we get
so that \( \phi (\omega ) \) is a small perturbation of the affine function \( \omega \mapsto a_{jk} + b_{ljk} \cdot \omega \). By the next lemma, the hypothesis (1.12) on the tangential sites S allows to verify that such function does not vanish identically.
Lemma 28
Assume (1.12). Then, for all \(j \ne k\), \(j^2 + k^2 \le C_0\) it results \(a_{jk} \ne 0\).
Proof
Using formulae (1.19) and (4.11), we calculate
Hence
by assumption (1.12) on the set S. \(\square \)
Lemma 28 implies that \(\delta := \min \{ |a_{jk}| :j^2 + k^2 \le C_0, \ j \ne k \} > 0\).
Lemma 29
Assume (1.12). If \(j^2 + k^2 \le C_0\), then \(|R_{ljk}(i_n)| \le C \varepsilon ^{2(\nu -1)} \gamma \langle l \rangle ^{-{\tau }}\).
Proof
Denote \(b := b_{ljk}\) for brevity. For \( j^2 + k^2 \le C_0\), \(\omega \in R_{ljk}(i_n)\), one has, by (9.11) and (9.15),
for \(\varepsilon \) small enough. On the other hand, \(| b \cdot \omega | \le 2 | \bar{\omega }| |b|\) because \(|\omega | \le 2 |\bar{\omega }|\) (see (4.10) and (5.2)). Hence \(|b| \ge \delta _1\) where \(\delta _1 := \delta / (4 |\bar{\omega }|) > 0\). Split \(\omega = s \hat{b} + v\) where \(\hat{b} := b / |b|\) and \(v \cdot b = 0\). Let \(\psi (s) := \phi ( s \hat{b} + v )\). By (9.15), for \(\varepsilon \) small enough, we get
Then we proceed similarly as in the proof of Lemma 27. \(\square \)
The proof of (9.9) follows from the Lemmata 25–29, proceeding like in [3] (see the conclusion of the proof of Theorem 5.1 in [3]).
Proof of Theorem 2 concluded The conclusion of the proof of Theorem 2 follows exactly like in [5] (see “Proof of Theorem 5.1 concluded” in [5]).
Remark 13
By Remark 12 and Lemma 28 (which is the only point in the paper where assumption (1.12) is used) is not needed any more. Thus Theorem 1 applies to \( K = H + (3\varsigma /4) M^2\) without assuming hypothesis (1.12).
References
Alazard, T., Baldi, P.: Gravity capillary standing water waves. Arch. Ration. Mech. Anal. 217(3), 741–830 (2015)
Baldi, P.: Periodic solutions of fully nonlinear autonomous equations of Benjamin–Ono type. Ann. Inst. H. Poincaré (C) Anal. Non Linéaire 30, 33–77 (2013)
Baldi, P., Berti, M., Montalto, R.: KAM for quasi-linear and fully nonlinear forced perturbations of Airy equation. Math. Ann. 359, 471–536 (2014)
Baldi, P., Berti, M., Montalto, R.: KAM for quasi-linear KdV. C. R. Acad. Sci. Paris Ser. I 352, 603–607 (2014)
Baldi, P., Berti, M., Montalto, R.: KAM for autonomous quasi-linear perturbations of KdV. Ann. Inst. H. Poincaré (C) Anal. Non Linéaire. pp. 15–89. doi:10.1016/j.anihpc.2015.07.003
Baldi, P., Floridia, G., Haus, E.: Exact controllability for quasi-linear perturbations of KdV. Preprint arXiv:1510.07538
Berti, M., Biasco, P., Procesi, M.: KAM theory for the Hamiltonian DNLW. Ann. Sci. Éc. Norm. Supér. (4) 46, 301–373 (2013) (fascicule 2)
Berti, M., Biasco, P., Procesi, M.: KAM theory for the reversible derivative wave equation. Arch. Ration. Mech. Anal. 212, 905–955 (2014)
Berti, M., Bolle, P.: Quasi-periodic solutions with Sobolev regularity of NLS on \( {\mathbb{T}}^d \) with a multiplicative potential. J. Eur. Math. Soc. 15, 229–286 (2013)
Berti, M., Bolle P.: A Nash–Moser approach to KAM theory. Fields Institute Communications. In: Hamiltonian PDEs and Applications, vol. 75. pp. 255–284
Berti, M., Corsi, L., Procesi, M.: An abstract Nash–Moser theorem and quasi-periodic solutions for NLW and NLS on compact Lie groups and homogeneous manifolds. Commun. Math. Phys. 334(3), 1413–1454 (2015)
Berti, M., Montalto, R.: KAM for gravity capillary water waves. Preprint arXiv:1602.02411
Bourgain, J.: Gibbs measures and quasi-periodic solutions for nonlinear Hamiltonian partial differential equations. In: Gelfand Math. Sem, pp. 23–43. Birkhäuser, Boston (1996)
Feola, R., Procesi, M.: Quasi-periodic solutions for fully nonlinear forced reversible Schrödinger equations. J. Differ. Equ. 259(7), 3389–3447 (2015)
Guan, H., Kuksin, S.: The KdV equation under periodic boundary conditions and its perturbations. Nonlinearity 27(9), R61–R88 (2014)
Iooss, G., Plotnikov, P.I.: Small divisor problem in the theory of three-dimensional water gravity waves. In: Mem. Am. Math. Soc. 200, vol. 940 (2009)
Iooss, G., Plotnikov, P.I., Toland, J.F.: Standing waves on an infinitely deep perfect fluid under gravity. Arch. Ration. Mech. Anal. 177(3), 367–478 (2005)
Kappeler, T., Pöschel J.: KAM and KdV. Springer, New York (2003)
Kappeler, T., Topalov, P.: Global well-posedness of mKdV in \( L^2 (T, R)\). Commun. Partial Differ. Equ. 30(1–3), 435–449 (2005)
Klainerman, S., Majda, A.: Formation of singularities for wave equations including the nonlinear vibrating string. Commun. Pure Appl. Math. 33, 241–263 (1980)
Kuksin, S.: Hamiltonian perturbations of infinite-dimensional linear systems with imaginary spectrum. Funktsional. Anal. i Prilozhen. 21(3), 22–37, 95 (1987)
Kuksin, S.: A KAM theorem for equations of the Korteweg–de Vries type. Rev. Math. Phys. 10(3), 1–64 (1998)
Kuksin, S.: Analysis of Hamiltonian PDEs. In: Oxford Lecture Series in Mathematics and its Applications, vol. 19, pp. xii+212. Oxford University Press, Oxford (2000)
Kuksin, S., Pöschel, J.: Invariant Cantor manifolds of quasi-periodic oscillations for a nonlinear Schrödinger equation. Ann. Math. 2(143), 149–179 (1996)
Lax, P.: Development of singularities of solutions of nonlinear hyperbolic partial differential equations. J. Math. Phys. 5, 611–613 (1964)
Liu, J., Yuan, X.: A KAM theorem for Hamiltonian partial differential equations with unbounded perturbations. Commun. Math. Phys 307(3), 629–673 (2011)
Pöschel, J.: Quasi-periodic solutions for a nonlinear wave equation. Comment. Math. Helv. 71(2), 269–296 (1996)
Procesi, M., Procesi, C.: A normal form for the Schrödinger equation with analytic non-linearities. Commun. Math. Phys. 312, 501–557 (2012)
Taylor, M.E.: Pseudodifferential operators and nonlinear PDEs. In: Progress in Mathematics. Birkhäuser, Boston (1991)
Zhang, J., Gao, M., Yuan, X.: KAM tori for reversible partial differential equations. Nonlinearity 24, 1189–1228 (2011)
Zehnder, E.: Generalized implicit function theorems with applications to some small divisors problems I–II. Commun. Pure Appl. Math. 28, 91–140 (1975) [and 29, 49–113 (1976)]
Acknowledgments
This research was supported by the European Research Council under FP7, ERC Project 306414 HamPDEs, and PRIN 2012 “Variational and perturbative aspects of nonlinear differential problems”. This research was carried out in the frame of Programme STAR, financially supported by UniNA and Compagnia di San Paolo.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Baldi, P., Berti, M. & Montalto, R. KAM for autonomous quasi-linear perturbations of mKdV. Boll Unione Mat Ital 9, 143–188 (2016). https://doi.org/10.1007/s40574-016-0065-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40574-016-0065-1