1 Introduction

The present work aims to treat the perturbations of a linear string in the framework of classical Hamiltonian field theory. The unperturbed base model we have in mind, the linear string, is described by the one-dimensional wave equation

$$\displaystyle \begin{aligned} q_{tt}\;=\;c^2q_{xx} \, , \end{aligned} $$
(1)

where \(q:\mathbb {R}\times D\to \mathbb {R}:(t,x)\to q(t,x)\) is the unknown, real-valued field, and c is a real, positive parameter, the speed of the wave. As usual, partial derivatives are denoted by subscripts, i.e. q t =  tq, q x =  xq and so on. Concerning the space domain D and the boundary conditions of the field q, we here focus on the 1-periodic case, namely \(D=\mathbb {T}:=\mathbb {R}/\mathbb {Z}\) (the L-periodic case, with \(D=\mathbb {R}/(L\mathbb {Z})\), can always be reduced to the case L = 1 by rescaling both the independent variables to x′ = xL, t′ = tL).

Solving Eq. (1), for any initial condition q(0, x), q t(0, x) defined on \(\mathbb {T}\) and regular enough, is a standard exercise in Fourier analysis. Indeed, substituting \(q(t,x)=\sum _{k\in \mathbb {Z}}\hat {q}_k(t)e^{\imath 2\pi k x}\) (ı is the imaginary unit) into (1), one gets

$$\displaystyle \begin{aligned} \frac{d^2\hat{q}_k}{dt^2}=-4\pi^2c^2k^2 \, \hat{q}_k\ , \end{aligned}$$

which implies \(\hat {q}_k(t)=a_ke^{\imath \omega _kt}+\bar {a}_{-k}e^{-\imath \omega _kt}\), where the a k are complex constants (the bar denoting complex conjugation), and

$$\displaystyle \begin{aligned} \omega_k:=2\pi c |k|\ ;\ \ k\in\mathbb{Z}\ . \end{aligned} $$
(2)

Observe that ω k = ω k, which implies \(\overline {\hat {q}}_k=\hat {q}_{-k}\), i.e. q is real. Relation (2) defines the dispersion relation of the wave equation. A given space periodic system, characterised by a certain dispersion relation k → ω k, is said to be non dispersive if ω k+1 − ω k is piecewise constant, i.e. if ω k is piecewise linear in k and this is clearly the case for the wave equation. One can check that the solution q(t, x) of the problem is time periodic for all initial conditions, the period being 2πω 1 = 1∕c.

It is almost impossible to give a complete account of physical phenomena that, to the first linear approximation, are described by the wave equation. Let us just mention, to have in mind concrete examples that we are going to analyse later, wave propagation in fluids and long-wavelength vibrations of interacting particle chains. In all these problems, the need to go beyond the first approximation arises, in order to take into account the effects of both nonlinearity and dispersion, typically determining whether some interesting form of energy localisation may take place, as opposed to a fast energy spreading among the degrees of freedom of the system. One is thus led to look for a general treatment of the possible perturbations of Eq. (1) regardless of the specific physical problem giving rise to it. This in turn calls for the restriction to a mathematical context where the possible perturbations constitute a well-defined ordered class of objects. We do this within the framework of Hamiltonian field theory, at the price to exclude, among others, all the dissipative effects from the theory (no claim is made here about their irrelevance: the other way around. See, for example, the enlightening discussion made by Nekhoroshev in [33]). Moreover, we consider nonlinear and dispersive perturbations depending on q x, p and their higher order derivatives, but not on q. Indeed, all systems made of interacting particles, such as solids, fluids and gasses, in absence of external forces, and on a sufficiently large space scale, are described by a certain wave equation at the linear level, with perturbations depending, in principle, only by the space derivatives of the field (and its momentum, possibly). This is due to the fact that interactions in matter depend on differences of coordinates, which in the continuum approximation corresponds to derivatives.

On the other hand, considering smooth perturbations of the wave equation depending on q (not only through derivatives) would be interesting as well. For example, as shown by Bambusi and Nekhoroshev and by Nekhoroshev [6, 7, 33], the smooth perturbations of the wave equation depending on q only (no derivatives) give rise to very nice, long-lasting localisation phenomena. Whether be possible to include such a class of problems in our treatment, drawing meaningful conclusions, looks unclear, at present.

Although we decided to focus on one-dimensional systems, it is worth mentioning that the techniques presented here can be generalised to study problems in higher space dimension. In this case one can predict, for example, energy localisation for a certain class of anisotropic rectangular lattices [22].

The paper is organised as follows. In Sect. 2 we introduce the Hamiltonian formalism of classical field theory, at the end of which we provide an informal presentation of the main results. Section 3 contains the elements of perturbation theory framed in the more general context of Poisson systems, which is the one appropriate to our purposes. Section 4 contains the formal statements and proofs of the results. The application of such results to the FPU problem and to the water wave problem is treated in Sect. 5. Finally, a short list of open problems is provided in Sect. 6.

2 Outline of the Method and Results

2.1 Hamiltonian Field Theory

For the sake of completeness, we report here a short review on what is meant by Hamiltonian field theory. The reader is referred to the monographs [16, 25], and [32], for details and/or a more extensive treatment of the subject.

In Hamiltonian field theory the dynamical variables (e.g. coordinates and conjugate momenta) are points in a certain function space, the phase space of the system, and the observables, including the Hamiltonian, are functionals, admitting a density, defined on the phase space.

In order to specify the notations used below, let us first consider the space of smooth functions, or fields \(u:\mathbb {T}\to \mathbb {R}\). A functional F[u], with density \(\mathscr {F}\) depending on x and on u(x) and its derivatives up to a given order, is defined as

$$\displaystyle \begin{aligned} F[u]=\oint \mathscr{F}(x,u,u_x,u_{xx},\dots) \, dx\ , \end{aligned} $$
(3)

where here and in the sequel we make use of the short hand notation \(\oint :=\int _{\mathbb {T}}\). The functional derivative (or variational derivative) of F with respect to u, denoted by δFδu, is defined by the relation

$$\displaystyle \begin{aligned} \delta{F}[u,\delta u]:=\frac{d}{d\epsilon}F[u+\epsilon \delta u]\big|{}_{\epsilon=0}=\oint \frac{\delta F}{\delta u}\delta u\ dx\ , \end{aligned} $$
(4)

for any smooth finite increment δu defined on \(\mathbb {T}\). Through repeated integrations by parts and erasing the boundary terms one finds

$$\displaystyle \begin{aligned} \frac{\delta F}{\delta u}=\sum_{j\geq0} (-1)^j \frac{d^j}{dx^j} \frac{\partial \mathscr{F}}{\partial (\partial_x^ju)}= \frac{\partial \mathscr{F}}{\partial u}-\frac{d}{dx}\frac{\partial \mathscr{F}}{\partial u_x}+ \frac{d^2}{dx^2}\frac{\partial \mathscr{F}}{\partial u_{xx}}+\cdots, \end{aligned} $$
(5)

the sum above being finite if \(\mathscr {F}\) is a polynomial in u and its derivatives up to a given finite order (as will be in our case). Relation (4) defines the Gateaux, or weak differential of the functional F at u with increment δu, which under further requirements coincide with the Fréchet, or strong differential of F; see e.g. [39]. The functional derivative is also referred to, in the mathematical literature, as the L 2-gradient of F with respect to u. Indeed, in the Hilbert space \(L_2(\mathbb {T})\) of square integrable functions on \(\mathbb {T}\), endowed with the usual scalar product \(\langle f,g\rangle :=\oint fg\ dx\), one can rewrite (4) as δF = 〈δFδu, δu〉 := 〈∇F, δu〉, identical in form to its finite-dimensional counterpart.

In the Hamiltonian field theory considered in the present paper, the phase space Γ of the system is the space of two components, smooth, real-valued fields (q(x), p(x)) defined on \(\mathbb {T}\). The observables of the theory are the functionals \(F:\varGamma \to \mathbb {R}\) admitting a density \(\mathscr {F}\) which is a polynomial in q(x), p(x) and their space derivatives up to a finite order, with coefficients possibly depending on x. One then selects, among the observables, the Hamiltonian defining the given system, namely

$$\displaystyle \begin{aligned} H[q,p]:=\oint\mathscr{H}(x,q,p,q_x,p_x,\dots)\ dx\ . \end{aligned} $$
(6)

The motion of the system, a certain curve γ : [t 1, t 2] ∋ t↦(q, p)(t) ∈ Γ, is then specified by a stationary action principle, as in the finite-dimensional case. Indeed, defining the action functional S[q, p] as

$$\displaystyle \begin{aligned} S[q,p]:=\int_{t_1}^{t_2}\big[\langle p,q_t\rangle-H\big]\ dt=\int_{t_1}^{t_2}\oint \left[pq_t-\mathscr{H}\right]dt\ dx\ , \end{aligned} $$
(7)

one defines the actual motion of the system as the critical point of S in the space of smooth curves (q(t, x), p(t, x)) in Γ with fixed ends on the first component: q(t 1, x) := q 1(x), q(t 2, x) := q 2(x), q 1 and q 2 being two assigned fields on \(\mathbb {T}\). The smooth increment curves (δq, δp)(t) must then satisfy the condition δq(t 1, x) = δq(t 2, x) = 0. With the notation just introduced, and performing simple integrations by parts, one gets the differential δS of the action S, namely

$$\displaystyle \begin{aligned} \delta S=\int_{t_1}^{t_2}\oint \left[\left(q_t-\frac{\delta H}{\delta p}\right)\delta p- \left(p_t+\frac{\delta H}{\delta q}\right)\delta q\right]dt\ dx\ . \end{aligned} $$
(8)

This is zero for any increment (δq, δp)(t) if and only if the following Hamilton equations hold:

$$\displaystyle \begin{aligned} q_t=\frac{\delta H}{\delta p}\ \ ;\ \ p_t=-\frac{\delta H}{\delta q}\ . \end{aligned} $$
(9)

This is the Hamilton principle of stationary action in classical field theory.

In this work, we restrict our attention to scalar fields q and p defined on the (flat) unit circle \(\mathbb {T}\). However, all the above construction and most of the results presented below can be extended to vector fields defined on any multi-dimensional space domain (not necessarily a torus).

Consider now a functional \(F[q,p]:=\oint \mathscr {F}(x,q,p,q_x,p_x,\dots )dx\). Its time derivative along the solutions of the Hamilton equations (9) associated with H is computed by means of repeated integrations by parts with respect to x. The result can be written as \(dF/dt=\left \{F,H\right \}_{q,p}\), where

$$\displaystyle \begin{aligned} \left\{F,H\right\}_{q,p}:= \oint\left(\frac{\delta F}{\delta q}\frac{\delta H}{\delta p}-\frac{\delta F}{\delta p}\frac{\delta H}{\delta q}\right)\ dx:=\left\langle\nabla F,\mathsf{J}_2\nabla H\right\rangle \end{aligned} $$
(10)

is the Poisson bracket of the functionals F and H. In the second definition above, IMAGE is the standard 2 × 2 symplectic matrix, IMAGE and the same for H. The product ξT J 2η = ξ 1η 2 − ξ 2η 1, for any pair of vectors \(\xi ,\eta \in \mathbb {R}^2\), defines the symplectic 2-form. The Poisson bracket (10) defines a bilinear, skew-symmetric product on the algebra of functionals defined on Γ, and it satisfies the Jacobi identity {{F, G}q,p, H}q,p + {{G, H}q,p, F}q,p + {{H, F}q,p, G}q,p ≡ 0 and the Leibniz rule {FG, H}q,p = F{G, H}q,p + {F, H}q,pG for any triple of functionals F, G, H. The algebra of functionals on Γ endowed with the Poisson bracket becomes a Poisson algebra and is typically referred to as the algebra of observables.

Remark 1

Given any skew-symmetric bilinear product on an algebra, the Jacobi identity characterises it as a Lie bracket. The latter, by further assuming the Leibniz rule, becomes a Poisson bracket (by definition). Thus, a Poisson algebra is a Lie algebra of Leibniz type.

The fundamental Poisson brackets of the Hamiltonian field theory on \(\mathbb {T}\) are

$$\displaystyle \begin{aligned} \{q(x),p(y)\}_{q,p}=\delta(x-y)\ \ ;\ \ \{q(x),q(y)\}_{q,p}=\{p(x),p(y)\}_{q,p}=0\ , \end{aligned} $$
(11)

where δ(x) is the Dirac delta distribution on \(\mathbb {T}\). This is proved by considering the identity \(\oint \delta (x-y)f(y)dy=f(x)\), valid for any continuous function on \(\mathbb {T}\), from which δf(x)∕δf(y) = δ(x − y) follows. As a consequence, the Hamilton equations (9) can be written in the form

$$\displaystyle \begin{aligned} q_t=\{q,H\}_{q,p}\ \ ;\ \ p_t=\{p,H\}_{q,p}\ . \end{aligned} $$
(12)

2.2 Results: Informal Presentation

Within the Hamiltonian formalism just introduced, we study a well-defined class of problems, defined as follows. We introduce a “bookkeeping parameter” λ and give a weight λ2 to both q x and p, weighting any successive derivative x of them by λ. Defining r := q x, this amounts to assume a “grading” (perturbative ordering of the dynamical variables and their derivatives) r ∼ p ≪ r x ∼ p x ≪ r xx ∼ p xx…, and (r x)2 ∼ r3, where, in a loose notation, ∼ and ≪ mean “of the same order of” and “of an order smaller than”, respectively. For the sake of simplicity, we assume the smooth density \(\mathscr {H}\) of H to be a function of q x, p and their space derivatives up to order four. Such a limitation is due to the fact that, in the present paper, we do not consider λ-expansions of the Hamiltonian H to degree higher than four, and with the chosen grading, derivatives of q x and p of order higher than four enter the perturbative problem from degree five on (in λ). The parameter λ is formal: it is necessary to define the grading and to trace the perturbative ordering, and it can be set to one at the end of the computations.

Definition 1

The class of problems considered in the present work is defined by the family of Hamiltonians of the form

$$\displaystyle \begin{aligned} H_\lambda:=\frac{1}{\lambda^4}\oint \mathscr{H}(\lambda^2q_{x},\lambda^2p,\lambda^3q_{xx},\lambda^3p_x, \dots,\lambda^6q_{xxxxx},\lambda^6p_{xxxx})\ dx\ , \end{aligned} $$
(13)

with the condition

$$\displaystyle \begin{aligned} \left(\frac{\partial^2\mathscr{H}}{\partial q_x^2}\Big|{}_{\lambda=0}\right)\ \left(\frac{\partial^2\mathscr{H}}{\partial p^2}\Big|{}_{\lambda=0}\right)>0\ . \end{aligned} $$
(14)

By Taylor expanding \(\mathscr {H}\) in powers of λ, close to λ = 0, and assuming without loss of generality that \(\mathscr {H}|{ }_{(q,p)=0}=0\), one gets a perturbative ordering of the Hamiltonian of the form

$$\displaystyle \begin{aligned} H_\lambda=H_0+ \lambda H_1 +\lambda^2 H_2+\lambda^3 H_3 +\lambda^4 H_4 +\cdots\ . \end{aligned} $$
(15)

We here observe that the absence of a term proportional to 1∕λ2 in the latter expansion is due to the conservation of the total momentum \(\oint p\ dx\), which can be always set to zero.

The main results are now presented in an informal way, their precise statements and proofs being provided below. The condition (14), which characterises the elliptic nature of the fixed point q = p = 0, implies that there exists a canonical transformation bringing the unperturbed Hamiltonian H 0 into the standard wave form

$$\displaystyle \begin{aligned} K_0:=\oint \frac{p^2+(q_x)^2}{2}\ dx\ , \end{aligned} $$
(16)

and leaving the perturbative expansion (15) unaltered. The equations of motion associated with the latter Hamiltonian are q t = p, p t = q xx, i.e. in second-order form, the wave equation q tt = q xx.

Now, in terms of the variables r := q x and p, the expanded Hamiltonian (15) reads K 0 + λH 1 + λ2 H 2 + ⋯, where \(K_0=\frac {1}{2}\oint (p^2+r^2)dx\), and the H j are functionals whose density is a homogeneous polynomial of “grade” j in r, p and their derivatives. One then conveniently performs the change of field variables (r, p)↦(u, v) defined by \(u=(r+ p)/\sqrt {2}\), \(v=(r-p)/\sqrt {2}\), in terms of which \(K_0=\frac {1}{2}\oint (u^2+v^2)dx\), and its flow separates the left from right wave: u t = u x, v t = −v x, so that u and v are simply the left and right translation of the corresponding initial datum, respectively.

The key idea is now to decouple the left from the right dynamics to higher orders. To such an end, we build up an explicit transformation of the field variables

$$\displaystyle \begin{aligned} \mathscr{T}_\lambda:\ (u,v)\mapsto(\tilde{u},\tilde{v})\ , \end{aligned}$$

λ-close to the identity, which sets the Hamiltonian H = K 0 + λH 1 + λ2 H 2 + ⋯ (expressed in the (u, v) variables) into normal form to order 1 ≤ s ≤ 4 with respect to K 0. This means, by definition, that \(H\circ \mathscr {T}^{-1}_\lambda =K_0+\lambda Z_1+\lambda ^2Z_2+\cdots \) is such that the Z j are first integrals of K 0, for 1 ≤ j ≤ s.

The results proved below are the following. In the general case, i.e. no further hypotheses being added to the Definition 1, we show that the normal form Hamiltonian to order s = 2 has the form K 0 + λ2 Z 2 + ⋯, and the corresponding dynamics of the variables \(\tilde {u},\tilde {v}\) reads

$$\displaystyle \begin{aligned} \left\{ \begin{aligned} \tilde{u}_t\;&=\; c_l \tilde{u}_x+a_l\kappa_3(\tilde{u})+\cdots \\ \tilde{v}_t\;&=\;- c_r \tilde{v}_x-a_r\kappa_3(\tilde{v})+\cdots \end{aligned} \right.\ . \end{aligned} $$
(17)

On the other hand, in certain relevant cases, such as the “mechanical” one, where \(\mathscr {H}= p^2/2+\mathscr {U}(q_x,q_{xx},\dots ,q_{xxxxx})\), or that of the water waves, one has H 1 = H 3 ≡ 0. In such situations the normal form Hamiltonian to order s = 4 has the form K 0 + λ2 Z 2 + λ4 Z 4 + ⋯, whose associated dynamics reads

$$\displaystyle \begin{aligned} \left\{ \begin{aligned} \tilde{u}_t\;&=\; c_l \tilde{u}_x+a_l\kappa_3(\tilde{u})+b_l\kappa_5(\tilde{u})+\cdots \\ \tilde{v}_t\;&=\;- c_r \tilde{v}_x-a_r\kappa_3(\tilde{v})-b_r\kappa_5(\tilde{v})+\cdots \end{aligned} \right.\ . \end{aligned} $$
(18)

In systems (17) and (18) a lr, b lr and c lr are certain constants (depending on the model, on the parameter λ and on the initial condition), whereas κ 3 and κ 5 are the vector fields of the first and second integral in the KdV hierarchy [1], namely

$$\displaystyle \begin{aligned} \begin{array}{rcl} \kappa_3(w)& =&\displaystyle \gamma w w_x+w_{xxx} =\partial_x\frac{\delta I_3}{\delta w} \, , \end{array} \end{aligned} $$
(19)
$$\displaystyle \begin{aligned} \begin{array}{rcl} \kappa_5(w)& =&\displaystyle \frac{5}{6} \gamma^2 w^2 w_x+\frac{10}{3}\gamma w_x w_{xx} + \frac{5}{3} \gamma w w_{xxx} + u_{xxxxx} = \partial_x\frac{\delta I_5}{\delta w}\, .{} \end{array} \end{aligned} $$
(20)

Here \(\gamma \in \mathbb {R}\) is a parameter, whose value is explicitly determined by the first order normal form transformation, whereas the first two integrals I 3 and I 5 of the KdV hierarchy are given by

$$\displaystyle \begin{aligned} \begin{array}{rcl} I_3& =&\displaystyle \oint \left(\frac{\gamma}{6}w^3-\frac{1}{2}(w_{x})^2 \right) \, dx \, , \end{array} \end{aligned} $$
(21)
$$\displaystyle \begin{aligned} \begin{array}{rcl} I_5& =&\displaystyle \oint \left(\frac{5\gamma^2}{72} w^4 + \frac{5\gamma}{12} w^2 w_{xx} +\frac{1}{2} (w_{xx})^2 \right) \, dx \, . \end{array} \end{aligned} $$
(22)

The conclusion is that both in the general and in the special case, the dynamics of the perturbed wave equation is integrable in the KdV hierarchy sense to the second perturbative order included.

Remark 2

The standard Hamiltonian normal form construction to leading order always leads to (17). On the other hand, in order to get (18), the second step of Hamiltonian normalisation is not enough, in general. With the aid of Hamiltonian transformations, we generally succeed in decoupling equations of motion for the two independent variables to higher orders but, in general, this is not enough to conjugate the equations of motion to those of the KdV integrable hierarchy. It is remarkable that, at this point, each of the two decoupled equations of motion falls in a class that was analysed by Kodama [26, 29,30,31] (and whose results have been extended to equations on the torus in [23]). Without entering the details, which could deserve an entire work, the idea is the following. One starts from a PDE of the form

$$\displaystyle \begin{aligned} u_t \;=\; F(u)\;:=\; F_0(u)+\lambda F_1(u)+\lambda^2 F_2(u)+O(\lambda^3) \end{aligned} $$
(23)

and one considers the effect of a change of variables uu + λG(u). Denoting with [⋅, ⋅] the commutator of two vector fields, the effect of the transformation on the RHS of the PDE (23) is

$$\displaystyle \begin{aligned} F(u) \mapsto e^{\lambda[G,\cdot]} F(u)=&F_0(u)+\lambda\big(F_1(u)+[G,F_0](u)\big)\\ +&\lambda^2 \left(F_2(u)+[G,F_1](u)+\frac{1}{2} [G,[G,F_0]](u)\right)+O(\lambda^3) \, . \end{aligned} $$
(24)

The latter conjugation of the vector field F holds in general, i.e. for any G. The Kodama transformation consists in making use of the natural grading of the KdV equation in order to choose a G consisting of a finite sum of monomials and satisfying two fundamental requirements. The first one is [G, F 0] = 0, which allows to leave F 1 in the KdV hierarchy, as it is given by the normal form construction. The second one consists just in “forcing” F 2 + [G, F 1] to fit the KdV hierarchy, even though F 2 does not. This part of the theory is only sketched in the present review and we refer to [23, 26] for details.

Remark 3

The treatment of the general case to orders s = 3 and s = 4 requires three and four perturbative steps, respectively, and is currently in progress.

3 Abstract Setting: Perturbation Theory in Poisson Systems

In order to treat our problem, we need to frame our Hamiltonian field theory in the more general context of Poisson systems [32, 36]. Such a short digression is adapted to our present purposes and does not aim at any generality.

3.1 Poisson Formalism

Definition 2

Let Γ be the phase space of the system and let \(\mathscr {A}(\varGamma )\) be the algebra of real-valued smooth functions defined on Γ. A binary application, or product, \(\{\cdot , \cdot \}:\mathscr {A}(\varGamma ) \times \mathscr {A}(\varGamma ) \to \mathscr {A}(\varGamma )\) is called a Poisson bracket on Γ if it satisfies the following properties

  1. (i)

    Skew-symmetry: {F, G} = −{G, F};

  2. (ii)

    Left-linearity: {αF + βG, H} = α{F, H} + β{G, H};

  3. (iii)

    Jacobi identity: {F, {G, H}} + {G, {H, F}} + {H, {F, G}} = 0;

  4. (iv)

    Leibniz rule: {FG, H} = F{G, H} + {F, H}G,

\(\forall F,G,H \in \mathscr {A}(\varGamma )\) and \(\alpha ,\beta \in \mathbb {R}\). The pair \((\mathscr {A},\{\ ,\})\) is called Poisson algebra.

Remark 4

The bracket {⋅, ⋅}q,p defined in (10) satisfies axioms (i)-(iv) in the above definition. Thus, the axiomatic definition above contains both the usual Hamiltonian mechanics and the field theory (as well as quantum mechanics).

For the sake of concreteness, let us consider the case where Γ is the space of two components, smooth, real-valued fields u(x) = (u 1(x), u 2(x))T defined on \(\mathbb {T}\) (what we show can be exported to the case of n components, complex-valued fields on a d-dimensional domain D).

By analogy with the standard case (10), a bilinear, skew-symmetric, Leibniz bracket on such a space is defined by the formula

$$\displaystyle \begin{aligned} \{F,G\}_J:= \langle \nabla F, J \nabla G \rangle:=\oint \sum_{i,j=1}^2\frac{\delta F[u]}{\delta u_i} J_{ij}[u]\frac{\delta G[u]}{\delta u_j}\ dx \ , \end{aligned} $$
(25)

where J ij[u] is a tensor valued operator, skew-symmetric with respect to the L 2 scalar product 〈 , 〉, functionally dependent on u. Notice that with the choice J = J 2, and denoting u 1 = q, u 2 = p, (25) coincides with (10). On the other hand, the bracket (25) does not satisfy the Jacobi identity (hypothesis (iii) above), in general. We state without proof the following Proposition [32], which characterises the Poisson brackets of the form (25).

Proposition 1

The bracket (25) satisfies the Jacobi identity, so that it is a Poisson bracket, if and only if the skew-symmetric tensor J[u] satisfies the Schouten identity

$$\displaystyle \begin{aligned} \sum_{s=1}^2\left(J_{is}D_{u_s} J_{jk}+J_{js}D_{u_s} J_{ki}+J_{ks}D_{u_s} J_{ij}\right)=0 \end{aligned} $$
(26)

for all u and all i, j, k = 1, 2.

Here \(D_{u_s}\) denotes the weak partial derivative with respect to u s, defined in the usual way:

$$\displaystyle \begin{aligned} \left(D_{u_s}f\right)h:=\frac{d}{d\epsilon}f[u_s+\epsilon h])\Big|{}_{\epsilon=0}\ , \end{aligned} $$
(27)

for any f functionally dependent on u. Observe that, for example, \(D_{u_1}u_1=1\), \(D_{u_2}\partial _xu_2=\partial _x\) and so on. Thus, any skew-symmetric tensor J[u] satisfying the identity (26) is a Poisson tensor, i.e. it defines through (25) a Poisson bracket. An obvious but fundamental consequence of Proposition 1 is the following

Corollary 1

Any skew-symmetric tensor J independent of u (i.e. constant on the phase space) is a Poisson tensor.

Remark 5

One does not require J[u] to be non-degenerate, so that J is allowed to have a nontrivial kernel. The functionals F such that JF = 0 are called Casimir invariants of the given Poisson structure and represent constants of motion for all Hamiltonian systems: {H, F} = 0 for any \(H \in \mathscr {A}(\varGamma )\).

Within this framework, fixing a Hamiltonian H[u] in the given Poisson algebra, the associated dynamics is defined in the usual way, namely

$$\displaystyle \begin{aligned} u_t=\{u,H\}_J=J\nabla_u H\ , \end{aligned} $$
(28)

to be read by components, ∇uH being the functional gradient of H[u]. Of course, any functional F evolves along the solutions of (28) according to F t = {F, H}J. Hamiltonian dynamical systems, in the generalised Poisson sense, have the form (28), which includes the standard (symplectic) case.

The fundamental feature of generalised Hamiltonian systems is their invariant character under any change of variables.

Proposition 2

Any smooth change of variables \(f:u\mapsto \tilde {u}=f[u]\) maps the Hamiltonian system u t = JuH into the Hamiltonian system \(\tilde {u}_t=\tilde {J}\nabla _{\tilde {u}} \tilde {H}\) , where \(\tilde {H}=H\circ f^{-1}\) , whereas the transformed Poisson tensor \(\tilde {J}\) is given by

$$\displaystyle \begin{aligned} \tilde{J}[\tilde{u}]:=(D_uf)J(D_uf)^T\Big|{}_{u=f^{-1}[\tilde{u}]}\ . \end{aligned} $$
(29)

The corresponding Poisson brackets are related, for any \(F,G\in \mathscr {A}(\varGamma )\) , by

$$\displaystyle \begin{aligned} \{F,G\}_J\circ f^{-1}=\{F\circ f^{-1},G\circ f^{-1}\}_{\tilde J}\ . \end{aligned} $$
(30)

In the latter formula, D u denotes the weak Jacobian of u, as defined in (27). The proof of the above Proposition is direct and not reported. The important point is the following: if J is a Poisson tensor, its transformed \(\tilde {J}\) under any f is a Poisson tensor. Of course, the Hamilton equations are not invariant in form under f, which happens if and only if \(\tilde J=J\). Canonical transformations are then defined as those transformations f leaving the Poisson tensor invariant. In order to check the canonicity of a transformation f, it is easier to make use of (30) which, with \(J=\tilde J\), yields {F, G}J ∘ f−1 = {Ff−1, Gf−1}J.

Remark 6

If J = J 2, the transformation law (29), together with the canonicity condition \(\tilde J=J\), yields the requirement that the Jacobian D uf be symplectic.

The equation of motion (28) can be rewritten as \(u_t=\mathscr {L}_Hu\), where the operator \(\mathscr {L}_{H}\cdot =\{\cdot , G \}_J\), such that \(\mathscr {L}_HF=\{F,H\}_J\) for any F, is the Lie derivative of F in the direction of the Hamiltonian vector field JH. One can then formally solve the equation by exponentiation, which defines the flow \(\varPhi _H^t\) of the system, namely

$$\displaystyle \begin{aligned} u(t)=e^{t\mathscr{L}_H}w:=\varPhi^t_H(w)\ , \end{aligned} $$
(31)

where w = u(0) is an arbitrary initial condition. Of course the exponential operator above is defined, as usual, by its formal series

$$\displaystyle \begin{aligned} e^{t \mathscr{L}_H}= 1+ t\mathscr{L}_H+\frac{t^2}{2} \mathscr{L}_H^2+O(t^3) \, . \end{aligned} $$
(32)

Now, since the evolution equation \(F_t=\{F,H\}_J=\mathscr {L}_HF\) of any functional F is solved by \(e^{t\mathscr {L}_H}F(w)\), which must equal \(F[u(t)]=F[\varPhi ^t_H(w)]\) for any initial condition w, one gets the useful relation

$$\displaystyle \begin{aligned} e^{t\mathscr{L}_H}F=F\circ \varPhi^t_H\ , \end{aligned} $$
(33)

which is known as the exchange Lemma; we will make use of it below.

The Hamiltonian flow \(\varPhi ^t_H:\varGamma \to \varGamma \) represents a one-parameter family of canonical transformations of Γ into itself (the family is a group if the flow is global).

Proposition 3

For any t such that \(\varPhi ^t_H\) exists, and any pair of functionals F and G, one has

$$\displaystyle \begin{aligned} \{F,G\}_J \circ \varPhi_H^t \;=\; \{F \circ \varPhi_H^t,H\circ \varPhi_H^t\}_J\ . \end{aligned} $$
(34)

Proof

Define Δ(t) the difference between the left and the right-hand side of (34), and observe that Δ(0) ≡ 0. Making use of relation (33), and of the Jacobi identity, one gets \(d\varDelta (t)/dt=\{\varDelta (t),H\}_J=\mathscr {L}_H\varDelta (t)\), whose solution is \(\varDelta (t)=e^{t\mathscr {L}_H}\varDelta (0)\equiv 0\). □

Remark 7

In the above treatment, the Hamiltonian H is arbitrary. It follows that any functional G, regarded as a Hamiltonian, generates a one-parameter family of canonical transformations, which is given by its flow \(\varPhi ^s_G=e^{s\mathscr {L}_G}\), where \(\mathscr {L}_G=\{\ ,G\}_J\). In the jargon, G is called the generating Hamiltonian, and \(\mathscr {L}_G=d\varPhi _G^s/ds|{ }_{s=0}\) the generator of the transformation.

As a final point of this section, we state a simple version of the Nöther theorem in the Poisson framework.

Theorem 1

If the Hamiltonian H[u] is invariant with respect to the flow \(e^{s\mathscr {L}_K}\) of generator \(\mathscr {L}_K=\{\ ,K\}_J\) , i.e. \(e^{s\mathscr {L}_K}H=H\) for any s close to zero, then {H, K}J = 0.

Proof

The derivative of \(e^{s\mathscr {L}_K}H=H\) with respect to s, at s = 0, gives the result. □

In the practice, one usually “sees” a certain symmetry of H, i.e. one is able to write down a certain transformation Ψs such that Ψ0 = 1 and H ∘ Ψs = H for any s around zero. Then, if Ψs is a Hamiltonian flow, its generating Hamiltonian K is a constant of motion of the given system.

3.2 Perturbation Theory

The target of Hamiltonian perturbation theory, which goes back to Poincaré and Birkhoff, is the following. Given a Hamiltonian

$$\displaystyle \begin{aligned} H\;=\;H_0+\lambda H_1+\lambda^2 H_2 + O(\lambda^3)\ , \end{aligned} $$
(35)

formally ordered with respect to the small parameter λ, one looks for a canonical transformation, λ-close to the identity, erasing completely or in part the perturbation terms H j≥1 up to a given order (possibly infinite, as in the KAM theory). As is well known, the complete removal of the perturbation terms, even to the first few orders, is not possible, in general. The best one can do is instead to find a canonical transformation setting H in normal form, according to the following definition.

Definition 3

The Hamiltonian H 0 + λZ 1 + ⋯ + λn Z n + O(λn+1) is said to be in normal form to order n ≥ 1 with respect to H 0 if \(\mathscr {L}_{H_0}Z_j=\{Z_j,H_0\}=0\) for any j = 1, …, n.

Observe that Z j ≡ 0 fits the normal form requirement, which means that the definition includes the possibility of complete removal of some perturbation terms.

The canonical transformation bringing the Hamiltonian (35) into normal form with respect to H 0, to order λ2 included, is given by composing the flows of two unknown Hamiltonians G 1 and G 2, namely

$$\displaystyle \begin{aligned} u\mapsto \tilde{u}= e^{-\lambda^2 \mathscr{L}_2} e^{-\lambda \mathscr{L}_1} u\ , \end{aligned} $$
(36)

where \(\mathscr {L}_j:=\mathscr {L}_{G_j}\), j = 1, 2. The inverse transformation maps the Hamiltonian (35) into

$$\displaystyle \begin{aligned} \tilde{H}=\ &e^{\lambda^2 \mathscr{L}_2} e^{\lambda \mathscr{L}_1} H= H_0+ \lambda\left(\mathscr{L}_1 H_0+H_1\right) + \\ + & \lambda^2\left( \mathscr{L}_2 H_0+\mathscr{L}_1 H_1 + \frac{1}{2} \mathscr{L}_1^2 H_0 + H_2 \right)+O(\lambda^3)\ , \end{aligned} $$
(37)

which is obtained by expanding the exponentials. The two generating Hamiltonians are then found by imposing that, according to the Definition 3, the quantities

$$\displaystyle \begin{aligned} Z_1 \;&:=\; H_1+\mathscr{L}_1 H_0\ , \\ Z_2 \;&:=\; \mathscr{L}_2 H_0+\mathscr{L}_1 H_1 + \frac{1}{2} \mathscr{L}_1^2 H_0 + H_2 \end{aligned} $$
(38)

be first integrals of H 0. Observing that \(\mathscr {L}_j H_0=-\mathscr {L}_{H_0}G_j\), the latter two equations for the four unknowns Z j and G j, can be rewritten in the form

$$\displaystyle \begin{aligned} \mathscr{L}_{H_0}G_1 \;&:=\; H_1-Z_1\ , \\ \mathscr{L}_{H_0}G_2 \;&:=\; \mathscr{L}_1 H_1 + \frac{1}{2} \mathscr{L}_1^2 H_0 + H_2-Z_2\ . \end{aligned} $$
(39)

These equations have one and the same structure, namely

$$\displaystyle \begin{aligned} \mathscr{L}_{H_0}G_j=S_j-Z_j\ ,\ \ (j=1,2) \end{aligned} $$
(40)

with obvious definitions of the S j.

Remark 8

Looking for a transformation to an arbitrary order n, one finds at any order j = 1, …, n an equation of the form (40), where S j is a known quantity if all the equations up to order j − 1 have been solved.

Equation (40) is known as the homological equation of order j, which has to be solved determining the unknowns Z j and G j under the condition \(\mathscr {L}_{H_0}Z_j=0\).

In what follows we suppose that the flow \(\varPhi _{H_0}^s\) of H 0 is global (i.e. it exists for all \(s\in \mathbb {R}\)) and uniformly bounded with respect to s.

Definition 4

The time average of any F along the unperturbed flow of H 0 is denoted by

$$\displaystyle \begin{aligned} \langle F \rangle_0: = \lim_{t\to\infty}\frac{1}{t} \int_0^t F\circ\varPhi_{H_0}^s\ ds\ . \end{aligned} $$
(41)

If the flow of H 0 is τ-periodic, i.e. \(\varPhi _{H_0}^\tau =1\), then \(\langle F\rangle _0=\frac {1}{\tau }\int _0^\tau F\circ \varPhi _{H_0}^s ds\).

Lemma 1

$$\displaystyle \begin{aligned} \mathscr{L}_{H_0}\langle F\rangle_0=0\ . \end{aligned} $$
(42)

Proof

Composing the left and right-hand side of (41) with the flow \(\varPhi _{H_0}^r\), one gets, on the right-hand side, \(\lim \frac {1}{t} \int _0^t F\circ \varPhi _{H_0}^{s+r}\ ds= \lim \frac {1}{t} \left (\int _r^0+\int _0^t+\int _t^{t+r}\right ) F\circ \varPhi _{H_0}^{a}\ da =\lim \frac {1}{t} \int _0^t F\circ \varPhi _{H_0}^{a}\ da\). Thus \(\langle F\rangle _0\circ \varPhi _{H_0}^r=F\), which implies (42), and vice versa. □

Lemma 2

The solution of the homological equation (40) is given by

$$\displaystyle \begin{aligned} Z_j=\langle S_j\rangle_0\ \ ;\ \ G_j=\langle G_j\rangle_0+\lim_{t\to\infty}\frac{1}{t} \int_0^t(s-t)e^{s\mathscr{L}_{H_0}}\left(S_j-\langle S_j\rangle_0\right)ds\ . \end{aligned} $$
(43)

If the flow of H 0is τ-periodic, \(G_j=\langle G_j\rangle _0+\frac {1}{\tau }\int _0^\tau s\ e^{s\mathscr {L}_{H_0}}\left (S_j-\langle S_j\rangle _0\right )ds\).

Proof

Applying \(e^{s\mathscr {L}_{H_0}}\) to Eq. (40), taking into account the invariance of Z j (by the definition of normal form), and taking the time average, one gets the first of (43) in the limit. By the latter result, the homological equation becomes \(\mathscr {L}_{H_0}G_j=S_j-\langle S_j\rangle _0\). Applying \((s-t)e^{s\mathscr {L}_{H_0}}\) to the latter equation and time averaging, one gets the second of (43) in the limit. □

Remark 9

The generating Hamiltonians G j solving the homological equation are defined up to their average along the flow of H 0, i.e. up to an arbitrary constant of motion of H 0. Thus, both the normal form Hamiltonian and the transformation bringing to it are not unique. In the sequel, we make the choice 〈G j0 ≡ 0.

Theorem 2 (Averaging Principle)

The canonical transformation

$$\displaystyle \begin{aligned} u\mapsto \tilde u= e^{-\lambda^2 \mathscr{L}_2} e^{-\lambda \mathscr{L}_1} u\ , \end{aligned}$$

generated by

$$\displaystyle \begin{aligned} G_1= & \lim_{t\to\infty}\frac{1}{t}\int_0^t(s-t)e^{s \mathscr{L}_0} \left(H_1-\langle H_1\rangle_0\right)ds\ ; \\ G_2= & \lim_{t\to\infty}\frac{1}{t}\int_0^t(s-t)e^{s \mathscr{L}_0} \left(S_2-\langle S_2\rangle_0\right)ds\ ; \\ S_2:= &H_2+\frac{1}{2}\left\{H_1,G_1\right\}+\frac{1}{2}\{\langle H_1\rangle_0,G_1\}\ , \end{aligned} $$
(44)

maps the perturbed Hamiltonian H = H 0 + λH 1 + λ2 H 2 + O(λ3) into the normal form \(\tilde H=e^{\lambda ^2 \mathscr {L}_2} e^{\lambda \mathscr {L}_1} H=H_0+\lambda Z_1+\lambda ^2 Z_2+O(\lambda ^3)\) , explicitly given by

$$\displaystyle \begin{aligned} \tilde H=H_0+\lambda \langle H_1\rangle_0+ \lambda^2\left( \langle H_2\rangle_0 +\frac{1}{2}\langle \{H_1,G_1\}\rangle_0\right)+O(\lambda^3)\ . \end{aligned} $$
(45)

Proof

By Lemma 2, solving the first of the homological equations (39) yields Z 1 and G 1. By substituting \(\mathscr {L}_1H_0=Z_1-H_1=\langle H_1\rangle _0-H_1\) into the right-hand side of the second of the homological equations (39), one gets the latter in the form \(\mathscr {L}_{H_0}G_2=S_2-Z_2\), with S 2 as in (44). Solving by Lemma 2 again yields Z 2 and G 2. □

Remark 10

As a matter of fact, in order to get the normal form Hamiltonian (45), one does not need to compute G 2. This is a general fact: Z j+1 depends on G 1, …, G j.

4 Hamiltonian Field Theory Close to q tt = q xx

We now come back to our problem and solve it by applying all the tools introduced in the previous section.

Let us start by considering a Hamiltonian \(H=\oint \mathscr {H} dx\), whose density \(\mathscr {H}\) does not depend explicitly on t and x and is an analytic function of q x, p and their spatial derivatives up to a certain finite order, in the neighbourhood of the origin. Since \(\mathscr {H}\) is invariant under time, space and q translations, Theorem 1 (Nöther) applies.

Proposition 4

\(H=\oint \mathscr {H} dx\), \(I=\oint q_xp\ dx\) and \(P=\oint p\ dx\) are the three first integrals corresponding to the symmetries t  t + s, x  x + s and q  q + s, respectively. Moreover, {I, P} = 0, so that the three first integrals are in involution.

Proof

The conservation of H is obvious. The Hamilton equations for I at time s are: q s = q x and p s = p x, whose solution is q(t, x + s) and p(t, x + s), clearly corresponding to the x-translation. The Hamilton equations for P are q s = 1, p s = 0, solved by q(t, x) + s and p(t, x), corresponding to the q-translation. Finally, observe that \(\{I,P\}_{q,p}=\oint (\delta I/\delta q) (\delta P/\delta p)dx = -\oint p_x\ dx = 0\). □

Remark 11

One can always restrict the dynamics to the submanifold \(P=\oint p\ dx=0\) by the canonical transformation q = q′, p = P + p′.

For the sake of convenience, we repeat below the definition of the class of Hamiltonian functionals considered, with the appropriate grading.

Definition 5

The perturbative ordering of the Hamiltonian H is defined by the following scaling:

$$\displaystyle \begin{aligned} H_\lambda:=\frac{1}{\lambda^4}\oint \mathscr{H}(\lambda^2q_{x},\lambda^2p,\lambda^3q_{xx},\lambda^3p_x,\dots,\lambda^6q_{xxxxx},\lambda^6p_{xxxx})\ dx\ . \end{aligned} $$
(46)

By Taylor expanding in powers of λ, close to λ = 0, assuming without loss of generality that \(\mathscr {H}|{ }_{(q,p)=0}=0\), and taking into account Remark 11, one gets

$$\displaystyle \begin{aligned} H_\lambda=H_0+ \lambda H_1 +\lambda^2 H_2+\lambda^3 H_3 +\lambda^4 H_4 +\cdots\ , \end{aligned} $$
(47)

where

$$\displaystyle \begin{aligned} H_0=\oint \frac{a p^2+ b (q_x)^2}{2} dx +c I \end{aligned} $$
(48)

with a, b and c some constants and \(I=\oint q_xp\ dx\);

$$\displaystyle \begin{aligned} H_1=\oint d_1 q_xp_x\ dx\ ; \end{aligned} $$
(49)
$$\displaystyle \begin{aligned} \begin{array}{rcl} H_2=\oint& &\displaystyle \left[ e_1 (q_x)^3 +e_2 p^3 +e_3 (q_x)^2p +e_4 q_xp^2 + e_5(q_{xx})^2 +\right.\\ +& &\displaystyle \left. e_6 (p_x)^2+e_7 q_{xx}p_x\right]\ dx\ ; \end{array} \end{aligned} $$
(50)
$$\displaystyle \begin{aligned} H_3=\oint\left[f_1 (q_x)^2p_x+ f_2 q_{xx}p^2 +f_3 q_{xx}p_{xx} \right]\ dx \, ; \end{aligned} $$
(51)
$$\displaystyle \begin{aligned} \begin{array}{rcl} {} H_4=\oint& &\displaystyle \left[g_1(q_x)^4+g_2p^4+g_3 (q_x)^2p^2+g_4(q_x)^3p+g_5q_xp^3\right. +\\ +& &\displaystyle \left. g_6(q_{xx})^2q_x + g_7(q_{xx})^2p+g_8(p_x)^2q_x+g_9(p_x)^2p \right. +\\ +& &\displaystyle \left. g_{10} q_{xxx} p^2+g_{11}(q_x)^2p_{xx}+g_{12}(q_{xxx})^2+g_{13}(p_{xx})^2 \right. +\\ +& &\displaystyle \left. g_{14}q_{xxxx}p_x\right]\ dx\ , \end{array} \end{aligned} $$
(52)

and so on. Here d 1, e 1, …, g 14 are given constants.

Remark 12

Since \(\mathscr {H}\) is independent of x, the density of each H j is independent of x. It follows that {I, H j} = 0 for any j ≥ 0.

Proposition 5

If the constants \(a:=\partial ^2\mathscr {H}/\partial p^2|{ }_0\) and \(b:=\partial ^2\mathscr {H}/\partial (q_x)^2|{ }_0\) appearing in (48) are different from zero and have the same sign, there exists a time-dependent canonical transformation which brings the Hamiltonian H 0in the canonical wave equation form \(K_0=\frac {1}{2}\oint [p^2+(q_x)^2]dx\) and preserves the structure of the perturbations H jto any order j ≥ 0.

Proof

Let a = σ|a| and b = σ|b|, with σ = ±1. One first performs the canonical rescaling \(q=\sqrt {|a|}\ q'\), \(p=\sqrt {|b|}\ p'\), H = σ|ab|H′, t = σt′, which brings H 0 into K 0 + c′I, where \(c'=\sigma c/\sqrt {|ab|}\). Then one performs the transformation \((q',p')=\varPhi _{c'I}^t(q'',p'')=\varPhi _I^{c't}(q'',p'')\), where \(\varPhi _I^t\) denotes the flow of \(I=\oint q_xp\ dx\). The latter transformation is canonical and erases c′I. Clearly, both transformations do not change the structure of any H j nor the value of the coefficients of the Hamiltonians H 1, …, H 4. Observe that the flow of I is the left translation of (q, p), so that it is global and preserves the regularity of the initial condition. □

Remark 13

Consider \(K_0+\lambda H_1=\frac {1}{2}\oint [p^2+(q_x)^2+2\lambda d_1q_xp_x]dx\). Its Hamilton equations read

$$\displaystyle \begin{aligned} q_t=p-\lambda d_1q_{xx}\ \ ;\ \ p_t=q_{xx}+\lambda d_1p_{xx}\ . \end{aligned}$$

Both q and p satisfy the linear Boussinesq equation

$$\displaystyle \begin{aligned} u_{tt}=u_{xx}+(\lambda d_1)^2u_{xxxx}\ . \end{aligned}$$

The condition on a and b in the Proposition 5 above identifies the elliptic fixed points in the given class of Hamiltonians. One is then left with the problem of simplifying the dynamics of K 0 + λH 1 + λ2 H 2 + ⋯. The perturbations to various order have the structure listed above and no further simplification can be made, in general. However, there is a relevant class of Hamiltonians that display a much simpler structure, namely the class of mechanical Hamiltonians of the form \(\mathscr {H}=p^2/2+\mathscr {U}\), where \(\mathscr {U}\) depends only on q x and its derivatives. Such Hamiltonians usually arise as the continuum limit of some lattice system, the notable case being just that of the vibrating string.

Proposition 6

Suppose that \(\mathscr {H}=p^2/2+\mathscr {U}(q_x,q_{xx},\dots ,q_{xxxxx})\) . Then, if the condition \(b:=\partial ^2\mathscr {U}/\partial (q_x)^2|{ }_0>0\) holds, H 0can be brought in the canonical wave form K 0, H 1 = H 3 ≡ 0, and

$$\displaystyle \begin{aligned} H_2=\oint \left[\alpha_1 (q_x)^3 +\alpha_2(q_{xx})^2\right]\ dx\ ; \end{aligned}$$
$$\displaystyle \begin{aligned} H_4=\oint\left[\beta_1(q_x)^4+\beta_2(q_{xx})^2q_x+\beta_3(q_{xxx})^2\right]\ dx\ . \end{aligned}$$

Proof

The momentum p cannot appear out of H 0, by definition. Notice that in this case there is no term proportional to I in H 0. □

In the latter significant case one can obviously rename H 2 → H 1 and H 4 → H 2, λ2 → λ.

4.1 Traveling Waves

The equations of motion associated with \(K_0=\oint \frac {p^2+(q_x^2)}{2}\ dx\) reduce to the wave equation for the field q:

$$\displaystyle \begin{aligned} q_t=p\ \ ;\ \ p_t=q_{xx}\ , \qquad \Longleftrightarrow \qquad q_{tt}=q_{xx} \, . \end{aligned} $$
(53)

In order to simplify the analysis of perturbations of the wave equation, it is convenient to perform a change of variables that maps the functions (q, p) into the Riemann invariants (u, v):

$$\displaystyle \begin{aligned} u \;=\; \dfrac{q_x+p}{\sqrt{2}}\ \ ;\ \ v \;=\; \dfrac{q_x-p}{\sqrt{2}}\ . \end{aligned} $$
(54)

The equations of motion for u and v are the left and right translation equation, respectively:

$$\displaystyle \begin{aligned} \begin{cases} u_t=u_x \\ v_t=-v_x \end{cases} \, . \end{aligned} $$
(55)

Indeed, the solution of the above system corresponding to the initial condition (u 0(x), v 0(x)) is (u 0(x + t), v 0(x − t)), i.e. a rigid translation of the initial profiles. The flow of the wave equation, that is used to compute normal forms, is particularly manageable in these new variables, being a left translation for u and a right translation for v (at positive times).

The change of variables (54) is not canonical and it maps the standard Poisson tensor J 2 into the Gardner tensor [24]

$$\displaystyle \begin{aligned} J\;=\; \begin{pmatrix} \partial_x & 0 \\ 0 & -\partial_x \end{pmatrix} \, . \end{aligned} $$
(56)

In particular, as can be checked, formula (29) for the transformation (54) reads

$$\displaystyle \begin{aligned} D_{q,p}(u,v)\begin{pmatrix} 0 & 1 \\ -1 & 0 \end{pmatrix} D_{q,p}^T(u,v)= \begin{pmatrix} \partial_x & 0 \\ 0 & -\partial_x \end{pmatrix}\ . \end{aligned}$$

The Hamiltonian K 0, expressed in terms of (u, v), reads \(K_0=\oint \frac {u^2+v^2}{2}\ dx\), so that the translation equations for u and v are the Hamilton equations associated with K 0 in the Gardner structure.

The explicit expression of the Hamiltonians (48)–(52) in the (u, v) variables is:

$$\displaystyle \begin{aligned} K_0\;=\;\oint \frac{u^2+v^2}{2} \, dx \, ; \end{aligned} $$
(57)
$$\displaystyle \begin{aligned} H_1\;=\; \oint \frac{d_1}{\sqrt{|ab|}} u v_x \, dx \, ; \end{aligned} $$
(58)
$$\displaystyle \begin{aligned} H_2 \;=\; \oint &\Big\{ \frac{1}{2^{3/2}} \Big[ \big({\textstyle \frac{e_1}{|b|{}^{3/2}}+\frac{e_2}{|a|{}^{3/2}}+\frac{e_3}{|b| \sqrt{|a|}}+\frac{e_4}{|a| \sqrt{|b|}}}\big)u^3 \\ &+\big({\textstyle \frac{e_1}{|b|{}^{3/2}}-\frac{e_2}{|a|{}^{3/2}}-\frac{e_3}{|b| \sqrt{|a|}}+\frac{e_4}{|a| \sqrt{|b|}}}\big)v^3\\ &+\big({\textstyle \frac{3 e_1}{|b|{}^{3/2} }-\frac{3 e_2}{|a|{}^{3/2}}+\frac{e_3}{|b|\sqrt{|a|}}-\frac{e_4}{|a|\sqrt{|b|}}} \big)u^2 v \\ &+\big({\textstyle \frac{3 e_1}{|b|{}^{3/2} }+\frac{3 e_2}{|a|{}^{3/2}}-\frac{e_3}{|b|\sqrt{|a|}}-\frac{e_4}{|a|\sqrt{|b|}}} \big)u v^2 \Big]+ \\ &+\frac{1}{2} \Big[ \big({\textstyle \frac{e_5}{|b|} + \frac{e_6}{|a|} + \frac{e_7}{\sqrt{|ab|}}} \big)u_x^2 +\\ &+\big({\textstyle \frac{e_5}{|b|} + \frac{e_6}{|a|} - \frac{e_7}{\sqrt{|ab|}}} \big)v_x^2 \Big] \Big\} \, dx \, . \end{aligned} $$
(59)

4.2 The Generic Case

In order to perform a canonical transformation as stated in Proposition 2, one has to compute time averages, as required in Theorem 2. General formulas applying to the case of an unperturbed flow consisting of left/right translations are provided in the next lemma.

Lemma 3

Suppose that f and g are continuous functions on \(\mathbb {T}\) . Then

$$\displaystyle \begin{aligned} \begin{array}{rcl} \oint \oint f(x\pm s) \, dx \, ds & =&\displaystyle \oint f(x) \, dx \ ; {} \end{array} \end{aligned} $$
(60)
$$\displaystyle \begin{aligned} \begin{array}{rcl} \oint \oint f(x\pm s) g(x \mp s) \, dx \, ds & =&\displaystyle \oint f(x) \, dx \oint g(y) \, dy \ ;{} \end{array} \end{aligned} $$
(61)
$$\displaystyle \begin{aligned} \begin{array}{rcl} \int_0^1 \oint s \, f(x\pm s) g(x \mp s) \, dx \, ds & =&\displaystyle \\ & \,&\displaystyle \!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!=\frac{1}{2} \oint f(x) \, dx \oint g(y) \, dy \pm \frac{1}{2} \oint g(x) \, \partial_x^{-1} f(x) \, dx \ ,\quad {} \end{array} \end{aligned} $$
(62)

where \(\partial _x^{-1}f(x)\) denotes the unique primitive of f with zero average on \(\mathbb {T}\).

Proof

All these proofs consist of straightforward computations in Fourier space. First, we prove (61):

$$\displaystyle \begin{aligned} \oint \oint f(x\pm s) g(x \mp s) \, dx \, ds \;&=\; \int_0^1 \int_0^1 \sum_{k,k' \in \mathbb{Z}} \hat{f}_k \hat{g}_{k'} e^{2 \pi \imath k(x \pm s)} e^{2 \pi \imath k' (x \mp s)} \, dx \, ds \\ &=\;\sum_{k,k' \in \mathbb{Z}} \hat{f}_k \hat{g}_{k'} \delta_{k+k',0} \delta_{k-k',0} \;=\; \hat{f}_0 \hat{g}_0 \, . \end{aligned}$$

From here, (60) follows by choosing g = 1. In order to prove (62), we Fourier transform the LHS:

$$\displaystyle \begin{aligned} \int_0^1 \oint s \, f(x\pm s) g(x \mp s) \, dx \, ds \;=\; \sum_{k \in \mathbb{Z}} \hat{f}_k \hat{g}_{-k} \int_0^1 s e^{\pm 4 \pi \imath k s } \, ds\ . \end{aligned}$$

It remains to notice that

$$\displaystyle \begin{aligned} \int_0^1 s e^{\pm 4 \pi \imath k s } \, ds \;=\; \delta_{k,0} \int_0^1 s \, ds + (1-\delta_{k,0}) \int_0^1 s e^{\pm 4 \pi \imath k s } \, ds = \frac{1}{2} \delta_{k,0} \pm \frac{1}{2} \frac{1}{2 \pi \imath k}(1-\delta_{k,0}) \end{aligned}$$

and to recognise that 1∕(2πık) is the Fourier-multiplier corresponding to the operator \(\partial _x^{-1}\). □

Proposition 7

There exists a (formal) near-to-identity, canonical transformation \((u,v) \mapsto (\tilde {u},\tilde {v})\) mapping H λ into

$$\displaystyle \begin{aligned} \tilde{H}_\lambda \;=\; K_0 + \lambda^2 Z_2+O(\lambda^3)\ , \end{aligned} $$
(63)

where

$$\displaystyle \begin{aligned} K_0 \;=\; \oint \frac{\tilde{u}^2+\tilde{v}^2}{2} \, dx \, ; \end{aligned} $$
(64)
$$\displaystyle \begin{aligned} Z_2 \;=\; \oint &\Big\{ \frac{1}{2^{3/2}} \Big[ \big({\textstyle \frac{e_1}{|b|{}^{3/2}}+\frac{e_2}{|a|{}^{3/2}}+\frac{e_3}{|b| \sqrt{|a|}}+\frac{e_4}{|a| \sqrt{|b|}}}\big)\tilde{u}^3 + \\ &+\big({\textstyle \frac{e_1}{|b|{}^{3/2}}-\frac{e_2}{|a|{}^{3/2}}-\frac{e_3}{|b| \sqrt{|a|}}+\frac{e_4}{|a| \sqrt{|b|}}}\big)\tilde{v}^3 \Big]+ \\ &+\frac{1}{2} \Big[ \big({\textstyle \frac{e_5}{|b|} + \frac{e_6}{|a|} + \frac{e_7}{\sqrt{|ab|}}-\frac{d_1^2}{2|ab|}} \big)\tilde{u}_x^2 +\\ &+\big({\textstyle \frac{e_5}{|b|} + \frac{e_6}{|a|} - \frac{e_7}{\sqrt{|ab|}}-\frac{d_1^2}{2|ab|}} \big)\tilde{v}_x^2 \Big] \Big\} \, dx \, . \end{aligned} $$
(65)

Proof

First perturbative step: Using (45) and (61) one has Z 1 = 0:

$$\displaystyle \begin{aligned} Z_1 \;&=\; \int_0^1 e^{s \mathscr{L}_{H_0}} H_1 \, ds \\ &=\; \int_0^1 \oint \frac{d_1}{\sqrt{|ab|}} u(x+s) v_x(x-s) \, dx \, ds \\ &\overset{\text{(61)}}{=}\; \frac{d_1}{\sqrt{|ab|}}\oint u(x) \, dx \oint v_y(y) \, dy \;=\; 0, \end{aligned}$$

where in the last step we used that the v y has zero average.

Additional term at second order: We need the expression of G 1 to compute Z 2. Using (45) and (62) we have

$$\displaystyle \begin{aligned} G_1\;&=\; \int_0^1 s e^{s \mathscr{L}_{H_0}} H_1 \, ds \\ &=\;\frac{d_1}{\sqrt{|ab|}} \int_0^1 \oint s \, u(x+s) v_x(x-s) \, dx \, ds \\ &\overset{\text{(62)}}{=}\; -\frac{d_1}{2\sqrt{|ab|}} \oint u v \, dx \, . \end{aligned}$$

The computation of functional derivatives yields:

$$\displaystyle \begin{aligned} \frac{\delta G_1}{\delta u} \;=\; - \frac{d_1}{2 \sqrt{|ab|}} v \, ; \qquad \frac{\delta G_1}{\delta v} \;=\; -\frac{d_1}{2 \sqrt{|ab|}} u \, ; \end{aligned}$$
$$\displaystyle \begin{aligned} \frac{\delta H_1}{\delta u} \;=\; \frac{d_1}{\sqrt{|ab|}} v_x \, ; \qquad \frac{\delta H_1}{\delta v} \;=\; -\frac{d_1}{\sqrt{|ab|}} u_x\ , \end{aligned}$$

and one finally obtains

$$\displaystyle \begin{aligned} \{H_1,G_1\} \;&=\; \oint \left(\frac{\delta H_1}{\delta u} \partial_x \frac{\delta G_1}{\delta u} - \frac{\delta H_1}{\delta v} \partial_x \frac{\delta G_1}{\delta v} \right) \, dx \\ &=\;-\frac{d_1^2}{2 |ab|} \oint \left(v_x^2+u_x^2 \right) \, dx \, . \end{aligned}$$

Computation of the second-order normal form: Using (45), one has to time average (with respect to the unperturbed flow of K 0) the following expression:

$$\displaystyle \begin{aligned} H_2+\frac{1}{2} &\{H_1-Z_1,G_1\} \;=\; \oint \Big\{ \frac{1}{2^{3/2}} \Big[ \big({\textstyle \frac{e_1}{|b|{}^{3/2}}+\frac{e_2}{|a|{}^{3/2}}+\frac{e_3}{|b| \sqrt{|a|}}+\frac{e_4}{|a| \sqrt{|b|}}}\big)u^3 \\ &+\big({\textstyle \frac{e_1}{|b|{}^{3/2}}-\frac{e_2}{|a|{}^{3/2}}-\frac{e_3}{|b| \sqrt{|a|}}+\frac{e_4}{|a| \sqrt{|b|}}}\big)v^3+\big({\textstyle \frac{3 e_1}{|b|{}^{3/2} }-\frac{3 e_2}{|a|{}^{3/2}}+\frac{e_3}{|b|\sqrt{|a|}}-\frac{e_4}{|a|\sqrt{|b|}}} \big)u^2 v \\ &+\big({\textstyle \frac{3 e_1}{|b|{}^{3/2} }+\frac{3 e_2}{|a|{}^{3/2}}-\frac{e_3}{|b|\sqrt{|a|}}-\frac{e_4}{|a|\sqrt{|b|}}} \big)u v^2 \Big]+ \\ &+\frac{1}{2} \Big[ \big({\textstyle \frac{e_5}{|b|} + \frac{e_6}{|a|} + \frac{e_7}{\sqrt{|ab|}}-\frac{d_1^2}{2 |ab|}} \big)u_x^2 +\big({\textstyle \frac{e_5}{|b|} + \frac{e_6}{|a|} - \frac{e_7}{\sqrt{|ab|}}-\frac{d_1^2}{2 |ab|}} \big)v_x^2 \Big] \Big\} \, dx\ . \end{aligned}$$

As a consequence of (61) and under the assumption of \(\oint u \, dx = \oint v \, dx =0\):

$$\displaystyle \begin{aligned} \int_0^1 \oint u^2(x+s) v(x-s) \, dx \, ds \;=\; \left(\oint u^2(x) \, dx \right) \left( \oint v(x) \, dx \right) \;=\; 0\ ; \end{aligned}$$
$$\displaystyle \begin{aligned} \int_0^1 \oint u(x+s) v^2(x-s) \, dx \, ds \;=\; \left(\oint u(x) \, dx \right) \left( \oint v^2(x) \, dx \right) \;=\; 0\ . \end{aligned}$$

Moreover

$$\displaystyle \begin{aligned} \int_0^1 \oint u^3(x+s) \, dx \, ds \;=\; \oint u^3(x) \, dx\ ; \end{aligned}$$
$$\displaystyle \begin{aligned} \int_0^1 \oint v^3(x+s) \, dx \, ds \;=\; \oint v^3(x) \, dx\ ; \end{aligned}$$
$$\displaystyle \begin{aligned} \int_0^1 \oint u_x^2 (x+s) \, dx \, ds \;=\; \oint u_x^2(x) \, dx\ ; \end{aligned}$$
$$\displaystyle \begin{aligned} \int_0^1 \oint v_x^2 (x+s) \, dx \, ds \;=\; \oint v_x^2(x) \, dx\ , \end{aligned}$$

and this completes the proof. □

Remark 14

\(\tilde {H}_\lambda \) is always the Hamiltonian of a pair of counter-propagating Korteweg-de Vries equations (up to a small remainder), i.e. its vector field \(J\nabla \tilde H_\lambda \) is of the form (17). Such a result is somehow expected from, and in agreement with the existing results treating particular cases in the literature, among which those concerning the FPU problem (starting with the seminal work of Zabusky and Kruskal [37]) and the propagation of surface water waves (where the first deduction of the KdV equation goes back to Boussinesq [13]).

4.3 The Mechanical Case

For mechanical Hamiltonians of the form \(\mathscr {H}=p^2/2+\mathscr {U}\), where \(\mathscr {U}\) depends on q x and its derivatives, starting from Proposition 6 and repeating the analysis made in the general case, we perform the change of variables (q, p)↦(u, v), which yields

$$\displaystyle \begin{aligned} K_0\;=\; \oint\frac{u^2+v^2}{2} \, dx \, , \end{aligned} $$
(66)
$$\displaystyle \begin{aligned} H_2\;=\; \oint\left[\frac{\alpha_1}{2^{3/2}} \left(u^3+3u^2v+3 u v^2 + v^3 \right)+\frac{\alpha_2}{2} \left((u_x)^2+2u_xv_x+(v_x)^2 \right)\right] \, dx \, , \end{aligned} $$
(67)
$$\displaystyle \begin{aligned} H_4\;&=\;\oint \Big\{\beta_1 \left[\frac{u^4+4u^3v+6u^2v^2+4u v^3+v^4}{4}\right]\\ &\quad +\beta_2 \left[\frac{(u_x)^2+2u_x v_x +(v_x)^2}{2} \right]\frac{u+v}{\sqrt{2}}\\ &\quad +\frac{\beta_3}{2} [(u_{xx})^2+2u_{xx} v_{xx}+(v_{xx})^2] \Big\} \, dx \, . \end{aligned} $$
(68)

Proposition 8

There exists a (formal) near-to-identity, canonical transformation \((u,v) \mapsto (\tilde {u},\tilde {v})\) mapping H λ into

$$\displaystyle \begin{aligned} \tilde{H}_\lambda \;=\; K_0 + \lambda^2 Z_2+\lambda^4 Z_4 +O(\lambda^6)\ , \end{aligned} $$
(69)

where

$$\displaystyle \begin{aligned} K_0 \;=\; \oint \frac{\tilde{u}^2+\tilde{v}^2}{2} \, dx \, , \end{aligned} $$
(70)
$$\displaystyle \begin{aligned} Z_2 \;=\; \oint \left[ \frac{\alpha_1}{2^{3/2}}(\tilde{u}^3+\tilde{v}^3)+\frac{\alpha_2}{2} (\tilde{u}_x^2+\tilde{v}_x^2) \right] \, dx \, , \end{aligned} $$
(71)
$$\displaystyle \begin{aligned} Z_4 \;=\; &\oint \Big\{ \Big(\frac{\beta_1}{4}-\frac{9\alpha_1^2}{16} \Big)(\tilde{u}^4+\tilde{v}^4)+\Big(\frac{\beta_2}{2^{3/2}}-\frac{3\alpha_1\alpha_2}{\sqrt{2}} \Big)\big[\tilde{u}(\tilde{u}_x)^2+\tilde{v} (\tilde{v}_x)^2 \big]+ \\ & +\Big(\frac{\beta_3}{2}-\frac{\alpha_2^2}{2} \Big) \big[ (\tilde{u}_{xx})^2+(\tilde{v}_{xx})^2 \big] \Big\} \, dx +\Big(\frac{3 \beta_1}{2}-\frac{9 \alpha_1^2}{2} \Big) \langle \tilde{u}^2 \rangle \langle \tilde{v}^2 \rangle+\\ &+\frac{9 \alpha_1^2}{16} \big( \langle \tilde{u}^2 \rangle^2+\langle \tilde{v}^2 \rangle^2 \big) \, . \end{aligned} $$
(72)

Proof

First perturbative step: using Proposition 2 we have

$$\displaystyle \begin{aligned} Z_2\;&=\; \int_0^1 e^{s \mathscr{L}_{H_0}} H_2 \, ds \\ &\overset{\text{(61)}}{=}\;\oint \Big\{ \frac{\alpha_1}{2^{3/2}}(u^3+v^3)+\frac{\alpha_2}{2}[(u_x)^2+(v_x)^2] \Big\} \, dx \\ &+\frac{3 \alpha_1}{2^{3/2}} \big(\langle u^2 \rangle \langle v \rangle + \langle u \rangle \langle v^2 \rangle \big)\ ; \end{aligned}$$

here the last term vanishes because \(\langle u\rangle =\oint u \, dx =0\) and \(\langle v\rangle =\oint v \, dx = 0\).

Generator of the first order transformation:

$$\displaystyle \begin{aligned} G_2 \;&=\; \int_0^1 s e^{s \mathscr{L}_{H_0}} (H_2-Z_2) \, ds \\ &=\int_0^1 \oint s \Big\{\frac{3 \alpha_1}{2^{3/2}} \big[u^2(x+s) v(x-s)+ u(x+s) v^2(x-s)\big]\\ &\qquad +\alpha_2 u_x(x+s)v_x(x-s) \Big\} \, dx \, ds\\ &\overset{\text{(62)}}{=}\frac{3 \alpha_1}{2^{5/3}} \big( \langle u^2 \rangle \langle v \rangle + \langle u \rangle \langle v^2 \rangle \big)+\frac{3 \alpha_1}{2^{5/2}} \Big( \oint v^2 \partial_x^{-1} u \, dx + \oint v \partial_x^{-1} u^2 \, dx \Big)\\ &\qquad + \frac{\alpha_2}{2} \oint u v_x \, dx \\ &=\;\frac{3 \alpha_1}{2^{5/2}} \Big( \oint v^2 \partial_x^{-1} u \, dx + \oint v \partial_x^{-1} u^2 \, dx \Big) + \frac{\alpha_2}{2} \oint u v_x \, dx\ , \end{aligned}$$

where in the last step we used 〈u〉 = 0 and 〈v〉 = 0. Making use of the functional derivatives

$$\displaystyle \begin{aligned} \frac{\delta G_1}{\delta u} \;=\; \frac{3 \alpha_1}{2^{3/2}} \Big[-u \partial_x^{-1} v - \frac{1}{2} \partial_x^{-1} v^2 \Big] + \frac{\alpha_2}{2} v_x \, ; \end{aligned}$$
$$\displaystyle \begin{aligned} \frac{\delta G_1}{\delta v} \;=\; \frac{3 \alpha_1}{2^{3/2}} \Big[\frac{1}{2}\partial_x^{-1} u^2+v \partial_x^{-1} u \Big] - \frac{\alpha_2}{2}u_x\, ; \end{aligned}$$
$$\displaystyle \begin{aligned} \frac{\delta (H_2-Z_2)}{\delta u} \;=\; \frac{3 \alpha_1}{2^{3/2}} \big[ 2u v + v^2\big]-\alpha_2 v_{xx}\, ; \end{aligned}$$
$$\displaystyle \begin{aligned} \frac{\delta (H_2-Z_2)}{\delta v} \;=\; \frac{3 \alpha_1}{2^{3/2}} \big[ u^2+ 2u v\big]-\alpha_2 u_{xx}\, , \end{aligned}$$

we can compute the Poisson bracket

$$\displaystyle \begin{aligned} \{ H_2-Z_2,G_2\} \;&=\; \oint \Big[\frac{\delta (H_2-Z_2)}{\delta u} \partial_x \frac{\delta G_2}{\delta u} - \frac{\delta (H_2-Z_2)}{\delta v} \partial_x \frac{\delta G_2}{\delta v} \Big] \, dx \, . \end{aligned}$$

Since we do not need its full expression, we can use (61) to simplify computations and consider only those terms that do not vanish after taking the average with respect to the flow of K 0. We obtain

$$\displaystyle \begin{aligned} \Big\langle \{H_2-Z_2,G_2\} \Big \rangle_{0} \;&=\;\oint \Big[-\frac{9 \alpha_1^2}{16} \big( u^4+v^4 \big)-\frac{3\alpha_1\alpha_2}{\sqrt{2}} \big(u_x^2 u+v_x^2 v\big)+ \\ &-\frac{\alpha_2^2}{2}\big( (u_{xx})^2+(v_{xx})^2 \big) \Big] \, dx+\frac{9 \alpha_1^2}{16} \big( \langle u^2 \rangle^2+\langle v^2 \rangle^2 \big)\\ & -\frac{9 \alpha_1^2}{2} \langle u^2 \rangle \langle v^2 \rangle\ , \end{aligned}$$

whereas

$$\displaystyle \begin{aligned} \langle H_4 \rangle_{0} \;&=\; \oint \Big\{ \frac{\beta_1}{4}\big(u^4+v^4\big)+\frac{\beta_2}{2^{3/2}} \big[ u(u_x)^2+v(v_x)^2 \big]+\frac{\beta_3}{2} \big[ (u_{xx})^2+(v_{xx})^2 \big]\Big\} \, dx \\ &+\frac{3 \beta_1}{2} \langle u^2 \rangle \langle v^2 \rangle \, . \end{aligned}$$

Summing the right-hand sides of the two previous equations we get

$$\displaystyle \begin{aligned} Z_4 \;&=\; \oint \Big\{ \Big(\frac{\beta_1}{4}-\frac{9\alpha_1^2}{16} \Big)(u^4+v^4)+\Big(\frac{\beta_2}{2^{3/2}}-\frac{3\alpha_1\alpha_2}{\sqrt{2}} \Big)\big[u(u_x)^2+v (v_x)^2 \big]+ \\ &\qquad +\Big(\frac{\beta_3}{2}-\frac{\alpha_2^2}{2} \Big) \big[ (u_{xx})^2+(v_{xx})^2 \big] \Big\} \, dx +\Big(\frac{3 \beta_1}{2}-\frac{9 \alpha_1^2}{2} \Big) \langle u^2 \rangle \langle v^2 \rangle\\& \qquad +\frac{9 \alpha_1^2}{16} \big( \langle u^2 \rangle^2+\langle v^2 \rangle^2 \big) \, . \end{aligned}$$

Here, as in the generic case, Z 2 is in the KdV hierarchy, i.e. the vector field \(J\nabla (\tilde K_0+\lambda ^2 Z_2)\) has the form of the right-hand side of (17). On the other hand, Z 4 is not, in general, in the KdV hierarchy: the two components of its vector field JZ 4 are not proportional to κ 5 (as defined in (20)), which is due to the impossibility to fit all the required constraints on its parameters, in general. However, it is still possible to get a dynamics within the KdV hierarchy to order λ4 by applying the Kodama normalisation procedure to the vector field \(J\nabla (\tilde K_0+\lambda ^2 Z_2+\lambda ^4 Z_4)\). Although such a normalisation is noncanonical, in principle, it actually yields a system of equations in the form (18). Neglecting the remainder, these equations turn out to be Hamiltonian a fortiori, with the correct Gardner-Poisson tensor (56). The deep reason behind this fact is far from being deeply understood, at present.

Concrete examples are discussed in the next Sect. 5, where we also provide an explicit example of Kodama transformation.

5 Applications

5.1 The Fermi-Pasta-Ulam Problem

The Fermi-Pasta-Ulam (FPU) chain consists of N identical (unit) masses connected by nonlinear springs to their nearest neighbours. The dynamics is generated by the Hamiltonian

$$\displaystyle \begin{aligned} H=\;\sum_{j\in \mathbb{Z}_N} \left[\frac{p_j^2}{2}+\phi(q_{j+1}-q_j) \right]\ , \end{aligned} $$
(73)

where \(\mathbb {Z}_N:=\mathbb {Z}/(N\mathbb {Z})\), and ϕ is the potential

$$\displaystyle \begin{aligned} \phi(z) \;:=\; \frac{z^2}{2}+\alpha \frac{z^3}{3}+\beta \frac{z^4}{4}+O(z^5)\ , \end{aligned} $$
(74)

and α,β,…are the parameters measuring the strength of the nonlinear terms. One usually refers to the α-model if α is the only non-zero parameter; to the β-model if β is the only non-zero parameter; to the α + β-model if both α and β are non-zero, and to the generalised FPU model if the lowest degree of the nonlinearity is greater than or equal 5.

When all the parameters in the nonlinearity are set to zero, the Hamiltonian (73) reduces to that of a harmonic chain, where particles interact through linear forces only. The latter system is integrable in the sense of Liouville, and the Hamiltonian is diagonalised by the (discrete) Fourier transform

$$\displaystyle \begin{aligned} p_j \;=\; \frac{1}{\sqrt{2N}} \sum_{k=-N}^N \hat{p}_k e^{\imath \pi \frac{j k}{N}}\ , \end{aligned} $$
(75)

and similarly for q j. The integrals of motion are the energies of the Fourier modes

$$\displaystyle \begin{aligned} E_k \;=\; \frac{|\hat{p}_k|{}^2+ \omega_k^2 |\hat{q}_k|{}^2}{2} \, , \qquad k=-N,\dots,N-1\ , \end{aligned} $$
(76)

where \(\omega _k:=2 \big | \sin \big (\frac {k \pi }{2N} \big ) \big |\) are the proper frequencies of oscillation. Observe that E k = E k, for all k.

The nonlinear model (73) was introduced by Fermi, Pasta and Ulam (FPU), supported by Tsingou [18], with the purpose of analysing its thermalisation process. The authors expected that the interaction between the Fourier modes due to the nonlinear terms, and the consequent energy sharing between them, would have brought the system to reach the thermal equilibrium on a short time scale. In particular, as a detector of thermal equilibrium, they expected to observe the “equipartition of energy”, i.e. a final state of the system where, on time average, all Fourier energies have almost the same value, i.e. E k ≃ EN, where E is the total energy. Their numerical simulations showed instead a completely different scenario: by initially exciting the lowest frequency mode (k = 1), within their available computation time, energy sharing was observed to effectively take place only among the first few modes and, instead of a continuous trend to equipartition, the dynamics showed an almost recurrent behaviour. The first explanation of the latter phenomenon goes back to Zabusky and Kruskal [37], who approximated the traveling wave dynamics of the system by the KdV equation, and based their argument on the recurrent behaviour of its solitons. On the Hamiltonian side, the first correct computation of the resonant normal form of the lattice system, in action angle-variables, is due to [35]. Such a construction was only later recognised to include that of Zabusky and Kruskal [8, 10].

Nowadays, it is well known that a key role in the explanation of the FPU phenomenon, or paradox, is played by the integrability of the resonant normal form either of the lattice system or of its infinite-dimensional approximation (we refer to [4, 19] and the references therein). Indeed, the KdV equation admits a complete set of (infinitely many) integrals of motion, whose conservation prevents a fast energy sharing among the Fourier modes. Moreover, the preservation of the analyticity of the initial condition causes an exponential decay of its Fourier energies [28]. These two aspects resemble very much the observations in the FPU experiment.

In fact, the connection FPU-KdV can be made rigorous using the normal form construction of Theorem 2, as follows. As a preliminary step, we perform the canonical change of variables (q, p)↦(s, r) defined by the generating function

$$\displaystyle \begin{aligned} F(q,s) \;=\;\sum_{j \in \mathbb{Z}_N} s_j(q_j-q_{j+1})\ , \end{aligned} $$
(77)

which gives

$$\displaystyle \begin{aligned} r_j \;&= \;-\frac{\partial F}{\partial s_j}=q_{j+1}-q_j \, ,\\ p_j \;&=\; \frac{\partial F}{\partial q_j}=s_j-s_{j-1} \, . \end{aligned} $$
(78)

In terms of the new variables (s, r) the Hamiltonian (73) reads

$$\displaystyle \begin{aligned} H=\sum_{j\in\mathbb{Z}_N}\left[\frac{(s_{j+1}-s_j)^2}{2}+\phi(r_j)\right]\ , \end{aligned} $$
(79)

whose equations of motion are

$$\displaystyle \begin{aligned} \dot{s}_j\ &=\frac{\partial H}{\partial r_j}=\phi'(r_j)\ ,\\ \dot{r}_j\ &=-\frac{\partial H}{\partial s_j}=s_{j+1}+s_{j-1}-2s_j\ . \end{aligned} $$
(80)

Remark 15

The periodicity of the q j implies \(\sum _{j\in \mathbb {Z}_N}r_j=0\), whereas the periodicity of the s j implies \(\sum _{j\in \mathbb {Z}_N}p_j=0\).

We now assume the existence of a pair of analytic functions \(R,S:\mathbb {T} \times \mathbb {R} \to \mathbb {R}\) such that

(81)

Notice that the choice of the functions R and S is not unique. For example, one can add to them any linear combination of the form \(\sum _{m\in \mathbb {Z}}c_m\sin {}(\pi mx/h)\), which vanishes at the lattice sites x = hj. Having in mind long-wavelength initial conditions, a natural choice consists in restricting R and S to the Fourier polynomials supported on the first few harmonics at τ = 0, and in regarding the discrete system as a sampling of the continuous one at any τ > 0. This is allowed by the following proposition.

Proposition 9

Consider the Hamiltonian functional

$$\displaystyle \begin{aligned} \mathscr{H}[S,R]=\oint \Big[\frac{1}{\varepsilon} \phi(\sqrt{\varepsilon} R(x,\tau))-\frac{1}{2} S(x,\tau) \varDelta_h S(x,\tau) \Big] \, dx\ , \end{aligned} $$
(82)

where

$$\displaystyle \begin{aligned} \varDelta_h:= \frac{4}{h^2}\sinh^2\left(\frac{h}{2}\partial_x\right)= \partial_x^2+\frac{h^2}{12} \partial_x^4 + O(h^4) \end{aligned} $$
(83)

is the discrete Laplacian. Then, its Hamilton equations restricted to the lattice coincide with the FPU equations (80).

Proof

Considering (S, R) as a canonical pair coordinate-momentum, one has

$$\displaystyle \begin{aligned} S_\tau=&\frac{\delta\mathscr{H}}{\delta R}=\frac{1}{\sqrt{\varepsilon}}\phi'(\sqrt{\varepsilon}R)\ ,\\ R_\tau=&-\frac{\delta\mathscr{H}}{\delta S}=\varDelta_h S\ . \end{aligned} $$
(84)

The latter equations, restricted to the lattice, i.e. to the points x = hj, coincide with those obtained by substituting (81) into (80). □

One thus embeds the dynamics of the FPU lattice (80) within that of the infinite-dimensional Hamiltonian system (84). The latter consists of a system of nonlinear dispersive Hamiltonian PDEs for any expansion to finite order of the discrete Laplacian (83). Moreover, making use of the latter expansion and of the explicit expression (74) of ϕ, one observes that the Hamiltonian (82) has the grading of Definition 5 with

$$\displaystyle \begin{aligned} \lambda \sim \sqrt{\varepsilon} \sim h^2\ . \end{aligned} $$
(85)

Let us see in which sense KdV equation allows us to explain rigorously, in the case of the α-chain, the FPU phenomenon, namely the fact that, if one low-frequency mode is initially excited, then the energy quickly flows to a small packet of modes whose energy, on time average, decreases exponentially with the mode index. The main result is conveniently formulated in terms of the quantities

$$\displaystyle \begin{aligned} \kappa\;:=\;\frac{k}{N}\ \ ;\ \ \mathscr{E}_\kappa \;:=\; \frac{E_k}{N}\ , \end{aligned} $$
(86)

denoting the specific mode index (or wave number) and the corresponding specific energy, respectively. We are interested in the evolution of initial data supported on one harmonic mode of long wavelength, i.e. specific index κ 0 = k 0N ≪ 1.

Theorem 3 (Bambusi-Ponno [8])

Consider an initial condition of the form

$$\displaystyle \begin{aligned} \mathscr{E}_{\kappa_0} (0)\; = \; C_0\mu^4 , \qquad \mathscr{E}_\kappa(0) =\;0 , \qquad \forall \kappa \neq \kappa_0 \, , \end{aligned} $$
(87)

where C 0is any fixed constant and μ := κ 0 := k 0N ≪ 1.

Then, for any fixed time T fthere exist positive constants μ , σ, C 1and C 2(dependent on C 0and T f) such that, for all κ, μ < μ and |t|≤ T fμ3

  1. (i)
    $$\displaystyle \begin{aligned} \mathscr{E}_\kappa(t)\; \leq\; \mu^4 C_1 e^{- \sigma \kappa/\mu}\ ; \end{aligned} $$
    (88)
  2. (ii)

    there exists a sequence of almost periodic functions \(\{F_n(t)\}_{ n\in \mathbb {N}}\) and an associated specific sequence

    $$\displaystyle \begin{aligned} F_\kappa=\mu^4F_n\ ,\quad \mathit{\text{if}}\quad \kappa=n\kappa_0\ ;\quad F_\kappa =0 \quad \mathit{\text{otherwise}}\ , \end{aligned} $$
    (89)

    such that

    $$\displaystyle \begin{aligned} |\mathscr{E}_\kappa(t) - F_\kappa(t)| \;\leq\; C_2 \mu^5\ . \end{aligned} $$
    (90)

The proof of this theorem is based on the fact that a solution of the KdV equation with an analytic initial datum on the torus remains analytic for all times [28]. In particular, analyticity implies the exponential decay of Fourier coefficients, which in turn implies the exponential decay of the Fourier coefficients for the FPU system.

On the other hand, technical difficulties arise when comparing the dynamics of the discrete system with the dynamics of the continuous one, due to the contribution of the singular remainder of the discrete Laplacian that contains higher-order derivatives. The latter problem is overcome by a combined use of the analyticity property of the KdV flow, closeness to the identity of the canonical transformation and Grönwall lemma [8].

However, when comparing the above result with the numerical simulations and with the recent results on relating the FPU dynamics to that of the Toda lattice [5], one realises that it is not optimal: the time scale of closeness to the KdV dynamics numerically observed turns out to be longer than t ∼ μ−3 ∼ ε−3∕4. In fact, there is an actual hope to improve the latter result which rests on the fact that the normal form of the FPU problem is in the KdV hierarchy not only to the first but also to the second perturbative order. Then, an extension of Theorem 3 could work with a second-order normal form transformation yielding the (presumably) optimal result of localisation of the Fourier spectrum on time scales ∼ μ5 ∼ ε−5∕4.

Within this context, we present below the normal form construction of the FPU problem, including the Kodama transformation.

Proposition 10

The Hamiltonian (82) can be mapped into the normal form

$$\displaystyle \begin{aligned} \tilde{H}=K_0+Z_1+Z_2+\dots\ , \end{aligned} $$
(91)

with

$$\displaystyle \begin{aligned} \begin{array}{rcl} K_0& =&\displaystyle \oint \frac{{\tilde{u}}^2+{\tilde{v}}^2}{2} \, dx {} \end{array} \end{aligned} $$
(92)
$$\displaystyle \begin{aligned} \begin{array}{rcl} Z_1& =&\displaystyle \frac{h^2}{4! 2} \oint \left[\frac{4 \alpha \sqrt{2 \varepsilon}}{h^2}\big({\tilde{u}}^3+{\tilde{v}}^3 \big)+{\tilde{u}}{\tilde{u}}_{xx}+{\tilde{v}}{\tilde{v}}_{xx} \right] \, dx {} \end{array} \end{aligned} $$
(93)
$$\displaystyle \begin{aligned} \begin{array}{rcl} Z_2& =&\displaystyle {\textstyle\frac{3}{20} \frac{h^4}{(4!)^2}}\oint \Big[{\textstyle\left(\frac{\beta}{\alpha^2}-\frac{1}{2}\right) \frac{240 \alpha^2 \varepsilon}{h^4}}({\tilde{u}}^4+{\tilde{v}}^4)+{\textstyle\frac{20 \alpha \sqrt{2 \varepsilon}}{h^2}}\big({\tilde{u}}^2{\tilde{u}}_{xx}+{\tilde{v}}^2{\tilde{v}}_{xx} \big) \\ & \,&\displaystyle +({\tilde{u}}_{xx})^2+({\tilde{v}}_{xx})^2 \Big] \, dx + \Big({\textstyle\frac{3 \beta \varepsilon}{8}-\frac{\alpha^2 \varepsilon}{4} }\Big) \Big( \oint {\tilde{u}}^2 \, d x \Big) \Big(\oint {\tilde{v}}^2 \, d x \Big) + \\ & \,&\displaystyle +\frac{\alpha^2 \varepsilon}{32} \Big(\langle {\tilde{u}}^2 \rangle^2+\langle {\tilde{v}}^2 \rangle^2 \Big) {} \end{array} \end{aligned} $$
(94)

Proof

By introducing the Riemann variables

$$\displaystyle \begin{aligned} u:=\frac{S_x+R}{\sqrt{2}}\ \ ;\ \ v:=\frac{S_x-R}{\sqrt{2}}\ , \end{aligned} $$
(95)

the result is actually a Corollary of Proposition 8, with the substitutions \(\alpha _1=\frac {\alpha \sqrt {\varepsilon }}{6 \sqrt {2}}\), \(\alpha _2=-\frac {h^2}{4! 2}\) and \(\beta _1=\frac {\beta \epsilon }{4}\). □

Remark 16

The equations of motion of K 0 + Z 1 are those of two counter-propagating KdV equations, i.e. of the form (17), for any α. On the other hand, the equations of motion of K 0 + Z 1 + Z 2 are not in the KdV hierarchy, i.e. in the form (18), unless the special condition β = 5α2∕6 holds.

In order to bring the continuous FPU equations of motion into the KdV hierarchy form (18), one must look for a suitable Kodama transformation, as sketched in Remark 2 [20].

Proposition 11

The Kodama transformation

$$\displaystyle \begin{aligned} \tilde u = w + g(w)\ \ ;\ \ \tilde v = z + g(z)\ , \end{aligned} $$
(96)

where

$$\displaystyle \begin{aligned} g(w):=&\frac{h^2}{4!}\left(\frac{7}{2}-\frac{9}{2}\frac{\beta}{\alpha^2}\right)w_{xx}+ \frac{\alpha\sqrt{\varepsilon}}{\sqrt{2}}\left(\frac{13}{12}-\frac{3}{2}\frac{\beta}{\alpha^2}\right) \left(w^2-\oint w^2\ dx\right) + \\ -&\frac{1}{6}\left(w_x\partial_x^{-1}w-\oint w^2\ dx\right)\ , \end{aligned} $$
(97)

maps the equations of motion of the Hamiltonian normal form (91) into the integrable KdV form (18).

Proof

The proof consists in a long, though direct computation. Details can be found in [20]. Observe that, according to the grading (85), g ∼ λ, which does not affect the first order normal form. □

A natural question arises now, namely whether it is possible to construct a normal form transformation, including the Kodama procedure, conjugating the continuous FPU equations to those of the KdV hierarchy to perturbative orders higher than the second one. This is an open problem for initial data generically supported on lower modes, but it has recently been addressed for initial data close to the traveling wave. In [23] it is proved that for almost-traveling waves, the conjugation to the third-order works only if the parameters correspond to a curve in the space of parameters containing the Toda lattice.

In general, it is expected that the FPU normal form is in the KdV hierarchy to a finite perturbative order, depending on the model. This is easily seen by considering the family of generalised FPU-systems [9] defined by a Hamiltonian of the form (73) with

$$\displaystyle \begin{aligned} \phi(z)=\frac{z^2}{2}+\frac{z^p}{p} \, , \qquad p \geq 3\ . \end{aligned} $$
(98)

Instead of fixing a model and going on with the perturbative order, we here consider how the first order normal form depends on the exponent p. The Hamiltonian (82) with potential (98) reads

$$\displaystyle \begin{aligned} H \;=\; \oint \Big[\frac{R^2}{2}+\gamma \varepsilon^{\frac{p-2}{2}} \frac{R^p}{p}+\frac{1}{2} (S_x)^2-\frac{h^2}{12}(S_{xx})^2 \Big] \, dx + O(h^4)\ . \end{aligned} $$
(99)

Passing to the (u, v) variables (95), one gets H = K 0 + H 1, where

$$\displaystyle \begin{aligned} \begin{array}{rcl} K_0& =&\displaystyle \oint \frac{u^2+v^2}{2} \, dx\ , \end{array} \end{aligned} $$
(100)
$$\displaystyle \begin{aligned} \begin{array}{rcl} H_1& =&\displaystyle \oint \left[\gamma \varepsilon^{\frac{p-2}{2}} \frac{(u-v)^p}{2^{p/2} p}-\frac{h^2}{24}\big((u_{x})^2 + 2 u_x v_x + (v_x)^2\big) \right] \, dx\ . \end{array} \end{aligned} $$
(101)

Averaging H 1 (using (61)) one computes the normal form \(\tilde H=K_0+Z_1+\cdots \) of the system, where

$$\displaystyle \begin{aligned} Z_1= \left\langle H_1\right\rangle\;&=\; \oint \frac{\gamma \varepsilon^{\frac{p-2}{2}}}{2^{p/2} p} \big(u^p+(-v)^p\big)-\frac{h^2}{24} \big((u_x)^2+(v_x)^2 \big) \, dx \\ & + \frac{\gamma \varepsilon^{\frac{p-2}{2}}}{2^{p/2} p} \sum_{j=1}^{p-1} (-1)^j \binom{p}{j} \Big(\oint u^{p-j} \, dx \Big) \Big(\oint v^j \, dx \Big) \, . \end{aligned} $$
(102)

For p = 3 one finds that K 0 + Z 1 is the Hamiltonian of two uncoupled KdV equations, as expected. For p = 4, the so-called β-model, K 0 + Z 1 is the Hamiltonian of two uncoupled modified KdV (mKdV) equations. Thus, the first order normal form is integrable for p = 3, 4. On the other hand, for p ≥ 5 the first order normal form Hamiltonian is that of two generalised, nonintegrable KdV equations, that are also nonlinearly coupled for p ≥ 6. For this class of models the integrability of the normal form, and the consequent FPU phenomenon of energy localisation due to closeness to integrability, are lost to first order if the degree of nonlinearity is high enough (p ≥ 5). More than this, in [9] it is suggested that the blow-up of solutions characterising the nonintegrable KdV equations might play a relevant role in the problem.

As a last point, we stress that the method of infinite-dimensional perturbation theory allows to analyse the FPU system, treated in Proposition 10, in the singular limit h → 0 with fixed, small specific energy ε. Such a limit is justified on the short term, where dispersion is expected to play a minor role with respect to nonlinearity, which explains why the normal modes start to effectively share their energy. Taking the limit h → 0, at fixed ε, of the FPU terms (92), (93) and (94), one finds

$$\displaystyle \begin{aligned} H=K_0+Z_1+Z_2+\dots \end{aligned} $$
(103)

with

$$\displaystyle \begin{aligned} \begin{array}{rcl} K_0& =&\displaystyle \oint \frac{u^2+v^2}{2} \, dx \, , \end{array} \end{aligned} $$
(104)
$$\displaystyle \begin{aligned} \begin{array}{rcl} Z_1& =&\displaystyle \frac{\alpha \sqrt{\varepsilon}}{2\sqrt{2}} \oint \frac{u^3+v^3 }{3} \, dx \, , \end{array} \end{aligned} $$
(105)
$$\displaystyle \begin{aligned} \begin{array}{rcl} Z_2& =&\displaystyle \Big(\frac{\beta}{\alpha^2}-\frac{1}{2} \Big) \frac{\alpha^2 \varepsilon}{4} \oint \frac{u^4+v^4}{4} \, dx \, . {} \end{array} \end{aligned} $$
(106)

The equations of motion associated with this normal form Hamiltonian consist of a pair of uncoupled, generalised Burgers equations, whose solution displays a gradient catastrophe at a finite shock time t s. It has recently been proved that the Fourier energy spectrum of such a system displays a power law decay characterised by the universal exponent − 8∕3 exactly at t s. Such a prediction fits very well the numerical spectrum of the FPU system [21]. Of course, the dynamics on times longer than t s cannot be described in this limit and dispersive effects must be re-included, in agreement with the grading (85).

5.2 Water Waves

Consider an ideal fluid occupying, at rest, the domain

$$\displaystyle \begin{aligned} \varOmega_{0,L}\;:=\; \big\{(x,z) \in [0,L] \times \mathbb{R} \,:\, -h<z<0 \big\} \, , \end{aligned} $$
(107)

with L > 0. We study the evolution of the free surface under the action of gravity, in the irrotational regime. Thus, given a periodic function \(\eta :[0,L] \to \mathbb {R}\), we define the domain

$$\displaystyle \begin{aligned} \varOmega_{\eta,L} \;:=\;\big\{(x,z) \in [0,L] \times \mathbb{R} \, : \, -h<z<\eta(x) \big\} \, . \end{aligned} $$
(108)

Irrotationality makes it possible to describe the velocity of the fluid u as gradient of a function called velocity potential by u = ∇ϕ. This problem admits a Hamiltonian formulation [14, 15, 38] and the conjugated variables are the wave profile η(x) and the trace of the velocity potential at the free surface:

$$\displaystyle \begin{aligned} \psi(x) \;:=\; \phi(x,\eta(x)) \, . \end{aligned} $$
(109)

The Hamiltonian of the system is

$$\displaystyle \begin{aligned} H(\eta,\psi) \;=\; \oint \Big(\frac{1}{2}g \eta^2+\frac{1}{2} \psi G(\eta) \psi \Big) \, dx, \end{aligned} $$
(110)

where G(η) is the Dirichlet-to-Neumann operator defined as follows. Given a function ψ(x) and consider the boundary value problem

$$\displaystyle \begin{aligned} \begin{array}{rcl} \varDelta \phi & =&\displaystyle 0 \, , \qquad (x,z) \in \varOmega_{\eta,L} \end{array} \end{aligned} $$
(111)
$$\displaystyle \begin{aligned} \begin{array}{rcl} \phi_z\Big|{}_{z=-h}& =&\displaystyle 0 \end{array} \end{aligned} $$
(112)
$$\displaystyle \begin{aligned} \begin{array}{rcl} \phi(0) & =&\displaystyle \phi(L) {} \end{array} \end{aligned} $$
(113)
$$\displaystyle \begin{aligned} \begin{array}{rcl} \phi \Big|{}_{z=\eta(x)} & =&\displaystyle \psi \end{array} \end{aligned} $$
(114)

and let ϕ be its solution. Then

$$\displaystyle \begin{aligned} G(\eta)\psi\;:=\; \sqrt{1+\eta_x^2} \partial_n \phi\big|{}_{z=\eta(x)} =(\phi_z-\eta_x\phi_x)\big|{}_{z=\eta(x)}, \end{aligned} $$
(115)

where n denotes the derivative in the direction normal to z = η(x).

We are interested in solutions of the form

$$\displaystyle \begin{aligned} \eta(x)\;=\;\mu^2 h^3 \sqrt{2} \tilde{\eta}(\mu x) \, , \qquad \psi(x)\;=\; \mu \sqrt{2 g h} h^2 \tilde{\psi}(\mu x) \, , \qquad \mu=1/L \ll 1 \, ,\end{aligned} $$
(116)

that corresponds to a canonical transformation when rescaling time to

$$\displaystyle \begin{aligned} \tilde{t} \;=\; \frac{t}{\mu \sqrt{gh}} \, \end{aligned} $$
(117)

and the physical space becomes the torus of unitary length.

Note that the dependence on η of the Dirichlet-to-Neumann operator causes the Hamiltonian (110) not to fall within the class of mechanical Hamiltonians of Sect. 4.3.

The small parameter of the theory is λ = ()2. Expanding the Hamiltonian in λ one getsFootnote 1

$$\displaystyle \begin{aligned} H\;=\;H_0+\lambda H_1 + \lambda^2 H_2 + O(\lambda^3) \end{aligned} $$
(118)

with H 0 being in the same form of (92) but with renamed variables:

$$\displaystyle \begin{aligned} \begin{array}{rcl} H_0 & =&\displaystyle \oint \frac{\tilde{\eta}^2+\tilde{\psi}_y^2}{2} \, dy \, , \end{array} \end{aligned} $$
(119)
$$\displaystyle \begin{aligned} \begin{array}{rcl} H_1& =&\displaystyle \frac{1}{2} \oint \Big(-\frac{1}{3}\tilde{\psi}_{yy}^2+\sqrt{2} \tilde{\eta} \tilde{\psi}_y^2 \Big) \, dy \, , \end{array} \end{aligned} $$
(120)
$$\displaystyle \begin{aligned} \begin{array}{rcl} H_2 & =&\displaystyle \frac{1}{2} \oint \Big(\frac{2}{15} \tilde{\psi}_{yyy}^2-\sqrt{2} \tilde{\eta} \tilde{\psi}_{yy}^2 \Big) \, dy \, . \end{array} \end{aligned} $$
(121)

Note that, the Hamiltonian contains terms with the product of \(\tilde {\eta }\) and \(\tilde {\psi }\) and thus does not fit the definition of mechanical Hamiltonian given above. Anyway, as for the FPU problem, it is convenient to use characteristic variables (u, v) defined as

$$\displaystyle \begin{aligned} \begin{array}{rcl} \tilde{\eta}(y,t) & =&\displaystyle \frac{u(y,t)+v(y,t)}{\sqrt{2}} \, , \end{array} \end{aligned} $$
(122)
$$\displaystyle \begin{aligned} \begin{array}{rcl} \tilde{\psi}_y(y,t) & =&\displaystyle \frac{u(y,t)-v(y,t)}{\sqrt{2}} \end{array} \end{aligned} $$
(123)

we then obtain

$$\displaystyle \begin{aligned} \begin{array}{rcl} K_0& =&\displaystyle \oint \frac{u^2+v^2}{2} \, dy \, , \end{array} \end{aligned} $$
(124)
$$\displaystyle \begin{aligned} \begin{array}{rcl} H_1& =&\displaystyle \oint \Big(-\frac{1}{12}(u_y^2+v_y^2)+\frac{u^3+v^3}{4}+\frac{u_yv_y}{6}-\frac{u^2v+uv^2}{4} \Big) \, dy \, , \end{array} \end{aligned} $$
(125)
$$\displaystyle \begin{aligned} \begin{array}{rcl} H_2& =&\displaystyle \oint\Big( \frac{1}{2} \frac{u_{yy}^2+v_{yy}^2}{15}-\frac{1}{4} (u u_y^2+ v v_y^2)-\frac{1}{15} u_{yy} v_{yy} \end{array} \end{aligned} $$
(126)
$$\displaystyle \begin{aligned} \begin{array}{rcl} & \, &\displaystyle -\frac{1}{4} (uv^2_y-2 u u_y v_y+vu_y^2-2vu_yv_y) \Big) \, dy \, . \end{array} \end{aligned} $$
(127)

Applying the techniques of Theorem 2 one has

Proposition 12

Within the normal form procedure outlined above, Hamiltonian (118) can be mapped into the normal form

$$\displaystyle \begin{aligned} \tilde{H} \;=\; \tilde{K}_0+\lambda Z_1+ \lambda^2 Z_2+\dots \end{aligned} $$
(128)

with

$$\displaystyle \begin{aligned} \begin{array}{rcl} Z_1& =&\displaystyle \oint \Bigg[\frac{\tilde{u}^3+\tilde{v}^3}{4}-\frac{1}{12} \big(\tilde{u}_y^2+\tilde{v}_y^2\big)\Bigg] \, d y {} \, , \end{array} \end{aligned} $$
(129)
$$\displaystyle \begin{aligned} \begin{array}{rcl} Z_2& =&\displaystyle \oint \Bigg[\frac{1}{64} \big(\tilde{u}^4+\tilde{v}^4 \big)+\frac{7}{48} \big(\tilde{u}^2 \tilde{u}_{yy}+\tilde{v}^2 \tilde{v}_{yy} \big) + \frac{29}{720} \big( \tilde{u}_{yy}^2+\tilde{v}_{yy}^2\big) \Bigg] \, d y \\ & &\displaystyle \quad + \frac{1}{8} \langle \tilde{u}^2 \rangle \langle \tilde{v}^2 \rangle \, .{} \end{array} \end{aligned} $$
(130)

Proof

This result is proved computing normal form Proposition 8.

First perturbative step: We use (61) to average H 1 along the flow of H 0 obtaining the expression for Z 1 in (129). The Hamiltonian generating the canonical transformation can be computed using (62):

$$\displaystyle \begin{aligned} G_1\;=\; -\oint \bigg[\frac{1}{12}v_yu+\frac{1}{8}u^2 \partial_y^{-1} v - \frac{1}{8} v^2 \partial_y^{-1} u \bigg] \, dy \, . \end{aligned}$$

We can therefore compute the L 2-gradient of G 1 and of H 1 − Z 1 obtaining

$$\displaystyle \begin{aligned} \frac{\delta G_1}{\delta u} \;&=\; -\frac{1}{12} v_y - \frac{1}{4} u \partial_y^{-1} v - \frac{1}{8} \partial_y^{-1} v^2 \, ,\\ \frac{\delta G_1}{\delta v} \;&=\; \frac{1}{12} u_y + \frac{1}{8} \partial_y^{-1} u^2 + \frac{1}{4} v \partial_y^{-1} u \, , \end{aligned}$$
$$\displaystyle \begin{aligned} \frac{\delta (H_1-Z_1)}{\delta u} \;&=\; - \frac{1}{6} v_{yy}-\frac{1}{2} uv -\frac{1}{4} v^2 \, , \\ \frac{\delta (H_1-Z_1)}{\delta v} \;&=\; - \frac{1}{6} u_{yy}-\frac{1}{2} uv - \frac{1}{4} u^2 \, . \end{aligned}$$

Second perturbative step: We use (61) to average H 2 and {H 1 − Z 1, G 1} obtaining:

$$\displaystyle \begin{aligned} \big\langle \{H_1,G_1\} \big\rangle_{0} \;&=\; \frac{1}{8} \oint \Big[\frac{1}{9}(u_{yy}^2+v_{yy}^2)+\frac{1}{3}(u^2 u_{yy} +v^2 v_{yy})+\frac{1}{4}(u^4+v^4) \Big] \, dy \\ &\qquad \frac{1}{4} \langle u^2 \rangle \langle v^2 \rangle \, , \end{aligned}$$
$$\displaystyle \begin{aligned} \langle H_2 \rangle_0 \;=\; \oint \Big[\frac{u_{yy}^2+v_{yy}^2}{30}+\frac{1}{8} \big( u^2 u_{yy}+v^2 v_{yy} \big) \Big] \, dy \, . \end{aligned}$$

We obtain \(Z_2=\langle H_2 \rangle _0 + \frac {1}{2} \langle \{H_1,G_1 \} \rangle _0\) that is precisely (130). □

As for equations of the FPU lattice, these Hamiltonians are not in the Korteweg-de Vries hierarchy. Exactly as in the previous case, Kodama’s theory solves the problem and with a close-to-identity change of variables maps the Hamiltonian into:

$$\displaystyle \begin{aligned} H_{\mathrm{NF}}(u,v)\;=\;K_0(u)+\lambda K_1(u)+\lambda^2 c_2 K_2(u) +K_0(v)+\lambda K_1(v)+\lambda^2 c_2 K_2(v) \,\end{aligned} $$
(131)

with c 2 being some explicit constant.

In case μ is a small free parameter not related to L and the water waves are studied on the whole real line (that is, \(x \in \mathbb {R}\) and thus, imposing limxϕ(x) = 0 instead of ϕ(0) = ϕ(L) in (113)), the following result holds

Theorem 4 (Bambusi [3])

For any s′, there exists λ > 0 and s, s, s.t., if 0 < λ < λ , then there exists a map \(T_\lambda :B_1^s \to W^{s'',1} \times W^{s'',1}\) , with the following properties

  1. (i)

    \(\sup _{(u,v) \in B_1^s} \Vert T_\lambda (u,v)-(u,v) \Vert _{W^{s'',1} \times W^{s'',1}} \leq C \lambda \),

  2. (ii)

    Let I λbe an interval containing the origin and \(z(\cdot )=(u(\cdot ),v(\cdot )) \in C^1(I_\lambda ;B_1^s)\) be a solution of the Hamiltonian system (131) with \(c_2=\frac {299}{389}\) define

    $$\displaystyle \begin{aligned} z_a=(u_a,v_a)\;:=\;T_\lambda(u,v) \, . \end{aligned} $$
    (132)

    Then there exists \(R \in C^1(I_\lambda ,W^{s',2}\times W^{s',2})\) s.t. one has

    $$\displaystyle \begin{aligned} \dot{z}_a \;=\; J \nabla H (z_a(t))+\lambda^3 R(t) \, \qquad \forall t \in I_\lambda \, , \end{aligned} $$
    (133)

    where H is the Hamiltonian of water waves problem in the variables u and v.

An interesting non-trivial dynamical information one can obtain from this Theorem concerns the goodness of the approximation of the normal form dynamics. That is, for smooth enough initial data, it is possible to go back to the original non-scaled variables and to get the estimate on the wave profile

$$\displaystyle \begin{aligned} \sup_{|t| \leq T^*/{\mu^3 \sqrt{gh}}} \Vert \eta(t)-\eta_a(t) \Vert_{L^\infty} \leq C\mu^6 \, . \end{aligned} $$
(134)

Note that the difference between wave profiles can be proved to be small only for times in which the second perturbative correction is negligible. Thus, as for the FPU system, an interesting open problem is the understanding which results can hold for larger time scales.

We are confident that these two results can be proved also in the periodic setting presented above.

6 Conclusions and Open Problems

In the framework of Hamiltonian field theory, the continuum limit of the FPU chain for long-wavelength excitations and the Hamiltonian of water waves belong to the same wider class of perturbations of the wave equation. This is not the case of other lattice models, such as the Klein-Gordon, for which one has to take into account the presence of the mass term.

Recently, the analysis of lattice model using the machinery of water waves has received a certain interest especially for systems in two spatial dimensions [22, 27] or for the analysis of higher-order normal forms for one-dimensional systems [23].

As a comparison, water waves are now a hot topic in research. The main goals in the field are results on well-posedness as well as regularity result for solutions or existence of quasi periodic or traveling wave solutions (see e.g. the recent results [2, 11, 12, 17]).

In this sense, many open questions remain open and can hopefully be addressed in the next future:

  • The analysis at second order performed in Subsec. 5.1 does not allow us to conclude that the dynamics of the integrable system is close to the dynamics of the original system. Actually, it is known how to obtain a result on the dynamics, but only over times over which the effects of the second-order term is invisible. One of the open major problems is to understand how to go beyond the time scale of Theorem 3.

  • From the point of view of statistical physics, the regime on which Theorem 3 is proved is not significant as the specific energy of the system ε ∼ 1∕N4. The thermodynamic limit would require ε to be constant and independent of the size of the system. This is read, in terms of the normal form construction, as a zero-dispersion limit of the Korteweg-de Vries equation. It would be interesting to study the effect of this limit.

  • Last, small attention has been given to the analysis of the FPU model when the dispersion is neglected (see [34]). An interesting question to address would be if Eqs. (103)–(106) can be used to explain some properties of the dynamics, especially for short time scales, low Fourier modes or in the regime of high specific energy.