1 Introduction

Long-crested water waves propagating shoreward are commonplace in the shallow-water zone of large bodies of water. Waves of this general form are easily generated in laboratory settings as well. If a standard xyz—coordinate system is adopted in which z increases in the direction opposite to which gravity acts, such waves are often taken to propagate along the x-axis, say in the direction of increasing values, and to be independent of the y-coordinate. In this case, if dissipation and surface tension effects are ignored, the fluid assumed to be incompressible and the motion irrotational, the standard representation of the velocity field and the free surface is provided by the Euler equations for the motion of a perfect fluid with the boundary behavior at the free surface determined by the Bernoulli condition. On typical geophysical length scales, these equations provide reasonably good approximations of what is actually observed in nature. In detail, this system has the form

$$\begin{aligned} {\left\{ \begin{array}{ll} \Delta {\varphi } = 0,\qquad \qquad &{} 0< y < h_0 + \eta (x,t), \\ \partial _y\varphi = 0, &{} y = 0, \\ \partial _t \eta = \partial _y \varphi - \partial _x{\eta } \cdot \partial _x{\varphi }, &{} y = h_0 + \eta (x,t), \\ \partial _t \varphi = g\eta - \frac{1}{2} (\partial _x\varphi )^2 - \frac{1}{2} (\partial _y \varphi )^2, &{} y = h_0 + \eta (x,t). \end{array}\right. } \end{aligned}$$
(1.1)

Here, the bottom is taken to be flat, horizontal and located at \(z = 0\), though theory with a slowly varying bottom can easily be derived along the same lines (see Bona and Chen 1997). The undisturbed depth is \(h_0\) while the dependent variable, \(\eta = \eta (x, t)\), is the deviation of the free surface from its rest position \((x,h_0)\) at time t. Thus, the depth of the water column over the spatial point (x, 0) on the bottom, at time t, is \(h(x, t) = h_0+\eta (x, t)\). The dependent variable \(\phi = \phi (x, y, t)\) is the velocity potential which is defined throughout the flow domain and whose existence owes to the fact that the fluid is incompressible and irrotational. Hence, \((u(x, z, t), v(x, z, t)) = \nabla \phi (x, z, t)\) is the velocity field at the point (xz) in the flow domain at time t. Here, \(\nabla \) connotes the gradient with respect to the spatial variables only. Of course, for this formulation to make sense, it must be the case that the free surface remains a graph over the bottom, a presumption that overlies the developments here. It deserves remark that the system (1.1) can be rewritten in a Hamiltonian form, as Zakharov (1968) pointed out almost 50 years ago.

Beginning already in the first half of the nineteenth century, simpler models have been posited, in part because the approximation using (1.1) is both analytically and computationally recalcitrant. Note in particular that the location of the free surface is part of the problem, so that two boundary conditions at the free surface are needed for its determination. Observe also that the temporal derivatives only appear in the boundary conditions, making the problem further nonstandard. Moreover, the precision one might hope for from using the Euler equations is not always useful in practice. If the input data have significant error, there may be little point in the higher accuracy afforded by the Euler system (1.1) as opposed to cruder approximations.

The largest steps forward in the nineteenth century study of approximate models were taken by Boussinesq in the 1870s (see especially his opus Boussinesq 1877). The coupled systems of equations which now bear his name are well known to theoreticians and they and their relatives find frequent use in practical situations (see, e.g., Boczar-Karakiewicz et al. 2003; Bona and Chen 1997). In addition to the presumption that the wave motion is long-crested, so sensibly one-dimensional, they subsist on the assumption that the wave amplitudes and wavelengths encountered in the evolution are, respectively, small and large relative to the undisturbed depth \(h_0\) of the liquid over the horizontal, featureless bottom. More precisely, their derivation needs that

$$\begin{aligned} \alpha = \frac{A}{h_0}\ll 1, \qquad \beta = \frac{h_0^2}{l^2}\ll 1, \qquad S = \frac{\alpha }{\beta } = \frac{Al^2}{h_0^3}\approx 1. \end{aligned}$$
(1.2)

Here, A is a typical amplitude of the wave motion in question while l is a typical wavelength. The assumption that the Stokes’ number \(S = \frac{\alpha }{\beta }\) is of order one effectively means that nonlinear and dispersive effects are balanced. Boussinesq also derived a model, now called the Korteweg–de Vries (KdV) equation, which was a specialization of the coupled systems, formally valid for waves traveling only in one direction, say in the direction of increasing values of x.

Almost a century later, Peregrine (1966) and Benjamin et al. (1972) returned to Boussinesq’s unidirectional model

$$\begin{aligned} \eta _t + \eta _x + \frac{3}{2} \eta \eta _x + \frac{1}{6} \eta _{xxx} = 0 \end{aligned}$$
(1.3)

(the Korteweg–de Vries equation, commonly referred to as the KdV equation) and derived an equivalent version known as the regularized long-wave equation (RLW equation) or the BBM equation. In terms of the dependent variable \(\eta (x,t)\), this equation takes the form

$$\begin{aligned} \eta _t + \eta _x + \frac{3}{2} \eta \eta _x - \frac{1}{6} \eta _{xxt} = 0 \end{aligned}$$
(1.4)

in the unscaled, non-dimensional variables

$$\begin{aligned} x = \frac{1}{h_0} \, \bar{x}, \quad t = \sqrt{\frac{g}{h_0}} \, \bar{t} \quad \mathrm{and} \quad \eta = \frac{1}{h_0}\, \bar{\eta }. \end{aligned}$$

Here, the constant g is the acceleration due to gravity while \(\bar{x}, \bar{t}\) and \(\bar{\eta }\) are laboratory or field variables, all measured in the unit of length consistent with the values of \(h_0\) and g.

Models like the BBM and KdV equations are known to provide good approximations of unidirectional solutions of the full water wave problem (1.1) on the so-called Boussinesq timescale, \(\frac{1}{\beta } \approx \frac{1}{\alpha }\) (see Alazman et al. 2006; Bona et al. 2005, 1983). They are also known to predict laboratory observations with reasonable accuracy on similar timescales (see Bona et al. 1981; Hammack 1973; Hammack and Segur 1974).

In some applications, notably coastal engineering and ocean wave modeling, the waves need to be followed on timescales longer than the Boussinesq timescale (for example, see Boczar-Karakiewicz et al. 2003 and references therein). In such situations, a higher-order approximation to the water wave problem might prove to be useful as it would be formally valid on the square \(\frac{1}{\beta ^2} \approx \frac{1}{\alpha ^2} \) of the long, Boussinesq timescale. Such models have appeared in the literature before (see Olver 1984a, b for early examples). It is our purpose here to put forward a class of such higher-order correct, unidirectional evolution equations and to provide analysis relating to the fundamental issue of Hadamard well-posedness for a subclass. Models will be isolated that are not only a formally second-order correct approximation of the full, two-dimensional water wave problem, but also possesses a Hamiltonian structure. As Olver pointed out in his pioneering work (Olver 1984b), this helpful aspect is more difficult to attain in higher-order models that formally are faithful to the overlying Euler equations than in the first-order correct KdV or BBM models. Indeed, the fifth-order model appearing in Olver (1984b) does not in fact have a Hamiltonian structure, as Olver points out.

The notion of well-posedness which is featured here was put forward by Hadamard more than a century ago in a lecture the well-known French mathematician gave at Princeton University (see Hadamard 1902). In his conception, a problem is well-posed subject to given auxiliary data when there corresponds a unique solution which depends continuously on variations in the specified supplementary data. Hadamard points out that if the problem is lacking these properties, it will probably be useless in practical applications. Auxiliary data brought from real-world situations typically features at least a small amount of error. If the model were to respond discontinuously to these small perturbations, the reproducibility of the model predictions in laboratory and field settings would be compromised and likewise their use in real situations would be suspect.

To clarify the role of the size restrictions (1.2), it is often helpful to rescale the variables. For example, in the context of Eq. (1.4), change variables by letting \(\eta \hookrightarrow \alpha \eta \), and \((x,t) \hookrightarrow \sqrt{\beta }(x,t)\). In the new variables, \(\eta \) and its first few partial derivatives with respect to x and t are presumed to be of order one and the equation takes the form

$$\begin{aligned} \eta _t + \eta _x + \frac{3}{2} \alpha \eta \eta _x - \frac{1}{6} \beta \eta _{xxt} = 0. \end{aligned}$$
(1.5)

In this scaling, the role of the small parameters is more apparent. Moreover, the error term made in the approximation, which is set to zero in (1.5), is quadratic in the small parameters \(\alpha \) and \(\beta \). Because of this latter aspect, even though the solution and its derivatives remains of order one, the ignored error can accumulate and have an order-one effect on the solution on a timescale of size \(\frac{1}{\alpha ^2} \approx \frac{1}{\beta ^2}\) and hence the need for a higher-order correct model if longer spatial distances are in question.

The starting point of our derivation of higher-order KdV–BBM-type equations is the paper Bona et al. (2002) (and see also the earlier note Bona and Chen 1997) where a several-parameter variant of the classical Boussinesq system of two coupled equations was derived. These Boussinesq systems are derived without the assumption of one-way propagation and can therefore countenance long-crested waves propagating in both directions. The theory in Bona et al. (2002) assumes incompressibility, irrotationality, long-crestedness and the size conditions enunciated in (1.2). Boussinesq systems were formally derived at both first and second order in the small parameters \(\alpha \) and \(\beta \). In dimensionless, scaled variables as appearing in (1.5), the family of formally first-order correct systems has the form

$$\begin{aligned} {\left\{ \begin{array}{ll} \eta _t +w_x +\alpha (w\eta )_x + \beta \big (aw_{xxx}-b\eta _{xxt}\big )=0,\\ w_t +\eta _x +\alpha ww_x +\beta \big (c\eta _{xxx} -dw_{xxt}\big ) =0. \end{array}\right. } \end{aligned}$$
(1.6)

The variable \(\eta \) is proportional to the deviation of the free surface from its rest position at the point x at time t, as it was in (1.4), while \(w = w(x,t)\) is proportional to the horizontal velocity at a certain depth \(z_0\), say, at the point \((x,z_0,t)\) in the flow domain. (The velocity w is scaled by \(\sqrt{gh_0}\) to make it non-dimensional and then by \(\alpha \) to make it of order one.) The constants abc and d are not arbitrary. They satisfy the relations

$$\begin{aligned} {\left\{ \begin{array}{ll} a=\frac{1}{2}\Big (\theta ^2-\frac{1}{3}\Big )\lambda , \qquad &{} b=\frac{1}{2}\Big (\theta ^2-\frac{1}{3}\Big )(1-\lambda ),\\ c=\frac{1}{2}(1-\theta ^2)\mu , \qquad \quad &{}d=\frac{1}{2}(1-\theta ^2)(1-\mu ), \end{array}\right. } \end{aligned}$$
(1.7)

so that \(a + b + c + d = \frac{1}{3}\). In the same, order-one, independent and dependent variables, the second-order correct systems are

$$\begin{aligned} {\left\{ \begin{array}{ll} \begin{aligned} &{}\eta _t +w_x + \beta \left( a w_{xxx} -b\eta _{xxt}\right) +\beta ^2\left( a_1w_{xxxxx} + b_1\eta _{xxxxt}\right) \\ &{}\quad = -\alpha (\eta w)_x + \alpha \beta \left( b(\eta w)_{xxx} -\left( a+b-\frac{1}{3}\right) (\eta w_{xx})_x\right) ,\\ &{}w_t+\eta _x+\beta \left( c\eta _{xxx}-d w_{xxt}\right) + \beta ^2\left( c_1\eta _{xxxxx} + d_1w_{xxxxt} \right) \\ &{}\quad =-\alpha ww_x +\alpha \beta \left( (c+d) ww_{xxx}-c(ww_x)_{xx} -(\eta \eta _{xx})_x +(c+d-1) w_xw_{xx}\right) , \end{aligned} \end{array}\right. } \end{aligned}$$
(1.8)

where the additional constants \(a_1, b_1, c_1, d_1\) are

$$\begin{aligned} {\left\{ \begin{array}{ll} \begin{aligned} a_1&{}= -\frac{1}{4}\Big (\theta ^2-\frac{1}{3}\Big )^2(1-\lambda )+\frac{5}{24}\Big (\theta ^2-\frac{1}{5}\Big )^2\lambda _1,\\ b_1&{} = -\frac{5}{24}\Big (\theta ^2-\frac{1}{5}\Big )^2(1-\lambda _1),\\ c_1&{} =\frac{5}{24}(1-\theta ^2)\Big (\theta ^2-\frac{1}{5}\Big )(1-\mu _1),\\ d_1&{}=-\frac{1}{4}\big (1-\theta ^2\big )^2\mu -\frac{5}{24}(1-\theta ^2)\Big (\theta ^2-\frac{1}{5}\Big )\mu _1. \end{aligned} \end{array}\right. } \end{aligned}$$
(1.9)

The parameter \(\theta \) has physical significance. It is determined by the height above the bottom at which the horizontal velocity is specified initially and whose evolution is being followed. In the earlier notation, \(\theta = 1-z_0\). Because the vertical variable is scaled by the undisturbed depth \(h_0\) in these descriptions, \(\theta \) must lie in the interval [0, 1]. The other values, \(\lambda , \mu , \lambda _1\) and \(\mu _1\) are modeling parameters and can in principle take any real value. Thus, the coefficients appearing in the higher-order Boussinesq systems form a restricted, eight-parameter family. Notice that if terms quadratic in \(\alpha \) and \(\beta \) are dropped, the second-order system (1.8) reduces to the first-order system (1.6).

The velocity field in the rest of the flow is determined by an associated approximation of the velocity potential in the flow domain. The latter is derived from a knowledge of w (see Bona et al. 2002, 2013).

Local in time well-posedness of the Cauchy problem for the systems (1.6) and (1.8) was studied in Bona et al. (2002) and Bona et al. (2004). Not all of these systems are even linearly well-posed. Indeed, the recent foray (Ambrose et al. 2017) shows that many of those not linearly well-posed are in fact not locally well-posed when the nonlinearity is taken into account. The fact that some of the family is ill-posed has the advantage of eliminating them from consideration when real-world approximation is the goal.

These systems were further extended in Bona et al. (2005) to include waves that are fully three-dimensional, and not just long-crested motions. Rigorous estimates were also provided for the difference between solutions of the full water wave problem and solutions of the first-order models. A further extension of Bona et al. (2005) is given in Lannes and Saut (2006), where Boussinesq systems in the Kadomtsev–Petviashvili (KP) scaling are derived. The latter situation is intermediate between the long-crested regime where transverse motion is ignored entirely and three-dimensional Boussinesq systems that allow strong transverse disturbance, a regime that is often referred to as allowing for weakly transverse long waves. A detailed survey of results of this sort can be found in Saut’s lecture notes (Saut) or the recent monograph of Lannes (2013).

As hinted already, when long-crested waves are essentially moving in only one direction, one might prefer to use a unidirectional model because less auxiliary data are needed to initiate it. Theory developed in Alazman et al. (2006) has shown rigorously that predictions of first-order Boussinesq systems and those of their unidirectional counterpart (1.4) are the same to the neglected order, provided the wave motion is initiated unidirectionally. This gives rigorous credence to the utility of such unidirectional models since the bidirectional models are known to be a good approximation of solutions of the full Euler system in the Boussinesq regime of small amplitude and long wavelength.

We stress that while the higher-order, unidirectional models put forward here are formally correct on the square of the Boussinesq timescale, no proof of this exists. Indeed, considering the difficulty encountered in showing the first-order correct, Boussinesq systems are faithful to the full, inviscid water wave problem (1.1) on the Boussinesq timescale and showing the KdV–BBM approximations (1.3)–(1.4) are true to their overlying Boussinesq system, a rigorous result for the systems derived here on the square of the Boussinesq timescale is likely to be challenging. One can show that the higher-order terms do not do damage to the original KdV–BBM approximation of the full water wave problem on the Boussinesq timescale, provided sufficiently smooth initial data are countenanced. This point is not addressed here as it would take us afield of the main developments. It is also the case that one can show directly and rigorously that the linearized, higher-order, unidirectional model is faithful to the linearized Boussinesq system on this very long timescale, again, provided the initial data have enough regularity. However, these results are far from what one would like to have in hand.

The present contribution proceeds as follows. In the next section, we derive formally from the second-order Boussinesq equations a class of second-order KdV–BBM-type equations. Also in the next section, function class notation is introduced and our main results about the higher-order, unidirectional models are stated. Section 3 provides proofs of the results stated in Sect. 2.2, while Sect. 4 features commentary about the choice of the parameters \(\theta , \lambda , \mu , \lambda _1, \mu _1\) and another parameter \(\rho \) to be introduced presently. Section 5 is devoted to a discussion of the linear dispersion relation. Finally, in Sect. 6 some concluding remarks are recorded.

2 Derivation of the Models and the Main Results

The formal derivation of a class of higher-order, unidirectional equations, together with a precise statement of results about their well-posedness is the topic of this section.

2.1 Model Equations

The starting point is the collection (1.8) of higher-order Boussinesq systems derived in Bona et al. (2002). The parameters \(a, b, \cdots c_1, d_1\) are those presented in (1.7) and (1.9). As we are working in the Boussinesq regime where the Stokes’ number \(S = \frac{\alpha }{\beta } \approx 1\), the two small parameters \(\alpha \) and \(\beta \) are treated on an equal footing. Thus, \(O(\alpha ) = O(\beta ), \, O(\alpha \beta ) = O(\beta ^2)\), etc.

In case the wave motion is essentially in one direction, say in the direction of increasing values of x, we will show how to reduce such Boussinesq systems to the single, fifth-order model,

$$\begin{aligned} \begin{aligned}&\eta _t+\eta _x- \beta \gamma _1\eta _{xxt}+ \beta \gamma _2\eta _{xxx}+\beta ^2 \delta _1 \eta _{xxxxt}+\beta ^2 \delta _2 \eta _{xxxxx}+\alpha \frac{3}{4}(\eta ^2)_x\\&\quad +\,\alpha \beta \Big (\gamma (\eta ^2)_{xxx}-\frac{7}{48}(\eta _x^2)_x \Big ) -\alpha ^2\frac{1}{8}(\eta ^3)_x=0. \end{aligned} \end{aligned}$$
(2.1)

The constants \(\gamma _1, \gamma _2, \delta _1, \delta _2\) and \(\gamma \) depend upon the parameters \(a, b, \ldots \) in (1.8) and will be displayed presently.

Passage from the Boussinesq systems (1.8) to the unidirectional models (2.1) follows the same line of argument as did the passage from the first-order system (1.6) to the mixed KdV–BBM equations

$$\begin{aligned} \eta _t + \eta _x + \frac{3}{2} \alpha \eta \eta _x + \nu \beta \eta _{xxx} -\Big (\frac{1}{6} - \nu \Big )\beta \eta _{xxt} \, = \, 0, \end{aligned}$$
(2.2)

where \(\nu =\frac{1}{2}( a+c) = \frac{1}{4}\big [\theta ^2(\lambda - \mu ) - \frac{1}{3} \lambda + \mu \big ]\) depends upon \(\theta , \lambda \) and \(\mu \) and can formally take any real value. [See Alazman et al. 2006; Constantin and Lannes 2009 and, in the internal wave context, Duchêne (2014). A special case of this model may be found in Bona and Varlmov (2005) for a moving boundary problem].

As described in Bona (2000), at the lowest order of approximation wherein the parameters are small enough that even the first-order terms in \(\alpha \) and \(\beta \) may be dropped, the system (1.8) becomes the one-dimensional wave equation, viz.

$$\begin{aligned} {\left\{ \begin{array}{ll} \eta _t +w_x=0,\qquad \quad w_t+\eta _x =0,\\ \eta (x,0) =f(x), \qquad w(x,0) = g(x), \end{array}\right. } \end{aligned}$$
(2.3)

where f(x) and g(x) are the initial disturbances of the surface and the horizontal velocity, respectively. The solution to (2.3) is

$$\begin{aligned} {\left\{ \begin{array}{ll} \begin{aligned} \eta (x,t)= \frac{1}{2}\Big [f(x+t)+f(x-t)\Big ] - \frac{1}{2}\Big [g(x+t)-g(x-t)\Big ],\\ w(x,t)= \frac{1}{2}\Big [g(x+t)+g(x-t)\Big ] -\frac{1}{2}\Big [f(x+t)-f(x-t)\Big ]. \end{aligned}\end{array}\right. } \end{aligned}$$

For the left-propagating component to vanish, one must have \(f=g\), in which case \(\eta (x,t) =f(x-t)\),

$$\begin{aligned} \eta _t + \eta _x = 0 \quad \mathrm{and} \quad w = \eta . \end{aligned}$$

Notice in particular that in the Boussinesq regime, when most of the propagation is to the right, it appears that

$$\begin{aligned} \eta _t =-\eta _x+O(\alpha , \beta ), \quad {\text {as}}\quad \alpha , \beta \rightarrow 0, \end{aligned}$$
(2.4)

a point that will play a significant role in what follows.

At the next order when one keeps terms of first order in \(\alpha \) and \(\beta \), the standard ansatz used in Alazman et al. (2006) was that

$$\begin{aligned} w = \eta + \alpha A + \beta B \end{aligned}$$
(2.5)

where \(A = A(\eta , \ldots )\) and \(B = B(\eta _{xx}, \eta _{xt}, \ldots )\) turn out to be simple polynomial functions of \(\eta \) and its first few partial derivatives. Indeed, substituting (2.5) into the first-order system (1.6) and dropping all terms of quadratic order in the small parameters \(\alpha \) and \(\beta \) leads to the pair

$$\begin{aligned} {\left\{ \begin{array}{ll} \begin{aligned} \eta _t + \eta _x + \alpha A_x + \beta B_x + \alpha (\eta ^2)_x + \beta \big (a\eta _{xxx} - b \eta _{xxt}\big ) \, = \, 0, \\ \eta _t + \alpha A_t + \beta B_t + \eta _x + \alpha ww_x + \beta \big (c \eta _{xxx} - d \eta _{xxt} \big ) \, = \, 0, \end{aligned} \end{array}\right. } \end{aligned}$$
(2.6)

of equations. Demanding that these be consistent, and making use of the fact derived from (2.4) that \(A_t = -A_x + O(\alpha , \beta )\) and similarly for B, it is determined that

$$\begin{aligned} A = -\frac{1}{4} \eta ^2 \qquad \mathrm{and} \qquad B = \frac{1}{2}\Big ((c-a )\eta _{xx} + (b-d)\eta _{xt}\Big ). \end{aligned}$$
(2.7)

Using these relations in either of the equations in (2.6) leads to the KdV–BBM equations (2.2) with \(\nu \) as advertised above.

If one again makes use of the low-order relation (2.4) between \(\partial _x\) and \(\partial _t\), Eq. (2.2) can be reduced further to the pure BBM equation (1.5). (The same equation can also be obtained by particular choices of \(\theta , \lambda \) and \(\mu \)).

It was shown in Alazman et al. (2006) that not only does this procedure lead formally to KdV–BBM-type equations of the form displayed in (2.2), but that if the Boussinesq system is initiated with data \((\eta _0, w_0)\) that satisfies (2.5), then its solution \((\eta ,w)\) has \(\eta \) well approximated by the solution \(\eta _\mathrm{BBM}\) of (1.5), initiated with \(\eta _0\), and the velocity w that the Boussinesq system generates is shown to be well approximated by using the BBM amplitude \(\eta _\mathrm{BBM}\) and the formula (2.5) to define a BBM horizontal velocity \(w_\mathrm{BBM}\).

If a higher-order approximation is needed, then it is natural to posit the higher-order ansatz

$$\begin{aligned} w = \eta +\alpha A+\beta B +\alpha \beta C+\beta ^2D+\alpha ^2E \end{aligned}$$
(2.8)

analogous to (2.5) (see, for example, Dullin et al. 2003; Lannes 2013). The functions ABCD and E will again turn out to be polynomial functions of \(\eta \) and its partial derivatives. It deserves remark that the presumption (2.8) was already pursued in Olver (1984a) and in subsequent publications, but the fifth-order partial differential equations that emerged do not have a Hamiltonian structure.

Substituting (2.8) into the system (1.8) and ignoring terms that are at least cubic in the small parameters \(\alpha \) and \(\beta \) leads to the pair of equations

$$\begin{aligned} {\left\{ \begin{array}{ll} \begin{aligned} \eta _t&{}=-\,\eta _x-\alpha A_x -\beta B_x -\alpha \beta C_x -\beta ^2 D_x -\alpha ^2E_x+b\beta \eta _{xxt}-b_1\beta ^2\eta _{xxxxt} -a\beta \eta _{xxx}\\ &{}\quad -\,a\alpha \beta A_{xxx}-a\beta ^2B_{xxx}-a_1\beta ^2\eta _{xxxxx} -(\alpha \eta ^2+\alpha ^2A\eta +\alpha \beta B\eta )_x \\ &{}\quad +\, b\alpha \beta (\eta ^2)_{xxx} -(a+b-\frac{1}{3})\alpha \beta (\eta \eta _{xx})_x,\\ \eta _t&{}=-\,\eta _x -\alpha A_t-\beta B_t-\alpha \beta C_t-\beta ^2 D_t-\alpha ^2E_t+d\beta \eta _{xxt}+d\alpha \beta A_{xxt}+d\beta ^2B_{xxt}\\ &{}\quad -\, d_1\beta ^2 \eta _{xxxxt}-c\beta \eta _{xxx}-c_1\beta ^2\eta _{xxxxx} -\alpha \eta \eta _x-\alpha ^2(\eta A)_x-\alpha \beta (\eta B)_x\\ &{}\quad -\,c\alpha \beta (\eta \eta _x)_{xx} +(c+d)\alpha \beta \eta \eta _{xxx}-\alpha \beta (\eta \eta _{xx})_x +(c+d-1)\alpha \beta \eta _x\eta _{xx}. \end{aligned}\end{array}\right. } \end{aligned}$$
(2.9)

Demanding that these two equations be consistent (at the first order) leads to the formulas (2.7) for A and B at order \(\alpha \) and \(\beta \), respectively, as one would expect. Our goal is to derive a fifth-order, one-way model which, in addition to being Hamiltonian, has a linear dispersion relation which matches that of the full water wave system (1.1) up to and including the order \(\beta ^2\) terms, so presenting an error which is formally of order \(\beta ^3\) (recall that \(\alpha \approx \beta \) in the present development). The laboratory experiments reported in Bona et al. (1983) make it clear that the error in the phase velocity dominates the overall error, at least for moderately sized waves. Hence, getting the dispersion relation right to the order we are working seems important. Indeed, if the dispersion relation is not correct to order \(\beta ^2\), the model definitely is not second-order correct in the limit of very small values of \(\alpha \) (e.g., linear theory).

It will be helpful to introduce an auxiliary parameter \(\rho \), viz.

$$\begin{aligned} B=\frac{1}{2} (c-a+\rho )\eta _{xx} +\frac{1}{2}(b-d+\rho )\eta _{xt}. \end{aligned}$$
(2.10)

Of course, at the first order, this is equivalent to the version with \(\rho = 0\), but at the next order, \(\rho \) can be chosen so that the resulting second-order, one-way model has certain, desirable properties. This will be discussed in more detail in Sect. 4. Of special interest will be the value

$$\begin{aligned} \rho =b+d-\frac{1}{6}. \end{aligned}$$
(2.11)

This will turn out to be perspicuous, though we do not insist on it for the nonce.

With this value of B, the mixed KdV–BBM equation (2.2) resulting from the first-order approximation turns out to be

$$\begin{aligned} \eta _t + \eta _x + \frac{3}{2} \alpha \eta \eta _x + {\tilde{\nu }} \beta \eta _{xxx} -\Big (\frac{1}{6} - {\tilde{\nu }} \Big )\beta \eta _{xxt} \, = \, 0, \end{aligned}$$

where \({\tilde{\nu }} =\frac{1}{2}( a+c+\rho )\). Notice that if (2.11) holds, then \({\tilde{\nu }} = \frac{1}{12}\). Therefore, to insist on the consistency of the two equations in (2.9) at the second order in \(\alpha \) and \(\beta \), we use the approximation

$$\begin{aligned} \eta _t =-\eta _x - \frac{3}{2} \alpha \eta \eta _x -{\tilde{\nu }} \beta \eta _{xxx} +\Big (\frac{1}{6} - {\tilde{\nu }} \Big )\beta \eta _{xxt} +O(\alpha ^2, \beta ^2, \alpha \beta ), \quad {\text {as}}\quad \alpha , \beta \rightarrow 0.\nonumber \\ \end{aligned}$$
(2.12)

Using the approximation (2.12) along with the forms of A and B given, respectively, in (2.7) and (2.10) in the system (2.9), there appear more terms involving order \(\alpha \beta , \beta ^2\) and \(\alpha ^2\). Equating the terms of order \(\alpha \beta \) in (2.9) leads to the equation

$$\begin{aligned} C= & {} \left[ \frac{1}{8}(a+4b+2c-d)+\frac{3}{16}(a+b-c-d)+\frac{3}{8}\rho \right] (\eta ^2)_{xx}\\&+\frac{13}{24}\eta \eta _{xx} +\frac{11}{48}\eta _x^2. \end{aligned}$$

Likewise, equating the terms containing \(\beta ^2\) in (2.9) yields

$$\begin{aligned} \begin{aligned} D=&-\left[ \frac{1}{2}(b_1-d_1)+\frac{1}{4}(b-d+\rho )\left( a-d+\frac{1}{6}\right) +\frac{1}{4} d(c-a+\rho )\right] \eta _{xxxt}\\&- \left[ \frac{1}{2}(a_1-c_1)+\frac{1}{4}(c-a+\rho )\left( a+\frac{1}{6}\right) -\frac{1}{12}\rho \right] \eta _{xxxx}. \end{aligned} \end{aligned}$$

Finally, balancing the terms containing \(\alpha ^2\) in the system (2.9), one obtains

$$\begin{aligned} E= \frac{1}{8}\eta ^3. \end{aligned}$$

Putting the expressions for ABCD and E in either of the equations in (2.9), using the relation (2.12) and taking note of the formula \(\eta \eta _{xxx}= \frac{1}{2} (\eta ^2)_{xxx}-\frac{3}{2}(\eta _x^2)_x\), there appears the evolution equation

$$\begin{aligned} \begin{aligned}&\eta _t+\eta _x -\gamma _1\beta \eta _{xxt}+\gamma _2\beta \eta _{xxx}+\delta _1\beta ^2\eta _{xxxxt}+\delta _2\beta ^2\eta _{xxxxx}\\&\quad + \frac{3}{2}\alpha \eta \eta _x + \alpha \beta \Big (\gamma (\eta ^2)_{xxx} -\frac{7}{48}(\eta _x^2)_x\Big )-\frac{1}{8}\alpha ^2(\eta ^3)_x=0, \end{aligned} \end{aligned}$$
(2.13)

where

$$\begin{aligned} {\left\{ \begin{array}{ll} \gamma _1=\frac{1}{2}(b+d-\rho ),\\ \gamma _2=\frac{1}{2}(a+c+\rho ),\\ \delta _1= \frac{1}{4}\big [2(b_1+d_1)-(b-d+\rho )\big (\frac{1}{6}-a-d\big )-d(c-a+\rho )\big ],\\ \delta _2= \frac{1}{4}\big [2(a_1+c_1) -(c-a+\rho )\big (\frac{1}{6}-a\big )+\frac{1}{3}\rho \big ],\\ \gamma =\frac{1}{24}\big [5-9(b+d)+9\rho \big ]. \end{array}\right. } \end{aligned}$$
(2.14)

Remark 2.1

As our analysis so far has been predicated on the abcd-system (1.8), the relation \(a+b+c+d=\frac{1}{3}\) has been used while calculating C and D, and consequently the values of the parameters introduced in (2.14). In this situation, one readily obtains that \(\gamma _1+\gamma _2 =\frac{1}{6}, \gamma =\frac{1}{24}(5-18\gamma _1)\) and \(\delta _2-\delta _1 = \frac{19}{360}-\frac{1}{6}\gamma _1\) (see (4.3) below). Thus, Eq. (2.13) effectively has only two free parameters, namely \(\gamma _1\) and \(\delta _1 \). This aspect plays no particular role in the well-posedness theory to follow. However, it does become important when the issue of insuring the system is Hamiltonian is addressed. Detailed discussion of these issues may be found in Sects. 4 and 5.

If instead, one were to relax the relation \(a+b+c+d=\frac{1}{3}\) when computing CD and elsewhere, the resulting model would be

$$\begin{aligned} \begin{aligned}&\eta _t+\eta _x -\gamma _1\beta \eta _{xxt}+\gamma _2\beta \eta _{xxx}+\delta _1\beta ^2\eta _{xxxxt}+\delta _2\beta ^2\eta _{xxxxx}\\&\quad + \frac{3}{2}\alpha \eta \eta _x + \alpha \beta \Big (\sigma _1(\eta ^2)_{xxx} -\sigma _2(\eta _x^2)_x\Big )-\frac{1}{8}\alpha ^2(\eta ^3)_x=0, \end{aligned} \end{aligned}$$
(2.15)

where \(\gamma _1, \gamma _2\) are as in (2.14), \(\delta _1, \delta _2\) satisfy the relation

$$\begin{aligned} \delta _2-\delta _1 = \frac{1}{4}\rho (a+b+c+d)+\frac{1}{8}\big [(b-d)^2-(a-c)^2\big ] +\frac{1}{2}(a_1-b_1+c_1-d_1) \end{aligned}$$

and \(\sigma _1\), \(\sigma _2\) are given by

$$\begin{aligned} {\left\{ \begin{array}{ll} \sigma _1=\frac{1}{24}\big [4+3(a-2b+c-2d)+9\rho \big ],\\ \sigma _2=\frac{1}{48}\big [4+9(a+b+c+d)\big ]. \end{array}\right. } \end{aligned}$$

The more general Eq. (2.15) reduces to (2.13) when \(a+b+c+d=\frac{1}{3}\). An in-depth analysis of the general model (2.15) could be interesting. Such a more general model might arise if surface tension effects were taken into account in the original Boussinesq system. Depending upon the undisturbed depth, another small parameter may arise in this situation and one must deal with its relation to \(\alpha \) and \(\beta \). What the corresponding second-order correct model looks like would depend upon how these parameters compare to one another. This potentially interesting project is not pursued here. Our focus remains upon the one-way model (2.13) corresponding to the second-order water wave system (1.8) for which dispersion considerations mentioned earlier demand that \(a+b+c+d=\frac{1}{3}\).

While the derivation is formal, we expect the equation (2.13) to have the same sort of properties that its first-order correct analog (1.5) does as regards approximating unidirectional solutions of the second-order Boussinesq system (1.8) and, consequently, solutions of the full water wave problem. However, as already mentioned, rigorous theory to this effect is not available as it is at first order.

Models like (2.13) have appeared in the literature before (cf. Dullin et al. 2003 when the surface tension is set to 0 and the wide ranging article Johnson 2002 together with the references contained in these articles). For example, the equation (2.19) in Dullin et al. (2003), in the zero surface tension regime, appears in our class of equations (see the discussion in Sects. 4 and 5). Especially interesting in the present context is the class of models introduced in Dullin et al. (2004). These models are derived formally by using a Kodama transformation combined with a smoothing operator to derive a family of integrable water wave equations that includes the Camassa–Holm equation and a fifth-order KdV-type equation. While these models are Hamiltonian and all have the correct linear dispersion relation to the second order, the Hamiltonian structure does not allow a global well-posedness theory to be mounted (for local well-posedness, see Mustafa 2006 and the several references to earlier work contained therein). Indeed, it transpires that these models do not in fact have global smooth solutions, but can form singularities in finite time. This is of course at odds with the underlying presumptions about regularity that go into the sort of formal expansions used in Dullin et al. (2004) and in the present essay. Another problem with this class of models is that the dependent variable that eventually emerges is not a physical one, and so not amenable to direct measurement. To return to the wave amplitude, what we call \(\eta \), requires applying the Kodama transformation, a non-local, nonlinear operator involving \(\partial _x^{-1}\). As \(\partial _x\) has a kernel, its non-local inverse does not necessarily act well on the sort of Sobolev-type function classes appearing here and in some of the well-posedness theory for the models derived in Dullin et al. (2004). The present family of model is written directly in terms of \(\eta \) and is globally well posed, so not suffering from these drawbacks. Moreover, if lateral boundary conditions arise, as they always do in practice, passing back and forth between \(\eta \) and their new variable u is going to present problems. (See Chen 2018 for recent theory concerning boundary value problems for one of the models derived here).

It is also worth to note that if \(\alpha = O(\beta ^{\frac{1}{2}})\) instead of \(\alpha = O(\beta )\), then a Camassa–Holm-type equation emerges, namely

$$\begin{aligned} \begin{aligned}&\eta _t+\eta _x -\gamma _1\beta \eta _{xxt}+\gamma _2\beta \eta _{xxx} +\frac{3}{2}\alpha \eta \eta _x \\&\quad + \alpha \beta \Big (\gamma (\eta ^2)_{xxx}-\frac{7}{48}(\eta _x^2)_x\Big )-\frac{1}{8}\alpha ^2(\eta ^3)_x=0. \end{aligned} \end{aligned}$$

The two higher-order, linear, dispersive terms drop off because they are now negligible compared to the remaining terms. However, as one would expect for models where the nonlinear effects are more dominant, the formal temporal range of validity for this model, in terms of the wavelength parameter \(\beta \), is only of order \(O(\beta ^{-1})\). That is to say, the formal error between the model predictions and those of the full water wave problem is of order \(O(\beta ^2 t)\). If the two fifth-order dispersive terms are left in place, then higher-order nonlinear terms deserve keeping as well to maintain a uniform level of approximation. On the other hand, insofar as the largest part of the error resides in incorrect phase speeds, keeping these terms could be useful in practical situations, even in this more nonlinear situation. After all, the experiments in Bona et al. (1983) show that BBM-type equations maintain engineering-level approximation in the long-wave regime, even for Stokes numbers in the mid-20s, which is to say \(\alpha /\beta \approx 25\).

For the analysis that follows, the small parameters \(\alpha \) and \(\beta \) are not relevant. Reverting to non-dimensional, but unscaled variables, which are denoted surmounted with a tilde, namely \(\tilde{\eta }(\tilde{x},\tilde{t}) = \alpha ^{-1} \eta (\beta ^{\frac{1}{2}}\tilde{x},\beta ^{\frac{1}{2}}\tilde{t})\) and then dropping the tildes yields the fifth-order, KdV–BBM-type equation

$$\begin{aligned}&\eta _t+\eta _x - \gamma _1\eta _{xxt}+\gamma _2\eta _{xxx}+\delta _1\eta _{xxxxt}+\delta _2\eta _{xxxxx}\nonumber \\&\quad +\,\frac{3}{4}(\eta ^2)_x+\gamma (\eta ^2)_{xxx}-\frac{7}{48}(\eta _x^2)_x -\frac{1}{8}(\eta ^3)_x=0. \end{aligned}$$
(2.16)

In many circumstances, boundary value problems may be the most practically interesting. However, one usually starts with the pure initial value problem to get an idea of what may be true for more complicated problems. This latter problem, wherein we search for a solution of (2.16) subject to \(\eta (x,0)\) being specified for all \(x \in {\mathbb {R}}\), will be the subject of further mathematical consideration.

We conclude this subsection with the observation that approximate models like the one displayed in (2.16) can be derived by expanding the Dirichlet–Neumann operator in the Zakharov–Craig–Sulem formulation (see, for example, Lannes 2013 and the references therein). An approach using the Dirichlet to Neumann operator does have as a component the rigorous theory pertaining to this operator. And if one is expanding the Hamiltonian rather than the dependent variables themselves, one is guaranteed a Hamiltonian equation. However, it does not guarantee that the dispersion relation so obtained fits the full dispersion to the order of the terms being kept. Nor does it guarantee that the resulting equation provides a well-posed problem. A good example of what can go wrong appears in Ambrose et al. (2012) and Ambrose et al. (2014), where this technique was applied to a deep-water situation. Similar problems arise for the Kaup–Boussinesq system, which is formally Hamiltonian, but is ill-posed even in smooth function classes (see Ambrose et al. 2017).

The classical expansion used here allows for choices of parameters that guarantees both local well-posedness and, in a special case, Hamiltonian structure. It also has the advantage of producing a model that behaves well with respect to the imposition of non-trivial boundary conditions (see Chen 2018).

2.2 Mathematical Theory

Equation (2.16) above formally describes the propagation of unidirectional waves. Naturally, one would like to have a theory that shows solutions of this system closely track associated solutions of the higher-order Boussinesq systems (1.8) on the longer timescale \(O\left( \frac{1}{\beta ^2}\right) \). Logically prior to such a result is the fundamental issue of the well-posedness of the Cauchy problem associated with (2.16). It is to this latter issue that attention is now turned. To be useful in comparing the unidirectional model with its overlying bidirectional analog, one naturally needs a well-posedness theory that is valid at least on the longer timescale \(O(\frac{1}{\beta ^2})\). Better still would be a global well-posedness theory so that issues of finite time singularity formation do not intrude upon the practical use of such models.

As mentioned earlier, the notion of well-posedness used is the standard one. We say that the Cauchy problem for an equation is locally well-posed in a Banach space X of functions of the spatial variable if corresponding to given initial data in X there exists a non-trivial time interval [0, T] and a unique continuous curve in X, defined at least for \(t \in [0, T]\) that solves the equation in an appropriate sense. It is also demanded that this solution varies continuously with variations of the initial data. If the above properties are true for any bounded time interval, we say that the Cauchy problem is globally well-posed in X.

For the local well-posedness theory, it is important that the coefficients \(\gamma _1\) and \(\delta _1\) appearing, respectively, in front of the \(\eta _{xxt}\) and \(\eta _{xxxxt}\)-terms be nonnegative. The problem is linearly ill-posed if this is not the case, as one can see by taking the linear part of equation (3.1) in the next section. (The special cases where \(\delta _1 = 0\) and \(\gamma _1 > 0\) is also locally well-posed, but will not be considered here.) It will be presumed henceforth that \(\gamma _1 \ge 0\) and \( \delta _1 > 0\) to be the case. Discussion of concrete conditions for this to be the case are forthcoming in Sect. 4. Indeed, it will be shown that there are plenty of choices of the fundamental parameters \(\theta , \lambda , \mu , \lambda _1, \mu _1\) and \(\rho \) for which \(\gamma _1, \delta _1\) are positive.

Local well-posedness will be obtained by using multilinear estimates combined with a contraction mapping argument. The local theory does not depend upon special choices of the parameters in the problem other than the positivity of \(\gamma _1\) and \(\delta _1\). In general, Eq. (2.16) does not have an obvious Hamiltonian structure. However, by suitably choosing the parameters, it can be put into Hamiltonian form. The Hamiltonian structure allows one to infer bounds on solutions that lead to global well-posedness. As seen in the recent simulations of solutions of some of the first-order systems (Bona and Chen 2016), lack of Hamiltonian structure often seems to go along with lack of global well-posedness for arbitrarily sized data.

While solutions of the system (2.16) will not approximate solutions of the full water wave problem (1.1) without considerable smoothness (see Bona et al. 2005), a modern thrust in the analysis of dispersive partial differential equations is to provide local and global well-posedness theory in relatively large function classes. While mostly of mathematical interest, theory in such low-regularity classes can be useful in the analysis of numerical schemes for approximating solutions of such equations, especially when the lower-order norms can be given time-independent bounds.

To obtain a global well-posedness result for initial data with lower-order Sobolev regularity, we use a high-low frequency splitting technique. Such splitting methods have roots at least as far back as the work of M. Schonbek and her collaborators (see Amick et al. 1989; Schonbek 1981 for example). In the context of BBM-type equations, it was applied in Bona and Chen (2009) and Bona and Tzvetkov (2009) to obtain sharper well-posedness results. More subtle splitting appears in the work of Bourgain [see, e.g., Bourgain 1998 and the references therein, as well as the further developments in Fonseca et al. (1999), Fonseca et al. (2002) for example].

Before announcing the main results, the mostly standard notation that will be used throughout is recorded. If f is a function defined on the real line \({\mathbb {R}}\), then \(\hat{f}\) denotes its Fourier transform, namely

$$\begin{aligned} \hat{f}(\xi ) = \frac{1}{\sqrt{2\pi }}\int _{{\mathbb {R}}}e^{-ix\xi }f(x)\hbox {d}x. \end{aligned}$$

The space of square-integrable, measurable functions defined on a measurable subset \(\Omega \) of Euclidean space will be denoted \(L^2(\Omega )\). In fact, throughout, \(\Omega \) will always be \(\mathbb {R}\) or \(\mathbb {R}^2\) and we will usually not bother to display the set, but just write \(L^2\) for \(L^2(\mathbb {R})\), etc. The \(L^2\)-based Sobolev space of order \(s \in {\mathbb {R}}\) will be denoted by \(H^s = H^s({\mathbb {R}}) = (1-\Delta )^{-s/2}L^2\) as usual. If \(f:{\mathbb {R}}\times [0, T] \rightarrow {\mathbb {R}}\), the mixed \(L_T^q L_x^p\)-norm of f is

$$\begin{aligned} \Vert f\Vert _{L_T^q L_x^p} = \left( \int _0^T \left( \int _{{\mathbb {R}}} |f(x, t)|^p\,\hbox {d}x \right) ^{q/p}\,\hbox {d}t\right) ^{1/q}, \end{aligned}$$

with the usual modification when p or q is \(\infty \). An analogous definition is used for the other mixed norms \(L_x^pL_T^q\), with the order of integration in time and space interchanged. In the notation \(L_x^pL_T^q\) or \(L_T^pL_x^q, T\) is replaced by t when the interval [0, T] is instead the whole real line \({\mathbb {R}}\). For \(T>0\) and \(s\in {\mathbb {R}}, C([0,T];H^s)\) denotes the space of continuous maps from [0, T] to \(H^s\) with its usual norm, \(\Vert u\Vert _{C([0,T];H^s)}:= \sup _{t\in [0, T]}\Vert u(x, \cdot )\Vert _{H^s}\).

We use c or C to denote various space- and time-independent constants whose exact values may vary from one line to the next. The notation \( A \lesssim B\) connotes an estimate of the form \(A\le cB\) for some c, while \(A \sim B\) means \( A \lesssim B\) and \(B \lesssim A\). The notation \(a+\) stands for \(a +\epsilon \) for any \(\epsilon > 0\), no matter how small.

Here are the main results. The first one is about the local well-posedness in \(H^s ({\mathbb {R}}\)), \(s \ge 1\).

Theorem 2.2

Assume \(\gamma _1, \delta _1 >0\). For any \(s\ge 1\) and for given \(\eta _0\in H^s({\mathbb {R}})\), there exist a time \(T =T(\Vert \eta _0\Vert _{H^s})\) and a unique function \(\eta \in C([0,T];H^s)\) which is a solution of the IVP for (2.16), posed with initial data \(\eta _0\). The solution \(\eta \) varies continuously in \(C([0,T];H^s)\) as \(\eta _0\) varies in \(H^s\).

With more regularity and a further restriction on the coefficients of the equation, global well-posedness holds, as the next theorem attests.

Theorem 2.3

Assume \(\gamma _1, \delta _1 > 0\). Let \(s\ge \frac{3}{2}\) and \(\gamma =\frac{7}{48}\). Then the solution to the IVP associated with (2.16) given by Theorem 2.2 can be extended to arbitrarily large time intervals [0, T]. Hence the problem is globally well-posed in this case.

3 Well-Posedness Theory in \(H^s, s\ge 1\)

Local well-posedness will be established using multilinear estimates combined with a contraction mapping argument. Global well-posedness in the spaces \(H^s\) with \(s \ge 2\) is obtained via energy-type arguments together with the local theory. For values of s below 2, the global theory results from splitting the initial data into a small, rough part and a smooth part and writing evolution equations for each of these in such a way that the sum of the results of the separate evolutions provides a solution of the original problem.

3.1 Local Well-Posedness

This section will focus upon local well-posedness issues for the Cauchy problem associated with (2.16) for given data \(\eta (x,0) =\eta _0(x)\) in \(H^s({\mathbb {R}})\). The first step is to write (2.16) in an equivalent integral equation format. Taking the Fourier transform of Eq. (2.16) with respect to the spatial variable yields

$$\begin{aligned}&{\widehat{\eta }}_t +i\xi {\widehat{\eta }} +\gamma _1 \xi ^2{\widehat{\eta }}_t - i \gamma _2 \xi ^3{\widehat{\eta }}+\delta _1\xi ^4{\widehat{\eta }}_t\\&\quad +\,\delta _2i\xi ^5{\widehat{\eta }}+ \frac{3}{4} i\xi \widehat{\eta ^2} -\gamma i\xi ^3\widehat{\eta ^2} -\frac{1}{8} i\xi \widehat{\eta ^3} -\frac{7}{48} i\xi \widehat{\eta _x^2}=0, \end{aligned}$$

or what is the same,

$$\begin{aligned} \Big (1+\gamma _1\xi ^2+\delta _1 \xi ^4\Big )i{\widehat{\eta }}_t= & {} \xi \left( 1-\gamma _2\xi ^2+\delta _2\xi ^4\right) {\widehat{\eta }}+\frac{1}{4}\left( 3\xi -4\gamma \xi ^3\right) \widehat{\eta ^2}\nonumber \\&-\,\frac{1}{8} \xi \widehat{\eta ^3} -\frac{7}{48}\xi \widehat{\eta _x^2}. \end{aligned}$$
(3.1)

Because \(\gamma _1, \delta _1\) are taken to be positive, the fourth-order polynomial

$$\begin{aligned} \varphi (\xi ) := 1 + \gamma _1\xi ^2+\delta _1\xi ^4, \end{aligned}$$

is strictly positive. Define the three Fourier multiplier operators \(\phi (\partial _x), \psi (\partial _x)\) and \(\tau (\partial _x)\) via their symbols, viz.

$$\begin{aligned} \widehat{\phi (\partial _x)f}(\xi ):= & {} \phi (\xi )\widehat{f}(\xi ), \nonumber \\ \widehat{\psi (\partial _x)f}(\xi ):= & {} \psi (\xi )\widehat{f}(\xi ) \;\;\; \mathrm{and }\nonumber \\ \widehat{\tau (\partial _x)f}(\xi ):= & {} \tau (\xi )\widehat{f}(\xi ), \end{aligned}$$
(3.2)

where

$$\begin{aligned} \phi (\xi )=\frac{\xi (1-\gamma _2\xi ^2+\delta _2\xi ^4)}{\varphi (\xi )}, \quad \psi (\xi )=\frac{\xi }{\varphi (\xi )} \quad \mathrm{and} \quad \tau (\xi )=\frac{3\xi -4\gamma \xi ^3}{4\varphi (\xi )}. \end{aligned}$$

With this notation, the Cauchy problem associated with Eq. (2.16) can be written in the form

$$\begin{aligned} {\left\{ \begin{array}{ll} i\eta _t = \phi (\partial _x)\eta + \tau (\partial _x)\eta ^2 - \frac{1}{8}\psi (\partial _x)\eta ^3 -\frac{7}{48}\psi (\partial _x)\eta _x^2\, ,\\ \eta (x,0) = \eta _0(x). \end{array}\right. } \end{aligned}$$
(3.3)

Consider first the linear IVP

$$\begin{aligned} {\left\{ \begin{array}{ll} i\eta _t = \phi (\partial _x)\eta ,\\ \eta (x,0) = \eta _0(x), \end{array}\right. } \end{aligned}$$
(3.4)

whose solution is given by \(\eta (t) = S(t)\eta _0\), where \(\widehat{S(t)\eta _0} = e^{-i\phi (\xi )t}\widehat{\eta _0}\) is defined via its Fourier transform. Clearly, S(t) is a unitary operator on \(H^s\) for any \(s \in {\mathbb {R}}\), so that

$$\begin{aligned} \Vert S(t)\eta _0\Vert _{H^s} = \Vert \eta _0\Vert _{H^s}, \end{aligned}$$
(3.5)

for all \(t > 0\). Duhamel’s formula allows us to write the IVP (3.3) in the equivalent integral equation form,

$$\begin{aligned} \eta (x,t) = S(t)\eta _0 -i\int _0^tS(t-t')\Big (\tau (\partial _x)\eta ^2 - \frac{1}{8} \psi (\partial _x)\eta ^3 -\frac{7}{48}\psi (\partial _x)\eta _x^2\Big )(x, t') \hbox {d}t'.\nonumber \\ \end{aligned}$$
(3.6)

In what follows, a short-time solution of (3.6) will be obtained via the contraction mapping principle in the space \(C([0,T];H^s)\). This will provide a proof of Theorem 2.2.

3.1.1 Multilinear Estimates

Various multilinear estimates are now established that will be useful in the proof of the local well-posedness result. First, we record the following “sharp” bilinear estimate obtained in Bona and Tzvetkov (2009).

Lemma 3.1

For \(s \ge 0\), there is a constant \(C = C_s\) for which

$$\begin{aligned} \Vert \omega (\partial _x) (u v)\Vert _{H^s} \le C\Vert u\Vert _{H^s}\Vert v\Vert _{H^s} \end{aligned}$$
(3.7)

where \(\omega (\partial _x)\) is the Fourier multiplier operator with symbol

$$\begin{aligned} \omega (\xi ) \, = \, \frac{|\xi |}{1 + \xi ^2}. \end{aligned}$$

It is worth noting that there is a counterexample in Bona and Tzvetkov (2009) showing that the inequality (3.7) is false if \(s<0\).

Corollary 3.2

For any \(s \ge 0\), there is a constant \(C = C_s\) such that the inequality

$$\begin{aligned} \Vert \tau (\partial _x) \eta ^2\Vert _{H^s} \le C \Vert \eta \Vert _{H^s} ^2 \end{aligned}$$
(3.8)

holds, where the operator \(\tau (\partial _x)\) is defined in (3.2).

Proof

Since \(\delta _1>0\), it follows that \(\tau (\xi ) \le C \omega (\xi )\) for some constant \(C>0\). The proof of the estimate (3.8) thus follows from Lemma 3.1. \(\square \)

Proposition 3.3

For \(s \ge \frac{1}{6}\), there is a constant \(C = C_s\) such that

$$\begin{aligned} \Vert \psi (\partial _x) \eta ^3\Vert _{H^s} \le C \Vert \eta \Vert _{H^s} ^3. \end{aligned}$$
(3.9)

Proof

Consider first when \(\frac{1}{6} \le s < \frac{5}{2}\). In this case, it appears that

$$\begin{aligned} \Big |(1+|\xi |)^s \,\psi (\xi )\Big |=\Big |\frac{ (1+|\xi |)^s \xi }{(1 +\gamma _1 \xi ^2+\delta _1\xi ^4)}\Big | \le C \frac{1}{(1+|\xi |)^{3-s}}. \end{aligned}$$

The last inequality implies that

$$\begin{aligned} \begin{aligned} \Vert \psi (\partial _x) \eta ^3\Vert _{H^s}&= \Vert (1+|\xi |)^s \,\psi (\xi )\widehat{\eta ^3}(\xi )\Vert _{L^2} \le C\left\| \frac{1}{(1+|\xi |)^{3-s}} \widehat{\eta ^3}(\xi )\right\| _{L^2}\\&\le C \left\| \frac{1}{(1+|\xi |)^{3-s}}\right\| _{L^2}\Vert \widehat{\eta ^3}\Vert _{L^{\infty }} \le C\Vert \eta \Vert _{L^3}^3. \end{aligned} \end{aligned}$$

In one dimension, the Sobolev embedding theorem states in part that \(H^{\frac{1}{6}}\) is embedded in \(L^3\), so

$$\begin{aligned} \Vert \eta \Vert _{L^{3}} \le C \Vert \eta \Vert _{H^\frac{1}{6}}, \end{aligned}$$

whence

$$\begin{aligned} \Vert \psi (\partial _x) \eta ^3\Vert _{H^s} \le C\Vert \eta \Vert _{H^s} ^3 \end{aligned}$$

whenever \(\frac{1}{6} \le s < \frac{5}{2}\).

On the other hand, if \(s >1/2\), the Sobolev space \(H^s\) is a Banach algebra. Since \(|\psi (\xi )| \le C\omega (\xi )\), Lemma 3.1 implies that

$$\begin{aligned} \Vert \psi (\partial _x) (\eta \eta ^2)\Vert _{H^s} \le C\Vert \eta \Vert _{H^s} \Vert \eta ^2 \Vert _{H^s}\le C\Vert \eta \Vert _{H^s}^3, \end{aligned}$$

which completes the proof of Proposition 3.3. \(\square \)

Remark 3.4

The reader will appreciate presently that this result is only used in case \(s > \frac{1}{2}\), so the full power of the last proposition is not needed in our theory. We thought it interesting that the result holds down to \(s = \frac{1}{6}\) and note that the inequality at this level could be useful in the setting of internal waves in the deep ocean. This point will be investigated in future research.

Lemma 3.5

For \(s \ge 1\), the inequality

$$\begin{aligned} \Vert \psi (\partial _x) \eta _x^2\Vert _{H^s} \le C \Vert \eta \Vert _{H^s} ^2 \end{aligned}$$
(3.10)

holds.

Proof

Observe that

$$\begin{aligned} \psi (\xi ) \le C\omega (\xi ) \frac{1}{1+ |\xi |}. \end{aligned}$$

The inequality (3.7) then allows the conclusion

$$\begin{aligned} \Vert \psi (\partial _x) \eta _x^2\Vert _{H^s} \le C\Vert \omega (\partial _x) \eta _x^2\Vert _{H^{s-1}} \le C\Vert \eta _x\Vert _{H^{s-1}}\Vert \eta _x\Vert _{H^{s-1}} \le C\Vert \eta \Vert _{H^{s}}^2, \end{aligned}$$

since \(s-1 \ge 0\). \(\square \)

The preceding ingredients are assembled to provide a proof of the local well-posedness theorem.

Proof of Theorem 2.2

Define a mapping

$$\begin{aligned} \Psi \eta (x,t) = S(t)\eta _0 -i\int _0^tS(t-t')\Big (\tau (D_x)\eta ^2 - \frac{1}{4} \psi (\partial _x)\eta ^3 -\frac{7}{48} \psi (\partial _x)\eta _x^2\Big )(x, t') \hbox {d}t'.\nonumber \\ \end{aligned}$$
(3.11)

The immediate goal is to show that this mapping is a contraction on a closed ball \({\mathcal {B}}_r\) with radius \(r > 0\) and center at the origin in \(C([0,T];H^s)\).

As remarked earlier, S(t) is a unitary group in \(H^s({\mathbb {R}})\) [see (3.5)], and therefore,

$$\begin{aligned} \Vert \Psi \eta \Vert _{H^s} \le \Vert \eta _0\Vert _{H^s} +CT\Big [\big \Vert \tau (\partial _x)\eta ^2 - \frac{1}{8} \psi (\partial _x)\eta ^3 -\frac{7}{48}\psi (\partial _x)\eta _x^2\big \Vert _{C([0,T];H^s)}\Big ]. \end{aligned}$$

The inequalities (3.8), (3.9) and (3.10) lead immediately to

$$\begin{aligned} \Vert \Psi \eta \Vert _{H^s} \le \Vert \eta _0\Vert _{H^s} +CT\Big [\big \Vert \eta \big \Vert _{C([0,T];H^s)}^2 + \big \Vert \eta \big \Vert _{C([0,T];H^s)}^3 +\big \Vert \eta \big \Vert _{C([0,T];H^s)}^2\Big ].\nonumber \\ \end{aligned}$$
(3.12)

If, in fact, \(\eta \in {\mathcal {B}}_r\), then (3.12) yields

$$\begin{aligned} \Vert \Psi \eta \Vert _{H^s} \le \Vert \eta _0\Vert _{H^s} +CT\big [2r +r^2 \big ]r. \end{aligned}$$

If we choose \(r= 2\Vert \eta _0\Vert _{H^s}\) and \(T= \frac{1}{2Cr(2 + r) }\), then \(\Vert \Psi \eta \Vert _{H^s} \le r\), showing that \(\Psi \) maps the closed ball \({\mathcal {B}}_r\) in \(C([0,T];H^s)\) onto itself. With the same choice of r and T and the same sort of estimates, one discovers that \(\Psi \) is a contraction on \({\mathcal {B}}_r\) with contraction constant equal to \(\frac{1}{2}\) as it happens. The rest of the proof is standard. \(\square \)

Remark 3.6

The following points follow immediately from the proof of the Theorem 2.2:

  1. (1)

    The maximal existence time \(T_s\) of the solution satisfies

    $$\begin{aligned} T_s\ge \bar{T} = \frac{1}{8C_s\Vert \eta _0\Vert _{H^s}(1+\Vert \eta _0\Vert _{H^s})}, \end{aligned}$$
    (3.13)

    where the constant \(C_s\) depends only on s.

  2. (2)

    The solution cannot grow too much on the interval \([0,{\bar{T}}]\) since

    $$\begin{aligned} \Vert \eta (\cdot ,t)\Vert _{H^s} \le r = 2\Vert \eta _0\Vert _{H^s} \end{aligned}$$
    (3.14)

    for t in this interval, where \(\bar{T}\) is as above in (3.13).

3.2 Global Well-Posedness

In this section, a priori deduced bounds are obtained with an eye toward extending the local well-posedness just established. The present theory countenances the spaces \(H^s({\mathbb {R}}), s \ge \frac{3}{2}\). However, we begin with a global well-posedness result in \(H^s({\mathbb {R}})\) for \(s\ge 2\).

3.2.1 Global Well-Posedness in \(H^2\)

The aim here is to derive an a priori estimate in \(H^2({\mathbb {R}})\), subject to certain restrictions on the parameters that appear in (2.16). Multiplying Eq. (2.16) by \(\eta \), integrating over the spatial domain \({\mathbb {R}}\) and integrating by parts yields

$$\begin{aligned} \frac{1}{2} \frac{\hbox {d}}{\hbox {d}t} \int _{\mathbb {R}} \left( \eta ^2 + \gamma _1\eta _x^2+\delta _1\eta _{xx}^2\, \right) \hbox {d}x+ \gamma \int _{\mathbb {R}}(\eta ^2)_{xxx}\, \eta \,\hbox {d}x-\frac{7}{48} \int _{\mathbb {R}} (\eta _x^2)_x\, \eta \,\hbox {d}x=0. \end{aligned}$$

Further integrations by parts gives

$$\begin{aligned} \frac{1}{2} \frac{\hbox {d}}{\hbox {d}t} \int _{\mathbb {R}} \left( \eta ^2 + \gamma _1\eta _x^2+\delta _1\eta _{xx}^2\, \right) \hbox {d}x= \left( \gamma -\frac{7}{48}\right) \int _{\mathbb {R}}\eta _x^3\,\hbox {d}x. \end{aligned}$$
(3.15)

Of course, these calculations involve derivatives of higher order than are guaranteed to exist by assuming the initial data lies only in \(H^2\). However, one can make the calculations using smoother solutions and then pass to the limit of rougher data making use of the continuous dependence result. The idea is standard and we pass over the details (cf. Bona and Kalisch 2000).

From (3.15), it is clear that an a priori estimate obtains when \(\gamma =\frac{7}{48}\). That such a condition can be imposed while respecting the other mathematical limitations \(\gamma _1>0\) and \(\delta _1 > 0\) will be discussed in Sect. 4. For the time being, we presume that \(\theta , \lambda , \mu , \lambda _1, \mu _1\) and \(\rho \) have been chosen so that \(\gamma = \frac{7}{48}\) and \(\gamma _1, \delta _1 > 0\) still holds. In this case, Eq. (2.16) becomes

$$\begin{aligned}&\eta _t+\eta _x-\gamma _1 \eta _{xxt}+\gamma _2\eta _{xxx}+\delta _1\eta _{xxxxt}+\delta _2\eta _{xxxxx}\nonumber \\&\quad +\,\frac{3}{4}(\eta ^2)_x+\gamma \big (\eta ^2\big )_{xxx}-\gamma \big (\eta _x^2\big )_x -\frac{1}{8}\big (\eta ^3\big )_x=0. \end{aligned}$$
(3.16)

In this form, it has the conserved quantity

$$\begin{aligned} E(\eta (\cdot ,t)):= \frac{1}{2}\int _{\mathbb {R}} \eta ^2 + \gamma _1 (\eta _x)^2+\delta _1(\eta _{xx})^2\, \hbox {d}x= E(\eta _0). \end{aligned}$$
(3.17)

Remark 3.7

In fact, with the restriction \(\gamma =\frac{7}{48}\), the equation is Hamiltonian, for there is a second conserved quantity, namely

$$\begin{aligned} \Theta (\eta ) = \frac{1}{2}\int _{\mathbb {R}}\left( -\eta ^2 -\frac{1}{2}\eta ^3 + \frac{1}{16}\eta ^4 + \frac{7}{24} \eta \eta _x^2+\gamma _2\eta _x^2 - \delta _2 \eta _{xx}^2\right) \hbox {d}x. \end{aligned}$$

The system itself may be written in the Hamiltonian format

$$\begin{aligned} \frac{\partial }{\partial t} \nabla E(\eta ) \, = \, \frac{\partial }{\partial x} \nabla \Theta (\eta ) \end{aligned}$$

where \(\nabla E\) is the Euler derivative of E and similarly \(\nabla \Theta \) the Euler derivative of \(\Theta \).

The conservation law (3.17), which is essentially the \(H^2\)-norm, immediately points to the following global well-posedness result.

Theorem 3.8

Let \(s\ge 2\) and suppose \(\gamma _1, \delta _1 > 0\) and \(\gamma =\frac{7}{48}\). Then the IVP for Eq. (2.16) is globally well-posed in \(H^s({\mathbb {R}})\).

Proof

Following a standard argument, the global well-posedness in \(H^2({\mathbb {R}})\) is a consequence of the local theory and the a priori bound implied by the conserved quantity (3.17). To prove global well-posedness in \(H^k\), where \(k \ge 3\) is an integer, we proceed by induction on k.

Assume that \(\eta _0\) lies in \(H^3\). The local well-posedness theory then delivers a solution in \(C([0,T];H^3)\) for some \(T > 0\). If a priori bounds on the \(H^3\)-norm of \(\eta \) which are finite on finite time intervals holds, then the local theory can be iterated and a global solution results.

Differentiate Eq. (3.16) with respect to the spatial variable, multiply the resulting equation by \(\eta _x\) and integrate over \({\mathbb {R}}\). After integrations by parts in the spatial variable, there obtains

$$\begin{aligned} \begin{aligned}&\frac{1}{2} \frac{\hbox {d}}{\hbox {d}t} \int _{\mathbb {R}} \left( \eta _x^2 + \gamma _1 \eta _{xx}^2+\delta _1\eta _{xxx}^2 \right) \,\hbox {d}x +\frac{3}{4} \int _{\mathbb {R}}\eta _{x}^3 \, \hbox {d}x \\&\quad -3 \gamma \int _{\mathbb {R}}\eta _{xx}^2 \eta _{x}\,dx-\frac{3}{8} \int _{\mathbb {R}} \, \eta _x^3\,\eta \, \hbox {d}x=0. \end{aligned} \end{aligned}$$
(3.18)

Standard Sobolev embedding results show that for any time t for which the solution exists,

$$\begin{aligned} \begin{aligned} \Vert \eta \Vert _{L_x^2}^2&\le 2 E_0, \quad \Vert \eta _x\Vert _{L_x^2}^2 \le \frac{2}{\gamma _1}E_0, \quad \Vert \eta _{xx}\Vert _{L_x^2}^2 \le \frac{2}{\delta _1}E_0, \\ \Vert \eta \Vert _{L_x^{\infty }}^2&\le \frac{4}{\sqrt{\gamma _1}}\,E_0, \quad \Vert \eta _x\Vert _{L_x^{\infty }}^2 \le \frac{4}{\sqrt{\delta _1\gamma _1}}\,\,E_0, \end{aligned} \end{aligned}$$
(3.19)

where \(E_0 = E(\eta _0)\). After integrating (3.18) with respect to time over the interval [0, t], making elementary estimates of all the terms not involving a third derivative and using (3.19) systematically, there obtains the inequality

$$\begin{aligned} \delta _1\int _{\mathbb {R}} \eta _{xxx}^2\, \hbox {d}x\le & {} \,\int _{\mathbb {R}} \left( (\eta _{0x})^2+\gamma _1(\eta _{0xx})^2 + \delta _1(\eta _{0xxx})^2\right) \,\hbox {d}x \\&+\,C \int _0^t \Vert \eta _x\Vert _{L_x^{\infty }} \left( \Vert \eta _x\Vert _{L_x^2}^2+ \Vert \eta _{xx}\Vert _{L_x^2}^2+\Vert \eta _x\Vert _{L_x^2}^2\,\Vert \eta \Vert _{L_x^{\infty }}\right) \hbox {d}x \\\le & {} \delta _1\int _{\mathbb {R}} (\eta _{0xxx})^2\, \hbox {d}x+ CE_0 + C E_0^{3/2}\left( 1+ E_0^{1/2}\right) t, \end{aligned}$$

from which the desired \(H^3\)-bound follows.

Assuming there are in hand \(H^k\) bounds, an entirely similar energy-type calculation reveals that the solution \(\eta \) has \(H^{k+1}\)-bounds as soon as the initial data \(\eta _0\) lies in \(H^{k+1}\).

To obtain global well-posedness in the fractional-order Sobolev spaces \(H^s, s \ge 2\) not an integer, a straightforward application of nonlinear interpolation theory (see Bona and Scott 1976; Bona et al. 2014) may be applied, thereby completing the proof of the theorem. \(\square \)

3.2.2 Global Well-Posedness in \(H^s, s\ge \frac{3}{2}\)

The object of this subsection is to prove the second main result, Theorem 2.3. To establish well-posedness below the level where a priori bounds obtain, a Fourier splitting technique will be employed wherein the data \(\eta _0\) is decomposed into a small, rough part and a smooth part. As already mentioned, such decompositions are commonplace in various contexts in the theory of partial differential equations.

Let there be given initial data \(\eta _0 \in H^s\) where \(1 \le s < 2\) and a \(T>0\). As advertised, the data \(\eta _0\) is decomposed into a small part and a smooth part, viz.

$$\begin{aligned} \eta _0 = w_0 \, + \, v_0 \qquad \mathrm{where} \qquad w_0 \in H^\infty \,\,\, \mathrm{and} \,\,\, v_0 \in {H^s} \end{aligned}$$

is small. Such a decomposition can be effected in many ways. One that is especially helpful in what follows is the one-parameter family \(\{w_0^\epsilon \}_{\epsilon > 0}\) defined by way of their Fourier transforms to be

$$\begin{aligned} \widehat{w_0^\epsilon } = \zeta (\epsilon \xi )\widehat{\eta _0}(\xi ) \end{aligned}$$

where \(\zeta \) is an even, \(C^\infty \)-function defined on \({\mathbb {R}}, 0 \, \le \zeta \le 1, \, \zeta (0) = 1\) and such that \(1-\zeta (\xi )\) has a zero of infinite order at \(\xi = 0\) while \(\zeta \) decays exponentially to 0 as \(|\xi | \rightarrow \infty \). (For example, \(\zeta \) could be a cutoff function which is identically equal to 1 on the interval \([-1,1]\) and has support in \([-2,2]\).) It follows by a straightforward computation in the Fourier transformed variables that if \(\eta _0 \in H^s\), then for \(r \ge 0\),

$$\begin{aligned} \Vert w_0^\epsilon \Vert _{H^{s+r}} \,=\, O\left( \epsilon ^{-r} \right) \quad \mathrm{and}\qquad \Vert \eta _0 - w_0^\epsilon \Vert _{H^{s-r}} \, = \,o\left( \epsilon ^r\right) \end{aligned}$$
(3.20)

as \(\epsilon \downarrow 0\) [see, for example, Lemma 5 in Bona and Smith (1975)]. Define \(v_0 = v^\epsilon _0 = \eta _0 - w_0^{\epsilon }\). For the moment, the dependence of both \(v_0\) and \(w_0\) upon \(\epsilon \) will be suppressed. The values of \(\epsilon \) will be appropriately limited presently.

By choosing \(\epsilon \) small enough so that \(\Vert v_0\Vert _{H^s} \le 1\) and \(\Vert v_0\Vert _{H^s} \le \frac{1}{12C_sT}\), the local well-posedness theory adduced in Theorem 2.2 assures us that if we pose \(v_0\) as initial data for our evolution Eq. (3.16), then the solution v emanating from it will lie in \(C([0,T];H^s)\) and it will not be larger than \(2\Vert v_0\Vert _{H^s}\) over the entire time interval [0, T] (see Remark 3.6). It can also be insured that

$$\begin{aligned} \Vert v(\cdot ,t)\Vert _{H^1} \le 2\Vert v_0\Vert _{H^1}\quad \mathrm{for \, \, all} \,\, t \in [0,T], \end{aligned}$$

simply by imposing the further restriction \(\Vert v_0\Vert _{H^1} \le \frac{1}{12C_1T}\). This follows since the integral operator \(\Psi \) in (3.11) will simultaneously satisfy (3.13) and (3.14) for both the Sobolev indices s and 1. The solutions, which are the fixed points of \(\Psi \) in the two spaces, must be the same by uniqueness in the larger space.

Once v is fixed and known to exist on the entire time interval [0, T], the smooth part \(w_0\) of the initial data is evolved according to the variable coefficient IVP

$$\begin{aligned} {\left\{ \begin{array}{ll} w_t+w_x -\gamma _1 w_{xxt}+\gamma _2 w_{xxx}+\delta _1 w_{xxxxt}+\delta _2 w_{xxxxx}+G(v,w)=0, \\ w(x,0)=w_0(x), \end{array}\right. } \end{aligned}$$
(3.21)

where

$$\begin{aligned} G(v,w):= & {} \frac{3}{2}(v w)_x +\frac{3}{4}(w^2)_x +2 \gamma (vw)_{xxx}+\gamma (w^2)_{xxx}\nonumber \\&-2\gamma (v_xw_x)_{x}-\gamma (w_x^2)_{x}-\frac{3}{8}(v^2w)_{x}\nonumber \\&-\frac{3}{8}(vw^2)_{x}-\frac{1}{8}(w^3)_{x}. \end{aligned}$$
(3.22)

If a solution w exists in \(C([0,T];H^s)\), then \(v + w\) provides a solution on the time interval [0, T] of the original problem for Eq. (3.16) with initial value \(\eta _0\). As T was arbitrary, global existence is thereby concluded. Well-posedness then follows from the local theory. That is, the continuous dependence of the solution on the initial data and the uniqueness of solutions within the function class \(C([0,T];H^s)\) derive from the previously elucidated local well-posedness results. Thus, Theorem 2.3 will be established as soon as (3.21) is shown to have a solution in \(C([0,T];H^s)\).

Proof of Theorem 2.3

As already discussed, the variable coefficient v appearing in the nonlinearity (3.22) lies in \(C([0,T];H^s) \subset C([0,T];H^1)\). As a first step, it is important to show that the IVP (3.21) for w is locally well-posed in \(H^2\) and not just in \(H^s\). To this end, write the IVP (3.21) in the equivalent, integral equation form

$$\begin{aligned} {\left\{ \begin{array}{ll} w(x,t) = S(t)w_0 -i \displaystyle \int _0^tS(t-t')\Big (\tau (\partial _x)w^2 + 2\tau (\partial _x)wv - \frac{1}{8} \psi (\partial _x)w^3 \\ - \frac{3}{8} \psi (\partial _x)w^2v - \frac{3}{8} \psi (\partial _x)wv^2 - \gamma \psi (\partial _x)w_x^2 - 2\gamma \psi (\partial _x)w_xv_x \Big )(x, t') \hbox {d}t' \\ =\, \Phi (w)(x,t), \end{array}\right. }\qquad \end{aligned}$$
(3.23)

where the Fourier multiplier operators \(\psi (\partial _x)\) and \(\tau (\partial _x)\) are as defined already in (3.2) and the unitary family S(t) is the solution group for the linear Eq. (3.4).

This integral equation is studied in \(C([0,T];H^2)\) when the variable coefficient v lies in \(C([0,T];H^s)\). As \(w_0\) lies in \(H^\infty \), it is clear that \(S(t)w_0\) lies in \(C({\mathbb {R}};H^2)\). Just as in the earlier analysis of the integral Eq. (3.6), the argument proceeds by showing that the mapping \( w \mapsto \Phi (w)\) defined by the right-hand side of (3.23) is a contraction on a ball \({\mathcal {B}}_r\) of radius r about 0 in the space \(C([0,T_0];H^2)\) for r large enough and \(T_0\) small enough. This will establish the local well-posedness needed for the next step in the analysis.

The summands in the integral equation that only feature w may be handled just as before and suitable estimates are forthcoming since we are working in \(H^2\) (see the proof of Theorem 2.2). The following lemma provides the extra information needed to complete the argument in favor of \(\Phi \) being a contraction mapping on \({\mathcal {B}}_r \subset C([0,T_0];H^2)\) for suitable \(T_0\) and r.

Lemma 3.9

Suppose \(1 \le s < 2\). Then for \(f \in H^s\) and \(g \in H^2\), there are constants C depending only on s such that

$$\begin{aligned} \begin{aligned} \Vert \tau (\partial _x)fg\Vert _{H^2}&\le C \Vert f \Vert _{H^s} \Vert g \Vert _{H^2}, \qquad \Vert \psi (\partial _x)f^2g \Vert _{H^2} \le C \Vert f \Vert ^2_{H^s} \Vert g \Vert _{H^2}, \\ \Vert \psi (\partial _x)fg^2 \Vert _{H^2}&\le C \Vert f \Vert _{H^s} \Vert g \Vert ^2_{H^2}, \qquad \Vert \psi (\partial _x)f_xg_x \Vert _{H^2} \le C \Vert f \Vert _{H^s} \Vert g \Vert _{H^2}. \end{aligned}\nonumber \\ \end{aligned}$$
(3.24)

Proof

As \(\tau (\partial _x)\) is a bounded map from \(H^r\) to \(H^{r+1}\), it follows that

$$\begin{aligned} \Vert \tau (\partial _x)fg\Vert _{H^2} \le C\Vert fg \Vert _{H^1} \le C\Vert f \Vert _{H^1} \Vert g \Vert _{H^1} \le C\Vert f \Vert _{H^s} \Vert g \Vert _{H^2.} \end{aligned}$$

The operator \(\psi (\partial _x)\) maps \(H^r\) to \(H^{r+3}\). Consequently, we have

$$\begin{aligned} \Vert \psi (\partial _x) f^2g \Vert _{H^2}&\le C\Vert f ^2g\Vert _{H^1} \le C\Vert f \Vert ^2_{H^1} \Vert g \Vert _{H^1} \le C\Vert f \Vert ^2_{H^s}\Vert g \Vert _{H^2}, \\ \Vert \psi (\partial _x) fg^2 \Vert _{H^2}&\le C\Vert f g^2\Vert _{H^1} \le C\Vert f \Vert _{H^1} \Vert g \Vert ^2_{H^1} \le C\Vert f \Vert _{H^s}\Vert g \Vert ^2_{H^2},\\ \Vert \psi (\partial _x)f_xg_x \Vert _{H^2}&\le C\Vert f_x g_x\Vert _{L^2} \le C\Vert f_x \Vert _{L^2} \Vert g_x \Vert _{L^\infty } \le C\Vert f \Vert _{H^s}\Vert g \Vert _{H^2}, \end{aligned}$$

and the results are established. \(\square \)

It is straightforward to use the smoothing estimates (3.24) to show that the mapping \(\Phi \) is a contraction on a suitably chosen ball about the origin in \(C([0,T_0];H^2)\) for \(T_0\) small enough, which is the content of the following proposition.

Proposition 3.10

The IVP (3.21) is locally well-posed in \(H^2\).

It remains only to show that the local in time solution w of (3.21) can be continued to the entire time interval [0, T]. This in turn will be settled as soon as a priori bounds on w in \(H^2\) are provided which are valid on [0, T]. To see such a bound obtains, multiply Eq. (3.21) by w, integrate over \({\mathbb {R}}\) and integrate by parts in the spatial variable to obtain

$$\begin{aligned} \begin{aligned} \frac{1}{2} \frac{\partial }{\partial t} \int _{\mathbb {R}} \left( w^2 +\gamma _1 w_x^2+\delta _1 w_{xx}^2\,\right) \hbox {d}x&= \frac{3}{2}\int _{\mathbb {R}} v w w_x dx-2\gamma \int _{\mathbb {R}}(v w)_{x}w_{xx}\,\hbox {d}x\\&\quad -2\gamma \int _{\mathbb {R}}\,v_x (w_x)^2\, \hbox {d}x\\&\quad -\frac{3}{8} \int _{\mathbb {R}}v^2 w\,w_{x}\, dx-\frac{3}{8} \int _{\mathbb {R}}\,v\, w^2 \, w_x\,\hbox {d}x. \end{aligned} \end{aligned}$$
(3.25)

The intermediate computations are justified as before by the use of the continuous dependence results in \(H^2\) for w and \(H^1\) for v. Let \({\mathcal {X}}(t):=\int _{\mathbb {R}} \left( w^2 +\gamma _1 w_x^2+\delta _1 w_{xx}^2\,\right) \hbox {d}x\). Then, \({\mathcal {X}}(t)\) is equivalent to the square of the \(H^2\)-norm of \(w(\cdot ,t)\).

The next task is to obtain an upper bound on the right-hand side of (3.25) in terms of \(\Vert w\Vert _{H^2}\) and \(\Vert v\Vert _{H^1}\). The fact that \(\Vert w\Vert _{L^\infty }\) and \(\Vert w_x\Vert _{L^\infty }\) are both bounded by \(\Vert w\Vert _{H^2}\) and elementary estimates implies that

$$\begin{aligned} \begin{aligned} \frac{\partial {\mathcal {X}}(t)}{\partial t}&\le C\Big (\big (\Vert v\Vert _{H^1} + \Vert v\Vert ^2_{H^1}\big )\Vert w\Vert ^2_{H^2} + \Vert v\Vert _{H^1}\Vert w\Vert ^3_{H^2}\Big ) \\&\le C\Big (\big (\Vert v\Vert _{H^1} + \Vert v\Vert ^2_{H^1}\big ){\mathcal {X}}(t) + \Vert v\Vert _{H^1}{\mathcal {X}}(t)^{\frac{3}{2}}\Big ). \end{aligned} \end{aligned}$$
(3.26)

Recall that \(\Vert v(\cdot ,t)\Vert _{H^1} \le 2\Vert v_0\Vert _{H^1}\) on the entire interval [0, T]. In consequence, (3.26) can be extended thusly:

$$\begin{aligned} \frac{\partial {\mathcal {X}}(t)}{\partial t} \, \le \, 2C\Vert v_0\Vert _{H^1}\left( {\mathcal {X}}(t) + {\mathcal {X}}(t)^{\frac{3}{2}} \right) . \end{aligned}$$

Notice that, because of (3.20),

$$\begin{aligned} \Vert v_0\Vert _{H^1} = o(\epsilon ^{s-1}) = \nu (\epsilon )\epsilon ^{s-1} \quad \mathrm{where} \quad \nu (\epsilon ) \rightarrow 0 \quad \mathrm{as} \quad \epsilon \rightarrow 0. \end{aligned}$$
(3.27)

If \(\Sigma (t)\) is the solution of

$$\begin{aligned} \frac{\hbox {d}\Sigma }{\hbox {d}t} = 2C\Vert v_0\Vert _{H^1}\left( \Sigma (t) + \Sigma (t)^{\frac{3}{2}}\right) \end{aligned}$$
(3.28)

with \(\Sigma (0) = {\mathcal {X}}(0)\), then a Gronwall-type argument implies that \({\mathcal {X}}(t) \le \Sigma (t)\) for all t for which \(\Sigma \) is finite. The solution of (3.28) is

$$\begin{aligned} \sigma (t) = \frac{\sigma (0)e^{Ct\Vert v_0\Vert _{H^1}}}{1 - \sigma (0)\left( e^{Ct\Vert v_0\Vert _{H^1}} - 1\right) } \le \frac{\sigma (0)e^{CT\Vert v_0\Vert _{H^1}}}{1 - \sigma (0)\left( e^{CT\Vert v_0\Vert _{H^1}} - 1\right) }, \end{aligned}$$
(3.29)

provided the right-hand side is positive and finite, where \(\sigma (t)^2 = \Sigma (t)\). Of course, as long as \(0 \le y \le 1\), say, then \(e^y - 1 \le ey\). Since T is fixed and \(\Vert v_0\Vert _{H^1}\) is small for small values of \(\epsilon \), the right-hand side of (3.29) may be bounded above by

$$\begin{aligned} \frac{\sigma (0)e^{CT \Vert v_0\Vert _{H^1}}}{1 - CTe\sigma (0)\Vert v_0\Vert _{H^1}}. \end{aligned}$$

The latter will provide the desired upper bound needed to continue the solution w to the entire time interval [0, T] as soon as

$$\begin{aligned} \sigma (0) \Vert v_0\Vert _{H^1} < \frac{1}{CeT}. \end{aligned}$$
(3.30)

As \(\sigma (0)\) is equivalent to the \(H^2\)-norm of \(w_0\), (3.20) implies that \( \sigma (0) \, \le \, C\epsilon ^{s-2}. \) Combining this with (3.27), it is seen that

$$\begin{aligned} \sigma (0)\Vert v_0\Vert _{H^1} \, = \, o\left( \epsilon ^{2s - 3} \right) \quad \mathrm{as} \quad \epsilon \downarrow 0. \end{aligned}$$

Consequently, if \(s \ge \frac{3}{2}\) and \(\epsilon \) small enough, (3.30) is valid and the result is proved. \(\square \)

4 Parameter Restrictions

The class of partial differential Eq. (2.16) is all formally equivalent models for long-crested, small amplitude, long waves on the surface of an ideal fluid over a flat bottom. The hope is that they approximate solutions of the full water mwave problem for an ideal fluid with an error that is of order \(O\left( \beta ^3 t\right) \) over a timescale at least of order \(O\left( \beta ^{-2}\right) \). Rigorous theory to this effect, but only on the shorter, Boussinesq timescale \(O\left( \beta ^{-1}\right) \), is available for the lower order, unidirectional models (2.2) by combining results in Alazman et al. (2006), Bona et al. (2005) and Bona et al. (1983).

It deserves remark that various models already existing in the literature are specializations of the class of models displayed in (2.13). For example, the model derived in Dullin et al. (2003) in it’s zero surface tension limit, and see also Johnson (2002) and Marchant and Smyth (1990), appears by taking \(\rho = b + d\) and an appropriate choice of \(\lambda _1\). As will be clear momentarily, this model, like the one in Olver (1984b), is not Hamiltonian.

Despite the fact that the models are formally equivalent, they may have very different mathematical properties. When it comes time to choose one of the models for use in a real-world situation, one naturally wants to have good mathematical properties at hand. This was discussed in some detail in Bona et al. (2002) and Bona et al. (2004) in the context of the lower-order system (1.6)–(1.7).

In the present account, theory has been developed that implies the local well-posedness of the initial value problem for a subclass of our unidirectional models. Local well-posedness is a minimal requirement for the use of such models in practice. We also found an additional condition which allows the local theory to be continued indefinitely. It is especially noteworthy that this condition implies the equation to have a Hamiltonian structure. The full water wave model also has a Hamiltonian structure, and experience indicates that maintaining such a Hamiltonian arrangement in approximate models is likely to lead to better qualitative agreement with the full model. Hence, our recommendation is to use the special versions of our equation displayed in (3.16).

Interest is now turned to specifying conditions under which the various restrictions on the coefficients \(\gamma _1, \delta _1\) and \(\gamma \) that cropped up during our analysis are valid. Recall that these conditions were

$$\begin{aligned} \gamma _1\,>\,0, \quad \delta _1 \, > \, 0\quad \mathrm{and} \quad \gamma = \frac{7}{48} \end{aligned}$$
(4.1)

(see Theorem 3.8). The models satisfying these three conditions appear to have a more satisfactory mathematical theory. It is worth reiterating that comparison results indicating that such models approximate solutions of the full water wave problem rely upon smoothness (see Alazman et al. 2006; Bona et al. 2005; Lannes 2013, for example). The fact that, with the restrictions (4.1), the model is globally well-posed in smooth function classes is therefore potentially very useful.

4.1 Hamiltonian Structure

The Hamiltonian structure displayed in Remark 3.7 is the key to our global well-posedness results. It also engenders other good features in the model which are not entered upon here.

So far, the condition \(\gamma = \frac{7}{48}\) is the only one for which we know existence of a Hamiltonian structure. Looking at the formula for \(\gamma \) given in (2.14) and demanding that \(\gamma =\frac{7}{48}\) implies that

$$\begin{aligned} \frac{1}{24}\big [5-9(b+d)+9\rho \big ] = \frac{7}{48}. \end{aligned}$$

Thus, the Hamiltonian structure is guaranteed if one chooses \(\rho \) by the formula

$$\begin{aligned} \rho = b+d-\frac{1}{6}, \end{aligned}$$
(4.2)

which is exactly the one advertised in (2.11). In terms of the fundamental parameters \(\theta , \lambda \) and \(\mu , \rho \) given in (4.2) is written as

$$\begin{aligned} \rho = \frac{1}{6}\left[ 1-\left( \theta ^2-\frac{1}{3}\right) \lambda -3\left( 1-\theta ^2\right) \mu \right] = \frac{1}{6}-(a+c), \end{aligned}$$

where the relation \(a+b+c+d=\frac{1}{3}\) has been used.

4.2 Well-Posedness

As mentioned already, Eq. (2.16) is easily seen to be linearly ill-posed in Sobolev classes unless the parameters \(\gamma _1\) and \(\delta _1\) are positive. These are the more important of the three restrictions in (4.1) as far as well-posedness is concerned. We fix the value of \(\rho = b + d -\frac{1}{6}\) given by (4.2) for which \(\gamma _1 = \gamma _2=\frac{1}{12}\). In particular, \(\gamma _1 > 0\), so that condition is met. In what follows, we discuss the condition \(\delta _1>0\).

As noted in Remark 2.1, a straightforward calculation reveals that

$$\begin{aligned} \delta _2 - \delta _1 + \frac{1}{6} \gamma _1 \, = \, \frac{19}{360}, \end{aligned}$$
(4.3)

regardless of the choice of the various fundamental parameters. As \(\gamma _1 = 1/12\), it is further deduced that

$$\begin{aligned} \delta _2 \,=\, \delta _1 + \frac{7}{180}. \end{aligned}$$
(4.4)

Thus, the condition \(\gamma = 7/48\) implies (4.2). This in turn yields (4.4). So, any value of \(\delta _1 > 0\) may be specified as long as it is consistent with choices of \(\theta , \lambda , \mu , \lambda _1\) and \(\mu _1\).

Using the formula (2.14) for \(\delta _1\) together with the formulas (1.7) and (1.9) for the coefficients \(a, b, \cdots , c_1, d_1\) and (4.2) for \(\rho \), a little algebra shows that in terms of the fundamental parameters \(\theta , \lambda , \mu , \lambda _1\) and \(\mu _1\),

$$\begin{aligned} \begin{aligned} \delta _1&=\delta _1(\theta , \lambda , \mu , \lambda _1,\mu _1)\\&=\frac{1}{2}(b_1+d_1)-\frac{1}{4}\Big (2b-\frac{1}{6}\Big )\Big (\frac{1}{6}-a-d\Big )-\frac{1}{4}d\Big (\frac{1}{6}-2a\Big )\\&= -\frac{5}{48}\big (\theta ^2-\frac{1}{5}\big )\Big [\big (\theta ^2-\frac{1}{5}\big )(1-\lambda _1)+(1-\theta ^2)\mu _1\Big ]\\&\quad -\frac{1}{4}\Big [\Big (\theta ^2-\frac{1}{3}\Big )(1-\lambda )-\frac{1}{6}\Big ]\Big [\frac{1}{6}-\frac{1}{2}\Big (\theta ^2-\frac{1}{3}\Big )\lambda -\frac{1}{2}(1-\theta ^2)(1-\mu )\Big ]\\&\quad -\frac{1}{8}(1-\theta ^2)(1-\mu )\Big [\frac{1}{6}-\Big (\theta ^2-\frac{1}{3}\Big )\lambda \Big ]\\&=\frac{5}{48}\big (\theta ^2-\frac{1}{5}\big )^2\lambda _1-\frac{5}{48}\big (\theta ^2-\frac{1}{5}\big )(1-\theta ^2)\mu _1+P(\theta , \lambda , \mu ), \end{aligned}\qquad \quad \end{aligned}$$
(4.5)

where

$$\begin{aligned} \begin{aligned} P(\theta , \lambda , \mu )&=-\frac{(3\theta ^2-1)^2}{72}\lambda ^2 + \frac{(3\theta ^2-1)(6\theta ^2-1)}{144} \lambda \\&\quad -\frac{(1-\theta ^2)}{24}\mu -\frac{(5\theta ^4-30\theta ^2+14)}{240}, \end{aligned} \end{aligned}$$
(4.6)

is a polynomial in \(\theta , \lambda \) and \(\mu \). A study of (4.5) reveals that there are two separate cases to consider.

Case 1: \(\theta \in [0,1] \setminus \{ \frac{1}{\sqrt{5}}\}\). In this case \(\delta _1>0\) if and only if

$$\begin{aligned} \lambda _1> \frac{(1-\theta ^2)\mu _1}{\big (\theta ^2-\frac{1}{5}\big )}-\frac{48}{5}\frac{P(\theta , \lambda , \mu )}{\big (\theta ^2-\frac{1}{5}\big )^2}=:{\mathcal {H}}(\theta , \lambda , \mu , \mu _1). \end{aligned}$$
(4.7)

Since \({\mathcal {H}}(\theta , \lambda , \mu , \mu _1)\) is finite for any given values of \(\theta , \lambda , \mu \) and \(\mu _1\), it is always possible to choose an appropriate \(\lambda _1\) such that the inequality (4.7) holds true. Indeed, there are many choices that work.

Case 2: \(\theta = \frac{1}{\sqrt{5}}\). In this case

$$\begin{aligned} \delta _1\Big (\frac{1}{\sqrt{5}}, \lambda , \mu , \lambda _1,\mu _1\Big )=P\Big (\frac{1}{\sqrt{5}}, \lambda , \mu \Big ) = -\frac{1}{450} \lambda ^2 -\frac{1}{1800}\lambda -\frac{1}{30}\mu -\frac{41}{1200}. \end{aligned}$$

Observe that the quadratic equation

$$\begin{aligned} P\Big (\frac{1}{\sqrt{5}}, \lambda , \mu \Big )=0, \end{aligned}$$

in (4.6) defines a parabola facing downward. The region in \(\lambda -\mu \) space where \(\delta _1=P\left( \frac{1}{\sqrt{5}}, \lambda , \mu \right) >0\) is the shaded region inside the parabola shown in the Fig. 1.

Fig. 1
figure 1

Region where \(P\left( \frac{1}{\sqrt{5}}, \lambda , \mu \right) > 0\) is shaded

5 The Dispersion Relation

The models derived here depend upon choices of six parameters, which have been denoted \(\lambda , \lambda _1, \mu , \mu _1, \theta \) and \(\rho \). The parameter \(\theta \) has physical significance, whereas the others are modeling parameters and in principle, can take any real value.

As will be seen in a moment, the linearized dispersion relation for the class of models derived here always matches that of the full water wave problem through second order in the small parameter \(\beta \). More precisely, if any of these models are linearized about the rest state, the resulting linear partial differential equation has a dispersion relation relating phase speed c to wave number k. A brief calculation shows this to be

$$\begin{aligned} c_\mathrm{model}(k) \, = \, 1 - \big (\gamma _1 + \gamma _2\big )k^2 + \big (\delta _2 - \delta _1 + \gamma _1^2 + \gamma _1\gamma _2 \big ) k^4+ {\mathcal {F}}k^6 \end{aligned}$$

where k is the wave number and the coefficient \({\mathcal {F}}\) is

$$\begin{aligned} {\mathcal {F}} \, = \, {\mathcal {F}}(\theta ,\lambda ,\mu ,\lambda _1,\mu _1, \rho ) \, = \, -\gamma _1 \delta _2 -\gamma _2(-\delta _1 + \gamma _1^2) + 2 \gamma _1\delta _1 - \gamma _1^3.\quad \end{aligned}$$
(5.1)

As \(\gamma _1 + \gamma _2 = 1/6\) holds independently of the choice of parameters, the second and third terms simplify, viz.

$$\begin{aligned} c_\mathrm{model}(k) \, = \, 1 - \frac{1}{6} k^2 + \left( \delta _2 - \delta _1 + \frac{1}{6} \gamma _1 \right) k^4+ {\mathcal {F}}k^6. \end{aligned}$$

Making use of (4.4) leads to the final result

$$\begin{aligned} c_\mathrm{model}(k) \, = \, 1 - \frac{1}{6} k^2 + \left( \frac{19}{360}\right) k^4+ {\mathcal {F}}k^6, \end{aligned}$$

regardless of the choice of the various parameters.

For the two-dimensional water wave problem displayed in (1.1), the linearized dispersion relationship is exactly

$$\begin{aligned} c_\mathrm{Euler}(k) = \pm \sqrt{\frac{\tanh (k)}{k}}. \end{aligned}$$
(5.2)

For waves moving to the right, the \(+\)-sign is appropriate. One recognizes that the Taylor expansion of the function of the right-hand side of (5.2) in the long-wave regime (small wave number k) is

$$\begin{aligned} c_\mathrm{Euler}(k) \, = \, 1 - \frac{1}{6} k^2 + \frac{19}{360} k^4 - \frac{55}{3024} k^6 + O(k^8). \end{aligned}$$

In consequence, all the models put forward here are seen to satisfy the full, linear dispersion relation through order \(k^4\). Of course, if the derivation is done correctly, this has to be the case. If one rescales the variables so the long wavelength assumption is measured by \(\beta \) as in the formalities of the derivation, then one sees that the error in the linear part of the approximation is at worst of order \(\beta ^3\).

It is tempting to choose the parameters \(\theta , \lambda , \mu , \lambda _1, \mu _1\) and \(\rho \) so that \({\mathcal {F}}\) matches the next order in the dispersion relation exactly, as was done at the lower order in Bona and Chen (1998). Hence, if the auxiliary parameters are chosen so that

$$\begin{aligned} {\mathcal {F}}(\theta ,\lambda ,\mu ,\lambda _1,\mu _1,\rho ) = -\frac{55}{3024}, \end{aligned}$$
(5.3)

then the linear dispersion in the model would match that of the linear water wave problem up to and including order \(\beta ^3\). Such a choice could have a salutary effect on the detailed accuracy of the model, though it does not improve the overall formal level of approximation.

Of course, one needs that the criteria for local well-posedness continue to hold in the light of this choice. A study of the formula (5.1) for \({\mathcal {F}}\) shows that

$$\begin{aligned} \begin{aligned} {\mathcal {F}}&= -\gamma _1 \delta _2 -\gamma _2\big (-\delta _1 + \gamma _1^2\big ) + 2 \gamma _1\delta _1 - \gamma _1^3 \\&= \, -\gamma _1 \delta _2 + \delta _1\big ( \gamma _2 + 2 \gamma _1 \big ) -\gamma _1^2\big (\gamma _1 + \gamma _2 \big ) \\&= \, -\gamma _1\delta _2 + \delta _1\left( \gamma _1 + \frac{1}{6}\right) - \frac{1}{6} \gamma _1^2 \\&= \, \gamma _1\left( \delta _1 - \delta _2 - \frac{1}{6}\gamma _1 \right) + \frac{1}{6} \delta _1 \\&= \, -\frac{19}{360} \gamma _1 + \frac{1}{6} \delta _1, \end{aligned} \end{aligned}$$
(5.4)

where the facts that \(\gamma _1 + \gamma _2 = 1/6\) and the relation (4.3) have been used. It is interesting to know whether or not the relation (5.3), which implies the model dispersion relation agrees with the exact linear dispersion relation up to order \(k^6\), is consistent with the conditions \(\delta _1> 0, \gamma _1 > 0\) and \(\gamma = \frac{7}{48}\) implying global well-posedness. The condition \(\gamma = \frac{7}{48}\) requires that \(\rho = b + d - \frac{1}{6}\) as in (4.2). This in turns implies that \(\gamma _1 = \frac{1}{12} > 0\). That the parameters can be chosen so that (5.3) holds is clear upon consulting the formula (4.5) for \(\delta _1\), which already presumes that \(\rho = b + d - \frac{1}{6}\). For example, choose \(\theta ^2 \in (\frac{1}{5},1)\), and fix \(\lambda , \mu \) and \(\mu _1\). Then \(\delta _1\) is seen to have the form

$$\begin{aligned} \delta _1 = M + N\lambda _1 \end{aligned}$$

where \(N > 0\). Clearly any value of \(\delta _1\) can be achieved by a suitable choice of \(\lambda _1\) and so any value of \({\mathcal {F}}\) can be achieved under the restriction \(\rho = b + d - \frac{1}{6}\). However, notice that (5.4) and (5.3) yield

$$\begin{aligned} \delta _1 = 6\left( \frac{19}{360}\frac{1}{12} - \frac{55}{3024} \right) = -\frac{139}{1680} \, < \, 0. \end{aligned}$$

Hence, the requirement of Hamiltonian structure together with local well-posedness is not consistent with the model approximating the dispersion relation at the next order without considering \(O(\alpha ^2, \beta ^2, \alpha \beta )\) terms in (2.12) and a new correction parameter like \(\rho \).

6 Concluding Remarks

Derived here is a class of unidirectional models for long-crested water waves that are formally second-order correct. Basic analysis of the pure initial value problem for our models has been developed. A local well-posedness theory in relatively weak spaces is established under conditions on the two parameters \(\delta _1\) and \(\gamma _1\) that appear in the model, and which depend upon the other parameters. Global well-posedness is only established in case the equation has a special, Hamiltonian structure. Conditions under which both aspects obtain are given.

A comment is deserved about the focus maintained throughout on unidirectional models. Boussinesq himself understood that his one-way model was simpler than the coupled pair of two-way models that he first derived. It was also simpler than a second-order in time, unidirectional model equation he had derived earlier. In both these instances, a modern perspective on this issue is that the undirectional model can be posed with half the auxiliary data needed to initiate the coupled system. However, unidirectionality places a severe limitation on the wave motion when it is posed as an initial value problem. More precisely, a strict relationship between the initial wave profile and the velocity field is implied. On the other hand, it is known that for Boussinesq-type systems, if the initial disturbance is suitably localized and small, then on certain temporal scales, the disturbance will decompose into a left- and a right-going wave, each of which satisfy approximately a unidirectional equation (see Schneider and Wayne 2000; Bona et al. 2005). Finally, it is worth noting that even fairly steep beaches do not reflect all that much energy (see Mahony and Pritchard 1980). For very gently shelving beaches such as obtain in many nearshore zones, the reflection is negligible as regards its effect on shaping and erosive processes. Hence, unidirectional models seem to suffice in such circumstances.

Finally, we remark that when choosing the depth parameter \(\theta \), it is a good idea if it is taken well inside the interval [0, 1]. While the horizontal velocity does not appear in the unidirectional model, a formal corollary of its derivation is a prediction of the horizontal velocity at the depth \(1-\theta ^2\). This is comprised of the formula (2.8) expressing the horizontal velocity in terms of the functions ABCD and E together with the forms (2.7) determined for A and B and those for CD and E. It is hard to measure the horizontal velocity very close to the free surface, while in actual fact, there is essentially zero velocity on the bottom because of the viscous boundary layer. Typical velocity measurements in laboratory and field situations are made somewhere in the middle of the water column.