1 Introduction

Nonlinear, dispersive wave equations arise in a number of important application areas. Because of this, and because their mathematical properties are interesting and subtle, their theory and applications have seen enormous development since the 1960s when they first came to the fore (see Miura [29] for a sketch of the early history of the subject). The theory for a single nonlinear, dispersive wave equation is well developed by now, though there are still interesting open issues. The theory for coupled systems of such equations is much less developed, though they, too, arise as models of a range of physical phenomena. Considered here is a paradigm class of such systems, namely coupled Korteweg–de Vries equations. The systems we have in mind take the form

$$\begin{aligned} \left\{ \begin{aligned}&u_t+u_{xxx}+P(u, v)_x=0, \\&v_t +v_{xxx}+Q(u,v)_x=0, \end{aligned}\right. \end{aligned}$$
(1.1)

which comprise two independent linear Korteweg–de Vries equations coupled through nonlinear terms. Here, the dependent variables \(u=u(x,t)\) and \(v=v(x,t)\) are real-valued functions and subscripts connote partial differentiation. The nonlinearities are taken to be homogeneous quadratic polynomials in u and v, viz.

$$\begin{aligned} P(u,v)=Au^2+Buv+Cv^2, \qquad Q(u,v)=Du^2+Euv+Fv^2, \end{aligned}$$

with given real coefficients \(A, B, \ldots , F\). Such systems and their close relatives arise as models for waves in a number of situations. For example, the model for Madden-Julian atmospheric oscillations recently developed by Majda and Biello [28] fits exactly into this class of systems. The surface water wave models put forward in [9] and [10] have specializations with the same sort of coupled KdV structure as in (1.1) (see also [12] and [13]). The Gear–Grimshaw system [22] arising in internal wave propagation likewise has features similar to the simpler models in (1.1). A particular system of the type displayed above, but with BBM-type dispersion, was studied by Hakkaev [23].

1.1 Local and Global Well-Posedness

The pure initial-value problem for these systems was studied in [3] and [11]. It transpires that the system (1.1) is always locally well posed in the \(L^{2}({\mathbb {R}})\)–based Sobolev spaces \(H^s({\mathbb {R}})\times H^s({\mathbb {R}})\) for any \(s>-\frac{3}{4}.\) This result follows the general lines of development available already for the single Korteweg–de Vries equation

$$\begin{aligned} u_t+u_{xxx}+uu_x=0 \end{aligned}$$
(1.2)

(see [11, 20]). It was also shown that for solutions of (1.1) corresponding to \(s \ge 0\), the integral

$$\begin{aligned} \varOmega (u, v) =\int _\mathbb {R} \left( au^2+buv+cv^2\right) \,dx \end{aligned}$$
(1.3)

is independent of time. The constants abc comprise any nontrivial solution of the system

$$\begin{aligned} \left\{ \begin{aligned} 2Ba+(E-2A)b-4Dc=0, \\ 4Ca+(2F-B)b-2Ec=0, \end{aligned} \right. \end{aligned}$$
(1.4)

of two linear equations in three unknowns. In case \(s \ge 1\), the integral

$$\begin{aligned} \varTheta (u,v)=\int _\mathbb {R} \Big (a u_x^2+bu_xv_x+cv_x^2-R(u,v)\Big )\,dx, \end{aligned}$$
(1.5)

with

$$\begin{aligned} R(u,v)=\frac{\alpha }{3}u^3+\beta u^2v+\gamma uv^2+\frac{\delta }{3}v^3, \end{aligned}$$
(1.6)

is also independent of time if (uv) solves (1.1). Here, the real constants \(\alpha , \beta , \gamma \) and \(\delta \) depend upon the original coefficients, \(A, B, \ldots , F\). When the quadratic form \(\Upsilon (x,y) = ax^2 + bxy + cy^2\) is positive definite, which is to say, \(4ac - b^2 > 0,\) these invariants immediately allow the local theory to be extended to global well-posedness if \(s \ge 0\). Pursuing a further, energy-type argument as in [20] and an observation in [6], this can be improved and it is now known that (1.1) is globally well posed for arbitrarily sized data in \(H^s(\mathbb {R}) \times H^s(\mathbb {R})\) for any \(s > -\frac{3}{4}\) when \(4ac - b^2 \ge 0\).

Theory for the periodic initial-value problem for the Korteweg–de Vries equation (1.2) is slightly different from the problem posed in Sobolev spaces on \({\mathbb {R}}\). Thanks to the work of Kappeler and Topalov [24], we know the periodic initial-value problem is globally well posed in \(H^s(\mathbb {T})\) for \(s \ge -1\). Here, \({\mathbb {T}} = {\mathbb {R}}/{\mathbb {Z}}\) is the one-dimensional torus. However, the proof in such large spaces was made using the inverse-scattering formulation of the problem. As far as we can tell, the systems (1.1) do not generally possess an inverse-scattering theory, so the Kappeler–Topalov result does not easily generalize to them. Theory using harmonic analysis techniques has been developed by Bourgain in a series of papers and later improved by Kenig, Ponce and Vega. The crowning achievement so far for the periodic initial-value problem using harmonic analysis and energy estimates is found in the paper of Colliander, Keel, Staffilani, Takaoka and Tao [20] where detailed references to the work of Bourgain and Kenig, Ponce and Vega can be found. The current state of the art using these techniques is that the problem is globally well posed in \(H^s(\mathbb {T})\) provided \(s > -\frac{1}{2}\).

For the systems (1.1), the well posedness in \(H^s(\mathbb {T})\) has not yet been dealt with. A straightforward Bona-Smith argument [14] together with the a priori \(H^1(\mathbb {T})\)-bound provided by the invariants (1.3) and (1.5) when \(4ac - b^2 > 0\) (so that the quadratic form \(\Upsilon \) is positive definite) yields global well-posedness for \(s \ge 1\). It seems likely this can be improved, but global well-posedness in \(H^s({\mathbb {T}}) \times H^s(\mathbb {T})\) for values of \(s \ge 1\) suffices to provide the backdrop for the error estimates which are one of the main contributions of the present essay.

1.2 Solitary Waves

It was shown in [6] that when \(4ac-b^2>0\), the systems (1.1) always possess solitary-wave solutions of the special form

$$\begin{aligned} (u_s(x,t),v_s(x,t)) = (\mu _1,\mu _2) \phi _{\omega }(x-\omega {t})=3\omega (\mu _1,\mu _2){\text {sech}}^2\left( \frac{\sqrt{\omega }}{2}\, (x-\omega {t})\right) .\qquad \end{aligned}$$
(1.7)

The parameters \(\mu _1\) and \(\mu _2\) are real constants that are independent of the speed \(\omega \) of propagation. It transpires that it is only the ratio \(\mu = \mu _2/\mu _1\) that is in question here (or in special cases where \(\mu _1 = 0\), the ratio \(\nu = \mu _1/\mu _2\)). These ratios satisfy the cubic equation

$$\begin{aligned} C\mu ^3+(B-F)\mu ^2+(A-E)\mu -D=0 \end{aligned}$$

subject to the side condition \(A+B\mu +C\mu ^2\ne 0\). In case \(\mu _1 = 0\), this becomes

$$\begin{aligned} D \nu ^3 + (E-A) \nu ^2 + (F-B) \nu - C = 0 \end{aligned}$$

subject to \(D\nu ^2+E\nu +F\ne 0\). For details, see [6] or [7].

As the speed \(\omega \) is independent of \(\mu \) or \(\nu \), these solutions comprise smooth curves in function space of traveling waves. Such solutions were termed proportional solitary waves since both u and v have the same shape function. While such solutions always exist when \(4ac-b^2>0\), they need not be unique even among proportional solutions. They will be used to implement accuracy tests in the numerical experiments. Introductory remarks about the numerical schemes are presented next.

1.3 Background for the Numerical Schemes

Design and analysis of numerical schemes to approximate solutions of the systems (1.1) is the focus of the present essay. Naturally, our treatment of this approximation problem is guided by the very considerable literature devoted to approximating solutions of the single KdV equation (1.2). The work described here finds its inspiration in the body of work on discontinuous Galerkin (DG) methods for KdV-type equations and their relatives.

Despite the existence of a surfeit of numerical schemes for the KdV equation, rigorous error estimates for them were rare until the new millenium. Early work that featured rigorous theory may be found in the the papers [31] of Wahlbin, [32] of Winther and [4] of Baker et al.

The main obstacle to deriving error bounds is the difficulty of constructing a viable projection for the dispersive (third derivative) term that would play a role similar to that of the elliptic projection in the context of parabolic and hyperbolic problems. The work of Chi-Wang Shu and collaborators, in particular the articles [19, 33, 34], opened a new path for constructing such projections in the setting of the DG method and its local version, the LDG method. These projections allowed error estimates for both the local and non-local versions of the DG method for the full nonlinear problem.

Later, and building again on the work of Cheng and Shu [19], a DG scheme was put forward in [8] that had the salutary property of preserving up to round-off error the first two invariants of the KdV equation. Such schemes are called conservative, whereas the earlier schemes mentioned above were dissipative. That is, the scheme itself introduces dissipation into the approximation. As is often the case with Hamiltonian systems, schemes that preserve invariants of the motion have improved long-time behaviour of the errors. This theme was pursued further in [27] through the development of a DG scheme preserving the third invariant, which can act as a Hamiltonian for the KdV equation.

More recently, a posteriori error estimates were obtained in [25] and [26] for both DG and LDG versions of schemes for the Generalized KdV equation. The key step introduced in [25] was a reconstruction operator that was used to obtain the first such estimates for nonlinear time-dependent equations of KdV-type. This operator can be adapted to the treatment of derivatives of arbitrarily high order and applies equally well to conservative and dissipative numerical schemes.

The present work is an extension of the ideas and techniques alluded to above for a single, nonlinear, dispersive wave equation to coupled systems of such equations. It constitutes a first attempt at the construction of approximations of solutions to such systems including rigorous analysis of these approximations. Both conservative and dissipative formulations for the system (1.1) will be considered, with an eye to understanding the advantages and disadvantages of each. The conservative formulation has the property that the invariant \(\varOmega (u(\cdot ,t),v(\cdot ,t))\) defined in (1.3), with \(\mathbb {R}\) replaced by [0, 1] and solutions (uv) that are periodic of period one in space, is constant in time, up to round-off error. On the other hand, for the dissipative method, \(\varOmega (u(\cdot ,t),v(\cdot ,t))\) is a monotone decreasing function of time. Of course, up to roundoff error, both schemes are such that the so-called ‘mass’ quantities \(\int _0^1 u(x,t) dx\) and \(\int _0^1 v(x,t)\, dx\) are time independent.

Both approaches appear to be worth studying and have differing characteristics on the theoretical as well as the computational level. For one, while the conservative approach offers the promise of improved error behaviour over long time intervals, the analysis becomes much more arduous. Indeed, the definition of the projection operator is not local anymore and its well-posedness can be established only after some conditions are imposed on the degree of the polynomials used and the parity of the number of cells. While such restrictions might strike the reader as odd, they have been seen already in a number of related works (e.g. [8, 27] and [26]).

The paper is organized as follows: Section 2 is devoted to notation and other preliminary material including the function spaces that are relevant to the analysis that follows. The finite element spaces are then introduced. These consist of continuous piecewise polynomial functions defined on the period domain \({\mathbb {T}}\). The requirement of continuity is a small departure from the standard DG method. It is put into place to keep down the technical difficulties arising in the treatment of the nonlinearities (see Proposition 3.3 in [8]). Weak forms are also introduced for the nonlinear and dispersive terms. The forms considered herein for the dispersive term come in two flavors, conservative and dissipative, denoted \(\mathcal {D}\) and \(\tilde{\mathcal {D}}\), respectively [see (2.9) and (2.12)]. They are inspired by the corresponding forms used in [8] and [19], respectively. The analysis of the numerical schemes is effected by appropriate projection operators. These are constructed and their well-posedness as well as approximation properties examined in detail. As already mentioned, conservation causes the projection operator to be nonlocal, a fact that has as a consequence the above mentioned restrictions on the degree q of the piecewise polynomials and the number of cells in the mesh. For the conservative formulation, it is shown that the approximation power of the projection is optimal, which is to say \(O(h^{q+1})\), only if the mesh is uniform. Otherwise, it is \(O(h^q)\). On the other hand, the dissipative projection is entirely local and its well-posedness and optimal approximation properties are readily established without any conditions on the degree q or the mesh.

In Sect. 3, the conservative semi-discrete formulation is introduced. The integral \(\varOmega (u_h,v_h)\) of the semi-discrete approximations \((u_h,v_h)\) is conserved and the problem is seen to be globally well posed whenever a solution (abc) of (1.4) satisfies the positivity condition \(4ac- b^2 > 0\). It is also shown that the pair \((u_h,v_h)\) converges to (uv) at the rate of \(O(h^q)\) if the mesh is uniform, but \(O(h^{q-1})\) otherwise. A similar analysis is carried out using the dissipative form for the dispersive term to obtain semi-discrete approximations \(({\tilde{u}}_h,{\tilde{v}}_h)\). The analysis of well-posedness and convergence follows lines akin to those of the conservative method, so the details are only sketched. In particular, the convergence rate for this approach is shown to be \(O(h^q)\) without any restrictions on q or the mesh. In Sect. 3.4, fully discrete schemes based on Implicit Runge–Kutta (IRK) time-stepping methods belonging to the Gauss-Legendre family (cf. [21]) are put forward. The choice of this particular class of methods is motivated by their excellent accuracy and stability properties. A special feature of the stability for these methods allows us to show that when used in tandem with the conservative form \(\mathcal {D}\), they yield fully discrete approximations \((u^n,v^n)\) for which \(\varOmega (u^n,v^n)\) is constant, which is to say, \(\varOmega (u^{n+1},v^{n+1}) = \varOmega (u^{n},v^{n})\) at each time level n, up to roundoff error.

In the last section, results of numerical experiments are reported. These include actual convergence rates, behaviour of \(\varOmega \) as a function of time and an appraisal of the relative performance of the conservative and dissipative methods over long time intervals.

2 The Numerical Approximation

Details of the numerical approximations are now set forth. This begins with a discussion of the spatial discretization which leads directly to a semi-discrete approximation of the continuous problem.

2.1 The Meshes

Let \(\mathcal {T}_h\) denote a partition of the real interval [0, 1] of the form \( 0=x_0< x_1< \cdots < x_M = 1\). We will also say that \(\mathcal {T}_h\) is a mesh on [0, 1]. The points \(x_m\) are called nodes while the intervals \(I_m = [x_m,x_{m+1}]\) will be referred to as cells. The subscript h will connote the maximum length of the cells \(I_m, m = 0, \ldots M\!-\!1\). The notation \(x_m^-=x_m^+=x_m\) will be useful in taking account, respectively, of left- and right-hand limits of discontinuous functions. The caveat followed throughout is that \(x_0^-=x_M^-\) and \(x_M^+=x_0^+\) corresponding to the underlying spatial periodicity of the solutions being approximated.

2.2 Function Spaces

For a real interval \(I = [a_1,a_2]\), the Sobolev spaces \(W^{s,p} = W^{s,p}(I)\), equipped with their usual norms will appear frequently. When \(p=2\), we also use \(H^s = H^s(I)\) to denote \( W^{s,2}(I)\). An unadorned norm \(\Vert \cdot \Vert \) will always indicate the \(L^2\)–norm on [0,1]. Use will also be made of the so-called broken Sobolev spaces \(W^{s,p}(\mathcal {T}_h)\) that are defined as the finite Cartesian products \(\Pi _{I \in \mathcal {T}_h} W^{s,p}(I)\). Note that when \(sp>1\), the elements of \(W^{s,p}(\mathcal {T}_h)\) are uniformly continuous when restricted to a given cell but may be discontinuous across nodes. For the purpose of indicating these potential discontinuities, the following notation is used: For \(v \in H^k(\mathcal {T}_h) = W^{k,2}(\mathcal {T}_h)\) with \(k \ge 1\), let \(v_m^+\) and \(v_m^-\) denote the right-hand and left-hand limits, respectively, of v at the node \(x_m\). The jump \([v]_m\) of v at \(x_m\) is \(v_m^+ - v_m^-\), whereas the average \(\{v\}_m\) of v at \(x_m\) is \(\frac{1}{2} (v_m^+ + v_m^-)\). These are standard notations in the context of DG–methods. In all cases, the definitions are meant to adhere to the convention that \(v_0^- = v_M^-\) and \(v_M^+ = v_0^+\) which is tantamount to identifying \(x_0\) and \(x_M\).

For integer \(m\ge 0\), \(C^m[0,1]\) is the classical space of functions that are, together with derivatives of order up to m, continuous on [0, 1]. The periodic versions of these spaces, namely

$$\begin{aligned} C_{per}^m[0,1] =\{ v \in C^m[0,1], v^{(j)}(0) = v^{(j)}(1), \ j=0,\dots ,m\}, \end{aligned}$$

will also arise. For \(m \ge 1\), set \(H_{per}^m(\mathcal {T}_h) = C_{per}^0[0,1] \cap H^m(\mathcal {T}_h)\).

The following, basic embedding inequality (see [2]) will find frequent use. For \(\mathcal {T}_h\) a partition, \(v \in H^1(\mathcal {T}_h)\) and any cell \(I \in \mathcal {T}_h\), there is a constant c which is independent of the cell I such that

$$\begin{aligned} \Vert v\Vert _{L^\infty (I)} \le c \left( h_I^{-1/2} \Vert v\Vert _{L^2(I)} + h_I^{1/2} \Vert v_x\Vert _{L^2(I)}\right) , \end{aligned}$$
(2.1)

where \(h_I\) is the length of I. Indeed, the dependence of (2.1) on the value of \(h_I\) is easily ascertained by a simple scaling argument. Note that (2.1) may also be viewed as a trace inequality.

2.3 The Finite Element Spaces

The spatial approximations will be sought in the space of continuous and periodic piecewise polynomial functions \(V_h^q\) subordinate to the mesh \(\mathcal {T}_h\), viz.

$$\begin{aligned} V_h^q = \{ v \in C_{per}^0[0,1] : \, v \big |_{I_m} \in \mathcal {P}_q(I_m), \, m=0,\ldots ,M\!-\!1\} \end{aligned}$$

where \(\mathcal {P}_q\) is the space of polynomials of degree q. The spaces \(V_h^q\) are subspaces of \(H_{per}^m(\mathcal {T}_h)\) for any \(m\ge 1\) and have well-known, local approximation and inverse properties which are spelled out here for convenience (cf. [16]). Let \(q \ge 2\) be fixed and let ij be such that \(0 \le j \le i \le q+1\). Then, for any cell I and any v in \(W^{j,p}(I)\), there exists a \(\chi \in \mathcal {P}_q(I)\) such that

$$\begin{aligned} |v-\chi |_{W^{j,p}(I)} \le c h_I^{i-j} |v|_{W^{i,p}(I)}, \quad p=2,\infty , \end{aligned}$$
(2.2)

where \(|v|_{W^{j,p}(I)}\) denotes the top-order seminorm on the Sobolev space \(W^{j,p}(I)\) and the constant c is independent of \(h_I\). The equally well-known inverse inequalities

$$\begin{aligned} |\chi |_{W^{j,p}(I)} \le c h_I^{\frac{1}{p}-\frac{1}{2}-j} |\chi |_I, \quad 0\le j\le q+1,\ p=2,\infty , \end{aligned}$$
(2.3)

for all \(\chi \in \mathcal {P}_q(I)\) (see again [16]) will also find frequent use.

2.4 The Weak Formulations

The weak formulation of the problem begins with consideration of the nonlinear terms in (1.1), which are all of the form \((uv)_x\). Keeping in mind that we will be working with continuous periodic functions, it is natural to define the form \(\mathcal {N}\) via integration by parts, viz.

$$\begin{aligned} \mathcal {N}(u,v;\chi ) = -\sum _{I \in \mathcal {T}_h} (uv,\chi _x)_I, \quad u,v, \chi \in H^1(\mathcal {T}_h), \end{aligned}$$
(2.4)

where \((\cdot ,\cdot )_I\) denotes the \(L^2\) inner product over the cell I. The form \(\mathcal {N}\) is actually a trilinear form and is well-defined since \(H^1(\mathcal {T}_h)\) is a Banach Algebra. It is also obviously symmetric in its first two arguments. By virtue of the Riesz Representation Theorem, this form defines the associated nonlinear operator \(\mathcal {N}: H^1(\mathcal {T}_h) \times H^1(\mathcal {T}_h) \rightarrow V_h^q\) whose \(L^2([0,1])\)-inner product with any \(\chi \in V_h^q\) is

$$\begin{aligned} (\mathcal {N}(u,v),\chi ) = - \sum _{I \in \mathcal {T}_h}(uv,\chi _x)_I. \end{aligned}$$
(2.5)

Other properties of \(\mathcal {N}\) are encapsulated in the following lemma.

Lemma 1

  1. (i)

    The nonlinear form \(\mathcal {N}\) defined by (2.4) is consistent in the sense that

    $$\begin{aligned} \mathcal {N}(u,v;\chi ) = ((uv)_x,\chi ), \quad u,v, \chi \in H_{per}^1(\mathcal {T}_h). \end{aligned}$$
    (2.6)
  2. (ii)

    For \(u,v \in H_{per}^1(\mathcal {T}_h)\),

    $$\begin{aligned} \mathcal {N}(u,v;v) = \frac{1}{2} \sum _{I \in \mathcal {T}_h} (v^2,u_x)_I. \end{aligned}$$
    (2.7)

Proof

  1. (i)

    Integration by parts yields

    $$\begin{aligned} \mathcal {N}(u,v;\chi ) = \sum _{I \in \mathcal {T}_h}((uv)_x,\chi )_I - \sum _{m=0}^{M-1} [uv\chi ]_m =((uv)_x,\chi ), \end{aligned}$$

    since the jump terms vanish because of the assumed continuity and periodicity conditions.

  2. (ii)

    Since uv are continuous and periodic, integration by parts yields

    $$\begin{aligned} \mathcal {N}(u,v;v) = -\frac{1}{2} \sum _{I\in \mathcal {T}_h} ((v^2)_x,u)_I = \frac{1}{2} \sum _{I\in \mathcal {T}_h} (v^2,u_x)_I. \end{aligned}$$

\(\square \)

A simple but important consequence of periodicity is that the integral \(\int _0^1 (u^2)_xu\,dx\) vanishes for \(u \in C_{per}^1[0,1]\). As a direct consequence of the above lemma, we see that the form \(\mathcal {N}\) preserves this property on \(H_{per}^1(\mathcal {T}_h)\) and in particular on \(V_h^q\).

Corollary 1

The form \(\mathcal {N}\) is conservative in the sense that

$$\begin{aligned} \mathcal {N}(u,u;u) = 0, \quad u \in H_{per}^1(\mathcal {T}_h). \end{aligned}$$
(2.8)

Proof

Let \(u \in H_{per}^1(\mathcal {T}_h)\). Formula (2.4) implies that \(\mathcal {N}(u,u;u) = -\sum _{I\in \mathcal {T}_h}(u^2,u_x)_I\) whereas (2.7) shows that \(\mathcal {N}(u,u;u) = \frac{1}{2} \sum _{I\in \mathcal {T}_h}(u^2,u_x)_I\). The result follows. \(\square \)

Attention is now turned to constructing a bilinear form to represent the third derivative terms. It is similar in spirit to the form \(\mathcal {D}\) introduced in [8] with the difference being that the prevailing spaces are globally continuous. For \(u, \chi \in H^2(\mathcal {T}_h)\), define \(\mathcal {D}\) by

$$\begin{aligned} \mathcal {D}(u,\chi ) = \sum _{I\in \mathcal {T}_h} (u_x,\chi _{xx})_I + \sum _{m=0}^{M-1} \{u_x\}_m [\chi _x]_m. \end{aligned}$$
(2.9)

The next lemma delineates properties of \(\mathcal {D}\) that justify the particular form chosen in (2.9).

Lemma 2

  1. (i)

    The bilinear form \(\mathcal {D}\) defined by (2.9) is consistent in the sense that

    $$\begin{aligned} \mathcal {D}(u,\chi ) = (u_{xxx},\chi ), \quad u\in H^3(\mathcal {T}_h) \cap C_{per}^2[0,1],\ \chi \in H_{per}^2(\mathcal {T}_h). \end{aligned}$$
    (2.10)
  2. (ii)

    The form \(\mathcal {D}\) is skew-adjoint so that

    $$\begin{aligned} \mathcal {D}(v,v) = 0 \quad \forall v \in H^2(\mathcal {T}_h). \end{aligned}$$
    (2.11)

Proof

  1. (i)

    Integrating by parts twice yields

    $$\begin{aligned} \mathcal {D}(u,\chi ) = \sum _{I\in \mathcal {T}_h} (u_{xxx},\chi )_I - \sum _{m=0}^{M-1} [u_x\chi _x]_m + \sum _{m=0}^{M-1} [u_{xx}\chi ]_m + \sum _{m=0}^{M-1} \{u_x\}_m[\chi _x]_m. \end{aligned}$$

    Since \(u_{xx}\) and \(\chi \) are continuous and periodic, it follows that the jumps \([u_{xx}\chi ]_m\) vanish. The identity \([u_x\chi _x]_m = \{u_x\}_m[\chi _x]_m + [u_x]_m \{\chi _x\}_m\) holds for \(u,\chi \in H^{2}(\mathcal {T}_h)\). Since the jumps \([u_x]_m\) also vanish, the result follows.

  2. (ii)

    To establish (2.11), note that \(v_x v_{xx} = \frac{1}{2} (v_x^2)_x\). Thus

    $$\begin{aligned} \sum _{I\in \mathcal {T}_h} (v_x,v_{xx})_I = -\frac{1}{2} \sum _{m=0}^{M-1} [v_x^2]_m. \end{aligned}$$

    The result now follows from the observation that \([v_x^2]_m = 2 \{v_x\}_m[v_x]_m\).\(\square \)

It will also be convenient to define the operator counterpart \(\mathcal {D}: H^2(\mathcal {T}_h) \rightarrow V_h^q\) of the bilinear form \(\mathcal {D}(\cdot ,\cdot )\) via the requirement

$$\begin{aligned} (\mathcal {D}u, \chi ) = \mathcal {D}(u,\chi ), \quad \forall \chi \in V_h^q. \end{aligned}$$

The dissipative version

$$\begin{aligned} \tilde{\mathcal {D}}(u,\chi ) = \sum _{I\in \mathcal {T}_h} (u_x,\chi _{xx})_I + \sum _{m=0}^{M-1} u_x(x_m^+) [\chi _x]_m \end{aligned}$$
(2.12)

of the bilinear form \(\mathcal {D}\) may also be used to represent the third derivative terms. The following lemma summarizes the properties of \(\tilde{\mathcal {D}}\) that are the counterparts of those of \(\mathcal {D}\) shown in Lemmas 1 and 2. The proofs are similar and therefore omitted.

Lemma 3

  1. (i)

    The bilinear form \(\tilde{\mathcal {D}}\) defined in (2.12) is consistent in the sense that

    $$\begin{aligned} \tilde{\mathcal {D}}(u,\chi )=(u_{xxx},\chi ), \quad u\in H^3(\mathcal {T}_h) \cap C_{per}^2[0,1],\ \chi \in H_{per}^2(\mathcal {T}_h). \end{aligned}$$
    (2.13)
  2. (ii)

    The form \(\tilde{\mathcal {D}}\) is dissipative, which is to say,

    $$\begin{aligned} \tilde{\mathcal {D}}(v,v) = \frac{1}{2} \sum _{m=0}^{M-1} [v_x]_m^2, \quad v \in H^2(\mathcal {T}_h). \end{aligned}$$
    (2.14)

2.5 The Dispersive Projections

A conservative projection operator \(P \!\! : \! C_{per}^1[0,1] \! \rightarrow \! V_h^q\) adapted to the bilinear form \(\mathcal {D}\) is now constructed following the ideas in [8]. In contrast to [8], the range of this projection is comprised of (globally) continuous, periodic functions.

For u in \(C_{per}^1[0,1]\), the projection of u is the function \(w = Pu\) in \(V_h^q\) that satisfies the conditions

$$\begin{aligned} \begin{array}{ll} (w,\chi )_I =(u,\chi )_I, &{} \quad \forall \chi \in \mathcal {P}_{q-3}(I),\ I \in \mathcal {T}_h, \\ w(x_m^+) = u(x_m^+), &{}\quad m=0,\ldots ,M-1, \\ w(x_m^-) = u(x_m^-), &{}\quad m=0,\ldots ,M-1, \\ \{w_x\}_m = \{u_x\}_m, &{} \quad m=0,\ldots , M-1. \end{array} \end{aligned}$$
(2.15)

Note that for \(q=2\) the first condition is vacuous. Also, since u and the elements of \(V_h^q\) are continuous and periodic, the second and third conditions can be equivalently expressed as \(w(x_m)=u(x_m),\ m=0,\dots ,M\). However there is no guarantee that such a projection will be well defined. Indeed, due to the nonlocal nature of the fourth condition in (2.15), existence will be shown only when certain conditions on q and M are satisfied.

When it exists, the operator P plays a central role in establishing error estimates for the conservative, semi-discrete method and is adapted, as the next lemma shows, to the bilinear form \(\mathcal {D}\).

Lemma 4

For u in \(C_{per}^1[0,1]\), if its projection \(w = Pu\) defined in (2.15) exists, then it satisfies

$$\begin{aligned} \mathcal {D}(w,\chi ) = \mathcal {D}(u,\chi ), \quad \chi \in V_h^q. \end{aligned}$$
(2.16)

Proof

First, note that \([u\chi _{xx}]_m = [u]_m\chi _{xx}(x_m^-)+u_m^+[\chi _{xx}]_m\). Since u is continuous and periodic, the jumps \([u]_m\) are all zero. Because \(w_m^+ = u_m^+\) and both u and w are continuous, it transpires that

$$\begin{aligned}{}[u\chi _{xx}]_m = u_m^+[\chi _{xx}]_m = w_m^+[\chi _{xx}]_m = [w]_m\chi _{xx}(x_m^-)+w_m^+[\chi _{xx}]_m = [w\chi _{xx}]_m. \end{aligned}$$

In consequnce, upon integrating by parts the term \(\sum _{I\in \mathcal {T}_h}(u_x,\chi _{xx})_I\) in (2.9), it follows at once from the above identity and the definition of w that

$$\begin{aligned} \mathcal {D}(u,\chi )= & {} - \sum _{I\in \mathcal {T}_h}(u,\chi _{xxx})_I - \sum _{m=0}^{M-1} [u\chi _{xx}]_m + \sum _{m=0}^{M-1} \{u_x\}_m [\chi _x]_m \\= & {} - \sum _{I\in \mathcal {T}_h}(w,\chi _{xxx})_I - \sum _{m=0}^{M-1} [w\chi _{xx}]_m + \sum _{m=0}^{M-1} \{w_x\}_m [\chi _x]_m. \end{aligned}$$

A further integration by parts then yields

$$\begin{aligned} \mathcal {D}(u,\chi ) = \sum _{I\in \mathcal {T}_h}(w_x,\chi _{xx})_I + \sum _{m=0}^{M-1} \{w_x\}_m [\chi _x]_m = \mathcal {D}(w,\chi ). \end{aligned}$$

The proof is complete. \(\square \)

Combining this result with (2.10), one sees that for \(u \in C_{per}^3[0,1]\)

$$\begin{aligned} \mathcal {D}w = P_0 u_{xxx}, \end{aligned}$$
(2.17)

where \(P_0\) is the \(L^2\)-projection operator into \(V_h^q\).

Next a related projection operator \({\tilde{P}} : C_{per}^1[0,1] \rightarrow V_h^q\) is defined which is adapted to the bilinear form \(\tilde{\mathcal {D}}(\cdot ,\cdot )\). It will be used in the analysis of the dissipative semi-discrete formulation. Interestingly, it also plays a role in the analysis of the projection P.

For \(u \in C_{per}^1[0,1]\), its projection \({\tilde{w}}= {\tilde{P}} u\) is defined by the conditions

$$\begin{aligned} \begin{array}{ll} ({\tilde{w}},\chi )_I=(u,\chi )_I, &{} \quad \forall \chi \in \mathcal {P}_{q-3}(I),\ I \in \mathcal {T}_h,\\ {\tilde{w}}(x_m^+) = u(x_m^+), &{}\quad m=0,\ldots ,M-1, \\ {\tilde{w}}(x_m^-) = u(x_m^-), &{}\quad m=0,\ldots ,M-1, \\ {\tilde{w}}_x(x_m^+) = u_x(x_m^+) &{} \quad m=0,\ldots ,M-1, \end{array} \end{aligned}$$
(2.18)

In contrast to (2.15), the conditions in (2.18) are entirely local to each cell. Hence, the proofs of existence and uniqueness of this projection are straightforward. Furthermore, classical finite element approximation theory (see again [16, 18]) can be brought to bear to show that \({\tilde{w}}\) is an optimal approximation to suitably regular u.

Proposition 1

The projection operator \({\tilde{P}}\) is well-defined for \(q\ge 2\). Furthermore, for any element \(u \in W^{q+1,p}(\mathcal {T}_h) \cap C_{per}^{1}[0,1]\), the optimal approximation properties

$$\begin{aligned} \Vert u-{\tilde{w}}\Vert _{W^{j,p}(I)} \le ch_I^{q+1-j} |u|_{W^{q+1,p}(I)}, \quad I \in \mathcal {T}_h, \quad j=0,1, \quad p=2,\infty , \end{aligned}$$
(2.19)

obtain, where \(h_I\) is the length of cell I and c is independent of u and I.

Proof

The definition of \({\tilde{P}}\) is local to each cell and involves linear equations. Hence to prove existence and uniqueness it suffices to show that \(u=0\) implies \({\tilde{w}}=0\).

For \(\ell \ge 0\), let \(P_{\ell }(t)\), be the usual Legendre polynomials that are orthogonal on \([-1,1]\), normalized so that \(P_{\ell }(1)=1\). Given a cell \(I_m=[x_m,x_{m+1}], \ m=0,\dots ,M-1\), consider the affine map

$$\begin{aligned} x= x(\xi ) = \frac{h_m}{2} \xi + \frac{x_m+x_{m+1}}{2}, \quad -1 \le \xi \le 1, \end{aligned}$$
(2.20)

that maps \([-1,1]\) onto \(I_m\). The family of rescaled Legendre polynomials \(P_{m,\ell }(x)\) is defined by \(P_{m,\ell }(x)=P_{\ell }(\xi )\) where x and \(\xi \) are related by (2.20). The polynomials \(P_{m,\ell }\) are orthogonal with respect to the \(L^2\)-inner product on \(I_m\).

Let \({\tilde{w}}_m\) denote the restriction of \({\tilde{w}}\) to \(I_m\). It can be be expressed in terms of the rescaled Legendre polynomials thusly;

$$\begin{aligned} {\tilde{w}}_m(x) = \sum _{\ell =0}^q \alpha _{m,\ell } P_{m,\ell }(x) = \sum _{\ell =0}^q \alpha _{m,\ell } P_{\ell }(\xi ). \end{aligned}$$

The first equation in (2.18) and the orthogonality of the Legendre polynomials imply that

$$\begin{aligned} \alpha _{m,\ell }=0, \quad \ell =0,\ldots ,q-3. \end{aligned}$$
(2.21)

Since \(P_\ell (\pm 1) = (\pm 1)^\ell \), the second and third equations in (2.18) translate to

$$\begin{aligned} 0= & {} {\tilde{w}}_m(x_m^+) = \sum _{\ell =q-2}^q \alpha _{m,\ell } P_\ell (-1) = (-1)^{q-2} \left( \alpha _{m,q-2} - \alpha _{m,q-1} + \alpha _{m,q} \right) \\ 0= & {} {\tilde{w}}_m(x_{m+1}^-) = \sum _{\ell =q-2}^q \alpha _{m,\ell } P_\ell (1) = \alpha _{m,q-2} + \alpha _{m,q-1} + \alpha _{m,q}. \end{aligned}$$

It follows from these two equations that

$$\begin{aligned} \alpha _{m,q-1} = 0 \quad \text{ and }\quad \alpha _{m,q-2} + \alpha _{m,q} = 0. \end{aligned}$$
(2.22)

The fourth equation in (2.18) together with the well known identities \(P'_{\ell }(\pm 1)=\frac{1}{2} (\pm 1)^{\ell -1} \ell (\ell +1), \ \ell = 0,\ldots \) lead to

$$\begin{aligned} 0 = {\tilde{w}}_x(x_m^+)= & {} \sum _{\ell =q-2}^q \alpha _{m,\ell } P_{m,\ell }'(x_m^+) = \frac{2}{h_m} \sum _{\ell =q-2}^q \alpha _{m,\ell } P_\ell '(-1) \\= & {} \frac{1}{h_m} (-1)^{q-3} \left( (q-2)(q-1)\alpha _{m,q-2} -(q-1)q \, \alpha _{m,q-1} + q(q+1)\alpha _{m,q}\right) . \nonumber \end{aligned}$$
(2.23)

Using the fact that \(\alpha _{m,q-1}=0\) and the second equation in (2.22) together with the fact that the values of \((q-2)(q-1)\) and \(q(q+1)\) are never the same reveals that \(\alpha _{m,q-2}= \alpha _{m,q}=0\). Thus \({\tilde{w}}=0\) whenever \(u=0\), so the projection operator is indeed well-defined.

The arguments above show that \({\tilde{P}} u = u\) whenever u belongs to \(V_h^q\). The approximation properties (2.19) are established by an application of the Bramble-Hilbert Lemma. This completes the proof. \(\square \)

Proposition 2

For even values of q, the projection operator P is not well defined in general. On the other hand, it is well defined for odd values of \(q \ge 3\) provided the number M of cells is also odd.

Proof

The machinery set up in the proof of Proposition 1 is applied to the difference \(w-{\tilde{w}}\) rather than w. Existence, uniqueness and approximation properties of w will be deduced from those that can be established for this difference.

Letting e denote the restriction of \(w-{\tilde{w}}\) to \(I_m\), it follows from the first three equations of (2.15) and (2.18) that \((e,\chi )_{I_m} = 0, \ \chi \in \mathcal {P}_{q-3}(I)\) and \(e(x_m^+)=e(x_m^-)=0\). Expanding in terms of the rescaled Legendre polynomials as before gives \(e = \sum _{\ell =0}^q \alpha _{m,\ell } P_{m,\ell }(x) = \sum _{\ell =0}^q \alpha _{m,\ell } P_{\ell }(\xi )\), and just as before, the analogs of (2.21) and (2.22) hold. Consequently, there results

$$\begin{aligned} \alpha _{m,\ell } =0, \ \ell =0,\dots ,q-3, \quad \alpha _{m,q-1} = 0, \quad \alpha _{m,q-2} =- \alpha _{m,q}. \end{aligned}$$
(2.24)

On the other hand, from the fourth equations in (2.15) and (2.18), there obtains

$$\begin{aligned} \{e_x\}_m = \frac{1}{2} \left( u_x(x_m) - {\tilde{w}}_x(x_m^-)\right) := \eta _m. \end{aligned}$$
(2.25)

This in turn translates into

$$\begin{aligned} \eta _m= & {} \frac{1}{h_m} \sum _{\ell =0}^q \alpha _{m,\ell } P_\ell '(-1) + \frac{1}{h_{m-1}} \sum _{\ell =0}^q \alpha _{m-1,\ell } P_\ell '(1) \\= & {} (-1)^{q-3} \left( (q-2)(q-1)\alpha _{m,q-2} - (q-1)q \alpha _{m,q-1} + q(q+1) \alpha _{m,q} \right) / (2h_m) \\&\ +\, \left( (q-2)(q-1)\alpha _{m-1,q-2} + (q-1)q \alpha _{m-1,q-1} + q(q+1) \alpha _{m-1,q} \right) / (2h_{m-1}), \end{aligned}$$

using the fact that \(\alpha _{m-1,\ell } = \alpha _{m,\ell }=0, \ \ell =0,\dots ,q-3\). This last result can be further simplified since \(\alpha _{m,q-1} = 0\) and \(\alpha _{m,q-2} =- \alpha _{m,q}\). The upshot is that

$$\begin{aligned} \begin{aligned} \eta _m = (-1)^{q-3} \beta _{m}+\beta _{m-1}, \quad \beta _{m} := (2q-1)\alpha _{m,q}/h_m, \ m=0,\dots ,M-1 \\ \end{aligned} \end{aligned}$$
(2.26)

with the understanding that the indices are modulo M. This is a system of M linear equations in the unknowns \(\beta _{0},\dots ,\beta _{M-1}\) and the projection w is well defined if and only if the system is uniquely solvable.

If q is even, the coefficient matrix of this linear system is circulant with first row \([-1,0,\dots ,0,1]\). The nullspace of this matrix is the span of the vector \((1,1,\dots ,1)\); hence the system (2.26) does not have a solution unless \(\sum _{m=0}^{M-1} \eta _m = 0\), a property that cannot be assumed to hold in general.

On the other hand, if q is odd, the matrix is again circulant but with first row \([1,0,\dots ,0,1]\). If M is even, the nullspace of the coefficient matrix is the span of the vector \((1,-1,\dots ,1,-1)\) and the matrix is again singular. However, it is invertible if M is odd and its inverse is also circulant with first row \(\frac{1}{2} [1,1,-1,1,\dots ,1,-1]\). To summarize, if q and M are odd, then the coefficients \(\{\alpha _{m,q}\}_{m=0}^{M-1}\) are well defined and therefore so is \(e=w-{\tilde{w}}\), which in turn implies that w is well defined since \({\tilde{w}}\) is already uniquely defined. This completes the proof. \(\square \)

Proposition 3

Suppose the assumptions of Proposition 2 are satisfied so that the conservative projection operator P is well defined. Then for \(u\in C_{per}^1[0,1]\cap W^{q+1,\infty }(\mathcal {T}_h)\), \(j=0,1\) and \( p=2,\infty \), the following quasi-optimal approximation properties hold:

$$\begin{aligned} \begin{aligned} \Vert u-Pu\Vert _{W^{j,p}(I)} \le c h_I^{1+\frac{1}{p}-j} \sum _{J \in \mathcal {T}_h} h_J^{q} \Vert u^{(q+1)}\Vert _{L^\infty (J)}. \end{aligned} \end{aligned}$$
(2.27)

Here, \(h_J\) is the length of the interval J as before. If in addition the mesh \(\mathcal {T}_h\) is uniform and u also belongs to \(W^{q+2,\infty }(\mathcal {T}_h)\), the optimal approximation estimates

$$\begin{aligned} \Vert u-Pu\Vert _{W^{j,p}(I)} \le c h^{\frac{1}{p}+q+1-j} \Vert u\Vert _{W^{q+2,\infty } (0,1)}, \end{aligned}$$
(2.28)

are valid. Here, h is the uniform length of the intervals comprising the mesh.

Proof

The approximation properties of w will follow from those of \({\tilde{w}}\) displayed in (2.19) by finding suitable bounds for \(e=w-{\tilde{w}}\). As seen in the proof of Proposition 2, for q and M odd, the system exhibited in (2.26) is uniquely solvable and yields

$$\begin{aligned} \begin{aligned} \beta _{m} = \frac{1}{2} \left( \eta _m + \sum _{k=1} ^\frac{M-1}{2} \left( \eta _{m+2k-1} - \eta _{m+2k} \right) \right) , \,\, m=0,\ldots ,M-1, \\ \end{aligned} \end{aligned}$$
(2.29)

where the indices in the sum are all read modulo M.

Recall that \(\alpha _{m,\ell }=0, \ \ell =0,\dots ,q-3\), \(\alpha _{m,q-1}=0\) and \(\alpha _{m,q-2} = -\alpha _{m,q}\). From the orthogonality of the \(P_{m,\ell }\) and the well-known fact that \(\Vert P_{m,\ell }\Vert _{I_m}^2 = h_m/(2\ell +1)\), it follows that

$$\begin{aligned} \Vert e\Vert _{I_m}^2 = \sum _{\ell =0}^q \alpha _{m,\ell }^2 \Vert P_{m,\ell }\Vert _{I_m}^2 = \frac{4q-2}{(2q-3)(2q+1)} h_m\alpha _{m,q}^2. \end{aligned}$$
(2.30)

It is clear from the inequalities in (2.19) that

$$\begin{aligned} |\eta _\ell |=|u_x(x_\ell )-{\tilde{w}}_x(x_\ell ^-)| \le c h_{\ell -1}^q \Vert u^{(q+1)}\Vert _{L^\infty (I_{\ell -1})}, \quad \ell =0,\dots ,M-1. \end{aligned}$$

(Keep in mind that the indices \(\ell \) are all taken modulo M.) Hence, it follows from (2.29), ignoring for the time being the alternating signs, that

$$\begin{aligned} |\alpha _{m,q}| = \frac{h_{m}}{2q-1}|\beta _{m}|\le ch_{m} \sum _{J \in \mathcal {T}_h} h_J^{q} \Vert u^{(q+1)}\Vert _{L^\infty (J)}. \end{aligned}$$
(2.31)

Combining this with (2.30) gives

$$\begin{aligned} \Vert e\Vert _{I} \le c h_I^{3/2} \sum _{J\in \mathcal {T}_h} h_J^{q} \Vert u^{(q+1)}\Vert _{L^\infty (J)}. \end{aligned}$$
(2.32)

Since \(u-w=u-{\tilde{w}}-e\), (2.19) and (2.32) lead to (2.27) with \(p=2,j=0\). Moreover, using (2.32) and the inequalities (2.1) and (2.3) applied to e together with (2.19) gives (2.27) for the remaining cases \(p=\infty , \ j=0,1\) and \(p = 2\), \(j = 1\).

The estimate (2.27) can be improved by exploiting the alternating signs in (2.29). Indeed, the techniques appearing in the proof of Proposition 3.2 of [8], show that

$$\begin{aligned} \eta _\ell = h_{\ell -1}^q \sum _{j=0}^{q-1} \rho _j u^{(q+1)}(\zeta _{{\ell -1},j}), \quad \zeta _{\ell -1,j} \in I_{\ell -1}, \end{aligned}$$
(2.33)

where \(\rho _j, j=0,\dots ,q-1\) depend only on q. If \(h_{\ell }=h\), then (2.33) and the Mean-Value Theorem can be used to extract an extra power of h, viz.

$$\begin{aligned} \begin{aligned} |\eta _{\ell } - \eta _{\ell +1}|&= {} h^{q} \sum _{j=0}^{q-1} |\rho _j| \left| u^{(q+1)}(\zeta _{{\ell -1},j}) -u^{(q+1)}(\zeta _{{\ell },j}) \right| \\&\le c h^{q+1} \Vert u^{(q+2)} \Vert _{L^\infty (I_{\ell -1} \cup I_{\ell })}. \end{aligned} \end{aligned}$$
(2.34)

The proof of (2.28) follows as before. \(\square \)

Remark 1

Commentary is in order concerning the conditions imposed in Propositions 2 and 3.

  1. (i)

    It should be emphasized that the condition that both q and M are odd pertains only to the existence and approximation properties of the conservative projection operator. The conservative semi-discrete approximation (3.1) is well defined, independently of this assumption.

  2. (ii)

    Obviously, there is no problem with creating meshes \(\mathcal {T}_h\) with an odd number M of cells. Moreover, this property is easily preserved in a process of repeated refinement or coarsening at later times in the temporal integration. Numerical experiments indicate that the convergence rates are the same, whether or not the mesh possesses an odd number of cells and so we have tentatively concluded that this restriction is simply an artifact of our proof, which relies upon the projection.

  3. (iii)

    With \(h = \max h_I\), using the fact that \(\sum _{J\in \mathcal {T}_h} h_J = 1\), it follows from (2.27) that

    $$\begin{aligned} \begin{aligned} \Vert u-w\Vert \le c h^q \Vert u\Vert _{W^{q+1,\infty } (0,1)}, \end{aligned} \end{aligned}$$
    (2.35)

    which is quasi-optimal. Similarly, if the mesh is uniform, the estimate (2.28) leads to the optimal bound

    $$\begin{aligned} \begin{aligned} \Vert u-w\Vert \le c h^{q+1} \Vert u\Vert _{W^{q+2,\infty } (0,1)}. \end{aligned} \end{aligned}$$
    (2.36)

    Note that the characterization of optimality refers to the exponent of h and not the regularity required of u. Henceforth, we shall write \(\Vert u-w\Vert \le c h^\mu \) with \(\mu =q+1\) or q (depending on whether the mesh is uniform or not), omitting the dependence of c on u.

3 The Semi-Discrete Approximations: Existence, Uniqueness and Error Estimates

3.1 The Conservative Semi-Discrete Formulation

The stage is set for defining the semi-discrete approximation to solutions of the system (1.1). For a given partition \({\mathcal {T}}_h\), let \(u_h,v_h\) in \(V_h^q\times [0,T]\) be the solutions of the coupled system

$$\begin{aligned} \left[ \begin{array}{l} u_{ht} \\ v_{ht} \end{array} \right] + \left[ \begin{array}{lll} A &{} B &{} C \\ D &{} E &{} F \end{array} \right] \left[ \begin{array}{l} \mathcal {N}(u_h,u_h) \\ \mathcal {N}(u_h,v_h) \\ \mathcal {N}(v_h,v_h) \end{array} \right] + \left[ \begin{array}{l} \mathcal {D}u_h \\ \mathcal {D}v_h \end{array} \right] = \left[ \begin{array}{l} 0 \\ 0 \end{array} \right] , \end{aligned}$$
(3.1)

of first-order in time operator equations with initial data \(u_h^0, v_h^0 \in V_h^q\). The initial data will of course be a suitable approximation of initial data \(u_0, v_0\) for (1.1) (see Theorem 3.2).

Expanding \(u_h\) and \(v_h\) in terms of a basis for the finite-dimensional space \(V_h^q\), it is readily seen that (3.1) is equivalent to an initial-value problem for a system of ordinary differential equations of the form \((U_t,V_t) = \mathcal F(U,V)\) where \({\mathcal {F}}\) is quadratic and so locally Lipschitz continuous. It is then immediate that this system has existence and uniqueness of a solution corresponding to given initial data \(u_h^0, v_h^0\), at least locally in time.

The following lemma is central to our development.

Lemma 5

Let (abc) be a non-trivial solution of the system (1.4). If \(u,v \in V_h^q\), the following identity holds

$$\begin{aligned} \mathcal {I}(u,v) :=\int _0^1 [2au+bv bu+2cv] \left[ \begin{array}{lll} A &{} B &{} C \\ D &{} E &{} F \end{array} \right] \left[ \begin{array}{l} \mathcal {N}(u,u) \\ \mathcal {N}(u,v) \\ \mathcal {N}(v,v) \end{array} \right] = 0 \end{aligned}$$
(3.2)

Proof

In detail, \(\mathcal {I}(u,v)\) is written as

$$\begin{aligned} \mathcal {I}(u,v)=&2aA \mathcal {N}(u,u;u) +bA \mathcal {N}(u,u,v) + 2aB\mathcal {N}(u,v;u) + bB\mathcal {N}(u,v;v) \\&+\, 2aC\mathcal {N}(v,v;u) + bC\mathcal {N}(v,v;v) + bD\mathcal {N}(u,u;u) + 2c D \mathcal {N}(u,u;v) \\&+\, bE\mathcal {N}(u,v;u) + 2c E\mathcal {N}(u,v;v) + bF\mathcal {N}(v,v;u) + 2cF\mathcal {N}(v,v;v). \end{aligned}$$

The four terms of the form \({\mathcal {N}}(w,w;w)\) all vanish on account of (2.8). For the remaining terms, the elementary relations

$$\begin{aligned} {\mathcal {N}}(u,u;v) = -2 {\mathcal {N}}(u,v;u) \quad \mathrm{and} \quad {\mathcal {N}}(v,v;u) = -2 {\mathcal {N}}(u,v;v) \end{aligned}$$

lead to the formula

$$\begin{aligned} \mathcal {I}(u,v) = {\mathcal {N}}(u,v;u) \Big [2Ba+(E-2A)b-4Dc \Big ] + {\mathcal {N}}(u,v;v) \Big [-4Ca+(B -2F)b +2Ec\Big ]. \end{aligned}$$

The fact that (abc) is a solution to (1.4) thus implies that \(\mathcal {I}(u,v)=0\) and concludes the proof. \(\square \)

Theorem 3.1

Suppose (abc) is a solution of the system (1.4). Let \((u_h,v_h)\) be a solution pair of (3.1). Then the quantity \(\varOmega (u_h,v_h)\) is time independent. Furthermore, if \(b^2<4ac\), the solution is global in time and uniformly bounded.

Proof

Multiply the first equation in (3.1) by \(2au_h+bv_h\), the second equation by \(bu_h+2cv_h\) and integrate the sum of these over the the period domain [0, 1]. The result is

$$\begin{aligned} \frac{d}{dt} \varOmega (u_h,v_h) + \mathcal {I}(u_h,v_h)+ & {} 2a\mathcal {D}(u_h,u_h) + b\left( \mathcal {D}(u_h,v_h)+\mathcal {D}(v_h,u_h)\right) \nonumber \\+ & {} 2c \mathcal {D}(v_h,v_h) = 0. \end{aligned}$$
(3.3)

Lemma 5 reveals that \(\mathcal {I}(u_h,v_h)=0\). The skew-symmetry of \(\mathcal {D}\) implies that \(\mathcal {D}(u_h,u_h)\), \(\mathcal {D}(v_h,v_h)\) and \(\mathcal {D}(u_h,v_h)+\mathcal {D}(v_h,u_h)\) vanish. Therefore,

$$\begin{aligned} \frac{d}{dt} \varOmega (u_h,v_h)=0, \end{aligned}$$
(3.4)

whence \(\varOmega (u_h,v_h)\) is time independent.

If \(b^2 < 4ac\) then there is a positive constant \(\sigma \) such that \(\sigma \! \int _0^1 (u_h^2+v_h^2)dx \le \varOmega (u_h,v_h)\) and hence both \(\Vert u_h\Vert \) and \(\Vert v_h\Vert \) remain bounded as long as the solution exists. In view of the equivalence of norms on the finite-dimensional space \(V_h^q\), \(\Vert u_h\Vert _\infty \) and \(\Vert v_h\Vert _\infty \) and consequently the components of U and V are also uniformly bounded as long as the solution exists. This conclusion allows the local existence theory made via Picard iteration to be continued indefinitely, leading to a global solution which is necessarily uniformly bounded in time. \(\square \)

3.2 Error Estimates for the Conservative Semi-Discrete Approximation

A preliminary observation is helpful. Let u and v be smooth and periodic solutions of the system (1.1). In view of the consistency of the form \(\mathcal {N}\) shown in (2.6) (equivalently that of the operator \(\mathcal {N}\)), it follows that \(\mathcal {N}(u,u) = P_0 ((u^2)_x), \ \mathcal {N}(u,v) = P_0 ((uv)_x)\) and \(\mathcal {N}(v,v) = P_0 ((v^2)_x)\) where \(P_0\) is the \(L^2\)-projection onto \(V_h^q\) as before. Since \(\mathcal {D}\) is similarly consistent [see (2.10)], \(\mathcal {D}u = P_0 u_{xxx},\ \mathcal {D}v = P_0 v_{xxx}\). By applying \(P_0\) to the system (1.1), there obtains

$$\begin{aligned} \left[ \begin{array}{l} P_0 u_t \\ P_0 v_t \end{array} \right] + \left[ \begin{array}{lll} A &{} B &{} C \\ D &{} E &{} F \end{array} \right] \left[ \begin{array}{l} \mathcal {N}(u,u) \\ \mathcal {N}(u,v) \\ \mathcal {N}(v,v) \end{array} \right] + \left[ \begin{array}{l} \mathcal {D}u \\ \mathcal {D}v \end{array} \right] = \left[ \begin{array}{l} 0 \\ 0 \end{array} \right] . \end{aligned}$$
(3.5)

The strategy is to make a comparison of \(u_h,v_h\) with the projections \(w^{(u)}:= Pu, \, w^{(v)}:=Pv\) of u and v, respectively, as defined via (2.15). Consider the new quantities

$$\begin{aligned} \zeta ^{(u)}= u_h-w^{(u)}, \quad \zeta ^{(v)}= v_h-w^{(v)},\quad \eta ^{(u)}= u-w^{(u)}, \quad \eta ^{(v)}= v-w^{(v)}. \end{aligned}$$

Replace \(\mathcal {D}u\) by \(\mathcal {D}w^{(u)}\) and \(\mathcal {D}v\) by \(\mathcal {D}w^{(v)}\) in (3.5) using lemma 4 and then subtract the result from (3.1). There emerges the error equation

$$\begin{aligned} \left[ \begin{array}{l} \zeta ^{(u)}_t \\ \zeta ^{(v)}_t \end{array} \right]+ & {} \left[ \begin{array}{lll} A &{} B &{} C \\ D &{} E &{} F \end{array} \right] \left[ \begin{array}{l} \mathcal {N}(u_h,u_h) - \mathcal {N}(w^{(u)},w^{(u)}) \\ \mathcal {N}(u_h,v_h) - \mathcal {N}(w^{(u)},w^{(v)}) \\ \mathcal {N}(v_h,v_h) - \mathcal {N}(w^{(v)},w^{(v)}) \end{array} \right] + \left[ \begin{array}{l} \mathcal {D}\zeta ^{(u)}\\ \mathcal {D}\zeta ^{(v)}\end{array} \right] \\= & {} \left[ \begin{array}{l} P_0 \eta ^{(u)}_t \\ P_0 \eta ^{(v)}_t \end{array} \right] + \left[ \begin{array}{lll} A &{} B &{} C \\ D &{} E &{} F \end{array} \right] \left[ \begin{array}{l} \mathcal {N}(u,u) - \mathcal {N}(w^{(u)},w^{(u)}) \\ \mathcal {N}(u,v) - \mathcal {N}(w^{(u)},w^{(v)}) \\ \mathcal {N}(v,v) - \mathcal {N}(w^{(v)},w^{(v)}) \end{array} \right] , \end{aligned}$$

where \((w^{(u)}_t,w^{(v)}_t)\) has been subtracted from both sides. This system is written as \(Q_t +Q_1 +Q_2=Q_3+Q_4\) for convenience.

As in the proof of Lemma 5, multiply the latter system from the left by the vector \({\tilde{Q}}(t)=[2a\zeta ^{(u)}+b\zeta ^{(v)}b\zeta ^{(u)}+2c\zeta ^{(v)}]\) and integrate over [0, 1] to derive

$$\begin{aligned} \int _0^1 {\tilde{Q}}Q_t \, dx = \frac{d}{dt} \varOmega (\zeta ^{(u)},\zeta ^{(v)}). \end{aligned}$$
(3.6)

A previous calculation shows that

$$\begin{aligned} \int _0^1 {\tilde{Q}}Q_2 \, dx = 2a \mathcal {D}(\zeta ^{(u)},\zeta ^{(u)})+b\left( \mathcal {D}(\zeta ^{(u)},\zeta ^{(v)}) + \mathcal {D}(\zeta ^{(v)},\zeta ^{(u)})\right) + 2c\mathcal {D}(\zeta ^{(v)},\zeta ^{(v)})=0.\nonumber \\ \end{aligned}$$
(3.7)

Applying the identities

$$\begin{aligned} \mathcal {N}(u_h,u_h) - \mathcal {N}(w^{(u)},w^{(u)})= & {} \mathcal {N}(\zeta ^{(u)},\zeta ^{(u)}) + 2\mathcal {N}(w^{(u)},\zeta ^{(u)}) \nonumber \\ \mathcal {N}(u_h,v_h) - \mathcal {N}(w^{(u)},w^{(v)})= & {} \mathcal {N}(\zeta ^{(u)},\zeta ^{(v)}) + \mathcal {N}(w^{(v)},\zeta ^{(u)}) + \mathcal {N}(w^{(u)},\zeta ^{(v)}) \nonumber \\ \mathcal {N}(v_h,v_h) - \mathcal {N}(w^{(v)},w^{(v)})= & {} \mathcal {N}(\zeta ^{(v)},\zeta ^{(v)}) + 2\mathcal {N}(w^{(v)},\zeta ^{(v)}), \end{aligned}$$
(3.8)

allows the conclusion

$$\begin{aligned} \int _0^1 {\tilde{Q}}Q_1 \, dx= & {} \mathcal {I}(\zeta ^{(u)},\zeta ^{(v)}) \nonumber \\&+\, \mathcal {N}(w^{(u)},\zeta ^{(u)};\zeta ^{(u)})(4Aa+2Db) + \mathcal {N}(w^{(v)},\zeta ^{(u)};\zeta ^{(u)})(2Ba+Eb) \nonumber \\&+\, \mathcal {N}(w^{(u)},\zeta ^{(v)};\zeta ^{(v)})(Bb+2Ec) + \mathcal {N}(w^{(v)},\zeta ^{(v)};\zeta ^{(v)})(2Cb+4Fc) \nonumber \\&+ \,\mathcal {N}(w^{(u)},\zeta ^{(v)};\zeta ^{(u)})(2Ba+Eb) + \mathcal {N}(w^{(u)},\zeta ^{(u)};\zeta ^{(v)})(2Ab+4Dc) \nonumber \\&+ \,\mathcal {N}(w^{(v)},\zeta ^{(v)};\zeta ^{(u)})(4Ca+2Fb) + \mathcal {N}(w^{(v)},\zeta ^{(u)};\zeta ^{(v)})(Bb+2Ec).\nonumber \\ \end{aligned}$$
(3.9)

On account of Lemma 5, \(\mathcal {I}(\zeta ^{(u)},\zeta ^{(v)})=0\). Let \(Q_{1,1}\) denote the first four of the last eight terms in (3.9). Formula (2.7) permits this to be rewritten as

$$\begin{aligned} Q_{1,1}= & {} \left( 2Aa+Db\right) \sum _{I\in \mathcal {T}_h} \left( (\zeta ^{(u)})^2, w^{(u)}_x\right) _I + \left( Ba+Eb/2\right) \sum _{I\in \mathcal {T}_h} \left( (\zeta ^{(u)})^2,w^{(v)}_x\right) _I \\&+ \,\left( Bb/2+Ec\right) \sum _{I\in \mathcal {T}_h} \left( (\zeta ^{(v)})^2, w^{(u)}_x\right) _I +\big (Cb+2Fc\big ) \sum _{I\in \mathcal {T}_h} \left( (\zeta ^{(v)})^2,w^{(v)}_x\right) _I. \end{aligned}$$

Now it results from (2.27) with \(p=\infty , \, j=1\) that if \(h =\max h_I\), then

$$\begin{aligned} \Vert w^{(u)}_x\Vert _{L^\infty (I)} + \Vert w^{(v)}_x\Vert _{L^\infty (I)}\le & {} c \sum _{J\in \mathcal {T}_h} h_J^{q} \left( \Vert u^{(q+1)}\Vert _{L^\infty (J)}+ \Vert v^{(q+1)}\Vert _{L^\infty (J)}\right) \\\le & {} ch^{q-1} \left( \Vert u^{(q+1)}\Vert _{L^\infty (0,1)}+ \Vert v^{(q+1)}\Vert _{L^\infty (0,1)}\right) \sum _{J\in \mathcal {T}_h} h_J. \end{aligned}$$

Since \(\sum _{J\in \mathcal {T}_h} h_J=1\), it follows that

$$\begin{aligned} \Vert w^{(u)}_x\Vert _{L^\infty (I)}+\Vert w^{(v)}_x\Vert _{L^\infty (I)} \le c. \end{aligned}$$
(3.10)

This in turn leads to the bound

$$\begin{aligned} \big | Q_{1,1}\big | \le c \left( \Vert \zeta ^{(u)}\Vert ^2+\Vert \zeta ^{(v)}\Vert ^2\right) . \end{aligned}$$
(3.11)

In handling the next two terms, which are denoted by \(Q_{1,2}\), it is crucial that \(2Ba+Eb=2Ab+4Dc\), a fact that flows from the first equation in (1.4). Because of this relation,

$$\begin{aligned} Q_{1,2}= & {} - (2Ba+Eb) \sum _{I\in \mathcal {T}_h} \left( \left( w^{(u)}\zeta ^{(v)},\zeta ^{(u)}_x\right) _I + \left( w^{(u)}\zeta ^{(u)},\zeta ^{(v)}_x\right) _I \right) \\= & {} (2Ba+Eb) \sum _{I\in \mathcal {T}_h} \left( w^{(u)}_x, \zeta ^{(u)}\zeta ^{(v)}\right) _I. \end{aligned}$$

In view of (3.10), it is concluded that

$$\begin{aligned} \big | Q_{1,2}\big | \le c \Vert \zeta ^{(u)}\Vert \Vert \zeta ^{(v)}\Vert \le \frac{c}{2}\Big (\Vert \zeta ^{(u)}\Vert ^2+\Vert \zeta ^{(v)}\Vert ^2\Big ). \end{aligned}$$
(3.12)

It is also the case that \(4Ca+2Fb=Bb+2Ec\), so that if \(Q_{1,3}\) denotes the final two terms in (3.9), then

$$\begin{aligned} Q_{1,3}= & {} - (4Ca+2Fb) \sum _{I\in \mathcal {T}_h} \Big ( \big (w^{(v)}\zeta ^{(v)},\zeta ^{(u)}_x\big )_I + \big (w^{(v)}\zeta ^{(u)},\zeta ^{(v)}_x\big )_I \Big ) \\= & {} (4Ca+2Fb) \sum _{I\in \mathcal {T}_h} \big (w^{(v)}_x ,\zeta ^{(u)}\zeta ^{(v)}\big )_I. \end{aligned}$$

Consequently, the inequality

$$\begin{aligned} \big | Q_{1,3}\big | \le c \Vert \zeta ^{(u)}\Vert \Vert \zeta ^{(v)}\Vert \le \frac{c}{2}\big (\Vert \zeta ^{(u)}\Vert ^2+\Vert \zeta ^{(v)}\Vert ^2\big ) \end{aligned}$$
(3.13)

may be extracted. Combining (3.11), (3.12) and (3.13) yields

$$\begin{aligned} \int _0^1 {\tilde{Q}}Q_1 \, dx \le c \big (\Vert \zeta ^{(u)}\Vert ^2+\Vert \zeta ^{(v)}\Vert ^2\big ). \end{aligned}$$
(3.14)

Next consider the term \(\int _0^1 {\tilde{Q}}Q_3 \, dx\). Since the time derivative commutes with P, it transpires that

$$\begin{aligned} \begin{aligned} \left| \int _0^1 {\tilde{Q}} Q_3 \, dx \right| \le ch^\mu \Big (\Vert u_t\Vert _{W^{\mu +1,\infty } (0,1)} + \Vert v_t\Vert _{W^{\mu +1,\infty } (0,1)}\Big ) \Big ( \Vert \zeta ^{(u)}\Vert +\Vert \zeta ^{(v)}\Vert \Big )\qquad \end{aligned} \end{aligned}$$
(3.15)

(see Proposition 3 and part (iii) of Remark 1 following its proof). It remains to estimate the term \(\int _0^1 {\tilde{Q}} Q_4 \, dx\). Integration by parts yields

$$\begin{aligned} \int _0^1 {\tilde{Q}} Q_4 \, dx \!=\! \sum _{\alpha ,\beta ,\gamma } \sum _{I\in \mathcal {T}_h} \left( c^{1}_{\alpha ,\beta ,\gamma } \big (\eta ^{(\alpha )} \eta _x^{(\beta )}, \zeta ^{(\gamma )}\big )_I \!+\! c^{2}_{\alpha ,\beta ,\gamma } \big (\omega ^{(\alpha )} \eta _x^{(\beta )}, \zeta ^{(\gamma )}\big )_I \!+\! c^{3}_{\alpha ,\beta ,\gamma } \big (\omega _x^{(\alpha )} \eta ^{(\beta )}, \zeta ^{(\gamma )}\big )_I \right) \end{aligned}$$

where each of the indices \(\alpha ,\beta ,\gamma \) can be u or v and \(c^i_{\alpha ,\beta ,\gamma }\) are constants depending on \(A,\ldots ,F\) as well as abc. Unfortunately the cancellations that occurred in the estimation of \(\int _0^1 {\tilde{Q}} Q_1 dx\) do not appear here and as will be seen, the terms in the second sum will cause a loss of one power of h.

Using (3.27) and (3.28) with \(p=\infty , \ j=0,1\), it follows that

$$\begin{aligned} \begin{aligned} \Vert \eta ^{(\beta )}\Vert _{L^\infty (I)} \le c h^{\mu }\, \mathrm{and}\, \Vert \eta _x ^{(\beta )}\Vert _{L^\infty (I)} \le c h^{\mu -1}. \end{aligned} \end{aligned}$$
(3.16)

Since \(\Vert w^{(\alpha )}\Vert _{L^\infty (I)}\) and \(\Vert w_x^{(\alpha )}\Vert _{L^\infty (I)}\) are bounded, it is deduced that

$$\begin{aligned} \int _0^1 {\tilde{Q}} Q_4 \, dx \le ch^{u-1} \left( \Vert \zeta ^{(u)}\Vert +\Vert \zeta ^{(\nu )}\Vert \right) . \end{aligned}$$
(3.17)

The pieces above are now assembled to establish the convergence of the conservative semi-discrete approximations.

Theorem 3.2

Suppose the solutions uv of the system (1.1) are sufficiently smooth and periodic. Assume also that the relation \(b^2 < 4ac\) holds. Then for initial data \(u_h^0, \, v_h^0\) that are \(O(h^{q})\) approximations of \(u(\cdot ,0), \, v(\cdot ,0),\) respectively, the error bound

$$\begin{aligned} \Vert u(t)-u_h(t)\Vert + \Vert v(t)-v_h(t)\Vert \le ce^{ct} h^{q-1} \end{aligned}$$
(3.18)

holds for the conservative semi-discrete approximations. If the mesh is uniform, then the bound on the right side is replaced by \(ce^{ct} h^{q}\).

Proof

Gathering (3.6), (3.7) and (3.14)–(3.17) gives the inequality

$$\begin{aligned} \frac{d}{dt} \varOmega (\zeta ^{(u)}(t),\zeta ^{(v)}(t)) \le c h^{2\mu -2}+c\big (\Vert \zeta ^{(u)}\Vert ^2 +\Vert \zeta ^{(v)}\Vert ^2 \big ). \end{aligned}$$
(3.19)

As seen before, \(b^2 < 4ac\) implies that

$$\begin{aligned} \sigma \big (\Vert \zeta ^{(u)}\Vert ^2 +\Vert \zeta ^{(v)}\Vert ^2 \big ) \le \varOmega (\zeta ^{(u)},\zeta ^{(v)}). \end{aligned}$$
(3.20)

Hence an application of Gronwall’s inequality to (3.19) yields

$$\begin{aligned} \varOmega (\zeta ^{(u)}(t),\zeta ^{(v)}(t)) \le c e^{ct}h^{2\mu -2}\varOmega (\zeta ^{(u)}(0),\zeta ^{(v)}(0)). \end{aligned}$$
(3.21)

Another application of (3.20) and the triangle inequality gives the bound (3.18). Finally, recall that \(\mu = q+1\) when the mesh is uniform. This gives the \(O(h^q)\) error bound and concludes the proof. \(\square \)

Remark 2

It is worth emphasizing the role played by the fact that the constants ab and c satisfy (1.4). This is precisely what allowed the quantities \(Q_{1,2}\) and \(Q_{1,3}\) to be written in a form that ultimately resulted in the estimates (3.12) and (3.13) being expressed solely in terms of the \(L^2\)-norms of \(\zeta ^{(u)}\) and \(\zeta ^{(v)}\).

3.3 The Dissipative Formulation

The dissipative formulation is obtained by using the operator \(\tilde{\mathcal {D}}\) instead of \(\mathcal {D}\) in (3.1). The analysis of this method follows closely that of the conservative one and so we content ourselves with highlighting the minor differences.

As far as the existence and uniqueness of the approximations \({\tilde{u}}_h,{\tilde{v}}_h\) are concerned, (3.3) still holds and \(\mathcal {I}({\tilde{u}}_h,{\tilde{v}}_h)\) vanishes also since the nonlinear forms are identical for both methods. Furthermore, using the identity \([{\tilde{u}}_{hx}{\tilde{v}}_{hx}]_m = {\tilde{u}}_{hx}(x_m^+)[{\tilde{v}}_{hx}]_m+{\tilde{v}}_{hx}(x_m^-)[{\tilde{u}}_{hx}]_m\), it is seen that \(\tilde{\mathcal {D}}({\tilde{u}}_h,{\tilde{v}}_h) + \tilde{\mathcal {D}}({\tilde{v}}_h,{\tilde{u}}_h) = \sum _{m=0}^{M-1} [{\tilde{u}}_{hx}]_m[{\tilde{v}}_{hx}]_m\). Combining this with (2.14) leads to

$$\begin{aligned} \frac{d}{dt} \varOmega ({\tilde{u}}_h,{\tilde{v}}_h) + \sum _{m=0}^{M-1} \big ( a [{\tilde{u}}_{hx}]_m^2 + b [{\tilde{u}}_{hx}]_m[{\tilde{v}}_{hx}]_m + c[{\tilde{v}}_{hx}]_m^2\big ) = 0. \end{aligned}$$

The condition \(b^2<4ac\) implies that the sum appearing above is nonnegative. In fact it is bounded from below by \(\sigma \sum _{m=0}^{M-1} \big ( [{\tilde{u}}_{hx}]_m^2 + [{\tilde{v}}_{hx}]_m^2 \big )\). It is therefore the case that \(\frac{d}{dt} \varOmega ({\tilde{u}}_h,{\tilde{v}}_h) \le 0\) which means that \(\varOmega ({\tilde{u}}_h,{\tilde{v}}_h)\) is bounded for all time by its value at \(t=0\). This indeed motivates calling this method dissipative. The global existence and uniqueness of \({\tilde{u}}_h,{\tilde{v}}_h\) follow as before.

The error estimates for the dissipative method can be established using the same approach followed for the conservative method. The major difference is that \({\tilde{u}}_h\) and \({\tilde{v}}_h\) are now compared to \({\tilde{w}}^{(u)}\) and \({\tilde{w}}^{(v)}\), respectively. The optimal, local estimates (2.19) are used, together with the bounds on \({\tilde{w}}^{(u)}, \, {\tilde{w}}^{(u)}_x, \, {\tilde{w}}^{(v)}, \, {\tilde{w}}^{(v)}_x\) in the \(L^\infty \)-norm. Note that the latter hold without any quasi uniformity restrictions on the mesh. One power of h is still lost due to the nonlinear terms.

Theorem 3.3

Suppose the solution (uv) of the system (1.1) is sufficiently smooth and that the relation \(b^2 < 4ac\) holds. Then for initial data \({\tilde{u}}_h^0, \, {\tilde{v}}_h^0\) that are \(O(h^{q})\) approximations to \(u(\cdot ,0), \, v(\cdot ,0)\), respectively, the error bound

$$\begin{aligned} \Vert u(t)-{\tilde{u}}_h(t)\Vert + \Vert v(t)-{\tilde{v}}_h(t)\Vert \le ce^{ct} h^q. \end{aligned}$$
(3.22)

is valid for the dissipative, semi-discrete approximations.

3.4 The Fully Discrete Approximations

In this subsection, consideration is given to adding time-stepping to the semi-discrete scheme studied above. It will turn out that a good choice of the time-stepping, namely the implicit Runge–Kutta (IRK) methods belonging to the Gauss–Legendre family results in a fully discrete scheme that continues to preserve the functional \(\varOmega \), up to roundoff error of course.

Let \(0=t^0<t^1<\dots <t^N=T\) be a partition of the temporal interval [0, T] with \(t^n = n\kappa ,\ n=0,\dots ,N\) where \(\kappa \) is the stepsize. The IRK methods are specified by an \(s\times s\) matrix \(A = (a_{ij})\), an s-vector \({\tau }=\{\tau _1,\dots ,\tau _s\}\) of temporal interpolation nodes and an s-vector \({\mathbf {w}}=\{w_1,\dots ,w_s\}\) of weights. These are often represented in tableau form, \(\frac{A|\tau }{{\mathbf {w}}| }\). For the initial-value problem of an ordinary differential equation \(y'=f(t,y)\), the discrete time-stepping is given by the mapping \(y^n \rightarrow y^{n+1}\) with \(y^{n+1} = y^n + \kappa \sum _{i=1}^s w_i f(t^n+\kappa \tau _i, y^{n,i})\) where the s intermediate values \(\left\{ y^{n,i}\right\} _{i=1}^s\) are solutions of the system

$$\begin{aligned} y^{n,i} = y^n + \kappa \sum _{j=1}^s a_{ij} f(t^n+\kappa \tau _j, y^{n,j}), \quad i=1,\dots ,s. \end{aligned}$$

Certain classes of IRK methods possess a particularly strong type of stability, namely that of algebraic stability, viz.

$$\begin{aligned}&w_i \ge 0, \, i=1,\dots ,s, \\&\text{ the } s\times s \text{ matrix } m_{ij} := a_{ij}w_i + a_{ji} w_j - w_iw_j \text{ is } \text{ positive } \text{ semidefinite }. \nonumber \end{aligned}$$
(3.23)

Two classes of IRK schemes with this property are the Radau-IIA and Gauss-Legendre methods. For the latter, the matrix M in (3.23) vanishes identically. It is precisely this feature that imparts them with the conservation properties that are sought here. The next table shows the first three members of the Gauss-Legendre family, corresponding to \(s=1,2\) and 3 (Table 1).

Table 1 Gauss–Legendre Implicit Runge–Kutta methods: \(s=1,2,3\)

For problems with smooth solutions the order of accuracy is 2s. The first two methods are used in the numerical experiments reported in Sect. 4. They are referred to as RK1 and RK2, respectively.

The fully discrete method that arises from using an IRK method to approximate solutions of the semi-discrete formulation (3.1) of our systems is as follows:

$$\begin{aligned}&u^{n+1} = u^n - \kappa \sum _{i=1}^s w_i f_u^i, \quad u^{n,i} = u^n - \kappa \sum _{j=1}^s a_{ij} f_u^j, \quad i=1,\dots ,s, \end{aligned}$$
(3.24)
$$\begin{aligned}&v^{n+1} = v^n - \kappa \sum _{i=1}^s w_i f_v^i, \quad v^{n,i} = v^n - \kappa \sum _{j=1}^s a_{ij} f_v^j, \quad i=1,\dots ,s \end{aligned}$$
(3.25)

where the quantities \(f_u^j,f_v^j\) given by

$$\begin{aligned} \left[ \begin{array}{l} f_u^i\\ f_v^i\end{array} \right] = \left[ \begin{array}{lll} A &{} B &{} C \\ D &{} E &{} F \end{array} \right] \left[ \begin{array}{l} \mathcal {N}(u^{n,i},u^{n,i}) \\ \mathcal {N}(u^{n,i},v^{n,i}) \\ \mathcal {N}(v^{n,i},v^{n,i}) \end{array} \right] + \left[ \begin{array}{l} \mathcal {D}u^{n,i}\\ \mathcal {D}v^{n,i}\end{array} \right] \end{aligned}$$
(3.26)

are introduced to simplify the notation.

Theorem 3.4

The fully discrete scheme using the conservative spatial formulation together with the Gauss–Legendre IRK methods has the property that

$$\begin{aligned} \varOmega (u^{n+1},v^{n+1}) = \varOmega (u^n,v^n), \ n \ge 0. \end{aligned}$$
(3.27)

That is, the continuous invariant \(\varOmega \), when evaluated on discrete approximations is time-step independent.

Proof

From (3.24), it is seen that

$$\begin{aligned} (u^{n+1})^2= & {} {} (u^n)^2 - 2 \kappa \sum _{j=1}^s w_j u^nf_u^j+ \kappa ^2 \sum _{i,j=1}^s w_i w_jf_u^if_u^j \nonumber \\= & {} {} (u^n)^2 - 2 \kappa \sum _{j=1}^s w_j \Big (u^{n,j}+ \kappa \sum _{i=1}^s a_{ji}f_u^i\Big ) f_u^j+ \kappa ^2 \sum _{i,j=1}^s w_iw_j f_u^if_u^j \nonumber \\= & {} {} (u^n)^2 - 2 \kappa \sum _{j=1}^s w_j u^{n,j}f_u^j- \kappa ^2 \sum _{i,j=1}^s m_{ij} f_u^if_u^j \nonumber \\= & {} {} (u^n)^2 - 2 \kappa \sum _{j=1}^s w_j u^{n,j}f_u^j, \nonumber \\ \end{aligned}$$
(3.28)

where the last identity reflects the vanishing of the array elements \(m_{ij}\) for the Gauss-Legendre IRK methods. By entirely similar calculations, there appears

$$\begin{aligned} (v^{n+1})^2= & {} (v^n)^2 - 2 \kappa \sum _{j=1}^s w_j v^{n,j}f_v^j\nonumber \\ u^{n+1}v^{n+1}= & {} u^nv^n- \kappa \sum _{j=1}^s w_j u^{n,j}f_v^j- \kappa \sum _{j=1}^s w_j v^{n,j}f_u^j. \end{aligned}$$
(3.29)

Multiply (3.28) by a, the first equation of (3.29) by c and the second equation in (3.29) by b, sum the results and integrate over [0, 1]. These calculations, which are almost identical to those seen in the proofs of Lemma 5 and Theorem 3.1 show that

$$\begin{aligned} \varOmega (u^{n+1},v^{n+1})= & {} \varOmega (u^n,v^n) - \kappa \sum _{j=1}^s w_j \Big ( 2a (f_u^j,u^{n,j}) + b (f_u^j,v^{n,j}) \nonumber \\&+\, b (f_v^j,u^{n,j})+2c(f_v^j,v^{n,j}) \Big ) \nonumber \\= & {} \varOmega (u^n,v^n) \!- \!\kappa \sum _{j=1}^s w_j \Big ( \mathcal {I}(u^{n,j},v^{n,j}) + 2a \mathcal {D}(u^{n,j},u^{n,j}) + b\big ( \mathcal {D}(u^{n,j},v^{n,j}) \nonumber \\&+\,\mathcal {D}(v^{n,j},u^{n,j}) \big ) +2c \mathcal {D}(v^{n,j},v^{n,j}) \Big ). \end{aligned}$$
(3.30)

In particular, it emerged from those previous proofs that

$$\begin{aligned} \mathcal {I}(u^{n,j},v^{n,j}) = \mathcal {D}(u^{n,j},u^{n,j}) = \mathcal {D}(u^{n,j},v^{n,j}) + \mathcal {D}(v^{n,j},u^{n,j}) = \mathcal {D}(v^{n,j},v^{n,j})=0. \end{aligned}$$
(3.31)

These identities, when used in (3.30), establish the temporal conservation property (3.27) of the fully discrete schemes. \(\square \)

Remark 3

In the dissipative spatial formulation that uses the form \(\tilde{\mathcal {D}}\) instead of \(\mathcal {D}\), we still have \(\mathcal {I}(u^{n,j},v^{n,j})=0\). However, the remaining terms in (3.31) do not vanish. Instead, it can be shown that the quantity \( 2a \tilde{\mathcal {D}}(u^{n,j},u^{n,j}) + b\big ( \tilde{\mathcal {D}}(u^{n,j},v^{n,j}) + \tilde{\mathcal {D}}(v^{n,j},u^{n,j}) \big ) +2c \tilde{\mathcal {D}}(v^{n,j},v^{n,j})\) appearing in (3.30) is equal to \( \sum _{m=0}^{M-1} \Big ( a[u^{n,j}_x]_m^2 + b [u^{n,j}_x]_m[v^{n,j}_x]_m + c[v^{n,j}_x]_m^2 \Big )\). Since we are operating under the assumption that \(4ac-b^2>0\), the latter sum is greater than \(\sigma \sum _{m=0}^{M-1} \left( [u^{n,j}_x]_m^2 + [v^{n,j}_x]_m^2 \right) \), for some positive \(\sigma \), and is therefore nonnegative. In view of this, the nonnegativity of \(\kappa \) and the weights \(w_j\), the inequality \(\varOmega (u^{n+1},v^{n+1}) \le \varOmega (u^n,v^n)\) obtains for the dissipative formulation.

Remark 4

For other algebraically stable IRK methods, the fully discrete approximations will no longer enjoy the conservation property (3.27) that obtains for those generated by the Gauss–Legendre methods. On the other hand, it is clear from the proof above that the stability result \(\varOmega (u^{n+1},v^{n+1}) \le \varOmega (u^n,v^n)\) will replace (3.27).

4 Numerical Experiments

4.1 The Test Cases

To adapt the experiments to the interval [0, 1], the third derivative terms in (1.1) are multiplied by a small parameter \(\epsilon \) to be specified shortly. The parameters \(A,\dots ,F\) are taken to be

$$\begin{aligned} A = \frac{1}{8}, \;\;\; B = \frac{1}{8}, \;\;\; C= \frac{1}{32}, \;\;\; D= \frac{1}{8}, \;\;\; E = 1, \;\;\; F= -\frac{9}{32}. \end{aligned}$$
(4.1)

These choices result in \(a = \frac{118}{17}, \ b = -\frac{28}{17}\) and \(\ c=1\), which in turn yields the positive discriminant \(4ac-b^2 = \frac{7240}{289}\).

To check the accuracy and convergence rates, two types of solutions of (1.1) were used. These are both proportional traveling waves of the form \((u,v) = (u,2u)\). The first is adapted from the well-known cnoidal-wave solution of the KdV equation,

$$\begin{aligned} u(x,t)= \lambda \, \text {cn}^2\big (4K(m)(x-\omega t-x_0):m \big ) \end{aligned}$$
(4.2)

where \(cn(z) = cn(z\!:\!m)\) is the Jacobi elliptic function with modulus \(m \in (0,1)\) (see [1]) and the parameters have the values \(m = 0.9, \lambda = 192m\epsilon K(m)^2\), \(\omega =64\epsilon (2m-1)K(m)^2\), \(\epsilon = \frac{1}{576}\) whilst \(x_0 = \frac{1}{2}\) centers the initial value in the middle of the interval. Here, the function \(K=K(m)\) is the complete elliptic integral of the first kind and the parameters are so organized that u and v have spatial period 1.

The second class of solutions is an approximation of the proportional solitary-wave solutions that were discussed in Sect. 1.1. The parameters \(A, B, \ldots , F\) are the same as for the cnoidal-wave type solutions, and it is still the case that \((u,v) = (u,2u)\), but now

$$\begin{aligned} u(x,t)= \Lambda \,\text{ sech }^2 \big (K(x- \omega t -x_0) \big ), \end{aligned}$$
(4.3)

with \(\Lambda = 1\), \(\omega = \Lambda /3\), \(\epsilon = \frac{1}{5760}\), \(K= \frac{1}{2}\sqrt{\frac{\Lambda }{3\epsilon }}\) and \(x_0 = \frac{1}{2}\) to again center the initial wave profile. With \(v = 2u\), this is an exact solution of the system which is manifestly not periodic in space. Owing to its symmetry about its crest and its exponential decay away from the crest, the initial data can be rendered periodic by simply restricting the above solution at \(t = 0\) to the computational domain [0, 1] and imposing periodic boundary conditions across \(x = 0\) and \(x = 1\). The resulting periodicized initial data yields a periodic solution of the system. It is known from previous theory that the resulting solution is approximated to within order \(\epsilon \) by the restriction of (u, 2u) to the period domain [0, 1] over a time interval of order \(\frac{1}{\epsilon }\) (c.f. [5] and [17]). The small value of \(\epsilon \) used in the experiments with proportional solitary waves thus yields a solution, the accuracy of whose numerical approximation can be determined by comparison with the exact solution (u, 2u) with u as in (4.3). Much of the numerical work on the KdV equation has made use of this small trick to check for accuracy and convergence, especially when issues surrounding solitary waves are under consideration.

4.2 Spatial Convergence Rates

A set of experiments was designed to measure the spatial convergence rates of our schemes. In particular, we wanted to assess the extent of agreement of what is observed empirically with the theoretical predictions made earlier. Of course, there is always interest in the asymptotic constants whose existence is suggested by the error estimates. (There is a considerable practical difference between a scheme that throws up an error of \(10^2 h^2\), say, and one whose error looks like \(10^{-2} h^2\).) But, especial interest is also focused on discovering whether the parity of q has a significant impact on the convergence rates for the conservative formulation (see the discussion in Sect. 3).

Tables 2, 3, 4 and 5 correspond to \(q=2,3,4\) and 5. They show, respectively, the results obtained with uniform meshes of M cells for both conservative and dissipative methods and for the cnoidal and solitary-wave solutions just described. The time step \(\kappa \) was taken as \(10^{-5}\) in Tables 2, 3 and 4, whilst in Table 5, the results pertaining to several different time steps are displayed. The \(L^2\)-norms of the errors are shown only for u. Since v is proportional to u, it is not surprising that the errors for v appear to be proportional to those for u.

Before engaging in a discussion of the results displayed in Tables 2, 3, 4 and 5, it is worth remarking that the determination of convergence rates is somewhat delicate, especially for the larger values of q. This is due to the very small errors achieved even for relatively small numbers M of spatial intervals. The errors are sufficiently small that the effects of roundoff, which are very difficult to estimate precisely, cannot be ignored and may affect the rates.

For \(q=2\) Table 2 shows clearly convergence rates of 2 and 3 for the conservative and dissipative methods, respectively, for both the cnoidal and solitary wave solutions. For \(q=4\), the rates shown in Table 4, while less definitive, seem to indicate a similar reduction of order for the conservative method. For the odd values \(q=3\) and \(q=5\), the rates appear to be \(q+1\) for both methods. One noticeable exception is in Table 5(b) where we see the rates of 3.62 and 8.41 for the conservative method. However, using the errors at \(M=25\) and \(M=100\), we obtain the robust rate of 6.02.

Table 2 \(q=2, T=1, \kappa = 10^{-5}\), RK1
Table 3 \(q=3, T=1, \kappa =10^{-5}\), RK2
Table 4 \(q=4\), \(T=1\), \(\kappa =10^{-5}\), RK2
Table 5 \(q=5\), \(T=1\), \(\kappa =1/160,1/1280,1/10240\), RK1

As far as the conservative method is concerned, the data strongly indicates order reduction by one for even q. On the other hand the parity of the number of cells appeared to be immaterial. Concerning the order reduction of one due to the nonlinear terms seen in Theorems 3.2 and 3.3 affecting both methods, it can be reasonably concluded that this is an artifact of the proof. If indeed it does not owe to a fundamental phenomenon, an interesting open problem is highlighted.

4.3 Long Time Simulations

Overall, the errors for the two methods are comparable, at least on time scales of order 1. The conservative method has a slight edge for the odd values \(q = 3,5\) of the order of the piecewise polynomials that comprise the spatial basis, the opposite being true for the even-orders \(q=2,4\). Next is undertaken a comparative study of the long-time behaviour of the two methods. The working expectation here is that the conservation of the invariant \(\varOmega \) should have beneficial consequences when it comes to long-time calculations.

Fig. 1
figure 1

\(q=2, M = 80\), a \(t = 250\), b \(t = 500\), c \(t = 750\), d \(t = 1000\), e computed values of \(\varOmega \), f phase error

Fig. 2
figure 2

\(q=2, M = 160\), a \(t = 250\), b \(t = 500\), c \(t = 750\), d \(t = 1000\), e computed values of \(\varOmega \), f phase error

Fig. 3
figure 3

\(q=3, M = 80\), a \(t = 250\), b \(t = 500\), c \(t = 750\), d \(t = 1000\), e computed values of \(\varOmega \), f phase error

Fig. 4
figure 4

\(q=3, M = 160\), a \(t = 250\), b \(t = 500\), c \(t = 750\), d \(t = 1000\), e computed values of \(\varOmega \), f phase error

Figures 1, 2, 3 and 4 that are exhibited provide a descriptive view of the approximations of solitary-wave solutions up to time \(T=1000\) for the two methods. They correspond to \(q=2, \, M=80, \ q=2, \, M=160, \ q=3, \, M=80, \ q=3, \, M=160\), respectively.

The graphs of the quantity \(\varOmega \) exhibited in the parts (e) of these figures show agreement with the theory. That is to say, \(\varOmega \) is sensibly invariant when the integration is performed using the conservative method, whereas it is monotonically decreasing when the dissipative method is employed. This decrease appears to be a linear function of time as can be seen clearly in the more accurate simulations corresponding to \(q=3\).

Before embarking on an interpretation of the remaining graphs, some commentary is helpful. In the context of the numerical approximation of solitary-wave solutions of nonlinear dispersive equations of KdV type, a plethora of experiments have shown that a lag develops between the locations of the crest of the solution and its approximations. This lag, or phase error, grows in time to a point where it becomes the majority contributor to the overall error. It can be mitigated by increased accuracy and the use of conservative numerical methods, but it cannot be completely eliminated. In fact, the solitary-wave solution being approximated is known to be orbitally stable (see [6]). This means that if the system is initiated with initial data \((u_0,v_0)\) that is close to a solitary-wave solution \((r(x-\omega t), s(x - \omega t))\) at \(t=0\), then the solution (uv) emanating from \((u_0,v_0)\) will always be close to that solitary wave in shape. More precisely, because \(4ac - b^2 > 0\) in conjunction with another condition spelled out in [6], the quantity

$$\begin{aligned} \inf _{\theta \in \mathbb {R}} \int _\mathbb {R} \Big (u(x,t) - r(x - \theta ) \Big )^2 + \Big (v(x,t) - s(x - \theta ) \Big )^2 \, dx \end{aligned}$$
(4.4)

is uniformly small for all time. The minimization (4.4) defines a function \(\theta = \theta (t)\) and the phase error is then taken to be

$$\begin{aligned} e(t) \, = \, |\theta (t) - \omega t|. \end{aligned}$$
(4.5)

What is important to realize is that while a perturbation stays close in shape, the corresponding solution (uv) appears to resolve into a solitary wave which in general has a slightly different speed of propagation. In consequence, the phase error necessarily grows linearly with time. The growth cannot be too fast since it is known that

$$\begin{aligned} |\theta '(t) - \omega | \end{aligned}$$

always remains small (see e.g. [15, 30]).

One can think of the numerical scheme as a perturbation of the continuous initial-value problem. Viewed in this light, and taking into account the above discussion of the continuous problem, one expects a phase error to be associated with the numerical approximation of the solution. The main purpose of the longer-time experiments is to offer a numerical study of the phase error, by which we mean the gap between the location of the crest of the solitary wave solution at time t and its numerical approximation at the same time. Thus in (4.4), (uv) will be the numerical approximation of the solution and (rs) will be the relevant proportional solitary wave. For \(q=2, \, M=80\) both conservative and dissipative methods show significant phase errors with the former proving superior to the latter. Another striking aspect is the marked loss of amplitude the approximation suffers when using the dissipative method, whereas the conservative approximations appear to be remarkably accurate in shape. Furthermore, the dissipative approximation seems to have spread to the point where the solution no longer gets back to zero away from its crest. In Fig. 2, doubling the number of cells has resulted in a quantitative improvement for both methods with the conservative method continuing to hold an edge. The phase errors persist for both methods, however.

Figures 3 and 4 correspond to \(q=3, \, M=80\) and \(q=3, \, M=160\) respectively. Since q is odd, the conservative method has the same convergence rate as the dissipative method for these runs. The phase error appears to be eliminated when using the conservative scheme, whereas it still persists albeit to a less severe degree for the dissipative method. For \(M=80\), a small decrease in amplitude for the latter method is still present. Finally, we note that there is indication of superlinear growth of the phase error in all the runs involving the dissipative method.

5 Summary and Future Work

We have constructed conservative and dissipative finite element methods for a system of Korteweg–de Vries type equations coupled through their nonlinearities. The associated semi-discrete approximations have been investigated in detail and shown to be globally well posed when certain relationships hold between the coefficients of the nonlinear terms. These are the same conditions that had been used previously in proving the well-posedness of the continuous system (1.1).

The semi-discrete approximations were shown to converge to the associated solutions of the full system of PDE’s and spatial convergence rates provided. For the dissipative scheme, this follows standard lines. However, interesting additional conditions on the parity of the degree of the piecewise polynomial basis for the finite element space and on the number of cells were needed for the analysis of the conservative method. While a little puzzling, such conditions have appeared in other works associated with the analysis of conservative methods.

A number of tests have been conducted to provide a limited but illuminating assessment of the actual impact of these conditions and also to provide a comparative view of the performance of the conservative and dissipative methods. Based on this evidence, it is safe to conclude that the conservative method with \(q=3\) offers an efficient and effective tool for simulations over very long time intervals.

In a companion project further investigation of the systems of the form (1.1) is in progress. This includes questions of blow-up for the systems not satisfying the condition \(4ac - b^2 \ge 0\) that figured so prominently in the present work and instability and interaction results for the proportional solitary waves.