1 Introduction

In this paper, we develop and analyze a new hybridizable discontinuous Galerkin (HDG) method for the following initial-boundary value problem of the Korteweg–de Vries (KdV) type equation on a finite domain

$$\begin{aligned} \begin{array}{llll} u_t+u_{xxx} + F(u)_x\,=\,f&{} \quad {\text {for}}\; x\in \Omega :=(a,\;b), t\in (0, T],\\ u\,=\,u_0&{}\quad {\text {in}}\;\Omega \text { for }t=0,\\ u \,=\, u_D &{}\quad {\text {on}}\; \partial \Omega := \{a,b\},\\ u_x \,=\, q_N &{}\quad {\text {on}}\;\partial \Omega _N:=\{b\}. \end{array} \end{aligned}$$
(1.1)

Here \(f \in L^2(\Omega )\) and \(F(u)=\beta u^m\), where \(\beta \) is a constant and \(m\ge 0\) an integer. The well-posedness of the problem (1.1) and properties of the solution have been theoretically and numerically studied; see [3,4,5, 17, 18, 30] and references therein.

KdV type equations play an important role in applications, such as fluid mechanics [7, 25, 26], nonlinear optics [1, 19], acoustics [28, 33], plasma physics [6, 29, 32, 37], and Bose–Einstein condensates [21, 31] among other fields. They also have an enormous impact on the development of nonlinear mathematical science and theoretical physics. Many modern areas were opened up as a consequence of the basic research on KdV equations. Due to their importance in applications and theoretical studies, there has been a lot of interest in developing accurate and efficient numerical methods for KdV equations. In particular, an ongoing effort on developing discontinuous Galerkin (DG) methods for KdV type equations has been made in the last decade. The first DG method, the local discontinuous Galerkin (LDG) method, for the KdV equation was introduced in 2002 by Yan and Shu [36] and further studied for the linear case in [20, 23, 34, 35]. In [10], a DG method for the KdV equation was devised by using repeated integration by parts. Recently, several conservative DG methods [2, 9, 22] were developed for KdV type equations to preserve quantities such as the mass and the \(L^2\)-norm of the solutions. When solving KdV equations, one can use these DG methods for spatial discretization together with explicit schemes for time-marching if the coefficient before the third-order derivative is very small. However, when such coefficient is of order one, for example, implicit time-marching methods might be the methods of choice.

Traditional DG methods, despite their prominent features such as hp-adaptivity and local conservativity, were criticized for having larger number of degrees of freedom than continuous finite element methods when solving steady-state problems or problems that require implicit-in-time solvers. Here, we develop an HDG method which is very suitable for solving KdV equations when implicit time-marching is used. HDG methods [11, 13,14,15] were first introduced for diffusion problems and they provide optimal approximations to both the potential and the flux. Due to the feature that the global coupled degrees of freedom only live on element interfaces, they are significantly advantageous for solving steady-state problems or time-dependent problems that require implicit time-marching. In [8], we introduced the first family of HDG methods for stationary third-order linear equations, which allow the approximations to the exact solution u and its derivatives \(u_x\) and \(u_{xx}\) to have different polynomial degrees. We proved superconvergence properties of these methods on projection of errors and numerical traces, and numerical results indicate that the HDG method using the same polynomial degree k for all three variables is quite robust with respect to the choice of the stabilization function and provides a converging postprocessed solution with order \(2k+1\) with the least amount of degrees of freedom. This suggests that the HDG method using the same polynomial degrees for all variables is the method of choice for solving one-dimensional third-order problems. Therefore, in this paper we extend this HDG method to time-dependent third-order KdV type equations.

To construct the HDG method for KdV equations, we follow the approach used in [8] for stationary third-order equations. That is, given any mesh of the domain, we show that the exact solution can be obtained by solving the equation on each element with provided boundary data that are determined by transmission conditions. Then we define HDG methods by a discrete version of this characterization, which ensures that the only globally-coupled degrees of freedom are those associated to the numerical traces on element interfaces. In [8], it was shown that HDG methods derived by providing boundary data to local problems in different ways are indeed equivalent to each other when the stabilization function is finite and nonzero. So here we just need to consider the one that takes the numerical trace of u at both ends of the interval and the numerical trace of \(u_{xx}\) at the right end as boundary data for the local problems. Our method is different from the HDG method in [27], which was designed from implementation point of view. That HDG method involves two sets of numerical traces for \(u_x\), and there is no error analysis for the method.

Our way of devising HDG methods from the characterization of the exact solution allows us to carry out stability and error analysis. We first apply an energy argument to find conditions on the stabilization function in the numerical traces, under which the HDG method has a unique solution for KdV type equations. Then by deriving four energy identities and combining them together, we prove that the method has optimal approximations to u as well as its derivatives \(u_x\) and \(u_{xx}\) for linear equations; this technique is similar to that in [35]. In implementation, implicit time-marching schemes such as BDF or DIRK methods can be used, and at each time step a stationary third-order equation is solved by the HDG method together with the Newton–Raphson method (see “Appendix”). Due to the one-dimensional setting of the KdV equations, the global matrix of the HDG method that needs to be numerically inverted at each time step is independent of the polynomial degree of the approximations, its size is only \(2N+1\), where N is the number of intervals of the mesh, and its condition number is of the order of \(h^{-2}\), where h denotes the size of the intervals of the mesh.

The paper is organized as follows. In Sect. 2, we define the HDG method for third-order KdV type equations and state and discuss our main results. The details of all the proofs are given in Sect. 3. We show numerical results in Sect. 4 and some concluding remarks in Sect. 5. The details on implementation of the method are in “Appendix”.

2 Main Results

In this section, we state and discuss our main results. We begin by describing the characterizations of the exact solution that the HDG method is a discrete version of. We then introduce our HDG method for KdV type equations, and state our stability result and optimal a priori error estimate.

2.1 Characterizations of the Exact Solution

To display the characterizations of the exact solution we are going to work with, let us first rewrite our third-order model equation as the following first-order system:

$$\begin{aligned} q - u_x \,=\, 0, \quad p - q_x\,=\, 0,\quad u_t+ p_x + F(u)_x\,=\,f \quad {\text {for} }\;x\in \Omega , t\in (0, T], \end{aligned}$$
(2.1a)

with the initial and boundary conditions

$$\begin{aligned} u\,&=\,u_0\quad {\text { in } }\Omega \text { for }t=0, \end{aligned}$$
(2.1b)
$$\begin{aligned} u \,&=\, u_D \quad {\text {on} }\; \partial \Omega , \end{aligned}$$
(2.1c)
$$\begin{aligned} q \,&=\, q_N \quad {\text {on} }\;\partial \Omega _N. \end{aligned}$$
(2.1d)

We partition the domain \(\Omega \) as

$$\begin{aligned} {\mathcal {T}}_h=\left\{ I_i:=(x_{i-1}, x_i){:}\,a=x_0<x_1<\cdots<x_{N-1}<x_N=b\right\} , \end{aligned}$$

and introduce the set of the boundaries of its elements, \(\partial {\mathcal {T}}_h:=\{ \partial I_i{:}\,i=1,\ldots ,N\}\). We also set \(\mathscr {E}_h:=\{x_i\}_{i=0}^N,\,h_i = x_i - x_{i-1}\) and \(h:=\max _{i=1}^N h_i\).

We know that, when f is smooth enough, if we provide the values \(\{\widehat{u}_i\}_{i=0}^N\) and \(\{\widehat{p}_i\}_{i=1}^N\) and, for each \(i=1,\ldots ,N\), solve the local problem

$$\begin{aligned}&Q - U_x=0, \quad P - Q_x=0, \quad U_t+P_x + F(U)_x=f \quad \text { in } I_i,\\&U =u_0\;\; \text { for } t=0, \quad U(x^+_{i-1})=\widehat{u}_{i-1}, \quad U(x^-_{i})=\widehat{u}_{i}, \quad P(x^-_{i})=\widehat{p}_{i}, \end{aligned}$$

then (PQU) coincides with the solution (pqu) of (2.1) if and only if the transmission conditions

$$\begin{aligned} Q(x_i^-)=Q(x_i^+),\quad P(x_i^-)=P(x_i^+),\quad i=1,\ldots ,N-1 \end{aligned}$$

and the boundary conditions

$$\begin{aligned} U=u_D \;\;{\text {on} }\; \partial \Omega , \quad Q =q_N \;\; {\text {on} }\; \partial \Omega _N \end{aligned}$$

are satisfied. There are other possible characterizations of the exact solution corresponding to different choices of boundary data for the local problem; see [8]. Note that for these characterizations, the boundary data of the local problems are the unknowns of a global problem obtained from the transmission conditions and boundary conditions, and the system of equations for the global unknowns is square.

2.2 HDG Method

To define our HDG method, we first introduce the finite element spaces to be used. We let the approximations \((u_h, q_h, p_h, \widehat{u}_h, \widehat{q}_h, \widehat{p}_h)\) to \((u|_\Omega ,q|_\Omega ,p|_\Omega ,u|_{\mathscr {E}_h},q|_{\mathscr {E}_h},p|_{\mathscr {E}_h})\) be in the space \(W_h^k\times W_h^k \times W_h^k \times L^2(\mathscr {E}_h)\times L^2(\partial {\mathcal {T}}_h) \times L^2(\partial {\mathcal {T}}_h)\) where

$$\begin{aligned} {W}_h^k = \left\{ w\in L^2(\mathcal {T}_h){:}\,w|{_{I_i}} \in {P}_{k}(I_i)\quad \forall \; i=1,\ldots ,N\right\} . \end{aligned}$$

Here \(P_k(I_i)\) is the space of polynomials of degree at most k on the domain \(I_i\). For any function \(\zeta \) lying in \(L^2(\partial {\mathcal {T}}_h)\), we denote its values on \(\partial I_i:=\{x_{i-1}^+, x_i^-\}\) by \(\zeta (x_{i-1}^+)\) (or simply \(\zeta ^+_{i-1}\)) and \(\zeta (x_i^-)\) (or simply \(\zeta ^-_i\)). Note that \(\zeta (x_{i}^+)\) is not necessarily equal to \(\zeta (x_i^-)\). In contrast, for any \(\eta \) in the space \(L^2(\mathscr {E}_h)\), its value at \(x_i,\,\eta (x_i)\) (or simply \(\eta _i\)) is uniquely defined; in this case, \(\eta (x_i^-)\) or \(\eta (x_{i}^+)\) mean nothing but \(\eta (x_i)\).

To obtain the HDG formulation, we use a discrete version of the characterization of the exact solution. Assuming that the values \(\{\widehat{u}_{hi}\}_{i=0}^N\) and \(\{\widehat{p}^{\;-}_{hi}\}_{i=1}^N\) are given, for each \(i=1,\ldots ,N\), we solve a local problem on the element \(I_i\) by using a Galerkin method. To describe it, let us introduce the following notation. By \((\varphi , v)_{I_i}\), we denote the integral of \(\varphi \) times v on the interval \(I_i\), and by \(\left\langle \varphi ,v n\right\rangle _{\partial I_i}\) we simply mean the expression \(\varphi (x_i^-)v(x_i^-)n(x_i^-) +\varphi (x_{i-1}^+)v(x_{i-1}^+)n(x_{i-1}^+)\). Here n denotes the outward unit normal to \(I_i\): \(n(x_{i-1}^+):=-1\) and \(n(x_i^-):=1\).

On the element \(I_i=(x_{i-1}, x_i)\), we give f and the boundary data \(\widehat{u}_{h\, i-1}, \widehat{u}_{h\,i}\) and \(\widehat{p}^{-}_{h\, i}\) and take the HDG approximate solutions \((p_h,q_h,u_h)\in P_{k}(I_i)\times P_{k}(I_i)\times P_{k}(I_i)\) to be the solution of the equations

$$\begin{aligned}&({q}_h,{v})_{I_i} + (u_h,v_x)_{I_i} - \left\langle \widehat{u}_h,{v}n \right\rangle _{\partial I_i} = 0, \\&({p}_h,{z})_{{I_i}} + (q_h,z_x)_{I_i} - \left\langle \widehat{q}_h,{z}n \right\rangle _{\partial I_i} = 0,\\&({{u}_h}_t,{w})_{I_i}-(p_h+F(u_h),w_x)_{I_i} + \left\langle \widehat{p}_h+\widehat{F}_h,{w}n \right\rangle _{\partial I_i} = (f,{w})_{I_i}, \end{aligned}$$

for all \((v, z, w)\,\in \,P_{k}(I_i)\times P_{k}(I_i) \times P_{k}(I_i)\), where the remaining undefined numerical traces are given by

$$\begin{aligned} {\left\{ \begin{array}{ll} \widehat{p}_h=p_h + \tau _{pu}\,({\widehat{u}_{h\,i-1}} - u_h)\,n &{} \text{ at } x_{i-1}^+,\\ \widehat{q}_h=q_h + \tau _{qu}\,( {\widehat{u}_{h\, i-1}} - u_h)\,n &{} \text{ at } x_{i-1}^+,\\ \widehat{q}_h= q_h + \tau _{qu} \,( {\widehat{u}_{h\, i}} - u_h)\,n +\tau _{qp}\,( {\widehat{p}^{\;-}_{h\,i}} -p_h)\,n&{} \text{ at } x_{i}^-,\\ \widehat{F}_h=F(\widehat{u}_h) -\tau _F(\widehat{u}_h, u_h) (\widehat{u}_h -u_h)\, n &{} \text{ at } x_{i-1}^+ \text{ and } x_i^-.\\ \end{array}\right. } \end{aligned}$$

The functions \(\tau _{qu}, \tau _{pu}, \tau _{qp}\), and \(\tau _F(\widehat{u}_h, u_h)\) are defined on \(\partial {\mathcal {T}}_h\) and are called the components of the stabilization function; they have to be properly chosen to ensure that the above problem has a unique solution. In particular, due to the nonlinearity of F, the function \(\tau _F(\cdot ,\cdot ){:}\,\partial \mathcal {T}_h\rightarrow \mathbb {R}\) can be nonlinear in terms of \(\widehat{u}_h\) and \(u_h\). In the case of \(F=0\), we simply take \(\tau _F=0\).

It remains to impose the transmission conditions

$$\begin{aligned}{}[\![\widehat{q}_h ]\!] (x_i) = 0 \quad \text{ and } \quad [\![\widehat{p}_h+\widehat{F}_h]\!] (x_i) = 0 \quad \text{ for } \text{ all } i =1,\ldots ,N-1, \end{aligned}$$

and the boundary conditions

$$\begin{aligned} \widehat{u}_h = u_D \;\;{\text {on} }\; \partial \Omega \quad \text{ and } \quad \widehat{q}_h = q_N \;\;{\text {on} }\; \partial \Omega _N. \end{aligned}$$

Here, \([\![\zeta ]\!](x_i):=\zeta (x_i^-)-\zeta (x_i^+)\). This completes the definition of the HDG methods using the characterization of the exact solution. Note that this way of defining the HDG methods immediately provides a way to implement them.

On the other hand, the above presentation of the HDG methods is not very well suited for their analysis. Thus, we now rewrite it in a more compact form using the notation

$$\begin{aligned} (\varphi , v):=\sum _{i=1}^N (\phi , v)_{I_i}, \quad \left\langle \varphi , v n\right\rangle :=\sum _{i=1}^N \left\langle \varphi , v n\right\rangle _{\partial I_i}. \end{aligned}$$

Let

$$\begin{aligned} M_h(g):=\left\{ \zeta \in L^2(\mathscr {E}_h){:}\; \zeta |_{\partial \Omega }=g\right\} ,\quad \tilde{M}_h:= L^2(\mathscr {E}_h{\setminus }\{a\}). \end{aligned}$$

The approximation provided by the HDG method, \((u_h, q_h, p_h, \widehat{u}_h, \widehat{p}_h^{\,-})\), is the element of \(W_h^{k}\times W_h^{k} \times W_h^{k} \times M_h(u_D)\times \tilde{M}_h\) which solves the equations

$$\begin{aligned}&({q}_h,{v}) + (u_h,v_x) - \left\langle \widehat{u}_h,{v}n \right\rangle = 0, \end{aligned}$$
(2.2a)
$$\begin{aligned}&({p}_h,{z}) + (q_h,z_x) - \left\langle \widehat{q}_h,{z}n \right\rangle = 0, \end{aligned}$$
(2.2b)
$$\begin{aligned}&({u_h}_t,{w}) -(p_h+F(u_h),w_x) + \left\langle \widehat{p}_h+\widehat{F}_h,{w}n \right\rangle = (f,{w}), \end{aligned}$$
(2.2c)

and

$$\begin{aligned} \left\langle \widehat{q}_h, \mu n\right\rangle = \left\langle q_N, \mu n\right\rangle _{\partial \Omega _N}, \quad \left\langle \widehat{p}_h+\widehat{F}_h, \chi n\right\rangle = 0 \end{aligned}$$
(2.2d)

for all \((v, z, w,\mu ,\chi )\,\in \,W_h^{k}\times W_h^{k} \times W_h^{k}\times \tilde{M}_h\times M_h(0)\), where, on \(\partial {\mathcal {T}}_h\), we have

$$\begin{aligned} {\left\{ \begin{array}{ll} \widehat{p}_h^{\,+}=p_h^+ + \tau _{pu}^+\,(\widehat{u}_h - u_h^+)\,n^+,\\ \widehat{q}_h^{\,+}=q_h^+ + \tau _{qu}^+\,(\widehat{u}_{h} - u_h^+)\,n^+,\\ \widehat{q}_h^{\,-}=q_h^- + \tau _{qu}^-\,(\widehat{u}_h - u_h^-)\,n^- +\tau _{qp}^-\,({\widehat{p}_h^{\,-}} -p_h^-)\,n^-,\\ \widehat{F}_h= F(\widehat{u}_h) -\tau _F(\widehat{u}_h,u_h)\,(\widehat{u}_h -u_h)\,n. \end{array}\right. } \end{aligned}$$
(2.2e)

It is not difficult to define HDG methods that are associated to other characterizations of the exact solution, but these methods are actually the same, provided that the corresponding stabilization function allows for the transition from one characterization to the other; see [8, 16]. In fact, the choice of characterization to use is more relevant for the actual implementation of the HDG method rather than for its actual definition. The implementation of the HDG method (2.2) is discussed in the “Appendix”.

When above scheme is discretized in time, we can choose the initial approximation (\(u_h^0, q_h^0, p_h^0, \widehat{u}_h^0, \widehat{p}_h^0\)) to be the HDG approximate solutions of the stationary equation \(v + v_{xxx} + F(v)_x = g,\) where \(g=u_0+(u_0)_{xxx}+F(u_0)_x\) and \(u_0\) is the initial data of the time-dependent problem (1.1); see [8] for HDG methods on stationary third-order equations. The initial approximation \((u_h^0, q_h^0, p_h^0, \widehat{u}_h^0, \widehat{p}_h^{0})\), is the element of \(W_h^{k}\times W_h^{k} \times W_h^{k} \times M_h(u_D)\times \tilde{M}_h\) which solves the equations

$$\begin{aligned}&({q}_h^0,{v}) + (u_h^0,v_x) - \left\langle \widehat{u}_h^0,{v}n \right\rangle = 0, \\&({p}_h^0,{z}) + (q_h^0,z_x) - \left\langle \widehat{q}_h^0,{z}n \right\rangle = 0,\\&({u_h^0},{w}) -(p_h^0+F(u_h^0),w_x) + \left\langle \widehat{p}_h^0+\widehat{F}_h^0,{w}n \right\rangle = (g,{w}),\\&\left\langle \widehat{q}_h^0, \mu n\right\rangle = \left\langle q_N, \mu n\right\rangle _{\partial \Omega _N},\quad \left\langle \widehat{p}_h^0+\widehat{F}_h^0, \chi n\right\rangle = 0 \end{aligned}$$

for all \((v, z, w,\mu ,\chi )\,\in \,W_h^{k}\times W_h^{k} \times W_h^{k}\times \tilde{M}_h\times M_h(0)\), where \(\widehat{q}_h^0, \widehat{p}_h^0\), and \(\widehat{F}_h^0\) are defined in the same ways as \(\widehat{q}_h, \widehat{p}_h\), and \(\widehat{F}_h\) in (2.2e). Note that the equations above are almost the same as those in (2.2) except the third one. This way of choosing initial data for time-dependent problems by solving corresponding stationary problems has been used in [9, 12].

Next, we present our stability result and a priori error estimate of the HDG method under some conditions on the stabilization function.

2.3 Stability

To discuss the \(L^2\)-stability of the HDG method, we let

$$\begin{aligned} \tilde{\tau }(u_h, \widehat{u}_h):= \frac{1}{(u_h-\widehat{u}_h)^2}{\int _{\widehat{u}_h}^{u_h}(F(s)-F(\widehat{u}_h)) n\, ds}. \end{aligned}$$

We have the following stability result.

Theorem 2.1

Assume that \(u_D=q_N=0\). If the stabilization function satisfies

$$\begin{aligned} \begin{aligned}&(\tau _F^+-\tilde{\tau }^+) -\tau _{pu}^+ -\frac{1}{2}(\tau _{qu}^+)^2 \ge 0,\text { and } \\&(\tau _F^--\tilde{\tau }^-) +\frac{1}{2}(\tau _{qu}^-)^2\ge 0, \quad (\tau _F^--\tilde{\tau }^-)(\tau _{qp}^-)^2+\tau _{qu}^-\tau _{qp}^- -\frac{1}{2}\ge 0, \end{aligned} \end{aligned}$$
(2.3)

then for the HDG method (2.2), we have

$$\begin{aligned} \frac{d}{dt}\Vert u_h\Vert ^2 \le 2(f,u_h). \end{aligned}$$

Note that if the nonlinear term \(F=0\), then we have \(\tau _F =\tilde{\tau }=0\) and the condition (2.3) in the Theorem above can be simplified as

$$\begin{aligned} -\tau _{pu}^+ -\frac{1}{2}{(\tau _{qu}^+)}^2\ge 0\quad \text { and } \quad \tau _{qu}^-\tau _{qp}^- -\frac{1}{2}\ge 0. \end{aligned}$$
(2.4)

If \(F(u)\ne 0\), we just need to have \(\tau _F\ge \tilde{\tau }\) and take \(\tau _{qu}^\pm , \tau _{pu}^+\) and \(\tau _{qp}^-\) to satisfy (2.4). Since

$$\begin{aligned} \tilde{\tau }=\frac{1}{(u_h-\widehat{u}_h)^2}{\int _{\widehat{u}_h}^{u_h}F'(\xi )(s-\widehat{u}_h) n\, ds}\le \frac{1}{2} \sup _{s\in J(u_h, \widehat{u}_h)}|F'(s)|, \end{aligned}$$

where \(J(u_h, \widehat{u}_h)=[\min \{u_h,\widehat{u}_h\}, \max \{u_h,\widehat{u}_h\}]\), the stabilization function \(\tau _F\) satisfies the condition \(\tau _F\ge \tilde{\tau }\) if

$$\begin{aligned} \tau _F\ge \frac{1}{2}\sup _{s\in J(u_h, \widehat{u}_h)}|F'(s)|. \end{aligned}$$

For other choices of \(\tau _F\) which satisfies the condition \(\tau _F\ge \tilde{\tau }\), see [24].

2.4 A Priori Error Estimate for Linear Equations

Now we consider the convergence properties of our HDG method for linear equations in which \(F=0\). We proceed as follows. We first define an auxiliary projection and state its optimal approximation property. Then, we provide an estimate for the \(L^2\)-norm of the projections of the errors in the primary and auxiliary variables.

Let us introduce a key auxiliary projection that is tailored to the numerical traces. The projection of the function \((u,q,p)\in H^1(\mathcal {T}_h)\times H^1(\mathcal {T}_h)\times H^1(\mathcal {T}_h),\,\varPi (u, q, p) := (\varPi u, \varPi {q},\varPi p)\), is defined as follows. On an element \({I_i}=(x_{i-1}, x_i)\), the projection is the element of \({{ P }}_{k}({I_i})\times { { P }}_{k}({I_i})\times P_{k}({I_i})\) which solves the following equations:

$$\begin{aligned} (\delta _u,v)_{I_i}&= 0 \quad \forall \,\,v \in { P }_{k-1}({I_i}), \end{aligned}$$
(2.5a)
$$\begin{aligned} (\delta _q,z)_{I_i}&= 0 \quad \forall \,\,z \in { P }_{k-1}({I_i}), \end{aligned}$$
(2.5b)
$$\begin{aligned} (\delta _p,w)_{I_i}&= 0 \quad \forall \,\,w \in { P }_{k-1}({I_i}), \end{aligned}$$
(2.5c)
$$\begin{aligned} \delta _p - \tau _{pu}^+\,\delta _u \,n&=0 \quad \text{ on } x_{i-1}^+, \end{aligned}$$
(2.5d)
$$\begin{aligned} \delta _q - \tau _{qu}^+\,\delta _u\,n&=0 \quad \text{ on } x_{i-1}^+, \end{aligned}$$
(2.5e)
$$\begin{aligned} \delta _q - \tau _{qu}^-\,\delta _u\,n -\tau _{qp}^-\,\delta _p\, n&=0 \quad \text{ on } x_i^-, \end{aligned}$$
(2.5f)

where we use the notation \(\delta _\omega := \omega - \varPi \omega \) for \(\omega = u, q\), and p. Note that the last three equations have exactly the same structure as the numerical traces of the HDG method in (2.2e).

The following result for the optimal approximation properties of the projection \(\varPi \) was shown in [8]. To state it, we use the following notation. The \(H^s(D)\)-norm is denoted by \(\Vert \cdot \Vert _{s, D}\). We drop the first subindex if \(s=0\), and the second one if \(D=\Omega \) or \(D={\mathcal T}_h\).

Lemma 2.2

Suppose that

$$\begin{aligned} \tau _{qu}^+ +\tau _{qu}^- -\tau _{pu}^+\tau _{qp}^- \ne 0. \end{aligned}$$
(2.6)

Then the projection \(\varPi \) in (2.5) is well defined on any interval \(I_i\). In addition, if \(\tau _{qu}^+, \tau _{qu}^-, \tau _{pu}^+\) and \(\tau _{qp}^-\) are constants, we have that, for \(\omega =u,q\) and p, there is a constant C such that

$$\begin{aligned} \Vert \omega -\varPi \omega \Vert _{I_i}&\le C\, h^{s+1} \quad \text{ for } s\in [1, {k}], \end{aligned}$$

provided \(\omega \in H^{s+1}(I_i)\).

Next, we provide estimates for the \(L^2\)-norm of the projection of the errors

$$\begin{aligned} \epsilon _u :=\; \varPi u - u_h,\quad \epsilon _{q} :=\; \varPi {q} - {q}_h, \quad \epsilon _{p} :=\; \varPi {p} - {p}_h, \end{aligned}$$

and deduce from them the estimates for the \(L^2\)-norm of the errors

$$\begin{aligned} e_u :=\; u - u_h,\quad e_{q} :=\;{q} - {q}_h, \quad e_{p} :=\; {p} - {p}_h. \end{aligned}$$

Theorem 2.3

Suppose that \(F(u)=0\) in the problem (2.1) and the exact solution \((u,q,p)\in W^{2,\infty }((0, T]; H^{k+1}(\mathcal {T}_h))\times W^{1,\infty }((0, T]; H^{k+1}(\mathcal {T}_h))\times W^{1,\infty }((0, T]; H^{k+1}(\mathcal {T}_h))\). If the stabilization function of the HDG method (2.2) satisfies the condition

$$\begin{aligned} \begin{aligned}&\tau _{qu}^->0, \,\,\tau _{qu}^-\tau _{qp}^-=1, \text { and }\\&\tau _{qu}^+\in [0, 1], \,\,\tau _{pu}^+\in \left[ -1-\sqrt{1-{\tau _{qu}^+}^2},-\frac{1}{2}-\frac{1}{2}{\tau _{qu}^+}^2\right] , \end{aligned} \end{aligned}$$
(2.7)

then for \(k>0\) and h small enough, we have

$$\begin{aligned} \Vert \epsilon _u(t)\Vert + \Vert \epsilon _q(t)\Vert + \Vert \epsilon _p(t)\Vert +\Vert {\epsilon _u}_t(t)\Vert \le Ch^{k+1}\quad \text { for } 0\le t\le T, \end{aligned}$$

where C is independent of h.

It is easy to see that if the stabilization function satisfies the condition (2.7), then it also satisfies the conditions (2.4) and (2.6). Using Lemma 2.2, Theorem 2.3 and the triangle inequality, we immediately get the following \(L^2\) error estimate for the actual errors.

Theorem 2.4

Suppose that the hypotheses of Theorem 2.3 are satisfied. Then we have

$$\begin{aligned} \Vert e_u(t)\Vert +\Vert e_q(t)\Vert +\Vert e_p(t)\Vert +\Vert e_{u_t}(t)\Vert \le Ch^{k+1} \quad \text { for } 0\le t\le T, \end{aligned}$$

where C is independent of h.

3 Proofs

In this section, we provide detailed proofs of our main results. We first prove Theorem 2.1 on the \(L^2\)-stability of the HDG method for general KdV type equations. Then we combine several energy identities to prove the error estimate in Theorem 2.3 for linear third-order equations.

3.1 \(L^2\)-Stability

Now let us prove Theorem 2.1 on the stability of the HDG method for the KdV equation. We treat the nonlinear term in a way similar to that in [24].

Proof

Taking \(\omega =u_h, v=-p_h\) and \(z=q_h\) in (2.2a)–(2.2c) and adding the three equations together, we get

$$\begin{aligned} (f, u_h)&=({u_h}_t,u_h)-(p_h+F(u_h), {u_h}_x) + \left\langle \widehat{p}_h+\widehat{F}_h, u_h n\right\rangle \\&\quad -\,(q_h, p_h)- (u_h, {p_h}_x) + \left\langle \widehat{u}_h, p_h n\right\rangle \\&\quad +\,(p_h,q_h)+(q_h, {q_h}_x)-\left\langle \widehat{q}_h, q_h n\right\rangle . \end{aligned}$$

Using integration by parts and (2.2d), we have

$$\begin{aligned} \begin{aligned} (f,u_h)&=\frac{1}{2}\frac{d}{dt}\Vert u_h\Vert ^2 -(F(u_h),{u_h}_x)-\left\langle \widehat{p}_h+\widehat{F}_h-p_h, (\widehat{u}_h-u_h)n \right\rangle \\&\quad +\,\frac{1}{2}\left\langle (\widehat{q}_h-q_h)^2, n\right\rangle + \frac{1}{2}\widehat{q}_h^{\, 2}(0). \end{aligned} \end{aligned}$$
(3.1)

Let G(s) be such that \(d G(s)/ds=F(s)\). It is easy to see that

$$\begin{aligned} -(F(u_h),{u_h}_x)= -\left( \frac{d}{dx} G(u_h), 1\right) = -\left\langle G(u_h), n\right\rangle =-\left\langle \int _{\widehat{u}_h}^{u_h} F(s) ds, n\right\rangle . \end{aligned}$$

Using it for the second term on the right hand side of (3.1), we get that

$$\begin{aligned} (f,u_h)=\frac{1}{2}\frac{d}{dt}\Vert u_h\Vert ^2+ \Phi + \frac{1}{2}\widehat{q}_h(0)^2, \end{aligned}$$

where

$$\begin{aligned} \Phi&= -\left\langle \int _{\widehat{u}_h}^{u_h} (F(s)-F(\widehat{u}_h)) ds, n\right\rangle -\left\langle \widehat{F}_h-F(\widehat{u}_h), (\widehat{u}_h-u_h)n \right\rangle \\&\quad -\,\left\langle \widehat{p}_h-p_h, (\widehat{u}_h-u_h)n \right\rangle +\frac{1}{2}\left\langle (\widehat{q}_h-q_h)^2, n\right\rangle . \end{aligned}$$

Next, we just need to show that \(\Phi \ge 0\). Let

$$\begin{aligned} \tilde{\tau }:=\frac{1}{(\widehat{u}_h-u_h)^2}\int _{\widehat{u}_h}^{u_h} (F(s)-F(\widehat{u}_h))n ds. \end{aligned}$$

Using the definition of \(\widehat{F}_h\) in (2.2e), we have

$$\begin{aligned} \Phi =\left\langle \tau _F-\tilde{\tau },(\widehat{u}_h-u_h)^2 \right\rangle -\left\langle \widehat{p}_h-p_h, (\widehat{u}_h-u_h)n\right\rangle +\frac{1}{2}\left\langle (\widehat{q}_h-q_h)^2, n\right\rangle . \end{aligned}$$

By the definition of \(\widehat{p}_h\) and \(\widehat{q}_h\) in (2.2e), we get

$$\begin{aligned} \Phi ^+&:=\Phi |_{\partial \mathcal {T}_h^+} =\left\langle \tau _F^+-\tilde{\tau }^+-\tau _{pu}^+ -\frac{1}{2}(\tau _{qu}^+)^2, (\widehat{u}_h-u_h)^2 \right\rangle _{\partial \mathcal {T}_h^+},\\ \Phi ^-&:=\Phi |_{\partial \mathcal {T}_h^-}= \left\langle \tau _F^- -\tilde{\tau }^-+\frac{1}{2}(\tau _{qu}^-)^2, (\widehat{u}_h-u_h)^2 \right\rangle _{\partial \mathcal {T}_h^-}+\left\langle \frac{1}{2}(\tau _{qp}^-)^2, (\widehat{p}_h-p_h)^2\right\rangle _{\partial \mathcal {T}_h^-}\\&\quad +\,\left\langle \tau _{qu}^-\tau _{qp}^- -1, (\widehat{p}_h-p_h)(\widehat{u}_h-u_h)n \right\rangle _{\partial \mathcal {T}_h^-}. \end{aligned}$$

It is easy to check that if the stabilization function satisfies the condition (2.3), then we get \(\Phi ^+\ge 0\) and \(\Phi ^-\ge 0\). This shows that

$$\begin{aligned} \frac{1}{2}\frac{d}{dt}\Vert u_h\Vert ^2\le (f,u_h). \end{aligned}$$

\(\square \)

3.2 Error Analysis

In this section, we prove the optimal error estimate for the projections of the errors in Theorem 2.3 for linear equations with \(F=0\). First, we obtain the equations for the projection of the errors.

3.2.1 The Error Equations

From the equations defining the HDG method, (2.2a)–(2.2c), and the fact that the exact solution also satisfy these equations, we obtain the following error equations

$$\begin{aligned}&(e_q,{v}) + (e_u,v_x) - \left\langle \widehat{e}_u,{v} n\right\rangle =0,\\&(e_p,{z}) + (e_q,z_x) -\left\langle \widehat{e}_q,{z}n \right\rangle =0,\\&({e_u}_t,{w}) -(e_p,w_x) + \left\langle \widehat{e}_p,{w} n\right\rangle =0, \end{aligned}$$

for all \((v, z, w)\,\in \,W_h^{k}\times W_h^{k} \times W_h^{k}\), where \(\widehat{e}_\omega =\omega -\widehat{\omega }_h\) for \(\omega =u, q\), and p. From (2.2e) and (2.2d), it is easy to see that

$$\begin{aligned} {\left\{ \begin{array}{ll} \widehat{e}_p^{\,+}&{}=e_p^+ +\tau _{pu}^+\,(\widehat{e}_u - e_u^+)\,n^+,\\ \widehat{e}_q^{\,+}&{}=e_q^+ + \tau _{qu}^+\,(\widehat{e}_{u} - e_u^+)\,n^+,\\ \widehat{e}_q^{\,-}&{}=e_q^- + \tau _{qu}^-\,(\widehat{e}_u - e_u^-)\,n^- +\tau _{qp}^-\,({\widehat{e}_p^{\,-}} -e_p^-)\,n^-, \end{array}\right. } \end{aligned}$$

and

$$\begin{aligned} \left\langle \widehat{e}_q, \mu \,n\right\rangle = 0, \quad \left\langle \widehat{e}_p, \chi \,n\right\rangle = 0 \end{aligned}$$

for all \((\mu ,\chi )\in \tilde{M}_h\times M_h(0)\). Now we set

$$\begin{aligned} \widehat{\epsilon }_u=\widehat{e}_u\quad \text { and }\quad \widehat{\epsilon }_p^{\,-}=\widehat{e}_p^{\,-}, \end{aligned}$$

and let

$$\begin{aligned} {\left\{ \begin{array}{ll} \widehat{\epsilon }_p^{\,+}= \epsilon _p^+ + \tau _{pu}^+\,(\widehat{\epsilon }_u - \epsilon _u^+)\,n^+,\\ \widehat{\epsilon }_q^{\,+}= \epsilon _q^+ + \tau _{qu}^+\,(\widehat{\epsilon }_u - \epsilon _h^+)\,n^+,\\ \widehat{\epsilon }_q^{\,-}= \epsilon _q^- + \tau _{qu}^-\,(\widehat{\epsilon }_u - \epsilon _u^-)\,n^- +\tau _{qp}^-\,({\widehat{\epsilon }_p^{\,-}} -\epsilon _p^-)\,n^-. \end{array}\right. } \end{aligned}$$
(3.2)

Using the Eqs. (2.5d)–(2.5f), after some simple algebra manipulations we get that

$$\begin{aligned} \widehat{\epsilon }_p^{\,+}=\widehat{e}_p^{\,+} \quad \text { and } \quad \widehat{\epsilon }_q^{\,\pm }=\widehat{e}_q^{\,\pm }. \end{aligned}$$

Therefore, by the definition of the projection \(\varPi \), (2.5a)–(2.5c), we easily obtain the following equations for the projections of errors

$$\begin{aligned}&(\epsilon _q,{v})+(\delta _q, v) + (\epsilon _u,v_x) - \left\langle \widehat{\epsilon }_u,{v} n\right\rangle =0, \end{aligned}$$
(3.3a)
$$\begin{aligned}&(\epsilon _p,{z})+(\delta _p, z) + (\epsilon _q,z_x) - \left\langle \widehat{\epsilon }_q,{z}n \right\rangle =0, \end{aligned}$$
(3.3b)
$$\begin{aligned}&(\epsilon _{ut},{w})+(\delta _{ut}, w)-(\epsilon _p,w_x) + \left\langle \widehat{\epsilon }_p,{w} n\right\rangle =0, \end{aligned}$$
(3.3c)
$$\begin{aligned}&\left\langle \widehat{\epsilon }_q, \mu \,n \right\rangle = 0, \quad \left\langle \widehat{\epsilon }_p, \chi \,n\right\rangle = 0 \end{aligned}$$
(3.3d)

for all \((v, z, w,\mu ,\chi )\,\in \,W_h^{k }\times W_h^{k } \times W_h^{k }\times \tilde{M}_h\times M_h(0)\).

3.2.2 Energy Identities

To prove the \(L^2\)-error estimate in Theorem 2.3, we begin by establishing a key identity involving the quantity

$$\begin{aligned} \Vert \epsilon \Vert ^2:=\Vert \epsilon _u\Vert ^2+\Vert \epsilon _q\Vert ^2+\Vert \epsilon _p\Vert ^2+\Vert \epsilon _{ut}\Vert ^2 \end{aligned}$$

by energy arguments.

Lemma 3.1

We have that

$$\begin{aligned} \frac{1}{2}\frac{d}{dt}\Vert \epsilon \Vert ^2+ S+\Psi =0, \end{aligned}$$

where

$$\begin{aligned} S&=(\delta _{ut},\epsilon _u)-(\delta _q, \epsilon _p)+(\delta _p, \epsilon _q) +({\delta _q}_t, \epsilon _q)+(\delta _p, \epsilon _{ut})-(\delta _{ut},\epsilon _p)\\&\quad +\,({\delta _p}_t,\epsilon _p) -({\delta _u}_t, {\epsilon _q}_t) +({\delta _q}_t,\epsilon _{ut}) +(\delta _{utt},\epsilon _{ut}) -({\delta _q}_t, {\epsilon _p}_t)+({\delta _p}_t, {\epsilon _q}_t),\\ \Psi&= -\left\langle \widehat{\epsilon }_p-\epsilon _p, (\widehat{\epsilon }_u-\epsilon _u)n\right\rangle +\frac{1}{2}\left\langle (\widehat{\epsilon }_q-\epsilon _q)^2, n\right\rangle \\&\quad +\,\left\langle \widehat{\epsilon }_q-\epsilon _q, (\widehat{\epsilon }_{ut}-\epsilon _{ut})n\right\rangle +\frac{1}{2}\left\langle (\widehat{\epsilon }_p-\epsilon _p)^2, n\right\rangle \\&\quad +\,\left\langle \widehat{\epsilon }_{qt}-\epsilon _{qt}, (\widehat{\epsilon }_p-\epsilon _p)n\right\rangle +\frac{1}{2}\left\langle (\widehat{\epsilon }_{ut}-\epsilon _{ut})^2, n\right\rangle \\&\quad -\,\left\langle \widehat{\epsilon }_{pt}-\epsilon _{pt},(\widehat{\epsilon }_{ut}-\epsilon _{ut})n\right\rangle +\frac{1}{2}\left\langle (\widehat{\epsilon }_{qt}-\epsilon _{qt})^2,n\right\rangle \\&\quad +\,\frac{1}{2}\widehat{\epsilon }_q^{\;2}(x_0)+\frac{1}{2}(\widehat{\epsilon }_p+\widehat{\epsilon }_{qt})^2(x_0)-\frac{1}{2}\widehat{\epsilon }_p^{\;2}(x_N). \end{aligned}$$

Proof

Differentiating the error equations (3.3a)–(3.3c) with respect to t, we get

$$\begin{aligned}&({\epsilon _q}_t,{v})+({\delta _q}_t, v) + ({\epsilon _u}_t,v_x) - \left\langle \widehat{\epsilon }_{ut},{v} n\right\rangle =0, \end{aligned}$$
(3.4a)
$$\begin{aligned}&({\epsilon _p}_t,{z})+({\delta _p}_t, z) + ({\epsilon _q}_t,z_x) -\left\langle \widehat{\epsilon }_{qt},{z}n \right\rangle =0, \end{aligned}$$
(3.4b)
$$\begin{aligned}&({\epsilon _u}_{tt},{w})+({\delta _u}_{tt}, w)-({\epsilon _p}_t,w_x)+ \left\langle \widehat{\epsilon }_{pt},{w} n\right\rangle =0. \end{aligned}$$
(3.4c)

Next, we use (3.3) and (3.4) to get four energy identities.

(i) Taking \(w=\epsilon _u, v=-\epsilon _p,\) and \(z=\epsilon _q\) in (3.3) and adding the three equations together, we have

$$\begin{aligned} 0&=({{\epsilon }_u}_t,\epsilon _u)+(\delta _{ut},\epsilon _u)-(\epsilon _p, {{\epsilon }_u}_x) + \left\langle \widehat{{\epsilon }}_p, {\epsilon }_u n\right\rangle \\&\quad -\,({\epsilon }_q, {\epsilon }_p)- (\delta _q, {\epsilon }_p)-({\epsilon }_u, {{\epsilon }_p}_x) + \left\langle \widehat{{\epsilon }}_u, {\epsilon }_p n\right\rangle \\&\quad +\,({\epsilon }_p, {\epsilon }_q)+(\delta _p, {\epsilon }_q)+ ({\epsilon }_q, {{\epsilon }_q}_x)-\left\langle \widehat{{\epsilon }}_q, {\epsilon }_q n\right\rangle . \end{aligned}$$

Using integration by parts, (3.3d), and the fact that

$$\begin{aligned} \widehat{{\epsilon }}_u|_{\partial \Omega }=\widehat{e}_u|_{\partial \Omega }=0, \quad \widehat{{\epsilon }}_q|_{\partial \Omega _N}=\widehat{e}_q|_{\partial \Omega _N}=0, \end{aligned}$$

we get

$$\begin{aligned} \begin{aligned} 0&=\frac{1}{2}\frac{d}{dt}\Vert {\epsilon }_u\Vert ^2 +({\delta _u}_t,\epsilon _u) -(\delta _q,{\epsilon }_p)+(\delta _p,{\epsilon }_q)\\&\quad -\,\left\langle \widehat{{\epsilon }}_p-{\epsilon }_p, (\widehat{{\epsilon }}_u-{\epsilon }_u)n \right\rangle +\frac{1}{2}\left\langle (\widehat{{\epsilon }}_q-{\epsilon }_q)^2, n\right\rangle +\frac{1}{2}\widehat{\epsilon }_q^{\;2}(x_0). \end{aligned} \end{aligned}$$
(3.5)

(ii) Similar to (i), taking \(v={\epsilon }_q\) in (3.4a), \(z={{\epsilon }_u}_t\) in (3.3b), and \(w=-{\epsilon }_p\) in (3.3c) and adding the three equations together, we get

$$\begin{aligned} \begin{aligned} 0&= \frac{1}{2}\frac{d}{dt} \Vert {\epsilon }_q\Vert ^2 + ({\delta _q}_t,{\epsilon }_q) +(\delta _p, {{\epsilon }_u}_t) -({\delta _u}_t, {\epsilon }_p)\\&\quad +\,\left\langle \widehat{{\epsilon }}_q -{\epsilon }_q, (\widehat{{\epsilon }}_{ut}-{\epsilon }_{ut})n\right\rangle +\frac{1}{2}\left\langle (\widehat{{\epsilon }}_p -{\epsilon }_p)^2, n\right\rangle -\frac{1}{2} \widehat{\epsilon }_p^{\;2}(x_N)+\frac{1}{2} \widehat{\epsilon }_p^{\;2}(x_0). \end{aligned} \end{aligned}$$
(3.6)

(iii) Taking \(v={{\epsilon }_u}_t\) in (3.4a), \(z={\epsilon }_p\) in (3.4b), and \(w=-{{\epsilon }_{q}}_t\) in (3.3c) and adding the equations together, we get

$$\begin{aligned} \begin{aligned} 0&= \frac{1}{2}\frac{d}{dt} \Vert {\epsilon }_p\Vert ^2 + ({\delta _p}_t,{\epsilon }_p) +({\delta _{q}}_t, {{\epsilon }_u}_t) -({\delta _u}_t, {{\epsilon }_{q}}_t)\\&\quad +\,\left\langle \widehat{{\epsilon }}_{qt} -{\epsilon }_{qt}, (\widehat{{\epsilon }}_{p}-{\epsilon }_{p})n\right\rangle +\frac{1}{2}\left\langle (\widehat{{\epsilon }}_{ut} -{\epsilon }_{ut})^2, n\right\rangle +\widehat{\epsilon }_{qt}\widehat{\epsilon }_p (x_0). \end{aligned} \end{aligned}$$
(3.7)

(iv) Taking \(v=-{{\epsilon }_p}_t, z= {{\epsilon }_q}_t\), and \(w={{\epsilon }_u}_t\) in (3.4a)–(3.4c) and adding the equations together, we get

$$\begin{aligned} \begin{aligned} 0&= \frac{1}{2}\frac{d}{dt} \Vert {{\epsilon }_u}_t\Vert ^2 + ({\delta _u}_{tt},{{\epsilon }_u}_t) -({\delta _{q}}_t, {{\epsilon }_p}_t) +({\delta _p}_t, {{\epsilon }_{q}}_t)\\&\quad -\,\left\langle \widehat{{\epsilon }}_{pt} -{\epsilon }_{pt}, (\widehat{{\epsilon }}_{ut}-{\epsilon }_{ut})n\right\rangle +\frac{1}{2}\left\langle (\widehat{{\epsilon }}_{qt} -{\epsilon }_{qt})^2, n\right\rangle +\frac{1}{2}\widehat{\epsilon }_{qt}^{\;2}(x_0). \end{aligned} \end{aligned}$$
(3.8)

The proof is completed by adding the four equations (3.5)–(3.8) together. \(\square \)

3.2.3 Proof of the \(L^2\)-Error Estimate

Using Lemma 3.1, we first get the following result.

Lemma 3.2

If the stabilization function satisfies the condition (2.7), then we have

$$\begin{aligned} \Vert {\epsilon }(t)\Vert ^2 \le \Vert {\epsilon }(0)\Vert ^2 +\Theta (0)+ \int _0^t\widehat{\epsilon }_p^{\;2}(x_N)\,dt+2\,|\int _0^t S\, dt| \quad \text { for } 0\le t\le T, \end{aligned}$$

where

$$\begin{aligned} \Theta =\left\langle \tau _{qu}^+ - \tau _{pu}^+ \tau _{qu}^+, (\widehat{{\epsilon }}_u-{\epsilon }_u)^2 \right\rangle _{\partial \mathcal {T}_h^+} +\left\langle 1, \tau _{qu}^-(\widehat{{\epsilon }}_u-{\epsilon }_u)^2+\tau _{qp}^-(\widehat{{\epsilon }}_p-{\epsilon }_p)^2\right\rangle _{\partial \mathcal {T}_h^-}, \end{aligned}$$

and S is the same as in Lemma 3.1.

Proof

Using the definition of \(\widehat{{\epsilon }}_p^+\) and \(\widehat{{\epsilon }}_q\) in (3.2), for the \(\Psi \) term in Lemma 3.1, we have

$$\begin{aligned} \Psi = \Psi ^+ +\Psi ^-, \end{aligned}$$

where

$$\begin{aligned} \Psi ^+&= - \left\langle \tau _{pu}^+, (\widehat{{\epsilon }}_u-{\epsilon }_u)^2\right\rangle _{\partial \mathcal {T}_h^+} - \frac{1}{2}\left\langle (\tau _{qu}^+)^2, (\widehat{{\epsilon }}_u-{\epsilon }_u)^2\right\rangle _{\partial \mathcal {T}_h^+}\\&\quad +\,\left\langle \tau _{qu}^+, (\widehat{{\epsilon }}_u-{\epsilon }_u)(\widehat{{\epsilon }}_u-{\epsilon }_u)_t \right\rangle _{\partial \mathcal {T}_h^+} - \frac{1}{2}\left\langle (\tau _{pu}^+)^2, (\widehat{{\epsilon }}_u-{\epsilon }_u)^2\right\rangle _{\partial \mathcal {T}_h^+}\\&\quad -\,\left\langle \tau _{pu}^+ \tau _{qu}^+, (\widehat{{\epsilon }}_u-{\epsilon }_u)(\widehat{{\epsilon }}_u-{\epsilon }_u)_t \right\rangle _{\partial \mathcal {T}_h^+} -\frac{1}{2}\left\langle 1, (\widehat{{\epsilon }}_{ut}-{\epsilon }_{ut})^2 \right\rangle _{\partial \mathcal {T}_h^+}\\&\quad -\,\left\langle \tau _{pu}^+, (\widehat{{\epsilon }}_{ut}-{\epsilon }_{ut})^2 \right\rangle _{\partial \mathcal {T}_h^+} -\frac{1}{2}\left\langle (\tau _{qu}^+)^2, (\widehat{{\epsilon }}_{ut}-{\epsilon }_{ut})^2\right\rangle _{\partial \mathcal {T}_h^+}\\&\quad +\,\frac{1}{2}\widehat{\epsilon }_q^{\;2}(x_0)+\frac{1}{2}(\widehat{\epsilon }_p+\widehat{\epsilon }_{qt})^2(x_0) \end{aligned}$$

and

$$\begin{aligned} \Psi ^-&= -\left\langle \widehat{{\epsilon }}_p-{\epsilon }_p, \widehat{{\epsilon }}_u-{\epsilon }_u\right\rangle _{\partial \mathcal {T}_h^-} + \frac{1}{2}\left\langle 1,(\tau _{qu}^-(\widehat{{\epsilon }}_u-{\epsilon }_u)+\tau _{qp}^-(\widehat{{\epsilon }}_p-{\epsilon }_p))^2\right\rangle _{\partial \mathcal {T}_h^-}\\&\quad +\,\left\langle \tau _{qu}^-(\widehat{{\epsilon }}_u-{\epsilon }_u)+\tau _{qp}^-(\widehat{{\epsilon }}_p-{\epsilon }_p), (\widehat{{\epsilon }}_u-{\epsilon }_u)_t \right\rangle _{\partial \mathcal {T}_h^-} + \frac{1}{2}\left\langle 1,(\widehat{{\epsilon }}_p-{\epsilon }_p)^2\right\rangle _{\partial \mathcal {T}_h^-}\\&\quad +\,\left\langle \tau _{qu}^-(\widehat{{\epsilon }}_u-{\epsilon }_u)_t + \tau _{qp}^-(\widehat{{\epsilon }}_p-{\epsilon }_p)_t, \widehat{{\epsilon }}_p-{\epsilon }_p\right\rangle _{\partial \mathcal {T}_h^-} +\frac{1}{2}\left\langle 1, (\widehat{{\epsilon }}_{ut}-{\epsilon }_{ut})^2\right\rangle _{\partial \mathcal {T}_h^-}\\&\quad -\,\left\langle \widehat{{\epsilon }}_{pt}-{\epsilon }_{pt}, \widehat{{\epsilon }}_{ut}-{\epsilon }_{ut}\right\rangle _{\partial \mathcal {T}_h^-} +\frac{1}{2}\left\langle 1,(\tau _{qu}^-(\widehat{{\epsilon }}_{ut}-{\epsilon }_{ut})+\tau _{qp}^-( \widehat{{\epsilon }}_{pt}-{\epsilon }_{pt}))^2\right\rangle _{\partial \mathcal {T}_h^-}\\&\quad -\,\frac{1}{2}\widehat{\epsilon }_p^{\;2}(x_N). \end{aligned}$$

We can rewrite the term \(\Psi ^+ \) as

$$\begin{aligned} \Psi ^+= \Gamma _1 + \frac{1}{2}\frac{d}{dt} \Theta _1, \end{aligned}$$

where

$$\begin{aligned} \Gamma _1&= \left\langle -\tau _{pu}^+ -\frac{1}{2} (\tau _{qu}^+)^2 -\frac{1}{2}(\tau _{pu}^+)^2, (\widehat{{\epsilon }}_u-{\epsilon }_u)^2 \right\rangle _{\partial \mathcal {T}_h^+} \\&\quad +\,\left\langle -\frac{1}{2}-\tau _{pu}^+ -\frac{1}{2}(\tau _{qu}^+)^2, (\widehat{{\epsilon }}_{ut}-{\epsilon }_{ut})^2 \right\rangle _{\partial \mathcal {T}_h^+} +\frac{1}{2}\widehat{\epsilon }_q^{\;2}(x_0)+\frac{1}{2}(\widehat{\epsilon }_p+\widehat{\epsilon }_{qt})^2(x_0),\\ \Theta _1&= \left\langle \tau _{qu}^+ - \tau _{pu}^+ \tau _{qu}^+, (\widehat{{\epsilon }}_u-{\epsilon }_u)^2 \right\rangle _{\partial \mathcal {T}_h^+}. \end{aligned}$$

Similarly, if we assume that \(\tau _{qu}^-\tau _{qp}^-=1\), after some calculations we get

$$\begin{aligned} \Psi ^-= \Gamma _2+ \frac{1}{2}\frac{d}{dt} \Theta _2-\frac{1}{2}\widehat{\epsilon }_p^{\;2}(x_N), \end{aligned}$$

where

$$\begin{aligned} \Gamma _2&= \left\langle \left( \frac{1}{2}\tau _{qu}^-\right) ^2, (\widehat{{\epsilon }}_u-{\epsilon }_u)^2\right\rangle _{\partial \mathcal {T}_h^-} +\left\langle \frac{1}{2}, \Big (\tau _{qp}^-(\widehat{{\epsilon }}_p-{\epsilon }_p)+(\widehat{{\epsilon }}_u-{\epsilon }_u)_t)\Big )^2\right\rangle _{\partial \mathcal {T}_h^-} \\&\quad +\,\left\langle \left( \frac{1}{2}\tau _{qp}^-\right) ^2, (\widehat{{\epsilon }}_{pt}-{\epsilon }_{pt})^2\right\rangle _{\partial \mathcal {T}_h^-} + \left\langle \frac{1}{2}, \Big ( (\widehat{{\epsilon }}_p-{\epsilon }_p)+ \tau _{qu}^-(\widehat{{\epsilon }}_u-{\epsilon }_u)_t \Big )^2\right\rangle _{\partial \mathcal {T}_h^-},\\ \Theta _2&= \left\langle 1, \tau _{qu}^-(\widehat{{\epsilon }}_u-{\epsilon }_u)^2+\tau _{qp}^-(\widehat{{\epsilon }}_p-{\epsilon }_p)^2\right\rangle _{\partial \mathcal {T}_h^-}. \end{aligned}$$

So from Lemma 3.1 we get

$$\begin{aligned} \frac{1}{2}\frac{d}{dt}(\Vert {\epsilon }\Vert ^2 +\Theta _1 +\Theta _2) +\Gamma _1+\Gamma _2 = \frac{1}{2}\widehat{\epsilon }_p^{\;2}(x_N)-S. \end{aligned}$$
(3.9)

Now we integrate the Eq. (3.9) with respect to t and get

$$\begin{aligned}&\frac{1}{2}\Big (\Vert {\epsilon }(t)\Vert ^2+ \Theta _1(t)+\Theta _2(t)\Big )+\int _0^t (\Gamma _1+\Gamma _2)dt\\&\quad = \frac{1}{2}\Big (\Vert {\epsilon }(0)\Vert ^2 +\Theta _1(0)+\Theta _2(0)\Big )+\frac{1}{2}\int _0^t\widehat{\epsilon }_p^{\;2}(x_N)dt-\int _0^t S \,dt. \end{aligned}$$

It is easy to check that if \(\tau _{qu}^\pm , \tau _{pu}^+\) and \(\tau _{qp}^-\) satisfy the condition (2.7), we have

$$\begin{aligned} \Theta _1\ge 0, \;\;\Theta _2\ge 0,\;\; \Gamma _1\ge 0,\;\; \Gamma _2\ge 0 \;\; \text { for any } t\in [0, T]. \end{aligned}$$

Therefore,

$$\begin{aligned} \Vert {\epsilon }(t)\Vert ^2 \le \Vert {\epsilon }(0)\Vert ^2 +\Theta (0)+ \int _0^t\widehat{\epsilon }_p^{\;2}(x_N)\,dt+2\,|\int _0^t S \,dt|, \end{aligned}$$

where \(\Theta =\Theta _1+\Theta _2.\) \(\square \)

To prove Theorem 2.3, we also need the following Lemma for error estimates of the initial approximations at \(t=0\) (see Theorems 2.2 and 2.3 in [8]).

Lemma 3.3

If \(\tau _{qu}^\pm , \tau _{pu}^+, \tau _{qp}^-\) satisfy the condition (2.6), then for \(k>0\),

$$\begin{aligned}&\Vert {\epsilon }_u(0)\Vert +\Vert {\epsilon }_q(0)\Vert +\Vert {\epsilon }_p(0)\Vert \le C h^{k+2},\\&\Vert \widehat{e}_u(0)\Vert _{\mathscr {E}_h}+\Vert \widehat{e}_q(0)\Vert _{\mathscr {E}_h}+\Vert \widehat{e}_p(0)\Vert _{\mathscr {E}_h}\le Ch^{2k+1}. \end{aligned}$$

In addition, let us get an estimate for \({\epsilon }_{ut}\) at \(t=0\).

Lemma 3.4

If \(\tau _{qu}^\pm , \tau _{pu}^+, \tau _{qp}^-\) satisfy the condition (2.6), then for \(k>0\)

$$\begin{aligned}&\Vert {\epsilon }_{ut}(0)\Vert \le C h^{k+1}. \end{aligned}$$

Proof

Taking \(t=0\) and \(w={{\epsilon }_u}_t(0)\) in the error equation (3.3c), we have

$$\begin{aligned} (\epsilon _{ut}(0),{{\epsilon }_u}_t(0))+(\delta _{ut}(0), {{\epsilon }_u}_t(0))-(\epsilon _p(0),{{{\epsilon }_u}_t}_x(0)) + \left\langle \widehat{\epsilon }_p(0),{{\epsilon }_u}_t(0) n\right\rangle =0. \end{aligned}$$

By Cauchy inequality, trace inequality and inverse inequality, we get

$$\begin{aligned} \Vert {{\epsilon }_u}_t(0)\Vert ^2\le C\Vert \delta _{ut}(0)\Vert ^2+Ch^{-2}\Vert {\epsilon }_p(0)\Vert ^2 +Ch^{-1}\Vert \widehat{{\epsilon }}_p(0)\Vert _{\mathscr {E}_h}^2. \end{aligned}$$

Then the conclusion follows by using Lemmas 2.2 and 3.3. \(\square \)

Now let us finish the proof of Theorem 2.3 by estimating the right hand side of the inequality in Lemma 3.2 and using Lemmas 3.3 and 3.4.

Proof

We first estimate the term \(\int _0^t\;\widehat{\epsilon }_p^{\;2}(x_N) dt\). Taking \(\omega \) to be \(\omega _1:=\frac{x-x_0}{x_N-x_0}\) in (3.3c), we get

$$\begin{aligned} \widehat{\epsilon }_p(x_N)=-(\epsilon _{ut}, \omega _1)-(\delta _{ut},\omega _1)+\left( \epsilon _p, \frac{1}{x_N-x_0}\right) \end{aligned}$$

by the fact that \(\omega _1(x_0)=0\) and \(\omega _1(x_N)=1\). Using Cauchy inequality, we have

$$\begin{aligned} |\widehat{\epsilon }_p(x_N)| \le&\, |(\epsilon _{ut}, \omega _1)|+|(\delta _{ut},\omega _1)|+\left| \left( \epsilon _p, \frac{1}{x_N-x_0}\right) \right| \\ \le&\, C \left( \Vert \epsilon _{ut}\Vert + \Vert \delta _{ut}\Vert +\Vert \epsilon _p\Vert \right) . \end{aligned}$$

Then by the approximation property of the projection \(\Pi \) in Lemma 2.2, we obtain

$$\begin{aligned} \int _0^t\widehat{\epsilon }_p^{\;2}(x_N)\,dt \le \; Ch^{2k+2}+\int _0^t \Vert \epsilon \Vert ^2 dt. \end{aligned}$$
(3.10)

Next, we estimate the term \(|\int _0^t S \, dt|\). Let

$$\begin{aligned} S=S_1+S_2, \end{aligned}$$

where

$$\begin{aligned} S_1&=(\delta _{ut},\epsilon _u)-(\delta _q, \epsilon _p)+(\delta _p, \epsilon _q) +({\delta _q}_t, \epsilon _q)+(\delta _p, \epsilon _{ut})-(\delta _{ut},\epsilon _p)\\&\quad +\,({\delta _p}_t,\epsilon _p) +({\delta _q}_t,\epsilon _{ut}) +(\delta _{utt},\epsilon _{ut}),\\ S_2&=-({\delta _u}_t, {\epsilon _q}_t)-({\delta _q}_t, {\epsilon _p}_t)+({\delta _p}_t, {\epsilon _q}_t). \end{aligned}$$

Using Cauchy inequality and the approximation property of the projection \(\Pi \) in Lemma (2.2), we get

$$\begin{aligned} \int _0^t |S_1|dt \le Ch^{k+1}\int _0^t\Vert {\epsilon }\Vert dt. \end{aligned}$$

Integrating \(S_2\) with respect to t, we have

$$\begin{aligned} \int _0^t S_2 dt&= - ({\delta _u}_{t}, {\epsilon }_q)|_0^t + \int _0^t({\delta _u}_{tt},{\epsilon }_q)dt -({\delta _q}_t, {\epsilon _p})|_0^t+\int _0^t({\delta _q}_{tt}, {\epsilon _p}) dt\\&\quad +\,({\delta _p}_t, {\epsilon _q})|_0^t -\int _0^t({\delta _p}_{tt}, {\epsilon _q})dt. \end{aligned}$$

By the approximation property of the projection \(\Pi \) in Lemma 2.2,

$$\begin{aligned} \biggl |\int _0^t S_2 dt\bigg | \le Ch^{2k+2}+C\Vert {\epsilon }(0)\Vert ^2+\frac{1}{4}\Vert {\epsilon }(t)\Vert ^2+Ch^{k+1}\int _0^t \Vert {\epsilon }\Vert dt. \end{aligned}$$

So we get

$$\begin{aligned} \begin{aligned} \Big |\int _0^t S dt\Big |&\le \int _0^t|S_1|dt +\left| \int _0^t S_2\, dt\right| \\&\le Ch^{2k+2}+C\Vert {\epsilon }(0)\Vert ^2+\frac{1}{4}\Vert {\epsilon }(t)\Vert ^2+Ch^{k+1}\int _0^t \Vert {\epsilon }\Vert dt. \end{aligned} \end{aligned}$$
(3.11)

Applying (3.10) and (3.11) to Lemma 3.2, we have

$$\begin{aligned} \Vert {\epsilon }(t)\Vert ^2 \le&C \Vert {\epsilon }(0)\Vert ^2 +C \Theta (0)+C h^{2k+2}+C\int _0^t\Vert {\epsilon }(t)\Vert ^2 dt. \end{aligned}$$

Since

$$\begin{aligned} \Theta (0)\le C \left( \Vert \widehat{{\epsilon }}_u(0)\Vert _{\mathscr {E}_h}^2+\Vert \widehat{{\epsilon }}_p(0)\Vert _{\mathscr {E}_h}^2\right) +Ch^{-1}\left( \Vert {\epsilon }_u(0)\Vert ^2+\Vert {\epsilon }_p(0)\Vert ^2\right) \end{aligned}$$

by Lemma 3.3 and the trace inequality, we have

$$\begin{aligned} \Vert {\epsilon }(t)\Vert ^2 \le&Ch^{2k+2}+C\int _0^t\Vert {\epsilon }(t)\Vert ^2 dt \end{aligned}$$

using Lemmas 3.3 and 3.4. Now we use Grönwall’s inequality and get

$$\begin{aligned} \Vert {\epsilon }(t)\Vert ^2\le C h^{2k+2}, \end{aligned}$$

where C depends on t but not on h. This completes the proof of Theorem 2.3. \(\square \)

4 Numerical Results

In this section, we carry out several numerical experiments to study the accuracy and capability of our HDG method. In the first and the second numerical experiments, we examine the orders of convergence of the method for linear and nonlinear third-order problems. In the third and the fourth experiments, we apply the method to solve some well-known dispersive wave problems. For all the experiments, we use the following second-order midpoint rule [2, 9] for time discretization. Let \(0=t_0<t_1<\cdots <t_J=T\) be a partition of the interval [0, T] and \(\Delta t_j=t_{j+1}-t_j\). For \(j=0, \ldots , J-1\) and \(\omega \in \{u_h, q_h, p_h\}\), let \(\omega ^{j+1}\in W_h^k\) be defined as

$$\begin{aligned} \omega ^{j+1}=2\omega ^{j,1}-\omega ^j, \end{aligned}$$

where \(\omega ^{j,1}\) is the solution of the equation

$$\begin{aligned} \frac{\omega ^{j,1}-\omega ^j}{\frac{1}{2}\,\Delta t_j} +(\omega ^{j,1})_{xxx}+F(\omega ^{j,1})_x=0. \end{aligned}$$

The components of the stabilization function, \((\tau _{qu}^+, \tau _{pu}^+, \tau _{qu}^-, \tau _{qp}^-)\) are taken to be \((0, -1, 1, 1)\) in all the following numerical tests.

Numerical Experiment 1 In this test, we use the HDG method to solve the time-dependent third-order linear problem

$$\begin{aligned} u_t + u_{xxx} =f, \end{aligned}$$

where f is chosen so that the exact solution is \(u(x,t)=\sin (x+t)\) on the domain \((x,t)\in [0, 1]\times [0, 0.1]\). The initial condition is \(u_0=\sin (x)\) and the boundary conditions are \(u(0,t)=\sin (t), u(1,t)=\sin (1+t)\) and \(u_x(1,t)=\cos (1+t)\). We take \(h={2^{-n}}\) for \(n=1,\ldots , 5\). The step size for time discretization is \(\Delta t=0.1*h^2\) for \(k=0, 1\), and \(\Delta t=0.1*h^3\) for \(k=2, 3\) so that the temporal errors are very small. We compute the orders of convergence of \(u_h, q_h, p_h\) at the final time \(T=0.1\), and the orders we observe in the numerical experiments are listed in Table 1.

Our numerical results indicate that the orders of convergence of \((e_u,e_q,e_p)\) are optimal as predicted by the error estimate in Theorem 2.4 for any \(k>0\). For \(k=0\), although our error analysis is inclusive, we observe that the method converges optimally in the numerical experiment.

Table 1 The error \((e_u, e_q, e_p)\) and their convergence orders for the linear problem in the numerical experiment 1

Numerical Experiment 2 Now we use the HDG method to solve the nonlinear third-order equation

$$\begin{aligned} u_t + u_{xxx}+(3u^2)_x =f. \end{aligned}$$

The function f, the initial condition and the boundary conditions are chosen so that the exact solution is \(u(x,t)=\sin (2x+t)\) in the domain \((x,t)\in [0, \pi ]\times [0, 0.1]\). Here, we take the stabilization function \(\tau _F=3\), given that \(F(u)=3u^2\) and \(\frac{1}{2} |F'(u)|=3|u|\le 3\) for the solution u. The mesh size for the HDG method is \(h={2^{-n}}\) for \(n=3,\ldots , 7\). The step size for time discretization is \(\Delta t=0.1*h^2\) for \(k=0, 1\) and \(\Delta t=0.1*h^3\) for \(k=2, 3\) so that the temporal errors are much smaller than the spatial errors. The orders of convergence of \(u_h, q_h, p_h\) at the final time \(T=0.1\) are displayed in Table 2. Our numerical results show that the orders of convergence of \((e_u,e_q,e_p)\) are also optimal for any \(k\ge 0\) for the nonlinear problem.

Table 2 The error \((e_u, e_q, e_p)\) and their convergence orders for the nonlinear problem in the numerical experiment 2

In the previous two tests, we have observed optimal convergence rates of the HDG method for both linear and nonlinear third-order problems. In the next two tests, we apply the method to solve the KdV equation

$$\begin{aligned} u_t+ u_{xxx}+ (3u^2)_x=0. \end{aligned}$$
(4.1)

Numerical Experiment 3 In this test, we consider the KdV equation (4.1) in the domain \((x,t)\in [-10,0]\times [0,2]\) with the initial condition \(u_0=2{{\mathrm{sech}}}^2(x-4)\) and the boundary conditions \(u(-10, t)=2{{\mathrm{sech}}}^2(-10-4t+4), \,u(0,t)=2{{\mathrm{sech}}}^2(-4t+4), \,u_x(0, t)= -4{{\mathrm{sech}}}^2(-4t+4)\tanh (-4t+4)\). The exact solution to this initial-boundary value problem is the classical solitary-wave solution [2, 27]

$$\begin{aligned} u(x,t)=2{{\mathrm{sech}}}^2(x-4t+4). \end{aligned}$$
Fig. 1
figure 1

Space-time graphs of one soliton in the domain \((x, t)\in [-10, 0]\times [0, 2]\). Evolution of the HDG approximate solution (left) and the exact solution (right) of (a): u, (b): q, and (c): p

Fig. 2
figure 2

Space-time graphs of the interaction of two solitary waves in the domain \((x, t)\in [-20, 0]\times [0, 2]\). Evolution of the HDG approximate solution (left) and the exact solution (right) of (a): u, (b): q, (c): p

In the computation, we use 100 elements, piecewise cubic polynomials, and time-step size \(\Delta t=10^{-3}\), and take \(\tau _F=(F'(\widehat{u}))^2+\frac{1}{4}\) so that \(\tau _F > \frac{1}{2}|F'(\widehat{u}_h)|\). The space-time graphs of the computed solution \((u_h, q_h, p_h)\) as well as the exact solutions (uqp) at the final time \(T=2\) are displayed in Fig. 1. We observe a good match between the approximate solutions and the exact solutions.

Numerical Experiment 4 In this test, we simulate the interaction of two solitary waves with different propagation speeds using our HDG method. We consider the KdV equation (4.1) in the domain \((x,t)\in [-20,0]\times [0,2]\) with the initial condition

$$\begin{aligned} u_0(x)= 5 \frac{4.5 {{\mathrm{csch}}}^2[1.5(x+14.5)]+2{{\mathrm{sech}}}^2(x+12)}{\{3\coth [1.5(x+14.5)]-2\tanh (x+12)\}^2} \end{aligned}$$

and boundary data \(u(-20,t), u(0,t), u_x(0,t)\), which admits the solution (see [27])

$$\begin{aligned} u(x,t)=5\frac{4.5{{\mathrm{csch}}}^2[1.5(x-9t+14.5)]+2{{\mathrm{sech}}}^2(x-4t+12)}{\{3\coth [1.5(x-9t+14.5)]-2tanh(x-4t+12)\}^2}. \end{aligned}$$

In our computation, we use 50 elements, piecewise cubic polynomials, and the time-step size \(\Delta t=10^{-4}\). The stabilization function \(\tau _F\) is taken in the same way as in the previous test. The space-time graphs of the HDG approximate solutions and the exact solutions are displayed in Fig. 2. From the side-by-side comparison, we see that the HDG solutions are good approximations to the exact solutions. They show that the two waves are moving toward the same direction. The faster soliton catches up with the slower one and they overlap around \(t=0.5\). Afterwards, the faster soliton continues to propagate and the slower one falls behind.

5 Concluding Remarks

In this paper, we develop a new HDG method for time-dependent third-order equations in one space dimension based on the characterization of the exact solution as the solutions to local problems that are “glued” together by transmission conditions. We find conditions on the stabilization function under which the method is \(L^2\) stable for KdV type equations. We also obtain optimal error estimates for the linear third-order equation. Numerical results from computation verify the theoretical error analysis and show that the method is able to accurately simulate solitary wave solutions of the KdV equation. Our future work is to develop and analyze HDG methods for fifth-order KdV equations and third-order equations in multiple dimensions and complex systems.