1 Introduction

This paper considers the numerical methods for solving the generalized Benjamin–Ono (gBO) equation

$$\begin{aligned} {\left\{ \begin{array}{ll} u_t=-(-\mathcal Hu_x+\frac{1}{m}u^m)_x, \quad x \in \mathbb {R}, \,\, t>0, \,\, m \in \mathbb {Z}^+ ,\\ u(x,0)=u_0, \end{array}\right. } \end{aligned}$$
(1.1)

where \(u \in L^1(\mathbb {R})\) is the velocity potential, and the Hilbert transform \(\mathcal {H}\) is defined by

$$\begin{aligned} \mathcal Hf(x)=\frac{1}{\pi }\text{ p.v. }\int _{-\infty }^{\infty } \frac{f(y)}{x-y} dy, \end{aligned}$$
(1.2)

where p.v. stands for the principle value of the indefinite integral, or equivalently, \(\widehat{\mathcal Hf}(\xi )=-i\text{ sgn }(\xi )\hat{f}(\xi )\) on the Fourier frequency side. When \(m=2\), it is the well-known Benjamin–Ono (BO) equation

$$\begin{aligned} u_t -\mathcal H u_{xx} + u_x u =0, \end{aligned}$$
(1.3)

derived by Benjamin [5] in 1967 and Ono [54] in 1975. This Eq. (1.3) models the one-dimensional waves in deep water. The BO equation is closely related to the Korteweg-de Vries (KdV) equation, where the Hilbert transform term \(\mathcal {H}u_{xx}\) is replaced by \(u_{xxx}\). The KdV equation models the one-dimensional shallow water waves. Both equations, BO and KdV, are completely integrable, and the Lax pair can be constructed as described in e.g., [2, 3, 27, 34, 52]. Without loss of generality, we can consider other nonlinearities. One example is to consider the power nonlinearities as of in Eq. (1.1). They are relevant in various other models of water waves, e.g., see [1, 8, 9, 22]. When \(m=3\), the Eq. (1.1) is typically referred to as the modified Benjamin–Ono (mBO) equation. When \(m \ge 3\), the Eq.  (1.1) is typically referred to as the generalized Benjamin–Ono (gBO) equation. In general, the gBO equation (1.1) conserves the following three quantities

$$\begin{aligned}&I[u(t)] {\mathop {=}\limits ^\mathrm{{def}}}\int u(x,t) dx =I[u_0]; \end{aligned}$$
(1.4)
$$\begin{aligned}&M[u(t)] {\mathop {=}\limits ^\mathrm{{def}}}\int [u(x,t)]^2 dx =M[u_0]; \end{aligned}$$
(1.5)
$$\begin{aligned}&E[u(t)] {\mathop {=}\limits ^\mathrm{{def}}}\int \left[ \frac{1}{2} \left( (\mathcal H \partial _x)^{\frac{1}{2}} u(x,t)\right) ^2-\frac{1}{m(m+1)} \left( u(x,t)\right) ^{m+1} \right] dx = E[u_0]. \end{aligned}$$
(1.6)

The first one is called the momentum, or the conservation of the first integral, or the hyperbolic conservation law, and the last two are often called mass and energy (Hamiltonian), respectively.

Besides its physical applications, the gBO equation attracts great interest in investigating from the mathematical point of view, since it is a good example to study the fractional partial differential equations. The well-posedness theory for the Cauchy problem has been discussed initially in [37, 63]. Futher improvements on the well-posedness questions were done in [16, 17, 35, 41, 50, 51, 67, 70]. The gBO equation possesses the scaling invariance property, i.e., suppose u(xt) is the solution to (1.1), then, \(u_{\lambda }=\lambda ^{\frac{2}{m-1}}u(\lambda x, \lambda ^2 t)\) is also a solution to (1.1) for some constant \(\lambda \). Moreover, when considering the homogeneous Sobolev norm \(\Vert u\Vert _{\dot{H}^{s_c}}=\Vert u_{\lambda }\Vert _{\dot{H}^{s_c}}\) (where \(\Vert u\Vert _{\dot{H}^{s_c}}=\Vert |\xi |^{s_c}\hat{u}(\xi )\Vert _{L^2}\)), we have the relation \(s_c=\frac{1}{2}-\frac{1}{m-1}\). When \(s_c<0\) (\(m<3\)), it refers to the \(L^2\)-subcritical case; when \(s_c=0\) (\(m=3\)), it refers to the \(L^2\)-critical case; and when \(s_c>0\) (\(m>3\)), it refers to the \(L^2\)-supercritical case. The soliton resolution conjecture are one of the most interesting topics in the \(L^2\)-subcritical case; and for the \(L^2\)-critical and \(L^2\)-supercritical cases, there may exist blow-up solutions. This was numerically observed in [10] and our recent paper [59]. Besides the blow-up solutions, there are still many open questions, such as the soliton stability and the dispersion limit. These kind of questions have been studied both numerically and analytically. Compared with the (generalized) KdV equation (e.g., [11, 30, 31], etc.), however, the gBO equation is less well studied (e.g., [49, 56] and review [62]). Therefore, a stable, efficient and accurate numerical algorithm would be desired the future study.

Numerical investigations on the BO equation have been started some time ago. Related articles can be found in [10, 24, 68] for the domain truncation approach; [12, 13, 71] for the computation of the Hilbert transform on \(\mathbb {R}\); [15, 32] for the pseudo-spectral method with the rational basis functions; and [14] for a comparison between the domain truncation and the pseudo-spectral method on \(\mathbb {R}\). Despite some years of investigations, there are still far less studies about numerical methods for the gBO equations than the gKdV equations. To our best knowledge, there are no results concerning the conservative schemes for the gBO equation on the whole real line \(\mathbb {R}\) so far. On the other hand, the conservative schemes are always preferable in simulating the PDE’s with conserved quantities, especially for studying the long time solution behavior, since it generally possesses good accuracy and stability. One possible reason is the numerical approximation of the Hilbert transform on \(\mathbb {R}\), which is not as well studied as on a finite domain. However, if considering the conventional domain truncation spatial discretization strategy (e.g., the finite difference or Fourier spectral methods), the Hilbert transform usually leads to a slow decaying function, and consequently, to a relatively large domain truncation error.

The purpose of this paper is to construct the conservative schemes for the gBO equation (1.1) on the whole real line \(\mathbb {R}\), with arbitrarily high order accuracy in time. The spatial discretization is achieved by the rational basis functions with the pseudo-spectral approach from [32]. We first notice that the pseudo-spectral discretization from the rational basis functions will result in the operators with the “SBP" property discussed in [57]. Next, we prove that by reformulating into the different forms, and applying the Hermitian or anti-Hermitian properties of the resulting spatial semi-discretized system, either the spatial semi-discretized mass or energy will be preserved. For the temporal discretization, the Crank–Nicolson method with the conventional reformulation of the nonlinear potential term (e.g., see [25] for the nonlinear Schrödinger (NLS) equation case) will lead to the conservation of the energy in the discrete time flow. Furthermore, the high order mass and energy conservative schemes can be constructed from the scalar auxiliary variable (SAV) approach (see [21, 65, 66] for applications to dispersive PDEs). By using the symplectic Runge–Kutta (SRK) method, the three invariant quantities (1.4)–(1.6) [with proper modifications for energy (1.6)] will be preserved exactly in the discrete time flow. However, due to the limitation of the spatial discretization, we can only conserve either the discrete mass or the discrete energy in the space-time fully discrete sense. In fact, this strategy is universal. By a similar space-time discretization, it is easy to construct the conservative schemes for the gKdV equations and the structure-preserving schemes (the discrete mass and energy are preserved exactly at the same time) for the NLS equations, as well as their high dimensional generalization by applying the tensor product. This will be useful in studying the long time behavior of the solutions for those equations, as well as the slow decaying solutions, since the traditional domain truncation strategy (e.g., [29, 45, 46, 73]) requires large computational domain, and consequently, it results in large number of nodes in spatial discretization.

This paper is organized as follows. In Sect. 2, we introduce the pseudo-spectral spatial discretization strategy from the rational basis functions. Then, we define the discrete inner product with respect to the collocation points from such rational basis functions. Finally, we give the mass-conservative or the energy-conservative spatial semi-discretized form of the gBO equation (1.1). In Sect. 3, we first introduce the Crank–Nicolson types of temporal discretization. We show that the Crank–Nicolson method with its conventional modification on the nonlinear term, will preserve the energy exactly in the discrete time flow. Combining with the previous results in Sect. 2, we give two fully discretized schemes for the gBO equation, which conserve either the discrete mass or the discrete energy. Next, we consider the high-order conservative schemes achieved by the symplectic Runge–Kutta method with the SAV reformulation. From the classical argument on the symplectic Runge–Kutta method (e.g. [20, 60]), we show that the reformulated system preserves the quantities (1.4)–(1.6) exactly in the discrete time flow. Again, combined with the spatial discretization results in Sect. 2, we give two fully discretized schemes with high order temporal accuracy—one conserves the discrete mass and the other conserves the discrete energy. In Sect. 4, we illustrate the numerical examples. As a comparison, we also show the numerical results obtained from the non-conservative semi-implicit Leap-Frog scheme. Our numerical results show that the proposed schemes preserve the designate quantities based on the type of conservative scheme we choose. Compared with the non-conservative scheme, these conservative schemes also possess better accuracy in most cases, especially in simulating the soliton type solutions, which is of great physical interest. Additionally, the error from the temporal discretization decreases on the order as expected (second order for the IRK2 and Leap-Frog schemes, and fourth order for the IRK4 schemes). These results show the validity and efficiency of the numerical methods proposed.

2 Spatial Discretization

In this section, we describe the rational basis functions in \(\mathbb {R}\) used for the spatial discretization. The review of basis functions can be found in [19, 71]. One advantage of this discretization is that it can easily represent the Hilbert transform. Then, we define the discrete inner product corresponding to the collocation points from the rational basis functions. Finally, we introduce two types of the spatial discretization for the gBO equation (1.1): one is mass-conservative and the other one is energy-conservative.

2.1 Rational Basis Functions

Consider the rational basis functions on the whole real line \(\mathbb {R}\), which comes from the Fourier transform of the Laguerre functions (functions of parabolic cylinder), and behaves as \(x^{-1}\) at infinity, i.e.,

$$\begin{aligned} u(x,t)=\sum _{k=-\infty }^{\infty } \hat{u}_{k}(t)\rho _k(x), \quad \rho _k(x)=\frac{(\alpha +ix)^k}{(\alpha -ix)^{k+1}}, \end{aligned}$$
(2.1)

where i is the complex number, and \(\alpha \) is a mapping parameter that we will describe later. In [19], it is shown that \(\{ \rho _k(x) \}_{k=-\infty }^{\infty }\) form a complete orthogonal basis in \(L^2(-\infty , \infty )\) with the following orthogonality

$$\begin{aligned} \int _{-\infty }^\infty \rho _j(x) \overline{\rho _k(x)}dx = {\left\{ \begin{array}{ll} \pi /\alpha , &{} \quad j=k\\ 0, &{} \quad j\ne k. \end{array}\right. } := \frac{\pi }{\alpha } \delta _{j,k} . \end{aligned}$$
(2.2)

Therefore, we have

$$\begin{aligned} \hat{u}_k(t)=\frac{\alpha }{\pi }\int _{-\infty }^{\infty } u(x,t)\rho _k(x)dx. \end{aligned}$$

From the rational expansion (2.1), the Hilbert transform can be easily calculated [71] by

$$\begin{aligned} \mathcal {H}(u(t,x))=\sum _{k=-\infty }^{\infty } -i \hat{u}_k(t) \text {sgn}(k)\rho _k(x), \end{aligned}$$
(2.3)

with \(\textrm{sgn}(k)=1\) when \(k=0\). Meanwhile, the derivatives of u(xt) can be computed by using the relation \(\rho _k+\rho _{k-1}=\frac{2\alpha (\alpha +ix)^{n-1}}{(\alpha -ix)^{n+1}}\), and consequently,

$$\begin{aligned} u_x(x,t)=\sum _{k=-\infty }^{\infty }&\frac{i}{2\alpha }[k \hat{u}_{k-1}+(2k+1) \hat{u}_k+(k+1) \hat{u}_{k+1}]\rho _k(x), . \end{aligned}$$
(2.4)

Similar to find the higher order derivatives for Chebyshev or Legendre basis approximation in [64, Chapter 3], the second (or higher) order derivative of u(xt) can be computed iteratively as follows:

$$\begin{aligned} u_{xx}(x,t)&= \sum _{k=-\infty }^{\infty } -\frac{1}{4\alpha ^2} [k(k-1) \hat{u}_{k-2}+4k^2\hat{u}_{k-1} \nonumber \\&\quad + (6k^2+6k+2) \hat{u}_k+4(k+1)^2 \hat{u}_{k+1}+(k+2)(k+1) \hat{u}_{k+2}]\rho _k(x) . \end{aligned}$$
(2.5)

In numerical computations, a truncation of N-term interpolation function \(I_Nu\) are used to approximate the function u(x) using the basis \(\lbrace \rho _k(x) \rbrace _{-N/2}^{N/2-1}\) to interpolate from the node values at \(u(x_{-N/2}),\cdots u(x_{N/2-1})\), i.e.,

$$\begin{aligned} u(x,t)\approx I_Nu:= \hat{\textbf{u}}^T \vec {\rho }:=\sum _{k=-N/2}^{N/2-1} \hat{u}_k(t)\rho _k(x), \end{aligned}$$

where \(\hat{\textbf{u}}=(\hat{u}_{-N/2},\hat{u}_{-N/2+1}, \cdots , \hat{u}_{N/2-1})^T\) is the vector of the truncated coefficients, and \(\vec {\rho }=(\rho _{-N/2}(x), \cdots , \rho _{N/2-1}(x))^T\) is the vector function of \(\rho _k(x)\). This leads to the sparse matrix forms

$$\begin{aligned} u_x \approx [ \mathbf {S_1} \hat{\textbf{u}}]^T \vec {\rho }, \quad u_{xx} \approx [\mathbf {S_2} \hat{\textbf{u}}]^T \vec {\rho }, \quad \mathcal {H}u \approx [\textbf{H}\hat{\textbf{u}}]^T \vec {\rho }, \end{aligned}$$
(2.6)

where \(\mathbf {S_{1}}\) and \(\mathbf {S_{2}}\) are given in (2.4) and (2.5) via the coefficients of \(\lbrace \hat{u}_k \rbrace \), and

$$\begin{aligned} \textbf{H}=-i \textrm{diag}(\textrm{sgn}(-N/2+0.5), \cdots , \textrm{sgn}(N/2-0.5)) \end{aligned}$$

is the diagonal matrix representing the approximation of the Hilbert transform in (2.3).

From (2.4) to (2.6), it is easy to see that the matrices \(\mathbf {S_1}\) and \(\textbf{H}\) are anti-Hermitian, and the matrix \(\mathbf {S_2}\) is real and symmetric. The anti-Hermitian and Hermitian properties for \(\mathbf {S_1}\) and \(\mathbf {S_2}\) are crucial in constructing the conservative schemes.

Now, consider the change of variable

$$\begin{aligned} x=\alpha \tan \frac{\theta }{2}, \quad \textrm{or} \,\, \textrm{equivalently}, \quad e^{i\theta }=\frac{\alpha +ix}{\alpha -ix}, \quad -\pi \le \theta \le \pi , \end{aligned}$$

and a spatial discretization \(x_j=\alpha \tan \frac{\theta _j}{2}, \theta _j=jh, h=2\pi /N, j=-N/2, \cdots , N/2-1\), where the \(\alpha \) is the mapping parameter indicating that N/2 collocation points are located in the interval \([-\alpha , \alpha ]\). Notice that

$$\begin{aligned} u(x_j)=\sum _{k=-N/2}^{N/2-1}\hat{u}_k \rho _k(x_j) \,\, \Rightarrow \,\, u(x_j)(\alpha -ix_j)=\sum _{k=-N/2}^{N/2-1}\hat{u}_k e^{ik\theta _j}, \end{aligned}$$
(2.7)

hence, the Fast Fourier transform (FFT) can be applied to obtain the coefficients \(\hat{u}_k\). We note that the above discretization in space is not uniform in x, but uniform in \(\theta \), and the singularity at \(x_{-N/2}=-\infty \) can be removed by imposing the boundary condition \(u(-\infty )=0\), i.e., setting \(u_{-N/2}=0\).

We denote the matrix \(\textbf{F}\) to be the standard Fast Fourier transform (FFT) matrix with \(\{ k\theta _j \}\), i.e.,

$$\begin{aligned} \textbf{F}_{kj}= \frac{1}{N} e^{-ik \theta _j}, \quad \textbf{F}^{-1}_{jk}= e^{ik \theta _j}, \quad -N/2 \le j,k \le N/2-1. \end{aligned}$$

Note that instead of writing explicitly, the matrices \(\textbf{F}\) and \(\textbf{F}^{-1}\) can be computed by FFT, (see, e.g., [64, Chapter 2] and [69, Chapter 3]). Denote the diagonal matrix \(\textbf{P}=\text {diag}(\alpha -ix_{-N/2}, \cdots , \alpha -ix_{N/2-1})\) to be the weight matrix, which comes from (2.7). The coefficients \(\hat{u}_k\) in the vector form can be represented by

$$\begin{aligned} \hat{\textbf{u}}=\textbf{FPu}, \end{aligned}$$
(2.8)

where \(\textbf{u}=(u_{-N/2},\cdots , u_{N/2-1})^T\) and \(u_j=u(x_j)\).

Now, we define the discrete inner product with respect to the rational basis function. Denote the inner product between the two functions u(x) and v(x) on \(\mathbb {R}\) by

$$\begin{aligned} \langle u(x),v(x) \rangle := \int _{\mathbb {R}} u(x) \bar{v}(x) dx. \end{aligned}$$

Recall that the interpolation function \(I_N u\) is the approximation of \(u(x) \approx I_Nu = \sum _{k=-N/2}^{N/2-1} \hat{u}_{k}\rho _k(x)\), then, the approximation of the inner product for functions u and v will be

$$\begin{aligned} \langle u, v \rangle&\approx \int _{\mathbb {R}} I_Nu \overline{I_N v} \, dx = \frac{\pi }{\alpha } \sum _{k=-N/2}^{N/2-1} \hat{u}_k \overline{\hat{v}}_k \nonumber \\&=\frac{\pi }{\alpha }\overline{\mathbf {({FPv})}}^\textbf{T}\mathbf {(FPu)}=\frac{\pi }{\alpha }\bar{\textbf{v}}^\textbf{T} \bar{\textbf{P}} \bar{\textbf{F}}^\textbf{T} \textbf{F} \textbf{Pu}=\frac{\pi }{\alpha N}\bar{\textbf{v}}^\textbf{T} \bar{\textbf{P}} \textbf{Pu}, \end{aligned}$$
(2.9)

from the orthogonal property (2.2), the relation (2.8), and \(\bar{\textbf{F}}^\textbf{T}=\frac{1}{N} \mathbf {F^{-1}}\) (e.g., see [64, Chapter 2]). Denote the diagonal matrix \(\textbf{W}=\textbf{P}\bar{\textbf{P}}=\textrm{diag}(\alpha ^2+x^2_{-N/2}, \cdots , \alpha ^2+x^2_{N/2-1})\) to be the product of the two diagonal matrices \(\textbf{P}\) and \(\bar{\textbf{P}}\). According to (2.9), we can define the discrete inner product with respect to the collocation points \(\lbrace x_j \rbrace \) from the rational basis function:

$$\begin{aligned} \langle \textbf{u}, \textbf{v} \rangle _h := \frac{\pi }{N \alpha } \bar{\textbf{v}}^\textbf{T}{} \textbf{Wu}=\frac{\pi }{N \alpha } \sum _{j=-N/2}^{N/2-1} w_j u_j \bar{v}_j, \end{aligned}$$
(2.10)

where \(w_j=\alpha ^2+x^2_{j}\) can be considered as the weights for the quadrature rule in (2.10).

Now, for simplicity, we define the first order and second order differential matrices \(\mathbf {D_{1,2}}=\mathbf {P^{-1}F^{-1} S_{1,2} FP}\). Then, \(u_x(x_j) \approx \mathbf {D_1u}(j)\), the jth element of the vector \(\mathbf {D_1u}\), and similarly for its second order derivative \(u_{xx}(x_j) \approx \mathbf {D_1u}(j)\).

We notice that for \(u,v \in \mathbb {V}_N\), where \(\mathbb {V}_N\) is the space spanned by \(\lbrace \rho _k(x) \rbrace _{-N/2}^{N/2-1}\), the operator \(\mathbf {D_1}\) satisfies the summation by part property, i.e.,

$$\begin{aligned} \int u_x v dx =\langle \mathbf {D_1u,}\bar{\textbf{v}} \rangle _h=-\langle \mathbf {u,D_1} {\bar{\textbf{v}}} \rangle _h=-\int u v_x dx. \end{aligned}$$
(2.11)

Indeed, from (2.9), we have

$$\begin{aligned} \int u_x v dx&=\langle \mathbf {D_1u,}\bar{\textbf{v}} \rangle _h =\frac{\pi }{\alpha N}\mathbf {v^TWD_1u}=\frac{\pi }{\alpha N}(\mathbf {F^TS_1^T(F^{-1})^T}\bar{\textbf{P}}{} \textbf{v})^\textbf{T}{} \textbf{Pu}\\&=\frac{\pi }{\alpha N}\overline{\left( \mathbf {P^{-1}}\bar{\textbf{F}}^\textbf{T}\bar{\textbf{S}}_\textbf{1}^\textbf{T}(\bar{\textbf{F}}^{\mathbf{-1}})^\textbf{TP}\bar{\textbf{v}} \right) ^\textbf{T}}\bar{\textbf{P}}{} \textbf{Pu}=-\frac{\pi }{\alpha N}(\overline{\textbf{FPD}_\textbf{1}\bar{\textbf{v}}})^\textbf{T} \textbf{FPu}\\&=-\langle \mathbf {u,D_1}\bar{\textbf{v}} \rangle _h=-\int u v_x dx. \end{aligned}$$

Unfortunately, the operator \(\mathbf {D_1}\) does not satisfy the chain rule or the integration property. Indeed, for example, let \(u \in \mathbb {V}_N\), we consider the power nonlinearity \(u^3\). For any test function \(v \in \mathbb {V}_N\), we want \(\int (u^3)_x\bar{v} dx=\int 3u^2u_x\bar{v} dx\). However,

$$\begin{aligned} \int (u^3)_x\bar{v} dx \approx \int I_N((u^3)_x)\bar{v} dx = \langle \mathbf {D_1u^3,}\bar{\textbf{v}} \rangle _h \ne \langle \mathbf {diag(3u^2)D_1u,}\bar{\textbf{v}} \rangle _h=\int I_N(3u^2u_x)\bar{v} dx. \end{aligned}$$

Similarly, \(\langle \mathbf {D_1u,1} \rangle _h \ne 0\) (which implies \(\int u_x dx\)), since 1 is not in the function space \(\mathbb {V}_N\) unless \(N\rightarrow \infty \). This implies unlike the Fourier spectral method, the rational basis function spectral method here does not preserve the first integral.

2.2 Conservative Spatial Discretization

To discuss the conservative spatial discretizations, we first define the spatial semi-discretized \(L^1\)-type integral, mass and energy from (1.4)–(1.6). Let \(\textbf{1}=(1,1, \cdots ,1)^T\) be the \(N \times 1\) vector. For simplicity, we also denote by \(\mathbf {u^m}=(u_{-N/2}^m, \cdots u_{N/2-1}^m)^T\) to be the pointwise power of the vector \(\textbf{u}\). Then, the spatial semi-discretized \(L^1\)-type integral, mass and energy are defined as follows

$$\begin{aligned}&I_h=\langle \textbf{u}, \textbf{1} \rangle _h; \end{aligned}$$
(2.12)
$$\begin{aligned}&M_h=\langle \textbf{u}, \textbf{u} \rangle _h; \end{aligned}$$
(2.13)
$$\begin{aligned}&E_h=\frac{1}{2} \langle \mathbf {P^{-1}F^{-1} HS_1 FP u}, \textbf{u} \rangle _h -\frac{1}{m(m+1)} \langle \mathbf {u^m}, \textbf{u} \rangle _h. \end{aligned}$$
(2.14)

It is easy to see that if \(\textbf{u} \in \mathbb {R}^N\), then \(\frac{d}{dt}I_h=\langle \textbf{u}_t, \textbf{1} \rangle _h\), and \(\frac{d}{dt}M_h=2\langle \textbf{u}_t, \textbf{u} \rangle _h\) from (2.10). We also note that

$$\begin{aligned} \frac{d}{dt}E_h&= \frac{1}{2} \left( \langle \mathbf {P^{-1}F^{-1} HS_1 FP u}_t, \textbf{u} \rangle _h+\langle \mathbf {P^{-1}F^{-1} HS_1 FP u}, \textbf{u}_t \rangle _h \right) \nonumber \\&\quad - \frac{1}{m(m+1)} \left( \langle (\mathbf {u^m})_t, \textbf{u} \rangle _h +\langle \mathbf {u^m}, \textbf{u}_t \rangle _h \right) \nonumber \\&={\text {Re}}\left( \langle \mathbf {P^{-1}F^{-1} HS_1 FP u}, \textbf{u}_t \rangle _h \right) -\frac{1}{m} \langle \mathbf {u^m}, \textbf{u}_t \rangle _h. \end{aligned}$$
(2.15)

The last equality is obtained from the following argument. We first noticed that the matrix \(\mathbf {HS_1}\) is real and symmetric, and \(\textbf{P}\) is diagonal. Consequently, using (2.10) and \(\bar{F}^\mathbf {{T}}=\frac{1}{N}\mathbf {F^{-1}}\), we obtain

$$\begin{aligned}&\langle \mathbf {P^{-1}F^{-1} HS_1 FP u}, \textbf{u}_t \rangle _h= \frac{\pi }{N\alpha }\bar{\textbf{u}^T}_t \textbf{P} \bar{\textbf{P}} \mathbf{P^{-1}F^{-1}HS_1FPu} \\&\quad = \frac{\pi }{N\alpha } \mathbf {u^TP^TF^THS_1(F^{-1})}^\textbf{T}\bar{\textbf{P}}\bar{\textbf{u}}_t = \frac{\pi }{N\alpha } \mathbf {u^TP}\bar{\textbf{P}} \bar{\textbf{P}}^{-1}\bar{\textbf{F}}^{-1}\overline{\mathbf{HS_1}}\bar{F}\bar{P}\bar{\textbf{u}}_t\\&\quad = \frac{\pi }{N\alpha } \mathbf {u^TP}\bar{\textbf{P}}\overline{\mathbf {P^{-1}F^{-1}HS_1FPu_t}} =\overline{\langle \mathbf {P^{-1}F^{-1}HS_1FPu_t, u} \rangle _h}. \end{aligned}$$

Similarly, we have

$$\begin{aligned} \langle \mathbf {(u^m)}_t, \textbf{u} \rangle _h =m \langle \mathbf {u^{m-1}u}_t, \textbf{u} \rangle _h =m\langle \mathbf {u^m}, \textbf{u}_t \rangle _h. \end{aligned}$$

Recall that the energy conservation (1.6) is obtained by taking the inner product of \(\left( -\mathcal {H}u_x+\frac{1}{m}u^m \right) \) on both sides in the gBO equation (1.1). On the other hand, the mass conservation (1.5) of the gBO equation (1.1) is obtained by first rewriting it in the form

$$\begin{aligned} u_t=\mathcal {H}u_x -\frac{1}{m+1}\left( (u^m)_x+u^{m-1}u_x \right) , \end{aligned}$$

and then, taking the inner product with u on both sides. When the equation is discretized with our rational basis functions, we noticed that \(u_x \approx \mathbf {P^{-1}F^{-1}S_1FPu}\). Thus, we can consider the matrix \(\mathbf {P^{-1}F^{-1}S_1FP}\) to be the approximation of the first order derivative operator \(\partial _x\). Similarly, we have the approximation of \(\mathcal {H}\partial _x\), \(\partial _{xx}\) from \(\mathbf {P^{-1}F^{-1}HS_1FP}\) and \(\mathbf {P^{-1}F^{-1}S_2FP}\), respectively. Using this idea, we propose the following proposition for the spatial conservative discretizations.

Proposition 2.1

The following spatial semi-discretized equation to the gBO equation (1.1)

$$\begin{aligned} \textbf{u}_t=-\mathbf {P^{-1}F^{-1}S_1FP}\left( \mathbf {-P^{-1}F^{-1}HS_1FP u}+ \frac{1}{m}\mathbf {u^m}\right) \end{aligned}$$
(2.16)

conserves the spatial semi-discretized energy, i.e.,

$$\begin{aligned} \frac{d}{dt}E_h=0. \end{aligned}$$
(2.17)

On the other hand, the following spatial semi-discretized equation to the gBO equation (1.1)

$$\begin{aligned} \textbf{u}_t=\mathbf {P^{-1}F^{-1}HS_2FP u}- \frac{1}{m+1} \left( \textrm{diag}(\mathbf {u^{m-1}) P^{-1}F^{-1}S_1FP u} +\mathbf {P^{-1}F^{-1}S_1FPu^m} \right) \end{aligned}$$
(2.18)

conserves the spatial semi-discretized mass, i.e.,

$$\begin{aligned} \frac{d}{dt}M_h=0. \end{aligned}$$
(2.19)

Proof

Putting the Eq. (2.16) in (2.15) yields

$$\begin{aligned} \frac{d}{dt}E_h&={\text {Re}}\left( \langle \mathbf {P^{-1}F^{-1}HS_1FP u}- \frac{1}{m}\mathbf {u^m}, \mathbf {P^{-1}F^{-1}S_1FP} (\mathbf {P^{-1}F^{-1}HS_1FP u}- \frac{1}{m}\mathbf {u^m}) \rangle _h \right) \nonumber \\&=\frac{\pi }{\alpha N} {\text {Re}}\left[ \overline{ \left( \mathbf {P^{-1}F^{-1}S_1FP} (\mathbf {P^{-1}F^{-1}HS_1FP u}- \frac{1}{m}\mathbf {u^m}) \right) } ^{\textbf{T}} \textbf{W} \left( \mathbf {P^{-1}F^{-1}HS_1FP u}- \frac{1}{m}\mathbf {u^m} \right) \right] \nonumber \\&=\frac{\pi }{\alpha N} {\text {Re}}\left[ \overline{\left( \mathbf {P^{-1}F^{-1}HS_1FP u}- \frac{1}{m}\mathbf {u^m}) \right) }^{\textbf{T}} \overline{(\mathbf {P^{-1}F^{-1}HS_1FP})}^{\textbf{T}} \textbf{W} \left( \mathbf {P^{-1}F^{-1}HS_1FP u}- \frac{1}{m}\mathbf {u^m} \right) \right] \nonumber \\&= \frac{\pi }{\alpha N} {\text {Re}}\left[ \overline{ \left( \mathbf {P^{-1}F^{-1}HS_1FP u}- \frac{1}{m}\mathbf {u^m}\right) }^{\textbf{T}} \left( \bar{\textbf{P}}\mathbf {F^{-1}S_1FP} \right) ^{\textbf{T}} \left( \mathbf {P^{-1}F^{-1}HS_1FP u}- \frac{1}{m}\mathbf {u^m} \right) \right] =0, \end{aligned}$$
(2.20)

since \(\textbf{W}=\bar{\textbf{P}}\textbf{P}\), and notice that the matrix \(\frac{1}{N} \bar{\textbf{P}}\mathbf {F^{-1}S_1FP}\) is anti-Hermitian, which possesses the property regarding to the quadratic form \(\bar{\textbf{f}}^\textbf{T}{} \textbf{Df}=0\) if \(\textbf{D}\) is an anti-Hermitian matrix.

Similarly, we first note that \(\textbf{W}\textrm{diag}(\mathbf{u^{m-1}})=\textrm{diag}(\mathbf {u^{m-1}})\textbf{W}\), since both of them are diagonal matrices. Then,

$$\begin{aligned} \frac{d}{dt}M_h&=2\langle \mathbf {u_t}, \textbf{u} \rangle =2\langle \mathbf {P^{-1}F^{-1}HS_2FP u}, \textbf{u} \rangle _h \nonumber \\&\quad -\frac{2}{m+1} \langle \left( \textrm{diag}(\mathbf {u^{m-1}}) \mathbf{P^{-1}F^{-1}S_1FP u} +\mathbf { P^{-1}F^{-1}S_1FPu^m } \right) , \textbf{u} \rangle _h \nonumber \\&=0-\frac{2\pi }{(m+1) \alpha N}\left( (\mathbf {u^{m}})^\textbf{T} \bar{\textbf{P}}\mathbf {F^{-1}S_1FP u}+ \mathbf {u^T} \bar{\textbf{P}}\mathbf {F^{-1}S_1FP u^m} \right) =0. \end{aligned}$$
(2.21)

The identity holds because the first part \(2\langle \mathbf {P^{-1}F^{-1}HS_2FP u}, \textbf{u} \rangle _h=0\) in (2.21) results from the quadratic form \(\frac{2\pi }{\alpha N}\bar{\textbf{u}}^\textbf{T}\bar{\textbf{P}}\mathbf {F^{-1}S_1FP u}\) according to the definition (2.10), and the matrices \(\frac{1}{N} \bar{\textbf{P}}\mathbf {F^{-1}HS_2FP}\) is anti-Hermitian. For the second part \((\mathbf {u^{m}})^\textbf{T} \bar{\textbf{P}}\mathbf {F^{-1}S_1FP u}+ \textbf{u}^\textbf{T} \bar{\textbf{P}}{} \mathbf{F^{-1}S_1FP u^m} =0\) in (2.21), we first notice that the identity

$$\begin{aligned} \mathbf {f^T} \bar{\textbf{P}}{} \mathbf{F^{-1}S_1FPg }= N(\bar{\textbf{F}}\bar{\textbf{P}}\textbf{f})^{\textbf{T}}\mathbf {S_1FPg} \end{aligned}$$

when \(\textbf{f}\) and \(\textbf{g}\) are real vectors. Then, taking \(\textbf{f}=\mathbf {u^m}\) and \(\textbf{g}=\textbf{u}\) yields \((\mathbf {u^{m}})^\textbf{T} \bar{\textbf{P}}{} \mathbf{F^{-1}S_1FP u}+ \mathbf {u^T} \bar{\textbf{P}}{} \mathbf{F^{-1}S_1FP u^m} =0\), since \(\mathbf {S_1}\) is anti-Hermitian.

\(\square \)

3 Temporal and Full Discretization

In this section, we first discuss the temporal discretization, and then, the space-time full discretization of the gBO equation (1.1). We start with the most commonly used Crank–Nicolson-type scheme. After that, we consider the high order conservative schemes. This is achieved by the symplectic Runge–Kutta method, such as the Gauss–Legendre Runge–Kutta method. When considering the energy conservation, the scalar auxiliary variable (SAV) approach from [21, 65] will be incorporated. With this kind of approach, one can easily construct the conservative numerical scheme with arbitrarily high order accuracy in time.

3.1 Crank–Nicolson-Type Scheme

We first introduce the notations. Assume that our simulation is on the finite time interval \(t\in [0,T]\). Define \(\tau \) to be the time step and \(t_n=n\tau \) to be the time at the nth time step. Denote \(u^n \approx u(x,t_n)\) to be the semi-discretization in time. Denote \(I^n=I[u^n]\), \(M^n=M[u^n]\) and \(E^n=E[u^n]\) to be the momentum, mass and energy from (1.4)–(1.6) at time \(t=t_n\). For convenience, the half-time step is denoted as \(u^{n+\frac{1}{2}}=\frac{1}{2}(u^n+u^{n+1})\) from the linear interpolation. We also denote the full discretization by \(u^n_j \approx u(x_j,t_n)\), and the column vector \(\textbf{u}^n \approx u(\textbf{x},t_n)\). Now, we define the discrete first integral, mass and energy as follows:

$$\begin{aligned}&I_h^n=\langle \textbf{u}^n, \textbf{1} \rangle _h; \end{aligned}$$
(3.1)
$$\begin{aligned}&M_h^n=\langle \textbf{u}^n, \textbf{u}^n \rangle _h; \end{aligned}$$
(3.2)
$$\begin{aligned}&E_h^n=\frac{1}{2} \langle \mathbf {P^{-1}F^{-1} HS_1 FP} \textbf{u}^n, \textbf{u}^n \rangle _h -\frac{1}{m(m+1)} \langle (\textbf{u}^n)^{\textbf{m}}, \textbf{u}^n \rangle _h. \end{aligned}$$
(3.3)

We propose the following Crank–Nicolson-type mass-conservative scheme as follows:

Theorem 3.1

The scheme

$$\begin{aligned} \frac{\textbf{u}^{n+1}-\textbf{u}^n}{\tau }&=\mathbf {P^{-1}F^{-1}HS_2FP u}^{n+\frac{1}{2}}\nonumber \\&\quad - \frac{1}{m+1} \left( \textrm{diag}\left( (\textbf{u}^{{n+\frac{1}{2}}})^{\mathbf {m-1}} \right) \mathbf {P^{-1}F^{-1}S_1FP} \textbf{u}^{n+\frac{1}{2}} +\mathbf {P^{-1}F^{-1}S_1FP}(\textbf{u}^{{n+\frac{1}{2}}})^{\textbf{m}} \right) \end{aligned}$$
(3.4)

conserves the discrete mass (3.2) exactly in time, i.e.,

$$\begin{aligned} M_h^{n+1}=M_h^{n}. \end{aligned}$$

Proof

The proof is straightforward. Equipping the Eq. (3.4) with the discrete inner product (2.10) with the vector \(\textbf{u}^{n+\frac{1}{2}}\), and using the identity (2.21) in Proposition 2.1 yields the result. \(\square \)

For the Crank–Nicolson-type energy-conservative scheme, by modifying of the nonlinear term, we have the following theorem.

Theorem 3.2

The scheme

$$\begin{aligned} \frac{\textbf{u}^{n+1}-\textbf{u}^n}{\tau }&=\mathbf {P^{-1}F^{-1}S_1FP} \Big [ \mathbf {P^{-1}F^{-1}HS_1FP u}^{n+\frac{1}{2}} \nonumber \\&\quad - \frac{1}{m(m+1)} \textrm{diag}\left( \frac{(\textbf{u}^{n+1})^{\mathbf {m+1}}-(\textbf{u}^{n})^{\mathbf {m+1}}}{(\textbf{u}^{n+1})^{\textbf{2}}-(\textbf{u}^{n})^{\textbf{2}} } \right) \textbf{u}^{n+\frac{1}{2}} \Big ] \end{aligned}$$
(3.5)

conserves the discrete energy (3.3) exactly in time, i.e.,

$$\begin{aligned} E_h^{n+1}=E_h^{n}. \end{aligned}$$

Proof

Following the same idea as in Theorem 3.1, we equip the Eq. (3.5) with the discrete inner product (2.10) with the vector

$$\begin{aligned} \mathbf {P^{-1}F^{-1}HS_1FPu}^{n+\frac{1}{2}}- \frac{1}{m(m+1)} \textrm{diag}\left( \frac{(\textbf{u}^{n+1})^{\mathbf {m+1}}-(\textbf{u}^{n})^{\mathbf {m+1}}}{(\textbf{u}^{n+1})^{\textbf{2}}-(\textbf{u}^{n})^{\textbf{2}} } \right) \textbf{u}^{n+\frac{1}{2}}. \end{aligned}$$

Then, using the identity (2.20) in Proposition 2.1 yields the result. \(\square \)

The construction for the conservative schemes in Theorems 3.1 and 3.2 are standard. If we only consider the semi-discretization in time, the scheme (3.4) is the midpoint rule, or the implicit 2nd order Runge–Kutta method (IRK2), which is also known as the symplectic (quadratic preserving) Runge–Kutta method, e.g., see [63], and thus, the quadratic quantity (mass), is conserved. We split the potential into the form

$$\begin{aligned} \frac{1}{m}(u^m)_x= \frac{1}{m+1}[u^{m-1}u_x +(u^m)_x] \end{aligned}$$

for the purpose of creating the symmetry for the mass conservations in the spatial discretization, which we discussed in the previous section. For the energy conservation, we need to reformulate the potential part, which is widely used in literature, see e.g., [23, 43, 48] for the NLS case.

We next discuss the numerical schemes with higher order temporal accuracy. For simplicity and conciseness, we only consider the semi-discretization in time. The space-time full discretization results can be easily generalized together with the results from Sect. 2.

3.2 High Order Conservative Schemes

The high order temporal conservative schemes can be achieved by the symplectic Runge–Kutta (SRK) method. We first briefly review the RK method before showing our results. Consider the problem

$$\begin{aligned} u_t=f(u). \end{aligned}$$
(3.6)

From the time \(t=t_n\) to \(t=t_{n+1}\), let \(b_i\), \(a_{ij}(i,j=1,\cdots s)\) be real numbers, and \(c_i=\sum _{j=1}^s a_{ij}\) be the collocation points. Denote the intermediate values \(U_i\) to be the solution satisfying (3.6) at the intermediate time \(t^i=t_n+\tau c_i\). Then, the intermediate values \(U_i\)’s are calculated by

$$\begin{aligned} U_i=u^n+\tau \sum _{j=1}^s a_{ij}f_j, \end{aligned}$$
(3.7)

where \(f_i=f(U_i)\). The solution \(u^{n+1}\) is updated by

$$\begin{aligned} u^{n+1}=u^n+\tau \sum _{j=i}^s b_if_i. \end{aligned}$$
(3.8)

We usually write the coefficients \(\textbf{A}=(a_{ij})\), \(\textbf{b}=(b_1, b_2,\cdots ,b_s)\) and \(\textbf{c}=(c_1, c_2,\cdots ,c_s)^T\) in the Butcher’s Tableaus ( [18]):

$$\begin{aligned} \begin{array} {c|c} \textbf{c}&{} \textbf{A}\\ \hline &{} \textbf{b} \end{array}. \end{aligned}$$

For example, we list two commonly used Runge–Kutta methods in the Butcher’s Tableaus in Table 1. They are the s-stage Runge–Kutta methods with \(s=1,2\), respectively. These methods are coming from the Gaussian-Legendre quadrature, known as the IRK2 and IRK4 methods, since the temporal accuracy is on the order of 2 and 4, respectively. We use these methods in our numerical simulations in the next section. There are many other types of Runge–Kutta methods as well, we refer the interested reader to [6, 20, 28, 60, 61].

Table 1 Butcher’s the s-stage Gaussian–Legendre collocation Runge–Kutta methods with \(s=1,2\)

We prove the following theorem for the mass-conservative scheme.

Theorem 3.3

The s-stage symplectic (quadratic preserving) Runge–Kutta method, which satisfies

$$\begin{aligned} b_ia_{ij}+b_ja_{ji}=b_ib_j, \qquad \text{ for } \quad i,j=1,\cdots ,s, \end{aligned}$$
(3.9)

conserves the discrete mass exactly in time for the spatial discretized gBO equation (2.18), i.e.,

$$\begin{aligned} M_h^{n+1}=M_h^{n}. \end{aligned}$$

Proof

The proof is standard. From the standard RK theory (e.g., [20, 60]), we can show that the temporal semi-discretized scheme conserves the mass exactly in the discrete time flow. Indeed, the RK theory shows that

$$\begin{aligned} M^{n+1}-M^n= 2\tau \sum _{i=1}^s b_i \langle U_i, f(U_i) \rangle + \tau ^2 \sum _{i,j=1}^s (b_ia_{ij}+b_ja_{ji}-b_ib_j) \langle f(U_i), f(U_j) \rangle =0, \end{aligned}$$

since \(\langle U_i, f(U_i) \rangle =0\) by putting f(U) in the form of (1.1).

When considering the space-time full discratization, we have

$$\begin{aligned} M^{n+1}_h-M^n_h= 2\tau \sum _{i=1}^s b_i \langle \textbf{U}_i, f(\textbf{U}_i) \rangle _h+ \tau ^2 \sum _{i,j=1}^s (b_ia_{ij}+b_ja_{ji}-b_ib_j) \langle f(\textbf{U}_i), f(\textbf{U}_j) \rangle _h=0, \end{aligned}$$

where the vector \(\textbf{U}_i\) is the discretized version of the intermediate value \(U_i\) for \(i=1, \cdots , s\), and \(\langle \textbf{U}_i, f(\textbf{U}_i) \rangle _h=0 \) by using the relation (2.21). \(\square \)

The symplectic Runge–Kutta method cannot preserve the discrete energy. In order to construct the energy-preserving scheme, we need to reformulate the potential term in the same idea as in (3.5). This is achieved by using the scalar auxillary approach from [21, 65]. We reformulate the Eq.  (1.1) into an equivalent system as follows:

$$\begin{aligned} {\left\{ \begin{array}{ll} u_t=-\left( -\mathcal H u_{x}+\dfrac{1}{m} \frac{u^m v}{\sqrt{(u^m,u)+C_0}} \right) _x,\\ v_t=\frac{m+1}{2\sqrt{\langle u^m,u \rangle +C_0}} \langle u^m ,u_t \rangle , \end{array}\right. } \end{aligned}$$
(3.10)

with the initial condition

$$\begin{aligned} u(x,0)=u_0, \qquad v_0=\sqrt{\langle u_0^m,u_0 \rangle +C_0}. \end{aligned}$$

Then, the energy to the system (1.1) is modified into the equivalent form

$$\begin{aligned} E[u(t), v(t)] := \frac{1}{2} \langle \mathcal {H}u_{x},u \rangle -\frac{1}{m(m+1)} (v^2 -C_0) \equiv E[u_0, v_0]. \end{aligned}$$
(3.11)

Here, we slightly abuse the notation E[uv] to represent the modified energy for convenience, since it is equivalent to the energy E[u] in (1.6) in the continuous sense. The \(C_0\) is a constant to make sure that the term \(\langle u^m, u\rangle +C_0\) is positive for all time \(t \in [0,T]\). In the actual computation, the \(C_0\) is adjustable during the time evolution, and thus, we only need to choose the constant \(C_0\) such that the term \(\langle u^m, u\rangle +C_0>0\) in the time interval \(t\in [t_n,t_{n+1}]\). This is easily fulfilled, since we only consider the solution smooth in time. We will discuss the \(C_0\) adjustment process at the end of this subsection.

Denote \(v^n \approx v(t_n)\) to be the semi-discretization of v in time, and also \(v_h \approx v(\textbf{u})\) to be the semi-discretization of v in space. We write the space-time full discretization of v as \(v_h^n \approx v(\textbf{u}^n,t_n)\). The reformulated equation system (3.11) can be discretized by the rational basis functions into the following form

$$\begin{aligned} {\left\{ \begin{array}{ll} \textbf{u}_t=-\mathbf {P^{-1}F^{-1}S_1FP}\left( \mathbf {-P^{-1}F^{-1}HS_1FP u}+ \dfrac{1}{m}\frac{\mathbf {u^m} v_h}{\sqrt{\langle \mathbf {u^m},\textbf{u} \rangle _h}+C_0} \right) := f(\textbf{u},v_h),\\ (v_h)_t=\frac{m+1}{2\sqrt{\langle \mathbf {u^m} ,\textbf{u} \rangle _h+C_0}} \langle \mathbf {u^m} ,\textbf{u}_t \rangle _h := g(\textbf{u},v_h), \end{array}\right. } \end{aligned}$$
(3.12)

with the initial conditions

$$\begin{aligned} \textbf{u}^0=u(\textbf{x},0), \quad \quad v_h^0=\sqrt{\langle (\textbf{u}^0)^{\textbf{m}},\textbf{u}^0 \rangle _h+C_0}. \end{aligned}$$

The fully discrete modified energy is defined as follows

$$\begin{aligned} E_h^n=\frac{1}{2} \langle \mathbf {P^{-1}F^{-1} HS_1 FP} \textbf{u}^n, \textbf{u}^n \rangle _h -\frac{1}{m(m+1)} \left( (v_h^n)^2-C_0 \right) . \end{aligned}$$
(3.13)

Next, we prove the following theorem for the high order energy-conservative schemes.

Theorem 3.4

The s-stage symplectic Runge–Kutta method, which satisfies (3.9), conserves the \(L^1\)-type integral (1.4), mass (1.5) and modified energy (3.11) in the discrete time flow for the reformulated gBO equation system (3.10), i.e.,

$$\begin{aligned} I^{n+1}=I^n, \quad M^{n+1}=M^{n}, \quad \text{ and } \quad E^{n+1}=E^{n}. \end{aligned}$$
(3.14)

Furthermore, the symplectic Runge–Kutta method preserves the discrete energy (3.13) for the spatial semi-discretized system (3.12), i.e.,

$$\begin{aligned} E_h^n=E_h^{n-1}=\cdots =E_h^0. \end{aligned}$$
(3.15)

Proof

The proof for the conservation of the temporal semi-discretized first integral, mass and energy (3.14) is standard, e.g., see [20, 47, 60, 72].

For the proof of the discrete energy conservation (3.15), substituting the inner product into the discrete sense, straightforward calculations yield

$$\begin{aligned} E^{n+1}_h&=\frac{1}{2} \langle \mathbf {P^{-1}F^{-1} HS_1 FP} \textbf{u}^{n+1}, \textbf{u}^{n+1} \rangle _h -\frac{1}{m(m+1)} \left( (v_h^{n+1})^2-C_0 \right) \\&=\frac{1}{2} \langle \mathbf {P^{-1}F^{-1} HS_1 FP} (\textbf{u}^n+\tau \sum _{i=1}^{s} b_if(\textbf{U}_i,V_i)), \textbf{u}^n+\tau \sum _{i=1}^{s} b_if(\textbf{U}_i,V_i) \rangle _h\\&\quad -\frac{1}{m(m+1)} \left( (v_h^{n}+\tau \sum _{i=1}^{s} b_i g(\textbf{U}_i,V_i))^2-C_0 \right) \\&=E_h^n+ \tau \sum _{i=1}^s b_i \left( \langle \mathbf {P^{-1}F^{-1} HS_1 FP} f(\textbf{U}_i,V_i),\textbf{U}_i \rangle _h -\frac{2}{m(m+1)} V_i g(\textbf{U}_i,V_i) \right) \\&\quad +\tau ^2 \sum _{i,j=1}^s (b_ia_{ij}+b_ja_{ji}-b_ib_j) \left( \langle f(\textbf{U}_i,V_i),f(\textbf{U}_j,V_j) \rangle _h +g(\textbf{U}_i,V_i)g(\textbf{U}_j,V_j) \right) \\&=E^n_h \end{aligned}$$

by using the relation (2.20), (3.12) and (3.9), where \(V_i\) is defined as the intermediate value of \(v_h^n\) for (3.12) similar to (3.7) and (3.8). \(\square \)

Remark 3.1

Comparing with the proof in [72, Theorem 3.1], the symplectic Runge–Kutta method can preserve all the three quantities in the discrete time flow for the reformulated system (3.10). However, due to the limitation of the spatial discretization, only the discrete energy will be preserved in the fully discrete sense. The conservation of the discrete momentum (1.4) for the gKdV equations in [72] is obtained by using the circulant and anti-symmetric property of the first order differential matrix from the Fourier pseudo-spectral discretization which we note that the rational basis functions here do not possess such property.

The adjustment process for the constant \(C_0\) from [72] can be adapted here. Suppose at \(t=t_n\), as the time evolves, the solution \(u^n\) leads to the term \(\int (u^n)^{m+1} dx +C_0<Tol\), where Tol is a given positive number (e.g., \(Tol=5\)). Then, we need to choose \(\tilde{C_0}\) to ensure \(\int (u^{n+1})^{m+1} dx +\tilde{C_0}>0\). The constant \(\tilde{C_0}\) can be chosen such that \(\int (u^n)^{m+1} dx +\tilde{C_0}> Tol\). For example, we can take \(\tilde{C_0}=10-\int (u^n)^{m+1} dx\), which leads to our new \(\tilde{v}^n \approx \sqrt{10}\) (and generally ensure the positivity for \(\int (u^{n+1})^{m+1} dx +\tilde{C_0}\)). Then, by using \(E[u^n,v^n]=E[u^n,\tilde{v}^n]\) from (3.11), we have our new \(\tilde{v}^n\)

$$\begin{aligned} \tilde{v}^n=\sqrt{(v^n)^2+\tilde{C_0}-C_0}. \end{aligned}$$
(3.16)

Finally, we substitute the \(v^n\) and \(C_0\) in (3.10) with \(\tilde{v}^n\) and \(\tilde{C_0}\), and then, continue with the time evolution for \(t=t_{n+1},t_{n+2}, \cdots \).

Remark 3.2

Note that \(v^2=\int u^{n+1} dx +C_0\) holds only at the collocation points \(t=t_n+\tau c_i\) for each \(i=1,2,\cdots ,s\) in \(t\in [t_n,t_{n+1}]\). However, the constant \(c_i\) may not necessarily be equal to 0 or 1, e.g., see Table 1. This means \(v^2=\int u^{n+1} dx +C_0\) does not hold at \(t_n\) in the discrete time flow. Therefore, the new \(\tilde{v}^n\) can only be evaluated by (3.16) to keep the discrete energy (3.11) invariant.

4 Numerical Results

In this section, we list our numerical examples for the proposed schemes. Before discussing the examples, we first describe a type of a fixed point iteration solver from [21, 72], which can be easily adapted here for solving the resulting nonlinear system from the IRK methods with the total computational cost on the order of \( \mathcal {O}(N\log (N))\) from FFT. Since the solution to (1.1) is continuous in time, \(u^{n}\) is supposed to be close enough to \(u^{n+1}\) for sufficiently small time step \(\tau \). Thus, \(u^n\) is a good initial guess for the fixed point iteration to start with. While we don’t give an analytic proof on the convergence or conservation properties during the fixed point iteration process, the solution always converges to the given accuracy level in our numerical experiments (usually less than \(10^{-12}\), which is our fixed point tolerance), and thus, the mass or energy are preserved from \(u^n\) to \(u^{n+1}\). We also refer to a recent paper [7] for interesting readers, which studies the convergence speed and hyperbolic conservation laws for different nonlinear algebraic system solvers resulting from the hyperbolic equations. Now, we take the IRK4 (\(s=2\)) case as an example to describe our fixed point iteration solver. Recall \(\mathbf {D_1}=\mathbf {P^{-1}F^{-1}S_1FP}\), and denote \(\mathbf {D_{H1}}=\mathbf {P^{-1}F^{-1}HS_1FP}\). The IRK4 scheme to (3.12) are rewriten into the system

$$\begin{aligned} {\left\{ \begin{array}{ll} \mathbf {U_i}=\textbf{u}^n+\tau \sum \limits _{j=1}^{2} a_{ij}\textbf{f}_j, \\ \Phi _i=\frac{(\mathbf {U_i})^{\textbf{m}}}{\sqrt{(\langle (\mathbf {U_i})^{\textbf{m}},\mathbf {U_i} \rangle _h +C_0}},\quad g_i=\frac{m+1}{2}\langle \Phi _i,\textbf{f}_i\rangle _h, \\ V_i=v_h^n+\tau \sum \limits _{j=1}^{2} a_{ij}{g}_j, \quad i=1,2, \end{array}\right. } \end{aligned}$$
(4.1)

and

$$\begin{aligned} {\left\{ \begin{array}{ll} \textbf{f}_1=-\mathbf {D_1}\left( \mathbf {D_{H1}} (\textbf{u}^n+\tau \sum \limits _{j=1}^{2} a_{1j}\textbf{f}_j) +\frac{1}{m}\Phi _1V_1) \right) , \\ \textbf{f}_2=-\mathbf {D_1}\left( \mathbf {D_{H1}} (\textbf{u}^n+\tau \sum \limits _{j=1}^{2} a_{2j}\textbf{f}_j) +\frac{1}{m}\Phi _2V_2) \right) . \end{array}\right. } \end{aligned}$$
(4.2)

Then, \(\textbf{u}^{n+1}\) and \(v_h^{n+1}\) can be updated by (3.8).

The system (4.1) and (4.2) can be solved by the fixed point iteration. At the lth iteration, we have

$$\begin{aligned} {\left\{ \begin{array}{ll} \left( \textbf{I}+\tau a_{11} \mathbf {D_{H2}} \right) \textbf{f}_1^{l+1}+\tau a_{12} \mathbf {D_{H2}f}_2^{l+1} =-(\mathbf {D_{H2}u}^m-\frac{1}{m}\mathbf {D_1}(\Phi _1^lV_1^l)),\\ \tau a_{21} \mathbf {D_3f}_1^{l+1} + \left( \textbf{I}+\tau a_{22} \mathbf {D_{H2}} \right) \textbf{f}_2^{l+1}=-(\mathbf {D_{H2}u}^m-\frac{1}{m}\mathbf {D_1}(\Phi _2^lV_2^l)), \end{array}\right. } \end{aligned}$$
(4.3)

where \(\textbf{I}\) is the identity matrix, and \(\mathbf {D_{H2}}=\mathbf {D_1D_{H1}}=\mathbf {P^{-1}F^{-1}HS_2FP}\). The system (4.3) can be first transformed into the sparse matrix by multiplying the diagonal matrix \(\textbf{P}\), and then implement the matrix operation \(\textbf{F}\) by FFT. Finally, the standard sparse matrix solvers can be applied, since \(\mathbf {{HS_2}}\) is a sparse matrix. After obtaining \(\textbf{f}_i^{l+1}\), we can use (4.1) to obtain the values of other variables \(U_i^{l+1}, \Phi _i^{l+1}, g_i^{l+1},\) and \( V_i^{l+1}\). The total computational cost remains on the same order.

We set \(\textbf{f}_{1}^{0}=\textbf{f}_{2}^{0}=f(\textbf{u}^n,v^n_h)\) to start the iteration and the fixed point iteration (4.3) converges in our numerical simulations, since the values from previous time step are usually close enough to the actual solution. The iteration terminates when

$$\begin{aligned} \max _{i} \Vert \textbf{f}_i^{l+1}-\textbf{f}_i^{l} \Vert _{\infty }<\epsilon , \end{aligned}$$

where \(i=1,2\), and \(\epsilon \) is typically set to be \(\epsilon =10^{-12}\) in our simulations.

We denote by the IRK2-MC and IRK4-MC the mass conservative schemes for solving (2.18) by using the 1st and 2nd stage RK methods with Gauss–Legendre collocation points which are 2nd and 4th order accuracy in time, respectively from Table 1. We also denote by the IRK2-EC and IRK4-EC the energy conservative schemes, which are still 2nd and 4th order accuracy in time for solving the reformulated system (3.12). As a comparison, we use the commonly used 2nd order non-conservative semi-implicit Leap-Frog scheme as follows

$$\begin{aligned} \frac{\textbf{u}^{n+1}-\textbf{u}^{n-1}}{2 \tau } = -\mathbf {P^{-1}F^{-1}S_1FP}\left( \mathbf {P^{-1}F^{-1}HS_1FP} \left( \frac{\textbf{u}^{n+1}+\textbf{u}^{n-1}}{2} \right) + \frac{1}{m}(\textbf{u}^{n})^{\textbf{m}} \right) , \end{aligned}$$
(4.4)

denoted as Leap-Frog. It is worth noting that the Leap-Frog scheme (4.4) may not be unconditionally stable. Indeed, if we consider the linear simplified model

$$\begin{aligned} u_t=ik^2u-\lambda ik u, \end{aligned}$$

where \(ik^2\) can be considered as the eigenvalues of the operator \(\mathcal {H}\partial _{xx}\) ranging from \(k=-N/2,\cdots ,N/2-1\), and similarly, ik is the eigenvalue of the operator \(\partial _x\), and \(\lambda \) is the constant for approximating the term \(\frac{1}{m} u^{m-1}\). The Leap-Frog scheme of this linear problem yields

$$\begin{aligned} \frac{u^{n+1}-u^{n-1}}{2\tau }-\frac{1}{2} ik^2(u^{n+1}+u^{n-1})+\lambda ik u^n=0. \end{aligned}$$

By taking \(z=k\tau \), its characteristic polynomial yields

$$\begin{aligned} (1-izk)\xi ^2+2iz\lambda \xi -(1+izk)=0, \end{aligned}$$

which gives two roots with respect to \(\xi \), i.e.,

$$\begin{aligned} \xi _{1,2}=\frac{1}{1-izk} \left( iz\lambda \pm \sqrt{-z^2\lambda ^2+(1+z^2k^2)} \right) . \end{aligned}$$

The stability condition is that \(|\xi |\le 1\). One sufficient condition for \(|\xi |\le 1\) is

$$\begin{aligned} -z^2\lambda ^2+(1+z^2k^2)\ge 0, \end{aligned}$$
(4.5)

which yields \(|\xi |^2=1\) from direct calculation.

Thus, for each \(k=k_0\), when \(k_0^2 \ge \lambda ^2\), (4.5) is automatically fulfilled. However, when \(k_0^2 \le \lambda ^2\), we need \(z\le \frac{1}{\sqrt{\lambda ^2-k_0^2}}\), and this yields the restriction on the time step:

$$\begin{aligned} \tau \le \frac{1}{|k_0| \sqrt{\lambda ^2-k_0^2}}. \end{aligned}$$

Or equivalently, note that \(k_0\sqrt{\lambda ^2-k_0^2}\) reaches the maximum when \(k_0=\frac{\lambda }{\sqrt{2}}\), and thus,

$$\begin{aligned} \tau \le 2/\lambda ^2. \end{aligned}$$

This indicates that when \(\lambda \) (or \(\Vert u^m\Vert _{L^{\infty }}\) in the nonlinear case we consider) increases, we need to shrink the time step \(\tau \) by \(\lambda ^{-2}\) to ensure the stability. In other words, the instability occurs at the constant \(\lambda \) from the nonlinear part instead of the high frequency level (large values of |k|) from the linear part.

We track the following quantities at \(t=t_n\) to check the accuracy:

$$\begin{aligned}&\mathcal {E}^n=\Vert u_{exact}^n-\textbf{u}^n\Vert _{\infty }; \end{aligned}$$
(4.6)
$$\begin{aligned}&\mathcal {E}_I^n=\max _{l<n}|I_h^l-I_h^0|; \end{aligned}$$
(4.7)
$$\begin{aligned}&\mathcal {E}^n_M=\max _{l<n}|M_h^l-M_h^0|; \end{aligned}$$
(4.8)
$$\begin{aligned}&\mathcal {E}^n_E=\max _{l<n}|E_h^l-E_h^0|. \end{aligned}$$
(4.9)

When the SAV approach is not applied (IRK2-MC, IRK4-MC and Leap-Frog), the discrete energy \(E_h^n\) is computed from (3.3); and when the SAV approach is applied, the discrete energy \(E_h^n\) is computed from the modified version (3.13). We mention here that it is easy to see the equivalence between the Crank–Nicolson scheme (3.4) and the IRK2-MC scheme. The energy-conservative Crank–Nicolson (CNEC) scheme, which (3.5) considers reformulating the potential, also shares the same idea as in the IRK2-EC scheme.

Now, we are ready to illustrate examples for our numerical simulations.

Example 1. Our first example considers the soliton solution for the BO (\(m=2\)) equation, \(u(x,t)=\frac{4c}{1+c^2(x-x_0-ct)^2}\). These type of solutions come from the smooth, positive, decaying at infinity solitary wave solution to the profile equation

$$\begin{aligned} \mathcal {H}Q_x+cQ-\frac{1}{m}Q^m=0, \end{aligned}$$
(4.10)

where c is a constant that indicates the speed of the traveling waves as well as the magnititude. The solutions are expected to travel to the right as the solitons, for example from the spectral stability result in [4, 36] and the inverse scattering theory [27, 33].

In our numerical simulations, we take \(\alpha =25\) with \(N=1024\). We take the traveling speed \(c=2\) and starting point at \(x_0=-20\). The time step \(\tau \) is taken to be \(\tau =\frac{1}{20}\) for all the four IRK type methods as well as the Crank–Nicolson method, and \(\tau =\frac{1}{40}\) for the Leap-Frog scheme (4.4), since taking the \(\tau =\frac{1}{20}\) will lead to the numerical instability in our numerical computations for the Leap-Frog scheme. We stop our numerical simulation at \(T=20\).

Figure 1 shows the solution profile obtained from the IRK4-EC scheme. The left subplot is the initial condition \(u_0\), the right subplot is the time evolution. One can see that the solution travels in the solitary wave manner, which is previously observed in [32, 59] and also as expected.

Fig. 1
figure 1

The solution profile for Example 1 from the IRK4-EC. Left: \(u_0\). Right: u(xt)

Fig. 2
figure 2

The errors in Example 1 by different time integrators: IRK2-MC (solid blue); IRK2-EC(dash red); IRK4-MC (dash dot orange); IRK4-EC (dot purple); Leap-Frog (circle green); CNEC (doc blue). Top left: \(\Vert u-u_{exact}\Vert _{\infty }\). Top right: discrete momentum error. Bottom left: discrete mass error. Bottom right: discrete energy error (Color figure online)

Figure 2 tracks the results obtained from the different time integrators. The top left subplot in Fig. 2 shows \(\Vert \textbf{u}^n-u_{exact}\Vert _{\infty }\) with respect to time, where \(u_{exact}=u(\textbf{x},t_n)\) is the exact solution. One can see that the Leap-Frog scheme (4.4) has the largest error (see the green circle line). Moreover, the 4th order schemes (IRK4-MC and IRK4-EC) own the better accuracy than the second order schemes (IRK2-MC, IRK2-EC and Leap-Frog), which is as expected, since they have higher order temporal accuracy. Moreover, we found that the IRK2-EC scheme performs better than the CNEC scheme and the IRK2-MC scheme. Furthermore, we observe that the energy-conservative schemes (dash red line for IRK2-EC and dot purple line for IRK4-EC) perform better than the mass-conservative schemes (blue solid line for IRK2-MC and orange dash-dot line for IRK4-EC).

The top right subplot in Fig. 2 tracks the error of the discrete first integral (4.7) at different times. One can see that the more accurate the time integrators are, the better preservation of this quantity will be, though they are not conserved exactly according to the Rational Basis Function spatial discretization.

The bottom two subplots in Fig. 2 track the error of discrete mass (4.8) and energy (4.9), respectively. The mass-conservative schemes (IRK2-MC and IRK4-MC) keep the error of discrete mass below at the level of \(10^{-12}\), which is the tolerance of the fixed iteration in solving the resulting nonlinear system from the implicit Runge–Kutta method. On the other hand, the discrete mass error for the other types of schemes is relatively large, especially the Leap-Frog scheme (thus, refered to as least accurate).

The bottom right subplot tracks the error of the discrete energy. It shows that the energy-conservative schemes (IRK2-EC and IRK4-EC) keep the error of discrete mass below the level of \(10^{-12}\). This justifies the validity of our schemes. Similarly, for the other time integrators, the Leap-Frog scheme performs the worst even with a smaller time step \(\tau \),; the other two mass-conservative schemes keep the error of discrete energy around the level of \(10^{-4}\).

We also list the \(L^{\infty }\) error \(\mathcal {E}^n\) at \(t=T\) with different time step \(\tau \) for these five time integrators in Table 2. Denote the error at time \(t=T\) for the time step \(\tau \) as \(\mathcal {E}(\tau )=\Vert u(\textbf{x},T)-\textbf{u}(T)\Vert _{\infty }\), and the rate defined as \(\text{ rate }=\frac{\mathcal {E}(2\tau )}{\mathcal {E}(\tau )}\). One can see that the Leap-Frog, IRK2-MC and IRK2-EC decrease with the ratio around 4, which are as expected, since they are of the second order schemes (since \(4=2^2\)). On the other hand, the ratio of the 4th order methods IRK4-MC and IRK4-EC is around \(2^4\), since they are 4th order methods. When the time step \(\tau \) is small, the decay rate is slightly below 16 (see the last row in Table 2). This is probably because the temporal error becomes comparable to the spatial discretization error, and consequently, affects the ratio.

Table 2 The convergence rates of Leap-Frog, IRK2-MC, IRK2-EC IRK4-MC and IRK4-EC in Example 1
Table 3 The comparison between the IRK4-MC, RK4 and the RRK4-MC method

We also want to compare the numerical efficiency between these implicit methods with explicit methods, such as the standard explicit RK4 and the recently proposed mass conservative relaxation Runge–Kutta methods from [38], e.g., the relaxation RK4 (RRK4-MC) method. These explicit methods need to take the time step \(\tau \) small enough to satisfy the CFL condition (\(\tau \sim 1/N^2\) ). Thus, usually, these explicit methods will have better accuracy compared with the proposed implicit methods, since the small time step is the necessity. On the other hand, smaller time step means more time evolution, resulting in more computational time. In Table 3, we denote the \(\tau _{\text{ max }}\) meaning that the time step \(\tau \) we use, and if we take the time step to be \(2\tau \), the numerical simulation will fail. Table 3 shows the numerical tests for the Example 1. It shows that we can take \(\tau =1.6e-3\), and obtain more accurate results in shorter computational time compared with the implicit methods (e.g., IRK4-MC). This shows that explicit time integrators are also an option for the simulation of the gBO type equations, since the linear operator \(\mathcal {H}\partial _{xx}\) is not as stiff as the \(\partial _{xxx}\) in the KdV case. The RRK methods can also preserve the mass, or energy if we change a little about the scheme (see [57, 58] for details). It is more efficient than the implicit schemes such as the IRK4-MC scheme in this BO case. However, besides changing the time step \(\tau \) when changing the nodes N from the CFL restriction, we also observe that the severe instability issue for the explicit methods will occur when the nonlinear power gets higher (see the case for \(m=5\) in Example 3), and thus, the implicit methods will become more efficient in those cases. Another concern about the RRK4 scheme is taken the non-uniform time step \(\gamma _n\tau \) ( \(\gamma _n\) is the constant to make the mass or energy to be preserved) at each time step. This may loose one order of accuracy if we want to interpret the solution as the given time \(t_n=n\tau \), which is called the incremental direction technique (IDT) method in this case. In summary, both of the RRK schemes and the IRK schemes are good conservative schemes that may be useful in the future numerical investigation of gBO type equations. The choice of the time integrator will depend on case by case.

Example 2. We next consider the scattering solution for the BO equation (\(m=2\)) with the initial condition \(u_0=- 2 {\text {sech}}^2(x)\). Its KdV version has been studied for questions on dispersion limit, see, e.g. [31, 40]. Here, we expect that a similar solution behavior may happen, since the BO equation only changes the dispersion term \(u_{xxx}\) from the KdV equation to \(\mathcal {H} u_{xx}\) (less amount of dispersion if viewed on the Fourier frequency side). Note that a negative value for \(\int u^3 dx\) may occur, and thus, the \(C_0\) adjustment process in (3.16) will make v(t) stay positive, and will keep the algorithm applicable for all time. The exact solution is not explicitly given, since due to the negative sign in the initial condition and coefficients chosen.

In this example, we still take the \(N=1024\) and \(\alpha =25\) for the spatial discretization. The time step \(\tau =\frac{1}{400}\) (\(\tau =\frac{1}{800}\) for the Leap-Frog) and the stopping time \(T=2\). We compute the reference solution \(u_{\textrm{ref}}\) by both IRK4-MC and IRK4-EC methods independently with an ultimately small time step (\(\tau =1/6400\)), denoted as \(u_{\mathrm {ref-MC}}\) and \(u_{\mathrm {ref-EC}}\), respectively. Since we intend to track the convergence rate with respect to time, to minimize the influence from the spatial discretization error, we use the \(u_{\mathrm {ref-MC}}\) to compute the \(L^{\infty }\) error \(\Vert \textbf{u}^n-u_{\textrm{ref}} \Vert _{\infty } \) when the \(\textbf{u}^n\) is obtained by the mass-conservative schemes (IRK2-MC and IRK4-MC), and use the \(u_{\mathrm {ref-EC}}\) to compute the \(L^{\infty }\) error \(\Vert \textbf{u}^n-u_{\textrm{ref}}\Vert _{\infty }\) when \(\textbf{u}^n\) is obtained by the energy-conservative schemes (IRK2-EC and IRK4-EC) and the Leap-Frog scheme. The spatial discretization error accumulates as the time evolves, see Fig. 5. From Fig. 5, the difference increases to the level of \(10^{-6}\) between these two solutions by the time we terminate the simulation.

Figure 3 shows the solution profile obtained from the IRK4-EC method. The left subplot shows the solution profile at different times t. The right plot shows the solution at the terminal time \(t=2\). We can see that the solution radiates to the right with fast oscillations. On the other hand, compared with the similar type of solutions to the KdV case (e.g., in [39, 72]), the frequency is smaller. This indicates that the lower order dispersion (\(\mathcal {H}\partial _{xx}\) compared with \(\partial _{xxx}\)) generates slower oscillations.

Fig. 3
figure 3

The solution profile in Example 2 from IRK4-SAV. Left: u(xt). Right: u(xt) at \(t=2\)

Figure 4 tracks the first integral-error, error of discrete first integral, mass and energy with respect to time. One can see that the results are similar to the previous example, and also agree with our analysis in Sect. 2 and 3.

Fig. 4
figure 4

The errors in Example 2 by different time integrators: IRK2-MC (solid blue); IRK2-EC(dash red); IRK4-MC (dash dot orange); IRK4-EC (dot purple); Leap-Frog (circle green). Top left: \(\Vert \textbf{u}^n-u_{\textrm{ref}}\Vert _{\infty }\). Top right: discrete momentum error. Bottom left: discrete mass error. Bottom right: discrete energy error

Fig. 5
figure 5

The difference of the reference solution obtained by the IRK4-EC scheme and IRK4-MC scheme. One can see the difference (mainly caused by the spatial discretization) keeps increasing to the level \(10^{-6}\)

Table 4 shows the \(L^{\infty }\) error at \(t=T\) with respect to the different time step \(\tau \). The decay rate is on the 2nd order for the 2nd order schemes (Leap-Frog, IRK2-MC and IRK2-EC), and on the 4th order for the 4th order schemes (IRK4-MC and IRK4-EC). Surprisingly, the IRK types of schemes (IRK2-MC and IRK2-EC, IRK4-MC and IRK4-EC) generate almost the same error (up to the decimals that we report) from the different reference solutions (\(u_{\mathrm {ref-MC}}\) and \(u_{\mathrm {ref-EC}}\)). This implies that some possible cancellations may occur between the spatial discretization errors.

Table 4 The convergence rates of Leap-Frog, IRK2-MC, IRK2-EC IRK4-MC and IRK4-EC in Example 2

Example 3. Our final example considers the mBO (\(m=3\)) and the gBO (\(m=4\)) cases. We take the initial condition \(u_0=0.99Q\), where Q is the soliton solution from (4.10) with \(c=1\). In these cases, while there is no explict form for Q, the profile of Q can be obtained numerically, e.g., by the Petviashvili iteration from [55, 59], and its convergence analysis in [42, 44, 53]. From [26], when \(u_0 =0.99Q\), which indicates that the solution is below the mass-energy threshold, the solution is proven to exist globally in time. Recent numerical study in [59] shows that the solution blows up when \(u_0=1.01Q\). In this paper, we consider the globally existing solutions, and thus, we take \(u_0=0.99Q\) in our example. We take \(N=1024\), \(\alpha =25\), \(\tau =0.02\) (\(\tau =0.01\) for the Leap-Frog scheme due to the stability issue) in our simulation. We run until \(T=10\) for \(m=3\), and \(T=5\) for \(m=4\).

We also test the \(m=5\) case. In this case, we take \(N=512\), \(\alpha =10\), \(T=1\) and \(\tau =0.001\) for the IRK schemes and the Crank–Nicolson scheme. It is worth noting that these implicit schemes still suffer the numerical instability. Indeed, when taking \(N=1024\), we cannot obtain the solution even for \(\tau =1e-4\), and the solution starts behave normally from taking \(\tau =1e-5\) in this case. This is an interesting question but unfortunately, to the authors’ best knowledge, no nonlinear stability analysis is considered for this type of questions so far. On the other hand, for given \(N=512\) and \(\alpha =10\), the RRK4 method still fails even when we take \(\tau =1e-6\). Therefore, in the high nonlinearity case, the implicit schemes are recommended with extra care.

Figures 6, 7 and 8 show the solution profiles obtained from the scheme IRK4-EC for \(m=3\), \(m=4\) and \(m=5\), respectively. The left subplots show the solution profiles u(xt) at different times. The right subplots show the solution at the final time T (blue solid line) and their comparison with the initial condition \(u_0\) (red dash line). For \(m=3\), the solution travels to the right with some radiation parts scattering to the left. However, for the \(m=4,5\) cases, the solution completely radiates to the left. Indeed, \(m=3\) is the \(L^2\)-critical case, while \(m=4,5\) are \(L^2\)-supercritical cases. They fall into different category despite of the similar nonlinearity form. These different scattering solution behavior are worth to be further studied from both numerical and analytic point of view.

Fig. 6
figure 6

The solution profile in Example 3 from IRK4-EC with \(m=3\). Left: u(xt). Right: u(xt) at \(t=10\)

Fig. 7
figure 7

The solution profile in Example 3 from IRK4-EC with \(m=4\). Left: u(xt). Right: u(xt) at \(t=5\)

Fig. 8
figure 8

The solution profile in Example 3 from IRK4-EC with \(m=5\). Left: u(xt). Right: u(xt) at \(t=1\)

Figures 9, 10 and 11 track the error of discrete first integral (left subplot), mass (middle subplot) and energy (right subplot) at different times for \(m=3,4,5\), respectively. Again, the discrete mass or energy can be preserved by choosing the mass-conservative scheme or energy-conservative scheme, respectively, which agrees with the analysis in Sects. 2 and 3.

Fig. 9
figure 9

The errors in Example 3 (\(m=3\)) by different time integrators: IRK2-MC (solid blue); IRK2-EC(dash red); IRK4-MC (dash dot orange); IRK4-EC (dot purple); Leap-Frog (circle green). Left: discrete momentum error. Middle: discrete mass error. Right: discrete energy error (Color figure online)

Fig. 10
figure 10

The errors in Example 3 (\(m=4\)) by different time integrators: IRK2-MC (solid blue); IRK2-EC(dash red); IRK4-MC (dash dot orange); IRK4-EC (dot purple); Leap-Frog (circle green). Left: discrete momentum error. Middle: discrete mass error. Right: discrete energy error (Color figure online)

Fig. 11
figure 11

The errors in Example 3 (\(m=5\)) by different time integrators: IRK2-MC (solid blue); IRK2-EC(dash red); IRK4-MC (dash dot orange); IRK4-EC (dot purple); CNEC (dot green). Left: discrete momentum error. Middle: discrete mass error. Right: discrete energy error (Color figure online)

5 Conclusion and Other Discussion

The rational basis functions are powerful basis functions for spectral spatial discretization. Besides being the eigenfunctions of the Hilbert transform, the accessibility of the fast Fourier transform and the resulting sparse differential matrices, we find that the spatial conservative schemes can be constructed from its (anti-)Hermitian properties. The quadrature rule of this spectral method are also proposed.

Combined with the conservative time integrators, such as the classical symplectic Runge–Kutta schemes, the SRK schemes with the scalar auxiliary variable reformulation, or the Relaxation Runge–Kutta schemes, arbitrarily high order mass-conservative or energy-conservative numerical schemes can be constructed. In the meanwhile, the explicit Relaxation Runge–Kutta schemes can be good conservative time integrators for the BO (\(m=2\)) equations, but when the nonlinear power gets higher, the numerical observed nonlinear instability problem suggests us to take the implicit schemes.

This pseudo-spectral approach can be extend to construct the the conservative schemes for the gKdV equations, i.e.,

$$\begin{aligned} u_t=-u_{xxx}-\frac{1}{m}(u^m)_x; \end{aligned}$$

and also the mass-energy conservative (structure-preserving) schemes for the NLS equations, i.e.,

$$\begin{aligned} u_t=i\left( u_{xx}+|u|^{m-1}u \right) . \end{aligned}$$

For example, the corresponding spatial semi-discretized form of the Eq. (3.12) for the gKdV equation will be

$$\begin{aligned} {\left\{ \begin{array}{ll} \textbf{u}_t=-\mathbf {P^{-1}F^{-1}S_1FP}\left( \mathbf {-P^{-1}F^{-1}S_2FP u}+ \dfrac{1}{m}\frac{\mathbf {u^m} v_h}{\sqrt{\langle \mathbf {u^m},\textbf{u} \rangle _h}+C_0} \right) ,\\ (v_h)_t=\frac{m+1}{2\sqrt{\langle \mathbf {u^m} ,\textbf{u} \rangle _h+C_0}} \langle \mathbf {u^m} ,\textbf{u}_t \rangle _h. \end{array}\right. } \end{aligned}$$

Similarly, the semi-discretized form for the NLS equation will be

$$\begin{aligned} {\left\{ \begin{array}{ll} \textbf{u}_t=i\left( \mathbf {-P^{-1}F^{-1}S_2FP u}+ \frac{\mathbf {|u|^{m-1}u} v_h}{\sqrt{\langle \mathbf {|u|^{m-1}u},\textbf{u} \rangle _h}+C_0} \right) ,\\ (v_h)_t=\frac{m}{2\sqrt{\langle \mathbf {|u|^{m-1}u} ,\textbf{u} \rangle _h+C_0}} {\text {Re}}\left( \langle \mathbf {|u|^{m-1}u} ,\textbf{u}_t \rangle _h \right) . \end{array}\right. } \end{aligned}$$

The straightforward adaption of proof in Theorem 3.4 will show the energy-conservative result for the gKdV equations, and the mass and energy preserving result for the NLS equations. These results can also be easily extended to higher dimensions (e.g., Zakharov-Kuznetsov equation or the d-dimensional NLS equation) by applying the tensor product in the spatial discretization. We omit the proofs and numerical examples here for conciseness.

In summary, by applying the rational basis functions, the above illustrated conservative schemes will increase the computational efficiency significantly, especially, in tracking the solution’s long time behavior, or the slow decaying solutions (e.g., \(u_0=\frac{1}{1+x^2}\)), since far less number of nodes are needed compared with the traditional domain truncation approaches.