This chapter presents the construction of arbitrary order extended Runge–Kutta–Nyström (ERKN) integrators. In general, ERKN methods are more effective than traditional Runge–Kutta–Nyström (RKN) methods in dealing with oscillatory Hamiltonian systems. However, the theoretical analysis for ERKN methods, such as the order conditions, the symplecticity conditions and the symmetric conditions, becomes much more complicated than that for RKN methods. Therefore, it is a bottleneck to construct high-order ERKN methods efficiently. This chapter first establishes the ERKN group \(\varOmega \) for ERKN methods and the RKN group G for RKN methods, respectively, and then shows that ERKN methods are a natural extension of RKN methods. That is, there exists an epimorphism \(\eta \) of the ERKN group \(\varOmega \) onto the RKN group G. This epimorphism gives a global insight into the structure of the ERKN group by the analysis of its kernel and the corresponding RKN group G. We also establish a particular mapping \(\varphi \) of G into \(\varOmega \) that each image element is an ideal representative element of the congruence class in \(\varOmega \). Furthermore, an elementary theoretical analysis shows that this mapping \(\varphi \) can preserve many structure-preserving properties, such as the order, the symmetry and the symplecticity. From the epimorphism \(\eta \) together with its section \(\varphi \), we may gain knowledge about the structure of the ERKN group \(\varOmega \) through the RKN group G.

6.1 Introduction

We are concerned in this chapter with initial value problems (IVP) of second-order oscillatory differential equations

$$\begin{aligned} \left\{ \begin{aligned}&y''(t)+My(t)=f(y(t)), \\&y(t_0)=y_0,\ y'(t_0)=y'_0, \end{aligned}\right. \end{aligned}$$
(6.1)

with M a (symmetric) positive semi-definite matrix and \(\Vert M\Vert \gg 1\), which frequently arise in many aspects of scientific and engineering computing, such as celestial mechanics, theoretical physics, chemistry and electronics. Effective numerical methods for solving this type of problems are of great importance (see, e.g. [4, 7,8,9,10, 13, 14]). Using the oscillatory structure introduced by the linear term My in (6.1), Yang et al. [34] proposed extended Runge–Kutta–Nyström (ERKN) methods. Much research effort on ERKN methods has been made and ERKN methods show notable efficiency and higher accuracy than the traditional Runge–Kutta–Nyström (RKN) methods in dealing with (6.1) (see, e.g. [28, 29, 31, 32, 35, 36]). It is clear that (6.1) becomes a Hamiltonian system once \(f(y)=-\nabla U(y),\) where U(y) is a smooth potential function. The symmetric conditions and symplectic conditions for ERKN methods have also been investigated [15, 16, 25, 27, 30]. However, it is very difficult to obtain a high-order ERKN method with some important structure properties, even though the order conditions, the symmetric conditions and the symplectic conditions have been well established.

On the one hand, we have known an important property of ERKN methods, that is, when \(M\rightarrow \mathbf {0}\), each ERKN method reduces to a classical RKN method. This property implies that there exists an intrinsic relation between ERKN and RKN methods. On the other hand, the structural properties such as symmetry and symplecticity of RKN methods have been studied by many authors and very useful results have been achieved [1,2,3, 19,20,21, 23, 24, 26]. Taking account of these two points, in this chapter we attempt to clarify this intrinsic relation between ERKN and RKN methods by introducing an epimorphism \(\eta \) from the ERKN group \(\varOmega \) to the RKN group G. In particular, we establish a particular mapping \(\eta \) from G to \(\varOmega \). Consequently, the properties of ERKN methods including the order, the symmetry, and the symplecticity, are inherited from the classical RKN methods via the mapping \(\varphi \).

The plan of this chapter is as follows. In Sect. 6.2 we briefly review the classical RKN methods and then construct the RKN group G. In Sect. 6.3, the theories associated with the ERKN group \(\varOmega \) are established. Especially, we show that there exists an epimorphism \(\eta \) from \(\varOmega \) to G. In Sect. 6.4, we address the particular mapping \(\varphi \) from G to \(\varOmega \) in detail. It turns out that this mapping preserves the order, the symplecticity, and almost the symmetry. In Sect. 6.5 we carry out some numerical experiments for the high-order structure-preserving ERKN methods derived from the theoretical analysis in Sect. 6.4. The last section is concerned with conclusions and discussions.

6.2 Classical RKN Methods and the RKN Group

This section begins with an overview of the results on classical RKN methods for second-order initial value problems

$$\begin{aligned} \left\{ \begin{aligned}&y''(t)=f(y(t)), \\&y(t_0)=y_0,\ y'(t_0)=y'_0, \end{aligned}\right. \end{aligned}$$
(6.2)

where the right-hand-side function f does not depend on the derivative \(y'\) and time t. As is well known, to approximate this autonomous system more efficiently than with traditional Runge–Kutta (RK) methods, the so-called Runge–Kutta–Nyström (RKN) methods were proposed [18]. An s-stage classical RKN method with a stepsize h for the problem (6.2) is defined as

$$\begin{aligned} \left\{ \begin{aligned}&Y_{i}=y_{n} + c_{i}hy'_{n} + h^{2}\sum _{j=1}^{s}a_{ij}f(Y_{j}), \quad i=1,\ldots ,s, \\&y_{n+1}=y_{n} + hy'_{n} + h^{2}\sum _{i=1}^{s}\bar{b}_{i}f(Y_{i}), \\&y'_{n+1}=y'_{n} + h\sum _{i=1}^{s}b_{i}f(Y_{i}), \end{aligned} \right. \end{aligned}$$
(6.3)

where \(a_{ij},\bar{b}_{i},b_{i}\) for \(i,j=1,\ldots ,s\) are real constants. Usually, the RKN method (6.3) can be briefly expressed in a Butcher Tableau

$$\begin{aligned} \begin{array}{ccc} \begin{array}{c|c} {{ c}} &{} {A }\\ \hline \\ { } &{}{{\bar{b}}^{\intercal }} \\ \hline { } &{}{{b}^{\intercal }} \end{array} &{}{\doteq }&{} \begin{array}{ c|ccc} c_1 &{}a_{11} &{} \cdots &{}a_{1s} \\ \vdots &{}\vdots &{} &{}\vdots \\ c_s &{} a_{s1} &{} \cdots &{}a_{ss} \\ \hline \\ &{}{\bar{b}_1} &{}{ \cdots } &{}{\bar{b}_s} \\ \hline &{}{b_1} &{}{ \cdots } &{}{b_s} \end{array} \end{array}\,. \end{aligned}$$
(6.4)

In order to establish an RKN group conveniently, we will specify an RKN method \(\varPhi \) with a stepsize h by \(\varPhi _{h}\). Then \(\varPhi _{\gamma h}\) and \(\varPhi _{\beta h}\) are regarded as two different elements once \(\gamma \ne \beta \), even though they share the same coefficients. To construct a group related to RKN methods, a binary composition is needed. Similarly to Hairer and Wanner [11] in 1974, we consider the composition of two RKN methods but matching allowance for their corresponding stepsizes.

We then introduce the following definition.

Definition 6.1

Suppose that \(\varPhi _h\) is an s-stage RKN method defined by (6.3) for the problem (6.2), \(\varPhi '_h\) is called an essential 0-stepsize form of \(\varPhi _h\) if the formula for \(\varPhi '_h\) reads

$$\begin{aligned} \left\{ \begin{aligned}&Y_{i}=y_{n} + c_{i}hy'_{n} + h^{2}\sum _{j=1}^{s}a_{ij}f(Y_{j}), \quad i=1,\ldots ,s, \\&y_{n+1}=y_{n} + h^{2}\sum _{i=1}^{s}\bar{b}_{i}f(Y_{i}), \\&y'_{n+1}=y'_{n} + h\sum _{i=1}^{s}b_{i}f(Y_{i}). \end{aligned} \right. \end{aligned}$$
(6.5)

Accordingly, \(\varPhi _h\) is called an h-stepsize form of \(\varPhi '_h\).

Remark 6.1

From Definition 6.1, the only difference is in the second equation compared with (6.3). This is essential. That is, the equation

$$y_{n+1}=y_{n} + hy'_{n} + h^{2}\sum _{i=1}^{s}\bar{b}_{i}f(Y_{i})\ ,$$

for \(\varPhi _h\) has been changed into

$$y_{n+1}=y_{n} + h^{2}\sum _{i=1}^{s}\bar{b}_{i}f(Y_{i})\ ,$$

in (6.5) for \(\varPhi '_h\). This means that the numerical solution \((y_{n+1},y'_{n+1})=\varPhi _{ h}(y_{n},y'_{n})\) approximates the exact solution at \(t_n+h\), whereas \((y_{n+1},y'_{n+1})=\varPhi '_h(y_{n},y'_{n})\) can only approximate to \((y(t_{n}),y'(t_{n}))\) at \(t_n\). It is noted that if \(\varPsi _h\) is a classical RKN method, then \(\varPsi _{0\cdot h}\) is just the identity I. This implies that an RKN method \(\varPsi _{0\cdot h}\) with the stepsize 0 is totally different from its essential 0-stepsize form \(\varPsi '_h\) under our new definition.

Suppose that \(\varPhi ^{1}_h\) and \(\varPhi ^{2}_h\) are two RKN methods with \(s_1\) stages and \(s_2\) stages, respectively. Their coefficients are respectively denoted by \(c=(c_1,\ldots ,c_{s_1})^{\intercal }, b=(b_1,\ldots ,b_{s_1})^{\intercal }, \bar{b}=(\bar{b}_1,\ldots ,\bar{b}_{s_1})^{\intercal }, A=\big (a_{ij}\big )_{s_1\times s_1}\) and \(c^{*}=(c_1^{*},\ldots ,c_{s_2}^{*})^{\intercal }, b^{*}=(b_1^{*},\ldots ,b_{s_2}^{*})^{\intercal }, \bar{b}^{*}=(\bar{b}_1^{*},\ldots ,\bar{b}_{s_2}^{*})^{\intercal }, A^{*}=\big (a_{ij}^{*}\big )_{s_2\times s_2}\). We next consider the composition of \(\varPhi _{\gamma h}^{1}\) and \(\varPhi _{\beta h}^{2}\). Taking \((y_0,y'_0)\) as the starting value at \(t_0\) and \((y_1,y'_1)\) as the updated value after one step, we can express the composition law of \((y_1,y'_1)=\big (\varPhi _{\beta h}^{2} \circ \varPhi _{\gamma h}^{1}\big )(y_0,y'_0)\) as

$$\begin{aligned} \left\{ \begin{aligned}&Y_i=y_0 + \gamma c_{i}hy'_0 + \gamma ^{2}h^{2}\sum _{j=1}^{s_1}a_{ij} f(Y_{j}), \quad i=1,\ldots ,s_{1}, \\&\widetilde{y}_1=y_0 + \gamma hy'_0 + \gamma ^{2}h^{2}\sum _{i=1}^{s_1}\bar{b}_{i} f(Y_{i}), \\&\widetilde{y}'_1=y'_0 + \gamma h\sum _{i=1}^{s_1}b_{i}f(Y_{i}), \\&\widetilde{Y}_k=\widetilde{y}_1 + \beta c_{k}^{*}h\widetilde{y}'_1 + \beta ^{2}h^{2}\sum _{j=1}^{s_2}a_{kj}^{*}f(\widetilde{Y}_{j}), \quad k=1,\ldots ,s_{2}, \\&y_1=\widetilde{y}_1 + \beta h\widetilde{y}'_1 + \beta ^{2}h^{2} \sum _{i=1}^{s_2}\bar{b}_{i}^{*}f(\widetilde{Y}_i), \\&y'_1=\widetilde{y}'_1 + \sum _{i=1}^{s_2}\beta hb_{i}^{*} f(\widetilde{Y}_i). \end{aligned} \right. \end{aligned}$$
(6.6)

Canceling \(\widetilde{y}_1\) and \(\widetilde{y}'_1\) from (6.6), we obtain the following simplified form

$$\begin{aligned} \left\{ \begin{aligned}&Y_i=y_0 + \gamma c_{i}hy'_0 + h^{2}\sum _{j=1}^{s_1}\gamma ^{2}a_{ij} f(Y_{j}), \quad i=1,\ldots ,s_{1} \\&\widetilde{Y}_k=y_0 + (\gamma + \beta c_{k}^{*})hy'_0 +h^2\Big (\sum _{j=1}^{s_1}(\gamma ^2\bar{b}_j+\gamma \beta c_k^{*}b_j) f(Y_j) + \sum _{j=1}^{s_2}\beta ^{2}a_{kj}^{*}f(\widetilde{Y}_{j})\Big ), \quad k=1,\ldots ,s_{2} \\&y_1=y_0 + (\gamma + \beta )hy'_0 +h^2\Big (\sum _{i=1}^{s_1}(\gamma ^2\bar{b}_i+\gamma \beta b_i) f(Y_i)+ \sum _{i=1}^{s_2}\beta ^{2}\bar{b}_{i}^{*}{ f(\widetilde{Y}_{i})}\Big ), \\&y'_1=y'_0 +h\Big (\sum _{i=1}^{s_1}\gamma b_i f(Y_i) + \sum _{i=1}^{s_2} \beta b_{i}^{*}f(\widetilde{Y}_{i})\Big ). \end{aligned} \right. \end{aligned}$$
(6.7)

Now let us have a further discussion on the formula (6.7). If \(\gamma +\beta \ne 0\), we observe that (6.7) is just an RKN method \(\varPsi _{(\gamma +\beta )h}\) with the stepsize \((\gamma +\beta )h\). Meanwhile, by a careful calculation the Butcher tableau of RKN method \(\varPsi _h\) reads

$$\begin{aligned} \begin{array}{c|cc} {{ \gamma c/\delta }} &{} { {\gamma ^2 A/\delta ^2 } } &{} \\ {{(\gamma e + \beta c^{*})/\delta } } &{}{\tilde{A}/\delta ^2} &{}{\beta ^2 A^{*}/\delta ^2} \\ \hline \\ {} &{}{\tilde{b}^{\intercal }/\delta ^2} &{}{\beta ^2 \bar{b}^{*\intercal }/\delta ^2} \\ \hline \\ {} &{}{\gamma b^{\intercal }/\delta } &{}{\beta b^{*\intercal }/\delta } \end{array}\,, \end{aligned}$$
(6.8)

where \(\delta =\gamma +\beta \), \(\tilde{A}_{ij}=\gamma ^2\bar{b}_{j}+\gamma \beta c_{i}^{*}b_{j}\), \(\tilde{b}_{j}=\gamma ^2\bar{b}_{j}+\gamma \beta b_{j}\) for \(i=1,\ldots ,s_2,\ j=1,\ldots ,s_1\), and \(e=(1,\ldots ,1)^{\intercal }\) is the \(s_2\times 1\) vector of units. It is clear that the updated value \((y_1,y'_1)\) just approximates the exact value at \(t_0+(\gamma +\beta )h\).

However, for the case of \(\gamma +\beta =0\), the formula (6.7) is no longer of classical RKN type. In this case, \(\varPhi _{\beta h}^{1}\circ \varPhi _{\gamma h}^{1}\) is just an \({ essential}\) 0-\({ stepsize}\) RKN method, whose corresponding h-stepsize form can be expressed in the following Butcher tableau

$$\begin{aligned} \begin{array}{c|cc} {{ \gamma c}} &{} {{\gamma ^2 A } } &{} \\ {{\gamma e + \beta c^{*}} } &{}{\tilde{A}} &{}{\beta ^2 A^{*}} \\ \hline \\ { } &{}{\tilde{b}^{\intercal }} &{}{\beta ^2 \bar{b}^{*\intercal }} \\ \hline { } &{}{\gamma b^{\intercal }} &{}{\beta b^{*\intercal }} \end{array}\,, \end{aligned}$$
(6.9)

where \(\tilde{A}_{ij}\) and \(\tilde{b}_{j}\) are the same as in formula (6.8). In this case, it should be noted that \(\sum _i\gamma b_i + \sum _i\beta b^*_i=0\) when \(\gamma +\beta =0\) and \(\sum _ib_i=\sum _ib^*_i=1\). Although this case is not significant in practice, it will be indispensable in the construction of an RKN group in the remainder of this chapter.

Define

\(G_1:=\{\varPhi _{\alpha h}\ |\ \varPhi _h\ \text {is a classical RKN method for } \alpha \in \mathbb {R} \},\)

\(G_0:=\{\varPhi '_{\alpha h}\ |\ \varPhi '_{\alpha h}\ \text {is the essential 0-stepsize form of } \varPhi _{\alpha h}\ \text {and}\ \varPhi _{\alpha h}\in G_1 \ with \sum _ib_i=0\}, \) and \(G=G_1\bigcup G_0\).

We then have the following result.

Theorem 6.1

\((G,\circ ,I)\) is a group with respect to the composition \(\circ \) and the identity I.

Proof

It is clear that the composition \(\circ \) is associative, and for each element \(\varTheta \in G\) we certainly have \(\varTheta \circ I=I\circ \varTheta =\varTheta \). Moreover, if \(\varPhi \) and \(\varPsi \) are two arbitrary elements in G, from the formula (6.7) and the above analysis we know that \(\varPhi \circ \varPsi \in G\). This shows the closure property of G under the product \(\circ \). We next show that each element in G is invertible.

For an s-stage RKN method \(\varLambda _h\) defined by (6.3), the existing results [19] have revealed the existence of its adjoint method \(\varLambda _{h}^{*}\). If the coefficients of the adjoint method are denoted by \(c^{*}=(c_1^{*},\ldots ,c_{s}^{*})^{\intercal }, b^{*}=(b_1^{*},\ldots ,b_{s}^{*})^{\intercal }, \bar{b}^{*}=(\bar{b}_1^{*},\ldots ,\bar{b}_{s}^{*})^{\intercal }\), and \(A^{*}=\big (a_{ij}^{*}\big )_{s\times s}\), then they satisfy

$$\begin{aligned} \left\{ \begin{aligned}&c_{i}^{*}=1-c_{s+1-i}\ , \\&a_{ij}^{*}=(1-c_{s+1-i})b_{s+1-j}-\bar{b}_{s+1-j}+a_{s+1-i,s+1-j}\ , \\&\bar{b}_j^{*}=b_{s+1-j}-\bar{b}_{s+1-j}\ , \\&b_j^{*}=b_{s+1-j}\ , \end{aligned} \right. \end{aligned}$$
(6.10)

for \(1\leqslant i,j\leqslant s\). Certainly \(\varLambda _{h}^{*}\) belongs to G, and hence \(\varLambda _{-h}^{*}\in G\). Furthermore, from the definition of adjoint methods, we have \(\varLambda _{h}^{-1}=\varLambda _{-h}^{*}\) straightforwardly. Consequently, we have \(\varLambda _{h}^{-1}\in G\), so does its essential 0-stepsize form \(\varLambda '_{h}\), namely, \(\varLambda _{h}^{'-1}\in G\). This completes the proof.    \(\square \)

Remark 6.2

Here, the above way of defining an RKN group has some nonessential differences from that of the RK group defined by Hairer and Wanner [11]. These differences actually rely on the following fact. If \(\varPhi _h\) and \(\varPsi _h\) are two different RKN methods and they are not adjoint to each other, then the composition \(\varPhi _h\circ \varPsi ^{*}_{-h}\) does not belong to \(G_1\) any more. Here \(\varPsi ^{*}_{h}\) denotes the adjoint method of \(\varPsi _{h}\). That is why we have additionally introduced Definition 6.1 and the set \(G_0\). Likewise, it is also needed to introduce another new definition (Definition 6.2) when constructing the ERKN group in the next section.

6.3 ERKN Group and Related Issues

6.3.1 Construction of ERKN Group

In this section, we are concerned with the group-structure analysis of the efficient integrator for the oscillatory second-order initial problem (6.1). It seems that classical RKN methods could still be applied to these problems as numerical integrators, since one may move the term My from the left-hand side to the right-hand side of the differential equation and then the problem (6.1) can be also transformed to the type of (6.2). However, when \(||M||\gg 1\), RKN methods may not be very effective methods for solving (6.1) and show bad numerical behavior. This is mainly caused by the highly oscillatory effect introduced by the linear term My. Taking account of this point, the extended Runge–Kutta–Nyström (ERKN) methods were proposed and designed especially for the oscillatory problem (6.1).

Based on the matrix-variation-of-constants formula [33], an s-stage ERKN method [34] for IVP (6.1) is defined by

$$\begin{aligned} \left\{ \begin{aligned}&Y_{i}=\phi _0(c_{i}^{2}V)y_n + c_{i}h\phi _1(c_{i}^{2}V)y'_{n} + h^2\sum _{j=1}^{s}a_{ij}(V)f(Y_j),\quad i=1,\ldots ,s, \\&y_{n+1}=\phi _0(V)y_n + h\phi _1(V)y'_{n} + h^2\sum _{i=1}^{s}\bar{b}_{i}(V)f(Y_i), \\&y'_{n+1}=-hM\phi _1(V)y_n + \phi _0(V)y'_{n} + h\sum _{i=1}^{s}b_{i}(V)f(Y_i). \end{aligned} \right. \end{aligned}$$
(6.11)

Here, \(c_1,\ldots ,c_s\) are real constants, \(b_i(V), \bar{b}_i(V)\) and \(a_{ij}(V)\) for \(i,j=1,\ldots ,s\) are matrix-valued functions of \(V\equiv h^2M\) which are usually expressed in formal series in terms of V

$$\begin{aligned} b_i(V)=\sum _{k=0}^{\infty }\frac{b_i^{(2k)}}{(2k)!}V^k, \quad \bar{b}_i(V)=\sum _{k=0}^{\infty }\frac{\bar{b}_i^{(2k)}}{(2k)!}V^k, \quad a_{ij}(V)=\sum _{k=0}^{\infty }\frac{a_{ij}^{(2k)}}{(2k)!}V^k, \end{aligned}$$
(6.12)

and

$$\begin{aligned} {\phi _j(V):=\sum _{k=0}^{\infty }\frac{(-1)^{k}V^k}{(2k+j)!}, \quad j=0,1,\ldots .} \end{aligned}$$
(6.13)

The properties related to \(\phi _j(V)\) for \(j=0,1,\ldots \) can be found in [31] and the details are omitted here. We can also express the ERKN method (6.11) in a Butcher tableau

$$\begin{aligned} \begin{array}{ccc} \begin{array}{c|c} {{ c}} &{} {A(V) }\\ \hline \\ { } &{}{{\bar{b}(V)}^{\intercal }} \\ \hline { } &{}{{b(V)}^{\intercal }} \end{array} &{}=&{} \begin{array}{ c|ccc} c_1 &{}a_{11}(V) &{} \cdots &{}a_{1s}(V) \\ \vdots &{}\vdots &{} &{}\vdots \\ c_s &{} a_{s1}(V) &{} \cdots &{}a_{ss}(V) \\ \hline \\ &{}{\bar{b}_1(V)} &{}{ \cdots } &{}{\bar{b}_s(V)} \\ \hline &{}{b_1(V)} &{}{ \cdots } &{}{b_s(V)} \end{array} \end{array}. \end{aligned}$$
(6.14)

Proceeding in the same spirit as for RKN methods, we also introduce a new definition for ERKN methods as follows.

Definition 6.2

Suppose that \(\varPsi _h\) is an s-stage ERKN method defined by (6.11) for the problem (6.1), \(\varPsi '_h\) is called essential 0-stepsize form of \(\varPsi _h\) if the formula for \(\varPsi '_h\) reads

$$\begin{aligned} \left\{ \begin{aligned}&Y_{i}=\phi _0(c_{i}^{2}V)y_n + c_{i}h\phi _1(c_{i}^{2}V)y'_{n} + h^2\sum _{j=1}^{s}a_{ij}(V)f(Y_j),\quad i=1,\ldots ,s, \\&y_{n+1}=y_n + h^2\sum _{i=1}^{s}\bar{b}_{i}(V)f(Y_i), \\&y'_{n+1}=y'_{n} + h\sum _{i=1}^{s}b_{i}(V)f(Y_i). \end{aligned} \right. \end{aligned}$$
(6.15)

Then \(\varPsi _h\) is called an h-stepsize form of \(\varPsi '_h\).

Suppose that \(\Upsilon _h^1\) and \(\Upsilon _h^2\) are two ERKN methods with \(s_1\) stages and \(s_2\) stages, respectively. The coefficients of \(\Upsilon _h^1\) are denoted as (\(c,b,\bar{b},A\)), and those of \(\Upsilon _h^2\) are additionally denoted with a star (\(c^*,b^*,\bar{b}^*,A^*\)). We now consider the composition of \(\Upsilon _{\gamma h}^1\) and \(\Upsilon _{\beta h}^2\). After a careful calculation, we derive the scheme \(\Upsilon _{\beta h}^2\circ \Upsilon _{\gamma h}^1\) as follows

$$\begin{aligned} \left\{ \begin{aligned}&Y_i=\phi _0(\gamma ^{2}c_{i}^{2}V)y_0 + \gamma c_{i}h\phi _1(\gamma ^{2}c_{i}^{2}V)y'_{0} +h^2\sum _{j=1}^{s_1}\gamma ^{2}A_{ij}(\gamma ^{2}V)f(Y_j),\quad i=1,\ldots ,s_1, \\&\widetilde{Y}_k=\phi _0((\gamma +\beta c_{k}^{*})^{2}V)y_0 + (\gamma +\beta c_{k}^{*})h\phi _1((\gamma +\beta c_{k}^{*})^{2}V)y'_0 \\&\qquad + h^2\Big ( \sum _{j=1}^{s_1}\big (\gamma ^2\bar{b}_j(\gamma ^{2}V)\phi _0(\beta ^{2}c^{*2}_{k}V) + \gamma \beta c^{*}_{k}b_{j}(\gamma ^{2}V)\phi _1(\beta ^{2}c^{*2}_{k}V)\big )f(Y_j) \\ {}&\qquad +\sum _{j=1}^{s_2}\beta ^{2}A_{kj}^{*}(\beta ^{2} V)f(\widetilde{Y}_j) \Big ), \quad k=1,\ldots ,s_{2}, \\&y_1=\phi _0((\gamma +\beta )^{2}V)y_0 + (\gamma +\beta )h\phi _1((\gamma +\beta )^{2}V)y'_0 \\&\qquad + h^2\Big ( \sum _{j=1}^{s_1}\big (\gamma ^2\bar{b}_j(\gamma ^{2}V)\phi _0(\beta ^{2}V) + \gamma \beta b_{j}(\gamma ^{2}V)\phi _1(\beta ^{2}V)\big )f(Y_j) +\sum _{i=1}^{s_2}\beta ^{2}\bar{b}_{i}^{*}(\beta ^{2}V)f(\widetilde{Y}_i) \Big ), \\&y'_1=-(\gamma +\beta ) hM\phi _1((\gamma +\beta )^{2}V)y_0 + \phi _0((\gamma +\beta )^{2}V)y'_0 \\&\qquad + h\Big ( \sum _{j=1}^{s_1}\big (-\gamma ^2\beta V\bar{b}_j(\gamma ^{2}V)\phi _1(\beta ^{2}V) + \gamma b_j(\gamma ^{2}V)\phi _0(\beta ^{2}V)\big )f(Y_j) +\sum _{i=1}^{s}\beta b_{i}(\beta ^{2}V)f(\widetilde{Y}_i) \Big ). \end{aligned} \right. \end{aligned}$$

For the case \(\gamma +\beta \ne 0\), gives that the composition \(\Upsilon _{\beta h}^2\circ \Upsilon _{\gamma h}^1\) indicates a new \(\Upsilon _{\delta h}\), namely, an \((s_1+s_2)\)-stage ERKN method with the stepsize \(\delta h=(\gamma +\beta )h\), and Butcher tableau

$$\begin{aligned} \begin{array}{c|cc} {{ \gamma c/\delta }} &{} {\gamma ^{2}A(\gamma ^{2}/\delta ^2 V)/\delta ^2 } &{} \\ {(\gamma e + \beta c^{*})/\delta } &{}{\bar{A}(V/\delta ^2)/\delta ^2} &{}{\beta ^{2}A^{*}(\beta ^{2}/\delta ^2 V)/\delta ^2} \\ \hline \\ { } &{}{\bar{B}^{\intercal }(V/\delta ^2)/\delta ^2} &{}{\beta ^2 \bar{b}^{*\intercal }(\beta ^{2}/\delta ^2 V)/\delta ^2} \\ \hline { } &{}{B^{\intercal }(V/\delta ^2)/\delta } &{}{\beta b^{*\intercal }(\beta ^{2}/\delta ^2 V)/\delta } \end{array}\,, \end{aligned}$$
(6.16)

where

$$\begin{aligned} \left\{ \begin{aligned}&\bar{A}_{ij}(V)=\gamma ^2\bar{b}_j(\gamma ^{2}V)\phi _0(\beta ^{2}c^{*2}_{i}V) + \gamma \beta c^{*}_{i}b_{j}(\gamma ^{2}V)\phi _1(\beta ^{2}c^{*2}_{i}V), \\&\bar{B}_{j}(V)= \gamma ^2\bar{b}_j(\gamma ^{2}V)\phi _0(\beta ^{2}V) + \gamma \beta b_{j}(\gamma ^{2}V)\phi _1(\beta ^{2}V), \\&B_{j}(V)=-\gamma ^2\beta V\bar{b}_j(\gamma ^{2}V)\phi _1(\beta ^{2}V) + \gamma b_j(\gamma ^{2}V)\phi _0(\beta ^{2}V), \end{aligned} \right. \end{aligned}$$
(6.17)

for \(i=1,\ldots ,s_2,\ j=1,\ldots ,s_1\).

If \(\gamma +\beta =0\), the composition \(\Upsilon '=\Upsilon _{\beta h}^2\circ \Upsilon _{\gamma h}^1\) is also essential 0-stepsize, whose corresponding h-stepsize form can be expressed in the following Butcher tableau

$$\begin{aligned} \begin{array}{c|cc} {{ \gamma c}} &{} {\gamma ^{2}A(\gamma ^{2}V) } &{} \\ {\gamma e + \beta c^{*} } &{}{\bar{A}(V)} &{}{\beta ^{2}A^{*}(\beta ^{2} V)} \\ \hline \\ { } &{}{\bar{B}^{\intercal }(V)} &{}{\beta ^2 \bar{b}^{*\intercal }(\beta ^{2} V)} \\ \hline { } &{}{B^{\intercal }(V)} &{}{\beta b^{*\intercal }(\beta ^{2} V)} \end{array}\,, \end{aligned}$$
(6.18)

where \(\bar{A}_{ij}(V),\bar{B}_{j}(V),B_{j}(V)\) have the same expression as (6.17) with \(\sum _i\gamma b_i^{(0)} + \sum _i\beta b^{*(0)}_i=0\).

Define

\(\varOmega _1:=\{\varPhi _{\alpha h}\ |\ \varPhi _h\ is\ { an}\ ERKN\ method\ for\ \alpha \in \mathbb {R} \}\),

\(\varOmega _0:=\{\varPhi '_{\alpha h}\ |\ \varPhi '_{\alpha h}\text { is the essential 0-stepsize form of}\ \varPhi _{\alpha h},\ \varPhi _{\alpha h}\in \varOmega _1 \ with \sum _ib_i^{(0)}=0\}\), and \(\varOmega =\varOmega _1\bigcup \varOmega _0\).

Then we have the following theorem.

Theorem 6.2

\((\varOmega ,\circ ,I)\) is a group with respect to the composition \(\circ \) and the identity I.

The proof is similar to that of Theorem 6.1, except that the coefficients of the adjoint method of (6.11) can be expressed as

$$\begin{aligned} \left\{ \begin{aligned}&c_{i}^{*}=1-c_{s+1-i}\ , \\&a_{ij}^{*}(V)=\phi _0(c_{s+1-i}^2 V)\bar{b}_{j}(V) - c_{s+1-i}\phi _1(c_{s+1-i}^2 V)b_{j}(V)+a_{s+1-i,s+1-j}(V)\ , \\&\bar{b}_j^{*}(V)=\phi _1(V)b_{s+1-j}(V)-\phi _0(V)\bar{b}_{s+1-j}(V)\ , \\&b_j^{*}(V)=V\phi _1(V)\bar{b}_{s+1-j}(V)+\phi _0(V)b_{s+1-j}(V)\ , \end{aligned} \right. \end{aligned}$$
(6.19)

for \(1\leqslant i,j\leqslant s\). Hence, we omit the details here.

6.3.2 The Relation Between the RKN Group G and the ERKN Group \(\varOmega \)

In the previous sections, we have established the RKN group \((G,\circ ,I)\) and the ERKN group \((\varOmega ,\circ ,I)\). A direct observation shows that when \(M\rightarrow \mathbf {0}\), the oscillatory problem (6.1) becomes the traditional second-order initial value problem (6.2), and the ERKN method (6.11) hence reduces to the RKN method (6.3). This point indicates that there exists an inherent relationship between RKN methods and ERKN methods. Thus, ERKN methods are usually regarded as an extension of RKN methods. In the following, we will rigorously demonstrate this extension relationship.

In what follows, we will denote the coefficients of an RKN method in lower-case by (\(c,b,\bar{b},a\)) and those of an ERKN method in upper-case by (\(C,B,\bar{B},A\)). As ERKN methods depend on the matrix M, for each element \(\varPsi \in \varOmega \) we will denote it as \(\varPsi (M)\) to show this relevance if necessary. Then the word reduces can be defined as a map

$$\begin{aligned} \eta : \quad \varOmega \longrightarrow G, \quad \eta (\varPsi )=\lim _{M\rightarrow \mathbf {0}}\varPsi (M), \quad \forall \ \ \varPsi \in \varOmega . \end{aligned}$$

As a continuation, we arrive at the following useful theorem.

Theorem 6.3

The map \(\eta \) is an epimorphism of the group \(\varOmega \) onto the group G.

Proof

Suppose that \(\varPsi ^1\) and \(\varPsi ^2\) are two elements of \(\varOmega \) respectively with the stepsizes \(\gamma h\) and \(\beta h\), \(\varPhi ^1=\eta (\varPsi ^1)\), and \(\varPhi ^2=\eta (\varPsi ^2)\). From the composition laws (6.8) and (6.9) of RKN methods and those of ERKN methods (6.16) and (6.18), it can be easily verified that \(\eta (\varPsi ^2\circ \varPsi ^1)= \varPhi ^2\circ \varPhi ^1=\eta (\varPsi ^2)\circ \eta (\varPsi ^1)\). In addition, from the fact that \(\eta (I)=I\), we conclude that \(\eta \) is a homomorphism of \(\varOmega \) into G.

We next show that \(\eta \) is surjective. For each element \(\varPhi \in G\), which is denoted by the coefficients (\(c,b,\bar{b},a\)), there exists \(\varPsi \in \varOmega \), whose coefficients can be expressed as

$$\begin{aligned} C=c,\ B(V)=b\otimes E_{n},\ \bar{B}(V)=\bar{b}\otimes E_{n}, \ \ A(V)=a\otimes E_{n}, \end{aligned}$$
(6.20)

where \(\otimes \) is the Kronecker product, \(E_{n}\) is an \(n\times n\) identity matrix and n is the dimension of square matrix M. Obviously the coefficients (\(C,B,\bar{B},A\)) define an element in \(\varOmega \), and thus \(\eta \) is surjective. This completes the proof.    \(\square \)

Corollary 6.1

Let K be the kernel of \(\eta \), i.e. \(K=\eta ^{-1}(I)\), then K is a normal subgroup of \(\varOmega \). Moreover, the induced map \(\bar{\eta }\) is an isomorphism of the quotient group \(\overline{\varOmega }=\varOmega /K\) onto the group G.

Theorem 6.3 actually gives a global view of ERKN methods by connecting them with classical RKN methods via the epimorphism map \(\eta \). From Corollary 6.1, the map \(\eta \) defines a congruence relation \(\equiv \) by the normal subgroup K, where

$$\varPhi \equiv \varPsi (mod\ K)\quad if\quad \varPhi ^{-1}\circ \varPsi \in K.$$

Then by finding a representative element \(\varPsi \) for each congruence class \(\overline{\varPsi }\in \overline{\varOmega }\), we can theoretically give all the elements in \(\overline{\varPsi }\), since for each \(\varTheta \in \overline{\varPsi }\) there exists \(\Gamma \in K\) such that \(\varTheta =\varPsi \circ \Gamma \). This fact indicates that \(\overline{\varPsi }\) is the coset of \(\varPsi \) relative to K, i.e. \(\overline{\varPsi }=\varPsi \circ K\). Hence it only remains to describe the normal subgroup K in detail. This can be easily obtained from the following definition of K

$$K=\big \{\varPsi \in \varOmega _0 | {b}_j^{(0)}=0\ and\ \bar{b}_j^{(0)}=0,\ \forall j\big \}.$$

6.4 A Particular Mapping of G into \(\varOmega \)

In Sect. 6.3, we have investigated the ERKN group as a whole. However, as mentioned in the previous section, we can just have a theoretical description for each congruence class\(\overline{\varPsi }\in \overline{\varOmega }\), and this is not associated with the important properties of the method, such as the symplecticity, the symmetry and the order. Recalling Corollary 6.1 again, and taking account of the fact that \(\overline{\varPsi }=\varPsi \circ K\), it is of great importance to select a representative element \(\varPsi \) with favourable properties for the congruence class \(\overline{\varPsi }\), even though we cannot give a detailed analysis for each element in \(\overline{\varPsi }\).

Meanwhile, because \(\eta (\overline{\varPsi })=\varPhi \in G\), \(\varPhi \) inherits all the advantages of the ERKN elements in \(\overline{\varPsi }\). Hence all the ERKN elements in \(\overline{\varPsi }\) cannot have better structural properties than the reduced RKN element \(\varPhi \). Taking account of this point, we may find this appropriate representative element \(\varPsi \) with the help of the corresponding reduced RKN element \(\varPhi \). In fact, what we are considering is to find a normal mapping \(\varphi \) from G into \(\varOmega \), so that \(\varphi (\varPhi )\) can preserve as many properties as the original RKN method \(\varPhi \) does. A direct result about the potential mapping \(\varphi \) is that it should be the section of \(\eta \). That is, the composition \(\eta \circ \varphi =I\) is the identity on G. In this sense, the underlying mapping defined by (6.20) may be a straightforward candidate for \(\varphi \). Unfortunately, most properties cannot be preserved in this easy way, and we have to reconsider a proper mapping \(\varphi \).

From the variation-of-constants formula (see, e.g. [31, 32]) for the problem (6.2) and the problem (6.1), as well as the corresponding RKN method for (6.2) and ERKN integrator (6.11), we consider the following mapping:

$$\begin{aligned} \begin{aligned} \varphi :\ G\longrightarrow \varOmega , \qquad \begin{array}{c|c} {{ c}} &{} {a }\\ \hline \\ { } &{}{{\bar{b}}^{\intercal }} \\ \hline { } &{}{{b}^{\intercal }} \end{array} \ \longmapsto \ \begin{array}{c|c} {{ C}} &{} {A(V) }\\ \hline \\ { } &{}{{\bar{B}(V)}^{\intercal }} \\ \hline { } &{}{{B(V)}^{\intercal }} \end{array}\,, \end{aligned} \end{aligned}$$
(6.21)

where

$$\begin{aligned} \left\{ \begin{aligned}&C_i=c_i, \\&A_{ij}(V)=a_{ij}\phi _1((c_i-c_j)^2 V), \\&\bar{B}_{i}(V)=\bar{b}_{i}\phi _1((1-c_i)^2 V), \\&B_{i}(V)=b_{i}\phi _0((1-c_i)^2 V), \end{aligned} \right. \end{aligned}$$

for \(1\leqslant i,j\leqslant s\) and s is the stage of the RKN method. This mapping naturally maps a classical RKN method \(\varPhi \) to an ERRK method \(\varphi (\varPhi )\). Meanwhile, being a representative element for the congruence class \(\overline{\varphi (\varPhi )}\), we will show by the following several theorems that \(\varphi (\varPhi )\) almost preserves all the properties that the original RKN method \(\varPhi \) has.

Theorem 6.4

If \(\varPhi \in G\) is symplectic, then \(\varphi (\varPhi )\in \varOmega \) is symplectic.

Proof

From the definition of G, it is needed to verify that the result holds for all RKN methods. Hence we can suppose that \(\varPhi \) is an s-stage symplectic RKN method. The results from Suris [26] and Okunbor and Skeel [19] show that the coefficients of \(\varPhi \) should satisfy the following symplectic conditions:

$$\begin{aligned} \left\{ \begin{aligned}&\bar{b}_i=(1-c_i)b_i,\quad 1\le i \le s, \\&\bar{b}_i b_j + b_i a_{ij} =\bar{b}_j b_i + b_j a_{ji}, \quad 1\le i,j \le s. \end{aligned} \right. \end{aligned}$$
(6.22)

We next show that the ERKN method \(\varphi (\varPhi )\), whose coefficients \(C,A(V),\bar{B}(V),B(V)\) defined by (6.21), is symplectic. Although there is no sufficient and necessary conditions for the symplecticity of ERKN methods, we will prove that \(\varphi (\varPhi )\) satisfies the following conditions:

$$\begin{aligned} \left\{ \begin{aligned}&\phi _0(V)B_{i}(V)+V\phi _1(V)\bar{B}_{i}(V)=d_i\phi _0(c_i^2 V), \quad d_i \in \mathbb {R}, i = 1, 2, . . . , s, \\&\phi _1(V)B_{i}(V)-\phi _0(V)\bar{B}_{i}(V)=c_i d_i\phi _1(c_i^2 V), \quad i = 1, 2, . . . , s, \\&\bar{B}_i B_j + d_i A_{ij} =\bar{B}_j B_i + d_j A_{ji}, \quad i,j = 1, 2, . . . , s, \end{aligned} \right. \end{aligned}$$
(6.23)

which are sufficient conditions for symplectic ERKN methods originally proposed by Wu et al. [30].

The equations \(\bar{B}_{i}(V)=\bar{b}_{i}\phi _1((1-c_i)^2 V)\) and \(B_{i}(V)=b_{i}\phi _0((1-c_i)^2 V)\) exactly give the first two equations of (6.23), with \(d_i=b_i\). Then, inserting the expression of \(A(V),\bar{B}(V),B(V)\) into the the third equation of (6.23), we obtain

$$\begin{aligned} \begin{aligned}&\bar{B}_i B_j + d_i A_{ij}-(\bar{B}_j B_i + d_j A_{ji}) \\ =&(1-c_i)b_i b_j \phi _1((1-c_i)^2 V)\phi _0((1-c_j)^2 V) + b_i a_{ij}\phi _1((c_i-c_j)^2 V) \\&\quad -\big ((1-c_j)b_i b_j \phi _1((1-c_j)^2 V)\phi _0((1-c_i)^2 V) + b_j a_{ji}\phi _1((c_i-c_j)^2 V)\big ) \\ =&\big (b_i b_j(c_j-c_i)+b_i a_{ij}-b_j a_{ji}\big )\phi _1((c_i-c_j)^2 V) \\ =&0. \end{aligned} \end{aligned}$$

The last equation directly follows from (6.22). This completes the proof.    \(\square \)

Theorem 6.5

If \(\varPhi \in G\) is symmetric and the coefficients satisfy the simplifying assumption \(\bar{b}_i=b_i (1-c_i)\), then \(\varphi (\varPhi )\in \varOmega \) is symmetric.

Proof

Similarly to Theorem 6.4, we only need to verify the case that \(\varPhi \) is an s-stage RKN method. Hence, we need to derive the symmetric conditions of ERKN methods [16, 27]

$$\begin{aligned} \left\{ \begin{aligned}&c_{i}=1-c_{s+1-i}\ , \\&A_{ij}(V)=\phi _0(c_{s+1-i}^2 V)\bar{B}_{j}(V) - c_{s+1-i}\phi _1(c_{s+1-i}^2 V)B_{j}(V)+A_{s+1-i,s+1-j}(V)\ , \\&\bar{B}_j(V)=\phi _1(V)B_{s+1-j}(V)-\phi _0(V)\bar{B}_{s+1-j}(V)\ , \\&B_j(V)=V\phi _1(V)\bar{B}_{s+1-j}(V)+\phi _0(V)B_{s+1-j}(V)\ , \end{aligned} \right. \end{aligned}$$
(6.24)

from the symmetric conditions of RKN methods [12],

$$\begin{aligned} \left\{ \begin{aligned}&c_{i}=1-c_{s+1-i}\ , \\&a_{ij}=(1-c_{s+1-i})b_{s+1-j}-\bar{b}_{s+1-j}+a_{s+1-i,s+1-j}\ , \\&\bar{b}_j=b_{s+1-j}-\bar{b}_{s+1-j}\ , \\&b_j=b_{s+1-j}\ . \end{aligned} \right. \end{aligned}$$
(6.25)

The first equation of (6.24) naturally holds. On noting that \(b_j=b_{s+1-j}\) and \(\bar{b}_j=b_j(1-c_j)\), we have

$$\begin{aligned} \begin{aligned}&\phi _1(V)B_{s+1-j}-\phi _0(V)\bar{B}_{s+1-j} \\ =&b_{s+1-j}\phi _1(V)\phi _0((1-c_{s+1-j})^2 V) - b_{s+1-j}(1-c_{s+1-j})\phi _0(V)\phi _1((1-c_{s+1-j})^2 V) \\ =&b_{j}\phi _1(V)\phi _0(c_j^2 V)-b_{j}c_j\phi _1(V)\phi _0(c_j^2 V) \\ =&b_{j}(1-c_j)\phi _1((1-c_{j})^2 V) \\ =&\bar{B}_j(V), \end{aligned} \end{aligned}$$

and

$$\begin{aligned} \begin{aligned}&V\phi _1(V)\bar{B}_{s+1-j}+\phi _0(V)B_{s+1-j} \\ =&b_{s+1-j}(1-c_{s+1-j})V\phi _1(V)\phi _1((1-c_{s+1-j})^2 V) + b_{s+1-j}\phi _0(V)\phi _0((1-c_{s+1-j})^2 V) \\ =&b_{j}\big (c_j V\phi _1(V)\phi _1(c_j^2 V)+\phi _0(V)\phi _0(c_j^2 V)\big ) \\ =&b_{j}\phi _0((1-c_{j})^2 V) \\ =&B_j(V). \end{aligned} \end{aligned}$$

These give the third and fourth equations of (6.24). Furthermore, it follows from (6.25) and the simplifying assumption \(a_{ij}=b_{j}(c_i-c_j)+a_{s+1-i,s+1-j}\). We thus have

$$\begin{aligned} \begin{aligned}&\phi _0(c_{s+1-i}^2 V)\bar{B}_{j}(V) - c_{s+1-i}\phi _1(c_{s+1-i}^2 V)B_{j}(V)+A_{s+1-i,s+1-j}(V)\ \\ =&b_{j}(1-c_{j})\phi _0((1-c_{i})^2 V)\phi _1((1-c_{j})^2 V) - b_{j}(1-c_{i})\phi _1((1-c_{i})^2 V)\phi _0((1-c_{j})^2 V)\\ {}&+ a_{s+1-i,s+1-j}\phi _1((c_i-c_{j})^2 V) \\ =&\big (b_{j}(c_i-c_j)+a_{s+1-i,s+1-j}\big )\phi _1((c_i-c_{j})^2 V), \\ =&a_{ij}\phi _1((c_i-c_{j})^2 V) \\ =&A_{ij}(V), \end{aligned} \end{aligned}$$

and consequently the second equation of (6.24) is satisfied. This completes the proof.    \(\square \)

Remark 6.3

Although the condition \(\bar{b}_i=b_i (1-c_i)\) required by Theorem 6.5, looks like an additional simplifying condition, in fact this assumption is already contained in the symplectic conditions for RKN methods in Theorem 6.4.

Fig. 6.1
figure 1

Figure of tree \(\tau =\tau _1\times \tau _2\times \cdots \times \tau _n\times \big (W_{+}b_{+}(b_{+}B_{+})^{p_1}(\widetilde{\tau }_1)\big )\times \cdots \times \big (W_{+}b_{+}(b_{+}B_{+})^{p_k}(\widetilde{\tau }_k)\big )\)

Fig. 6.2
figure 2

Figure of tree \(\tau =(W_{+}b_{+}(b_{+}B_{+})^{0}(\tau _1))\times \tau '\)

The following theorem is related to the order of a numerical method and the corresponding order conditions. Hence it seems plausible to gain some knowledge of special Nyström tree (SNT) and simplified special extended Nyström tree (SSENT), which is respectively designed to deal with order conditions of RKN and ERKN methods. Further details concerning SNT and SSENT can be found in [12, 35]. For the convenience of the proof, we introduce the following two definitions and a basic lemma, which will be used in the proof of the theorem later.

Definition 6.3

The degree of merge node \(d(\tau )\) on SSENT are recursively defined as follows.

1. \(d(\tau )=0\),   if \(\tau \in SNT\);

2. \(d(\tau )=k+\sum _{j=1}^{n}d(\tau _j)+\sum _{i=1}^{k}d(\widetilde{\tau }_i)\),    if \(\tau =\tau _1\times \tau _2\times \cdots \times \tau _n\times \big (W_{+}b_{+}(b_{+}B_{+})^{p_1}(\widetilde{\tau }_1)\big )\times \cdots \times \big (W_{+}b_{+}(b_{+}B_{+})^{p_k}(\widetilde{\tau }_k)\big )\) and \(\tau _i,\widetilde{\tau }_j \in SSENT\), \(p_i\in \mathbb {N}_{+}\) (see Fig. 6.1).

Definition 6.4

If \(\tau =\big (W_{+}b_{+}(b_{+}B_{+})^{0}(\tau _1)\big )\times \tau '\) (see Fig. 6.2), then we define \(\tau _1\) to be the first generation of \(\tau \). We recursively define that \(\tau _n\) is the nth (\(n\ge 2\)) generation of \(\tau \), if there exists \(\tau _0\in SSENT\) that \(\tau _n\) is the first generation of \(\tau _0\) and \(\tau _0\) is the \((n-1)\)th generation of \(\tau \).

Lemma 6.1

If \(\tau =\tau _1\times \tau _2,\ \tau _1,\tau _2\in SSENT\), then the order \(\rho (\tau )\), the sign \(s(\tau )\), the density \(\gamma (\tau )\), and the weight \(\varPhi _i(\tau )\) satisfy

$$\begin{aligned} \begin{aligned} \rho (\tau )=&\rho (\tau _1)+\rho (\tau _2)-1,&s(\tau )=s(\tau _1)\cdot s(\tau _2), \\ \gamma (\tau )=&\rho (\tau )\cdot \frac{\gamma (\tau _1)}{\rho (\tau _1)} \cdot \frac{\gamma (\tau _2)}{\rho (\tau _2)},&\varPhi _i(\tau )=\varPhi _i(\tau _1)\cdot \varPhi _i(\tau _2).\end{aligned} \end{aligned}$$
(6.26)

This lemma can be directly obtained from the definition of order, density and sign of an SSENT tree. Hence, we omit the remaining details of the proof here.

Theorem 6.6

If \(\varPsi \in G\) is of order \(p\ (p\ge 1)\), then \(\varphi (\varPsi )\in \varOmega \) is also of order p.

Proof

Suppose that \(\varPsi \) is an s-stage RKN method. The theorem can be stated as follows.

If the order conditions [12]

$$\begin{aligned} \left\{ \begin{aligned}&\sum _{i=1}^{s}\bar{b}_{i}\varPhi _i(\tau )=\frac{1}{(\rho (\tau )+1)\gamma (\tau )}, \quad \forall \tau \in SNT_m,\quad m\le p-1, \\&\sum _{i=1}^{s}b_{i}\varPhi _i(\tau )=\frac{1}{\gamma (\tau )}, \quad \forall \tau \in SNT_m,\quad m\le p, \end{aligned} \right. \end{aligned}$$
(6.27)

hold for \(\varPsi \), then the order conditions [35]

$$\begin{aligned} \left\{ \begin{aligned}&\sum _{i=1}^{s}\bar{B}_{i}\varPhi _i(\tau )=\frac{\rho (\tau )!}{\gamma (\tau )s(\tau )} \phi _{\rho (\tau )+1} + \mathscr {O}(h^{p-\rho (\tau )}),\quad \forall \tau \in SSENT_m,\quad m\le p-1, \\&\sum _{i=1}^{s}B_{i}\varPhi _i(\tau )=\frac{\rho (\tau )!}{\gamma (\tau )s(\tau )} \phi _{\rho (\tau )}+ \mathscr {O}(h^{p-\rho (\tau )+1}),\quad \forall \tau \in SSENT_m,\quad m\le p, \end{aligned} \right. \end{aligned}$$
(6.28)

also hold for \(\varphi (\varPsi )\) under the mapping (6.21).

We will prove this theorem by induction. To this end, the degree of merge node \(d(\tau )\) is used as an indicator. To show this in detail, we split the proof in two parts with \(d(\tau )=0\) and \(d(\tau )>0\) for all \(\tau \in SSENT\). As stated in [35], we should first note that SNT is in fact a subset of SSENT. In the first part of the proof we will show that for each \(\tau \in SNT\), i.e. \(d(\tau )=0\), the statement of (6.28) holds.

Noting that \(s(\tau )=1\) holds for all \(\tau \in SNT\), we can rewrite (6.28) as an equivalent form

$$\begin{aligned} \left\{ \begin{aligned}&\sum _{i=1}^{s}\bar{B}_{i}^{(2l)}\varPhi _i(\tau )=\frac{\rho (\tau )!}{\gamma (\tau )} \frac{(-1)^{l}(2l)!}{(\rho (\tau )+1+2l)!} ,\quad \forall \tau \in SNT_m, \quad 2l\le p-m-2, \\&\sum _{i=1}^{s}B_{i}^{(2l)}\varPhi _i(\tau )=\frac{\rho (\tau )!}{\gamma (\tau )} \frac{(-1)^{l}(2l)!}{(\rho (\tau )+2l)!} ,\quad \forall \tau \in SNT_m, \quad 2l\le p-m-1, \end{aligned} \right. \end{aligned}$$
(6.29)

with the definitions of matrix-valued functions (6.126.13). Furthermore, taking account of the mapping (6.21) and (6.126.13), we obtain the following equations

$$\begin{aligned} \left\{ \begin{aligned}&A_{ij}^{(2k)}=a_{ij}(c_i-c_j)^{k}\dfrac{(-1)^{k}}{2k+1}, \\&\bar{B}_{j}^{(2k)}=\bar{b}_{j}(1-c_j)^{k}\dfrac{(-1)^{k}}{2k+1}, \\&B_{j}^{(2k)}=b_{j}(1-c_j)^{k}(-1)^{k}, \end{aligned} \right. \end{aligned}$$
(6.30)

for the constants \(A_{ij}^{(2k)},B_{j}^{(2k)},\bar{B}_{j}^{(2k)}\) by comparing the corresponding coefficients of each term \(V^{k}\). Inserting the new expressions of (6.30) into (6.29) gives

$$\begin{aligned} \left\{ \begin{aligned}&\sum _{i=1}^{s}\bar{b}_{i}(1-c_i)^{2l}\varPhi _i(\tau )=\frac{1}{\gamma (\tau )} \frac{\rho (\tau )!(2l+1)!}{(\rho (\tau )+1+2l)!} ,\quad \forall \tau \in SNT_m, \quad m+2l+1\le p-1, \\&\sum _{i=1}^{s}b_{i}(1-c_i)^{2l}\varPhi _i(\tau )=\frac{1}{\gamma (\tau )} \frac{\rho (\tau )!(2l)!}{(\rho (\tau )+2l)!} ,\quad \forall \tau \in SNT_m, \quad m+2l+1\le p. \end{aligned} \right. \end{aligned}$$
(6.31)

This means that we only need show the correctness of (6.31) instead of (6.28).

Noting that \(\varPhi _i(\tau )\) is the weight of SNT tree \(\tau \), then \(c_i^k\varPhi _i(\tau )\) will be the weight of a new SNT tree \(\tau '=\tau _0\times \tau \), where

figure a

Considering Lemma 6.1, and noting that \(\gamma (\tau _0)=\rho (\tau _0)=k+1\), we then have

$$\begin{aligned} \rho (\tau ')=\rho (\tau )+k, \quad \gamma (\tau ')=\rho (\tau )\cdot \frac{\gamma (\tau )}{\rho (\tau _0)} \cdot \frac{\gamma (\tau )}{\rho (\tau _0)} =\gamma (\tau )\frac{\rho (\tau )+k}{\rho (\tau )}, \quad \varPhi _i(\tau ')=c_i^k\varPhi _i(\tau ). \end{aligned}$$
(6.32)

For any \(k\le 2l\), it can be deduced that \(k+\rho (\tau )\le p\), i.e. \(\rho (\tau ')\le p\). Thus, together with the order conditions (6.27) for the special SNT tree \(\tau '\) and (6.32), the following equations

$$\begin{aligned} \left\{ \begin{aligned}&\sum _{i=1}^{s}\bar{b}_{i}\big (c_i^k\varPhi _i(\tau )\big ) =\frac{\rho (\tau )}{\gamma (\tau )(\rho (\tau )+k)(\rho (\tau )+k+1)}, \ \forall \tau \in SNT_m,\ m+k+1\le p-1, \\&\sum _{i=1}^{s}b_{i}\big (c_i^k\varPhi _i(\tau )\big ) =\frac{\rho (\tau )}{\gamma (\tau )(\rho (\tau )+k)}, \ \forall \tau \in SNT_m, \ m+k+1\le p. \end{aligned} \right. \end{aligned}$$
(6.33)

are satisfied. Multiplying by \((-1)^k C_{2l}^k\) the two sides of (6.33) and summing over k from 0 to 2l, we obtain

$$\begin{aligned} \left\{ \begin{aligned} \sum _{k=0}^{2l}(-1)^k C_{2l}^k\sum _{i=1}^{s}\bar{b}_{i}\big (c_i^k\varPhi _i(\tau )\big )&=\sum _{i=1}^{s}\bar{b}_{i}(1-c_i)^{2l}\varPhi _i(\tau ) \\ {}&=\sum _{k=0}^{2l}(-1)^k C_{2l}^k\frac{\rho (\tau )}{\gamma (\tau )(\rho (\tau )+k)(\rho (\tau )+k+1)}, \\ \sum _{k=0}^{2l}(-1)^k C_{2l}^k\sum _{i=1}^{s}b_{i}\big (c_i^k\varPhi _i(\tau )\big )&=\sum _{i=1}^{s}b_{i}(1-c_i)^{2l}\varPhi _i(\tau ) \\ {}&=\sum _{k=0}^{2l}(-1)^k C_{2l}^k\frac{\rho (\tau )}{\gamma (\tau )(\rho (\tau )+k)}. \end{aligned} \right. \end{aligned}$$
(6.34)

Comparing (6.31) with (6.34), it can be concluded that if the two conditions,

$$\begin{aligned} \left\{ \begin{aligned}&\sum _{k=0}^{2l}(-1)^k C_{2l}^k\frac{\rho (\tau )}{(\rho (\tau )+k)(\rho (\tau )+k+1)} =\frac{\rho (\tau )!(2l+1)!}{(\rho (\tau )+1+2l)!} \\&\sum _{k=0}^{2l}(-1)^k C_{2l}^k\frac{\rho (\tau )}{(\rho (\tau )+k)} =\frac{\rho (\tau )!(2l)!}{(\rho (\tau )+2l)!}, \end{aligned} \right. \end{aligned}$$
(6.35)

hold for any \(\rho (\tau )\le p\), the Eq. (6.31) will be satisfied. Here \(C_{2l}^k\) denotes the binomial coefficient \(\frac{(2l)!}{k!(2l-k)!}\). It is clear that (6.35) is just a special case of the two identical equations

$$\begin{aligned} \left\{ \begin{aligned}&\sum _{k=0}^{2l}(-1)^k C_{2l}^k\frac{n}{(n+k)(n+k+1)} =\frac{n!(2l+1)!}{(n+1+2l)!},\quad \forall n\in \mathbb {N}_{+}, \\&\sum _{k=0}^{2l}(-1)^k C_{2l}^k\frac{n}{(n+k)} =\frac{n!(2l)!}{(n+2l)!},\quad \forall n\in \mathbb {N}_{+}. \end{aligned} \right. \end{aligned}$$
(6.36)

Hence, the proof of this part is complete.

For the second part of the proof, we suppose that the order conditions for \(\varphi (\varPsi )\) hold for any \(d(\tau )=K\) (\(\rho (\tau )\le p\)). This means that the equations

$$\begin{aligned} \left\{ \begin{aligned}&\sum _{i=1}^{s}\bar{B}_{i}\varPhi _i(\tau )=\frac{\rho (\tau )!}{\gamma (\tau )s(\tau )} \phi _{\rho (\tau )+1} + \mathscr {O}(h^{p-\rho (\tau )}), \\&\sum _{i=1}^{s}B_{i}\varPhi _i(\tau )=\frac{\rho (\tau )!}{\gamma (\tau )s(\tau )} \phi _{\rho (\tau )}+ \mathscr {O}(h^{p-\rho (\tau )+1}), \end{aligned} \right. \end{aligned}$$
(6.37)

are satisfied for \(\tau \in SSENT,\ d(\tau )=K\) (\(\rho (\tau )\le p\)). We turn to showing that (6.37) also holds for any \(\tau \in SSENT\) with \(d(\tau )=K+1\) (\(\rho (\tau )\le p\)).

Suppose that \(\tau \in SSENT,\ \rho (\tau )\le p\) and \(d(\tau )=K+1\). It follows from Definitions 6.3 and 6.4 that there must exist two integers \(l\ge 1\), \(n\ge 0\) and a corresponding SSENT tree \(\tau _n\) in \(\tau \), where \(\tau _n\) with the particular form \(\tau _n=\big (W_{+}b_{+}(b_{+}B_{+})^{l}(\tau _0)\big )\) is the nth generation of \(\tau \). Here, it is convenient to suppose that \(\tau _{k+1}\) is the first generation of \(\tau _k\) for \(1\le k\le n-1\) and \(\tau _1\) is the the first generation of \(\tau \), that is

$$\begin{aligned} \tau =\big (W_{+}b_{+}(b_{+}B_{+})^{0}(\tau _1)\big )\times \tau _1', \quad \tau _k=\big (W_{+}b_{+}(b_{+}B_{+})^{0}(\tau _k+1)\big )\times \tau _{k+1}', \end{aligned}$$

where \(\tau _k'\) may be some SSENT tree depending on \(\tau \). Using Lemma 6.1, we have the following formula

$$\begin{aligned} \left\{ \begin{aligned}&s(\tau _k)=s(\tau _{k+1})\cdot s(\tau _{k+1}'), \\&\rho (\tau _k)=\rho (\tau _{k+1})+\rho (\tau _{k+1}')+1, \\&\gamma (\tau _k)=\rho (\tau _k)\big (\rho (\tau _{k+1})+1\big )\gamma (\tau _{k+1}) \frac{\gamma (\tau _{k+1}')}{\rho (\tau _{k+1}')}. \end{aligned} \right. \end{aligned}$$
(6.38)

Recursively iterating (6.38) implies that

$$\begin{aligned} \left\{ \begin{aligned}&s(\tau )=s(\tau _{n})\cdot \prod _{k=1}^{n}s(\tau _{k+1}'), \\&\rho (\tau )=\rho (\tau _{k+1})+n+\sum _{k=1}^{n}\rho (\tau _{k+1}'), \\&\gamma (\tau )=\rho (\tau )\big (\rho (\tau _{n})+1\big )\gamma (\tau _n) \cdot \prod _{k=1}^{n-1}\rho (\tau _k)\big (\rho (\tau _{k})+1\big ) \cdot \prod _{k=1}^{n}\frac{\gamma (\tau _{k+1}')}{\rho (\tau _{k+1}')}. \end{aligned} \right. \end{aligned}$$
(6.39)

Modifying the SSENT tree \(\tau \) by merely replacing \(\tau _n\) with \(\tilde{\tau }_n\), we obtain a new tree \(\tilde{\tau }\) (certainly \(\tau _k\) will become a new one \(\tilde{\tau }_k\) and \(\tau _k'\) remains the same). Let \(\delta =\rho (\tau _n)-\rho (\tilde{\tau }_n)\). Then it follows from (6.39) that

$$\begin{aligned} \left\{ \begin{aligned}&s(\tilde{\tau })=s(\tau )\cdot \frac{s(\tilde{\tau }_n)}{s(\tau _n)}, \\&\rho (\tau )-\rho (\tilde{\tau })=\rho (\tau _1)-\rho (\tilde{\tau }_1) =\cdots =\rho (\tau _n)-\rho (\tilde{\tau }_n)=\delta , \\&\gamma (\tilde{\tau })=\gamma (\tau )\frac{\gamma (\tilde{\tau }_n)}{\gamma (\tau _n)} \big (1-\frac{\delta }{\rho (\tau )}\big )\big (1-\frac{\delta }{\rho (\tau _n)+1}\big ) \cdot \prod _{k=1}^{n-1}\big (1-\frac{\delta }{\rho (\tau _k)}\big ) \big (1-\frac{\delta }{\rho (\tau _k)+1}\big ). \end{aligned} \right. \end{aligned}$$
(6.40)
Fig. 6.3
figure 3

Figure of tree \(\tau _n=\big (W_{+}b_{+}(b_{+}B_{+})^{l}(\tau _0)\big )\)

Fig. 6.4
figure 4

Figure of tree \(\tilde{\tau }_n\)

For \(\tau _n=\big (W_{+}b_{+}(b_{+}B_{+})^{l}(\tau _0)\big )\) (see Fig. 6.3), we can derive

$$\begin{aligned} \left\{ \begin{aligned}&s(\tau _n)=(-1)^{l}s(\tau _0), \\&\rho (\tau _n)=\rho (\tau _0)+2l+2, \\&\gamma (\tau _n)=\gamma (\tau _0)\frac{(\rho (\tau _0)+2l+2)!}{\rho (\tau _0)!(2l)!}. \end{aligned} \right. \end{aligned}$$
(6.41)

We now consider \(\tilde{\tau }_n\) (see Fig. 6.4) with the particular form

figure b

We then have

$$\begin{aligned} \left\{ \begin{aligned}&s(\tilde{\tau }_n)=s(\tau _0), \\&\rho (\tilde{\tau }_n)=\rho (\tau _0)+2l+2, \\&\gamma (\tilde{\tau }_n)=\big (\rho (\tau _0)+k\big )\big (\rho (\tau _0)+k+1\big ) \big (\rho (\tau _0)+21+2\big )\frac{\gamma (\tau _0)}{\rho (\tau _0)}. \end{aligned} \right. \end{aligned}$$
(6.42)

Combining (6.40) with (6.416.42), we derive the following equations

$$\begin{aligned} \left\{ \begin{aligned}&s(\tilde{\tau })=s(\tau )\cdot (-1)^{l}, \\&\rho (\tilde{\tau })=\rho (\tau ),\ i.e.\ \delta =0, \\&\gamma (\tilde{\tau })=\gamma (\tau )\frac{\big (\rho (\tau _0)+k\big ) \big (\rho (\tau _0)+k+1\big )(\rho (\tau _0)-1)!(2l)!}{(\rho (\tau _0)+2l+1)!}. \end{aligned} \right. \end{aligned}$$
(6.43)

Keep in mind that the weights of \(\tau \) and \(\tilde{\tau }(k)\) (here we concretely denote the new tree \(\tilde{\tau }\) as \(\tilde{\tau }(k)\) since it really depends on k) can be respectively expressed as

$$\begin{aligned} \left\{ \begin{aligned}&\varPhi _i(\tau )=\sum _{\mu =1}^{s}\sum _{\nu =1}^{s} \Delta _{\mu }A_{\mu \nu }^{(2l)}\Xi _{\nu } =\sum _{\mu =1}^{s}\sum _{\nu =1}^{s} \Delta _{\mu }a_{\mu \nu }(c_{\mu }-c_{\nu })^{2l} \frac{(-1)^{l}}{2l+1}\Xi _{\nu }, \\&\varPhi _i(\tilde{\tau }(k))=\sum _{\mu =1}^{s}\sum _{\nu =1}^{s} c_{\mu }^{2l-k}\Delta _{\mu }a_{\mu \nu }c_{\nu }^{k}\Xi _{\nu }, \end{aligned} \right. \end{aligned}$$
(6.44)

where \(\Delta _{\mu },\Xi _{\nu }\) are some summation depending on other branches of \(\tau \). It follows from (6.44) that

$$\begin{aligned} \varPhi _i(\tau )=\sum _{k=0}^{2l}\frac{(-1)^{l+k}C_{2l}^{k}}{2l+1} \varPhi _i(\tilde{\tau }(k)). \end{aligned}$$
(6.45)

Moreover, from Definition 6.3, the equation \(d(\tilde{\tau }(k))=d(\tau )-1=K\) holds for any \(0\le k\le 2l\). By the assumption in this part we know that the order conditions (6.37) are satisfied for such \(\tilde{\tau }(k)\ (0\le k\le 2l)\). Combining the Eq. (6.45) with the order conditions for \(\tilde{\tau }(k)\), we have

$$\begin{aligned} \left\{ \begin{aligned} \sum _{i=1}^{s}\bar{B}_{i}\varPhi _i(\tau )&=\sum _{k=0}^{2l}\frac{(-1)^{l+k}C_{2l}^{k}}{2l+1} \sum _{i=1}^{s}\bar{B}_{i}\varPhi _i(\tilde{\tau }(k)) \\&=\sum _{k=0}^{2l}\frac{(-1)^{l+k}C_{2l}^{k}}{2l+1} \frac{\rho (\tilde{\tau }(k))!}{\gamma (\tilde{\tau }(k))s(\tilde{\tau }(k))} \phi _{\rho (\tilde{\tau }(k))+1} + \mathscr {O}(h^{p-\rho (\tilde{\tau }(k))}), \\ \sum _{i=1}^{s}B_{i}\varPhi _i(\tau )&=\sum _{k=0}^{2l}\frac{(-1)^{l+k}C_{2l}^{k}}{2l+1} \sum _{i=1}^{s}B_{i}\varPhi _i(\tilde{\tau }(k)) \\&=\sum _{k=0}^{2l}\frac{(-1)^{l+k}C_{2l}^{k}}{2l+1} \frac{\rho (\tilde{\tau }(k))!}{\gamma (\tilde{\tau }(k))s(\tilde{\tau }(k))} \phi _{\rho (\tilde{\tau }(k))}+ \mathscr {O}(h^{p-\rho (\tilde{\tau }(k))+1}). \end{aligned} \right. \end{aligned}$$
(6.46)

Taking account of the formula (6.43), which is related to \(\tau \) and the new SSENT tree \(\tilde{\tau }(k)\), the equations in (6.46) imply that

$$\begin{aligned} \left\{ \begin{aligned} \sum _{i=1}^{s}\bar{B}_{i}\varPhi _i(\tau )&=\sum _{k=0}^{2l}\frac{(-1)^{k}C_{2l}^{k}(\rho (\tau _0)+2l+1)!}{\big (\rho (\tau _0)+k\big )\big (\rho (\tau _0)+k+1\big )(\rho (\tau _0)-1)!(2l+1)!} \cdot \frac{\rho (\tau )!}{\gamma (\tau )s(\tau )}\phi _{\rho (\tau )+1}\\ {}&+ \mathscr {O}(h^{p-\rho (\tau )}), \\ \sum _{i=1}^{s}B_{i}\varPhi _i(\tau )&=\sum _{k=0}^{2l}\frac{(-1)^{k}C_{2l}^{k}(\rho (\tau _0)+2l+1)!}{\big (\rho (\tau _0)+k\big )\big (\rho (\tau _0)+k+1\big )(\rho (\tau _0)-1)!(2l+1)!} \cdot \frac{\rho (\tau )!}{\gamma (\tau )s(\tau )}\phi _{\rho (\tau )}\\ {}&+ \mathscr {O}(h^{p-\rho (\tau )+1}), \end{aligned} \right. \end{aligned}$$
(6.47)

by replacing \(s(\tilde{\tau }(k)),\gamma (\tilde{\tau }(k)),\rho (\tilde{\tau }(k))\) with \(s(\tau ),\gamma (\tau ),\rho (\tau )\). Comparing (6.47) with (6.37), we observe that whether the order conditions are satisfied for \(\tau \) depends on the following equation

$$\begin{aligned} \sum _{k=0}^{2l}\frac{(-1)^{k}C_{2l}^{k}(\rho (\tau _0)+2l+1)!}{\big (\rho (\tau _0)+k\big )\big (\rho (\tau _0)+k+1\big )(\rho (\tau _0)-1)!(2l+1)!} =1. \end{aligned}$$
(6.48)

It has also been known that (6.48) is just a special case of the identity

$$\begin{aligned} \sum _{k=0}^{2l}\frac{(-1)^{k}C_{2l}^{k}(n+2l+1)!}{\big (n+k\big )\big (n+k+1\big )(n-1)!(2l+1)!} =1, \quad n,l\in \mathbb {N}_{+}. \end{aligned}$$

Hence, (6.37) also holds for any \(\tau \in SSENT\) that \(d(\tau )=K+1\) (\(\rho (\tau )\le p\)).

Since both the base case and the inductive step have been demonstrated by the above two processes, we have completed the proof of this theorem.    \(\square \)

The theorems established in this section essentially reveal the relation between classical RKN methods and ERKN methods. An original and natural way to construct certain high-order ERKN methods is based on the order conditions (6.28), by which only general fifth/sixth order ERKN methods have now been found and it is quite difficult to find an arbitrarily high order ERKN method due to the high complexity. However, the theoretical results stated above can provide us with another simple way to construct high-order ERKN methods. In this way, we only need to find its corresponding reduced RKN method and these have been well studied in the literature. Furthermore, ERKN methods with particular properties, such as symmetry and symplecticity, can also be obtained via the mapping (6.21) and their reduced RKN methods. Finally, we are able to obtain knowledge of ERKN methods by studying RKN methods instead of ERKN methods themselves, especially in the construction of high-order ERKN methods.

6.5 Numerical Experiments

In order to show applications of the results presented in the previous section, we conduct some numerical experiments. First, we select some classical RKN methods as follows:

  • RKN3s4: the three-stage symmetric symplectic Runge–Kutta–Nyström method of order four proposed by Forest and Ruth [6];

  • RKN7s6: the seven-stage symmetric symplectic Runge–Kutta–Nyström method of order six given by Okunbor and Skeel [21];

  • RKN6s6: the six-stage Runge–Kutta–Nyström method of order six given by Papakostas and Tsitourasy [22];

  • RKN16s10: the sixteen-stage Runge–Kutta–Nyström method of order ten presented by Dormand, El-Mikkawy and Prince [5].

Then from the mapping (6.21), their corresponding ERKN methods are also obtained with the individual properties maintained. We denote their corresponding ERKN methods as ERKN3s4, ERKN7s6, ERKN6s6, and ERKN16s10, respectively.

During the numerical experiments, we will display the efficiency curves and the conservation of energy for each Hamiltonian system. It should be noted that, the numerical solution obtained by RKN16s10 with a small stepsize is used as the standard reference solution, if the analytical solution cannot be explicitly given.

Problem 6.1

We first consider an orbital problem with perturbation [29]

$$\begin{aligned} \left\{ \begin{aligned}&q_1''=-q_1-\frac{2\varepsilon +\varepsilon ^2}{r^5}q_1, \quad q_1(0)=1,\quad q_1'(0)=0, \\&q_2''=-q_2-\frac{2\varepsilon +\varepsilon ^2}{r^5}q_2, \quad q_2(0)=0,\quad q_1'(0)=1+\varepsilon , \end{aligned} \right. \end{aligned}$$

where \(r=\sqrt{q_1^2+q_2^2}\), and the analytical solution is given by

$$\begin{aligned} q_1(t)=\cos (t+\varepsilon t),\quad q_2(t)=\sin (t+\varepsilon t), \end{aligned}$$

with the Hamiltonian

$$\begin{aligned} H=\frac{p_1^2+p_2^2}{2}+\frac{q_1^2+q_2^2}{2}- \frac{2\varepsilon +\varepsilon ^2}{3(q_1^2+q_2^2)^{\frac{3}{2}}}. \end{aligned}$$

We numerically integrate the problem on the interval [0, 1000] with \(\varepsilon =10^{-3}\). It is clear from the efficiency curves in Fig. 6.5a that ERKN methods are usually superior to their corresponding reduced RKN methods with respect to the global error (GE), and a high-order RKN/ERKN method also shows better performance than a low-order RKN/ERKN method in dealing with an oscillatory problem. Figure 6.5b demonstrates that symplectic methods (RKN3s4, RKN7s6, ERKN3s4, ERKN7s6) show their good energy-conservation property for the Hamiltonian, while the other methods without symplecticity lead to a linear energy dissipation on a long-term scale. The detailed results on the energy conservation for ERKN3s4 and ERKN7s6 are shown in Fig. 6.6. All these results from Figs. 6.5 and 6.6 are consistent with those of classical numerical methods, and show that ERKN methods obtained by the map \(\varphi \) with the reduced RKN methods are remarkably efficient and effective.

Fig. 6.5
figure 5

Results for Problem 6.1: a The log-log plot of maximum global error GE against number of function evaluations; b the logarithm of the maximum global error of Hamiltonian \(GEH=max|H_n-H_0|\) against \(\log _{10}(t)\) with the stepsize \(h=1\)

Fig. 6.6
figure 6

Results for Problem 6.1: the global error for Hamiltonian of symplectic methods ERKN3s4 and ERKN7s6 with the stepsize \(h=2\)

Problem 6.2

We consider the Hénon–Heilse system

$$\begin{aligned} \left\{ \begin{aligned}&q''_1+q_1=-2q_1q_2, \\&q''_2+q_2=-q_1^2+q_2^2, \end{aligned} \right. \end{aligned}$$
(6.49)

with the initial conditions \(q_1(0)=\sqrt{\frac{5}{48}},p_2(0)=\frac{1}{4},q_2(0)=p_1(0)=0\). The Hamiltonian of the system is given by

$$\begin{aligned} H(p,q)=\frac{1}{2}(p_1^2+p_2^2)+\frac{1}{2}(q_1^2+q_2^2) +q_1^2q_2-\frac{1}{3}q_2^3. \end{aligned}$$

We first integrate this problem on the interval [0, 1000] with different stepsizes. The efficiency curves for each method are shown in Fig. 6.7a, which indicate the comparable efficiency for ERKN methods to their corresponding reduced RKN ones, since ||M|| now nearly has the same magnitude as \(||\frac{\partial f}{\partial q}||\). This phenomenon also occurs in Fig. 6.7b, where the energy-conservation curve for each method is plotted. Besides, we can also observe from Fig. 6.7 that the difference between symplectic methods is a little more remarkable than that between non-symplectic ones. The good energy-conservation property of symplectic ERKN methods (ERKN3s4 and ERKN7s6) is clearly shown in Fig. 6.8, which demonstrates that the symplecticity is maintained by the mapping \(\varphi \) very well.

Fig. 6.7
figure 7

Results for Problem 6.2: a The log-log plot of maximum global error GE against the number of function evaluations; b the logarithm of the maximum global error of Hamiltonian \(GEH=max|H_n-H_0|\) against \(\log _{10}(t)\) with the stepsize \(h=0.5\)

Fig. 6.8
figure 8

Results for Problem 6.2: the global error for Hamiltonian of symplectic methods ERKN3s4 and ERKN7s6 with the stepsize \(h=0.5\)

Problem 6.3

We consider the sine-Gordon equation [13] with the periodic boundary conditions

$$\begin{aligned} \left\{ \begin{aligned}&\frac{\partial ^2 u}{\partial t^2} = \frac{\partial ^2 u}{\partial x^2} - \sin u, \ -5\le x \le 5, \ t\ge 0, \\&u(-5,t)=u(5,t). \end{aligned}\right. \end{aligned}$$
(6.50)

A semi-discretization on the spatial variable with the second-order symmetric differences gives the following differential equations in time

$$\begin{aligned} \frac{d^2 U}{dt^2} + MU = F(U), \end{aligned}$$
(6.51)

where \(U(t)=(u_1(t),\ldots ,u_N(t))^{\intercal }\) with \(u_i(t)\approx u(x_i,t),\) \(x_i=-5+i\Delta x\) for \(i=1,\ldots ,N\), \(\Delta x=10/N\), and

$$\begin{aligned} \begin{aligned}&M=\frac{1}{\Delta x^2}\begin{pmatrix} 2 &{} -1 &{} &{} &{} -1 \\ -1 &{} 2 &{} -1&{} &{} \\ &{} \ddots &{}\ddots &{}\ddots &{} \\ &{} &{} -1&{} 2 &{} -1 \\ -1 &{} &{} &{} -1 &{} 2\end{pmatrix} \\&F(U)=-\sin (U)=-(\sin u_1,\ldots ,\sin u_N)^{\intercal }. \end{aligned} \end{aligned}$$

The corresponding Hamiltonian is given by

$$\begin{aligned} H(U',U)=\dfrac{1}{2}U'^{\intercal }U' + \dfrac{1}{2}U^{\intercal }MU -\big (\cos u_1 + \ldots + \cos u_N\big ). \end{aligned}$$

For this problem, we take the initial conditions as

$$\begin{aligned} U(0)=(\pi )_{i=1}^N, \ U_t(0)=\sqrt{N}\big (0.01 +\sin (\frac{2\pi i}{N})\big )_{i=1}^{N}, \end{aligned}$$

with \(N=64\). For the efficiency curves in Fig. 6.9a, we integrate the problem for \(t_{end}=10\) with the different stepsizes. Figure 6.9a shows the good efficiency and accuracy of all the ERKN methods. In Fig. 6.9b, all methods give rise to energy dissipation even if the method is symplectic. This phenomenon is mainly caused by the chaotic behavior of the problem, in which a sufficiently small perturbation may lead to a significant error after a long time, and this increase is always exponential. It can be observed from Fig. 6.10 that the numerical reference solution obtained by RKN16s10 obviously shows notable difference between the initial interval [0, 100] and the terminal interval [900, 1000]. Figure 6.11 gives a further demonstration that the global errors of RKN7s6 and ERKN7s6 increase nearly in an exponential fashion with time t. This may lead to non-conservation for symplectic methods in practical numerical computations.

Fig. 6.9
figure 9

Results for Problem 6.3: a The log-log plot of maximum global error GE against number of function evaluations; b the logarithm of the maximum global error of Hamiltonian \(GEH=max|H_n-H_0|\) against \(\log _{10}(t)\) with the stepsize \(h=0.01\)

Fig. 6.10
figure 10

Results for Problem 6.3: The numerical reference solution in different interval obtained by RKN16s10 with the stepsize \(h=0.001\)

Problem 6.4

We consider the Fermi-Pasta-Ulam problem (see, e.g. [10]), which can be expressed by a Hamiltonian system with the Hamiltonian

$$\begin{aligned} \begin{aligned} H(y,x)=\dfrac{1}{2}\sum _{i=1}^{2m}y_i^2 + \dfrac{\omega ^2}{2}\sum _{i=1}^{m}x_{m+i}^2 + \dfrac{1}{4}\Big ( (x_1-x_{m+1})^4\\ + \sum _{i=1}^{m-1}(x_{i+1}-x_{m+i-1}-x_i-x_{m+i})^4 + (x_m+x_{2m})^4 \Big ), \end{aligned}\end{aligned}$$
(6.52)

where \(x_i\) represents a scaled displacement of the ith stiff spring, \(x_{m+i}\) is a scaled expansion (or compression) of the ith stiff spring, and \(y_i, y_{m+i}\) are their velocities (or momenta).

Fig. 6.11
figure 11

Results for Problem 6.3: the global error for RKN7s6 and ERKN7s6 with the stepsize \(h=0.01\), respectively

The corresponding Hamiltonian system is given by

$$\begin{aligned} \left\{ \begin{aligned}&x'=H_y(y,x), \\&y'=-H_x(y,x), \end{aligned}\right. \end{aligned}$$
(6.53)

which can be also written in the equivalent form of the oscillatory second-order differential equations

$$\begin{aligned} x''(t)+Mx(t)=-\nabla _x U(x), \end{aligned}$$
(6.54)

where

$$\begin{aligned} \begin{aligned}&y=x',\ M=\begin{pmatrix} \mathbf {0}_{m\times m} &{} \mathbf {0}_{m\times m} \\ \mathbf {0}_{m\times m} &{} \omega ^2 I_{m\times m} \end{pmatrix}, \\&U(x)=\dfrac{1}{4}\Big ( (x_1-x_{m+1})^4 + \sum _{i=1}^{m-1}(x_{i+1}-x_{m+i-1}-x_i-x_{m+i})^4 + (x_m+x_{2m})^4 \Big ). \end{aligned} \end{aligned}$$

In the experiment, we choose

$$\begin{aligned} m=3,\ x_1(0)=1,\ y_1(0)=1,\ x_4(0)=\frac{1}{\omega },\ y_1(0)=1,\ \omega =200, \end{aligned}$$

and choose zero for the remaining initial values. The numerical results are shown in Fig. 6.12. Similarly to Problem 6.3, we also integrate the equation over a short interval with \(t_{end}=20\) to decrease the influence of chaotic behavior. Both figures show good efficiency in the global error and energy error for the ERKN methods. In particular, symplecticity is also maintained by the map \(\varphi \), such as ERKN3s4 and ERKN7s6 in Fig. 6.13 display a stable energy conservation in the sense of numerical computation.

Fig. 6.12
figure 12

Results for Problem 6.4: a The log-log plot of maximum global error GE against the number of function evaluations; b the logarithm of the maximum global error of Hamiltonian \(GEH=max|H_n-H_0|\) against \(\log _{10}(t)\) with the stepsize \(h=0.005\)

Fig. 6.13
figure 13

Results for Problem 6.4: the global error for Hamiltonian of symplectic methods ERKN3s4 and ERKN7s6 with the stepsize \(h=0.01\)

6.6 Conclusions and Discussions

In this chapter, we studied in greater depth the ERKN methods for solving (6.1) based on the group structure of numerical methods. After the construction of the RKN group and the ERKN group, we first presented the inherent relationship between ERKN and RKN methods, that is, there exists an epimorphism \(\eta \) of the ERKN group onto the RKN group. This epimorphism gives a clear and exact meaning for the word extension from RKN methods to ERKN methods and describes the ERKN group in terms of the RKN group in the sense of structure preservation. Moreover, we established the particular mapping \(\varphi \) defined by (6.21), which maps an RKN method to an ERKN method. A series of theorems about the mapping show that the image element can be regarded as an ideal representative element for each congruence class of the ERKN group. That is, the image ERKN element almost preserves as many properties as the RKN element does. This mapping \(\varphi \) also provides us with an effective approach to constructing arbitrarily high order (symmetric or symplectic) ERKN methods, whereas the original way based directly on order conditions (symmetric or symplectic conditions) is more complicated. Furthermore, the numerical simulations in Sect. 6.5 strongly support our theoretical analysis in Sect. 6.4, and the numerical results are really promising. The high-order structure-preserving ERKN methods obtained in such a simple and effective way show better efficiency and accuracy than their corresponding reduced methods (letting \(V = 0\)), namely, the RKN methods.

Remember that the exponential Fourier collocation methods for first-order differential equations were derived and analysed in Sect. 6.3. Accordingly, the next chapter will present trigonometric collocation methods for multi-frequency and multidimensional second-order oscillatory systems.

The material of this chapter is based on the recent work by Mei and Wu [17].