1 Introduction

Convolution operators play an important role in numerous applications which are modelled by linear time-invariant nonhomogeneous evolution equations. This includes problems in time and space-time wave and heat propagation problems which are formulated either by ordinary and partial differential equations or by the corresponding integral equations.

The discretization will be based on the convolution quadrature (CQ) method which has been developed originally by Lubich, see [12, 13, 15, 16] for parabolic problems and [14] for hyperbolic ones. The idea is to express the convolution kernel k as the inverse Laplace transform of some transfer operator K and to formulate the problem as an integro-differential equation in the Laplace domain.

The discretization then consists of approximating the (time-dependent) differential equation in the Laplace domain by a time stepping method—besides multistep methods also Runge-Kutta methods have been proposed and analyzed for this purpose [13, 5, 12, 13, 15]. The transformation back to the time domain results in a discrete convolution equation which then can be solved numerically. This method is nowadays one of the most popular method in this field.

However, the CQ method as well as its analysis relies strongly on the use of constant time stepping. In [9, 11], the generalized convolution quadrature (gCQ) has been introduced which allows for variable time stepping. The approach was limited to the first order implicit Euler scheme.

The goal of this paper is to introduce the Runge-Kutta generalized convolution quadrature which results in a method with much faster convergence rates as well as an improved long time behavior of the approximation compared to the implicit Euler method. The possibility to use variable time stepping allows to resolve adaptively a non-smooth behavior of the temporal solution which often occurs, e.g., in the short time range after an electric circuit is switched on and before it has reached a periodic state.

The paper is structured as follows. In Sect. 2 we will briefly recall the definition of one-sided convolution operators and define the class of convolution kernels which we will consider in this paper. In Sect. 3 we will introduce Runge-Kutta generalized convolution quadrature for the discretization of convolution operators. Its stability and convergence will be analyzed in Sect. 4 and the summation-by-parts formula for divided Runge-Kutta differences will be derived for this purpose. Section 5 is devoted to the numerical solution of convolution equations. We will present the discrete equations and derive an associativity property for the composition of Runge-Kutta generalized convolution operators which allows to use the stability and error analysis as in Sect. 4 to derive corresponding estimates for the discrete solution. Finally, we will report in Sect. 6 the results of numerical experiments to illustrate that, for problems where the regularity of the solution is not uniformly distributed in the time interval, our method converges with optimal convergence rates while other CQ-type methods are converging suboptimally.

2 The class of problems

We will consider the class of convolution operators as described in [14, Sect. 2.1] and recall its definition. Let B and D denote some normed vector spaces and let \(\mathcal {L}\left( B,D\right) \) be the space of continuous, linear mappings. As a norm in \(\mathcal {L}\left( B,D\right) \) we take the usual operator norm

$$\begin{aligned} \left\| \mathcal {F}\right\| _{D\leftarrow B}:=\sup _{u\in B\backslash \left\{ 0\right\} }\frac{\left\| \mathcal {F}u\right\| _{D}}{\left\| u\right\| _{B}}. \end{aligned}$$

For given \(\phi :\mathbb {R}_{\ge 0}\rightarrow B\), we consider the convolution

$$\begin{aligned} \int _{0}^{t}k\left( t-\tau \right) \phi \left( \tau \right) d\tau \quad \text {in }D\quad \text { for all }\;t\in \left[ 0,T\right] . \end{aligned}$$
(1)

The kernel operator k is defined as the inverse Laplace transform of a given transfer operator K. The class of problems under consideration is defined as follows. For \(\sigma \in \mathbb {R}\) we introduce

Assumption 1

For some \(\sigma _{K}\in \mathbb {R}\) (describing the analyticity region) and some \(\mu \in \mathbb {R}\) (describing the growth behavior), the class \(\mathcal {A}_{\sigma _{K}}^{\mu }\left( B,D\right) \) of transfer operators consists of operator valued mappings \(K:\mathbb {C}_{\sigma _{K} }\rightarrow \mathcal {L}\left( B,D\right) \) which satisfy:

  1. 1.

    \(K:\mathbb {C}_{\sigma _{K}}\rightarrow \mathcal {L}\left( B,D\right) \) is analytic.

  2. 2.

    For any \(\sigma >\sigma _{K}\), there exists a constant such that K satisfies the algebraic growth estimateFootnote 1

    (2)

For \(j\in \mathbb {Z}\), we define

$$\begin{aligned} K_{j}\left( z\right) :=z^{-j}K\left( z\right) . \end{aligned}$$
(3)

For any

$$\begin{aligned} \nu \in \mathbb {N}_{0}\quad \text {such that }\;\nu >\mu +1, \end{aligned}$$
(4)

the Laplace inversion formula

(5)

for a contour , with \(\sigma > \sigma _{K}\), defines a continuous and exponentially bounded operator \(k_{\nu }\left( t\right) \), which by Cauchy’s integral theorem vanishes for \(t<0\).

Let

$$\begin{aligned} C_{0}^{j}\left( \left[ 0,T\right] , B\right) :=\left\{ \psi \in C^{j}\left( \left[ 0,T\right] , B\right) \mid \,\, \forall \, 0\le r\le j-1:\;\psi ^{\left( r\right) }\left( 0\right) =0\right\} . \end{aligned}$$

As in [14] we denote the convolution \(k*\phi \) for \(\phi \in C_{0}^{\nu }\left( \left[ 0,T\right] , B\right) \) and \(\nu \) as in (4) by

$$\begin{aligned} \left( K\left( \partial _{t}\right) \phi \right) \left( t\right) :=\int _{0}^{t}k_{\nu }\left( \tau \right) \partial _{t}^{\nu }\phi \left( t-\tau \right) d\tau . \end{aligned}$$
(6)

Then

(7)

where the integrals exist as Riemann integrals.

Remark 2

Equation (7) can be rewritten as the coupled system

(8a)

with the solution \(u_{\nu }\) of

$$\begin{aligned} \partial _{t}u_{\nu }(z,t)=zu_{\nu }(z,t)+\partial _{t}^{\nu }\phi (t),\quad u_{\nu }(z,0)=0, \end{aligned}$$
(8b)

and \(\gamma \) a suitable contour in the complex plane: either a vertical contour running from to , for some \(\nu \) which satisfies (4), or a suitable closed contour clockwise oriented.

3 Runge-Kutta generalized convolution quadrature

3.1 Runge-Kutta methods

The discretization of the convolution (6) will be based on a discretization of the ordinary differential equation by a Runge-Kutta method with variable time steps. In this section, we will introduce the class of Runge-Kutta methods which we will consider and collect some basic properties—for proofs and further details we refer to [8].

We consider Runge-Kutta method of s stages given by the Butcher table \(\mathbf {A}=\left( a_{i,j}\right) _{i,j=1}^{s}\), \(\mathbf {b}=\left( b_{i}\right) _{i=1}^{s}\), \(\mathbf {c}=\left( c_{i}\right) _{i=1}^{s}\). For the discretization we employ a sequence of time points \(\Theta :=(t_{n} )_{n=0}^{N}\) with

$$\begin{aligned} 0=t_{0}<t_{1}<\cdots <t_{N}=T,\quad \Delta _{j}=t_{j}-t_{j-1},\quad \Delta :=\max _{1\le i\le n}\Delta _{j}. \end{aligned}$$
(9)

The local quasi-uniformity of the mesh is defined as the constant

$$\begin{aligned} c_{\Theta }:=\frac{1}{2}\max _{2\le i\le N}\left( \frac{\Delta _{i}}{\Delta _{i-1}}+\frac{\Delta _{i-1}}{\Delta _{i}}\right) . \end{aligned}$$
(10a)

As a further (mild) assumption on the mesh width we impose the condition on the maximal mesh width

$$\begin{aligned} \Delta \le C_{\Theta } T/N. \end{aligned}$$
(10b)

Notation 3

The internal time points are defined by \(t_{n,i}=t_{n-1}+c_{i}\Delta _{n}\), \(i=1,\ldots ,s\). For a function g which is defined in the time interval \(\left[ 0,T\right] \), we introduce

$$\begin{aligned} \mathbf {g}^{\left( n\right) }:=\left( g\left( t_{n,i}\right) \right) _{i=1}^{s}\in \mathbb {C}^{s}. \end{aligned}$$

The time step n is denoted as a superscript for vectors and matrices in order not to confuse with their components. The m-th time derivative of function u is denoted by \(\partial _{t}^{m}u\) and its evaluation at some time point \(t_{k}\) is

$$\begin{aligned} \partial _{t}^{m}u^{\left( k\right) }:=\frac{d^{m}u}{dt^{m}}\left( t_{k}\right) . \end{aligned}$$

Further, we introduce \(\mathbf {1}=\left( 1\right) _{i=1}^{s}\) and, for vectors \(\mathbf {v},\mathbf {w\in }\mathbb {C}^{s}\), the bilinear (not sesquilinear!) form

$$\begin{aligned} \mathbf {v}\cdot \mathbf {w}:= \sum _{j=1}^{s}v_{j}w_{j}. \end{aligned}$$

We also recall here the Hadamard product of two vectors \(\mathbf {v} ,\mathbf {w}\in \mathbb {C}^{s}\) by

$$\begin{aligned} \mathbf {v\odot w}=\left( v_{i}w_{i}\right) _{i=1}^{s} \quad \text {and} \quad \mathbf {v}^{m\mathbf {\odot }}=\underset{m\text {-times}}{\underbrace{\mathbf {v\odot \ldots \odot v}}}. \end{aligned}$$

The application of the s-stage Runge-Kutta methods to the initial value problem \(y^{\prime }=f\left( t,y\right) \), \(y\left( 0\right) =y_{0}\) can be written as the following recursion

$$\begin{aligned} Y_{i}^{\left( n\right) }&=y^{\left( n-1\right) }+ \Delta _{n} \sum _{j=1}^{s}a_{i,j}f\left( t_{n-1}+c_{j}\Delta _{n},Y_{j}^{\left( n\right) }\right) \quad i=1,\ldots ,s\\ y^{\left( n\right) }&=y^{\left( n-1\right) }+ \Delta _{n} \sum _{j=1}^{s}b_{j}f\left( t_{n-1}+c_{j}\Delta _{n},Y_{j}^{\left( n\right) }\right) . \end{aligned}$$

The Runge-Kutta method has (classical) order \(p\ge 1\) and stage order q if for sufficiently smooth right-hand side f

$$\begin{aligned} Y_{i}^{\left( 1\right) }-y\left( c_{i}\Delta _{1}\right) =\mathcal {O} \left( \Delta _{1}^{q+1}\right) \quad \forall \; i=1,\ldots ,m\quad \text {and}\quad y^{\left( 1\right) }-y\left( t_{1}\right) =\mathcal {O} \left( \Delta _{1}^{p+1}\right) , \end{aligned}$$

as \(\Delta _{1}\rightarrow 0\).

For the analysis of the Runge-Kutta method, the stability function

$$\begin{aligned} R\left( z\right) :=1+z\mathbf {b}\cdot \left( \mathbf {I}-z\mathbf {A}\right) ^{-1}\mathbf {1} \end{aligned}$$
(11)

plays a central role; here, and in the following \(\mathbf {I}\) denotes the identity matrix. Throughout the paper we assume that the Runge-Kutta method satisfies the following assumption.

Assumption 4

The Runge-Kutta method is A-stable, this is

(12)

with classical order \(p\ge 1\) and stage order \(q\le p\) and it is stiffly accurate, this is

$$\begin{aligned} \mathbf {b}= \mathbf {A}^{\intercal }\mathbf {e}_{s}\quad \text {with}\quad \mathbf {e}_{s} =\left( 0,\ldots ,0,1\right) ^{\intercal } \in \mathbb {R}^{s}. \end{aligned}$$
(13)

We will further assume that R is a Padé approximation to the exponential function and \(c_{s}=1\).

Two important families of methods satisfying Assumption 4 are RadauIIA and LobattoIIIC methods.

Remark 5

In what follows we will repeatedly use the following properties of Runge-Kutta methods satisfying Assumption 4:

  1. 1.

    Since R is a Padé approximation of the exponential function Theorem 4.12 in [8] implies that the coefficient matrix \(\mathbf {A}\) is diagonalizable and all eigenvalues \(d_{i}\), \(1\le i\le s\), have strictly positive real part. In particular \(\mathbf {A}\) is invertible.

  2. 2.

    Condition (13) implies \(R\left( \infty \right) =0\) [8, Chap. IV, Prop.3.8].

  3. 3.

    If the method has stage order q, it holds ([8, (15.5)])

    $$\begin{aligned} \mathbf {Ac}^{\left( m-1\right) \odot }=\frac{1}{m}\mathbf {c}^{m\odot }\quad \forall \; 1\le m\le q. \end{aligned}$$
    (14)
  4. 4.

    If the method has order p, it follows (cf. [1, 16])

    $$\begin{aligned} \mathbf {b}\cdot \mathbf {A}^{\ell }\mathbf {c}^{\left( k-1\right) \odot }=\mathbf {b}\cdot \mathbf {A}^{\ell -1}\mathbf {c}^{k\odot }/k = \frac{1}{(k+\ell )!},\quad \forall \; k+\ell \le p. \end{aligned}$$
    (15)

3.2 Discretization of the convolution operator

The starting point of the discretization of the convolution operator is the representation (8). We will add more flexibility in the discretization by replacing the regularization parameter \(\nu \) by a parameter \(\rho \in \mathbb {N}_{0}\). The stability and convergence analysis will show that \(\rho \) can be chosen in the range

$$\begin{aligned} \nu -(q+1)\le \rho \le p+\nu -(q+1), \end{aligned}$$
(16)

where \(\nu >\mu +1\) is as in (4)–(5), p is the order of the Runge-Kutta method which we will employ for the discretization and q is the stage order; some hints for the choice of \(\rho \) will be given in Remarks 7 and 20.

The discretization will be based on an approximation of the ordinary differential equation (cf. (8b))

$$\begin{aligned} \partial _{t}u_{\rho }(z,t)=zu_{\rho }(z,t)+\partial _{t}^{\rho }\phi (t),\quad u_{\rho }(z,0)=0. \end{aligned}$$

Assumption 4 implies (13) so that the chosen Runge-Kutta method can be written in the form

$$\begin{aligned} \mathbf {u}_{\rho }^{\left( n\right) }\left( z\right) =\left( \mathbf {e}_{s}\cdot \mathbf {u}_{\rho }^{\left( n-1\right) }(z)\right) \mathbf {1}+\Delta _{n}\mathbf {A}\left( z\mathbf {u}_{\rho }^{\left( n\right) }(z)+\partial _{t}^{\rho } \varvec{\phi } ^{\left( n\right) }\right) . \end{aligned}$$
(17)

We can write (17) as a recurrence for \(\mathbf {u}_{\rho }^{\left( n\right) }\)

$$\begin{aligned} \mathbf {u}_{\rho }^{\left( n\right) }(z)= & {} (\mathbf {I}-\Delta _{n}z\mathbf {A})^{-1}\left( \left( \mathbf {e}_{s}\cdot \mathbf {u}_{\rho }^{\left( n-1\right) }(z)\right) \mathbf {1}+\Delta _{n}\mathbf {A\partial }_{t}^{\rho } \varvec{\phi } ^{\left( n\right) }\right) \nonumber \\= & {} \left( \mathbf {e}_{s}\cdot \mathbf {u}_{\rho }^{\left( n-1\right) }\left( z\right) \right) \mathbf {R}\left( \Delta _{n}z\right) +\Delta _{n}\left( \mathbf {I}-z\Delta _{n}\mathbf {A}\right) ^{-1}\mathbf {A}\partial _{t}^{\rho } \varvec{\phi } ^{\left( n\right) } \end{aligned}$$
(18)

with

$$\begin{aligned} \mathbf {R}\left( z\right) :=\left( \mathbf {I}-z\mathbf {A}\right) ^{-1}\mathbf {1}. \end{aligned}$$
(19)

From the identity

$$\begin{aligned} \left( \mathbf {I}-z\mathbf {A}\right) ^{-1}\mathbf {A}=\frac{1}{z}\left( \mathbf {I}-z\mathbf {A}\right) ^{-1}-\frac{1}{z}\mathbf {I} \end{aligned}$$
(20)

which holds for all square matrices \(\mathbf {A}\) with regular resolvent, we conclude that the last component \(\mathbf {e}_{s}\cdot \mathbf {R}\) equals the stability function\(\ R\) (cf. (11)).

The last component in (18), \(\left( \mathbf {u}_{\rho }^{\left( n\right) }\right) _{s}\) then defines the approximation of \(u\left( t_{n}\right) \).

Definition 6

(Runge-Kutta generalized convolution quadrature) Let the transfer operator K satisfy (2) and let \(\nu \in \mathbb {N}_{0}\) be the smallest integer such that \(\nu >\mu +1\). Let \(\phi \in C_{0}^{\nu }\left( \left[ 0,T\right] , B\right) \) and consider the convolution operation

(21)

Let a Runge-Kutta method be given which satisfies Assumption 4. Then the discretization of (21) by Runge-Kutta generalized convolution quadrature is given by

(22)

with \(\mathbf {u}_{\rho }^{\left( 0\right) }=\mathbf {0}\) and

$$\begin{aligned} \mathbf {u}_{\rho }^{\left( n\right) }(z)=\left( \mathbf {e}_{s} \cdot \mathbf {u}_{\rho }^{\left( n-1\right) }\left( z\right) \right) \mathbf {R}\left( \Delta _{n}z\right) +\Delta _{n}\left( \mathbf {I} -z\Delta _{n}\mathbf {A}\right) ^{-1}\mathbf {A}\partial _{t}^{\rho } \varvec{\phi } ^{\left( n\right) },\quad n=1,2,\ldots . \end{aligned}$$

The approximation of \(K\left( \partial _{t}\right) \phi \) at time point \(t_{n}\) is given by the last component \(\mathbf {e}_{s}\cdot \left( K_{\rho }\left( \partial _{t}^{\Theta }\right) \partial _{t}^{\rho }\phi \right) ^{\left( n\right) }\). Here, \(\rho \in \mathbb {N}_{0}\) is a regularization parameter (cf. (16)) which can be chosen in the range

$$\begin{aligned} \nu -\left( q+1\right) \le \rho \le p+\nu -\left( q+1\right) , \end{aligned}$$

where p is the classical order of the Runge-Kutta method and q denotes the stage order.

Remark 7

It is important to mention that \(\gamma \) in (22), typically, is not chosen as the vertical contour but as a finite closed contour which encircles the poles of \(\mathbf {u}_{\rho }^{\left( n\right) }\) and is counter clockwise oriented. For the practical realization the contour integral in (22) has to be approximated by numerical quadrature (see also Remark 20); for the implicit Euler method this has been developed and analyzed in [10, 11] while for Runge-Kutta method this is the topic of a forthcoming paper.

4 Error analysis of Runge-Kutta generalized convolution quadrature

The analysis of the Runge-Kutta gCQ consists of several steps: First, we will resolve the recursion in (18) to express \(\mathbf {u}_{\rho }^{\left( n\right) }\) as a sum over the history. This allows to employ a summation-by-parts formula which allows to gain negative powers of z (and hence a faster decay of the integrand for large z) on the expense of increased smoothness requirements on the input function \(\phi \).

4.1 Summation-by-parts

The recursion (18) can be resolved and we obtain

$$\begin{aligned} \mathbf {u}_{\rho }^{\left( n\right) }\left( z\right)&=\Delta _{n}\left( \mathbf {I}-z\Delta _{n}\mathbf {A}\right) ^{-1}\mathbf {A}\partial _{t}^{\rho }\varvec{\phi }^{\left( n\right) }\\&\quad +\sum _{k=1}^{n-1}\Delta _{k}\left( {\displaystyle \prod \limits _{\ell =k+1}^{n-1}} R\left( \Delta _{\ell }z\right) \right) \left( \mathbf {e}_{s}\cdot \left( \mathbf {I}-z\Delta _{k}\mathbf {A}\right) ^{-1}\mathbf {A}\partial _{t}^{\rho }\varvec{\phi }^{\left( k\right) }\right) \mathbf {R}\left( \Delta _{n}z\right) . \end{aligned}$$

For the last component \(\mathbf {e}_{s}\cdot \mathbf {u}_{\rho }^{\left( n\right) }\left( z\right) \) this formula simplifies and we obtain

$$\begin{aligned} \mathbf {e}_{s}\cdot \mathbf {u}_{\rho }^{\left( n\right) }\left( z\right) =\sum _{k=1}^{n}\Delta _{k}\left( {\displaystyle \prod \limits _{\ell =k+1}^{n}} R\left( \Delta _{\ell }z\right) \right) \left( \mathbf {e}_{s}\cdot \left( \mathbf {I}-z\Delta _{k}\mathbf {A}\right) ^{-1}\mathbf {A}\partial _{t}^{\rho }\varvec{\phi }^{\left( k\right) }\right) . \end{aligned}$$
(23)

For the forthcoming analysis it is convenient to write this equation by using tensor calculus. Let us define the tensor product of vectors for \(k\ge 1\) byFootnote 2

$$\begin{aligned} \mathbf {e}_{s}^{k\otimes }:= {\displaystyle \bigotimes \limits _{\ell =1}^{k}} \mathbf {e}_{s},\qquad \mathbf {1}^{k\otimes }:= {\displaystyle \bigotimes \limits _{\ell =1}^{k}} \mathbf {1} \end{aligned}$$
(24)

and the Kronecker tensor product of matrices

$$\begin{aligned} \mathbb {A}^{\left( k,n\right) }\left( z\right) := {\displaystyle \bigotimes \limits _{\ell =k}^{n}} \left( \mathbf {I}-z\Delta _{\ell }\mathbf {A}\right) ^{-1}. \end{aligned}$$

Recall that a Kronecker matrix \(\bigotimes _{j=1}^{d}\mathbf {B}^{(j)}\ \)is applied to a tensor \(\bigotimes _{j=1}^{d}\mathbf {v}^{\left( j\right) }\) of vectors \(\mathbf {v}^{\left( j\right) }\) by means of

$$\begin{aligned} \left( \bigotimes _{j=1}^{d}\mathbf {B}^{(j)}\right) \left( \bigotimes _{j=1}^{d}\mathbf {v}^{\left( j\right) }\right) = {\displaystyle \bigotimes \limits _{j=1}^{d}} \mathbf {B}^{\left( j\right) }\mathbf {v}^{\left( j\right) }. \end{aligned}$$

The canonical extension of the bilinear form \(\mathbf {v} \cdot \mathbf {w}\) to tensors is

$$\begin{aligned} \left( \bigotimes _{j=1}^{d}\mathbf {v}^{\left( j\right) }\right) \cdot \left( \bigotimes _{j=1}^{d}\mathbf {w}^{\left( j\right) }\right) = {\displaystyle \prod \limits _{j=1}^{d}} \mathbf {v}^{\left( j\right) }\cdot \mathbf {w}^{\left( j\right) }. \end{aligned}$$

Finally, the vectorization is given by

$$\begin{aligned} \left( \left( \bigotimes _{j=1}^{d-1}\mathbf {v}^{\left( j\right) }\right) \otimes \bullet \right) \cdot \left( \bigotimes _{j=1}^{d}\mathbf {w}^{\left( j\right) }\right)&:= \left( \left( \bigotimes _{j=1}^{d-1} \mathbf {v}^{\left( j\right) }\right) \cdot \left( \bigotimes _{j=1} ^{d-1}\mathbf {w}^{\left( j\right) }\right) \right) \mathbf {w}^{\left( d\right) }\end{aligned}$$
(25)
$$\begin{aligned}&= \left( {\displaystyle \prod \limits _{j=1}^{d-1}} \mathbf {v}^{\left( j\right) }\cdot \mathbf {w}^{\left( j\right) }\right) \mathbf {w}^{\left( d\right) }. \end{aligned}$$
(26)

Then, we have

$$\begin{aligned} \mathbf {u}_{\rho }^{\left( n\right) }\left( z\right) =\sum _{k=1}^{n} \Delta _{k}\left( \mathbf {e}_{s}^{\left( n-k\right) \otimes }\otimes \bullet \right) \cdot \left( \mathbb {A}^{\left( k,n\right) }\left( z\right) \left( \mathbf {A}\partial _{t}^{\rho } \varvec{\phi } ^{\left( k\right) }\otimes \mathbf {1}^{\left( n-k\right) \otimes }\right) \right) . \end{aligned}$$
(27)

In the next step, we will introduce difference operators which are related to the time steps \(t_{k}\) and we will discuss their relation to Newton’s divided differences later. Let again \(\Theta :=(t_{n})_{n=1}^{N}\) denote the time grid with steps \(\Delta _{j}=t_{j}-t_{j-1}\). Formally we extend the time grid to the negative time axes by setting \(t_{-j}=-j\Delta _{1}\), \(j\in \mathbb {N}\).

Definition 8

(Divided Runge-Kutta differences) Let a Runge-Kutta method be given by the Butcher table \(\mathbf {A}\), \(\mathbf {b}\), \(\mathbf {c}\) with non-singular \(\mathbf {A}\) and \(c_{s}=1\). For a subset \(\mathcal {I} \subset \mathbb {Z}\) of consecutive integers, let \(\Theta _{\mathcal {I}}:=\left( x_{k}\right) _{k\in \mathcal {I}}\subset \mathbb {R}\) denote a sequence of strictly increasing points with steps \(\Delta _{k}=x_{k}-x_{k-1}\). We set

$$\begin{aligned} \mathcal {I}^{\prime }=\left\{ k\in \mathbb {Z}\mid \left\{ k-1,k\right\} \subset \mathcal {I}\right\} . \end{aligned}$$

For a function v which is defined in the points \(x_{k,r}:=x_{k-1} +c_{r}\Delta _{k}\), for all \(k\in \mathcal {I}^{\prime }\) and \(1\le r\le s\), the Runge-Kutta differences \( [\![\ldots ]\!]v\) are given by the recursion:

$$\begin{aligned}{}[\![x_{k} ]\!]v:=\mathbf {v}^{\left( k\right) }:=\left( v\left( x_{k,r}\right) \right) _{r=1}^{s}\quad \forall \; k\in \mathcal {I}^{\prime } \end{aligned}$$
(28)

and for all \(i,k\in \mathcal {I}^{\prime }\) with \(i<k\)

$$\begin{aligned}{}[\![x_{i},x_{i+1},\ldots ,x_{k} ]\!]v:=\left( \Delta _{k}\mathbf {A}\right) ^{-1}\left( [\![x_{i+1},\ldots ,x_{k} ]\!]v-\left( \mathbf {e}_{s} \cdot [\![ x_{i},\ldots ,x_{k-1}]\!] v \right) \mathbf {1} \right) . \end{aligned}$$
(29)

For \(m\in \mathbb {N}_{0}\), the tuple of m -th order Runge-Kutta differences is given by

(30)

For a tuple we set

$$\begin{aligned}{}[\![x_{i},\ldots ,x_{k} ]\!]\mathbb {V}:= [\![x_{i},\ldots ,x_{k} ]\!]v \end{aligned}$$

for any continuous function v which interpolates \(\mathbb {V}\) at the mesh points, i.e., \(v\left( x_{k,r}\right) =v_{r}^{\left( k\right) }\) for all \(k\in \mathcal {I}^{\prime }\) and \(1\le r\le s\).

In particular we have (cf. (28))

$$\begin{aligned}{}[\![x_{k-1},x_{k} ]\!]v=\left( \Delta _{k}\mathbf {A}\right) ^{-1}\left( \mathbf {v}^{\left( k\right) }-\left( \mathbf {e}_{s} \cdot \mathbf {v}^{\left( k-1\right) } \right) \mathbf {1} \right) . \end{aligned}$$
(31)

Proposition 9

(Summation by parts formula) Let a Runge-Kutta method satisfy Assuption 4. Let \(w:\mathbb {R}_{\ge 0}\rightarrow \mathbb {C}\) be a function which can be continuously extended to \(\mathbb {R}_{<0}\) by zero. The time mesh satisfies (9) and is extended by \(t_{-j} =-j\Delta _{1}\) for \(j\in \mathbb {N}\). Set \(\mathbf {w}^{\left( j\right) }=\left( w\left( t_{j,r}\right) \right) _{r=1}^{s}\in \mathbb {C}^{s}\), \(j\in \mathbb {Z}_{\le N}\) and let \(\mathbf {e}_{s}^{r\otimes }\), \(\mathbf {1} ^{r\otimes }\) be as in (24). Then, for any \(m\in \mathbb {N}_{0}\)

$$\begin{aligned}&\sum _{k=0}^{n}\Delta _{k}\left( \mathbf {e}_{s}^{\left( n-k\right) \otimes }\otimes \bullet \right) \cdot \left( \mathbb {A}^{\left( k,n\right) }\left( z\right) \left( \mathbf {Aw}^{\left( k\right) }\otimes \mathbf {1}^{\left( n-k\right) \otimes }\right) \right) \nonumber \\&\quad =-\sum _{\ell =0}^{m-1}\frac{ [\![t_{n-\ell },\ldots ,t_{n} ]\!]w}{z^{\ell +1}}\nonumber \\&\qquad +\frac{1}{z^{m}}\sum _{k=0}^{n}\Delta _{k}\left( \mathbf {e} _{s}^{\left( n-k\right) \otimes }\otimes \bullet \right) \cdot \left( \mathbb {A}^{\left( k,n\right) }\left( z\right) \left( \mathbf {A} [\![t_{k-m},\ldots ,t_{k} ]\!]w\otimes \mathbf {1}^{\left( n-k\right) \otimes }\right) \right) .\nonumber \\ \end{aligned}$$
(32)

For the corresponding generalized discrete convolution operator it holds

$$\begin{aligned} K\left( \partial _{t}^{\Theta }\right) w = K_{m}\left( \partial _{t}^{\Theta }\right) [\![\Theta ]\!]^{m}w. \end{aligned}$$
(33)

Proof

We denote the left-hand side in (32) by and obtain (cf. (20))

This onefold summation by parts can be iterated and leads to the assertion.

The second relation (33) is a simple consequence of Cauchy’s integral theorem.   \(\square \)

The following proposition states the boundedness of the right-hand side in (32) with respect to a decreasing step size in terms of the stage order of the underlying Runge–Kutta method.

Definition 10

Let \(r\in \mathbb {N}_{0}\), \(T>0\), and V be a normed vector space with norm \(\left\| \cdot \right\| _{V}\). For a vector-valued function \(\mathbf {v}\in V^{s}\), we set

$$\begin{aligned} \left\| \mathbf {v}\right\| _{V}:=\max _{1\le i\le s}\left\| v_{i}\right\| _{V} \end{aligned}$$

if no confusion is possible.

For a function \(w\in C^{r}\left( \left[ 0,T\right] , V\right) \) and any interval \(\tau \subset \left[ 0,T\right] \), we set

$$\begin{aligned} \left| w\right| _{C^{r}\left( \tau ,V\right) }:=\frac{1}{r!} \sup _{t\in \tau }\left\| \partial ^{r}w\left( t\right) \right\| _{V} \quad \text {and} \quad \left\| w\right\| _{C^{r}\left( \tau ,V\right) }:=\max _{0\le \ell \le r}\left| v\right| _{C^{\ell }\left( \tau ,V\right) }. \end{aligned}$$

Proposition 11

Let a Runge–Kutta method be given by the Butcher table \(\mathbf {A}\), \(\mathbf {b}\), \(\mathbf {c}\) with non-singular \(\mathbf {A}\) and \(c_{s}=1\). Let V be a normed vector space. If the method has stage order q then for \(0\le \ell \le q+1\) and any \(w\in C^{q+1}\left( \left[ t_{k-\ell },t_{k}\right] , V\right) \) it holds

$$\begin{aligned}{}[\![t_{k-\ell },t_{k+1-\ell },\ldots ,t_{k} ]\!]w&=\partial _{t}^{\ell }\mathbf {w}^{\left( k\right) }+\mathbf {T} _{q+1-\ell }^{\left( k\right) },\\ \left\| \mathbf {T}_{q+1-\ell }^{\left( k\right) }\right\| _{V}&\le C\left| w\right| _{C^{q+1}\left( \left[ t_{k-\ell },t_{k}\right] ,V\right) }\Delta _{k}^{q+1-\ell }, \end{aligned}$$

where C depends on \(c_{\Theta }\) (cf. (10a)), q, and \(\mathbf {A}\).

Proof

The proof is by induction. For \(\ell =0\) the result is obvious and we even have equality: \( [\![t_{k} ]\!]w=\mathbf {w}^{\left( k\right) }\) so that \(\mathbf {T}_{q+1}^{\left( k\right) }=\mathbf {0}.\)

Let us assume now that the result is true for \(\ell -1\). Then for \(\ell \) we have

$$\begin{aligned}{}[\![t_{k-\ell },t_{k-\ell +1},\dots ,t_{k} ]\!]w&=\Delta _{k}^{-1}\mathbf {A}^{-1}\left( [\![t_{k-\ell +1},\dots ,t_{k} ]\!]w\!-\!\left( \mathbf {e}_{s}\cdot [\![t_{k-\ell },\dots ,t_{k-1} ]\!]w\right) \mathbf {1}\right) \nonumber \\&=\Delta _{k}^{-1}\mathbf {A}^{-1}\left( \partial _{t}^{\ell -1}\mathbf {w} ^{\left( k\right) }-\left( \mathbf {e}_{s}\cdot \partial _{t}^{\ell -1}\mathbf {w}^{\left( k-1\right) }\right) \mathbf {1}+\tilde{\mathbf {T}}_{q+1-\ell }^{\left( k\right) }\right) \nonumber \\&=\Delta _{k}^{-1}\mathbf {A}^{-1}\left( \left( \int _{t_{k-1}}^{t_{k,m} }\partial _{t}^{\ell }w\right) _{m=1}^{s}+\tilde{\mathbf {T}}_{q+1-\ell }^{\left( k\right) }\right) , \end{aligned}$$
(34)

where

$$\begin{aligned} \tilde{\mathbf {T}}_{q+1-\ell }^{\left( k\right) }:=\mathbf {T}_{q+2-\ell }^{\left( k\right) }-\left( \mathbf {e}_{s}\cdot \mathbf {T}_{q+2-\ell }^{\left( k-1\right) }\right) \mathbf {1}. \end{aligned}$$

Conditions (14) imply

$$\begin{aligned} \Delta _{j}\mathbf {A\partial }_{t}^{\ell }\mathbf {w}^{(j)}=\left( \int _{t_{j-1} }^{t_{j,m}}\partial _{t}^{\ell }w\right) _{m=1}^{s}+ \varvec{\xi } ^{\left( j\right) }, \end{aligned}$$

with

$$\begin{aligned} \left\| \varvec{\xi } ^{\left( j\right) }\right\| _{V}\le C_{q}\Delta _{j}^{r+1}\left\| \partial _{t}^{\ell +r}w\right\| _{C^{0}\left( \tau _{j},V\right) } \quad 0\le r\le q. \end{aligned}$$

We apply this for \(r=q+1-\ell \) and obtain

$$\begin{aligned} \Delta _{j}^{-1}\mathbf {A}^{-1}\left( \int _{t_{j-1}}^{t_{j,m}}\partial _{t}^{\ell }w\right) _{m=1}^{s}=\partial _{t}^{\ell }\mathbf {w}^{\left( j\right) }+\widetilde{ \varvec{\xi } }^{\left( j\right) } \end{aligned}$$
(35)

with

$$\begin{aligned} \left\| \widetilde{ \varvec{\xi } }^{\left( j\right) }\right\| _{V}\le C_{q}C_{\mathbf {A}}\left| w\right| _{C^{q+1}\left( \left[ t_{j-1},t_{j}\right] , V\right) } \Delta _{j}^{q+1-\ell }. \end{aligned}$$
(36)

The combination of (34) with the induction hypothesis, (35), and (36) yields

$$\begin{aligned}{}[\![t_{k-\ell },t_{k-\ell +1},\dots ,t_{k} ]\!]w=\partial _{t}^{\ell }\mathbf {w}^{\left( j\right) }+\mathbf {T}_{q+1-\ell }^{\left( k\right) } \end{aligned}$$

with

$$\begin{aligned} \left\| \mathbf {T}_{q+1-\ell }^{\left( k\right) }\right\| _{V}\le C\left| w\right| _{C^{q+1}\left( \left[ t_{j-1},t_{j}\right] ,V\right) }\Delta _{j}^{q+1-\ell } \end{aligned}$$

and the result follows. \(\square \)

4.2 Stability

The starting point of the error estimates for the Runge-Kutta gCQ is the summation formula with summation by parts (cf. (33)):

$$\begin{aligned} K_{\rho }\left( \partial _{t}^{\Theta }\right) \partial _{t}^{\rho }\phi =K_{\rho +m}\left( \partial _{t}^{\Theta }\right) [\![\Theta ]\!]^{m}\partial _{t}^{\rho }\phi . \end{aligned}$$
(37)

In the next Lemma we collect some elementary estimates.

Lemma 12

Let a Runge–Kutta method be given which satisfies Assumption 4. Let \(d_{i}\), \(i=1,\dots ,s\), be the eigenvalues of the coefficient matrix \(\mathbf {A}\). We set

(38)
  1. (i)

    There exists a constant C depending on \(r_{0}\) and the Runge-Kutta coefficients such that

    (39)

    and \(\left( x\right) _{+}:=\max \left\{ 0,x\right\} \).

  2. (ii)

    Let \(\mathbf {A}=\mathbf {V}^{-1}\mathbf {DV}\) (cf. Remark 2). Then, it holds

    $$\begin{aligned} \left\| \left( \mathbf {I}-z\mathbf {A}\right) ^{-1}\right\| \le \beta _{0}:=\frac{2}{\alpha _{0}r_{0}}\left\| \mathbf {V}^{-1}\right\| \left\| \mathbf {V}\right\| \quad \forall \; z\in \mathbb {C}\;\text { with }\;{\text {Re}}z\le \frac{r_{0}}{2}. \end{aligned}$$
    (40)

Proof

  1. (i)

    By using \({\text {Re}}\left( \frac{1}{\zeta }\right) =\left( {\text {Re}}\zeta \right) /\left| \zeta \right| ^{2}\), we conclude that R is analytic for all \(z\in \mathbb {C}\) with \({\text {Re}}z<r_{0}\). Then there exists \(C_{R}>0\) such that \(\left| R\left( z\right) \right| \le C_{R}\) for all \(z\in \mathbb {C}\) with \({\text {Re}} z\le \frac{3}{4}r_{0}\). We conclude from Cauchy’s integral theorem that \(\left| R^{\prime }\left( z\right) \right| \le \frac{4C_{R}}{r_{0}}\) for all \(z\in \mathbb {C}\) with \({\text {Re}}z\le \frac{r_{0}}{2}\). Taylor’s theorem gives us the estimate

    Since A-stability implies we conclude that

    $$\begin{aligned} \left| R\left( z\right) \right| \le 1+C{\text {Re}} z\quad \forall \; z\in \mathbb {C}\;\text { with }\;0\le {\text {Re}}z\le r_{0}/2 \end{aligned}$$

    holds. Estimate (39) is trivial for (cf. (12))

  2. (ii)

    By Remark 2 we can estimate

    $$\begin{aligned} \left\| \left( \mathbf {I}-z\mathbf {A}\right) ^{-1}\right\| \le \left\| \mathbf {V}^{-1}\right\| \left\| \mathbf {V}\right\| \max _{1\le i\le s}\left\{ \frac{1}{\left| 1-zd_{i}\right| }\right\} . \end{aligned}$$

    Writing and , we obtain

    $$\begin{aligned} \left| 1-zd_{i}\right| ^{2}=\left( 1-xu+yv\right) ^{2}+\left( yu+xv\right) ^{2}=:\kappa \left( y\right) . \end{aligned}$$

    The quadratic function \(\kappa \) attains its minimum at \(y=-\frac{v}{u^{2}+v^{2}}\) so that

    $$\begin{aligned} \kappa \left( y\right) \ge \frac{\left( u-x\left( u^{2}+v^{2}\right) \right) ^{2}}{u^{2}+v^{2}}. \end{aligned}$$

    Note that for \(0\le x\le \frac{u}{2\left( u^{2}+v^{2}\right) }\), it holds

    $$\begin{aligned} \kappa \left( y\right) \ge \frac{\left( x\left( u^{2}+v^{2}\right) -u\right) ^{2}}{u^{2}+v^{2}}\ge \frac{1}{4}\frac{u^{2}}{u^{2}+v^{2}}. \end{aligned}$$

    This proves (40). \(\square \)

Theorem 13

Let a Runge-Kutta method be given which satisfies Assumption 4. Fix \(\sigma >\sigma _{K}\) and let the maximal step \(\Delta \) satisfy

$$\begin{aligned} \frac{r_{0}}{2}-\Delta \sigma \ge 0. \end{aligned}$$
(41)

Let \(\tilde{\rho }\in \mathbb {N}_{0}\) be such that \(\nu -\left( q+1\right) \le \tilde{\rho }\le \nu \) holds. Assume that \(\phi \in C_{0}^{\tilde{\rho } }\left( \left[ 0,T\right] , D\right) \). Then, for any \(\tilde{m} \in \mathbb {N}_{0}\) with

$$\begin{aligned} \mu -\tilde{\rho }+1<\tilde{m}\le q+1, \end{aligned}$$
(42)

the stability estimate

(43)

holds. If \(\phi \in C^{\tilde{\rho }+\tilde{m}}\left( \left[ 0,T\right] ,D\right) \) then

(44)

Proof

By Proposition 11 the \(\tilde{m}\)-th order divided Runge-Kutta difference of \(\partial _{t}^{\tilde{\rho }}\phi \) are bounded and we apply \(\tilde{m}\)-times summation by parts, i.e., consider (37) for \(\tilde{m}\) as in (42). The assumption (42) ensures that the contour in the definition of the generalized convolution \(K_{\tilde{\rho }+\tilde{m}}\left( \partial _{t}^{\Theta }\right) \) can be chosen as the vertical axes . Note that (37) equals

(45)

Assumption (13) implies that

$$\begin{aligned} R\left( z\right) =\mathbf {e}_{s}\cdot \left( \mathbf {I}-z\mathbf {A}\right) ^{-1}\mathbf {1} \end{aligned}$$

and then by Lemma 12 we can bound

Furthermore, we have

$$\begin{aligned} \left\| \mathbf {e}_{s}\cdot \left( z\Delta _{k}\mathbf {I}-\mathbf {A} ^{-1}\right) ^{-1} [\![t_{k-\tilde{m}},\ldots ,t_{k} ]\!]\partial _{t}^{\tilde{\rho }}\phi \right\| _{B}\le \beta _{0}\left\| \mathbf {A}\right\| \left\| [\![t_{k-\tilde{m}},\ldots ,t_{k} ]\!]\partial _{t}^{\tilde{\rho }}\phi \right\| _{B} \end{aligned}$$

and

$$\begin{aligned} \left\| \left( z\Delta _{n}\mathbf {I}-\mathbf {A}^{.-1}\right) ^{-1}\left( [\![t_{k-\tilde{m}},\ldots ,t_{k} ]\!]\partial _{t}^{\tilde{\rho }}\phi \right) \right\| _{B}\le \beta _{0}\left\| \mathbf {A}\right\| \left\| [\![t_{k-\tilde{m}},\ldots ,t_{k} ]\!]\partial _{t}^{\tilde{\rho }}\phi \right\| _{B}. \end{aligned}$$

Hence,

(46)

with an adjusted value of \(\beta _{0}\). The choice of \(\tilde{\rho }\) as stated in the lemma implies

(47)

with \(C:=\sqrt{s}\frac{\left( \beta _{0}\left\| \mathbf {A}\right\| \right) ^{2}}{2\pi }\int _{\gamma }\left| z\right| ^{\mu -\tilde{\rho }-\tilde{m}}dz\), which is (43). The combination with Proposition 11 gives (44). \(\square \)

4.3 Convergence

For given \(K\in \mathcal {A}_{\sigma _{K}}^{\mu }\left( B,D\right) \) we set

$$\begin{aligned} w:=K\left( \partial _{t}\right) \phi \quad \text {and}\quad w_{\rho }^{\left( n\right) }:=\mathbf {e}_{s}\cdot \left( K_{\rho }\left( \partial _{t}^{\Theta }\right) \partial _{t}^{\rho }\phi \right) ^{\left( n\right) }, \end{aligned}$$
(48)

for \(\rho \) being the regularization parameter in (22). We distinguish two main cases according to the choice of \(\rho \).

If \(\mu -\rho <-1\) then (8a) holds with \(\nu =\rho \) and the error can be written

(49)

For \(\sigma >\sigma _{K}\), we choose the contour and split it into

(50)

with some which will be fixed later. This induces the error splitting

(51)

In the next two lemmas we estimate (49).

Lemma 14

Assume that \(\mu -\rho <-1\) and \(\phi \in C^{\rho +p+1}\left( \left[ 0,T\right] , B\right) \) with

$$\begin{aligned} \phi ^{(r)}(0)=0\quad \forall :r=0,\dots ,\rho +q. \end{aligned}$$

Then the far field component of the error in (51) can be bounded by

(52)

Proof

We assume in more generality that

$$\begin{aligned} \phi ^{(r)}(0)=0\quad \forall \, r=0,\dots ,\rho +m-1 \end{aligned}$$

and choose \(m\le q+1\) later in an appropriate way.

Further we introduce the solution of the Runge-Kutta gCQ with right-hand side \([\![ \Theta ]\!] ^{m}\partial _{t}^{\rho }\phi \), given by (see (30) and (27))

$$\begin{aligned} \mathbf {u}_{\rho ,m}^{\left( n\right) }\left( z\right) =\sum _{k=1} ^{n}\Delta _{k}\left( \mathbf {e}_{s}^{\left( n-k\right) \otimes } \otimes \bullet \right) \cdot \left( \mathbb {A}^{\left( k,n\right) }\left( z\right) \left( \mathbf {A} [\![ t_{k-m},\ldots ,t_{k}]\!] \partial _{t}^{\rho }\phi \otimes \mathbf {1}^{\left( n-k\right) \otimes }\right) \right) . \end{aligned}$$
(53)

As usual in this paper, the last component is denoted by \(u_{\rho ,m}^{\left( n\right) }:=\mathbf {e}_{s} \cdot \mathbf {u}_{\rho ,m}^{\left( n\right) }\left( z\right) \).

In order to estimate the component of (49) which is related to the farfield we will estimate the difference \(u_{\rho }\left( z,t_{n}\right) -u_{\rho }^{\left( n\right) }\left( z\right) \) for . On the one side we observe that the exact solution of the ODE is given by

(54)

Since \(\partial _{t}^{\rho +\ell }\phi \left( 0\right) =0\) for \(0\le \ell \le m-1\le q\) and \(\phi \in C^{\rho +m}\left( \left[ 0,T\right] \right) \), we get via partial integration

$$\begin{aligned} u_{\rho }\left( z,t\right) =-\sum _{\ell =0}^{m-1}\frac{\partial _{t}^{\rho +\ell }\phi \left( t\right) }{z^{\ell +1}}+\frac{u_{\rho +m}\left( z,t\right) }{z^{m}}. \end{aligned}$$
(55)

On the other side, we recall that the numerical approximation by the Runge–Kutta method can be written by using tensor notation as in (27), this is

$$\begin{aligned} \mathbf {u}_{\rho }^{\left( n\right) }\left( z\right) =\sum _{k=1}^{n} \Delta _{k}\left( \mathbf {e}_{s}^{\left( n-k\right) \otimes }\otimes \bullet \right) \cdot \left( \mathbb {A}^{\left( k,n\right) }\left( z\right) \left( \mathbf {A}\partial _{t}^{\rho } \varvec{\phi } ^{\left( k\right) }\otimes \mathbf {1}^{\left( n-k\right) \otimes }\right) \right) . \end{aligned}$$

Summation by parts (Proposition 9) yields

$$\begin{aligned} \mathbf {u}_{\rho }^{\left( n\right) }\left( z\right) =\mathbf {-}\sum _{\ell =0}^{m-1}\frac{ [\![t_{n-\ell },\ldots ,t_{n} ]\!]\partial _{t}^{\rho }\phi }{z^{\ell +1}}+\frac{\mathbf {u}_{\rho ,m}^{\left( n\right) }\left( z\right) }{z^{m}}, \end{aligned}$$
(56)

with \(\mathbf {u}_{\rho ,m}^{(n)}\) as in (53). Since \(u_{\rho }^{\left( n\right) }=\mathbf {e}_{s}\cdot \mathbf {u}_{\rho }^{\left( n\right) }\) the error can be written in the form

(57)

with

Proposition 11 implies

$$\begin{aligned}&\left\| \mathbf {e}_{s}\cdot [\![t_{n-\ell },\ldots ,t_{n} ]\!]\partial _{t}^{\rho }\phi -\partial _{t}^{\rho +\ell }\phi \left( t_{n}\right) \right\| _{B}\\&\quad \le C\left| \phi \right| _{C^{\rho +m}\left( \left[ t_{n-\ell },t_{n}\right] , B\right) }\Delta _{n}^{m-\ell },\quad \forall \; \ 0\le \ell \le m, \end{aligned}$$

so that the combination with (2) yields

(58)

To estimate , we substitute \(\gamma \) by in the right-hand side of (45), multiply by \(\mathbf {e}_{s}\cdot \) from the left, and observe that “\(\left( K_{\rho }\left( \partial _{t}^{\Theta }\right) \left( \partial _{t}^{\rho }\phi \right) \right) ^{\left( n\right) } \)” in (45) then has to be substituted by “\({w_{{\text {*}}{far},m}^{\left( n\right) }} \)”. From Proposition 11 and the proof of Theorem 13 we then deduce (cf. (46))

The last term in (57), , can be estimated by using (54):

and, in turn,

The estimate of the farfield follows by choosing \(m=q+1\). \(\square \)

Lemma 15

Assume that \(\mu -\rho <-1\) and fix \(\sigma >\sigma _{K}\). Let the maximal step \(\Delta \) (cf. (9)) satisfy

$$\begin{aligned} \frac{r_{0}}{2}-\Delta \sigma \ge 0, \end{aligned}$$
(59)

with \(r_{0}\) in (38). Assume further that \(\phi \in C^{\rho +p+1}\left( \left[ 0,T\right] , B\right) \) with \(\phi ^{(r)}(0)=0\) for all \(r=0,\dots ,\rho +q\).

Then the error estimate

(60)

holds with \(w\left( t_{n}\right) \), \(w_{\rho }^{\left( n\right) }\) as in (48) and

$$\begin{aligned} c_{\nu }\left( \Delta \right) :=\left\{ \begin{array} [c]{ll} 1 &{} \nu \ne -1,\\ \log \frac{1}{\Delta } &{} \nu =-1. \end{array} \right. \end{aligned}$$
(61)

Proof

The far field component of the error (49) is bounded in Lemma 14. As in the proof of Lemma 14, we assume in more generality that

$$\begin{aligned} \phi ^{(r)}(0)=0\quad \forall \; r=0,\dots ,\rho +m-1 \end{aligned}$$

for some \(m \le q+1\).

To estimate the near field component we start from

and notice that after differentiating this relation k times for some \(k\le p+1\) we obtain, for any \(z\in \mathbb {C}\),

Hence, we obtain

(62)

Solving the error recursion.

In order to estimate

(63)

we analyze the error

(64)

Following [16, proof of Theorem 3.3], we set

$$\begin{aligned} d_{i}^{\left( n\right) }(z)&=u_{\rho }(z,t_{n-1}+c_{i}\Delta _{n} )-u_{\rho }(z,t_{n-1})-\Delta _{n}\sum _{j=1}^{s}a_{ij}u_{\rho }^{\prime }(z,t_{n-1}+c_{j}\Delta _{n}),\\ d^{\left( n\right) }(z)&=u_{\rho }\left( z,t_{n}\right) -u_{\rho }(z,t_{n-1})-\Delta _{n}\sum _{j=1}^{s}b_{j}u_{\rho }^{\prime }(z,t_{n-1} +c_{j}\Delta _{n})=d_{s}^{\left( n\right) }. \end{aligned}$$

We set \(\mathbf {D}^{\left( n\right) }=(d_{i}^{\left( n\right) })_{i=1} ^{s}\) and

$$\begin{aligned} \varvec{\delta } ^{\left( k\right) }:=\left( \delta _{i}^{(k)}\right) _{i=1}^{s}:=\frac{1}{(k-1)!}\left( \mathbf {Ac}^{\left( k-1\right) \odot }-\frac{1}{k}\mathbf {c}^{k\odot }\right) . \end{aligned}$$

By inserting the exact solution into the Runge–Kutta scheme and performing Taylor expansion around \(t_{n}\) we obtain

$$\begin{aligned} \begin{array}{l} \displaystyle \mathbf {D}^{\left( n\right) }\left( z\right) =\sum _{k=q+1}^{p}\Delta _{n}^{k}\partial _{t}^{k}u_{\rho }\left( z,t_{n}\right) \varvec{\delta } ^{(k)}+\Delta _{n}^{p}\mathbf {Q}^{\left( n\right) }\left( z\right) , \\ \displaystyle d^{\left( n\right) }\left( z\right) =\Delta _{n}^{p}\int _{t_{n-1}}^{t_{n} }\kappa \left( \frac{t-t_{n-1}}{\Delta _{n}}\right) \partial _{t}^{p+1}u_{\rho }(z,t)\,dt, \end{array} \end{aligned}$$
(65)

where

$$\begin{aligned} \mathbf {Q}^{\left( n\right) }\left( z\right) :=\int _{t_{n-1}}^{t_{n}} \varvec{\kappa } \left( \frac{t-t_{n-1}}{\Delta _{n}}\right) \partial _{t}^{p+1}u_{\rho }(z,t)\,dt \end{aligned}$$

and \( \varvec{\kappa } =\left( \kappa _{i}\right) _{i=1}^{s}\), \(\kappa \) are bounded Peano kernels. Note that this implies

Thus, the error satisfies the recursion

$$\begin{aligned} e_{n}(z)=R(\Delta _{n}z)e_{n-1}(z)-\Delta _{n}z\mathbf {b}\cdot (\mathbf {I} -\Delta _{n}z\mathbf {A})^{-1}\mathbf {D}^{\left( n\right) }(z)+d^{\left( n\right) }(z), \end{aligned}$$

for the stability function R of the Runge–Kutta method (11). Solving the recursion and using that \(e_{0}=0\) we obtain

$$\begin{aligned} e_{n}(z)=\sum _{j=1}^{n}\left( \prod _{\ell =j+1}^{n}R(\Delta _{\ell }z)\right) \left( \Delta _{j}z\mathbf {b}\cdot (\mathbf {I}-\Delta _{j}z\mathbf {A} )^{-1}\mathbf {D}^{\left( j\right) }(z)+d^{\left( j\right) }(z)\right) . \end{aligned}$$

By Lemma 12 for \(\Delta \) small enough we can estimate

(66)

so that

(67)

The combination of the order condition (15) with (65) allows to bound the first norm in the right-hand side of (67) by

$$\begin{aligned} \left\| \Delta _{j}z\mathbf {b}\cdot (\mathbf {I}-\Delta _{j}z\mathbf {A} )^{-1}\mathbf {D}^{\left( j\right) }\left( z\right) \right\| _{B}&\le \sum _{k=q+1}^{p}\Delta _{j}^{k}\left\| \partial _{t}^{k}u_{\rho }\left( z,t_{j}\right) \right\| _{B}\nonumber \\ {}&\quad \times \left\| \Delta _{j}z\mathbf {b} \cdot (\mathbf {I}-\Delta _{j}z\mathbf {A})^{-1} \varvec{\delta } ^{(k)}\right\| \nonumber \\&\quad +\Delta _{j}^{p}\left\| \Delta _{j}z\mathbf {b}\cdot (\mathbf {I}-\Delta _{j}z\mathbf {A})^{-1}\mathbf {Q}^{\left( j\right) }\right\| _{B}. \end{aligned}$$
(68)

For sufficiently small in (50) we have \(\left\| \Delta _{j} z\mathbf {A}\right\| <1\) for all so that a Neumann series argument gives us

$$\begin{aligned} \left\| \Delta _{j}z\mathbf {b}\cdot \left( \mathbf {I}-\Delta _{j} z\mathbf {A}\right) ^{-1} \varvec{\delta } ^{(k)}\right\| \le \frac{\left( C\Delta _{j}\left| z\right| \right) ^{p-k+2}}{(k-1)!} \end{aligned}$$

where C depends on \(\mathbf {A},\mathbf {b},\mathbf {c}\). Recall that \(m\le q+1\). Thus, for all it holds (cf. (62))

For the second term in the right-hand side of (68) we get in a similar fashion

so that

This estimate allows to bound the nearfield error by using (2)

The combination with the farfield estimates leads to the assertion for \(m=q+1\). \(\square \)

Our approximation (22) has implicit a certain regularization of the integrand according to the results in Sect. 4.1. As a consequence, it is not necessary to choose \(\rho >\mu +1\) for convergence in (22). It is actually enough to choose \(\rho >\mu -q\). The full convergence result according to the choice of \(\rho \) is stated in the following theorem.

Theorem 16

Let \(K\in \mathcal {A}_{\sigma _{K}}^{\mu }\left( B,D\right) \) be a transfer operator and let \(\nu \in \mathbb {N}_{0}\) denote the smallest integer with \(\nu >\mu +1\). Let a Runge-Kutta method be given which satisfies Assumption 4. Fix \(\sigma >\sigma _{K}\) and let the maximal step \(\Delta \) (cf. (9)) satisfy

$$\begin{aligned} \frac{r_{0}}{2}-\Delta \sigma \ge 0, \end{aligned}$$
(69)

with \(r_{0}\) in (38).

For any \(\rho \in \mathbb {N}_{\ge 0}\) in (22) with \(\rho \ge \nu -\left( q+1\right) \) and \(\phi \in C_{0}^{\nu }\left( \left[ 0,T\right] ,B\right) \) let

$$\begin{aligned} w:=K\left( \partial _{t}\right) \phi \quad \text {and}\quad w_{\rho }^{\left( n\right) }:=\mathbf {e}_{s}\cdot \left( K_{\rho }\left( \partial _{t}^{\Theta }\right) \partial _{t}^{\rho }\phi \right) ^{\left( n\right) }. \end{aligned}$$

Then, the error estimate

(70)

holds with \(c_{\nu }\left( \Delta \right) \) as in (61), provided that \(\phi ^{(r)}(0)=0\) for all \(r=0,\dots ,\rho +q\) and \(\phi \in C^{\nu +p+1}\left( \left[ 0,T\right] ,B\right) \).

Note that estimate (70) implies that the choice \(\rho =p+\nu -\left( q+1\right) \) (cf. (16)) leads to a convergence order \(\mathcal {O}\left( \Delta ^{p}\right) \) for sufficiently smooth and compatible data; for a further discussion see Remarks 7 and 20.

Proof

The case \(\mu -\rho < -1\) is fully addressed in Lemma 15. Let us then assume that \(\mu -\rho \ge -1\) and let \(\nu \in \mathbb {N}_{0}\) be the smallest integer such that \(\nu >\mu +1\) holds. Then the contour integral in

is well defined for all \(\phi \in C_{0}^{\nu }\left( \left[ 0,T\right] ,D\right) \). Since \(\nu \) is large enough we may choose \(\gamma \) as any suitable contour in the complex plane: either a vertical contour \(\gamma _{\perp }\) running from to or a suitable closed contour \(\gamma _{\circlearrowright }\) clockwise oriented.

The representation of the discrete solution

is well defined by Theorem 13, (44) if we choose a closed contour \(\gamma _{\circlearrowright }\) which encircles the spectra . The error at time step \(t_{n}\) is given by

(71)

Following the notation introduced in the proof of Lemma 15, we add and subtract \(u_{\nu }^{\left( n\right) }\) and split the error into two terms

The term \(T_{1}\) can be estimated by using Lemma 15 with the substitution \(\rho \leftarrow \nu \) therein and we get

(72)

By Proposition 9, the term \(T_{2}^{\left( n\right) }\) is the s-th component of

$$\begin{aligned} \mathbf {T}_{2}^{\left( n\right) }=\left( K_{\nu }\left( \partial _{t}^{\Theta }\right) \left( \partial _{t}^{\nu }\phi - [\![\Theta ]\!]^{\nu -\rho }\partial _{t}^{\rho }\phi \right) \right) ^{\left( n\right) }. \end{aligned}$$

Following the proof of Theorem 13 for the choices \(\tilde{m}\leftarrow 0\) and \(\tilde{\rho }\leftarrow \nu \), which is allowed because

$$\begin{aligned} \mu -\tilde{\rho }+1<\tilde{m}<q+1, \end{aligned}$$

we obtain

Proposition 11 gives now the estimate

(73)

Our assumptions

$$\begin{aligned} \rho \le \mu +1<\nu \end{aligned}$$

imply that the \(\Delta \)-exponents in (72) and (73) satisfy

$$\begin{aligned} \nu +q-\mu > q+1\ge q+1- \left( \nu -\rho \right) , \end{aligned}$$

which leads to the final error estimate

\(\square \)

5 Runge-Kutta generalized convolution quadrature for solving convolution equations

5.1 Discretization

In this section we will consider the solution of one-sided convolution equations: for given g, find \(\phi \)

$$\begin{aligned} K\left( \partial _{t}\right) \phi =g. \end{aligned}$$
(74)

We assume that the transfer operator K satisfies

$$\begin{aligned} K\in \mathcal {A}_{\sigma _{+}}^{\theta }\left( B,D\right) \quad \text {for some }\;\sigma _{+},\theta \in \mathbb {R} \end{aligned}$$
(75a)

and, in analogy to (4), we choose \(m\in \mathbb {N}_{0}\) as the smallest integer such that \(m>\theta +1\). In view of (6) we are seeking the solution \(\phi \) of (74) in \(C_{0}^{m}\left( \left[ 0,T\right] , B\right) \).

To ensure existence of a solution of (74) we assume

$$\begin{aligned} K^{-1}:\mathbb {C}_{\sigma _{-}}\!\rightarrow \!\mathcal {L}\left( D,B\right) \quad \text { exists and }\quad K^{-1}\in \mathcal {A}_{\sigma _{-}}^{\mu }\left( D,B\right) \quad \text { for some}\quad \sigma _{-},\mu \in \mathbb {R}. \end{aligned}$$
(75b)

We define \(\nu \) according to (4) but emphasize that \(\mu \), this time, denotes the growth exponent of the inverse operator \(K^{-1}\). We thus assume in what follows that

$$\begin{aligned} \left\| K^{-1}(z) \right\| _{B\leftarrow D} \le C \left( 1+|z| \right) ^{\mu }. \end{aligned}$$
(76)

Proposition 17

Let [SubEquationDirect](75a) be satisfied. If \(g\in C_{0}^{\nu }\left( \left[ 0,T\right] , D\right) \), then

(77)

for a contour and \(\sigma >\sigma _{-}\) is well defined.

If \(g\in C_{0}^{\nu +m}\left( \left[ 0,T\right] , D\right) \), it holds \(\phi \in C_{0}^{m}\left( \left[ 0,T\right] , B\right) \) so that \(K\left( \partial _{t}\right) \phi \) is well defined and \(\phi \) as in (77) satisfies (74).

Proof

The choice of \(\nu \) and the smoothness assumption on g imply that \(\phi \) in (77) is well defined (cf. (7)). By differentiating (77) and using \(g\in C_{0}^{\nu +m}\left( \left[ 0,T\right] , D\right) \), we obtain \(\phi ^{\left( r\right) }=0\) for \(0\le r\le m-1\). Thus, the associativity for one-sided convolutions (see [14, (2.3), (2.22)])

$$\begin{aligned} V\left( \partial _{t}\right) W\left( \partial _{t}\right) =\left( VW\right) \left( \partial _{t}\right) \end{aligned}$$
(78)

yields \(K\left( \partial _{t}\right) \left( K^{-1}\left( \partial _{t}\right) \phi \right) =g\). \(\square \)

The inversion formula (77) allows us to discretize the convolution equation (74) by the same method as developed for the forward equation (cf. Sect. 3):

(79a)

and the approximation of \(\phi \) at time point \(t_{n}\) is given by the last component

$$\begin{aligned} \phi \left( t_{n}\right) \approx \phi _{\rho }^{\left( n\right) } :=\mathbf {e}_{s} \cdot \varvec{\phi } _{\rho }^{\left( n\right) }. \end{aligned}$$
(79b)

Remark 18

The representation of the generalized convolution quadrature in the form (79) is well suited for theoretical investigations but not for the practical implementation: For important applications such as, e.g., for the solution of the space-time wave equation, the operator \(K^{-1}\left( s\right) \) is infinite dimensional and not available explicitly so that its discretization would be prohibitively expensive. Instead, we will prove that the associativity of continuous convolutions (78) is inherited by the Runge-Kutta gCQ: under assumptions which will be detailed in Theorem 28 it holds

$$\begin{aligned} V\left( \partial _{t}^{\Theta }\right) \circ W\left( \partial _{t}^{\Theta }\right) =\left( VW\right) \left( \partial _{t}^{\Theta }\right) \end{aligned}$$
(80)

so that (79a) can be written in the form (cf. Remark 22, Corollary 29)

Definition 19

(Runge-Kutta gCQ for solving convolution equations) Let the transfer operator K satisfy (75) and let \(\nu ,m\in \mathbb {N}_{0}\) be the smallest integers such that \(\nu >\mu +1\) and \(m>\theta +1\). Let \(g\in C_{0}^{\nu +m}\left( \left[ 0,T\right] , D\right) \). We consider the problem: find \(\phi \in C_{0}^{m}\left( \left[ 0,T\right] ,B\right) \) such that

$$\begin{aligned} K\left( \partial _{t}\right) \phi =g. \end{aligned}$$
(81)

Let a Runge-Kutta method be given which satisfies Assumption 4. Then the discretization of (81) by Runge-Kutta generalized Convolution Quadrature is given by

(82)

and the approximation of \(\phi \) at time \(t_{n}\) by the last component \(\phi _{\rho }^{\left( n\right) }:=\mathbf {e}_{s}\cdot \varvec{\phi } _{\rho }^{\left( n\right) }\). Here, \(\rho \in \mathbb {N}_{0}\) is a regularization parameter which can be chosen in the range

$$\begin{aligned} \nu -\left( q+1\right) \le \rho \le p+\nu -\left( q+1\right) , \end{aligned}$$
(83)

where p denotes the order and q the stage order of the Runge-Kutta method.

Remark 20

For the algorithmic realization of the Runge-Kutta gCQ (cf. (82)) one has to approximate the contour integrals in

(84)

by numerical quadrature. For the implicit Euler gCQ, such a quadrature scheme has been proposed and analyzed in [10, 11].

On one hand, Theorem 16 indicates that the upper bound in (83) for the choice of \(\rho \) improves the convergence rates up to the optimal order \(\mathcal {O}\left( \Delta ^{p}\right) \) for sufficiently smooth and compatible data, while smaller choices of \(\rho \) lead to a milder growth behavior of the integrand in (84) and simplify the numerical quadrature. This also shows the importance of the summation-by-parts representation which allows to achieve a faster decay of the integrand in the error estimates without increasing the numerical parameter \(\rho \) furthermore.

5.2 Associativity

The stability and convergence analysis of the approximation \(\phi _{\rho }^{\left( n\right) }\) as in Definition 19 follows directly from Theorem 13 and 16 if we prove the inversion formula

In more generality, we will prove (80). This requires to reformulate the contour integrals via tensorial divided differences which we will introduce and the proof of a Leibniz rule for tensorial divided differences to derive the associativity property for the composition of discrete generalized convolution operators. We refer to [7] and [6] for an introduction to tensor calculus and advanced topics.

For \(i,j,i^{\prime },j^{\prime }\in \left\{ 1,\ldots ,N\right\} \), we consider sequences \(\mathbf {B}^{\left( k\right) }\in \mathbb {C}^{s\times s}\), \(k\in \left\{ i,\ldots ,j\right\} \) and \(\mathbf {C}^{\left( k\right) } \in \mathbb {C}^{s\times s}\), \(k\in \left\{ i^{\prime },\ldots ,j^{\prime }\right\} \), of matrices. In Sect. 4.1 we introduced the Kronecker products of matrices and their application to tensors of vectors. The composition of Kronecker matrices is defined as the tensor of the “matching” matrix products by

$$\begin{aligned} \left( {\displaystyle \bigotimes \limits _{k=i}^{j}} \mathbf {B}^{\left( k\right) }\right) \circ \left( {\displaystyle \bigotimes \limits _{k=i^{\prime }}^{j^{\prime }}} \mathbf {C}^{\left( k\right) }\right) = {\displaystyle \bigotimes \limits _{k=\min \left\{ i,i^{\prime }\right\} } ^{\max \left\{ j,j^{\prime }\right\} }} \mathbf {B}^{\left( k\right) }\mathbf {C}^{\left( k\right) }, \end{aligned}$$

where we set \(\mathbf {B}^{\left( k\right) }=\mathbf {I}\) for \(k\notin \left\{ i,\ldots ,j\right\} \) and \(\mathbf {C}^{\left( k\right) }=\mathbf {I}\) for \(k\notin \left\{ i^{\prime },\ldots ,j^{\prime }\right\} \). For \(i=i^{\prime }\) and \(j=j^{\prime }\) we suppress the composition sign “\(\circ \)” as is usual for matrix–matrix multiplication.

Finally we define the resolvent matrix for \(\mathbf {C}\in \mathbb {C}^{s\times s}\) by

$$\begin{aligned} \mathbf {R}_{z}\left( \mathbf {C}\right) \in \mathbb {C}^{s\times s} \quad \text {with}\quad \mathbf {R}_{z}\left( \mathbf {C}\right) :=\left( z\mathbf {I}-\mathbf {C}\right) ^{-1}\mathbf {.} \end{aligned}$$

Definition 21

For a set of matrices \(\mathbf {C}^{\left( k\right) } \in \mathbb {C}^{s\times s}\), \(1\le k\le n\), and a function f which is analytic in a complex neighborhood \(\mathcal {U}\) of , the tensorial divided difference is a Kronecker matrix given byFootnote 3

(85)

for a counterclockwise oriented closed contour \(\Gamma \) in \(\mathcal {U}\) which encircles .

Notice that for \(s=1\), \(\mathbf {C}^{(k)}=c^{(k)}\in \mathbb {C}\), for \(1\le k\le n\), and the resolvent matrices are just the scalar factors \((z-c^{(k)})^{-1}\). Then formula (85) becomes nothing but the standard Newton’s divided difference of f at arguments \(c^{(k)}\). This is a consequence of the following contour integral representation [4]

(86)

for \(\Gamma \) enclosing the arguments \(c^{(k)}\). In this way, tensorial divided differences are generalizations of Newton’s divided differences for matrix-valued arguments via the Cauchy formula (85). In Lemma 25 we will derive an alternative representation of tensorial divided differences which mimics the recurrence relation for classical divided differences.

These tensorial divided differences allow to express the generalized discrete convolution (22), (27) via

(87)

for \(1\le n\le N\). The result is an N-tuple of vectors in \(\mathbb {C}^{s}\).

The function \(\omega _{n,j}\) is given by

$$\begin{aligned} \omega _{n,j}\left( z\right) := {\displaystyle \prod _{\ell =j+1}^{n}} \left( z-\Delta _{\ell }^{-1}\right) . \end{aligned}$$
(88)

Remark 22

This representation shows that the generalized discrete convolution depends only on the discrete values \(\partial _{t}^{\rho } \mathbf {g}^{\left( k\right) }\) and thus can be applied also to tuples of stage vectors; thus, the composition of generalized discrete convolutions is well defined.

Representation (87) extends the definition of generalized discrete convolutions based on the implicit Euler method (cf. [11]) to Runge-Kutta methods as can be seen from the following remark.

Remark 23

In [9, first formula in the proof of Lemma 4.1], it was shown that the gCQ based on the implicit Euler method with variable step size can be written in the form

$$\begin{aligned} \phi _{\rho }^{\left( n\right) }=\sum _{k=1}^{n}\omega _{n,k}\left( 0\right) \left( \left[ \frac{1}{\Delta _{k}},\frac{1}{\Delta _{k+1}},\dots ,\frac{1}{\Delta _{n}}\right] \left( K^{-1}\right) _{\rho }\right) \partial _{t}^{\rho }g^{\left( k\right) }, \end{aligned}$$
(89)

where \(\omega _{n,k}\) is as in (88). By using the contour integral representation (86) and taking into account the clockwise orientation of the contour \(\gamma \), (89) can be expressed in terms of contour integrals as

(90)

Alternatively, we consider equation (87) for the implicit Euler method. In this case we have \(\mathbf {A}=\left( 1\right) \in \mathbb {R} ^{1\times 1}\) and, in turn,

This is the same expression as (90) and we see that (87) defines an extension of the divided difference representation of scalar generalized convolution quadrature based on the implicit Euler method to Runge-Kutta methods.

The key role for writing (79a) as a forward equation will be played by an elegant inversion formula (which is well known for Runge-Kutta Convolution Quadrature with constant time steps).

In order to prove the associativity property of our discretization we develop a tensorial Leibniz formula and a composition rule for tensorial divided differences.

By Cauchy’s integral theorem it is easy to see that \(\left[ \mathbf {C} \right] f\) is the value of the function f applied to the matrix \(\mathbf {C}\) which is the analogue to standard zero-th order divided differences. For higher order divided differences we first introduce the tensorial difference \(\ominus ^{\left( k,j\right) }\left( \mathbf {A} ,\mathbf {B}\right) \ \)as the Kronecker matrix defined by

$$\begin{aligned} \ominus ^{\left( k,j\right) }\left( \mathbf {A},\mathbf {B}\right) =\left( {\displaystyle \bigotimes \limits _{\ell =1}^{k-1}} \mathbf {I}\right) \otimes \mathbf {A}\otimes {\displaystyle \bigotimes \limits _{\ell =k+1}^{n}} \mathbf {I}-\left( {\displaystyle \bigotimes \limits _{\ell =1}^{j-1}} \mathbf {I}\right) \otimes \mathbf {B}\otimes {\displaystyle \bigotimes \limits _{\ell =j+1}^{n}} \mathbf {I}, \end{aligned}$$

If \(\mathbf {A}\) and \(\mathbf {B}\) are simultaneously diagonalizable, this is, \(\mathbf {A}=\mathbf {V}^{-1}\mathbf {D}^{\left( 1\right) }\mathbf {V}\) and \(\mathbf {B}=\mathbf {V}^{-1}\mathbf {D}^{\left( 2\right) }\mathbf {V}\), for some \(\mathbf {V}\) and diagonal matrices \(\mathbf {D}^{\left( 1\right) }\), \(\mathbf {D}^{\left( 2\right) }\), we haveFootnote 4

$$\begin{aligned}&\left( {\displaystyle \bigotimes \limits _{i=1}^{n}} \mathbf {v}^{\left( i\right) }\right) \cdot \left( \ominus ^{\left( k,j\right) }\left( \mathbf {A},\mathbf {B}\right) {\displaystyle \bigotimes \limits _{i=1}^{n}} \mathbf {w}^{\left( i\right) }\right) \\&\quad =\left( {\displaystyle \bigotimes \limits _{i=1}^{n}} \mathbf {V}^{-\intercal }\mathbf {v}^{\left( i\right) }\right) \cdot \left( \ominus ^{\left( k,j\right) }\left( \mathbf {D}^{\left( 1\right) },\mathbf {D}^{\left( 2\right) }\right) {\displaystyle \bigotimes \limits _{i=1}^{n}} \mathbf {Vw}^{\left( i\right) }\right) . \end{aligned}$$

Remark 24

The eigenvalues of \(\ominus ^{\left( k,j\right) }\left( \mathbf {A} ,\mathbf {B}\right) \) are given by \(\lambda _{i_{1}}^{\left( 1\right) }-\lambda _{i_{2}}^{\left( 2\right) }\), where \(\lambda _{i_{1}}^{\left( 1\right) }\) are the eigenvalues of \(\mathbf {A}\) and \(\lambda _{i_{2}}^{\left( 2\right) }\) those of \(\mathbf {B}\). Hence, \(\ominus ^{\left( k,j\right) }\left( \mathbf {A},\mathbf {B}\right) \) is regular if and only if . In this case, \(\left( \ominus ^{\left( k,j\right) }\left( \mathbf {A},\mathbf {B}\right) \right) ^{-1}\) exists, i.e.,

$$\begin{aligned} \left( \ominus ^{\left( k,j\right) }\left( \mathbf {A},\mathbf {B}\right) \right) ^{-1}\ominus ^{\left( k,j\right) }\left( \mathbf {A},\mathbf {B} \right) =\ominus ^{\left( k,j\right) }\left( \mathbf {A},\mathbf {B}\right) \left( \ominus ^{\left( k,j\right) }\left( \mathbf {A},\mathbf {B}\right) \right) ^{-1}= {\displaystyle \bigotimes \limits _{i=1}^{n}} \mathbf {I} \end{aligned}$$

but, in general, is not a Kronecker matrix. Further note that

$$\begin{aligned}&\left( {\displaystyle \bigotimes \limits _{i=1}^{n}} \mathbf {v}^{\left( i\right) }\right) \cdot \left( \left( \ominus ^{\left( k,j\right) }\left( \mathbf {A},\mathbf {B}\right) \right) ^{-1} {\displaystyle \bigotimes \limits _{i=1}^{n}} \mathbf {w}^{\left( i\right) }\right) \\&\qquad \qquad =\left( {\displaystyle \bigotimes \limits _{i=1}^{n}} \mathbf {V}^{-\intercal }\mathbf {v}^{\left( i\right) }\right) \cdot \left( \left( \ominus ^{\left( k,j\right) }\left( \mathbf {D}^{\left( 1\right) },\mathbf {D}^{\left( 2\right) }\right) \right) ^{-1} {\displaystyle \bigotimes \limits _{i=1}^{n}} \mathbf {Vw}^{\left( i\right) }\right) . \end{aligned}$$

Lemma 25

For a set of matrices \(\mathbf {C}^{\left( k\right) }\in \mathbb {C}^{s\times s}\), \(1\le k\le n\), which are simultaneously diagonalizable, i.e.,

$$\begin{aligned} \mathbf {C}^{\left( k\right) }=\mathbf {V}^{-1}\mathbf {D}^{\left( k\right) }\mathbf {V}, \end{aligned}$$
(91)

it holds

(92)

Furthermore, if the intersection of the spectra of any pair \(\mathbf {C} ^{(k)},\mathbf {C}^{(j)}\), \(k\ne j\), is empty, the following recursion for tensorial divided differences holds true

$$\begin{aligned}&\left[ \mathbf {C}^{\left( 1\right) },\ldots ,\mathbf {C}^{\left( k\right) }\right] f\nonumber \\&\quad =\left( \left( \mathbf {I}\otimes \left[ \mathbf {C}^{\left( 2\right) },\ldots ,\mathbf {C}^{\left( k\right) }\right] f\right) -\left( \left[ \mathbf {C}^{\left( 1\right) },\ldots ,\mathbf {C}^{\left( k-1\right) }\right] f\otimes \mathbf {I}\right) \right) \nonumber \\ {}&\qquad \times \left( \ominus ^{\left( k,1\right) }\left( \mathbf {C}^{\left( k\right) },\mathbf {C}^{\left( 1\right) }\right) \right) ^{-1}. \end{aligned}$$
(93)

Proof

Statement (92) is trivial.

Since the matrices \(\mathbf {C}^{\left( k\right) }\) are simultaneously diagonalizable it is sufficient to prove the statement for diagonal matrices \(\mathbf {C}^{\left( k\right) }=\mathbf {D}^{\left( k\right) }\) and the statement follows from the corresponding property for standard divided differences. \(\square \)

Lemma 26

(Leibniz rule for tensorial divided differences) Let \(\mathbf {C}^{\left( j\right) }\), \(1\le j\le n\), be simultaneously diagonalizable (91), and f be as in Definition 21. For mappings fg analytic in a neighborhood of the tensorial Leibniz’ rule for divided differences holds

(94)

Proof

Since the matrices \(\mathbf {C}^{\left( k\right) }\) are assumed to be simultaneously diagonalizable it is sufficient to prove the statement for diagonal matrices \(\mathbf {C}^{\left( k\right) }=\mathbf {D}^{\left( k\right) }\), \(1\le k\le n\). Furthermore, continuity of divided differences with respect to the arguments \(\mathbf {C}^{\left( k\right) }\), \(1\le k\le n\), implies that it is enough to prove (94) for matrices with pairwise disjoint spectra, cf. [4].

The statement is trivial for \(n=1\) and we assume next that the assertion holds for all \(m<n\) and derive it for n.

From Lemma 25, we deduceFootnote 5

$$\begin{aligned}&\left[ \mathbf {D}^{\left( 1\right) },\ldots ,\mathbf {D}^{\left( n\right) }\right] \left( fg\right) \\&\quad =\left( \left( \mathbf {I}\otimes \left[ \mathbf {D}^{\left( 2\right) },\ldots ,\mathbf {D}^{\left( n\right) }\right] \left( fg\right) \right) -\left( \left[ \mathbf {D}^{\left( 1\right) },\ldots ,\mathbf {D}^{\left( n-1\right) }\right] \left( fg\right) \otimes \mathbf {I}\right) \right) \\ {}&\qquad \times \left( \ominus ^{\left( n,1\right) }\left( \mathbf {D}^{\left( n\right) },\mathbf {D}^{\left( 1\right) }\right) \right) ^{-1}\\&\quad \overset{\text {ind. assump.}}{=}\left( \mathbf {I}\otimes \sum _{j=2}^{n}\left( \left[ \mathbf {D}^{\left( j\right) },\ldots ,\mathbf {D}^{\left( n\right) }\right] f\right) \circ \left( \left[ \mathbf {D}^{\left( 2\right) },\ldots ,\mathbf {D}^{\left( j\right) }\right] g\right) \right. \\&\quad \quad \left. -\left( \sum _{j=1}^{n-1}\left[ \mathbf {D}^{\left( j\right) },\ldots ,\mathbf {D}^{\left( n-1\right) }\right] f\circ \left[ \mathbf {D}^{\left( 1\right) },\ldots ,\mathbf {D}^{\left( j\right) }\right] g\right) \otimes \mathbf {I}\right) \\ {}&\qquad \times \left( \ominus ^{\left( n,1\right) }\left( \mathbf {D}^{\left( n\right) },\mathbf {D}^{\left( 1\right) }\right) \right) ^{-1}\\&\quad =\left( \sum _{j=2}^{n}\left( \left[ \mathbf {D}^{\left( j\right) },\ldots ,\mathbf {D}^{\left( n\right) }\right] f\circ \left[ \mathbf {D} ^{\left( 1\right) },\ldots ,\mathbf {D}^{\left( j\right) }\right] g\right) \ominus ^{\left( j,1\right) }\left( \mathbf {D}^{\left( j\right) },\mathbf {D}^{\left( 1\right) }\right) \right. \\&\quad \quad \left. +\sum _{j=1}^{n-1}\left( \left[ \mathbf {D}^{\left( j\right) },\ldots ,\mathbf {D}^{\left( n\right) }\right] f\circ \left[ \mathbf {D}^{\left( 1\right) },\ldots ,\mathbf {D}^{\left( j\right) }\right] g\right) \ominus ^{\left( n,j\right) }\left( \mathbf {D}^{\left( n\right) },\mathbf {D}^{\left( j\right) }\right) \right) \\ {}&\qquad \times \left( \ominus ^{\left( n,1\right) }\left( \mathbf {D}^{\left( n\right) },\mathbf {D}^{\left( 1\right) }\right) \right) ^{-1} . \end{aligned}$$

Since \(\ominus ^{\left( 1,1\right) }\left( \mathbf {D}^{\left( 1\right) },\mathbf {D}^{\left( 1\right) }\right) =\ominus ^{\left( n,n\right) }\left( \mathbf {D}^{\left( n\right) },\mathbf {D}^{\left( n\right) }\right) =0\) the first sum can be extended to \(j=1\) and the second one to \(j=n\) without changing the values. Since \(\ominus ^{\left( j,1\right) }\left( \mathbf {D}^{\left( j\right) },\mathbf {D}^{\left( 1\right) }\right) +\ominus ^{\left( n,j\right) }\left( \mathbf {D}^{\left( n\right) },\mathbf {D}^{\left( j\right) }\right) =\ominus ^{\left( n,1\right) }\left( \mathbf {D}^{\left( n\right) },\mathbf {D}^{\left( 1\right) }\right) \), the result follows. \(\square \)

Finally, we will need a result for the composition of tensorized bilinear forms and employ the notation of vectorization as in (25).

Lemma 27

For vectors \(\mathbf {v}^{\left( j\right) },\mathbf {w} ^{\left( j\right) }\in \mathbb {C}^{s}\), let

$$\begin{aligned} \mathbf {q}^{\left( k+1\right) }:= & {} \alpha ^{\left( m+1,k\right) } \mathbf {B}^{\left( k+1\right) }\mathbf {w}^{\left( k+1\right) }\\&\text {with} \ \alpha ^{\left( m+1,k\right) }:=\left( {\displaystyle \bigotimes \limits _{j=m+1}^{k}} \mathbf {v}^{\left( j\right) }\right) \cdot \left( {\displaystyle \bigotimes \limits _{j=m+1}^{k}} \mathbf {B}^{\left( j\right) }\right) {\displaystyle \bigotimes \limits _{j=m+1}^{k}} \mathbf {w}^{\left( j\right) }. \end{aligned}$$

Then

$$\begin{aligned}&\left( {\displaystyle \bigotimes \limits _{\ell =k+1}^{n}} \mathbf {v}^{\left( \ell \right) }\otimes \bullet \right) \cdot \left( {\displaystyle \bigotimes \limits _{\ell =k+1}^{n+1}} \mathbf {C}^{\left( \ell \right) }\right) \left( \mathbf {q}^{\left( k+1\right) }\otimes {\displaystyle \bigotimes \limits _{\ell =k+2}^{n+1}} \mathbf {w}^{\left( j\right) }\right) \nonumber \\&\qquad \qquad =\left( {\displaystyle \bigotimes \limits _{\ell =m+1}^{n}} \mathbf {v}^{\left( \ell \right) }\otimes \bullet \right) \cdot \left( {\displaystyle \bigotimes \limits _{\ell =k+1}^{n+1}} \mathbf {C}^{\left( \ell \right) }\right) \circ \left( {\displaystyle \bigotimes \limits _{\ell =m+1}^{k+1}} \mathbf {B}^{\left( \ell \right) }\right) {\displaystyle \bigotimes \limits _{\ell =k+1}^{n+1}} \mathbf {w}^{\left( j\right) }. \end{aligned}$$
(95)

Proof

We denote the left-hand side in (95) by . Then,

and this is the assertion. \(\square \)

Theorem 28

(Associativity) Let a Runge-Kutta method be given which satisfies Assumption 4. Let \(W\left( s\right) \in \mathcal {L}\left( B,D\right) \) and \(V\left( s\right) \in \mathcal {L}\left( D,E\right) \) denote transfer operators which are analytic in some complex neighborhood \(\mathcal {U}\) of . It holds

$$\begin{aligned} V\left( \partial _{t}^{\Theta }\right) \circ W\left( \partial _{t}^{\Theta }\right) =\left( VW\right) \left( \partial _{t}^{\Theta }\right) . \end{aligned}$$
(96)

Proof

We set

The left-hand side in (96) can be written in the form

Next we apply the tensorial Leibniz rule for divided differences (cf. Lemma 26) to obtain

\(\square \)

Corollary 29

(Inversion formula) Let a Runge-Kutta method be given which satisfies Assumption 4. Equation (79a) has an explicit inversion formula. It holds

(97)

Proof

We employ Theorem 28 with \(V:=K_{-\rho }\) and \(W:=\left( K^{-1}\right) _{\rho }\) to obtain

with the identity mapping \({\text {Id}}\). Hence, only the summand with \(m=n\) is different from zero and the assertion follows. \(\square \)

6 Implementation and experiment

Our implementation of the Runge–Kutta gCQ is based on quadrature applied to definition (22). If a suitable quadrature with nodes \(z_{\ell }\) and weights \(w_{\ell }\), \(\ell =1,\dots ,N_{Q}\), is available it is clear how to approximate the action of the (forward) convolution \(K(\partial _{t}^{\Theta })\phi \) by a Runge-Kutta time stepping method applied to

$$\begin{aligned} \partial _{t}u_{\rho }(z,t)=z_{\ell }u_{\rho }(z,t)+\partial _{t}^{\rho }\phi ;\quad u_{\rho }(z_{\ell },0)=0,\quad \ell =1,\dots ,N_{Q}. \end{aligned}$$

The solution of the convolution equation \(K(\partial _{t}^{\Theta })\phi =g\), for given g, avoids the evaluation of the inverse convolution \(\phi =K^{-1}(\partial _{t}^{\Theta })g\) by employing the following algorithm which is based on K and not on its inverse. We compute approximations \({\varvec{\widetilde{\phi }}}^{(n)}\approx {\varvec{\phi }^{(n)}}\) from

$$\begin{aligned} K_{-\rho }\left( (\Delta _{n}\mathbf {A})^{-1}\right) {\varvec{\widetilde{\phi }}}^{(n)}=\mathbf {g}^{(n)}-\sum _{\ell =1}^{N_{Q}}w_{\ell }K_{-\rho }(z_{\ell })\left( \mathbf {e}_{s}\cdot \mathbf {u}^{(n-1)}(z_{\ell })\right) (\mathbf {I}-\Delta _{n}z_{\ell } \mathbf {A})^{-1}\mathbf {1} \end{aligned}$$

in the following way.

figure a

For gCQ based on the implicit Euler method the quadrature problem has been fully solved in [10] and several experiments are reported in [11]. The contour of choice in this case is the circle centered at \(\Delta _{\min }^{-1}\) with radius \(\Delta _{\min }^{-1}\), which coincides with the boundary of the region \(|R(\Delta _{\min }z)|=1\). The parameterization of this circle uses Jacobi elliptic functions in order to optimally exploit the analyticity domain of the integrand in (22), whose poles are located in the real segment \([\Delta ^{-1},\Delta _{\min }^{-1}]\).

For higher order Runge–Kutta methods the poles of the integrand in (22) are typically located in a sector around the positive real axis and the boundary of the stability region \(|R(\Delta _{\min }z)|=1\) is more complicated than a circle. In Fig. 1 we show the location of the poles, the curve \(|R(\Delta _{\min }z)|=1\) and our contour of choice for the grid

$$\begin{aligned} t_{j}=\left( \frac{j}{20}\right) ^{2},\quad j=1,\dots ,20, \end{aligned}$$

both for implicit Euler and RadauIIA5. In both cases we choose a circle as the integration contour but in the case of RadauIIA5 the radius is much larger, namely \(M=5\max (|\lambda |)/\Delta _{\min }\) for . This implies that the boundary of the contour becomes more vertical at \(z=0\) and thus avoids invading too much into the region \(|R(\Delta _{\min }z)|>1\) close to the origin. For this contour the number of quadrature nodes needed to produce the error plot in Fig. 2 was \(N_{Q}=3N\log ^{2}(N)\). The optimization of the integration contour and a rigorous error and complexity analysis are the subject of ongoing research.

Fig. 1
figure 1

Poles of the integrand in (22), integration contour and curve \(|R(\Delta _{\min }z)|=1\) for 20 steps quadratically graded towards the origin. Left for implicit Euler method. Right for RadauIIA5

Fig. 2
figure 2

Error with respect to the number of steps for g in (99). The straight lines indicate slopes 1, 2 and 3, respectively

In order to illustrate the performance of high order Runge–Kutta gCQ in comparison with the original CQ, with uniform steps, we consider the following one-dimensional example: find \(\phi \) such that \(K(\partial _{t})\phi =g\) with

(99)

The exact solution to this problem is computed in [17] and is given by

$$\begin{aligned} \phi (t)=2\sum _{k=0}^{\lfloor t/2\rfloor }g^{\prime }(t-2k). \end{aligned}$$
(100)

We approximate \(\phi (t)\) for \(t\in [0,1]\) by applying Algorithm 30 with RadauIIA5 and \(\rho =0\). Then we have \(\mu =1\) in Assumption 1, \(p=5\) and \(q=3\). The right-hand side g satisfies \(g^{(\ell )}(0)=0\) for \(\ell =0,1,2\) and is not three times differentiable at \(t=0\). This lack of regularity suggests to use a time grid which is algebraically graded towards the origin. We heuristically choose a quadratically graded mesh with points

$$\begin{aligned} \Theta =(t_{j})_{j=1}^{N}\quad \text{ with } \quad t_{j}=\left( \frac{j}{N}\right) ^{\alpha } \end{aligned}$$

and \(\alpha =2\). In this case it is \(\Delta =N^{-1}\) and \(\Delta _{\min }=N^{-2}\). For a comparison with uniform steps we set \(\alpha =1\). Figure 2 shows that the convergence rate is \(\mathcal {O}(\Delta ^{3})\) for the graded mesh and about \(\mathcal {O}(\Delta ^{1.6})\) for the uniform mesh. For this example, we have \(\mu =1\) and thus the minimal integer \(\nu >\mu +1\) is \(\nu =3\). For \(\rho =0\), with \(\mu -\rho =1>-1\), Theorem 16 then predicts a convergence rate like \(\mathcal {O}(\Delta ^{3+1-3})=\mathcal {O}(\Delta )\). The theoretical estimate provided by this Theorem is of order 3 or higher only for \(\rho \ge 2\). More precisely it is \(\mathcal {O}\left( \Delta ^{3}\right) \) for \(\rho =2\) and \(\mathcal {O}(\Delta ^{5})\) for \(\rho =3\). We believe this is due to a limitation of our theory which does not allow in principle to choose a fractional value of \(\nu \). In the limit (not allowed) case \(\nu =2\), the theoretical estimate yields actually an estimate like \(\mathcal {O}(\Delta ^{2})\). However we observe a better convergence rate for \(\rho =0\) as predicted by our theory, actually the rate coincides with the optimal convergence rate for smooth and compatible problems with uniform time steps developed in [1]. It is an open problem whether there exist examples where a bigger value of \(\rho \) is necessary for variable steps than for uniform steps or whether our theory yields a suboptimal estimate in terms of this parameter.