1 Introduction

In many physical applications, e.g. acoustic and electromagnetic scattering, it is necessary to solve an exterior boundary value problem for the three-dimensional wave equation. Such problems can be effectively treated with the use of time-domain boundary integral equations (TDBIE). The well-posedness of such formulations for the wave equation was analyzed in [3, 4].

Solution of time domain boundary integral equations is usually performed with Galerkin time-space methods [16], collocation methods [11] or Laplace-domain approaches. A review of these methods can be found in [10]. However, compared to the field of elliptic problems, fast solvers for time domain boundary integral equations are not as extensively developed. A particularly efficient approach to the solution of retarded potential boundary integral equations is offered in [14, 15].

One of the methods for the solution of TDBIE, convolution quadrature [2224], combines Laplace-domain and time-stepping techniques. It is stable, efficient and does not require intricate, highly accurate quadrature in space in contrast to standard Galerkin time-space methods. The applicability of the method to external boundary-value problems for the wave equation was justified in [24]. These results were supported by extensive numerical experiments in [5] where it was also shown that Runge–Kutta convolution quadrature [25] is preferable to multistep convolution quadrature whenever the scattering domain is non-convex. In [6] the theoretical justification of this fact was given.

In this paper we discretize the time-domain single layer boundary operator by an \(m\)-stage A-stable Runge–Kutta convolution quadrature [25]. The weights of the quadrature are boundary integral operators. The main part of this paper is devoted to the investigation of the behaviour of the kernels \(w_{n}^{h}(d)\) of these operators. We prove estimates that show an exponential decay of \(\Vert w_n^h(d)\Vert \) away from a neighbourhood of \(nh \approx d\), where \(h>0\) denotes the timestep. The paper ends with an illustration of how these results can be used to speed up existing algorithms for the computation of convolution weights.

2 Statement of the problem

Let \(\varOmega \subset {\mathbb {R}}^3\) be a bounded Lipschitz domain with boundary \(\varGamma \) and let \(\varOmega ^{c}={\mathbb {R}}^{3}\setminus \overline{\varOmega }\) be its complement.

We will consider the homogeneous wave equation set in \(\varOmega ^{c}\),

$$\begin{aligned}&\frac{\partial ^2{u}}{\partial {t}^2}-\varDelta u = 0\quad \text {in }[0, T]\times \varOmega ^{c},\nonumber \\&u(0, .) = \frac{\partial {u}}{\partial {t}}(0, .) = 0 \quad \text {in }\varOmega ^{c}, \\&u(t, x) = g(t, x) \quad \text {on }[0, T]\times \varGamma .\nonumber \end{aligned}$$
(2.1)

The solution \(u\) of the above system can be represented as the single-layer potential of an unknown density \(\lambda \)

$$\begin{aligned} u(t, {\tilde{x}})= ({\fancyscript{S}}\lambda )(t, {\tilde{x}})&= \int _{0}^{t}\int _{\varGamma }\frac{\delta (t-\tau - \Vert {\tilde{x}}-y\Vert )}{4\pi \Vert {\tilde{x}}-y\Vert }\lambda (\tau ,y) d\varGamma _{y} d\tau , \\&= \int _{\varGamma }\frac{\lambda (t-\Vert {\tilde{x}}-y\Vert ,y)}{4\pi \Vert {\tilde{x}}-y\Vert }d\varGamma _y,\quad (t, {\tilde{x}})\in [0,T]\times {\varOmega }^{c}, \end{aligned}$$

where \(\delta (\cdot )\) denotes the Dirac delta function. Single layer potential \({\fancyscript{S}}\lambda \) is also known as the retarded potential, the name being justified by the second expression above. For any density \(\lambda \), the function \(u = {\fancyscript{S}} \lambda \) satisfies the first two equations in (2.1), therefore to solve (2.1) it remains to choose \(\lambda \) so that the boundary condition is also satisfied. The single layer potential \({\fancyscript{S}}\lambda \) is continuous across \(\varGamma \), so letting in the above equation \({\tilde{x}}\rightarrow x \in \varGamma \) and using the boundary condition from (2.1), we obtain a boundary integral equation for the unknown density \(\lambda \)

$$\begin{aligned} g(t, x)=({\fancyscript{V}}\lambda )(t, x)&= \int _{0}^{t}\int _{\varGamma }\frac{\delta (t-\tau - \Vert x-y\Vert )}{4\pi \Vert x-y\Vert }\lambda (\tau ,y) d\varGamma _{y} d\tau , \nonumber \\&\quad \forall (t, x)\in [0, T]\times \varGamma . \end{aligned}$$
(2.2)

Here, the operator \({{\fancyscript{V}}}\) is called the single layer boundary integral operator. For the existence and uniqueness of solutions of this equation see [3].

For further discussion we will require the Laplace transforms of \({{\fancyscript{S}}}\) and \({{\fancyscript{V}}}\). With the Laplace transform defined by

$$\begin{aligned} {\fancyscript{L}} f(s)=\int _0^\infty e^{-st} f(t) dt, \quad \mathrm{Re }s >0, \end{aligned}$$

and causal \(f\), i.e. \(f(t) = 0\), for \(t \le 0\), it holds that

$$\begin{aligned} {\fancyscript{L}} (f(\cdot -r)) (s)= \int _0^\infty e^{-st} f(t-r) dt = e^{-sr} {\fancyscript{L}} f(s),\quad r \ge 0. \end{aligned}$$
(2.3)

Hence the Laplace transforms of \({{\fancyscript{S}}}\) and \({{\fancyscript{V}}}\) are given respectively by

$$\begin{aligned} S(s) \varphi ({\tilde{x}}) = \int _\varGamma \frac{e^{-s\Vert {\tilde{x}}-y\Vert }}{4\pi \Vert {\tilde{x}}-y\Vert }\varphi (y) d\varGamma _y, \quad {\tilde{x}} \in \varOmega ^c, \end{aligned}$$

and

$$\begin{aligned} V(s) \varphi (x) = \int _\varGamma \frac{e^{-s\Vert x-y\Vert }}{4\pi \Vert x-y\Vert }\varphi (y) d\varGamma _y, \quad x \in \varGamma . \end{aligned}$$

Next, we address the time-discretization of retarded potentials.

2.1 Convolution quadrature based on backward differences

Let \(h > 0\) denote the timestep and \(t_j = jh\) the equally spaced time-points. Then the derivative of a causal function \(f\) can be approximated by

$$\begin{aligned} f'(t) \approx \tfrac{1}{h}(f(t)-f(t-h)). \end{aligned}$$
(2.4)

Taking the Laplace transform of the above approximation and applying (2.3) gives

$$\begin{aligned} s{\fancyscript{L}} f(s) \approx \frac{1-e^{-sh}}{h} {\fancyscript{L}} f(s). \end{aligned}$$
(2.5)

We can also reverse this procedure: approximate the differentiation symbol \(s\) in the Laplace domain by \(\tfrac{1-e^{-sh}}{h} = s+sO((sh))\), see (2.5), and compute the inverse Laplace transform of the approximation to obtain the backward difference approximation of the derivative (2.4).

Convolution quadrature of \({{\fancyscript{S}}} \lambda \) and \({{\fancyscript{V}}} \lambda \) proceeds in a similar way. First the approximation in the Laplace domain is made

$$\begin{aligned} V(s) {\fancyscript{L}} \lambda \approx V\left( \tfrac{1-e^{-sh}}{h}\right) {\fancyscript{L}} \lambda \end{aligned}$$
(2.6)

and then the inverse Laplace transform of the approximation is computed and used as an approximation of \({{\fancyscript{V}}} \lambda \). Let the following expansion hold

$$\begin{aligned} V\left( \tfrac{1-e^{-sh}}{h}\right) = \sum _{j = 0}^\infty \omega _j(V) e^{-shj} = \sum _{j = 0}^\infty \omega _j(V) e^{-s t_j},\quad \mathrm{Re }s > 0. \end{aligned}$$

Remark 2.1

Note that \(V(s)\) is an analytic and bounded function of \(s\) for \(\mathrm{Re }s >0\),

$$\begin{aligned} \Vert V(s)\Vert _{H^{1/2}(\varGamma ) \leftarrow H^{-1/2}(\varGamma )} \le C(\sigma ) |s|,\quad \mathrm{Re }s \ge \sigma > 0, \end{aligned}$$

see [3]. Also \(1-e^{-sh}\) is an analytic function of \(e^{-sh}\) and \(\mathrm{Re }(1-e^{-sh}) > 0\) for \(\mathrm{Re }s > 0\). Hence, the above expansion is well defined and the linear operators \(\omega _j(V): H^{-1/2}(\varGamma ) \rightarrow H^{1/2}(\varGamma )\) are bounded.

Using (2.3) again, we see that the inverse Laplace transform of the approximation in (2.6) is given by

$$\begin{aligned} {{\fancyscript{V}}} \lambda (t) \approx \sum _{j = 0}^\infty \omega _j(V) \lambda (t-t_j). \end{aligned}$$

Assuming causality of \(\lambda \) we obtain the convolution quadrature approximation at \(t = t_n\):

$$\begin{aligned} \sum _{j = 0}^n \omega _j(V) \lambda (t_n-t_j) = \sum _{j = 0}^n \omega _{n-j}(V) \lambda (t_j). \end{aligned}$$

Error and stability analysis of this first order time-discretization method and of convolution quadrature based on other A-stable linear multistep methods can be found in [24].

2.2 Runge–Kutta based convolution quadrature

Since A-stable linear multistep methods have order restricted to \(p \le 2\), Runge–Kutta based methods need to be considered. For the importance of high order methods in wave propagation problems see [5, 6] and for the analysis of the resulting discrete systems see [25] and [8, 9].

Let an \(m\)-stage Runge–Kutta method be given by its Butcher tableau

In terms of \(A\), \(b\), and \(c\), an \(m\)-stage Runge–Kutta discretization of the initial value problem \(y' = f(t,y)\), \(y(0) = y_0\), is given by the recurrence

$$\begin{aligned} Y_{ni}&= y_n + h \sum _{j = 1}^m a_{ij} f(t_n+c_jh,Y_{nj}),\quad i =1,\ldots ,m,\\ y_{n+1}&= y_n + h \sum _{j = 1}^m b_j f(t_n+c_jh,Y_{nj}); \end{aligned}$$

here, \(h\) is the time-step and \(t_j = jh\). The values \(Y_{ni}\) and \(y_n\) are approximations to \(y(t_n+c_i h)\) and \(y(t_n)\), respectively. This Runge–Kutta method is said to be of (classical) order \(p \ge 1\) and stage order \(q\) if for sufficiently smooth right-hand sides \(f\),

$$\begin{aligned} Y_{0i} - y(c_ih)= O(h^{q+1}), \text { for } i = 1,\dots ,m, \quad \text {and} \quad y_1 - y(t_1) = O(h^{p+1}), \end{aligned}$$

as \(h \rightarrow 0\).

The corresponding stability function is defined by \(R(z)=1+zb^T(I-Az)^{-1}1\!\!1\), where \(1\!\!1=(1\; \ldots \; 1)^{T}\) and the following approximation property holds

$$\begin{aligned} R(z)=\mathrm{e}^{z}+O(z^{p+1}). \end{aligned}$$
(2.7)

We will make the following assumptions on the stability function of the Runge–Kutta method and will further assume that the coefficient matrix \(A\) is invertible. These assumptions are required by the theory of Runge–Kutta convolution quadrature for hyperbolic problems as described in [9].

Assumption 2.1

  1. (a)

    The Runge–Kutta method is A-stable, namely \(|R(z)|\le 1\) for all \(z\), s.t. \(\mathrm{Re } z\le 0\).

  2. (b)

    \(|R(\infty )|=1-b^TA^{-1}1\!\!1=0\).

  3. (c)

    \(|R(iy)|<1\), for all \(y\in {\mathbb {R}}\setminus \{0\}\).

Note that the second assumption above implies \(c_m = 1\). Examples of Runge–Kutta methods satisfying these assumptions are the Radau IIA and Lobatto IIIC methods.

As in the previous section we will want to approximate the derivative of a function \(f\), but this time at the vector of stages:

$$\begin{aligned} f(t+ch) = \begin{pmatrix} f(t+c_1 h) \\ \vdots \\ f(t+c_m h) \end{pmatrix}. \end{aligned}$$

Using (2.3) and proceeding as in the previous section we see that an approximation in the Laplace domain of the form

$$\begin{aligned} s e^{csh} {\fancyscript{L}} f (s) \approx \frac{\varDelta (e^{-sh})}{h} e^{csh} {\fancyscript{L}} f (s) \end{aligned}$$

is required, where

$$\begin{aligned} e^{csh} = \begin{pmatrix} e^{c_1sh} \\ \vdots \\ e^{c_m sh} \end{pmatrix} \end{aligned}$$

and \(\varDelta (\zeta ): {\mathbb {C}} \rightarrow {\mathbb {C}}^{m \times m}\) is a matrix valued function with the property

$$\begin{aligned} \varDelta (e^{-z})e^{c z} \approx z e^{c z}. \end{aligned}$$

The next lemma defines such a function and proves some of its properties.

Lemma 2.1

Let

$$\begin{aligned} \varDelta (\zeta ) =\left( A+\frac{\zeta }{1-\zeta }1\!\!1b^{T}\right) ^{-1}=A^{-1}-\zeta A^{-1}1\!\!1b^{T}A^{-1}. \end{aligned}$$
(2.8)

Then the following hold:

  1. (a)

    For \(z \rightarrow 0\) it holds

    $$\begin{aligned} \varDelta (e^{-z})e^{c z} = z e^{c z}+ O(z^{q+1}). \end{aligned}$$
  2. (b)

    If \(\mu \notin \sigma (A^{-1}),\) then \(R(\mu ) = \zeta ^{-1}\) if and only if \(\mu \in \sigma (\varDelta (\zeta )).\)

Proof

Note that the equality in (2.8) is readily proved using the Sherman–Morrison formula and \(b^TA^{-1}1\!\!1= 1\), where the latter is implied by Assumption 2.1(b). Result (b) follows from the expression

$$\begin{aligned} (zI-\varDelta (\zeta ))^{-1} = A(zA-I)^{-1} - \frac{\zeta }{1-R(z)\zeta } (zA-I)^{-1}1\!\!1b^T (zA-I)^{-1} \end{aligned}$$

proved in [25, Lemma 2.4].

Proof of (a) requires a few more steps. In [9, Lemma 2.5], it has been shown that

$$\begin{aligned} z b^Te^{cz} = e^z-1 + O(z^{p+1}) \quad \text {and}\quad zAe^{cz} = e^{cz}-1\!\!1+ O(z^{q+1}). \end{aligned}$$

Hence,

$$\begin{aligned} \varDelta (e^{-z})^{-1} ze^{cz}&= e^{cz}-1\!\!1+ \frac{1}{1-e^{-z}} 1\!\!1-\frac{e^{-z}}{1-e^{-z}} 1\!\!1+ O(z^{q+1})\\&= e^{cz} + O(z^{q+1}). \end{aligned}$$

\(\square \)

Therefore we are in a similar position as in the previous section. The last step that we need to do is construct the expansion

$$\begin{aligned} V\left( \frac{\varDelta (\zeta )}{h}\right) = \sum _{j = 0}^\infty W_n^h(V) \zeta ^j. \end{aligned}$$

Remark 2.2

Note that A-stability and Lemma 2.1(b) imply that the eigenvalues of \(\varDelta (\zeta )\) for \(|\zeta | < 1\) all lie in the right-half complex plane. Therefore the same arguments as in Remark 2.1 tell us that the above expansion is well defined and that the convolution weights \(W_n^h(V)\) are \(m \times m\) matrices of bounded linear operators mapping from \(H^{-1/2}(\varGamma )\) to \(H^{1/2}(\varGamma )\).

Recalling the definition of \(V(s)\) we see that \(W_{n}^{h}\equiv W_{n}^{h}(V)\) are integral operators defined by

$$\begin{aligned} W_n^h \lambda (x) = \int _\varGamma w_n^h(\Vert x-y\Vert ) \lambda (y) d\varGamma _y, \quad x \in \varGamma , \end{aligned}$$

where the kernels \(w_n^h(d)\) are defined by the corresponding expansion

$$\begin{aligned} \frac{\exp \left( -\frac{\varDelta (\zeta )}{h}d\right) }{4\pi d}=\sum \limits _{n=0}^{\infty }w_{n}^{h}(d) \zeta ^{n}. \end{aligned}$$
(2.9)

Let us denote by \(g_{n}\) and \({\varvec{g}}_{n}\) the following functions

$$\begin{aligned} g_{n}(x) = g(nh, x), \quad {\varvec{g}}_n(x)= \begin{pmatrix} g(nh+c_{1}h, x)\\ \vdots \\ g(nh+c_{m}h, x) \end{pmatrix}. \end{aligned}$$

With this notation the Runge–Kutta convolution quadrature of (2.2) is given by

$$\begin{aligned} {\varvec{g}}_{n}(x)=\sum _{i=0}^{n} \left( W_{n-i}^{h}{\varvec{\lambda }}_{i}\right) (x), \end{aligned}$$

where \({\varvec{\lambda }}_n\) denotes

$$\begin{aligned} {\varvec{\lambda }}_n(x)= \begin{pmatrix} \lambda (nh+c_{1}h, x)\\ \vdots \\ \lambda (nh+c_{m}h, x) \end{pmatrix}. \end{aligned}$$

In the remainder of the paper we will require the scaled convolution kernels \(\mathrm{w}_{n}(d):=4\pi d w_{n}^{h}(hd)\). Notice that \(\mathrm{w}_{n}(d)\) are the coefficients of the following expansion:

$$\begin{aligned} \exp \left( -\varDelta (\zeta )d\right) =\sum \limits _{n=0}^{\infty }\mathrm{w}_{n}(d)\zeta ^{n}. \end{aligned}$$
(2.10)

3 Sparsity of Runge–Kutta convolution weights

Our task in this section is to find estimates for convolution weights \(w_{n}^{h}(d)\) in terms of \(d\) and \(n\). To do so, we first derive bounds for the scaled convolution weights \(\mathrm{w}_{n}(d)\) and then use these results to show that similar bounds hold also for \(w_{n}^{h}(d)\).

The scaled convolution weight \(\mathrm{w}_{n}(d)\) for \(d>0\) can also be expressed as

$$\begin{aligned} \mathrm{w}_{n}(d) = \frac{1}{2\pi i} \oint \limits _{\gamma } R(z)^{n-1}(I-Az)^{-1}1\!\!1b^{T}(I-Az)^{-1} \mathrm{e}^{-zd} dz, \end{aligned}$$
(3.1)

see [25]. Here, \(\gamma \) represents a contour that encloses all the eigenvalues of \(A^{-1}\).

To prove the main estimates, we need to choose the contour \(\gamma \) carefully. First, we consider the domain \(\varUpsilon _{r}\), \(r>0\):

$$\begin{aligned} \varUpsilon _{r} = \left\{ z\in {\mathbb {C}} \; :\; |R(z)|>r \right\} . \end{aligned}$$

We denote by \(\gamma _{r}\) the boundary of this domain, i.e., \(\gamma _r := \partial \varUpsilon _{r}\). Hence, \(|R(z)|=r\) holds for all \(z\in \gamma _{r}\). Next, we prove some properties of domains \(\varUpsilon _r\).

Let

$$\begin{aligned} A_+ =\left\{ z \in {\mathbb {C}} \;:\; |R(z)| > |e^z|,\; \mathrm{Re }z > 0 \right\} \end{aligned}$$

denote the order star of \(R\) restricted to the right-half complex plane, see [20]. In fact \(A_+\) denotes just the \(m\) bounded fingers containing the \(m\), counting multiplicities, singularities of \(R\). Since \(|R(iy)| < 1\) for \(y \ne 0\), the origin is the only point of the intersection of the closure of the order star with the imaginary axis and hence

$$\begin{aligned} \overline{A}_+ \subset \varUpsilon _1 \cup \{0\}. \end{aligned}$$

Recall that the stability function of the Runge–Kutta method is a rational function

$$\begin{aligned} R(z)=\frac{P(z)}{Q(z)}, \end{aligned}$$

where \(P(z)\) and \(Q(z)\) are polynomials, see [20]. Without loss of generality we can assume that \(Q(0)=1\). The so-called \(E\)-polynomial is defined by

$$\begin{aligned} E(y)=|Q(iy)|^2-|P(iy)|^2=e_{0}y^{2s}+O(y^{2s+2}). \end{aligned}$$
(3.2)

It can be shown that for Runge–Kutta methods satisfying Assumption 2.1 the constant \(e_{0}\) is positive.

We will need the following lemma. We believe this result to be known, however were not able to find an exact reference. Therefore, we provide its proof in the appendix.

Lemma 3.1

For an \(A\)-stable Runge–Kutta method of order \(p\) there exist \(q,\nu >0,\) such that the domain

$$\begin{aligned} \{(x, y)\in {\mathbb {R}}^{2}\;:\; |y|< \nu x^{\frac{1}{\ell }},\; 0<x<q\} \end{aligned}$$

belongs to \(\varUpsilon _1\) \((\)and intersects all the order star fingers\().\) Here

$$\begin{aligned} \ell =\left\{ \begin{array}{l@{\quad }r} p+1,&{} \text {if }p\text { is odd, }\\ 2s,&{} \text {if }p\text { is even,}\\ \end{array} \right. \end{aligned}$$

where \(s\) is defined by (3.2) with \(e_0>0.\)

Lemma 3.2

Under Assumption 2.1, the domain \(\varUpsilon _1\) is located in the open right-half plane and is bounded and connected \((\)possibly multiply\().\)

Proof

The boundedness follows directly from the assumption \(R(\infty )=0\). A-stability and the bound \(|R(iy)|<1, \; y\in {\mathbb {R}}\setminus \{0\}\) imply that \(\varUpsilon _1\) is located in the open right-half plane.

Let \(\widetilde{\varUpsilon }_1\) be a connected (possibly multiply) component of \(\varUpsilon _1\). Then, by the maximum principle, \(\widetilde{\varUpsilon }_1\) must contain a singularity of \(R(z)\) and the closure of the corresponding finger (minus the origin). According to Lemma 3.1, the intersection of \(\widetilde{\varUpsilon }_1\) with all the other fingers is nonempty. Since \(\widetilde{\varUpsilon }_1\) contains all the singularities of \(\varUpsilon _1\), by the maximum modulus principle applied to \(R(z)\), it coincides with \(\varUpsilon _1\). \(\square \)

Remark 3.1

The domain \(\varUpsilon _1\) is not necessarily simply connected and can contain a hole \(\varUpsilon '\). Namely, a bounded domain \(\varUpsilon '\) can exist, s.t. \(R(z)\) vanishes in one of its interior points, \(|R(z)|<1\) inside \(\varUpsilon '\) and \(\partial \varUpsilon ' \subset \partial \varUpsilon _1\). This is the case for the diagonally implicit Runge–Kutta (DIRK) method of order 2, see [1, Theorem 5], defined by the Butcher tableau

(3.3)

Its stability function is given by \(R(z)=\frac{1+(1-2a)z}{(1-az)^2}\) and satisfies Assumption 2.1, see also [20, Section 4.6, p.98]. The boundary of the domain \(\varUpsilon _1\) for this method is shown in Fig. 1.

Fig. 1
figure 1

The boundary of \(\varUpsilon _{1}\) for an \(A\)-stable DIRK method defined by (3.3)

Remark 3.2

Note that by continuity of \(R(z)\), in a sufficiently small vicinity of \(r = 1\), \(\varUpsilon _{r}\) stays bounded and connected.

Corollary 3.1

If the stability function of a Runge–Kutta method that satisfies Assumption 2.1 coincides with a Padé approximant for the exponential, the domain \(\varUpsilon _1\) is simply connected. Further, for small enough \(\varepsilon > 0\), the perturbed domain \(\varUpsilon _{1\pm \varepsilon }\) is also simply connected.

Proof

For the proof we need two ingredients:

  1. 1.

    Ehle’s Conjecture [26, Theorem 7]. Any Padé approximation \(R(z)=\frac{P(z)}{Q(z)}\), \(\mathrm{deg }{P}=k\), \(\mathrm{deg }{Q}=m\) is \(A\)-stable iff \(m-2\le k \le m\).

  2. 2.

    All zeros of such Padé approximants lie in the open left-half plane, see [12, 13].

Hence, the existence of a bounded domain \(\varUpsilon '\), s.t. \(|R(z)|<1\) inside \(\varUpsilon '\) and \(\partial \varUpsilon ' \subset \partial \varUpsilon _1\) (i.e. a hole in \(\varUpsilon _1\)), contradicts the maximum modulus principle applied to the analytic function \(\frac{1}{R(z)}\), \(z\in {\mathbb {C}}_{+}\). The proof for the perturbed domain is similar. The main thing to notice is that if \(\varepsilon \) is small enough, then there exists \(\varepsilon ' > 0\) such that the zeros of \(R(z)\) are contained in the half-plane \(\mathrm{Re }z < -\varepsilon '\) and that \(\varUpsilon _{1\pm \varepsilon }\) is contained in \(\mathrm{Re }z > -\varepsilon '\).   \(\square \)

Again, the stability functions of the Radau IIA and Lobatto IIIC methods satisfy the assumptions of the above corollary.

Lemma 2.1(b) provides us with an easy way to draw the curves \(\gamma _r = \partial \varUpsilon _r\), i.e., by plotting the eigenvalues of \(\varDelta (\zeta )\) for all \(|\zeta | = 1/r\). In [5] the multiplicity of the eigenvalues of \(\varDelta (\zeta )\) for the 2- and 3-stage Radau IIA Runge–Kutta methods was discussed. In both cases \(\varDelta (\zeta )\) has only simple eigenvalues for \(|\zeta | = 1\), as explained also by Corollary 3.1. For the 2-stage version, eigenvalues of multiplicity greater than 1 occur only for \(\zeta = -5 \pm 3\sqrt{3}\). For a plot of these curves at the critical values and \(r = 1\) see Fig. 2.

Fig. 2
figure 2

Curves \(\gamma _r\), for the 2-stage Radau IIA method, are plotted for \(r = 1\) (the middle curve in blue) and the critical values \(r = r_1 = 5+3\sqrt{3}\) (the outer curve in green) and \(r = r_2 = 3\sqrt{3}-5\) (the inner curve in red). For \(r > r_1\) and \(r < r_2\) the curve splits into two disjoint curves

Since the domain \(\varUpsilon _{1}\) is connected, by Remark 3.2, \(\varUpsilon _{r}\) is also connected (not necessarily simply) for sufficiently small \(|r-1|\). Let \(\delta _{*}\) be such that

$$\begin{aligned} \text {the domain } \varUpsilon _r \text { is bounded and connected } \text { for } |r-1| < \delta _*. \end{aligned}$$
(3.4)

From now on we will make use of \(\gamma \) chosen as the positively oriented boundary of a domain \(\varUpsilon _{r}\), i.e. \(\gamma =\gamma _{r}\), with \(|r-1| < \delta _*\).

Remark 3.3

Note that the total length of \(\gamma _r\) is bounded, see [2, Lemma 3], by

$$\begin{aligned} |\gamma _{r}|\le 4 m d(\gamma _{r}), \end{aligned}$$

where \(d(\gamma _{r})\) is the diameter of \(\gamma _r\).

From (3.1) it follows that the Euclidean norm of \(\mathrm{w}_{n}(d)\) is bounded as

$$\begin{aligned} \Vert \mathrm{w}_{n}(d)\Vert&\le \frac{1}{2\pi } \left\| \int \limits _{\gamma _{r}}R(z)^{n-1}\mathrm{e}^{-zd} (I-Az)^{-1}1\!\!1b^{T}(I-Az)^{-1} dz\right\| \\&\le \frac{1}{2\pi } |\gamma _{r}|r^{n-1}\max \limits _{z \in \gamma _{r}}\Vert \mathrm{e}^{-zd}(I-Az)^{-1}1\!\!1b^{T}(I-Az)^{-1}\Vert . \end{aligned}$$

Denoting by \(U(z)=(I-Az)^{-1}\), we can deduce the bound

$$\begin{aligned} \max _{z\in \gamma _{r}}\Vert (I-Az)^{-1}1\!\!1b^{T}(I-Az)^{-1}\Vert&\le \max _{z\in \gamma _{r}}\Vert (I-Az)^{-1}\Vert ^2 \Vert 1\!\!1b^{T}\Vert \\&\le \max _{z\in \gamma _{r}}\Vert U(z)\Vert ^{2} \Vert b\Vert \sqrt{m}, \end{aligned}$$

which implies that

$$\begin{aligned} \Vert \mathrm{w}_{n}(d)\Vert \le \frac{1}{2\pi } r^{n-1} |\gamma _{r}| \Vert b\Vert \sqrt{m}\max \limits _{z \in \gamma _{r}}|\mathrm{e}^{-zd}| \max _{z\in \gamma _{r}}\Vert U(z)\Vert ^{2}. \end{aligned}$$
(3.5)

To understand the behaviour of a scaled convolution weight \(\mathrm{w}_{n}(d)\) we need to find a bound on \(\max \nolimits _{z \in \gamma _{r}}|\mathrm{e}^{-zd}|\). To do so, we use the fact that the stability function \(R(z)\) approximates \(\mathrm{e}^{z}\), see (2.7), and thus \(\max \nolimits _{z \in \gamma _{r}}|\mathrm{e}^{-zd}|\) can be expressed via the value of \(|R(z)|\) on \(\gamma _{r}\).

For a Runge–Kutta method of order \(p\) we can write

$$\begin{aligned} R(z)=\mathrm{e}^{z}+f(z), \end{aligned}$$

where \(f(z)=O(z^{p+1})\).

Let us consider \(z \in \gamma _{r}\) and \(d\in {\mathbb {R}}_{> 0}\). Multiplying the last equation by \(\mathrm{e}^{-z}R(z)^{-1}\), we get the following identity:

$$\begin{aligned} \mathrm{e}^{-z}=R(z)^{-1}\left( 1+f(z)\mathrm{e}^{-z}\right) . \end{aligned}$$
(3.6)

Clearly,

$$\begin{aligned} \max \limits _{z\in \gamma _{r}}{|\mathrm{e}^{-z}|}=\mathrm{e}^{{-x_{0}}}, \end{aligned}$$

where \(x_{0}=\min \limits _{z\in \gamma _{r}}\mathrm{Re }{z}\). Let \(z_{0} =x_{0}+iy_{0} \in \gamma _r\) be a point such that

$$\begin{aligned} \mathrm{Re }{z'}\ge \mathrm{Re }{z_{0}},\quad \text {for all }z'\in \gamma _{r}. \end{aligned}$$
(3.7)

Note that such a point is not necessarily unique. Taking the modulus and raising both sides of (3.6) to the \(d\)th power, we obtain

$$\begin{aligned} \max _{z\in \gamma _{r}}|\mathrm{e}^{-zd}|=|\mathrm{e}^{-z_{0}}|^d =|R(z_{0})^{-1}\left( 1+f(z_{0})\mathrm{e}^{-z_{0}}\right) |^d. \end{aligned}$$

Hence

$$\begin{aligned} \max _{z\in \gamma _{r}}|\mathrm{e}^{-zd}|=r^{-d}{\left| 1+f(z_{0})\mathrm{e}^{-z_{0}}\right| ^d}, \end{aligned}$$
(3.8)

where \(z_{0} =x_{0}+iy_{0}\) is a point satisfying (3.7).

Remark 3.4

The points \(z_{0} \in \gamma _r\) with the smallest real part can alternatively be characterized by the following two properties:

  • For all \(z\) such that \(\mathrm{Re }{z}<\mathrm{Re }{z_{0}}\), it holds that

    $$\begin{aligned} |R(z_{0})|>|R(z)|. \end{aligned}$$
    (3.9)

    This is due to the analyticity of \(R(z)\), as well as the definition of \(\gamma _{r}\) and points \(z_{0}\).

  • Let us fix \(x_{0}=\mathrm{Re }{z_{0}}\). Note that the continuous function \(|R(x_{0}+iy)|\rightarrow 0\) as \(y\rightarrow \infty \). Then \(y_{0}\), see (3.7), are the points in which \(|R(x_{0}+iy)|\) achieves its local maximum. Let us show this. We assume the contrary, namely, that there exists \(y'\in {\mathbb {R}}\) s.t.

    $$\begin{aligned} |R(x_{0}+iy')|>|R(x_{0}+iy_{0})|. \end{aligned}$$

    Since \(R(z)\) is analytic in \(z'=x_{0}+iy'\), there exists an \(\epsilon \)-neighborhood of \(z'\) s.t.

    $$\begin{aligned} |R(z)|>|R(x_{0}+iy_{0})|,\quad |z-z'|\le \epsilon . \end{aligned}$$

    Taking \(z=z'-\epsilon =(x_{0}-\epsilon )+iy'\), we arrive at the contradiction to (3.9).

In order to bound this product we need to understand how \(x_{0}\) and \(x_{0}+iy_{0}\) behave. This question has been studied in [19] examining the behaviour of \(R(z)\) in the order star [26]. Namely, when \(r\rightarrow 1\), all such points \(z_{0}\) satisfy \(|z_{0}|\le C|r-1|^{a}\) for some \(C,a>0\). This and the fact that \(f(z)=O(z^{p+1})\) will allow us to obtain bounds on the scaled convolution weights. Here we will employ the results from [19].

Definition 3.1

[19] Given a rational function \(R(z)\) we define the error growth function as the real-valued function \(\phi (x):=\sup \nolimits _{\mathrm{Re }{z}<x}|R(z)|\).

Theorem 3.1

(Theorem 7 in [19]) Let \(R(z)=\frac{P(z)}{Q(z)}\) be an A-stable approximation to \(\mathrm{e}^{z}\) of exact order \(p\ge 1,\) i.e.\(,\)

$$\begin{aligned} R(z)=\frac{P(z)}{Q(z)}=\mathrm{e}^{z}+C_{p+1}z^{p+1}+O(z^{p+2}), \quad \text {for }z\rightarrow 0,\; C_{p+1}\ne 0. \end{aligned}$$
(3.10)

Furthermore\(,\) assume \(|R(iy)|<1\) for \(y\ne 0\) and \(|R(\infty )|<1.\) Then we have for \(x\rightarrow 0\):

  • if \(p\) is odd\(,\)

    $$\begin{aligned} \phi (x)=\mathrm{e}^{x}+O(x^{p+1}). \end{aligned}$$
  • if \(p\) is even and \((-1)^{p/2}C_{p+1}x<0,\)

    $$\begin{aligned} \phi (x)=\mathrm{e}^{x}+O(x^{p+1}). \end{aligned}$$
  • if \(p\) is even and \((-1)^{p/2}C_{p+1}x>0,\)

    $$\begin{aligned} \phi (x)=\mathrm{e}^{x}+O(x^{1+p/(2s-p)}), \end{aligned}$$

    where \(s\) is defined by

    $$\begin{aligned} E(y)=|P(iy)|^2-|Q(iy)|^2=e_{0}y^{2s}+O(y^{2s+2}),\; e_{0}>0. \end{aligned}$$
    (3.11)

Remark 3.5

[19] For \(x<\mathrm{Re }\lambda _{min}\), with \(\lambda _{min}\) being an eigenvalue of \(A^{-1}\) with the smallest real part, \(\phi (x)\) is a strictly monotonically increasing continuous function.

The following proposition shows that for \(r\rightarrow 1\), \(x_{0}=\min \nolimits _{z\in \gamma _{r}}\mathrm{Re }{z}\) is close to \(r-1\).

Proposition 3.1

Let \(R(z)\) be the stability function of the Runge–Kutta method satisfying Assumption 2.1, let (3.10) hold and let \(x_{0}=\min \nolimits _{z\in \gamma _{r}}\mathrm{Re }{z}.\) Then for \(r\rightarrow 1\):

$$\begin{aligned} x_0=r-1+o(|r-1|). \end{aligned}$$

Proof

By definition, \(|R(z)|=r\) on \(\gamma _{r}\). Since the error growth function \(\phi (x)\) is a strictly monotonically increasing continuous function, see Remark 3.5, \(\phi (x_{0})=r\). The statement of the proposition follows from the application of the implicit function theorem to the 3 cases of Theorem 3.1 and the fact that \(\phi (0)=1\), \(\frac{d\phi }{dx}(0)=1\). \(\square \)

Next proposition shows that when \(r\approx 1\), the point \(z_{0}\) defined by (3.7) lies in a small circle centered at the origin.

Proposition 3.2

Let \(R(z)\) be the stability function of the Runge–Kutta method satisfying Assumption 2.1 and (3.10). Then there exist \(\delta _{0}>0\) and \(K>0,\) s.t.\(,\) for all \(r,\) with \(|r-1|<\delta _{0},\) the point \(z_{0}\in \gamma _{r}\) defined by (3.7) lies inside one of the circles specified below:

  1. 1.

    for \(p\) odd:

    $$\begin{aligned} |z_{0}|\le K |r-1|. \end{aligned}$$
  2. 2.

    for \(p\) even:

    1. (a)

      if \(r>1\) and \((-1)^\frac{p}{2}C_{p+1}<0\) or \(r<1\) and \((-1)^\frac{p}{2}C_{p+1}>0,\)

      $$\begin{aligned} |z_{0}|\le K|r-1|. \end{aligned}$$
    2. (b)

      if \(r>1\) and \((-1)^\frac{p}{2}C_{p+1}>0\) or \(r<1\) and \((-1)^\frac{p}{2}C_{p+1}<0,\)

      $$\begin{aligned} |z_{0}|\le K|r-1|^{\frac{1}{2s-p}}, \end{aligned}$$

      where \(s\) is defined by (3.11).

Proof

The proof of this statement follows closely the proof of Theorem 7 in [19]. Recall that \(z_{0}=x_0+iy_0\) is a (not necessarily unique) point in which \(|R(x_0+iy)|\), \(y\in {\mathbb {R}}\), achieves its local maximum, see Remark 3.4. As argued in [19], for \(x_0\rightarrow 0\) the maximum of \(|R(x_0+iy)|\), \(y\in {\mathbb {R}}\), lies inside the order star close to the origin. We consider the following cases for \(r\rightarrow 1\) (and, consequently, \(x_0\rightarrow 0\), see Proposition 3.1):

  1. 1.

    \(p\) is odd. As shown in the proof of Theorem 7 in [19], the local extrema of \(|R(x_0+iy)|\) lie asymptotically (as \(x_0\rightarrow 0\)) on the lines \(y=x_0 \tan {(k\pi /p)}\), \(k=0,1,\ldots ,p-1\). Since \(|R(z)|\) achieves an extremum at \(z_{0}\), it follows that \(|z_{0}| \le C|x_{0}|\), where \(C>0\) and depends on the Runge–Kutta method. Proposition 3.1 gives an expression for \(x_{0}\) (using \(\phi (x_0)=r\)):

    $$\begin{aligned} x_{0}=r-1+o(|r-1|). \end{aligned}$$

    Hence,

    $$\begin{aligned} |z_{0}| \le K|r-1|, \end{aligned}$$

    for some \(K>0\).

  2. 2.

    \(p\) is even.

    1. (a)

      As proved in Theorem 7 in [19], for \((-1)^{p/2}C_{p+1}x_{0}<0\) and \(x_0 \rightarrow 0\), the point \(z_0\) satisfies \(|z_{0}|\le C|x_0|,\; C>0\). The statement of the proposition is then obtained with similar arguments as in the previous case and the fact that \({\mathop {{\mathrm {sgn}}}}{x_{0}}={\mathop {{\mathrm {sgn}}}}{(r-1)}\).

    2. (b)

      For the last case, namely \((-1)^{p/2}C_{p+1}x_0>0\), in the proof of Theorem 7 in [19] it was shown that the local extrema of \(|R(x_0+iy)|,\; y\in {\mathbb {R}}\), for \(x_0\rightarrow 0\), are achieved near the imaginary axis and lie on a curve \(y^{2s-p}=Dx_0\), for some \(D\in {\mathbb {R}}\) and with \(s\) defined in (3.11). Then there exists \(C>0\), such that for all sufficiently small \(x_{0}\),

      $$\begin{aligned} |z_{0}|\le C\left( |x_0|+|x_0|^{1/(2s-p)}\right) =C|x_{0}|^{1/(2s-p)}\left( \left| x_{0}\right| ^{1-1/(2s-p)}+1\right) . \end{aligned}$$

      According to Proposition 3.4 in [20] \(2s\ge p+1\), therefore, for even \(p\), \(2s\ge p+2\). This implies that \(|x_{0}|^{1-1/(2s-p)}=o(|x_{0}|)\), hence

      $$\begin{aligned} |z_{0}|\le K|r-1|^{\frac{1}{2s-p}}, \end{aligned}$$

      for some \(K>0\). \(\square \)

Now we have all the estimates necessary to prove the next proposition on the decay of the scaled convolution weights.

Proposition 3.3

Let \(R(z)\) be the stability function of an \(m\)-stage Runge–Kutta method of order \(p\) satisfying Assumption 2.1 and (3.10).

Let \(s\) be defined by (3.11). Then there exist positive constants \({G,\;G',\;C,\; C'}\) and \({\bar{\delta }}\in (0,\; 1),\) such that for \(n\ge 1\) and \(0<\delta <{\bar{\delta }}\) the following estimates hold:

  1. 1.

    \(p\) is odd

    $$\begin{aligned} \begin{aligned} \Vert \mathrm{w}_{n}(d)\Vert&\le G(1-\delta )^{n-d}(1+C\delta ^{p+1})^d \quad \text {for } d\le n,\\ \Vert \mathrm{w}_{n}(d)\Vert&\le G'(1+\delta )^{n-d}(1+C'\delta ^{p+1})^d \quad \text {for } d>n; \end{aligned} \end{aligned}$$
    (3.12)
  2. 2.

    \(p\) is even

    1. (a)

      \(C_{p+1}(-1)^\frac{p}{2}>0\)

      $$\begin{aligned} \begin{aligned} \Vert \mathrm{w}_{n}(d)\Vert \le&G(1-\delta )^{n-d}(1+C\delta ^{p+1})^d\quad \text {for } d\le n,\\ \Vert \mathrm{w}_{n}(d)\Vert \le&G'(1+\delta )^{n-d}(1+C'\delta ^{\frac{p+1}{2s-p}})^d\quad \text {for } d>n; \end{aligned} \end{aligned}$$
      (3.13)
    2. (b)

      \(C_{p+1}(-1)^\frac{p}{2}<0\)

      $$\begin{aligned} \begin{aligned} \Vert \mathrm{w}_{n}(d)\Vert&\le G(1-\delta )^{n-d}(1+C\delta ^{\frac{p+1}{2s-p}})^d\quad \text {for } d\le n,\\ \Vert \mathrm{w}_{n}(d)\Vert&\le G'(1+\delta )^{n-d}(1+C'\delta ^{p+1})^d\quad \text {for } d>n. \end{aligned} \end{aligned}$$
      (3.14)

The scaled convolution weight \(\mathrm{w}_{0}(d)\) satisfies:

$$\begin{aligned} \Vert \mathrm{w}_{0}(d)\Vert \le \exp (-\mu d), \end{aligned}$$
(3.15)

for some \(\mu >0\).

Constants \(G,G',C,C',{\bar{\delta }},\mu \) depend only on the Runge–Kutta method and are independent of \(n\) and \(d\).

Proof

Let us start with \(\mathrm{w}_{0}(d)\). From the definition of the scaled convolution weights

$$\begin{aligned}&\exp \left( -\varDelta (\zeta )d\right) =\sum \limits _{n=0}^{\infty }\mathrm{w}_{n}\left( d\right) \zeta ^{n},\\&\varDelta (\zeta )=A^{-1}-\zeta A^{-1}1\!\!1b^{T}A^{-1}, \end{aligned}$$

it follows that \(\mathrm{w}_{0}(d)=\exp (-A^{-1}d)\). All the eigenvalues of \(A\) lie on the right of the imaginary axis (due to the A-stability of the Runge–Kutta method) and hence the same holds for the eigenvalues of \(A^{-1}\). The bound on \(\mathrm{w}_{0}(d)\) can then be obtained from the definition of the matrix exponential.

For the general case \(\mathrm{w}_{n}(d)\), \(n\ge 1\), we use the bounds derived before, inserting (3.8) into (3.5):

$$\begin{aligned} \Vert \mathrm{w}_{n}(d)\Vert&\le \frac{1}{2\pi } r^{n-1} \Vert b\Vert \sqrt{m}|\gamma _{r}|\max \limits _{z \in \gamma _{r}}|\mathrm{e}^{-zd}| \max _{z\in \gamma _{r}}\Vert U(z)\Vert ^{2}\nonumber \\&= \frac{1}{2\pi } r^{n-d-1}|\gamma _{r}|\max _{z\in \gamma _{r}}\Vert U(z)\Vert ^{2} \Vert b\Vert \sqrt{m} {|1+f(z_{0})\mathrm{e}^{-z_{0}}|^d}, \end{aligned}$$
(3.16)

where \(f(z)=R(z)-\mathrm{e}^{z}\) and \(z_{0}\) is such that \(\mathrm{Re }{z'}\ge \mathrm{Re }{z_{0}}\) for all \(z'\in \gamma _{r}\).

Let us first derive a bound for \(|1+f(z_{0})\mathrm{e}^{-z_{0}}|\). For \(|z|<\frac{1}{\lambda _{0}}\), where \(\lambda _{0}\) is the spectral radius of \(A\), we can expand \(R(z)=1+zb^{T}(I-Az)^{-1}1\!\!1\) with the help of Neumann series to obtain an explicit expression for \(f(z)\):

$$\begin{aligned} f(z)=R(z)-\mathrm{e}^{z}=z\sum _{l=p}^{\infty }b^TA^l1\!\!1z^{l}-\sum _{l=p+1}^{\infty }\frac{z^{l}}{l!}. \end{aligned}$$

For \(|z|<\frac{1}{\Vert A\Vert }\), we can trivially bound

$$\begin{aligned} |1+f(z)\mathrm{e}^{-z}|^d \le \left( 1+C|z|^{p+1}\right) ^{d}, \end{aligned}$$
(3.17)

where \(C\) depends on the Runge–Kutta method, but not on \(z\) or \(d\).

Now let \(n>d\). We choose \(r<1\), \(r=1-\delta \), \(0<\delta <\min \left\{ \delta _{*}, \; \frac{1}{\Vert A\Vert },\; 1\right\} \). Here \(\delta _{*}\) is the constant from (3.4).

Then the bound (3.16), using (3.17), can be written as

$$\begin{aligned} \Vert \mathrm{w}_{n}(d)\Vert \le \frac{1}{2\pi } (1-\delta )^{n-d-1}|\gamma _{1-\delta }|\max _{z\in \gamma _{1-\delta }}\Vert U(z)\Vert ^{2} \Vert b\Vert \sqrt{m} {\left( 1+C|z_{0}|^{p+1}\right) ^d}, \end{aligned}$$

where \(z_{0}\) is such that \(\mathrm{Re }{z_{0}}<\mathrm{Re }{z}\) for all \(z\in \gamma _{1-\delta }\).

The total length of \(\gamma _{1-\delta }\) as well as \(\max _{z\in \gamma _{1-\delta }}\Vert U(z)\Vert \) can be bounded by constants that depend on the Runge–Kutta method, see also Lemma 3.2 and Remarks 3.2 and 3.3. Applying Proposition 3.2 to estimate \((1+C|z_{0}|^{p+1})^d\), we obtain the required expressions for the case \(n>d\).

The bound for \(n<d\) can be obtained similarly by setting \(r=1+\delta \), with \(0<\delta <\min \left\{ \delta _{*}, \; \frac{1}{\Vert A\Vert },\; \; 1\right\} \). \(\square \)

Remark 3.6

Note that for even \(p\) the above bounds imply that when \(2s-p<p+1\) scaled convolution weights decay exponentially. However, since \(2s\le 2m\) (\(m\) is the number of stages and the degree of the denominator in \(R(z)=\frac{P(z)}{Q(z)}\)), for exponential decay it suffices that \(p\ge m\).

Remark 3.7

From the proof it can be seen that the increase of widths of (approximate) supports of convolution weights is due to the term \(|1+f(z_{0})\mathrm{e}^{-z_{0}}|\), which we bounded by a constant greater than \(1\). We have not observed in numerical experiments a case where this term is noticeably smaller than \(1\). Note that if this term were smaller than \(1\), the convolution weights \(\mathrm{w}_{n}(d)\) would decay exponentially with increasing \(n\approx d\).

We have shown that scaled convolution weights \(\mathrm{w}_{n}(d)\) exhibit exponential decay outside of a neighborhood of \(n\approx d\), which is an expression of the strong Huygens principle. Additionally, the above estimates suggest that the size of the approximate support of a convolution weight \(\mathrm{w}_{n}(d)\) increases with \(n\). Let us examine this in more detail.

We consider Runge–Kutta methods satisfying the following assumption.

Assumption 3.1

Let the order of the Runge–Kutta method be \(p\ge 1\). If \(p\) is even, then let \(2s-p<p+1\), where \(s\) is as in (3.11), see also Remark 3.6.

In particular, this assumption holds true for the Radau IIA and Lobatto IIIC methods. For \(n \ge 1\), \(\mathrm{w}_{n}(d)\) is bounded by \(\epsilon >0\) outside the interval \(\left[ d_{1}^{(n,\epsilon )},\; d_{2}^{(n,\epsilon )}\right] \), where

$$\begin{aligned} d_{1}^{(n,\epsilon )}&= \sup \Big \{d\ge 0: \; \Vert \mathrm{w}_{n}(d')\Vert <\epsilon , \;\text { for all } 0\le d'<d\Big \},\\ d_{2}^{(n,\epsilon )}&= \inf \Big \{d\ge 0: \; \Vert \mathrm{w}_{n}(d')\Vert <\epsilon , \; \text { for all }d'>d\Big \}. \end{aligned}$$

The set

$$\begin{aligned} \Big \{d: \; \Vert \mathrm{w}_{n}(d')\Vert <\epsilon , \; 0\le d'<d\Big \} \end{aligned}$$

is non-empty for \(n\ge 1\), since \(\mathrm{w}_{n}(0)=0\) (as we show in the proof of Proposition 3.4) and \(\mathrm{w}_{n}(d)\) is smooth as a function of \(d\ge 0\). The set

$$\begin{aligned} \Big \{d: \; \Vert \mathrm{w}_{n}(d')\Vert <\epsilon , \; d'>d\Big \} \end{aligned}$$

is non-empty for all \(n\ge 0\) by Proposition 3.3 for the Runge–Kutta methods under consideration. Hence, the values \(d_{1}^{(n,\epsilon )}\), \(d_{2}^{(n,\epsilon )}\) are defined for all \(n\ge 1\).

To find estimates on \(d_{1}^{(n,\epsilon )}\) and \(d_{2}^{(n,\epsilon )},\; n>0\), we make use of the bounds of Proposition 3.3 that can be written in a more general form:

$$\begin{aligned} \Vert \mathrm{w}_{n}(d)\Vert&\le G(1-\delta )^{n-d}(1+C\delta ^{\alpha })^{d} \quad \text {for all } d\le n,\\ \Vert \mathrm{w}_{n}(d)\Vert&\le G'(1+\delta )^{n-d}(1+C'\delta ^{\alpha '})^{d} \quad \text {for all } d> n, \end{aligned}$$

for all \(0<\delta <{\bar{\delta }}<1\) and constants \(C,C',G,G',\alpha ,\alpha ',{\bar{\delta }}>0\) depending only on the Runge–Kutta method. We can estimate \(d_{1}^{(n,\epsilon )}\) and \(d_{2}^{(n,\epsilon )}\) from

$$\begin{aligned}&G(1-\delta )^{n-d}(1+C\delta ^{\alpha })^{d}\le \epsilon ,\end{aligned}$$
(3.18)
$$\begin{aligned}&G'(1+\delta )^{n-d}(1+C'\delta ^{\alpha '})^{d}\le \epsilon . \end{aligned}$$
(3.19)

The first inequality is satisfied if

$$\begin{aligned} d&\le n\log \frac{1}{1-\delta }\left( \log \frac{1}{1-\delta }+ \log (1+C\delta ^{\alpha })\right) ^{-1}\\&-\log \frac{G}{\epsilon }\left( \log \frac{1}{1-\delta }+\log (1+C\delta ^{\alpha })\right) ^{-1}\\&= n-\left( \log \frac{1}{1-\delta }+\log (1+C\delta ^{\alpha })\right) ^{-1} \left( n\log \left( 1+C\delta ^{\alpha }\right) +\log \frac{G}{\epsilon }\right) . \end{aligned}$$

Hence, for small enough \(\delta \), there exist constants \(c_{1},c_2>0\) s.t. (recall that \(\alpha >1\))

$$\begin{aligned} d\le n-\frac{c_{1}n\delta ^{\alpha }+\log \frac{G}{\epsilon }}{\delta (1-c_2\delta )} \end{aligned}$$

implies (3.18). This in turn implies that

$$\begin{aligned} d_{1}^{(n,\epsilon )}\ge n-\frac{c_{1}n\delta ^{\alpha -1}+ \delta ^{-1}\log \frac{G}{\epsilon }}{1-c_2\delta }. \end{aligned}$$

Let us choose \(\delta \) to minimize the expression \(c_{1}n\delta ^{\alpha -1}+\delta ^{-1}\log \frac{G}{\epsilon }\) as \(n\rightarrow \infty \). In particular, setting

$$\begin{aligned} \delta =cn^{-\frac{1}{\alpha }}\log ^{\frac{1}{\alpha }}\frac{G}{\epsilon }, \end{aligned}$$
(3.20)

with \(c\) being a small constant independent of \(n,\epsilon \) (chosen so that \(\delta <{\bar{\delta }}\), where \({\bar{\delta }}\) is defined in Proposition 3.3), ensures that

$$\begin{aligned} d_{1}^{(n,\epsilon )}\ge n-C_{1}n^{\frac{1}{\alpha }}\log ^{1-\frac{1}{\alpha }}\frac{G}{\epsilon }, \end{aligned}$$
(3.21)

for some \(C_{1}\) depending on the Runge–Kutta method. Similarly,

$$\begin{aligned} d_{2}^{(n,\epsilon )}\le n+C_{1}'n^{\frac{1}{\alpha '}}\log ^{1-\frac{1}{\alpha '}}\frac{G'}{\epsilon }, \end{aligned}$$
(3.22)

for \(C_{1}'>0\) that depends on the Runge–Kutta method. To check this estimate, we plot the dependence of numerically determined values \(\varDelta _{1}^{n,\epsilon }=n-d_{1}^{n,\epsilon }\) and \(\varDelta _{2}^{n,\epsilon }=d_{2}^{n,\epsilon }-n\) on \(n\) in Fig. 3 for different Runge–Kutta methods.

Fig. 3
figure 3

Dependence of \(\varDelta _{1}^{n,10^{-4}},\; \varDelta _{2}^{n, 10^{-4}}\) on \(n\) for different Runge–Kutta methods

Our estimates predict that for methods with odd order \(p\), namely BDF1 (\(p=1\)), 2-stage Radau IIA (\(p=3\)) and 3-stage Radau IIA (\(p=5\)), \(\varDelta _{1}^{n,\epsilon }, \; \varDelta _{2}^{n,\epsilon }\) increase as \(O\left( n^{\frac{1}{p+1}}\right) \). This is in close agreement with the numerical results shown in Fig. 3.

For Runge–Kutta methods of even orders obtained estimates predict that for larger \(n\) and \(d\) the width of a convolution weight gets larger in a non-symmetric manner: \(\varDelta _{1}^{n,\epsilon }\) can get larger with increasing \(n\) faster than \(\varDelta _{2}^{n,\epsilon }\) or vice versa. For larger \(n\) and \(d\) the asymmetry becomes more visible. This can be illustrated by the Lobatto IIIC method of 6th order. Numerical experiments indicate that with increasing \(n\) the part of the approximate support of the convolution weight \(\mathrm{w}_{n}(d)\) of the Lobatto IIIC method corresponding to \(d<n\) increases slower than the part of the approximate support corresponding to \(d>n\). This effect can be explained by estimates (3.13) as follows. The stability function of the 4-stage Lobatto IIIC method is the \((2, 4)\)-Padé approximation to \(\mathrm{e}^{z}\). For such approximants the sign of \(C_{p+1}\) is negative (see, for example, [20, Theorem 3.11]); then the sign of \(C_{p+1}(-1)^{\frac{p}{2}}\) is positive. According to the estimates (3.13), \(\varDelta _{1}^{n,\epsilon }=O\left( n^{\frac{1}{p+1}}\right) \), with \(p=6\), while \(\varDelta _{2}^{n,\epsilon }=O\left( n^{\frac{2s-p}{p+1}}\right) \), with \(s=4\) for Lobatto IIIC (this value can be obtained by examining \(E(y)=O(y^{2s})\) for small \(y\in {\mathbb {R}}\)). Again, the results in Fig. 3 are in close agreement with these estimates.

Remark 3.8

The choice of \(\delta \) provided by (3.20) is valid only if

$$\begin{aligned} \left( \frac{\log \frac{G}{\epsilon }}{n}\right) ^{\frac{1}{\alpha }} \end{aligned}$$

can be bounded by a constant (otherwise it is impossible to choose \(c>0\) independent of \(n,\epsilon >0\) in (3.20)). Hence, the suggested dependence of \(\varDelta _{1}^{n,\epsilon },\; \varDelta _{2}^{n,\epsilon }\) on the accuracy \(\epsilon >0\) is not necessarily valid in the regime when \(n\) is fixed and \(\epsilon \rightarrow 0\). In this case an optimal choice of \(\delta \) is \(\delta =c>0\), and \(\varDelta _{1}^{n,\epsilon },\varDelta _{2}^{n,\epsilon }\) increase as \(O\left( \log \frac{1}{\epsilon }\right) \) rather than \(O\left( \log ^{\frac{1}{\alpha }}\frac{1}{\epsilon }\right) \).

This can be seen in Fig. 4, where we plot \(\varDelta ^{10,\epsilon }_{2}\) and \(\varDelta ^{10^{3},\epsilon }_{2}\) for BDF1 convolution weights for a range of \(\epsilon >0\).

Fig. 4
figure 4

We examine the dependence of \(\varDelta ^{n,\epsilon }_{2}\) for the BDF1 method on \(\epsilon \). In the left plot we show the dependence of \(\varDelta ^{10,\epsilon }_{2}\) on \(\log \frac{1}{\epsilon }\), and in the right plot the dependence of \(\varDelta ^{10^{3},\epsilon }_{2}\) on \(\log ^{\frac{1}{2}}\frac{1}{\epsilon }\) is demonstrated. We compare the rate of growth of \(\varDelta ^{10^{3},\epsilon }_{2}\) to \(O(\log \frac{1}{\epsilon })\) and \(O(\log ^{\frac{1}{2}}\frac{1}{\epsilon })\): this rate is closer to \(O(\log ^{\frac{1}{2}}\frac{1}{\epsilon })\) rather than \(O(\log \frac{1}{\epsilon })\)

The next proposition is a corollary of Proposition 3.3 and shows that (unscaled) convolution weights \(w_{n}^{h}(d)\) also experience exponential decay away from \(\frac{d}{h}\approx n\).

Proposition 3.4

Let \(R(z)\) be the stability function of an \(m\)-stage Runge–Kutta method of order \(p\) satisfying Assumptions 2.1 and (3.10).

Let \(s\) be defined by (3.11). Then there exist positive constants \({G,\;G',\;C,\; C'}\) and \({\bar{\delta }}\in (0,\; 1)\), such that for \(n\ge 1\) and \(0<\delta <{\bar{\delta }}\) the following estimates hold:

  1. 1.

    \(p\) is odd

    $$\begin{aligned} \begin{aligned} \Vert w^{h}_{n}(d)\Vert&\le \frac{G}{h}(1-\delta )^{n-\frac{d}{h}}(1+C\delta ^{p+1})^\frac{d}{h} \quad \text {for }\frac{d}{h}\le n,\\ \Vert w^{h}_{n}(d)\Vert&\le \frac{G'}{d}(1+\delta )^{n-\frac{d}{h}}(1+C'\delta ^{p+1})^\frac{d}{h} \quad \text {for }\frac{d}{h}>n; \end{aligned} \end{aligned}$$
    (3.23)
  2. 2.

    \(p\) is even

    1. (a)

      \(C_{p+1}(-1)^\frac{p}{2}>0\)

      $$\begin{aligned} \begin{aligned} \Vert w^{h}_{n}(d)\Vert&\le \frac{G}{h}(1-\delta )^{n-\frac{d}{h}}(1+C\delta ^{p+1})^\frac{d}{h} \quad \text {for }\frac{d}{h}\le n,\\ \Vert w^{h}_{n}(d)\Vert&\le \frac{G'}{d}(1+\delta )^{n-\frac{d}{h}} (1+C'\delta ^{\frac{p+1}{2s-p}})^\frac{d}{h}\quad \text {for }\frac{d}{h}>n; \end{aligned} \end{aligned}$$
      (3.24)
    2. (b)

      \(C_{p+1}(-1)^\frac{p}{2}<0\)

      $$\begin{aligned} \begin{aligned} \Vert w^{h}_{n}(d)\Vert&\le \frac{G}{h}(1-\delta )^{n-\frac{d}{h}}(1+C \delta ^{\frac{p+1}{2s-p}})^\frac{d}{h}\quad \text {for }\frac{d}{h}\le n,\\ \Vert w^{h}_{n}(d)\Vert&\le \frac{G'}{d}(1+\delta )^{n-\frac{d}{h}}(1+C' \delta ^{p+1})^\frac{d}{h}\quad \text {for }\frac{d}{h}>n. \end{aligned} \end{aligned}$$
      (3.25)

The convolution weight \(w^{h}_{0}(d)\) satisfies:

$$\begin{aligned} \Vert w^{h}_{0}(d)\Vert \le \frac{\exp (-\mu \frac{d}{h})}{4\pi d}, \end{aligned}$$

for some \(\mu >0.\)

Constants \(G,G',C,C',{\bar{\delta }},\mu \) depend only on the Runge–Kutta method and do not depend on \(n,d\) and \(h.\)

Proof

Let us again start with \(w_{0}^{h}(d)\). From the definition of the scaled convolution weights it follows that \(w_{0}^{h}(d)=\frac{\mathrm{w}_{0}(\frac{d}{h})}{4\pi d}\), and the required bound can be obtained from (3.15). Note, however, that the convolution weight \(w_{0}^{h}(d)\) has a singularity at \(d=0\).

Bounds for the case \(\frac{d}{h}>n\) can be obtained straightforwardly from expressions (3.12, 3.13, 3.14) applied to \(w_{n}^{h}(d)=\frac{\mathrm{w}_{n}(\frac{d}{h})}{4\pi d}\).

The case \(\frac{d}{h}\le n\) has to be treated separately: we cannot directly apply Proposition 3.3 for bounding \(w_{n}^{h}(d)=\frac{\mathrm{w}_{n}(\frac{d}{h})}{4\pi d}\), since for small \(d\) this bound would be far from optimal. We will proceed as follows. First, we will show that a scaled convolution weight \(\mathrm{w}_{n}(d)\) has a zero at \(d=0\) of multiplicity at least \(n\). Next, this fact and ideas from the proof of Proposition 3.3 will be used to demonstrate that away from a neighbourhood of \(d\approx nh\) convolution weights \(w_{n}^{h}(d)\) decay exponentially.

Let us first expand the generating function of the scaled convolution weights \(\mathrm{e}^{-\varDelta (\zeta )d}\), see (2.10), into a Taylor series in \(\zeta \) and then into a series in \(d\):

$$\begin{aligned} \exp \left( -\varDelta (\zeta )d\right)&= \sum _{n=0}^{\infty }\mathrm{w}_{n}(d)\zeta ^n,\\ \exp \left( -\varDelta (\zeta )d\right)&= \sum _{n=0}^{\infty } \frac{\left( -\varDelta (\zeta )\right) ^{n}}{ n!}d^{n}. \end{aligned}$$

Matching the powers of \(\zeta \) we obtain the following expansion for \(\mathrm{w}_{n}(d)\), \(n\ge 0\):

$$\begin{aligned} \mathrm{w}_{n}(d)=\sum _{m=n}^{\infty }d^{m}f_{m}^{n}\left( A, b, h\right) , \end{aligned}$$

where \(f_{m}^{n}(A, b, h)\) are matrix-valued functions of \(A\), \(b\) and \(h\).

Therefore, \(\mathrm{w}_{n}(0)=0\), \(n\ge 1\). The above also implies that convolution weights \(w_{n}^{h}(d)\), \(n\ge 1\), have a zero at \(d=0\) of order at least \(n-1\).

Let us recall the representation of the scaled convolution weights (3.1):

$$\begin{aligned} \mathrm{w}_{n}(d)= \frac{1}{2\pi i} \oint \limits _{\gamma } R(z)^{n-1}(I-Az)^{-1}1\!\!1b^{T}(I-Az)^{-1} \mathrm{e}^{-zd} dz, \end{aligned}$$
(3.26)

where \(\gamma \) is a contour that encloses all the eigenvalues of \(A^{-1}\), \(n\ge 1\). From this it follows that

$$\begin{aligned} \mathrm{w}_{n}(0) = \frac{1}{2\pi i} \oint \limits _{\gamma } R(z)^{n-1}(I-Az)^{-1}1\!\!1b^{T}(I-Az)^{-1}dz=0. \end{aligned}$$
(3.27)

Now let us prove the bounds (3.23, 3.24, 3.25) for \(\frac{d}{h}<n\). Let \(d\ne 0\). We express \(\mathrm{e}^{-z\frac{d}{h}}\) in terms of an integral of a parameter \(0\le \rho \le 1\):

$$\begin{aligned} \mathrm{e}^{-z\frac{d}{h}} = 1-\frac{zd}{h}\int _{0}^{1}\mathrm{e}^{-z\frac{d}{h}\rho }d\rho . \end{aligned}$$

The definition (3.26) can then be rewritten:

$$\begin{aligned} w_{n}^{h}(d)&= \frac{1}{2\pi i d} \oint \limits _{\gamma } R(z)^{n-1}(I-Az)^{-1}1\!\!1b^{T}(I-Az)^{-1} \mathrm{e}^{-z\frac{d}{h}} dz\\&= \frac{1}{2\pi i d} \oint \limits _{\gamma } R(z)^{n-1}(I-Az)^{-1}1\!\!1b^{T}(I-Az)^{-1}dz \\&-\frac{1}{2\pi i h} \int \limits _{0}^{1} \oint \limits _{\gamma } R(z)^{n-1}(I-Az)^{-1}1\!\!1b^{T} (I-Az)^{-1}z \mathrm{e}^{-z\frac{d}{h}\rho } dz d\rho . \end{aligned}$$

The first term in the above sum equals \(0\), due to (3.27). The modulus of the second term, namely,

$$\begin{aligned} \frac{1}{2\pi i h} \int \limits _{0}^{1} \oint \limits _{\gamma } R(z)^{n-1}(I-Az)^{-1}1\!\!1b^{T}(I-Az)^{-1}z \mathrm{e}^{-z\frac{d}{h}\rho } dz d\rho , \end{aligned}$$

can be estimated using the mean value theorem. We first bound the value of the integral

$$\begin{aligned} I(\rho , d)= \frac{1}{2\pi i h} \oint \limits _{\gamma } R(z)^{n-1}(I-Az)^{-1}1\!\!1b^{T}(I-Az)^{-1}z \mathrm{e}^{-z\frac{d}{h}\rho } dz \end{aligned}$$

repeating the arguments of the proof of Proposition 3.3. Note that two changes have to be made. First, \(d\) has to be replaced with \(\frac{d}{h}\rho \). Further, instead of bounding \(\Vert (I-Az)^{-1}1\!\!1b^{T}(I-Az)^{-1}\Vert \) we now bound \(\Vert (I-Az)^{-1}1\!\!1b^{T}(I-Az)^{-1}z\Vert \), for \(z\) lying on \(\gamma =\gamma _{r}\), by a constant that does not depend on \(d\), \(h\) or \(n\), but does depend on the Runge–Kutta method.

It is not difficult to see that there exist positive constants \({G,\;C}\), \({\bar{\delta }}\in (0,\; 1)\), \(q>0\) such that for \(n\ge 1\) and \(0<\delta <{\bar{\delta }}\) the following estimate holds:

$$\begin{aligned} |I(\rho , d)|\le \frac{1}{h} G(1-\delta )^{n-\frac{d}{h}\rho }(1+C\delta ^q)^{\frac{d}{h}\rho }, \end{aligned}$$

for \(\frac{d}{h}\rho \le n\). In the above expression \(q\) is either \(p+1\) or \(\frac{p+1}{2s-p}\), as in Proposition 3.3. Clearly this estimate is valid for all \(d\) such that \(\frac{d}{h}<n\) and \(\rho \in [0,1]\).

Next, we bound \(\int \limits _{0}^{1} I(\rho , d)d\rho \) as:

$$\begin{aligned} \left| \int \limits _{0}^{1} I(\rho , d)d\rho \right|&\le \max \limits _{\rho \in \left[ 0,\; 1\right] }|I(\rho , d)| \\&\le \frac{1}{h} G (1-\delta )^{n} \max \limits _{\rho \in \left[ 0,\; 1\right] }\left( \frac{1+C\delta ^q}{1-\delta }\right) ^{\frac{d}{h}\rho }\\&\le \frac{1}{h} G(1-\delta )^{n-\frac{d}{h}}\left( 1+C\delta ^{q} \right) ^{\frac{d}{h}}. \end{aligned}$$

This finishes the proof of the statement. \(\square \)

4 Computation of convolution weights

Let us write

$$\begin{aligned} {\fancyscript{K}}_{d}(\zeta )=\frac{\exp \left( {-\varDelta (\zeta )\frac{d}{h}}\right) }{4\pi d}. \end{aligned}$$

The expansion (2.9) shows that \(w_n^h(d)\) is the \(n\)th Taylor coefficient of \({\fancyscript{K}}_{d}(\zeta )\). Therefore, the Cauchy integral formula gives another representation of \(w_n^h(d)\),

$$\begin{aligned} w_n^h(d) = \frac{1}{2\pi i}\oint _{{\fancyscript{C}}} {\fancyscript{K}}_{d}(\zeta ) \zeta ^{-n-1}d\zeta . \end{aligned}$$

Let us choose the contour \({\fancyscript{C}}\) to be the circle centered at the origin with radius \(\varrho < 1\). Discretizing this integral with the composite Trapezoid rule gives the approximation

$$\begin{aligned} w_n^h(d) = \varrho ^{-n} \sum _{j = 0}^N {\fancyscript{K}}_{d}\left( \varrho e^{ij \tfrac{2\pi }{N+1}}\right) e^{-ijn \tfrac{2\pi }{N+1}} + O(\varrho ^{N+1}),\quad n = 0,1,\dots , N.\nonumber \\ \end{aligned}$$
(4.1)

In practice, the parameter \(\varrho > 0\) cannot be chosen arbitrarily small in finite precision arithmetic. If \(\mathrm{{eps}}\) denotes the machine precision the best accuracy that can be achieved is \(\surd \mathrm{{eps}}\) with the choice \(\varrho = \mathrm{{eps}}^{\tfrac{1}{2N}}\), see [23]. Using FFT, \(w_n^h(d)\) can be computed in \(O(N \log N)\) time for all \(n = 0,1,\ldots ,N\). However, if \(d\) is bounded by \(Kh\), for a constant \(K>0\), it is possible to avoid the scaling parameter \(\varrho \) as described in the next proposition.

Proposition 4.1

Let \(w_{n}^{h},\; n\ge 0,\) be the Runge–Kutta convolution weights (2.9), and let the Runge–Kutta method satisfy Assumption 2.1.

Let \(K>0\) be fixed. There exist \(\mu _{1},\mu _{2},\mu _{3}>0,\) s.t. for all \(\epsilon >0,h>0,L\in {\mathbb {N}}\) satisfying

$$\begin{aligned} L\ge \mu _{1}\log \frac{1}{\epsilon h}+\mu _{2}K+\mu _{3}, \end{aligned}$$

and all \(0<d\le Kh\) the following holds true.

  1. 1.

    There exists an \(L\)-term approximation to the convolution kernel \({\fancyscript{K}}_{d}(\xi )=\frac{\exp \left( {-\varDelta (\xi )\frac{d}{h}}\right) }{4\pi d}\):

    $$\begin{aligned} \left| {\fancyscript{K}}_{d}(\xi )-\sum _{\ell =0}^{L-1} w_{\ell }^{h}(d)\xi ^{\ell }\right| \le \epsilon \end{aligned}$$

    for all \(\xi \in {\mathbb {C}}:\; |\xi |\le 1\).

  2. 2.

    Convolution weights can be approximated with an accuracy \(\epsilon \) by an \(L\)-term discrete Fourier transform of the convolution kernel

    $$\begin{aligned} \left| w_{n}^{h}(d)-\frac{1}{L} \sum _{\ell =0}^{L-1} {\fancyscript{K}}_{d}(\mathrm{e}^{i\ell \frac{2\pi }{L}}) \mathrm{e}^{-i\ell n{\frac{2\pi }{L}}}\right| \le \epsilon \end{aligned}$$

    for all \(n<L\).

Proof

Let us prove the first statement using the bounds on convolution weights derived in Proposition 3.4. The second statement then follows directly from the first statement by the application of the aliasing formula.

By definition, for all \(|\xi |<1\)

$$\begin{aligned} {\fancyscript{K}}_{d}(\xi )=\sum _{\ell =0}^{\infty }w_{\ell }^{h}(d)\xi ^{\ell } =\sum \limits _{\ell =0}^{L-1}w_{\ell }^{h}(d)\xi ^{\ell } +\sum \limits _{\ell =L}^{\infty }w_{\ell }^{h}(d)\xi ^{\ell }. \end{aligned}$$

Let us show that given \(\epsilon >0\), there exists \(L\) s.t.

$$\begin{aligned} E_{L}(\xi ,d)=\left\| \sum \limits _{\ell =L}^{\infty } w_{\ell }^{h}(d)\xi ^{\ell }\right\| \le \epsilon . \end{aligned}$$
(4.2)

First, let \(L>K\). In a generalized form, the bounds on the scaled convolution weights \(w_{\ell }^{h}(d)\) for \(\ell \ge L\) and \(d\le Kh<\ell h\) can be stated as

$$\begin{aligned} \left\| w_{\ell }^{h}(d)\right\| \le \frac{G}{h}\left( 1-\delta \right) ^{\ell - \frac{d}{h}}(1+A\delta ^{\alpha })^{\frac{d}{h}}, \end{aligned}$$

for some \(0<\delta <{\bar{\delta }}<1\) and \(A,G,\alpha ,{\bar{\delta }}>0\) being constants. Then, after inserting this bound into the expression (4.2) for \(E_{L}(\xi ,d)\),

$$\begin{aligned} E_{L}(\xi ,d)&\le \frac{G}{h}\left( \frac{1+A\delta ^{\alpha }}{1-\delta }\right) ^{\frac{d}{h}}\sum _{\ell =L}^{\infty }(1-\delta )^{\ell }\nonumber \\&\le \frac{G}{h}\left( \frac{1+A\delta ^{\alpha }}{1-\delta }\right) ^{\frac{d}{h}}(1-\delta )^{L}\delta ^{-1}\le \frac{G}{h \delta }\left( \frac{1+A\delta ^{\alpha }}{1-\delta }\right) ^{K}(1-\delta )^{L},\qquad \quad \end{aligned}$$
(4.3)

where we used that \(d\le Kh\). From the above it can be seen that \(L\) has to be chosen larger than \(K\) and so that

$$\begin{aligned} \frac{G}{h\delta }\left( \frac{1+A\delta ^{\alpha }}{1-\delta }\right) ^{K}(1-\delta )^{L}<\epsilon . \end{aligned}$$

Namely,

$$\begin{aligned} L&\ge K\left( \log (1+A\delta ^{\alpha })\log ^{-1} \frac{1}{1-\delta }+1\right) \\&+\log \frac{1}{h\epsilon }\log ^{-1}\frac{1}{1-\delta }+ \log \frac{G}{\delta } \log ^{-1}\frac{1}{1-\delta }. \end{aligned}$$

This proves the statement of the proposition for \(|\xi |<1\). For \(|\xi |=1\) the correctness of the statement can be seen from the bound (4.3) which is valid for \(|\xi |=1\). \(\square \)

Remark 4.1

Note that in the above statement the dependence of \(L\) on \(\log \frac{1}{h}\) cannot be removed. To illustrate this fact, in Fig. 5 we plot \(n_{*}=\sup \{n\in {\mathbb {N}}\;|\; \Vert w_{\ell }^{h}\left( d\right) \Vert <\epsilon ,\; \text {for all }\ell \ge n,\; d\le D\}\) for different values of \(h\) and \(\epsilon \) and fixed \(\frac{D}{h}=10\), for the 3-stage Radau IIA method of order five.

Fig. 5
figure 5

Dependence of \(n_{*}=\sup \{n\in {\mathbb {N}}\;|\; \Vert w_{\ell }^{h}(d)\Vert <\epsilon ,\text { for all }\ell \ge n,\; d\le D\}\) on \(\frac{1}{h}\), for \(\epsilon =1e-4,\; 1e-5\) and fixed \(\frac{D}{h}=10\), for the 3-stage Radau IIA method

A similar statement holds for the scaled convolution weights.

Proposition 4.2

Let \(\mathrm{w}_{n}, \; n\ge 0,\) be the scaled Runge–Kutta convolution weights (2.10), and let the Runge–Kutta method satisfy Assumption 2.1.

Let \(K>0\) be fixed. There exist \(\mu _{1},\mu _{2},\mu _{3}>0,\) s.t. for all \(\epsilon >0\) and \(L\in {\mathbb {N}}\) satisfying

$$\begin{aligned} L\ge \mu _{1}\log \frac{1}{\epsilon }+\mu _{2}K+\mu _{3}, \end{aligned}$$

the following holds true for \(0\le d\le K.\)

  1. 1.

    There exists an \(L\)-term approximation to the convolution kernel \(K_{d}(\xi )=\exp \left( {-\varDelta (\xi )d}\right) \):

    $$\begin{aligned} \left| K_{d}(\xi )-\sum \limits _{\ell =0}^{L-1} \mathrm{w}_{\ell }(d)\xi ^{\ell }\right| \le \epsilon \end{aligned}$$

    for all \(\xi \in {\mathbb {C}}:\; |\xi |\le 1\).

  2. 2.

    Scaled convolution weights can be approximated with the accuracy \(\epsilon \) by an \(L\)-term discrete Fourier transform of the convolution kernel \(K_{d}(\xi )\)

    $$\begin{aligned} \left| \mathrm{w}_{n}(d)-\frac{1}{L} \sum \limits _{\ell =0}^{L-1} K_{d}(\xi )(\mathrm{e}^{i\ell \frac{2\pi }{L}}) \mathrm{e}^{-i\ell n{\frac{2\pi }{L}}}\right| \le \epsilon \end{aligned}$$

    for all \(n<L.\)

Proof

The proof of the proposition mimics the proof of Proposition 4.1. \(\square \)

5 Applications

The idea of using the sparsity of convolution weights to speed up calculations is not new. The most straightforward way of using the sparsity is to notice that for large enough \(n\), the weights \(W^h_j\), \(j > n\), need not be computed at all but can be approximated by zero. In order to describe more advanced algorithms we need to briefly introduce the Galerkin boundary element discretization of convolution weights.

Let the boundary \(\varGamma \) be split into \(M\) disjoint panels \(\tau _1,\ldots ,\tau _M\) so that \(\varGamma = \cup _i \bar{\tau }_i\). The span of piecewise constant basis functions

$$\begin{aligned} b_i(x) = {\left\{ \begin{array}{ll} 1 &{} x\in \tau _i\\ 0 &{} x\notin \tau _i\\ \end{array}\right. },\quad i = 1,2,\ldots ,M, \end{aligned}$$

defines a finite dimensional subspace \(X \subset H^{-1/2}(\varGamma )\).

The corresponding Galerkin discretization of a convolution weight \(W_n^h\) is given by

$$\begin{aligned} \left( {\mathbf {W}}_n^h\right) _{ij}&= \int _\varGamma \left( W_n^h b_i\right) (x) b_j(x) d \varGamma _x \\&= \int _\varGamma \int _\varGamma w_n^h(\Vert x-y\Vert ) b_i(y) b_j(x) d \varGamma _yd \varGamma _x\\&= \int _{\tau _i} \int _{\tau _j} w_n^h(\Vert x-y\Vert ) d \varGamma _yd \varGamma _x. \end{aligned}$$

Remark 5.1

Convolution weights \(W_n^h\) are \(m \times m\)-matrices of operators bounded as mappings from \(H^{-1/2}(\varGamma )\) to \(H^{1/2}(\varGamma )\). Therefore if

$$\begin{aligned} W_n^h = \begin{pmatrix} W_n^{11} &{} \cdots &{} W_n^{1m}\\ \vdots &{} \cdots &{} \vdots \\ W_n^{m1} &{} \cdots &{} W_n^{mm} \end{pmatrix}, \end{aligned}$$

then \({\mathbf {W}}_n^h\) should be understood as a matrix of matrices (a tensor) and

$$\begin{aligned} \left( {\mathbf {W}}_n^h\right) _{ij} = \begin{pmatrix} \int _\varGamma \left( W_n^{11}b_i\right) (x) b_j(x) d \varGamma _x &{} \cdots &{} \int _\varGamma \left( W_n^{1m}b_i\right) (x) b_j(x) d \varGamma _x\\ \vdots &{} \cdots &{} \vdots \\ \int _\varGamma \left( W_n^{m1}b_i\right) (x) b_j(x) d \varGamma _x &{} \cdots &{} \int _\varGamma \left( W_n^{mm}b_i\right) (x) b_j(x) d \varGamma _x \end{pmatrix}. \end{aligned}$$

The sparsity of convolution weights shows that many entries in the Galerkin matrices need not be computed. This fact, though only for linear multistep based methods, has already been used in [17, 18, 21] to construct and analyse an efficient algorithm. In the next section we analyze the complexity of such an algorithm for Runge–Kutta convolution quadrature.

5.1 Sparse Runge–Kutta convolution quadrature

In this section, using the new estimates proved in this paper, we analyse the complexity of sparse Runge–Kutta convolution quadrature. We will throughout consider Runge–Kutta methods satisfying Assumptions 2.1 and 3.1.

The main idea behind the a priori cut-off strategy of [17, 18, 21] is to approximate \({\mathbf {W}}_{n}^{h},\; n=0,\ldots , N\) (for a fixed time interval \([0,T]\), \(N = \lceil T/h\rceil \)) by sparse matrices. For Runge–Kutta based methods we can use the results of Proposition 3.3 to decide what proportion of the entries in the matrices can be set to zero. More precisely, let \(i, j, n\) be s.t. for all \(x\in \tau _i,\; y\in \tau _j\)

$$\begin{aligned} \left\| \mathrm{w}_{n}\left( \frac{\Vert x-y\Vert }{h}\right) \right\| <\epsilon , \end{aligned}$$

where \(\epsilon >0\) is a given accuracy. Then, using \(w_{n}^{h}(d)=\frac{\mathrm{w}_n\left( \frac{d}{h}\right) }{4\pi d}\), we obtain

$$\begin{aligned}&\left\| \left( {\mathbf {W}}_n^h\right) _{ij}\right\| = \left\| \int \limits _{\tau _i}\int \limits _{\tau _j}\frac{\mathrm{w}_n\left( \frac{\Vert x-y\Vert }{h}\right) }{4\pi \Vert x-y\Vert }b_{i}(x) b_{j}(y)d\varGamma _{x}d\varGamma _{y}\right\| \nonumber \\&\quad \le \int _{\tau _i}\int \limits _{\tau _j} \frac{\epsilon }{4\pi \Vert x-y\Vert }\left| b_{i}(x) b_{j}(y)\right| d\varGamma _{x}d\varGamma _{y}\le C_{\varGamma }\epsilon \Vert b_{i}\Vert _{L_{2} (\varGamma )}\Vert b_{j}\Vert _{L_{2}(\varGamma )},\qquad \quad \end{aligned}$$
(5.1)

for some constant \(C_{\varGamma }>0\) that depends only on \(\varGamma \). The last inequality was obtained from the \(L_{2}\)-continuity of the Laplace single layer boundary operator, see also [21].

Let us denote the spatial meshwidth by \(\varDelta x\) and assume that

$$\begin{aligned} c\varDelta x<\mathrm{diam }\tau _i<\varDelta x,\quad i=1,\ldots , M, \end{aligned}$$

for some \(c>0\) independent of \(\varDelta x\). Therefore, there exists a positive \(C'>0\), s.t. \(\left\| \left( {\mathbf {W}}_n^h\right) _{ij}\right\| \le C'\left( \varDelta x\right) ^2\epsilon \).

In [18] it is shown (for the BDF2 method) that the stability and convergence of the method based on the a priori cut-off strategy for BDF2 convolution quadrature is ensured if \(\epsilon \) is chosen s.t.

$$\begin{aligned} \log \frac{1}{\epsilon }=O(\log {M}). \end{aligned}$$
(5.2)

Similar analysis as in [18] shows that we can use the same lower bound for \(\epsilon \).

Additionally, let us assume that

$$\begin{aligned} \varDelta x=O(h^{\nu }),\quad \nu >0. \end{aligned}$$
(5.3)

The case \(\nu =1\) is of particular interest in applications, e.g., scattering of waves of high frequency content where the time step and meshwidth are restricted more by the highest frequency present in the system than the required accuracy. Namely, given the highest frequency in the system \(f\), the time step \(h\) is chosen as \(h\approx Cf^{-1}\), for \(C>0\), and the meshwidth \(\varDelta x\approx Kh,\; K>0\), see [14]. Under the assumption \(M=O\left( \left( \varDelta x\right) ^{-2}\right) \), this implies the following condition on \(N\) (the number of time steps) and \(M\) (the size of the spatial discretization):

$$\begin{aligned} M=O\left( N^{2\nu }\right) . \end{aligned}$$
(5.4)

To estimate the number of non-zero elements in \(\left( {\mathbf {W}}_{n}^{h}\right) ^{kl},\; k,l=1,\ldots ,m\), \(0\le n\le N\), recall that for some \(\alpha > 1\), \(\mathrm{w}_n\left( \frac{\Vert x-y\Vert }{h}\right) \) is negligible if

$$\begin{aligned} \left| \frac{\Vert x-y\Vert }{h}-n\right| \ge \max (\varDelta _1^{n,\epsilon }, \varDelta _2^{n,\epsilon }) \sim n^{1/\alpha }\log ^{1-1/\alpha }\tfrac{1}{\epsilon }. \end{aligned}$$

Hence the support of the kernel is widest for \(n = N\). Therefore, for each of the \(M^2\) pairs \((i,j)\), \(i,j = 1,\ldots ,M\), the corresponding entry in the Galerkin matrix \(\left( {\mathbf {W}}_{n}^{h}\right) _{ij}\) is non-zero for at most \(O(N^{1/\alpha }\log ^{1-1/\alpha }\tfrac{1}{\epsilon })\) convolution weights. Summing all this together we obtain

$$\begin{aligned} O(M^2 N^{1/\alpha }\log ^{1-1/\alpha }\tfrac{1}{\epsilon }) = O(M^2 N^{1/\alpha }\log ^{1-1/\alpha }M) \end{aligned}$$

as an estimate of the number of non-zero entries in the sparse approximation of the \(N\) convolution weights.

For the 3-stage Radau IIA method \(\alpha = p+1 = 6\). Taking \(\nu = 1\), i.e., \(M = O(N^2)\), we obtain as the total storage cost for \(N\) convolution weights

$$\begin{aligned} O(M^2N^{1/6}\log ^{5/6}M) = O(M^{2+1/12}\log ^{5/6}M). \end{aligned}$$

Hence, a straightforward application of sparsity does not give an algorithm of linear complexity. Nonetheless, the memory requirements are very close to those of many MOT and Galerkin methods, which scale as \(O(M^2)\).

As a remedy, in [21] it was suggested to combine the a priori cutoff strategy with panel clustering. Although the extension of these ideas to Runge–Kutta CQ may lead to a method with improved memory requirements, the total complexity of such an algorithm would not scale better than \(O(MN^2)\), due to the use of back substitution to solve the convolution system. This difficulty needs to be circumvented in order to allow for efficient large scale Runge–Kutta convolution quadrature algorithms. Improved complexity can be recovered if FFT methods are used. In the next section we describe briefly how FFT can be combined with the sparsity of the convolution weights to a good effect.

5.2 FFT and sparsity

The use of FFT, as in (4.1), usually destroys any sparsity, but in [7] it was shown that also in this case advantage can be made of sparsity. Here we will just briefly explain the main idea.

In (4.1) we have seen that the convolution kernels can be approximated by a discrete Fourier transform and hence the same is true of the convolution weights

$$\begin{aligned} {{\mathbf {W}}}_n^h \approx \varrho ^{-n} \frac{1}{N+1}\sum _{\ell = 0}^N \mathbf{V}(\varrho \mathrm{e}^{i\ell \frac{2\pi }{N+1}}) \mathrm{e}^{-i\ell n{\frac{2\pi }{N+1}}}, \end{aligned}$$

for \(n = 0, 1,\ldots , N\). This shows that \(N\) convolution weights can be computed in \(O(N \log N)\) time. Recall that for a fixed time interval \([0,T]\), \(N = \lceil T/h\rceil \).

From Proposition 3.3 we know that for a fixed \(n_0 > 0\)

$$\begin{aligned} \Vert w_n^h(d)\Vert =\frac{1}{4\pi d}\left\| \mathrm{w}_{n}\left( \frac{d}{h}\right) \right\| \le \frac{C}{4\pi d} (1-\delta )^{n-\tfrac{d}{h}}\le \frac{C}{4\pi d} (1-\delta )^{n-n_0}, \end{aligned}$$

for all \(d \in (0,\; n_0 h)\) and \(n > n_0\) with constant \(C\). Hence there exists \(n_1 \propto \log \varepsilon ^{-1}\) such that

$$\begin{aligned} \Vert w_n^h(d)\Vert \le \frac{\varepsilon }{4\pi d},\quad \text {for all }\; 0 < d < n_0h\text { and } n > n_1. \end{aligned}$$

Therefore for \(n > n_1\), the ”near-field” of the matrices \({{\mathbf {W}}}_n^h\) can be approximated by zero [c.f. (5.1)]:

$$\begin{aligned} \left( {{\mathbf {W}}}_n^h\right) _{ij} = \int _{\tau _i} \int _{\tau _j} w_n^h(\Vert x-y\Vert ) d \varGamma _yd \varGamma _x \approx 0, \quad \text {if } \mathrm{dist }(\tau _i,\tau _j) < n_0h, \; n > n_1. \end{aligned}$$

Unfortunately this is not true for the \(N+1\) matrices \(\mathbf{V}(\varrho \mathrm{e}^{i\ell \frac{2\pi }{N+1}})\), further problem being that the near-field forms the part of the matrix which fast methods such as hierarchical matrices and the fast multipole method cannot be applied to. To considerably reduce the computational costs, we show now how to reduce the number of matrices for which the near-field needs to be computed from \(O(N)\) to \(O\left( \log \frac{1}{\epsilon }\right) \) and still make effective use of the FFT. First we compute the \(n_1\) weights \({\mathbf {W}}_n^h\), \(n = 0,1,\ldots , n_1,\) using

$$\begin{aligned} {{\mathbf {W}}}_n^h \approx \frac{\varrho ^{-n}}{n_1+1}\sum _{\ell = 0}^{n_1} \mathbf{V}(\varrho \mathrm{e}^{i\ell \frac{2\pi }{n_1+1}}) \mathrm{e}^{-i\ell n_1{\frac{2\pi }{N+1}}}, \quad n =0,1,\ldots ,n_1. \end{aligned}$$

This is not an expensive operation since \(n_1\) depends only logarithmically on the desired accuracy \(\epsilon \), see Proposition 4.2 and (5.1). The remaining weights are computed using a similar formula as before in \(O(N \log N)\) time

$$\begin{aligned} {{\mathbf {W}}}_n^h \approx \frac{\varrho ^{-n}}{N+1}\sum _{\ell = 0}^N \widetilde{\mathbf{V}}(\varrho \mathrm{e}^{i\ell \frac{2\pi }{N+1}}) \mathrm{e}^{-i\ell n{\frac{2\pi }{N+1}}}, \quad n = n_1+1, \ldots ,N, \end{aligned}$$

where

$$\begin{aligned} (\widetilde{\mathbf{V}}(s))_{ij} = {\left\{ \begin{array}{ll} 0 &{}\quad \! \text {if } \mathrm{dist }(\tau _i,\tau _j) < n_0h\\ ( \widetilde{\mathbf{V}}(s))_{ij} &{}\quad \! \text {otherwise}. \end{array}\right. } \end{aligned}$$

Thereby \(N\) convolution weights can still be computed in \(O(N\log N)\) time, but only for a \(O\left( \log \frac{1}{\epsilon }\right) \) number of evaluations of \(\mathbf{V}(s)\) is it necessary to compute the near-field.

For a more thorough explanation of how to use such ideas in a full algorithm for solving the discretized equations see [7].

6 Conclusions

We have analysed the behaviour of convolution weights of Runge–Kutta convolution quadrature. We have proved that convolution weights \(w_n^h(d)\) decay exponentially away from \(nh \approx d\). The obtained estimates explain the dependence of the size of the approximate support of a convolution weight on the order of the underlying Runge–Kutta method. The results of this work can be used for the design and complexity analysis of fast algorithms based on sparsity for the solution of TDBIE for the three-dimensional wave equation.