1 Introduction

More than four decades ago Grigori N. Milstein proposed a new numerical method for the approximate integration of stochastic ordinary differential equations (SODEs) in [18] (see [19] for an English translation). This scheme is nowadays called the Milstein method and offers a higher order of accuracy than the classical Euler–Maruyama scheme. In fact, G. N. Milstein showed that his method converges with order 1 to the exact solution with respect to the root mean square norm under suitable conditions on the coefficient functions of the SODE while the Euler–Maruyama scheme is only convergent of order \(\frac{1}{2}\), in general.

In its simplest form, that is for scalar stochastic differential equations driven by a scalar Wiener process W, the Milstein method is given by the recursion

$$\begin{aligned} X_h(t + h)&= X_h(t) + h f(t, X_h(t) ) + g(t, X_h(t)) \Delta _h W(t) \nonumber \\&\quad +\, \frac{1}{2} \big ( \frac{\partial g}{\partial x} \circ g \big ) (t, X_{h} (t) ) \big ( \Delta _h W(t)^2 - h \big ), \end{aligned}$$
(1)

where h denotes the step size, \(\Delta _h W(t) = W(t + h) - W(t)\) is the stochastic increment, and f and g are the drift and diffusion coefficient functions of the underlying SODE (Eq. (3) below shows the SODE in the full generality considered in this paper).

Since the derivation of the Milstein method in [18] relies on an iterated application of the Itō formula, the error analysis requires the boundedness and continuity of the coefficient functions f and g and their partial derivatives up to the fourth order. Similar conditions also appear in the standard literature on this topic [12, 20, 21].

In more recent publications these conditions have been relaxed: For instance in [13] it is proved that the strong order 1 result for the scheme (1) stays true if the coefficient functions are only two times continuously differentiable with bounded partial derivatives, provided the exact solution has sufficiently high moments and the mapping \(x \mapsto \left( \frac{\partial g}{\partial x} \circ g \right) (t, x)\) is globally Lipschitz continuous for every \(t \in [0,T]\). On the other hand, from the results in [10] it follows that the explicit Euler–Maruyama method is divergent in the strong and weak sense if the coefficient functions are super-linearly growing. Since the same reasoning also applies to the Milstein scheme (1) it is necessary to consider suitable variants in this situation.

One possibility to treat super-linearly growing coefficient functions is proposed in [27]. Here the authors combine the Milstein scheme with the taming strategy from [11]. This allows to prove the strong convergence rate 1 in the case of SODEs whose drift coefficient functions satisfy a one-sided Lipschitz condition. The same approach is used in [15], where the authors consider SODEs driven by Lévy noise. However, both papers still require that the diffusion coefficient functions are globally Lipschitz continuous.

This is not needed for the implicit variant of the Milstein scheme considered in [8], where the strong convergence result also applies to certain SODEs with super-linearly growing diffusion coefficient functions. However, the authors only consider scalar SODEs and did not determine the order of convergence. The first result bypassing all these restrictions is found in [29], which deals with an explicit first order method based on a variant of the taming idea. A more recent result based on the taming strategy is also given in [16].

In this paper we propose two further variants of the Milstein scheme which apply to multi-dimensional SODEs of the form (3). First, we follow an idea from [2] and study the projected Milstein method consisting of the standard explicit Milstein scheme together with a nonlinear projection onto a sphere whose radius is expanding with a negative power of the step size. The second scheme is a Milstein-type variant of the split-step backward Euler scheme (see [7]) termed split-step backward Milstein method.

For both schemes we prove the optimal strong convergence rate 1 in the following sense: Let \(X :[0,T] \times \Omega \rightarrow {\mathbb {R}}^d\) and \(X_h :\{t_0, t_1, \ldots ,t_N\} \times \Omega \rightarrow {\mathbb {R}}^d\) denote the exact solution and its numerical approximation with corresponding step size h. Then, there exists a constant C independent of h such that

$$\begin{aligned} \max _{n \in \{1,\ldots ,N\} } \Vert X(t_n) - X_h(t_n) \Vert _{L^2(\Omega ;{\mathbb {R}}^d)} \le C h, \end{aligned}$$
(2)

where \(t_n = n h\) and \(t_N \le T\). For the proof we essentially impose the global monotonicity condition (4) and certain local Lipschitz assumptions on the first order derivatives of the drift and diffusion coefficient functions. For a precise statement of all our assumptions and the two convergence results we refer to Assumption 2.1 and Theorems 2.2 and 2.3 below. Together with the result on the balanced scheme found in [29], these theorems are the first results which determine the optimal strong convergence rate for some Milstein-type schemes without any linear growth or global Lipschitz assumption on the diffusion coefficient functions and for multi-dimensional SODEs.

Let us note that the error analysis presented in this paper is based on the notions of stochastic C-stability and B-consistency developed in [2]. As already indicated above, several further approaches to derive convergence rates of discretization schemes for SODEs with super-linearly growing coefficient functions are found in the literature. For additional references and a more detailed comparison with other methods of proofs we refer to [2]. However, many of these approaches rely on the interpolation of the discretization scheme to continuous time in order to apply perturbation results for Itō processes, as for instance in [9]. For implicit and split-step schemes the interpolation step is not easily accomplished. This explains why we prefer to apply a framework that estimates the mean-square error on the discrete time level, following strategies from standard textbooks on numerical analysis for deterministic ODEs, such as [3, 6, 25].

The remainder of this paper is organized as follows: In Sect. 2 we introduce the projected Milstein method and the split-step backward Milstein scheme in full detail. We state all assumptions and the convergence results, which are then proved in later sections. Further, we apply the convergence results to SODEs with additive noise for which the Milstein-type schemes coincide with the corresponding Euler-type schemes.

The proofs follow the same steps as the error analysis in [2]. In order to keep this paper as self-contained as possible we briefly recall the notions of C-stability and B-consistency and the abstract convergence theorem from [2] in Sect. 3. Then, in the following four sections we verify that the two considered Milstein-type schemes are indeed stable and consistent in the sense of Sect. 3. Finally, in Sect. 8 we report on a couple of numerical experiments which illustrate our theoretical findings. Note that both examples include non-globally Lipschitz continuous coefficient functions, which are not covered by the standard results found in [12, 20].

2 Assumptions and Main Results

This section contains a detailed description of our assumptions on the stochastic differential equation, under which our strong convergence results hold. Further, we introduce the projected Milstein method and the split-step backward Milstein scheme and we state our main results.

Our starting point is the stochastic ordinary differential equation (3) below. We apply the same notation as in [2] and we fix \(d,m \in {\mathbb {N}}\), \(T \in (0,\infty )\), and a filtered probability space \((\Omega , {\mathcal {F}}, ({\mathcal {F}}_t)_{t \in [0,T]},{\mathbf {P}})\) satisfying the usual conditions. By \(X :[0,T] \times \Omega \rightarrow {\mathbb {R}}^d\) we denote a solution to the SODE

$$\begin{aligned} \,\mathrm {d}X(t)&= f(t,X(t)) \,\mathrm {d}t + \sum _{r=1}^m g^r(t,X(t)) \,\mathrm {d}W^r(t),\quad t \in [0,T], \nonumber \\ X(0)&=X_0. \end{aligned}$$
(3)

Here \(f:[0,T] \times {\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\) stands for the drift coefficient function, while \(g^r :[0,T]\times {\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\), \(r=1,\ldots ,m\), are the diffusion coefficient functions. By \(W^r :[0,T] \times \Omega \rightarrow {\mathbb {R}}\), \(r = 1,\ldots ,m\), we denote an independent family of real-valued standard \(({\mathcal {F}}_t)_{t\in [0,T]}\)-Brownian motions on \((\Omega ,\mathcal {F},{\mathbf {P}})\). For a sufficiently large \(p \in [2,\infty )\) the initial condition \(X_0\) is assumed to be an element of the space \(L^p(\Omega ,{\mathcal {F}}_0,{\mathbf {P}};{\mathbb {R}}^d)\).

Let us fix some further notation: We write \(\langle \cdot , \cdot \rangle \) and \(|\cdot |\) for the Euclidean inner product and the Euclidean norm on \({\mathbb {R}}^d\), respectively. Further, we denote by \({\mathcal {L}}({\mathbb {R}}^d) = {\mathcal {L}}({\mathbb {R}}^d,{\mathbb {R}}^d)\) the set of all bounded linear operators on \({\mathbb {R}}^d\) endowed with the matrix norm \(| \cdot |_{{\mathcal {L}}({\mathbb {R}}^d)}\) induced by the Euclidean norm. For a sufficiently smooth mapping \(f :[0,T] \times {\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\) and a given \(t \in [0,T]\) we denote by \(\frac{\partial f}{\partial x} (t,x) \in {\mathcal {L}}({\mathbb {R}}^d)\) the Jacobian matrix of the mapping \({\mathbb {R}}^d \ni x \mapsto f(t,x) \in {\mathbb {R}}^d\).

Having established this we formulate the conditions on the drift and the diffusion coefficient functions:

Assumption 2.1

The mappings \(f :[0,T] \times {\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\) and \(g^r :[0,T] \times {\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\), \(r = 1,\ldots ,m\), are continuously differentiable. Further, there exist \(L \in (0,\infty )\) and \(\eta \in (\frac{1}{2},\infty )\) such that for all \(t \in [0,T]\) and \(x_1,x_2 \in {\mathbb {R}}^d\) it holds

$$\begin{aligned} \big \langle f(t,x_1) - f(t,x_2), x_1-x_2 \big \rangle + \eta \sum _{r = 1}^m \big | g^r(t,x_1) - g^r(t,x_2) \big |^2&\le L | x_1 - x_2 |^2. \end{aligned}$$
(4)

In addition, there exists \(q \in [2,\infty )\) such that

$$\begin{aligned} \big | \tfrac{\partial f}{\partial x} (t,x_1) - \tfrac{\partial f}{\partial x} (t,x_2) \big |_{\mathcal {L}({\mathbb {R}}^d)} \le L \big ( 1 + |x_1| + |x_2 | \big )^{q-2} | x_1 - x_2 | \end{aligned}$$
(5)

and, for every \(r = 1,\ldots ,m\),

$$\begin{aligned} \big | \tfrac{\partial g^r}{\partial x} (t,x_1) - \tfrac{\partial g^r}{\partial x} (t,x_2)\big |_{\mathcal {L}({\mathbb {R}}^d)}&\le L \big ( 1 + |x_1| + |x_2 | \big )^{\frac{q-3}{2}} | x_1 - x_2 | \end{aligned}$$
(6)

for all \(t \in [0,T]\) and \(x_1,x_2 \in {\mathbb {R}}^d\). Moreover, it holds

$$\begin{aligned} \big | \tfrac{\partial f}{\partial t} (t,x) \big | \le L \big ( 1 + |x | \big )^q, \quad \big | \tfrac{\partial g^r}{\partial t} (t,x) \big |&\le L \big ( 1 + |x|\big )^{\frac{q+1}{2}} , \end{aligned}$$
(7)

for all \(t \in [0,T]\), \(x \in {\mathbb {R}}^d\), and all \(r = 1,\ldots ,m\).

First we note that Assumption 2.1 is slightly weaker than the conditions imposed in [29, Lemma 4.2] in terms of smoothness requirements on the coefficient functions. Further, we recall that Eq. (4) is often termed global monotonicity condition in the literature. It is easy to check that Assumption 2.1 is satisfied (with \(q=3\)) if f and \(g^r\) and all their first order partial derivatives are globally Lipschitz continuous. However, Assumption 2.1 includes several SODEs which cannot be treated by the standard results found in [12, 20]. We refer to Sect. 8 for two more concrete examples.

For a possibly enlarged L the following estimates are an immediate consequence of Assumption 2.1 and the mean value theorem: For all \(t,t_1,t_2 \in [0,T]\) and \(x,x_1,x_2 \in {\mathbb {R}}^d\) it holds

$$\begin{aligned} |f(t,x) |&\le L \big ( 1 + |x | \big )^q, \end{aligned}$$
(8)
$$\begin{aligned} \big | \tfrac{\partial f}{\partial x} (t,x) \big |_{\mathcal {L}({\mathbb {R}}^d)}&\le L \big ( 1 + |x | \big )^{q-1}, \end{aligned}$$
(9)
$$\begin{aligned} | f(t_1,x) - f(t_2,x) |&\le L \big ( 1 + |x|\big )^q |t_1 - t_2|, \end{aligned}$$
(10)
$$\begin{aligned} | f(t,x_1) - f(t,x_2) |&\le L \big ( 1 + |x_1| + |x_2 | \big )^{q-1} | x_1 - x_2 |, \end{aligned}$$
(11)

and, for all \(r=1,\ldots ,m\),

$$\begin{aligned} | g^r(t,x)|&\le L \big ( 1 + |x | \big )^{\frac{q+1}{2}}, \end{aligned}$$
(12)
$$\begin{aligned} \big | \tfrac{\partial g^r}{\partial x} (t,x) \big |_{\mathcal {L}({\mathbb {R}}^d)}&\le L \big ( 1 + |x | \big )^{\frac{q-1}{2}}, \end{aligned}$$
(13)
$$\begin{aligned} | g^r(t_1,x) - g^r(t_2,x) |&\le L \big ( 1 + |x| \big )^{\frac{q+1}{2}} |t_1 - t_2|, \end{aligned}$$
(14)
$$\begin{aligned} | g^r(t,x_1) - g^r(t,x_2) |&\le L \big ( 1 + |x_1| + |x_2 | \big )^{\frac{q-1}{2}} | x_1 - x_2 |. \end{aligned}$$
(15)

Thus, Assumption 2.1 implies [2, Assumption 2.1] and all results of that paper also hold true in the situation considered here. Note that in this paper we use the weights \((1+|x|)^p\) instead of \(1+|x|^p\) as in [2]. For \(p \ge 0\) this makes no difference, however in condition (6) we may have \(p=\frac{q-3}{2}<0\) if \(2\le q <3\), so that Lipschitz constants actually decrease at infinity.

In the following it will be convenient to introduce the abbreviation

$$\begin{aligned} g^{r_1,r_2}(t,x) := \frac{\partial g^{r_1}}{\partial x} (t,x) g^{r_2}(t,x), \quad t \in [0,T],\; x \in {\mathbb {R}}^d, \end{aligned}$$
(16)

for \(r_1,r_2 = 1,\ldots ,m\). As above, one easily verifies under Assumption 2.1 that the mappings \(g^{r_1,r_2}\) satisfy (for a possibly larger L) the polynomial growth bound

$$\begin{aligned} \big | g^{r_1,r_2}(t,x) \big |&\le L \big ( 1 + |x| \big )^{q} \end{aligned}$$
(17)

as well as the local Lipschitz bound

$$\begin{aligned} \big | g^{r_1,r_2}(t,x_1) - g^{r_1,r_2}(t,x_2) \big | \le L \big ( 1 + |x_1| + |x_2 | \big )^{q-1} | x_1 - x_2 | \end{aligned}$$
(18)

for all \(x,x_1,x_2 \in {\mathbb {R}}^d\), \(t \in [0,T]\), and \(r_1,r_2 = 1,\ldots ,m\). For this conclusion to hold in case \(q<3\), it is essential to use the modified weight function in (6).

We say that an almost surely continuous and \(({\mathcal {F}}_t)_{t \in [0,T]}\)-adapted stochastic process \(X :[0,T] \times \Omega \rightarrow {\mathbb {R}}^d\) is a solution to (3) if it satisfies \({\mathbf {P}}\)-almost surely the integral equation

$$\begin{aligned} X(t) = X_0 + \int _{0}^{t} f(s,X(s)) \,\mathrm {d}s + \sum _{r = 1}^m \int _{0}^t g^r(s,X(s)) \,\mathrm {d}W^r(s) \end{aligned}$$
(19)

for all \(t \in [0,T]\). It is well-known that Assumption 2.1 is sufficient to ensure the existence of a unique solution to (3), see for instance [14], [17, Chap. 2.3] or the SODE chapter in [23, Chap. 3].

In addition, the exact solution has finite p-th moments, that is

$$\begin{aligned} \sup _{t \in [0,T]} \big \Vert X(t) \big \Vert _{L^p(\Omega ;{\mathbb {R}}^d)} < \infty , \end{aligned}$$
(20)

if the following global coercivity condition is satisfied: There exist \(C \in (0,\infty )\) and \(p \in [2, \infty )\) such that

$$\begin{aligned} \big \langle f(t,x), x \big \rangle + \frac{p-1}{2} \sum _{r = 1}^m \big | g^r(t,x) \big |^2 \le C \big (1 + | x |^2 \big ) \end{aligned}$$
(21)

for all \(x \in {\mathbb {R}}^d\), \(t \in [0,T]\). A proof is found, for example, in [17, Chap. 2.4].

For the formulation of the numerical methods we recall the following terminology from [2]: By \(\overline{h} \in (0,T]\) we denote an upper step size bound. Then, for every \(N \in {\mathbb {N}}\) we say that \(h = (h_1,\ldots ,h_{N}) \in (0,\overline{h}]^{N}\) is a vector of (deterministic) step sizes if \(\sum _{i = 1}^N h_i = T\). Every vector of step sizes h induces a set of temporal grid points \(\mathcal {T}_h\) given by

$$\begin{aligned} \mathcal {T}_h := \Big \{ t_n := \sum _{i = 1}^n h_i \, : \, n = 0,\ldots ,N \Big \}, \end{aligned}$$

where \(\sum _{\emptyset } = 0\). For short we write

$$\begin{aligned} |h| := \max _{i \in \{ 1,\ldots ,N\}} h_i \end{aligned}$$

for the maximal step size in h.

Moreover, we recall from [12, 20] the following notation for the stochastic increments: Let \(t,s \in [0,T]\) with \(s < t\). Then we define

$$\begin{aligned} I_{(r)}^{s,t} := \int _s^t \,\mathrm {d}W^{r}(\tau ), \end{aligned}$$
(22)

for \(r \in \{1,\ldots ,m\}\) and, similarly,

$$\begin{aligned} I_{(r_1,r_2)}^{s,t} := \int _s^t \int _s^{\tau _1} \,\mathrm {d}W^{r_1}(\tau _2) \,\mathrm {d}W^{r_2}(\tau _1), \end{aligned}$$
(23)

where \(r_1, r_2 \in \{1,\ldots ,m\}\). The joint family of the iterated stochastic integrals \(\left( I_{(r_1,r_2)}^{s,t}\right) _{r_1,r_2 = 1}^m\) is not easily generated on a computer. Besides special cases such as commutative noise one relies on an additional approximation method from e.g. [4, 24, 28]. We also refer to the corresponding discussion in [12, Chap. 10.3].

The first numerical scheme, which we study in this paper, is an explicit one-step scheme and termed projected Milstein method (PMil). It is the Milstein-type counterpart of the projected Euler–Maruyama method form [2] and consists of the standard Milstein scheme and a projection onto a ball in \({\mathbb {R}}^d\) whose radius is expanding with a negative power of the step size.

To be more precise, let \(h \in (0,\overline{h}]^N\), \(N \in {\mathbb {N}}\), be an arbitrary vector of step sizes with upper step size bound \(\overline{h} = 1\). For a given parameter \(\alpha \in (0,\infty )\) the PMil method is determined by the recursion

$$\begin{aligned} \overline{X}_h^{\mathrm {PMil}}(t_i)&= \min \big ( 1 , h_i^{-\alpha } \big | X_h^{\mathrm {PMil}}(t_{i-1}) \big |^{-1} \big ) X_h^{\mathrm {PMil}}(t_{i-1}),\nonumber \\ X_h^{\mathrm {PMil}}(t_i)&= \overline{X}_h^{\mathrm {PMil}}(t_{i}) + h_i f\left( t_{i-1}, \overline{X}_h^{\mathrm {PMil}}(t_{i}) \right) + \sum _{r = 1}^m g^r\left( t_{i-1}, \overline{X}_h^{\mathrm {PMil}}(t_{i})\right) I_{(r)}^{t_{i-1},t_{i}}\nonumber \\&\quad + \sum _{r_1, r_2 = 1}^m g^{r_1,r_2} \left( t_{i-1}, \overline{X}_{h}^{\mathrm {PMil}}(t_i)\right) I_{(r_2,r_1)}^{t_{i-1},t_{i}}, \quad \text { for } 1 \le i \le N, \end{aligned}$$
(24)

where \(X_h^{\mathrm {PMil}}(0) := X_0\). The results of Sect. 4 indicate that the parameter value for \(\alpha \) is optimally chosen by setting \(\alpha = \frac{1}{2(q-1)}\) in dependence of the growth rate q appearing in Assumption 2.1. One aim of this paper is the proof of the following strong convergence result for the PMil method. It follows directly from Theorems 4.4 and 5.1 as well as Theorem 3.5.

Theorem 2.2

Let Assumption 2.1 be satisfied with polynomial growth rate \(q \in [2,\infty )\). If the exact solution X to (3) satisfies \(\sup _{\tau \in [0,T]} \Vert X(\tau ) \Vert _{L^{8q-6}(\Omega ;{\mathbb {R}}^d)} < \infty \), then the projected Milstein method (24) with parameter value \(\alpha = \frac{1}{2(q-1)}\) and with arbitrary upper step size bound \(\overline{h} \in (0,1]\) is strongly convergent of order \(\gamma = 1\).

Next, we come to the second numerical scheme, which is called split-step backward Milstein method (SSBM). For a suitable upper step size bound \(\overline{h} \in (0,T]\) and a given vector of step sizes \(h = (h_1,\ldots ,h_N) \in (0,\overline{h}]^N\), \(N \in {\mathbb {N}}\), this method is defined by setting \(X_h^{\mathrm {SSBM}}(0) = X_0\) and by the recursion

$$\begin{aligned} \overline{X}_h^{\mathrm {SSBM}}(t_i)&= X_h^{\mathrm {SSBM}}(t_{i-1}) + h_i f\left( t_{i}, \overline{X}_h^{\mathrm {SSBM}}(t_i)\right) ,\nonumber \\ X_h^{\mathrm {SSBM}}(t_{i})&= \overline{X}_h^{\mathrm {SSBM}}(t_i) + \sum _{r = 1}^m g^r\left( t_{i}, \overline{X}_{h}^{\mathrm {SSBM}}(t_i)\right) I_{(r)}^{t_{i-1},t_{i}}\nonumber \\&\quad + \sum _{r_1, r_2 = 1}^m g^{r_1,r_2}\left( t_{i}, \overline{X}_{h}^{\mathrm {SSBM}}(t_i)\right) I_{(r_2,r_1)}^{t_{i-1},t_{i}}, \end{aligned}$$
(25)

for every \(i = 1,\ldots ,N\).

Let us note that the recursion defining the SSBM method evaluates the diffusion coefficient functions \(g^r\) at time \(t_{i}\) in the i-th step. This phenomenon was already apparent in the definition of the split-step backward Euler method in [2]. It turns out that by this modification we avoid some technical issues in the proofs as condition (26) is applied to f and \(g^r\), \(r = 1,\ldots ,m\), simultaneously at the same point \(t \in [0,T]\) in time. Compare also with the inequality (50) further below.

It is shown in Sect. 6 that the SSBM scheme is a well-defined stochastic one-step method under Assumption 2.1. The second main result of this paper is the proof of the following strong convergence result:

Theorem 2.3

Let Assumption 2.1 be satisfied with \(L \in (0, \infty )\) and \(q \in [2,\infty )\). In addition, we assume that there exist \(\eta _1 \in (1,\infty )\) and \(\eta _2 \in (0,\infty )\) such that it holds

$$\begin{aligned}&\big \langle f(t,x_1) - f(t,x_2), x_1-x_2 \big \rangle + \eta _1 \sum _{r = 1}^m \big | g^r(t,x_1) - g^r(t,x_2) \big |^2 \nonumber \\&\quad + \eta _2 \sum _{r_1,r_2 = 1}^m \big | g^{r_1,r_2}(t,x_1) - g^{r_1,r_2}(t,x_2) \big |^2 \le L | x_1 - x_2 |^2 \end{aligned}$$
(26)

for all \(x_1, x_2 \in {\mathbb {R}}^d\). If the solution X to (3) satisfies \(\sup _{\tau \in [0,T]} \Vert X(\tau ) \Vert _{L^{6q-4}(\Omega ;{\mathbb {R}}^d)} < \infty \), then the split-step backward Milstein method (25) with arbitrary upper step size bound \(\overline{h} \in (0, \max (L^{-1},\frac{2 \eta _2}{\eta _1}))\) is strongly convergent of order \(\gamma = 1\).

As we show below this theorem follows directly from Theorem 3.5 together with Theorems 6.3 and 7.1. Note that (26) is more restrictive than the global monotonicity condition (4) if the mappings \(g^{r_1,r_2}\) are not globally Lipschitz continuous for all \(r_1,r_2 = 1,\ldots ,m\).

In the remainder of this section we briefly summarize the corresponding convergence results in the case of stochastic differential equations with additive noise, that is if the mappings \(g^r\), \(r = 1,\ldots ,m\), do not depend explicitly on the state of X. In this case it is well-known that Milstein-type schemes coincide with their Euler-type counterparts.

To be more precise, we consider the solution \(X :[0,T] \times \Omega \rightarrow {\mathbb {R}}^d\) to an SODE of the form

$$\begin{aligned} \,\mathrm {d}X(t)&= f(t,X(t)) \,\mathrm {d}t + \sum _{r=1}^m g^r(t) \,\mathrm {d}W^r(t),\quad t \in [0,T], \nonumber \\ X(0)&=X_0. \end{aligned}$$
(27)

In this case, the conditions on the drift coefficient function \(f:[0,T] \times {\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\) and the diffusion coefficient functions \(g^r :[0,T] \rightarrow {\mathbb {R}}^d\), \(r=1,\ldots ,m\), in Assumption 2.1 simplify to

Assumption 2.4

(Additive noise) The coefficient functions \(f :[0,T] \times {\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\) and \(g^r :[0,T] \rightarrow {\mathbb {R}}^d\), \(r = 1,\ldots ,m\), are continuously differentiable, and there exist constants \(L \in (0,\infty )\), \(q \in [2,\infty )\) such that for all \(t \in [0,T]\) and \(x,x_1,x_2 \in {\mathbb {R}}^d\) the following properties hold

$$\begin{aligned} \big \langle f(t,x_1) - f(t,x_2), x_1-x_2 \big \rangle&\le L | x_1 - x_2 |^2,\\ \big | \tfrac{\partial f}{\partial t} (t,x) \big |&\le L \big ( 1 + |x | \big )^q,\\ \big | \tfrac{\partial f}{\partial x} (t,x_1) - \tfrac{\partial f}{\partial x} (t,x_2) \big |_{\mathcal {L}({\mathbb {R}}^d)}&\le L \big ( 1 + |x_1| + |x_2 | \big )^{q-2} | x_1 - x_2 | . \end{aligned}$$

Under this assumption it directly follows that \(g^{r_1,r_2} \equiv 0\) for all \(r_1,r_2 = 1,\ldots ,m\) for the coefficient functions defined in (16) . Consequently, the PMil method and the SSBM scheme coincide with the PEM method and the SSBE scheme from [2], respectively.

Let us note that Assumption 2.4 implies the global coercivity condition (21) for every \(p \in [2,\infty )\). Consequently, under Assumption 2.4 the exact solution to (27) has finite p-th moments for every \(p \in [2,\infty )\). From this and Theorems 2.2 and 2.3 we directly obtain the following convergence result:

Corollary 2.5

Let Assumption 2.4 be satisfied with \(L \in (0, \infty )\) and \(q \in [2,\infty )\). Then it holds that

  1. (i)

    the projected Euler–Maruyama method with \(\alpha = \frac{1}{2(q-1)}\) and arbitrary upper step size bound \(\overline{h} \in (0,1]\) is strongly convergent of order \(\gamma = 1\).

  2. (ii)

    the split-step backward Euler method with arbitrary upper step size bound \(\overline{h} \in (0, L^{-1})\) is strongly convergent of order \(\gamma = 1\).

3 A Reminder on Stochastic C-Stability and B-Consistency

In this section we give a brief overview of the notions of stochastic C-stability and B-consistency introduced in [2]. We also state the abstract convergence theorem, which, roughly speaking, can be summarized by

$$\begin{aligned} \text {stoch. C-stability } + \text { stoch. B-consistency} \quad \Rightarrow \text { Strong convergence.} \end{aligned}$$

We first recall some additional notation from [2]: For an arbitrary upper step size bound \(\overline{h} \in (0,T]\) we define the set \(\mathbb {T} := \mathbb {T}(\overline{h}) \subset [0,T) \times (0,\overline{h}]\) to be

$$\begin{aligned} \mathbb {T}(\overline{h}) := \big \{ (t,\delta ) \in [0,T) \times (0,\overline{h}] \, : \, t+\delta \le T \big \}. \end{aligned}$$

Further, for a given vector of step sizes \(h \in (0,\overline{h}]^N\), \(N \in {\mathbb {N}}\), we denote by \(\mathcal {G}^{2}(\mathcal {T}_h)\) the space of all adapted and square integrable grid functions, that is

$$\begin{aligned} \mathcal {G}^{2}(\mathcal {T}_h) := \big \{ Z :\mathcal {T}_h \times \Omega \rightarrow {\mathbb {R}}^d\, : \, Z(t_n) \in L^2(\Omega ,{\mathcal {F}}_{t_n},{\mathbf {P}};{\mathbb {R}}^d)\text { for all } n = 0,1,\ldots ,N \big \}. \end{aligned}$$

The next definition describes the abstract class of stochastic one-step methods which we consider in this section.

Definition 3.1

Let \(\overline{h} \in (0,T]\) be an upper step size bound and \(\Psi :{\mathbb {R}}^d \times \mathbb {T} \times \Omega \rightarrow {\mathbb {R}}^d\) be a mapping satisfying the following measurability and integrability condition: For every \((t,\delta ) \in \mathbb {T}\) and \(Z \in L^2(\Omega ,{\mathcal {F}}_{t},{\mathbf {P}};{\mathbb {R}}^d)\) it holds

$$\begin{aligned} \Psi (Z,t,\delta ) \in L^2(\Omega ,{\mathcal {F}}_{t+\delta },{\mathbf {P}};{\mathbb {R}}^d). \end{aligned}$$
(28)

Then, for every vector of step sizes \(h \in (0,\overline{h}]^N\), \(N \in {\mathbb {N}}\), we say that a grid function \(X_h \in \mathcal {G}^2(\mathcal {T}_h)\) is generated by the stochastic one-step method \((\Psi ,\overline{h},\xi )\) with initial condition \(\xi \in L^2(\Omega ,{\mathcal {F}}_{0},{\mathbf {P}};{\mathbb {R}}^d)\) if

$$\begin{aligned} X_h(t_i)&= \Psi (X_h(t_{i-1}), t_{i-1}, h_i), \quad 1 \le i \le N,\nonumber \\ X_h(t_0)&= \xi . \end{aligned}$$
(29)

We call \(\Psi \) the one-step map of the method.

For the formulation of the next definition we denote by \({\mathbb {E}}[ Y | {\mathcal {F}}_t]\) the conditional expectation of a random variable \(Y \in L^1(\Omega ;{\mathbb {R}}^d)\) with respect to the sigma-field \({\mathcal {F}}_t\). Note that if Y is square integrable, then \({\mathbb {E}}[ Y | {\mathcal {F}}_t]\) coincides with the orthogonal projection onto the closed subspace \(L^2(\Omega , {\mathcal {F}}_t, {\mathbf {P}}; {\mathbb {R}}^d)\). By \((\mathrm {id}- {\mathbb {E}}[ \cdot | {\mathcal {F}}_t])\) we denote the associated projector onto the orthogonal complement.

Definition 3.2

A stochastic one-step method \((\Psi ,\overline{h},\xi )\) is called stochastically C-stable (with respect to the norm in \(L^2(\Omega ;{\mathbb {R}}^d)\)) if there exist a constant \(C_{\mathrm {stab}}\) and a parameter value \(\nu \in (1,\infty )\) such that for all \((t,\delta ) \in \mathbb {T}\) and all random variables \(Y, Z \in L^2(\Omega ,{\mathcal {F}}_{t},{\mathbf {P}};{\mathbb {R}}^d)\) it holds

$$\begin{aligned}&\big \Vert {\mathbb {E}}\big [ \Psi (Y,t,\delta ) - \Psi (Z,t,\delta ) | {\mathcal {F}}_{t} \big ] \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}^2\nonumber \\&\qquad + \nu \big \Vert \big ( \mathrm {id}- {\mathbb {E}}[ \, \cdot \, | {\mathcal {F}}_{t} ] \big ) \big ( \Psi (Y,t,\delta ) - \Psi (Z,t,\delta ) \big ) \big \Vert ^2_{L^2(\Omega ;{\mathbb {R}}^d)}\nonumber \\&\quad \le \big (1 + C_{\mathrm {stab}} \delta \big ) \big \Vert Y - Z \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}^2. \end{aligned}$$
(30)

A first consequence of the notion of stochastic C-stability is the following a priori estimate: Let \((\Psi ,\overline{h},\xi )\) be a stochastically C-stable one-step method. If there exists a constant \(C_0\) such that for all \((t,\delta ) \in \mathbb {T}\) it holds

$$\begin{aligned} \big \Vert {\mathbb {E}}\big [ \Psi (0, t, \delta ) | {\mathcal {F}}_{t} \big ] \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}&\le C_0 \delta , \end{aligned}$$
(31)
$$\begin{aligned} \big \Vert \big ( \mathrm {id}- {\mathbb {E}}\big [ \, \cdot \, | {\mathcal {F}}_{t} \big ] \big ) \Psi (0, t, \delta ) \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}&\le C_0 \delta ^{\frac{1}{2}}, \end{aligned}$$
(32)

then there exists a positive constant C with

$$\begin{aligned} \max _{n \in \{0,\ldots ,N\}} \Vert X_h(t_n) \Vert _{L^2(\Omega ;{\mathbb {R}}^d)} \le \mathrm {e}^{C T} \Big ( \Vert \xi \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}^2 + C C_0^2 T \Big )^{\frac{1}{2}}, \end{aligned}$$

for every vector of step sizes \(h \in (0,\overline{h}]^N\), \(N \in {\mathbb {N}}\), where \(X_h\) denotes the grid function generated by \((\Psi ,\overline{h},\xi )\) with step sizes h. A proof for this result is found in [2, Cor. 3.6].

Definition 3.3

A stochastic one-step method \((\Psi ,\overline{h},\xi )\) is called stochastically B-consistent of order \(\gamma > 0\) to (3) if there exists a constant \(C_{\mathrm {cons}}\) such that for every \((t,\delta ) \in \mathbb {T}\) it holds

$$\begin{aligned} \big \Vert {\mathbb {E}}\big [ X(t+\delta ) - \Psi (X(t),t,\delta ) | {\mathcal {F}}_{t} \big ] \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)} \le C_{\mathrm {cons}} \delta ^{ \gamma + 1} \end{aligned}$$
(33)

and

$$\begin{aligned} \big \Vert \big ( \mathrm {id}- {\mathbb {E}}[ \, \cdot \, | {\mathcal {F}}_{t} ] \big ) \big ( X(t + \delta ) - \Psi (X(t),t,\delta ) \big ) \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)} \le C_{\mathrm {cons}} \delta ^{ \gamma + \frac{1}{2}}, \end{aligned}$$
(34)

where \(X :[0,T] \times \Omega \rightarrow {\mathbb {R}}^d\) denotes the exact solution to (3).

Finally, it remains to give our definition of strong convergence.

Definition 3.4

A stochastic one-step method \((\Psi ,\overline{h},\xi )\) converges strongly with order \(\gamma > 0\) to the exact solution of (3) if there exists a constant C such that for every vector of step sizes \(h \in (0,\overline{h}]^N\), \(N \in {\mathbb {N}}\), it holds

$$\begin{aligned} \max _{n \in \{0,\ldots ,N\}} \big \Vert X_h(t_n) - X(t_n) \big \Vert _{L^{2}(\Omega ;{\mathbb {R}}^d)} \le C |h|^{\gamma }. \end{aligned}$$

Here X denotes the exact solution to (3) and \(X_h \in \mathcal {G}^2(\mathcal {T}_h)\) is the grid function generated by \((\Psi ,\overline{h},\xi )\) with step sizes \(h \in (0,\overline{h}]^N\).

We close this section with the following abstract convergence theorem, which is proved in [2, Theorem 3.7].

Theorem 3.5

Let the stochastic one-step method \((\Psi ,\overline{h},\xi )\) be stochastically C-stable and stochastically B-consistent of order \(\gamma > 0\). If \(\xi = X_0\), then there exists a constant C depending on \(C_{\mathrm {stab}}\), \(C_\mathrm {cons}\), T, \(\overline{h}\), and \(\nu \) such that for every vector of step sizes \(h \in (0,\overline{h}]^N\), \(N \in {\mathbb {N}}\), it holds

$$\begin{aligned} \max _{n \in \{0,\ldots ,N\}} \big \Vert X(t_n) - X_h(t_n) \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)} \le C |h|^{\gamma }, \end{aligned}$$

where X denotes the exact solution to (3) and \(X_h\) the grid function generated by \((\Psi ,\overline{h},\xi )\) with step sizes h. In particular, \((\Psi ,\overline{h},\xi )\) is strongly convergent of order \(\gamma \).

4 C-Stability of the Projected Milstein Method

In this section we prove that the projected Milstein (PMil) method defined in (24) is stochastically C-stable.

Throughout this section we assume that Assumption 2.1 is satisfied with growth rate \(q \in [2,\infty )\). First, we choose an arbitrary upper step size bound \(\overline{h} \in (0, 1]\) and a parameter value \(\alpha \in (0,\infty )\). Later it will turn out to be optimal to set \(\alpha = \frac{1}{2(q-1)}\) in dependence of the growth q in Assumption 2.1.

For the definition of the one-step map of the PMil method it is convenient to introduce the following short hand notation: For every \(\delta \in (0,\overline{h}]\), we denote the projection of \(x \in {\mathbb {R}}^d\) onto the ball of radius \(\delta ^{-\alpha }\) by

$$\begin{aligned} x^\circ := \min (1, \delta ^{-\alpha } |x|^{-1}) x. \end{aligned}$$
(35)

Then, the one-step map \(\Psi ^{\mathrm {PMil}} :{\mathbb {R}}^d \times \mathbb {T} \times \Omega \rightarrow {\mathbb {R}}^d\) is given by

$$\begin{aligned} \Psi ^{\mathrm {PMil}}(x,t,\delta ) := x^\circ + \delta f(t, x^\circ ) + \sum _{r = 1}^m g^r(t, x^\circ ) I_{(r)}^{t,t+\delta } + \sum _{r_1,r_2 = 1}^m g^{r_1,r_2}(t,x^\circ ) I_{(r_2,r_1)}^{t,t+\delta } \end{aligned}$$
(36)

for every \(x \in {\mathbb {R}}^d\) and \((t,\delta ) \in \mathbb {T}\). Recall (22) and (23) for the definition of the stochastic increments.

First, we check that the PMil method is a stochastic one-step method in the sense of Definition 3.1. At the same time we verify that the one step map satisfies conditions (31) and (32).

Proposition 4.1

Let the functions f and \(g^r\), \(r = 1,\ldots ,m\), satisfy Assumption 2.1 with \(L \in (0,\infty )\) and let \(\overline{h} \in (0,1]\). For every initial value \(\xi \in L^2(\Omega ;{\mathcal {F}}_{0},{\mathbf {P}};{\mathbb {R}}^d)\) and for every \(\alpha \in (0,\infty )\) it holds that \((\Psi ^{\mathrm {PMil}}, \overline{h}, \xi )\) is a stochastic one-step method.

In addition, there exists a constant \(C_0\) only depending on L and m such that

$$\begin{aligned} \big \Vert {\mathbb {E}}\big [ \Psi ^{\mathrm {PMil}}( 0, t,\delta ) | {\mathcal {F}}_{t} \big ] \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}&\le C_0 \delta , \end{aligned}$$
(37)
$$\begin{aligned} \big \Vert \big ( \mathrm {id}- {\mathbb {E}}[ \, \cdot \, | {\mathcal {F}}_{t} ] \big ) \Psi ^{\mathrm {PMil}}( 0, t,\delta ) \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}&\le C_0 \delta ^{\frac{1}{2}} \end{aligned}$$
(38)

for all \((t,\delta ) \in \mathbb {T}\).

Proof

We first verify that \(\Psi ^{\mathrm {PMil}}\) satisfies (28). For this let us fix arbitrary \((t, \delta ) \in \mathbb {T}\) and \(Z \in L^2(\Omega ,{\mathcal {F}}_t,{\mathbf {P}};{\mathbb {R}}^d)\). Then, the continuity and boundedness of the mapping \({\mathbb {R}}^d \ni x \mapsto x^\circ = \min (1, \delta ^{-\alpha } |x|^{-1} ) x \in {\mathbb {R}}^d\) yields

$$\begin{aligned} Z^\circ \in L^\infty (\Omega ,{\mathcal {F}}_t,{\mathbf {P}};{\mathbb {R}}^d). \end{aligned}$$

Consequently, by the smoothness of the coefficient functions and by (8), (12), and (17) it follows that

$$\begin{aligned} f(t,Z^\circ ),\, g^{r_1}(t,Z^\circ ),\, g^{r_1,r_2}(t,Z^\circ ) \in L^\infty (\Omega ,{\mathcal {F}}_t,{\mathbf {P}};{\mathbb {R}}^d) \end{aligned}$$

for every \(r_1,r_2 = 1,\ldots ,m\). Therefore, \(\Psi ^{\mathrm {PMil}}(Z,t,\delta ) :\Omega \rightarrow {\mathbb {R}}^d\) is an \({\mathcal {F}}_{t+\delta } / {\mathcal {B}}({\mathbb {R}}^d)\)-measurable random variable satisfying condition (28).

It remains to show (37) and (38). From (8) we get immediately that

$$\begin{aligned} \big \Vert {\mathbb {E}}\big [ \Psi ^{\mathrm {PMil}}( 0, t,\delta ) | {\mathcal {F}}_{t} \big ] \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)} = \big | \delta f(t, 0) \big | \le L \delta . \end{aligned}$$

Next, recall that the stochastic increments \((I_{(r)}^{t,t+\delta })_{r = 1}^m\) and \((I_{(r_1,r_2)}^{t,t+\delta })_{r_1,r_2 = 1}^m\) are pairwise uncorrelated. Therefore, we obtain that

$$\begin{aligned}&\big \Vert \big ( \mathrm {id}- {\mathbb {E}}[ \, \cdot \, | {\mathcal {F}}_{t} ] \big ) \Psi ^{\mathrm {PMil}}( 0, t,\delta ) \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}^2\\&\quad = \Big \Vert \sum _{r = 1}^m g^r(t, 0 ) I_{(r)}^{t,t+\delta } + \sum _{r_1,r_2 = 1}^m g^{r_1,r_2}(t, 0 ) I_{(r_1,r_2)}^{t,t+\delta } \Big \Vert ^2_{L^2(\Omega ;{\mathbb {R}}^d)}\\&\quad = \delta \sum _{r = 1}^m \big | g^r(t, 0 ) \big |^2 + \frac{\delta ^2}{2} \sum _{r_1,r_2 = 1}^m \big | g^{r_1,r_2}(t, 0 ) \big |^2 \le L^2 m \delta + \frac{1}{2}L^2 m^2 \delta ^2, \end{aligned}$$

where the last step follows from (12) and (17). Since \(\delta \le \overline{h} \le 1\) this verifies (38). \(\square \)

The next result is concerned with the projection onto the ball of radius \(\delta ^{-\alpha }\). The proof is found in [2, Lem. 6.2].

Lemma 4.2

For every \(\alpha \in (0,\infty )\) and \(\delta \in (0,1]\) the mapping \({\mathbb {R}}^d \ni x \mapsto x^\circ \in {\mathbb {R}}^d\) defined in (35) is globally Lipschitz continuous with Lipschitz constant 1. In particular, it holds

$$\begin{aligned} \big | x^\circ _1 - x^\circ _2 \big | \le \big | x_1- x_2 \big | \end{aligned}$$
(39)

for all \(x_1, x_2 \in {\mathbb {R}}^d\).

The following inequality (40) follows from the global monotonicity condition (4) and plays an import role in the stability analysis of the PMil method. The proof is given in [2, Lem. 6.3].

Lemma 4.3

Let the functions f and \(g^r\), \(r = 1,\ldots ,m\), satisfy Assumption 2.1 with \(L \in (0,\infty )\), \(q \in [2,\infty )\), and \(\eta \in (\frac{1}{2},\infty )\). Consider the mapping \({\mathbb {R}}^d \ni x \mapsto x^\circ \in {\mathbb {R}}^d\) defined in (35) with \(\alpha \in (0, \frac{1}{2(q-1)}]\) and \(\delta \in (0,1]\). Then there exists a constant C only depending on L such that

$$\begin{aligned}&\big | x^\circ _1 - x_2^\circ + \delta ( f(t,x_1^\circ ) - f(t,x_2^\circ )) \big |^2 + 2 \eta \delta \sum _{r = 1}^m \big | g^r(t,x_1^\circ ) - g^r(t,x_2^\circ )) \big |^2\nonumber \\&\qquad \le (1 + C \delta ) | x_1 - x_2 |^2 \end{aligned}$$
(40)

for all \(x_1, x_2 \in {\mathbb {R}}^d\).

The next theorem verifies that the PMil method is stochastically C-stable.

Theorem 4.4

Let the functions f and \(g^r\), \(r = 1,\ldots ,m\), satisfy Assumption 2.1 with \(L \in (0,\infty )\), \(q \in [2,\infty )\), and \(\eta \in (\frac{1}{2},\infty )\). Further, let \(\overline{h} \in (0, 1]\). Then, for every \(\xi \in L^2(\Omega ,{\mathcal {F}}_0,{\mathbf {P}};{\mathbb {R}}^d)\) the projected Milstein method \((\Psi ^{\mathrm {PMil}},\overline{h},\xi )\) with \(\alpha = \frac{1}{2(q-1)}\) is stochastically C-stable.

Proof

Let \((t,\delta ) \in \mathbb {T}\) and \(Y, Z \in L^2(\Omega ,{\mathcal {F}}_t,{\mathbf {P}};{\mathbb {R}}^d)\) be arbitrary. By recalling (36) we obtain

$$\begin{aligned} {\mathbb {E}}\big [ \Psi ^{\mathrm {PMil}}(Y,t,\delta ) - \Psi ^{\mathrm {PMil}}(Z,t,\delta ) | {\mathcal {F}}_{t} \big ] = Y^\circ + \delta f(t, Y^\circ ) - ( Z^\circ + \delta f(t,Z^\circ )) \end{aligned}$$

and

$$\begin{aligned}&\big ( \mathrm {id}- {\mathbb {E}}[ \, \cdot \, | {\mathcal {F}}_{t} ] \big ) \big ( \Psi ^{\mathrm {PMil}}(Y,t,\delta ) - \Psi ^{\mathrm {PMil}}(Z,t,\delta ) \big )\\&\quad = \sum _{r = 1}^m \big ( g^r(t , Y^\circ ) - g^r(t , Z^\circ ) \big ) I_{(r)}^{t,t+\delta } + \sum _{r_1,r_2 = 1}^m \big ( g^{r_1,r_2}(t , Y^\circ ) - g^{r_1,r_2}(t , Z^\circ ) \big ) I_{(r_2,r_1)}^{t,t+\delta }. \end{aligned}$$

In order to verify (30) with \(\nu = 2 \eta \in (1,\infty )\) let us note that the stochastic increments are pairwise uncorrelated and independent of \(Y^\circ \) and \(Z^\circ \). Hence it follows

$$\begin{aligned}&\big \Vert {\mathbb {E}}\big [ \Psi ^{\mathrm {PMil}}(Y,t,\delta ) - \Psi ^{\mathrm {PMil}}(Z,t,\delta ) | {\mathcal {F}}_{t} \big ] \big \Vert ^2_{L^2(\Omega ;{\mathbb {R}}^d)}\\&\qquad + \nu \big \Vert \big ( \mathrm {id}- {\mathbb {E}}[ \, \cdot \, | {\mathcal {F}}_{t} ] \big ) \big ( \Psi ^{\mathrm {PMil}}(Y,t,\delta ) - \Psi ^{\mathrm {PMil}}(Z,t,\delta ) \big ) \big \Vert ^2_{L^2(\Omega ;{\mathbb {R}}^d)}\\&\quad = \big \Vert Y^\circ + \delta f(t, Y^\circ ) - ( Z^\circ + \delta f(t,Z^\circ )) \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}^2\\&\qquad + \nu \delta \sum _{r = 1}^m \big \Vert g^r(t , Y^\circ ) - g^r(t , Z^\circ ) \big \Vert ^2_{L^2(\Omega ;{\mathbb {R}}^d)}\\&\qquad + \frac{1}{2}\nu \delta ^2 \sum _{r_1,r_2 = 1}^m \big \Vert g^{r_1,r_2}(t , Y^\circ ) - g^{r_1,r_2}(t , Z^\circ ) \big \Vert ^2_{L^2(\Omega ;{\mathbb {R}}^d)}. \end{aligned}$$

An application of Lemma 4.3 with \(\nu = 2 \eta \) shows that the first two terms are dominated by

$$\begin{aligned}&{\mathbb {E}}\Big [ \big | Y^\circ + \delta f(t, Y^\circ ) - ( Z^\circ + \delta f(t,Z^\circ )) \big |^2 + \nu \delta \sum _{r = 1}^m \big | g^r(t , Y^\circ ) - g^r(t , Z^\circ ) \big |^2 \Big ]\\&\quad \le (1 + C \delta ) \big \Vert Y - Z \big \Vert ^2_{L^2(\Omega ;{\mathbb {R}}^d)}. \end{aligned}$$

In addition, applications of (18) and (39) yield

$$\begin{aligned} \big | g^{r_1,r_2}(t,x_1^\circ ) - g^{r_1,r_2}(t,x_2^\circ ) \big |&\le L \big ( 1 + |x_1^\circ | + |x_2^\circ | \big )^{q-1} \big |x_1^\circ - x_2^\circ \big | \\&\le L \big ( 1 + 2 \delta ^{-\alpha } \big )^{q-1} \big |x_1 - x_2 \big |, \end{aligned}$$

where we made use of the fact that \(|x_1^\circ |, |x_2^\circ | \le \delta ^{-\alpha }\). Since \(\alpha (q-1) = \frac{1}{2}\) and \(\delta \in (0,1]\) it follows \(\delta ^{\frac{1}{2}}( 1 + 2 \delta ^{-\alpha } )^{q-1} \le 3^{q-1}\) and, therefore,

$$\begin{aligned} \nu \delta ^2 \sum _{r_1,r_2 = 1}^m \big \Vert g^{r_1,r_2}(t , Y^\circ ) - g^{r_1,r_2}(t , Z^\circ ) \big \Vert ^2_{L^2(\Omega ;{\mathbb {R}}^d)} \le 3^{2(q-1)} \nu m^2 L^2 \delta \big \Vert Y - Z \big \Vert ^2_{L^2(\Omega ;{\mathbb {R}}^d)}. \end{aligned}$$

This completes the proof. \(\square \)

5 B-Consistency of the Projected Milstein Method

In this section we show that the PMil method is stochastically B-consistent of order \(\gamma = 1\). To be more precise, we prove the following result:

Theorem 5.1

Let f and \(g^r\), \(r = 1,\ldots ,m\), satisfy Assumption 2.1 with \(L \in (0,\infty )\) and \(q \in [2,\infty )\). Let \(\overline{h} \in (0,1]\) be arbitrary. If the exact solution X to (3) satisfies \(\sup _{\tau \in [0,T]} \Vert X(\tau ) \Vert _{L^{8q-6}(\Omega ;{\mathbb {R}}^d)} < \infty \), then the projected Milstein method \((\Psi ^{\mathrm {PMil}},\overline{h},X_0)\) with \(\alpha = \frac{1}{2(q-1)}\) is stochastically B-consistent of order \(\gamma = 1\).

In preparation for the proof of Theorem 5.1 we introduce several more technical lemmas. The first is cited from [2, Lemma 6.5]. It formalizes a method of proof already found in [7, Theorem 2.2].

Lemma 5.2

For arbitrary \(\alpha \in (0,\infty )\) and \(\delta \in (0,1]\) consider the mapping \({\mathbb {R}}^d \ni x \mapsto x^\circ \in {\mathbb {R}}^d\) defined in (35). Let \(L \in (0,\infty )\), \(\kappa \in [1, \infty )\) and let \(\varphi :{\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\) be a measurable mapping which satisfies

$$\begin{aligned} |\varphi (x) | \le L \big (1 + |x|^\kappa \big ) \end{aligned}$$

for all \(x \in {\mathbb {R}}^d\). For some \(p \in (2,\infty )\) let \(Y \in L^{p\kappa }(\Omega ;{\mathbb {R}}^d)\). Then there exists a constant C only depending on L and p with

$$\begin{aligned} \big \Vert \varphi (Y) - \varphi (Y^\circ ) \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)} \le C \big ( 1 + \Vert Y\Vert _{L^{p\kappa }(\Omega ;{\mathbb {R}}^d)}^{\frac{1}{2}p \kappa } \big ) \delta ^{\frac{1}{2} \alpha (p -2 ) \kappa }. \end{aligned}$$

The proof of consistency also depends on the Hölder continuity of the exact solution to (3) with respect to the norm in \(L^p(\Omega ;{\mathbb {R}}^d)\) for some \(p \in [2,\infty )\). A proof is given in [2, Proposition 5.4].

Proposition 5.3

Let f and \(g^r\), \(r = 1,\ldots ,m\), satisfy Assumption 2.1 with \(L \in (0,\infty )\) and \(q \in [2,\infty )\). For every \(p \in [2,\infty )\) there exists a constant \(C=C(L,q,p)\) such that

$$\begin{aligned} \big \Vert X(t_1) - X(t_2) \big \Vert _{L^p(\Omega ;{\mathbb {R}}^d)} \le C \left( 1 + \sup _{t \in [0,T]} \Vert X(t)\Vert _{L^{pq}(\Omega ;{\mathbb {R}}^d)}^q \right) | t_1 - t_2|^{\frac{1}{2}} \end{aligned}$$
(41)

holds for all \(t_1, t_2 \in [0,T]\) and for every solution \(X :[0,T] \times \Omega \rightarrow {\mathbb {R}}^d\) to the SODE (3) satisfying \(\sup _{t \in [0,T]}\Vert X(t)\Vert _{L^{pq}(\Omega ;{\mathbb {R}}^d)} < \infty \).

The next auxiliary result combines the Hölder estimates with growth functions.

Proposition 5.4

Let \(q_1 \ge 0, q_2 >0\) and consider an \({\mathbb {R}}^d\)-valued process \(X(t),t\in [0,T]\) satisfying (41) for \(pq=2(q q_2+ q_1)\) and \( \sup _{t\in [0,T]} \Vert X(t)\Vert _{L^{pq}(\Omega ;{\mathbb {R}}^d)} < \infty \). Then there exists a constant C such that for all \(0 \le t_1\le t_2 \le T\)

$$\begin{aligned}&\Vert \big (1 + |X(t_1)|+|X(t_2)|\big )^{q_1} \big | X(t_1) - X(t_2) \big |^{q_2} \Vert _{L^2(\Omega ;{\mathbb {R}})}\nonumber \\&\quad \le C \left( 1 + \sup _{t \in [0,T]} \Vert X(t)\Vert _{L^{pq}(\Omega ;{\mathbb {R}}^d)}^{\frac{pq}{2}} \right) | t_1 - t_2|^{\frac{q_2}{2}}. \end{aligned}$$
(42)

Proof

We apply a Hölder estimate with arbitrary \(\nu > 1\), \(\nu '=\frac{\nu }{\nu -1}\) and use (41),

$$\begin{aligned}&\Vert \left( 1 + |X(t_1)|+|X(t_2)|\right) ^{q_1} \big | X(t_1) - X(t_2) \big |^{q_2} \Vert _{L^2(\Omega ;{\mathbb {R}})}\\&\quad \le C \Vert 1+|X(t_1)|+ |X(t_2)| \Vert _{L^{2 \nu ' q_1}(\Omega ;{\mathbb {R}})}^{q_1} \Vert X(t_2)-X(t_1)\Vert _{L^{2 \nu q_2 }(\Omega ;{\mathbb {R}}^d)}^{q_2}\\&\quad \le C \left( 1+\sup _{t\in [0,T]}\Vert X(t) \Vert _{L^{2 \nu ' q_1}(\Omega ;{\mathbb {R}}^d)}^{q_1}\right) \left( 1+\sup _{t\in [0,T]}\Vert X(t) \Vert _{L^{2\nu q q_2 }(\Omega ;{\mathbb {R}}^d)}^{q q_2}\right) |t_1 -t_2|^{\frac{q_2}{2}} . \end{aligned}$$

The norms are balanced if we choose \(2 \nu 'q_1 = 2 q \nu q_2\) which leads to \(\nu = 1 + \frac{q_1}{q q_2}\) and \(2 \nu q q_2= 2(q q_2 + q_1)\). This shows our assertion for \(q_1 >0\). In case \(q_1=0\) it is enough to just apply (41) with \(p=2 q_2\). \(\square \)

The following lemma is quoted from [2, Lemma 5.5].

Lemma 5.5

Let Assumption 2.1 be satisfied by f and \(g^r\), \(r = 1,\ldots ,m\), with \(L \in (0,\infty )\) and \(q \in [2,\infty )\). Further, let the exact solution X to the SODE (3) satisfy \(\sup _{t \in [0,T]} \Vert X(t) \Vert _{L^{4q-2}(\Omega ;{\mathbb {R}}^d)} < \infty \). Then, there exists a constant C such that for all \(t_1, t_2,s \in [0,T]\) with \(0 \le t_1 \le s \le t_2 \le T\) it holds

$$\begin{aligned}&\int _{t_1}^{t_2} \big \Vert f(\tau ,X(\tau )) - f(s,X(t_1)) \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)} \,\mathrm {d}\tau \\&\quad \le C \left( 1 + \sup _{t \in [0,T]} \big \Vert X(t) \big \Vert _{L^{4q-2}(\Omega ;{\mathbb {R}}^d)}^{2q-1} \right) |t_1 - t_2|^{\frac{3}{2}}. \end{aligned}$$

The order of convergence indicated by Lemma 5.5 can be increased if we insert the conditional expectation with respect to the \(\sigma \)-field \({\mathcal {F}}_{t_1}\):

Lemma 5.6

Let Assumption 2.1 be satisfied by f and \(g^r\), \(r = 1,\ldots ,m\) with \(L \in (0,\infty )\) and \(q \in [2,\infty )\). Further, let the exact solution X to the SODE (3) satisfy \(\sup _{t \in [0,T]} \Vert X(t) \Vert _{L^{6q-4}(\Omega ;{\mathbb {R}}^d)} < \infty \). Then, there exists a constant C such that for all \(t_1, t_2,s \in [0,T]\) with \(0 \le t_1 \le s \le t_2 \le T\) it holds

$$\begin{aligned}&\int _{t_1}^{t_2} \big \Vert {\mathbb {E}}\big [ f(\tau ,X(\tau )) - f(s,X(t_1)) | {\mathcal {F}}_{t_1} \big ] \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)} \,\mathrm {d}\tau \\&\quad \le C \left( 1 + \sup _{t \in [0,T]} \big \Vert X(t) \big \Vert _{L^{6q-4}(\Omega ;{\mathbb {R}}^d)}^{3q-2} \right) |t_1 - t_2|^{2}. \end{aligned}$$

Proof

Since \(\Vert {\mathbb {E}}[ Y | {\mathcal {F}}_{t_1}] \Vert _{L^2(\Omega ;{\mathbb {R}}^d)} \le \Vert Y \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}\) for all \(Y \in L^2(\Omega ;{\mathbb {R}}^d)\) the integrand is estimated by

$$\begin{aligned}&\big \Vert {\mathbb {E}}\big [ f(\tau ,X(\tau )) - f(s,X(t_1)) | {\mathcal {F}}_{t_1} \big ] \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}\\&\quad \le \big \Vert f(\tau ,X(\tau )) - f(s,X(\tau )) \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)} + \big \Vert {\mathbb {E}}\big [ f(s,X(\tau )) - f(s,X(t_1)) | {\mathcal {F}}_{t_1} \big ] \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)} \end{aligned}$$

for every \(\tau \in [t_1,t_2]\). From (10) it follows that

$$\begin{aligned} \big \Vert f(\tau ,X(\tau )) - f(s,X(\tau )) \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}&\le L \Vert 1 + | X(\tau )| \Vert _{L^{2q}(\Omega ;{\mathbb {R}})}^q |\tau - s | \nonumber \\&\le C \left( 1 + \sup _{t \in [0,T]} \Vert X(t) \Vert _{L^{2q}(\Omega ;{\mathbb {R}}^d)}^q \right) |t_2 - t_1 |, \end{aligned}$$
(43)

which after integrating over \(\tau \), yields the desired estimate since \(2q \le 6q-4\) for \(q \ge 1\).

Next, from the mean value theorem we obtain

$$\begin{aligned} f(s,X(\tau )) - f(s,X(t_1)) = \frac{\partial f}{\partial x}(s, X(t_1)) \big ( X(\tau ) - X(t_1) \big ) + R_f, \end{aligned}$$

where the remainder term \(R_f\) is given by

$$\begin{aligned} R_f= \int _{0}^{1} \Big ( \frac{\partial f}{\partial x}\big (s, X(t_1) + \rho (X(\tau ) - X(t_1)) \big ) - \frac{\partial f}{\partial x}(s, X(t_1)) \Big ) \,\mathrm {d}\rho \, \big ( X(\tau ) - X(t_1) \big ). \end{aligned}$$

Using the SODE (3) we obtain

$$\begin{aligned} {\mathbb {E}}\Big [ \frac{\partial f}{\partial x}(s, X(t_1)) \big ( X(\tau ) - X(t_1) \big ) \Big | {\mathcal {F}}_{t_1} \Big ] = {\mathbb {E}}\Big [ \frac{\partial f}{\partial x}(s, X(t_1)) \int _{t_1}^{\tau } f(\sigma ,X(\sigma ) ) \,\mathrm {d}\sigma \Big | {\mathcal {F}}_{t_1} \Big ]. \end{aligned}$$

After taking the \(L^2\)-norm and inserting (8) and (9) we arrive at

$$\begin{aligned}&\Big \Vert {\mathbb {E}}\Big [ \frac{\partial f}{\partial x}(s, X(t_1)) \big ( X(\tau ) - X(t_1) \big ) \Big | {\mathcal {F}}_{t_1} \Big ] \Big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}\nonumber \\&\quad \le \int _{t_1}^{\tau } \big \Vert L ( 1 + |X(t_1)| \big )^{q-1} L \big ( 1 + |X(\sigma )| \big )^{q} \big \Vert _{L^2(\Omega ;{\mathbb {R}})} \,\mathrm {d}\sigma \nonumber \\&\quad \le C \left( 1 + \sup _{t \in [0,T]} \big \Vert X(t) \big \Vert _{L^{4q-2}(\Omega ;{\mathbb {R}}^d)}^{2q-1} \right) |\tau - t_1|. \end{aligned}$$
(44)

Hence, we also obtain the desired estimate for this term after integrating over \(\tau \).

Finally, we have to estimate the \(L^2\)-norm of the remainder term \(R_f\). For this we make use of (5) and get

$$\begin{aligned} | R_f |\le & {} \int _0^1 L \big ( 1 + | X(t_1) + \rho (X(\tau ) - X(t_1)) | + | X(t_1) | \big )^{q-2} \,\mathrm {d}\rho \, \big | X(\tau ) - X(t_1) \big |^2\nonumber \\\le & {} C \big ( 1 + \big | X(t_1) \big | + \big | X(\tau ) \big | \big )^{q-2} \big | X(\tau ) - X(t_1) \big |^2 \end{aligned}$$
(45)

for a constant C only depending on L and q. Applying Proposition 5.4 with \(q_2=2, q_1=q-2\) yields \(2(q q_2 + q_1) = 6q-4\) and therefore,

$$\begin{aligned} \Vert R_f \Vert _{L^2(\Omega ;{\mathbb {R}}^d)} \le C \left( 1 + \sup _{t \in [0,T]} \Vert X(t) \Vert ^{3q - 2}_{L^{6q -4}(\Omega ;{\mathbb {R}}^d)} \right) |\tau - t_1|. \end{aligned}$$
(46)

\(\square \)

The next lemma contains the corresponding estimate for the stochastic integral.

Lemma 5.7

Let Assumption 2.1 be satisfied by f and \(g^r\), \(r = 1,\ldots ,m\) with \(L \in (0,\infty )\) and \(q \in [2,\infty )\). Further, let the exact solution X to (3) satisfy \(\sup _{t \in [0,T]} \Vert X(t) \Vert _{L^{6q-4}(\Omega ;{\mathbb {R}}^d)} < \infty \). Then, there exists a constant C such that for all \(r =1,\ldots ,m\) and \(t_1, t_2,s \in [0,T]\) with \(0 \le t_1 \le s \le t_2 \le T\) it holds

$$\begin{aligned}&\Big \Vert \int _{t_1}^{t_2} g^{r}(\tau ,X(\tau )) - g^{r}(s,X(t_1)) \,\mathrm {d}W^{r}(\tau ) - \sum _{r_2 = 1}^m g^{r,r_2}(s,X(t_1)) I_{(r_2,r)}^{t_1,t_2} \Big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}\\&\quad \le C \left( 1 + \sup _{t \in [0,T]} \big \Vert X(t) \big \Vert _{L^{6q-4}(\Omega ;{\mathbb {R}}^d)}^{3q - 2} \right) |t_1 - t_2|^{\frac{3}{2}}. \end{aligned}$$

Proof

Let us fix \(r =1,\ldots ,m\) arbitrary. We first consider the square of the \(L^2\)-norm and by recalling (22) and (23) we get

$$\begin{aligned}&{\mathbb {E}}\Big [ \Big | \int _{t_1}^{t_2} \Big ( g^{r}(\tau ,X(\tau )) - g^{r}(s,X(t_1)) - \sum _{r_2 = 1}^m g^{r,r_2}(s,X(t_1)) I_{(r_2)}^{t_1,\tau }\Big ) \,\mathrm {d}W^{r}(\tau ) \Big |^2 \Big ]\\&\quad = \int _{t_1}^{t_2} {\mathbb {E}}\Big [ \Big | g^{r}(\tau ,X(\tau )) - g^{r}(s,X(t_1)) - \sum _{r_2 = 1}^m g^{r,r_2}(s,X(t_1)) I_{(r_2)}^{t_1,\tau } \Big |^2 \Big ] \,\mathrm {d}\tau \end{aligned}$$

by an application of the Itō isometry. Thus, the assertion is proved if there exists a constant C independent of \(\tau \), \(t_1\), \(t_2\), and s such that

$$\begin{aligned} \Gamma (\tau )&:= \Big \Vert g^{r}(\tau ,X(\tau )) - g^{r}(s,X(t_1)) - \sum _{r_2 = 1}^m g^{r,r_2}(s,X(t_1)) I_{(r_2)}^{t_1,\tau } \Big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}\\&\le C \left( 1 + \sup _{t \in [0,T]} \big \Vert X(t) \big \Vert _{L^{6q-4}(\Omega ;{\mathbb {R}}^d)}^{3q-2} \right) |t_1 - t_2| \end{aligned}$$

for every \(\tau \in [t_1,t_2]\). For this we first estimate \(\Gamma (\tau )\) by

$$\begin{aligned} \Gamma (\tau )&\le \big \Vert g^{r}(\tau ,X(\tau )) - g^{r}(s,X(\tau )) \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)} \\&\quad + \Big \Vert g^{r}(s,X(\tau )) - g^{r}(s,X(t_1)) - \sum _{r_2 = 1}^m g^{r,r_2}(s,X(t_1)) I_{(r_2)}^{t_1,\tau } \Big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}. \end{aligned}$$

In the same way as in (43) one shows for the first term

$$\begin{aligned} \big \Vert g^{r}(\tau ,X(\tau )) - g^{r}(s,X(\tau )) \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)} \le C \left( 1 + \sup _{t\in [0,T]} \Vert X(t) \Vert ^{\frac{q+1}{2}}_{L^{q+1}(\Omega ;{\mathbb {R}}^d)} \right) |t_2 - t_1 | \end{aligned}$$

and notes \(q+1 \le 6 q -4\). Next, we again apply the mean value theorem

$$\begin{aligned} g^r(s,X(\tau )) - g^r(s,X(t_1)) = \frac{\partial g^r}{\partial x}(s, X(t_1)) \big ( X(\tau ) - X(t_1) \big ) + R_g, \end{aligned}$$

where this time the remainder term \(R_g\) is given by

$$\begin{aligned} R_g:= \int _{0}^{1} \Big ( \frac{\partial g^r}{\partial x}\big (s, X(t_1) + \rho (X(\tau ) - X(t_1)) \big ) - \frac{\partial g^r}{\partial x}(s, X(t_1)) \Big ) \,\mathrm {d}\rho \, \big ( X(\tau ) - X(t_1) \big ). \end{aligned}$$

Using the condition (6) we get

$$\begin{aligned} |R_g| \le C \big ( 1+ |X(t_1)|+ |X(\tau )| \big )^{q_1} |X(\tau )-X(t_1)|^2, \quad \text {where} \;q_1= \tfrac{(q-3)_+}{2}. \end{aligned}$$

Therefore, Proposition 5.4 applies with \(q_2=2\) and leads to

$$\begin{aligned} \Vert R_g \Vert _{L^2(\Omega ;{\mathbb {R}}^d)} \le C \left( 1 + \sup _{t \in [0,T]} \Vert X(t) \Vert ^{3q-2}_{L^{6q-4}(\Omega ;{\mathbb {R}}^d)} \right) |\tau - t_1|, \end{aligned}$$

since \(2(q q_2+q_1)=\max (5q-3,4q)\le 6q-4\) for \(q\ge 2\). It remains to give a corresponding estimate for

$$\begin{aligned} \Gamma _2(\tau ) := \Big \Vert \frac{\partial g^r}{\partial x}(s, X(t_1)) \big ( X(\tau ) - X(t_1) \big ) - \sum _{r_2 = 1}^m g^{r,r_2}(s,X(t_1)) I_{(r_2)}^{t_1,\tau } \Big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}. \end{aligned}$$

After inserting (19) we finally arrive at the two terms

$$\begin{aligned} \Gamma _2(\tau )&\le \Big \Vert \frac{\partial g^r}{\partial x}(s, X(t_1)) \int _{t_1}^{\tau } f(\sigma ,X(\sigma )) \,\mathrm {d}\sigma \Big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}\\&\quad + \sum _{r_2 = 1}^m \Big \Vert \frac{\partial g^r}{\partial x}(s, X(t_1)) \int _{t_1}^{\tau } g^{r_2}(\sigma ,X(\sigma )) \,\mathrm {d}W^{r_2}(\sigma ) - g^{r,r_2}(s,X(t_1)) I_{(r_2)}^{t_1,\tau } \Big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}. \end{aligned}$$

Using (13), the first term is estimated analogously to (44),

$$\begin{aligned}&\Big \Vert \frac{\partial g^r}{\partial x}(s, X(t_1)) \int _{t_1}^{\tau } f(\sigma ,X(\sigma )) \,\mathrm {d}\sigma \Big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}\\&\quad \le C \Big (1+\sup _{t\in [0,T]}\Vert X(t)\Vert _{L^{3q-1}(\Omega ;{\mathbb {R}}^d)}^{\frac{3q-1}{2}} \Big )|\tau -t_1|. \end{aligned}$$

For the second term we insert (16) and (22) and obtain from Itō’s isometry

$$\begin{aligned}&\sum _{r_2 = 1}^m \Big \Vert \frac{\partial g^r}{\partial x}(s, X(t_1)) \int _{t_1}^{\tau } g^{r_2}(\sigma ,X(\sigma )) \,\mathrm {d}W^{r_2}(\sigma ) - g^{r,r_2}(s,X(t_1)) I_{(r_2)}^{t_1,\tau } \Big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}\\&\quad = \sum _{r_2 = 1}^m \Big \Vert \int _{t_1}^{\tau } \frac{\partial g^r}{\partial x}(s, X(t_1)) \big (g^{r_2}(\sigma ,X(\sigma )) - g^{r_2}(s,X(t_1)) \big ) \,\mathrm {d}W^{r_2}(\sigma ) \Big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}\\&\quad = \sum _{r_2 = 1}^m \Big ( \int _{t_1}^{\tau } \Big \Vert \frac{\partial g^r}{\partial x}(s, X(t_1)) \big (g^{r_2}(\sigma ,X(\sigma )) - g^{r_2}(s,X(t_1)) \big ) \Big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}^2 \,\mathrm {d}\sigma \Big )^{\frac{1}{2}}. \end{aligned}$$

Now, it follows from (14) and (15) that

$$\begin{aligned}&\big | g^{r_2}(\sigma ,X(\sigma )) - g^{r_2}(s,X(t_1)) \big |\\&\quad \le \big | g^{r_2}(\sigma ,X(\sigma )) - g^{r_2}(\sigma ,X(t_1)) \big | + \big | g^{r_2}(\sigma ,X(t_1)) - g^{r_2}(s,X(t_1)) \big |\\&\quad \le L \big ( 1 + |X(t_1)| + |X(\sigma )| \big )^{\frac{q-1}{2}} |X(\sigma ) -X(t_1) | + L \big ( 1 + |X(t_1)| \big )^{\frac{q+1}{2}} |\sigma - s |. \end{aligned}$$

Hence, the growth estimate (13) and Proposition 5.4 with \(q_1=q-1,q_2=1\) yield

$$\begin{aligned}&\Big \Vert \frac{\partial g^r}{\partial x}(s, X(t_1)) \big (g^{r_2}(\sigma ,X(\sigma )) - g^{r_2}(s,X(t_1)) \big ) \Big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)} \\&\quad \le L \big \Vert \big (1 + |X(t_1)|)^{\frac{q-1}{2}} \big | g^{r_2}(\sigma ,X(\sigma )) - g^{r_2}(s,X(t_1)) \big | \big \Vert _{L^2(\Omega ,{\mathbb {R}})}\\&\quad \le L^2 \big \Vert \big (1 + |X(t_1)| + |X(\sigma )|\big )^{q-1} \big | X(\sigma ) - X(t_1) \big |\big \Vert _{L^2(\Omega ,{\mathbb {R}})}\\&\qquad + L^2 \Vert \big (1+|X(t_1)|\big )^q \Vert _{L^2(\Omega ,{\mathbb {R}})} |\sigma -s| \\&\quad \le C \Big ( \big (1+ \sup _{t\in [0,T]}\Vert X(t) \Vert _{L^{4q-2}(\Omega ,{\mathbb {R}}^d)}^{2q-1}\big ) |\sigma - t_1|^{\frac{1}{2}} \\&\qquad + \big (1+\sup _{t\in [0,T]}\Vert X(t) \Vert _{L^{2q}(\Omega ,{\mathbb {R}}^d)}^{q}\big )|\sigma -s| \Big ). \end{aligned}$$

To sum up, we have shown

$$\begin{aligned} \Gamma _2(\tau ) \le C \big ( 1 + \sup _{t \in [0,T]} \Vert X(t) \Vert _{L^{4q - 2}(\Omega ;{\mathbb {R}}^d)}^{2q-1} \big ) |t_1 - t_2|. \end{aligned}$$

Since \(4q-2 \le 6q-4\), this completes the proof. \(\square \)

The proof shows that it is sufficient to have bounds for moments of order \(\max (5q-3,4q)\) instead of \(6q-4\). However, in view of the weaker estimate in Lemma 5.6, this does not improve the result of Theorem 5.1.

Now we are well-prepared for the proof of Theorem 5.1.

Proof of Theorem 5.1

We first verify (33) with \(\gamma = 1\) for the PMil method. For this, let \((t,\delta ) \in \mathbb {T}\) be arbitrary. After inserting (19) and (36) we obtain

$$\begin{aligned}&\big \Vert {\mathbb {E}}\big [ X(t + \delta ) - \Psi ^{\mathrm {PMil}}(X(t),t,\delta ) \big | {\mathcal {F}}_{t} \big ] \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}\\&\quad = \Big \Vert {\mathbb {E}}\Big [ X(t) + \int _{t}^{t+\delta } f(\tau ,X(\tau )) \,\mathrm {d}\tau - X^{\circ }(t) - \delta f(t,X^{\circ }(t)) \Big | {\mathcal {F}}_{t} \Big ] \Big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}\\&\quad \le \big \Vert X(t) - X^{\circ }(t) \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)} + \delta \big \Vert f(t,X(t)) - f(t,X^{\circ }(t)) \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}\\&\qquad + \int _{t}^{t+\delta } \big \Vert {\mathbb {E}}\big [ f(\tau ,X(\tau )) - f(t,X(t)) \big | {\mathcal {F}}_t \big ] \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)} \,\mathrm {d}\tau . \end{aligned}$$

By applying Lemma 5.2 with \(\varphi = \mathrm {id}\), \(\kappa = 1\), \(p = 8q - 6\), and \(\alpha = \frac{1}{2(q-1)}\) we obtain

$$\begin{aligned} \big \Vert X(t) - X^{\circ }(t) \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)} \le C \big ( 1 + \big \Vert X(t) \big \Vert _{L^{8q - 6}(\Omega ;{\mathbb {R}}^d)}^{4q - 3} \big ) \delta ^2, \end{aligned}$$

since \(\frac{1}{2}\alpha (p-2)\kappa = 2\). Similarly, we estimate the second term by Lemma 5.2 with \(\varphi = f(t,\cdot )\), \(\kappa = q\), and \(p = 6 - \frac{4}{q}\). Since in this case \(\frac{1}{2}\alpha (p-2)\kappa = 1\) we get

$$\begin{aligned} \delta \big \Vert f(t,X(t)) - f(t,X^{\circ }(t)) \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)} \le C \big ( 1 + \big \Vert X(t) \big \Vert _{L^{6q - 4}(\Omega ;{\mathbb {R}}^d)}^{3q-2} \big ) \delta ^2. \end{aligned}$$
(47)

The last term is estimated by Lemma 5.6 with \(t_1 = s = t\) and \(t_2= t+\delta \),

$$\begin{aligned}&\int _{t}^{t+\delta } \big \Vert {\mathbb {E}}\big [ f(\tau ,X(\tau )) - f(t,X(t)) \big | {\mathcal {F}}_t \big ] \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)} \,\mathrm {d}\tau \\&\quad \le C \left( 1 + \sup _{t\in [0,T]} \big \Vert X(t) \big \Vert _{L^{6q - 4}(\Omega ;{\mathbb {R}}^d)}^{3q-2} \right) \delta ^2. \end{aligned}$$

This completes the proof of (33). For the proof of (34) we first insert (19) and (36). Then, in the same way as above we obtain the following four terms

$$\begin{aligned}&\big \Vert ( \mathrm {id}- {\mathbb {E}}[\,\cdot \,|{\mathcal {F}}_t]) \big ( X(t + \delta ) - \Psi ^{\mathrm {PMil}}(X(t),t,\delta ) \big ) \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}\\&\le \sum _{r = 1}^m \big \Vert \big ( g^r(t,X(t)) - g^r(t,X^{\circ }(t)) \big ) I_{(r)}^{t,t+\delta } \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)} \\&\quad +\sum _{r,r_2 = 1}^m \big \Vert \big ( g^{r,r_2}(t,X(t)) - g^{r,r_2}(t,X^{\circ }(t)) \big ) I_{(r,r_2)}^{t,t+\delta } \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}\\&\quad +\Big \Vert ( \mathrm {id}- {\mathbb {E}}[\,\cdot \,|{\mathcal {F}}_t]) \int _{t}^{t+\delta } f(\tau ,X(\tau )) \,\mathrm {d}\tau \Big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}\\&\quad + \sum _{r = 1}^m \Big \Vert \int _{t}^{t + \delta } g^{r}(\tau ,X(\tau )) - g^{r}(t,X(t)) \,\mathrm {d}W^{r}(\tau ) - \sum _{r_2 = 1}^m g^{r,r_2}(t,X(t)) I_{(r_2,r)}^{t,t+\delta } \Big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}. \end{aligned}$$

Since the stochastic increment \(I_{(r)}^{t,t+\delta }\) is independent of \({\mathcal {F}}_t\) it directly follows that

$$\begin{aligned}&\big \Vert \big ( g^r(t,X(t)) - g^r(t,X^{\circ }(t)) \big ) I_{(r)}^{t,t+\delta } \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}\\&\quad = \delta ^{\frac{1}{2}} \big \Vert g^r(t,X(t)) - g^r(t,X^{\circ }(t)) \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}. \end{aligned}$$

Then, we apply Lemma 5.2 with \(\varphi = g^r(t,\cdot )\), \(\kappa = \frac{q+1}{2}\), and \(p = 10 - \frac{16}{q+1}\). As above, this yields \(\frac{1}{2}\alpha (p-2)\kappa = 1\) and we get

$$\begin{aligned} \delta ^{\frac{1}{2}} \big \Vert g^r(t,X(t)) - g^r(t,X^{\circ }(t)) \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)} \le C \big ( 1 + \sup _{t\in [0,T]} \big \Vert X(t) \big \Vert _{L^{5q - 3}(\Omega ;{\mathbb {R}}^d)}^{\frac{5}{2}q- \frac{3}{2}} \big ) \delta ^{\frac{3}{2}} \end{aligned}$$

for every \(r = 1,\ldots ,m\). In the same way we obtain for the second term

$$\begin{aligned}&\big \Vert \big ( g^{r,r_2}(t,X(t)) - g^{r,r_2}(t,X^{\circ }(t)) \big ) I_{(r,r_2)}^{t,t+\delta } \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}\\&\quad = \frac{1}{\sqrt{2}} \delta \big \Vert g^{r,r_2}(t,X(t)) - g^{r,r_2}(t,X^{\circ }(t)) \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}. \end{aligned}$$

Then, a further application of Lemma 5.2 with \(\varphi = g^{r,r_2}(t,\cdot )\), \(\kappa = q\), and \(p = 4 - \frac{2}{q}\) gives

$$\begin{aligned} \delta \big \Vert g^{r,r_2}(t,X(t)) - g^{r,r_2}(t,X^{\circ }(t)) \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)} \le C \big ( 1 + \sup _{t\in [0,T]} \big \Vert X(t) \big \Vert _{L^{4q - 2}(\Omega ;{\mathbb {R}}^d)}^{2q- 1} \big ) \delta ^{\frac{3}{2}} \end{aligned}$$

for every \(r,r_2 = 1,\ldots ,m\), since in this case \(\frac{1}{2}\alpha (p-2)\kappa = \frac{1}{2}\).

Next, since f(tX(t)) is \({\mathcal {F}}_t\)-measurable it follows for the third term that

$$\begin{aligned}&\Big \Vert ( \mathrm {id}- {\mathbb {E}}[\,\cdot \,|{\mathcal {F}}_t]) \int _{t}^{t+\delta } f(\tau ,X(\tau )) \,\mathrm {d}\tau \Big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}\\&\quad = \Big \Vert ( \mathrm {id}- {\mathbb {E}}[\,\cdot \,|{\mathcal {F}}_t]) \int _{t}^{t+\delta } f(\tau ,X(\tau )) - f(t,X(t)) \,\mathrm {d}\tau \Big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}. \end{aligned}$$

By making use of \(\Vert ( \mathrm {id}- {\mathbb {E}}[\,\cdot \,|{\mathcal {F}}_t]) Y \Vert _{L^2(\Omega ;{\mathbb {R}}^d)} \le \Vert Y \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}\) one directly deduces the desired estimate from Lemma 5.5. Finally, the last term is estimated by Lemma 5.7. \(\square \)

6 C-Stability of the Split-Step Backward Milstein Method

In this section we verify that Assumption 2.1 and condition (26) are sufficient for the C-stability of the split-step backward Milstein method.

The results of Proposition 6.1 below are needed in order to show that the SSBM method is a well-defined one-step method in the sense of Definition 3.1. Further, the inequality (50) plays a key role in the proof of the C-stability of the SSBM method and generalizes a similar estimate for the split-step backward Euler method from [2, Corollary 4.2].

Proposition 6.1

Let the functions \(f :[0,T] \times {\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\) and \(g^r :[0,T] \times {\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\), \(r = 1,\ldots ,m\), satisfy Assumption 2.1 and condition (26) with \(L \in (0,\infty )\), \(\eta _1 \in (1,\infty )\), and \(\eta _2 \in (0,\infty )\). Let \(\overline{h} \in (0,L^{-1})\) be given and define for every \(\delta \in (0,\overline{h}]\) the mapping \(F_\delta :[0,T] \times {\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\) by \(F_\delta (t,x) = x - \delta f(t, x)\). Then, the mapping \({\mathbb {R}}^d \ni x \mapsto F_\delta (t,x) \in {\mathbb {R}}^d\) is a homeomorphism for every \(t \in [0,T]\).

In addition, the inverse \(F_\delta ^{-1}(t,\cdot ) :{\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\) satisfies

$$\begin{aligned} \big | F_\delta ^{-1}(t,x_1)-F_\delta ^{-1}(t,x_2) \big |&\le (1 - L \delta )^{-1} | x_1 - x_2 |, \end{aligned}$$
(48)
$$\begin{aligned} \big | F_\delta ^{-1}(t,x) \big |&\le (1 - L \delta )^{-1} \big ( L \delta + | x | \big ), \end{aligned}$$
(49)

for every \(x,x_1, x_2 \in {\mathbb {R}}^d\) and \(t \in [0,T]\). Moreover, there exists a constant \(C_1\) only depending on L and \(\overline{h}\) such that

$$\begin{aligned}&\big | F_\delta ^{-1}(t,x_1) - F_\delta ^{-1}(t,x_2) \big |^2 + \eta _1 \delta \sum _{r = 1}^m \big | g^r(t, F_\delta ^{-1}(t,x_1)) - g^r(t, F_\delta ^{-1}(t,x_2)) \big |^2\nonumber \\&\quad + \eta _2 \delta \sum _{r_1,r_2 = 1}^{m} \big | g^{r_1,r_2}(t, F_\delta ^{-1}(t,x_1)) - g^{r_1,r_2}(t,F_\delta ^{-1}(t,x_2)) \big |^2 \le (1 + C_1 \delta ) \big | x_1 - x_2 \big |^2 \end{aligned}$$
(50)

for every \(x_1, x_2 \in {\mathbb {R}}^d\) and \(t \in [0,T]\).

Proof

The first part is a direct consequence of the Uniform Monotonicity Theorem (see for instance, [22, Chap.6.4], [26, Theorem C.2]). The estimates (48) and (49) are standard and a proof is found, for example, in [2, Sec. 4].

Regarding (50) it first follows from (26) that

$$\begin{aligned}&\langle F_\delta (t,x_1) - F_\delta (t,x_2), x_1 - x_2 \rangle \\&\quad = | x_1 - x_2 |^2 - \delta \langle f(t,x_1) - f(t,x_2), x_1 - x_2 \rangle \\&\quad \ge (1 - L \delta ) | x_1 - x_2 |^2 + \eta _1 \delta \sum _{r = 1}^m \big | g^r(t,x_1) - g^r(t,x_2) \big |^2\\&\qquad + \eta _2 \delta \sum _{r_1,r_2 = 1}^m \big | g^{r_1,r_2}(t,x_1) - g^{r_1,r_2}(t,x_2) \big |^2 \end{aligned}$$

for all \(x_1, x_2 \in {\mathbb {R}}^d\). For some \(y_1, y_2 \in {\mathbb {R}}^d\) we substitute \(x_1 = F_\delta ^{-1}(t,y_1)\) and \(x_2 = F_\delta ^{-1}(t,y_2)\) into this inequality. Then, after some rearranging we obtain

$$\begin{aligned}&\big |F_\delta ^{-1}(t,y_1) - F_\delta ^{-1}(t,y_2)\big |^2 + \eta _1 \delta \sum _{r = 1}^m \big | g^r(t,F_\delta ^{-1}(t,y_1)) - g^r(t,F_\delta ^{-1}(t,y_2)) \big |^2\\&\qquad + \eta _2 \delta \sum _{r_1,r_2 = 1}^m \big | g^{r_1,r_2}(t,F_\delta ^{-1}(t,y_1)) - g^{r_1,r_2}(t,F_\delta ^{-1}(t,y_2)) \big |^2 \\&\quad \le \big \langle y_1 - y_2, F_\delta ^{-1}(t,y_1) - F_\delta ^{-1}(t,y_2) \big \rangle + L \delta \big | F_\delta ^{-1}(t,y_1) - F_\delta ^{-1}(t,y_2)\big |^2. \end{aligned}$$

Next, as in the proof of [2, Corollary 4.2] we apply the Cauchy-Schwarz inequality and (48). This yields

$$\begin{aligned}&\big \langle y_1 - y_2, F_\delta ^{-1}(t,y_1) - F_\delta ^{-1}(t,y_2) \big \rangle + L \delta \big | F_\delta ^{-1}(t,y_1) - F_\delta ^{-1}(t,y_2)\big |^2\\&\quad \le |y_1 - y_2 | \big |F_\delta ^{-1}(t,y_1) - F_\delta ^{-1}(t,y_2)\big | + L \delta \big | F_\delta ^{-1}(t,y_1) - F_\delta ^{-1}(t,y_2)\big |^2\\&\quad \le (1 - L \delta )^{-1} \big ( 1 + (1 - L \delta )^{-1} L \delta \big ) |y_1 - y_2 |^2 =(1-L \delta )^{-2}|y_1-y_2|^2 \end{aligned}$$

for all \(y_1, y_2 \in {\mathbb {R}}^d\). Finally, note that \(b(\delta )=(1-L \delta )^{-2}\) is a convex function, hence for all \(\delta \in [0,\overline{h}]\),

$$\begin{aligned} (1 - L \delta )^{-2} \le 1 + C_1 \delta , \quad \text {with } C_1 = \frac{b(\overline{h})-b(0)}{\overline{h}}= L(2-L\overline{h}) (1- L \overline{h})^{-2}, \end{aligned}$$

and inequality (50) is verified. \(\square \)

Proposition 6.1 ensures that the implicit step of the SSBM method admits a unique solution if f satisfies Assumption 2.1 with one-sided Lipschitz constant L. To be more precise, for a given \(\overline{h} \in (0,L^{-1})\) let us consider an arbitrary vector of step sizes \(h \in (0,\overline{h}]^N\), \(N \in {\mathbb {N}}\). Then, it follows from Proposition 6.1 that the nonlinear equations

$$\begin{aligned} \overline{X}_h^{\mathrm {SSBM}}(t_i)&= X_h^{\mathrm {SSBM}}(t_{i-1}) + h_i f(t_{i}, \overline{X}_h^{\mathrm {SSBM}}(t_i)), \quad 1 \le i \le N, \end{aligned}$$

are uniquely solvable. Further, there exists a homeomorphism \(F_{h_i}(t_i,\cdot ) :{\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\) such that \(\overline{X}_h^{\mathrm {SSBM}}(t_i) = F_{h_i}^{-1}(t_i,X_h^{\mathrm {SSBM}}(t_{i-1}))\). Therefore, the one-step map \(\Psi ^{\mathrm {SSBM}} :{\mathbb {R}}^d \times \mathbb {T} \times \Omega \rightarrow {\mathbb {R}}^d\) of the split-step backward Milstein method is given by

$$\begin{aligned} \Psi ^{\mathrm {SSBM}}(x,t,\delta )&= F_{\delta }^{-1}(t+\delta ,x) + \sum _{r=1}^m g^r(t + \delta , F_\delta ^{-1}(t+\delta , x)) I_{(r)}^{t,t+\delta }\nonumber \\&\quad + \sum _{r_1, r_2 = 1}^m g^{r_1,r_2}(t+\delta , F_{\delta }^{-1}(t+\delta ,x) ) I_{(r_2,r_1)}^{t,t + \delta } \end{aligned}$$
(51)

for every \(x \in {\mathbb {R}}^d\) and \((t,\delta ) \in \mathbb {T}\), where the stochastic increments are defined in (22) and (23). Next, we verify that \(\Psi ^{\mathrm {SSBM}}\) satisfies condition (28) as well as (31) and (32).

Proposition 6.2

Let the functions f and \(g^r\), \(r = 1,\ldots ,m\), satisfy Assumption 2.1 and condition (26) with \(L \in (0,\infty )\), \(q \in [2,\infty )\), \(\eta _1 \in (1,\infty )\), and \(\eta _2 \in (0,\infty )\). For every \(\overline{h} \in (0, \max (L^{-1},\frac{2 \eta _2}{\eta _1}))\) and initial value \(\xi \in L^2(\Omega ;{\mathcal {F}}_{0},{\mathbf {P}};{\mathbb {R}}^d)\) it holds that \((\Psi ^{\mathrm {SSBM}}, \overline{h}, \xi )\) is a stochastic one-step method.

In addition, there exists a constant \(C_0\) depending on L, q, m, and \(\overline{h}\), such that

$$\begin{aligned} \big \Vert {\mathbb {E}}\big [ \Psi ^{\mathrm {SSBM}}( 0, t,\delta ) | {\mathcal {F}}_{t} \big ] \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}&\le C_0 \delta , \end{aligned}$$
(52)
$$\begin{aligned} \big \Vert \big ( \mathrm {id}- {\mathbb {E}}[ \, \cdot \, | {\mathcal {F}}_{t} ] \big ) \Psi ^{\mathrm {SSBM}}( 0, t,\delta ) \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}&\le C_0 \delta ^{\frac{1}{2}} \end{aligned}$$
(53)

for all \((t,\delta ) \in \mathbb {T}\).

Proof

Regarding the first assertion we show that \(\Psi ^{\mathrm {SSBM}}\) satisfies (28). For this we fix arbitrary \((t, \delta ) \in \mathbb {T}\) and \(Z \in L^2(\Omega ,{\mathcal {F}}_t,{\mathbf {P}};{\mathbb {R}}^d)\). Then, we obtain from Proposition 6.1 that the mapping \(F_\delta ^{-1}(t+\delta , \cdot ) :{\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\) is a homeomorphism satisfying the linear growth bound (49). Hence, we have

$$\begin{aligned} F_\delta ^{-1}(t+\delta , Z) \in L^2(\Omega ,{\mathcal {F}}_t,{\mathbf {P}};{\mathbb {R}}^d). \end{aligned}$$

Consequently, by the continuity of \(g^r\) and \(g^{r_1,r_2}\) the mappings

$$\begin{aligned} \Omega \ni \omega \mapsto g^r(t+\delta , F_\delta ^{-1}(t+\delta , Z(\omega ))) \in {\mathbb {R}}^d \end{aligned}$$

and

$$\begin{aligned} \Omega \ni \omega \mapsto g^{r_1,r_2}(t+\delta , F_\delta ^{-1}(t+\delta , Z(\omega ))) \in {\mathbb {R}}^d \end{aligned}$$

are \({\mathcal {F}}_t / {\mathcal {B}}({\mathbb {R}}^d)\)-measurable for every \(r,r_1,r_2 = 1,\ldots ,m\). Hence, \(\Psi ^{\mathrm {SSBM}}(Z,t,\delta ) :\Omega \rightarrow {\mathbb {R}}^d\) is an \({\mathcal {F}}_{t+\delta } / {\mathcal {B}}({\mathbb {R}}^d)\)-measurable random variable.

Next, we show that \(\Psi ^{\mathrm {SSBM}}(Z,t,\delta )\) is square integrable. First, it follows from (49) that

$$\begin{aligned} \big \Vert {\mathbb {E}}\big [ \Psi ^{\mathrm {SSBM}}( Z, t,\delta ) | {\mathcal {F}}_{t} \big ] \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}&= \big \Vert F_\delta ^{-1}(t+\delta , Z) \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}\\&\le (1 - L\delta )^{-1} \big ( L \delta + \Vert Z\Vert _{L^2(\Omega ;{\mathbb {R}}^d)} \big ). \end{aligned}$$

In particular, if \(Z = 0 \in L^2(\Omega ;{\mathbb {R}}^d)\) we get

$$\begin{aligned} \big \Vert {\mathbb {E}}\big [ \Psi ^{\mathrm {SSBM}}( 0, t,\delta ) | {\mathcal {F}}_{t} \big ] \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)} \le (1 - L\overline{h})^{-1} L \delta , \end{aligned}$$

which is (52). Further, since the stochastic increments \(I_{(r)}^{t,t+\delta }\) and \(I_{(r_1,r_2)}^{t,t+\delta }\) are pairwise uncorrelated and satisfy \({\mathbb {E}}[ |I_{(r)}^{t,t+\delta }|^2 ] = \delta \) and \({\mathbb {E}}[ |I_{(r_1,r_2)}^{t,t+\delta }|^2 ] = \frac{1}{2} \delta ^2\) we obtain

$$\begin{aligned}&\big \Vert \big ( \mathrm {id}- {\mathbb {E}}[ \, \cdot \, | {\mathcal {F}}_{t} ] \big ) \Psi ^{\mathrm {SSBM}}( 0, t,\delta ) \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}^2\\&\quad = \Big \Vert \sum _{r = 1}^m g^r(t+\delta , F_\delta ^{-1}(t+\delta , 0) ) I_{(r)}^{t,t+\delta }\Big \Vert ^2_{L^2(\Omega ;{\mathbb {R}}^d)}\\&\qquad + \Big \Vert \sum _{r_1, r_2 = 1}^m g^{r_1,r_2}(t+\delta , F_{\delta }^{-1}(t+\delta ,0) ) I_{(r_2,r_1)}^{t,t + \delta } \Big \Vert ^2_{L^2(\Omega ;{\mathbb {R}}^d)}\\&\quad = \delta \sum _{r = 1}^m \big | g^r(t+\delta , F_\delta ^{-1}(t+\delta , 0) ) \big |^2 + \frac{1}{2} \delta ^2 \sum _{r_1,r_2 = 1}^m \big | g^{r_1,r_2} (t+\delta , F_\delta ^{-1}(t+\delta , 0) ) \big |^2. \end{aligned}$$

Then, applications of (12) and (49) yield

$$\begin{aligned} \big | g^r(t+\delta , F_\delta ^{-1}(t+\delta , 0) ) \big |&\le L \big ( 1 + \big | F_\delta ^{-1}(t+\delta , 0) \big | \big )^{\frac{q+1}{2}}\\&\le L \big ( 1 + (1 - L \overline{h})^{-1} L \overline{h} \big )^{\frac{q+1}{2}} \end{aligned}$$

and, similarly, by (17)

$$\begin{aligned} \big | g^{r_1,r_2}(t+\delta , F_\delta ^{-1}(t+\delta , 0) ) \big |&\le L \big ( 1 + \big | F_\delta ^{-1}(t+\delta , 0) \big | \big )^q\\&\le L \big ( 1 + ( 1 - L \overline{h})^{-1} L \overline{h} \big )^{q}. \end{aligned}$$

Therefore, there exists a constant \(C_0\) depending on L, q, m, and \(\overline{h}\), such that (53) is satisfied. In particular, this proves that \(\Psi ^{\mathrm {SSBM}}( 0,t,\delta ) \in L^2(\Omega ,{\mathcal {F}}_{t+\delta },{\mathbf {P}};{\mathbb {R}}^d)\).

Next, for arbitrary \(Z \in L^2(\Omega ;{\mathcal {F}}_t,{\mathbf {P}};{\mathbb {R}}^d)\) the same arguments as above yield

$$\begin{aligned}&\big \Vert \Psi ^{\mathrm {SSBM}}(Z,t,\delta ) - \Psi ^{\mathrm {SSBM}}( 0, t,\delta ) \big \Vert ^2_{L^2(\Omega ;{\mathbb {R}}^d)}\\&\quad = \big \Vert F_\delta ^{-1}(t+\delta , Z) - F_\delta ^{-1}(t+\delta , 0) \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}^2 \\&\qquad + \delta \sum _{r = 1}^m \big \Vert g^r(t+\delta , F_\delta ^{-1}(t+\delta , Z) ) - g^r(t+\delta , F_\delta ^{-1}(t+\delta , 0) ) \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}^2\\&\qquad + \frac{\delta ^2}{2} \sum _{r_1,r_2 = 1}^m \big \Vert g^{r_1,r_2}(t+\delta , F_\delta ^{-1}(t+\delta , Z) ) - g^{r_1,r_2}(t+\delta , F_\delta ^{-1}(t+\delta , 0) ) \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}^2. \end{aligned}$$

Note that \(\eta _1 > 1\) and \(\frac{\delta ^2}{2} \le \frac{\overline{h}}{2} \delta \le \eta _2 \delta \). Thus, the inequality (50) is applicable and we obtain

$$\begin{aligned}&\big \Vert \Psi ^{\mathrm {SSBM}}(Z,t,\delta ) - \Psi ^{\mathrm {SSBM}}( 0, t,\delta ) \big \Vert ^2_{L^2(\Omega ;{\mathbb {R}}^d)} \le (1 + C_1 \delta ) \Vert Z \Vert ^2_{L^2(\Omega ;{\mathbb {R}}^d)}. \end{aligned}$$

Hence \(\Psi ^{\mathrm {SSBM}}(Z,t,\delta ) \in L^2(\Omega ,{\mathcal {F}}_{t+\delta },{\mathbf {P}};{\mathbb {R}}^d)\). \(\square \)

Theorem 6.3

Let the functions f and \(g^r\), \(r = 1,\ldots ,m\), satisfy Assumption 2.1 and condition (26) with \(L \in (0,\infty )\), \(\eta _1 \in (1,\infty )\), and \(\eta _2 \in (0,\infty )\). Further, let \(\overline{h} \in \left( 0, \max (L^{-1},\frac{2 \eta _2}{\eta _1})\right) \). Then, for every \(\xi \in L^2(\Omega ,{\mathcal {F}}_0,{\mathbf {P}};{\mathbb {R}}^d)\) the SSBM scheme \((\Psi ^{\mathrm {SSBM}},\overline{h},\xi )\) is stochastically C-stable.

Proof

Let \((t,\delta ) \in \mathbb {T}\) be arbitrary. For every \(Y \in L^2(\Omega ,{\mathcal {F}}_t,{\mathbf {P}};{\mathbb {R}}^d)\) we have

$$\begin{aligned} {\mathbb {E}}\big [ \Psi ^{\mathrm {SSBM}}(Y,t,\delta )| {\mathcal {F}}_t \big ] = F_{\delta }^{-1}(t+\delta ,Y) \end{aligned}$$

and

$$\begin{aligned} \big ( \mathrm {id}- {\mathbb {E}}[\, \cdot \, | {\mathcal {F}}_t] \big ) \Psi ^{\mathrm {SSBM}}(Y,t,\delta )&= \sum _{r = 1}^m g^r(t+\delta , F_\delta ^{-1}(t+\delta ,Y)) I_{(r)}^{t,t+\delta } \\&\quad + \sum _{r_1, r_2 = 1}^m g^{r_1,r_2}(t + \delta , F_{\delta }^{-1}(t+\delta ,Y) ) I_{(r_2,r_1)}^{t,t + \delta }. \end{aligned}$$

For the computation of the \(L^2\)-norm, we make use of the facts that the stochastic increments are independent of \({\mathcal {F}}_t\) and pairwise uncorrelated. Further, since \({\mathbb {E}}[ |I_{(r)}^{t,t+\delta }|^2 ] = \delta \) and \({\mathbb {E}}[ |I_{(r_1,r_2)}^{t,t+\delta }|^2 ] = \frac{1}{2} \delta ^2\) it follows for (30) with \(\nu = \eta _1\)

$$\begin{aligned}&\big \Vert {\mathbb {E}}\big [ \Psi ^{\mathrm {SSBM}}(Y,t,\delta ) - \Psi ^{\mathrm {SSBM}}(Z,t,\delta )| {\mathcal {F}}_t \big ] \big \Vert ^2_{L^2(\Omega ;{\mathbb {R}}^d)}\\&\qquad + \nu \big \Vert \big ( \mathrm {id}- {\mathbb {E}}[ \cdot | {\mathcal {F}}_t] \big ) \big ( \Psi ^{\mathrm {SSBM}}(Y,t,\delta ) - \Psi ^{\mathrm {SSBM}}(Z,t,\delta ) \big ) \big \Vert ^2_{L^2(\Omega ;{\mathbb {R}}^d)}\\&\quad = {\mathbb {E}}\big [ \big | F_{\delta }^{-1}(t+\delta ,Y) - F_{\delta }^{-1}(t+\delta ,Z) \big |^2 \big ] \\&\qquad + \eta _1 \delta \sum _{r = 1}^m {\mathbb {E}}\big [ \big | g^r(t+\delta , F_\delta ^{-1}(t+\delta ,Y)) - g^r(t+\delta ,F_\delta ^{-1}(t+\delta ,Z)) \big |^2 \big ]\\&\qquad + \frac{1}{2} \eta _1 \delta ^2 \sum _{r_1,r_2 = 1}^{m} {\mathbb {E}}\big [ \big | g^{r_1,r_2}(t+\delta , F_\delta ^{-1}(t+\delta ,Y)) - g^{r_1,r_2}(t+\delta ,F_\delta ^{-1}(t+\delta ,Z)) \big |^2 \big ]. \end{aligned}$$

Due to \(\frac{1}{2} \eta _1 \delta \le \frac{1}{2} \eta _1 \overline{h} < \eta _2\) an application of inequality (50) yields

$$\begin{aligned}&\big \Vert {\mathbb {E}}\big [ \Psi ^{\mathrm {SSBM}}(Y,t,\delta ) - \Psi ^{\mathrm {SSBM}}(Z,t,\delta )| {\mathcal {F}}_t \big ] \big \Vert ^2_{L^2(\Omega ;{\mathbb {R}}^d)}\\&\qquad + \nu \big \Vert \big ( \mathrm {id}- {\mathbb {E}}[\, \cdot \, | {\mathcal {F}}_t] \big ) \big ( \Psi ^{\mathrm {SSBM}}(Y,t,\delta ) - \Psi ^{\mathrm {SSBM}}(Z,t,\delta ) \big ) \big \Vert ^2_{L^2(\Omega ;{\mathbb {R}}^d)}\\&\quad \le (1 + C_1 \delta ) \big \Vert Y - Z \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}^2, \end{aligned}$$

which is the C-stability condition (30) with \(\nu = \eta _1 \in (1,\infty )\). \(\square \)

7 B-Consistency of the Split-Step Backward Milstein Method

This section is devoted to the proof of the following result, which is concerned with the B-consistency of the SSBM method.

Theorem 7.1

Let the functions f and \(g^r\), \(r = 1,\ldots ,m\), satisfy Assumption 2.1 with \(L \in (0,\infty )\) and \(q \in [2,\infty )\). Let \(\overline{h} \in (0,L^{-1})\). If the exact solution X to (3) satisfies \(\sup _{\tau \in [0,T]} \Vert X(\tau ) \Vert _{L^{6q-4}(\Omega ;{\mathbb {R}}^d)} < \infty \), then the split-step backward Milstein method \((\Psi ^{\mathrm {SSBM}},\overline{h},X_0)\) is stochastically B-consistent of order \(\gamma = 1\).

For the proof we recall some estimates of the homeomorphism \(F_\delta ^{-1}\) from [2, Lemma 4.3], which will be useful for the estimate of the local truncation error.

Lemma 7.2

Consider the same situation as in Proposition 6.1. Then there exist constants \(C_2\), \(C_3\) only depending on L, \(\overline{h}\) and q such that for every \(\delta \in (0,\overline{h}]\) the inverse \(F_\delta ^{-1}(t,\cdot ) :{\mathbb {R}}^d \rightarrow {\mathbb {R}}^d\) satisfies the estimates

$$\begin{aligned} \big | F_\delta ^{-1}(t,x) - x \big |&\le \delta C_2 \big ( 1 + | x|^q \big ), \end{aligned}$$
(54)
$$\begin{aligned} \big | F_\delta ^{-1}(t,x) - x - \delta f(t,x) \big |&\le \delta ^2 C_3 \big ( 1 + |x|^{2q-1} \big ) \end{aligned}$$
(55)

for every \(x \in {\mathbb {R}}^d\) and \(t \in [0,T]\).

Proof of Theorem 7.1

The proof follows the same steps as the proof of [2, Theorem 5.7]. Let us fix arbitrary \((t,\delta ) \in \mathbb {T}\). Then, by inserting (19) and (51) we obtain the following representation of the local truncation error

$$\begin{aligned}&X(t + \delta ) - \Psi ^{\mathrm {SSBM}} (X(t),t,\delta )\\&\quad = X(t) + \delta f(t + \delta ,X(t)) - F_{\delta }^{-1}(t+\delta ,X(t))\\&\qquad + \int _{t}^{t + \delta } \big ( f(\tau ,X(\tau )) - f(t + \delta ,X(t)) \big ) \,\mathrm {d}\tau \\&\qquad + \Big ( \sum _{r = 1}^m \int _{t}^{t + \delta } \big ( g^r(\tau ,X(\tau )) - g^r(t + \delta ,X(t)) \big ) \,\mathrm {d}W^r(\tau )\\&\qquad - \sum _{r_1,r_2 = 1}^m g^{r_1,r_2}(t+\delta ,X(t)) I_{(r_2,r_1)}^{t,t+\delta } \Big ) \\&\qquad +\sum _{r = 1}^m \big ( g^r(t+ \delta ,X(t)) - g^r(t+\delta ,F_{\delta }^{-1}(t + \delta ,X(t))) \big ) I_{(r)}^{t,t+\delta }\\&\qquad +\sum _{r_1,r_2 = 1}^m \big (g^{r_1,r_2}(t+ \delta ,X(t)) - g^{r_1,r_2}(t+\delta ,F_{\delta }^{-1}(t + \delta ,X(t))) \big ) I_{(r_2,r_1)}^{t,t+\delta }\\&\quad =: T_1 + T_2 + T_3 + T_4 + T_5. \end{aligned}$$

We discuss the five terms separately. It is already shown in the proof of [2, Theorem 5.7] that by applying (55) the \(L^2\)-norm of the term \(T_1\) is dominated by

$$\begin{aligned} \Vert T_1 \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}&= \big \Vert X(t) + \delta f(t + \delta ,X(t)) - F_{\delta }^{-1}(t + \delta ,X(t)) \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}\nonumber \\&\le C_3 \big \Vert 1 + | X(t) |^{2q-1} \big \Vert _{L^2(\Omega ;{\mathbb {R}})} \delta ^{2} \nonumber \\&\le C_3 \Big (1 + \sup _{\tau \in [0,T]} \Vert X(\tau ) \Vert _{L^{4q - 2}(\Omega ;{\mathbb {R}}^d)}^{2q-1} \Big ) \delta ^2. \end{aligned}$$
(56)

Moreover, if we consider the conditional expectation of the term \(T_2\) with respect to \({\mathcal {F}}_t\), then after taking the \(L^2\)-norm we arrive at

$$\begin{aligned} \big \Vert {\mathbb {E}}\big [ T_2 | {\mathcal {F}}_t \big ] \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)} = \Big \Vert {\mathbb {E}}\Big [ \int _{t}^{t + \delta } \big ( f(\tau ,X(\tau )) - f(t + \delta ,X(t)) \big ) \,\mathrm {d}\tau \, \big | {\mathcal {F}}_t \Big ] \Big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}. \end{aligned}$$

Hence, an application of Lemma 5.6 with \(t_1 =t\) and \(s=t+\delta \) yields

$$\begin{aligned} \big \Vert {\mathbb {E}}\big [ T_2 | {\mathcal {F}}_t \big ] \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)} \le C \big ( 1 + \sup _{t \in [0,T]} \big \Vert X(t) \big \Vert ^{3q -2}_{L^{6q-4}(\Omega ;{\mathbb {R}}^d)} \big ) \delta ^2. \end{aligned}$$
(57)

Since \({\mathbb {E}}[ T_1 | {\mathcal {F}}_t ] = T_1\) and \({\mathbb {E}}[ T_i | {\mathcal {F}}_t ] = 0\) for \(i \in \{3,4,5\}\) we get from (56) and (57)

$$\begin{aligned}&\big \Vert {\mathbb {E}}\big [ X(t + \delta ) - \Psi ^{\mathrm {SSBM}} (X(t),t,\delta ) \big | {\mathcal {F}}_t \big ] \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}\\&\quad \le \Vert T_1 \Vert _{L^2(\Omega ;{\mathbb {R}}^d)} + \big \Vert {\mathbb {E}}\big [ T_2 | {\mathcal {F}}_t \big ] \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}\\&\quad \le C \big ( 1 + \sup _{t \in [0,T]} \big \Vert X(t) \big \Vert ^{3q -2}_{L^{6q-4}(\Omega ;{\mathbb {R}}^d)} \big ) \delta ^2. \end{aligned}$$

This proves (33) with \(\gamma = 1\) and it remains to show (34). For this we estimate

$$\begin{aligned}&\big \Vert \big ( \mathrm {id}- {\mathbb {E}}[ \, \cdot \, | {\mathcal {F}}_t] \big ) \big ( X(t + \delta ) - \Psi ^{\mathrm {SSBM}} (X(t),t,\delta ) \big ) \big \Vert _{L^{2}(\Omega ;{\mathbb {R}}^d)}\\&\quad \le \sum _{i = 1}^{5} \big \Vert \big ( \mathrm {id}- {\mathbb {E}}[ \, \cdot \, | {\mathcal {F}}_t] \big ) T_i \big \Vert _{L^{2}(\Omega ;{\mathbb {R}}^d)}. \end{aligned}$$

Then, note that \(\Vert ( \mathrm {id}- {\mathbb {E}}[ \, \cdot \, | {\mathcal {F}}_t] ) T_1 \Vert _{L^{2}(\Omega ;{\mathbb {R}}^d)} = 0\) since \(T_1\) is \({\mathcal {F}}_t\)-measurable. Further, by making use of the fact that \(\Vert ( \mathrm {id}- {\mathbb {E}}[ \, \cdot \, | {\mathcal {F}}_t] ) Y \Vert _{L^{2}(\Omega ;{\mathbb {R}}^d)} \le \Vert Y \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}\) for all \(Y \in L^2(\Omega ;{\mathbb {R}}^d)\) we get

$$\begin{aligned} \big \Vert \big ( \mathrm {id}- {\mathbb {E}}[ \, \cdot \, | {\mathcal {F}}_t] \big ) T_2 \big \Vert _{L^{2}(\Omega ;{\mathbb {R}}^d)} \le \Vert T_2 \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}. \end{aligned}$$

After inserting \(T_2\) it follows from Lemma 5.5 that

$$\begin{aligned} \Vert T_2 \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}&\le \int _{t}^{t+\delta } \big \Vert f(\tau ,X(\tau )) - f(t+\delta ,X(t)) \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)} \,\mathrm {d}\tau \\&\le C \big ( 1 + \sup _{t \in [0,T]} \big \Vert X(t) \big \Vert _{L^{4q-2}(\Omega ;{\mathbb {R}}^d)}^{2q-1} \big ) \delta ^{\frac{3}{2}}. \end{aligned}$$

Regarding the term \(T_3\) we first couple the summation indices \(r = r_1\). Then, the triangle inequality yields

$$\begin{aligned} \Vert T_3 \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}&\le \sum _{r = 1}^m \Big \Vert \int _{t}^{t+\delta } \big ( g^{r}(\tau ,X(\tau )) - g^r(t+\delta ,X(t)) \big ) \,\mathrm {d}W^r(\tau )\\&\qquad - \sum _{r_2=1}^m g^{r,r_2}(t+\delta ,X(t)) I_{(r_2,r)}^{t,t+\delta } \Big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}. \end{aligned}$$

Hence, we are in the situation of Lemma 5.7 with \(t_1 = t\) and \(t_2 = s = t + \delta \) and we obtain

$$\begin{aligned} \Vert T_3 \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}&\le C \big ( 1 + \sup _{t \in [0,T]} \Vert X(t) \Vert ^{3q-2}_{L^{6q-4}(\Omega ;{\mathbb {R}}^d)} \big ) \delta ^{\frac{3}{2}}. \end{aligned}$$

The \(L^2\)-norm estimates of the remaining terms \(T_4\) and \(T_5\) follow the same line of arguments as the last part of the proof of [2, Theorem 5.7]. For instance, the term \(T_5\) is estimated as follows: From (18), (49), and (54) we obtain

$$\begin{aligned}&\big | g^{r_1,r_2}(t + \delta ,X(t)) - g^{r_1,r_2}(t + \delta ,F_{\delta }^{-1}(t + \delta ,X(t))) \big |\\&\quad \le L \big ( 1 + | X(t) | + |F_{\delta }^{-1}(t + \delta ,X(t))| \big )^{q-1} \big |X(t) - F_{\delta }^{-1}(t + \delta ,X(t)) \big | \\&\quad \le C_2 L \big ( 1 + | X(t) | + (1 - L \delta )^{-1} ( L \delta + | X(t) | ) \big )^{q-1} \big ( 1 + | X(t) |^{q} \big ) \delta \\&\quad \le C \big ( 1 + | X(t) |^{2q-1} \big )\delta , \end{aligned}$$

for a constant C only depending on \(C_2\), L, q, and \(\overline{h}\). Therefore,

$$\begin{aligned}&\Big \Vert \sum _{r_1,r_2 = 1}^m \big ( g^{r_1,r_2}(t + \delta ,X(t)) - g^{r_1,r_2}(t + \delta ,F_{\delta }^{-1}(t + \delta ,X(t))) \big ) I_{(r_2,r_1)}^{t,t+\delta } \Big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}^2\\&\quad = \frac{1}{2} \delta ^2 \sum _{r_1,r_2 = 1}^m \big \Vert g^{r_1,r_2} (t + \delta ,X(t)) - g^{r_1,r_2}(t + \delta ,F_{\delta }^{-1}(t + \delta ,X(t))) \big \Vert _{L^2(\Omega ;{\mathbb {R}}^d)}^2 \\&\quad \le C m^2 \Big ( 1 + \sup _{\tau \in [0,T]} \Vert X(\tau ) \Vert ^{2q-1}_{L^{4q-2}(\Omega ;{\mathbb {R}}^d)} \Big ) \delta ^4, \end{aligned}$$

which is the desired estimate of \(T_5\). The corresponding estimate of \(T_4\) reads

$$\begin{aligned} \Vert T_4 \Vert _{L^2(\Omega ;{\mathbb {R}}^d)} \le C m \Big ( 1 + \sup _{\tau \in [0,T]} \Vert X(\tau ) \Vert ^{\frac{3}{2}q-\frac{1}{2}}_{L^{3q-1}(\Omega ;{\mathbb {R}}^d)} \Big ) \delta ^{\frac{3}{2}} \end{aligned}$$

and is obtained in the same way as for \(T_5\) but with (15) in place of (18). Altogether, this completes the proof of (34) with \(\gamma = 1\). \(\square \)

8 Numerical Experiments

In this section we perform several numerical experiments which illustrate the preceding theory for two characteristic examples.

8.1 Double Well Dynamics with Multiplicative Noise

Consider the following stochastic differential equation

$$\begin{aligned} \,\mathrm {d}X(t)&=X(t)(1-X(t)^2)\,\mathrm {d}t+\sigma (1-X(t)^2)\,\mathrm {d}W(t), \quad t \in [0,T],\nonumber \\ X(0)&=X_0, \end{aligned}$$
(58)

where \(\sigma >0\). The coefficient functions of (58) are given by \(f(x):=x(1-x^2)\) and \(g(x):=\sigma (1-x^2)\), \(x\in {\mathbb {R}}\). Note that \(-f\) is the gradient of the double well potential \(V(x) = \frac{1}{4} x^4 - \frac{1}{2} x^2\).

In our experiments we compare the projected Milstein scheme (24) and the split-step backward Milstein method (25). In additon, we compare with the projected Euler–Maruyama method (PEM) proposed in [2],

$$\begin{aligned} \overline{X}_h(t_i)&=\min (1,h_i^{-\alpha }|X_h(t_{i-1})|^{-1})X_h(t_{i-1}),\nonumber \\ X_h(t_i)&=\overline{X}(t_i)+h_if(t_{i-1},\overline{X}_h(t_i))+\sum \limits _{r=1} ^mg^r(t_{i-1},\overline{X}_h(t_i))I_{(r)}^{t_{i-1},t_i}. \end{aligned}$$
(59)

As before, we have \(1\le i\le N\), \(h\in (0,1]^N\), \(X_h(0):=X_0\) , and \(\alpha =\frac{1}{2(q-1)}\).

The equation (58) satisfies Assumption 2.1 and the coercivity condition (21) with polynomial growth rate \(q=3\). Table 1 contains the restrictions on \(\eta \) and \(\sigma \) which imply condition (4) in Assumption 2.1. Moreover, we summarize the p-th moment bounds such that the assumptions of Theorems 2.2, 2.3, and Theorem 6.7 in [2] are satisfied.

Table 1 Convergence conditions for the three methods

Note that the additional condition (26) for the SSBM-method is never satisfied for our example, since the square of the term \(g(x)g'(x)=-2\sigma ^2x(1-x^2), x\in {\mathbb {R}}\) is of sixth order and cannot be controlled by the fourth order f-term in condition (26).

Since there is no explicit expression for the solution of (58) we replace the exact solution by a numerical reference approximation obtained with an extremely small step size \(\Delta t=2^{-17}\). For this step-size, the projection onto the \((\Delta t)^{-\alpha }\)-ball actually never occurs. The implicit step in the SSBM scheme employs Cardano’s method in order to solve the nonlinear equation exactly. The parameter value was set to \(\alpha =\frac{1}{4}\) as prescribed by the results in Sect. 4 above and in [2]. Figure 1 shows the strong error of convergence for seven different step sizes \(h=2^k\Delta t, k=7,\dots ,13\). The parameter values are \(\sigma =0.3\) and \(X_0=2\), and the coercivity condition (21) always holds.

Fig. 1
figure 1

Strong convergence errors for the approximation of the double well dynamics with parameter \(\sigma =0.3\) and \(X_0=2\)

The strong error is measured at the endpoint \(T=1\) by

$$\begin{aligned} \mathrm {error}=({\mathbb {E}}[|X_h(T)-X(T)|^2])^\frac{1}{2}, \end{aligned}$$
(60)

with a Monte Carlo simulation using \(2\cdot 10^6\) samples. For this number of samples we estimated the associated confidence intervals. They turned out to be two orders of magnitude smaller than the values of the error itself for all methods and parameters shown. In the scale of Fig. 1 they will be hardly visible.

In Fig. 1 one observes strong order \(\gamma =1\) for the two Milstein-type schemes and strong order \(\gamma =\frac{1}{2}\) for the projected Euler–Maruyama method, at least for smaller step-sizes.

Table 2 Strong convergence errors for the approximation of the double well dynamics with parameter \(\sigma =0.3\) and \(X_0=2\)

Table 2 contains the values of the computed errors and of the corresponding experimental order of convergence defined by

$$\begin{aligned} \mathrm {EOC}=\frac{\log (\mathrm {error}(h_i))-\log (\mathrm {error}(h_{i-1}))}{ \log (h_i)-\log (h_{i-1})}, \end{aligned}$$

which support the theoretical results. Moreover, as in [2] we are interested in the number of samples for which the trajectories of the PEM method and the PMil scheme leave the sphere of radius \(h^{-\alpha }\), i.e. the total number of trajectories we observed the following events

$$\begin{aligned}&\{i=1,\dots ,N:|X^{\mathrm {PEM}}_h(t_i)|>h^{-\alpha }\} \ne \emptyset , \end{aligned}$$
(61)
$$\begin{aligned}&\{i=1,\dots ,N:|X^{\mathrm {PMil}}_h(t_i)|>h^{-\alpha }\} \ne \emptyset , \end{aligned}$$
(62)

respectively. This information is provided in the fourth and seventh columns of Table 2.

Fig. 2
figure 2

Strong convergence errors for the approximation of the double well dynamics. Parameter values \(\sigma =1\) and \(X_0=2\)

Table 3 Strong convergence errors for the approximation of the double well dynamics

Figure 2 and Table 3 show the results of the strong error of convergence when conditions (4) and (21) in Assumption 2.1 are violated by choosing \(\sigma =1\). The estimate of the errors are based on the Monte Carlo simulation with the same number \(2 \cdot 10^6\) of samples as above. And as in the first experiment, confidence intervals are two orders of magnitude smaller than the values themselves. Therefore, we believe that the slightly irregular behavior of the convergence errors is not due to a too small number of samples. Rather we suspect that violation of the convergence conditions influences the expected order of convergence, see the numerical EOC values in Table 3. This effect certainly deserves further investigation. For an illustration we include two runs for parameter values \(\sigma =0.3\) and \(\sigma =1\) in Fig. 3.

Fig. 3
figure 3

Single trajectories for the methods PEM, PMil, and SSBM with step size \(h= 2^{-5}\) versus reference solution (exact) with parameter values \(\alpha =\frac{1}{4}\), \(X_0=2,\) and \(T=1\). a The case \(\sigma =0.3\). b The case \(\sigma =1\)

Fig. 4
figure 4

CPU times versus \(L^2\)-errors of the PEM, PMil, and SSBM methods for the double well dynamics

For a fair comparison of computational costs we compiled Fig. 4 which shows the \(L^2\)-error versus computing times (measured by tic,toc in MATLAB). One clearly observes that PMil and PEM outperform the split-step backward Milstein method SSBM. Moreover, PMil has a slight advantage over PEM when high accuracy is required.

8.2 A Stochastic Oscillator with Commutative Noise

Next, we consider a system in polar coordinates with diagonal noise of the following form

$$\begin{aligned} \begin{pmatrix} \,\mathrm {d}r(t) \\ \,\mathrm {d}\varphi (t) \end{pmatrix} = \begin{pmatrix} r f_1(r^2) \\ f_2(\varphi ) \end{pmatrix} \,\mathrm {d}t + \begin{pmatrix} r g_1(r^2) \,\mathrm {d}W^1(t) \\ g_2(\varphi ) \,\mathrm {d}W^2(t) \end{pmatrix}, \end{aligned}$$
(63)

where \(f_1,f_2\) are smooth functions on \([0,\infty )\) and \(g_1,g_2\) are smooth \(2 \pi \)-periodic functions on \({\mathbb {R}}\). We transform to Euclidean coordinates by applying Itō’s formula to \(x(r,\varphi )=(r \cos (\varphi ),r \sin (\varphi ))\),

$$\begin{aligned} \,\mathrm {d}x(t)= & {} \left[ \begin{pmatrix} x_1 &{} - x_2 \\ x_2 &{} x_1 \end{pmatrix} \begin{pmatrix} f_1(r^2) \\ f_2(\varphi ) \end{pmatrix} -\frac{1}{2} (g_2(\varphi ))^2 \begin{pmatrix} x_1 \\ x_2 \end{pmatrix} \right] \,\mathrm {d}t\nonumber \\&+ g_1(r^2) \begin{pmatrix} x_1 \\ x_2 \end{pmatrix} \,\mathrm {d}W^1(t) + g_2(\varphi ) \begin{pmatrix} - x_2 \\ x_1 \end{pmatrix} \,\mathrm {d}W^2(t). \end{aligned}$$
(64)

Here we set \(r^2 = x_1^2 + x_2^2\) and replace \(g_j(\varphi ),j=1,2\) by \(g_j(\mathrm {arg}(x_1 +ix_2))\) with the argument \(\mathrm {arg}(r e^{i \varphi })=\varphi \) taken from \((-\pi ,\pi ]\), for example. In the following computation we treat the special case

$$\begin{aligned} f_1(r^2)=\mu - r^2, \quad f_2(\varphi ) = \theta , \quad g_1(r^2)=\sigma _1, \quad g_2(\varphi ) = \sigma _2 \end{aligned}$$
(65)

with parameters \(\mu ,\theta ,\sigma _1,\sigma _2 \in {\mathbb {R}}\) still to be chosen. This is a generalization of a system studied in [5]. When the parameter \(\mu \) varies, this may be considered as a model problem for stochastic Hopf bifurcation (cf. [1, Chap. 9.4.2]).

Fig. 5
figure 5

Single trajectory of PMil scheme with step size \(h=2^{-4}\) for the stochastic oscillator dynamics. Two projected intermediate steps are indicated by dashed lines. Parameter values \(\mu =0.4\), \(\sigma _1=0.5\), \(\sigma _2=0.6\), \(\theta =1\), \(r_0=1.97\) and \(\varphi _0=\frac{\pi }{4}\). a Single trajectories of exact solution and PMil scheme. b Zoom in on the first few steps of the trajectory from a

In this case the system (64) with initial condition \(x(0) =(r_0\cos (\varphi _0),r_0 \sin (\varphi _0))\) can be solved explicitly via (63), since the radial equation is a stochastic Ginzburg Landau equation while the angular equation can be directly integrated. We obtain from [12, Chap. 4.4].

$$\begin{aligned} r(t)= & {} r_0 \exp \big ( \big (\mu -\frac{\sigma _1^2}{2}\big )t+ \sigma _1 W^1(t)\big )\nonumber \\&\times \Big (1+2 r_0^2\int _0^t\exp \big ((2\mu -\sigma _1^2)s+ 2 \sigma _1 W^1(s) \big ) \,\mathrm {d}s \Big )^{-1/2}, \nonumber \\ \varphi (t)= & {} \varphi _0 + \theta t + \sigma _2 W^2(t). \end{aligned}$$
(66)

Since \(g^1,g^2\) are linear and f is cubic with a uniform upper Lipschitz bound, we find that Assumption 2.1 is satisfied with \(q=3\). Moreover, the system has commutative noise [12, Chap. 10.3], since

$$\begin{aligned} g^{1,2}(x)=g^{2,1}(x)= \sigma _1 \sigma _2 \begin{pmatrix} -x_2 \\ x_1 \end{pmatrix}. \end{aligned}$$

As in [12, Chap.10 (3.16)] the double sum in (24), (25) then takes the explicit form

$$\begin{aligned} \frac{1}{2}\left[ \sigma _1^2( ( I_{(1)}^{t_{i-1},t_i})^2 - h_i) - \sigma _2^2( ( I_{(2)}^{t_{i-1},t_i} )^2 - h_i) \right] Y + \sigma _1 \sigma _2 I_{(1)}^{t_{i-1},t_i} I_{(2)}^{t_{i-1},t_i} \begin{pmatrix} -Y_2 \\ Y_1 \end{pmatrix}, \end{aligned}$$

where \(Y=\overline{X}_h^{\mathrm {PMil}}(t_i)\) and \(Y=\overline{X}_h^{\mathrm {SSBM}}(t_i)\), respectively.

Figure 5a shows the simulation of a single path generated by the exact solution and the projected Milstein method with equidistant step size \(h=2^{-4}\) and parameters \(\alpha =\frac{1}{4}, \mu =0.4, \sigma _1=0.5, \sigma _2=0.6, \theta =1\), \(r_0=1.97\), and \(\varphi _0=\frac{\pi }{4}\). The initial value is \(X_0=(1.39,1.39)\). Further, we use a highly accurate approximation of the integral in (66) by a Riemann sum with step size \(\Delta t=2^{-18}\).

As already mentioned in [2] we are interested in trajectories of the PMil scheme which do not coincide with trajectories generated by the standard Milstein method. Such a case is shown in Fig. 5a where the exact trajectory and the PMil-trajectory are displayed. Figure 5b shows a close-up of the projected Milstein scheme near the circle of radius \(h^{-\alpha }=2\). In the first and in the third step the trajectory leaves the ball, creating in the next step the intermediate values \(\overline{X}_h^{\mathrm {PMil}}(t_2)\) and \(\overline{X}_h^{\mathrm {PMil}}(t_4)\), which have been connected by dashed lines to their predecessor and their successor. Obviously, this event occurs more often when the starting point is close to the circle and the values of \(\sigma _1\) and \(\sigma _2\) are large.

Fig. 6
figure 6

Strong convergence errors for the approximation of the stochastic oscillator. Parameter values \(\mu =0.4\), \(\sigma _1=0.5\), \(\sigma _2=0.6\), \(\theta =1\), \(r_0=1.97\), and \(\varphi _0=\frac{\pi }{4}\)

Table 4 Strong convergence errors for the approximation of the stochastic oscillator

Figure 6 and Table 4 show the estimated strong error of convergence for the PEM scheme, the PMil method, and the SSBM scheme. Nonlinear equations in the scheme SSBM are solved by Newton’s method with three iteration steps. The parameters and the initial value are as in Fig. 5.

The estimates of errors, given by (60) at the endpoint \(T=1\), with seven different step sizes \(h=2^k\Delta t, k=8,\dots , 14\) are again based on Monte Carlo simulations with \(2\cdot 10^6\) samples. As above the associated confidence intervals are two orders of magnitude smaller than the estimated errors.

The numerical results in Fig. 6 and in Table 4 confirm the theoretical orders of convergence, though with some loss towards smaller step-sizes for the SSBM-method. As in our first example we provide a diagram of error versus computing time in Fig. 7. This time PEM has a slight advantage for very rough accuracy, but is worse than PMil and SSBM for higher accuracy. PMil always wins against SSBM.

Fig. 7
figure 7

CPU times versus \(L^2\)-errors of the PEM, PMil, and SSBM methods for the stochastic oscillator