1 Introduction

Stochastic differential equations (SDEs for short) with singular coefficients have been extensively studied recently (see [12, 25,26,27,28] and references therein). Meanwhile, in order for one to understand the numerical approximation of SDEs with irregular coefficients, numerical schemes have been established. The strong and weak convergence rates of Euler-Maruyama’s (abbreviated as EM’s) scheme for irregular SDEs were obtained (see [2, 3, 6, 7, 11, 13, 14, 16, 17, 19, 20, 23] for instance). [5, 8,9,10, 15, 18, 22] investigated Lp-approximation of solutions to SDEs with a discontinuous drift and obtained the corresponding Lp-error rates under differential assumptions on coefficients. More precisely, [9] investigated the Lp-error rate at least 1/2 with \(p\in [1,\infty )\) for the scalar SDEs with a piecewise Lipschitz drift and a Lipschitz diffusion coefficient that is non-zero at discontinuity points of the drift coefficient, this result has been extended to the case of scalar jump-diffusion SDEs in [22]. Based on the assumptions in [8,9,10, 22] showed the Lp-error rate at least 3/4 under additional piecewise smoothness assumptions on the coefficients, where they employed a novel technique by studying equations with coupled noise, and also showed that the Lp-error rate \(\frac {3}{4}\) cannot be improved in general. Under the condition of the Sobolev-Slobodeckij-type regularity of order κ ∈ (0,1), [18] obtained the L2-error rate \(\min \limits \{3/4,(1+\kappa )/2\}-\varepsilon \) (for arbitrarily small ε > 0) of the equidistant EM’s scheme for scalar SDEs with irregular drift and additive noise by using an explicit Zvonkin-type transformation and the Girsanov transformation. By using a suitable non-equidistant discretization, [18] also yields the strong convergence order of \(\frac {1+\kappa }{2}-\varepsilon \) for the corresponding EM’s scheme.

In this paper, we shall investigate the weak error of EM’s scheme for the following SDE on \(\mathbb {R}^{d}\)

$$ \mathrm{d} X_{t}= b(X_{t})\mathrm{d} t+\sigma \mathrm{d} W_{t},~X_{0}=x\in\mathbb{R}^{d}, $$
(1.1)

where (Wt)t≥ 0 is a d-dimensional Brownian motion with respect to a complete filtration probability space \(({\Omega }, ({\mathscr{F}}_{t})_{t\ge 0}, {\mathscr{F}}, \mathbb P)\). The associated EM’s scheme reads as follows: for any δ ∈ (0,1),

$$ \mathrm{d} X_t^{(\delta)}=b(X_{t_{\delta}}^{(\delta)})\mathrm{d} t+\sigma\mathrm{d} W_t,~X_0^{(\delta)}=x, $$
(1.2)

where tδ = [t/δ]δ and [t/δ] denotes the integer part of t/δ. The weak error is concerned with the convergence of the distribution of the EM’s scheme. Precisely, it is concerned with the approximation of \(\mathbb {E} f(X_{t})\) by \(\mathbb {E} f(X_{t}^{(\delta )})\) for a given function f. The weak error has been obtained for some SDEs with discontinuous drifts in [7, 11, 21]. It is worth noting that the test function f in these references is assumed to be Hölder continuous. When the test function f was relaxed to be just measurable and bounded, the result of weak convergence rate of EM’s scheme was obtained in [1] for SDEs with smooth coefficients. Recently, [4, 23] investigated the weak convergence rate of EM’s scheme for SDEs with irregular coefficients by using Girsanov’s transformation, and [3] used an integrability condition to obtain strong convergence rates for multidimensional SDEs with the aid of the Krylov estimate and the heat kernel estimate of Gaussian type process established by the parametrix method in [16]. Inspired by [3] and [4, 23], we shall give a note on the weak error for (1.1) with b satisfying an integrability condition (see (H2) below) which is similar to (A2’) in [3], and the given function f being only bounded and measurable on \(\mathbb {R}^{d}\). We say that functions satisfying (H2) have the Gauss-Besov regularity (see comments after (A2’) of [3]). Discontinuous functions can also satisfy some kind of Sobolev-Slobodeckij-type regularity which subjects to the Gauss-Besov regularity indicated by (H2), see examples in Section 2.2 or [3, Example 4.3]. Thus, we say the drift term b is “low regular” instead of irregular here. Moreover, (H2) also allows that the drift term satisfies a sub-linear growth condition (see (H1) below).

The remainder of this paper is organized as follows: The main result is presented in Section 2. All the proofs are given in Section 3.

2 Main result and examples

2.1 Assumption and main result

Let |⋅| be the Euclidean norm, 〈⋅,⋅〉 be the Euclidean product. ∥⋅∥ denotes the operator norm. We denote \(\|f\|_{\infty }=\sup _{x\in \mathbb {R}^{d}}|f(x)|\) for any bounded and measurable function f on \(\mathbb {R}^{d}\). Throughout this paper, we assume that the coefficients of (1.1) satisfy the following assumptions:

  1. (H1)

    \(b:\mathbb {R}^{d}\rightarrow \mathbb {R}^{d}\) is measurable and σ is an invertible d × d-matrix. There exist β ∈ [0,1) and nonnegative constants L1, L2 such that

    $$|b(x)|\leq L_{1}+L_{2}|x|^{\beta},~x\in\mathbb{R}^{d}.$$
  2. (H2)

    There exist p0 ≥ 2, α > 0 and \(\phi \in C((0,+\infty );(0,+\infty ))\) which is non-increasing on (0,l) and \({{\int \limits }_{0}^{l}}\phi ^{2}(s)\mathrm {d} s<\infty \) for some l > 0 such that

    $$ \sup_{z\in\mathbb{R}^d}{\int}_{\mathbb{R}^d\times\mathbb{R}^d}|b(y)-b(x)|^{p_0}\frac {\mathrm{e}^{-\frac {|x-z|^2} s-\frac {|y-x|^2} r}} {s^{\frac d 2}r^{\frac d 2}}\mathrm{d} x\mathrm{d} y\leq (\phi(s)r^{\alpha})^{p_0} ,~s>0,r\in [0,1]. $$

It is clear that (1.2) also has a unique strong solution. The index α in (H2) is used to characterize the order of the continuity and the function ϕ is used to characterize the type of the continuity. From examples in the next subsection, it is clear that functions sharing the same order of continuity can have different types of continuity.

We now formulate the main result.

Theorem 2.1

Assume (H1)–(H2). Then, for any T > 0 and any bounded measurable function f on \(\mathbb {R}^{d}\), there exists a constant \(C_{T,p_{0},\sigma ,x}>0\) such that

$$ |\mathbb{E} f(X_t)-\mathbb{E} f(X_t^{(\delta)})|\le C_{T,p_0,\sigma,x}\|f\|_{\infty}\delta^{\alpha},~t\in[0,T], $$
(2.1)

where p0 is defined in (H2). If the growth condition in (H1) is replaced by |b(x)|≤ L1 + L2|x|, then (2.1) also holds for T,L2,p0 and σ satisfying

$$ TL_2 \|\sigma^{-1}\| \|\sigma\| \frac {\sqrt{2(p_0+1)(p_0+3)}} {p_0-1}<1. $$
(2.2)

Remark 2.1

By [28, Theorem 1.1], (1.1) has a unique strong solution under (H1). It is also clear that (1.2) has a unique pathwise solution. For the bounded and irregular b, there are many results on strong and weak error of EM’s scheme (see, e.g., [3, 7, 11, 18] and references therein), and the weak error cannot be derived from the strong error directly if f is just a bounded and measurable function. We would like to highlight that authors in [18] has obtained the rate of strong convergence for one-dimensional SDEs if b is in \(L^{1}(\mathbb {R})\) and bounded, and satisfies the Sobolev-Slobodeckij-type regularity. This result is better than the present one in Theorem 2.1. However, results in [18] relied on an Zvonkin-type transformation which can be given explicitly in one dimension, and some favorable properties are lost in high dimensions. Here, only Girsanov’s transformation is used, while we allow that the SDE is multi-dimensional and that the drift satisfies a sub-linear growth condition. Moreover, we obtain the same convergence rate when b has linear growth, as long as (2.2) holds. Our assumption (H2) also includes the Sobolev-Slobodeckij-type regularity, see Example 2.4 in the next subsection. To obtain higher convergence rate as in [18], it seems that we need to make a deep investigation on the Zvonkin-type transformation.

In the assumption (H2), if α is a decreasing function of p0, then we can choose p0 = 2 and obtain the highest convergence rate in (2.1), see Example 2.3.

Remark 2.2

In [3], the strong convergence and the convergence rate are investigated with the drift satisfying an integrability condition and boundedness. Here we obtain the weak convergence rate of EM’s scheme, where the drift does not need to be bounded and the test function f in (2.1) is only bounded and measurable. Moreover, the convergence rate is better than the rate obtained in [3, Theorem 1.3].

From examples in the next subsection, one can see that the drift could be discontinuous. This means that we have extended the results in [1] where the coefficients must be smooth. However, our result is not optimal in the smooth case since the classical order of the weak error is α = 1 for SDEs with smooth coefficients in [1].

Remark 2.3

In [19, 21], authors considered the weak convergence rate of the EM’s scheme for (1.1) with the drift b is of sub-linear growth and b = bH + bA, where bH is α-Hölder for some α ∈ (0,1) and bA belongs to a class \(\mathcal {A}\) which does not contain any nontrivial Hölder continuous functions. The order of the convergence rate obtained in [21] is \(\frac {\alpha } 2\wedge \frac 1 4\), even if bA ≡ 0. However, the order of the convergence rate in Theorem 2.1 comes from the continuity order α in (H2), and it can be greater than \(\frac 1 4\).

The class \(\mathcal {A}\) in [19, 21] is given by \(\mathcal {A}\)-approximation. In contrast to the \(\mathcal {A}\)-approximation, our condition (H2) is more explicit. Moreover, for any time independent function ζ in the class \(\mathcal {A}\) of [19], it satisfies (H2) with \(p_{0}=2,\alpha =\frac 1 4\) and \(\phi (s)=s^{-\frac 1 4}\sqrt {1+\sqrt s}\). In fact, according to [19, Definition 2.1], ζ is bounded and there exists a sequence {ζn}n≥ 1 such that \(\zeta _{n}\in C^{1}(\mathbb {R}^{d})\) is uniformly bounded and converges to ζ locally in \(L^{1}(\mathbb {R}^{d})\), and there exists K > 0 such that

$$ \sup_{n\geq 1,~a\in\mathbb{R}^d}{\int}_{\mathbb{R}^d}\|\nabla\zeta_n(x+a)\|\frac {e^{-\frac {|x|^2} {s}}} {s^{(d-1)/2}}\mathrm{d} x\leq K(1+\sqrt s). $$
(2.3)

Noting that

$$ \sup_{x\ge0}(x^{\gamma^{\prime}}\operatorname{e}^{-\gamma x^2})=\Big(\frac{{\gamma^{\prime}}}{2\operatorname{e}\gamma}\Big)^{{\gamma^{\prime}}/2},~~~~~~{\gamma^{\prime}},\gamma>0, $$
(2.4)

we then obtain from (2.3) and (2.4) that

$$ \begin{array}{@{}rcl@{}} &&{\int}_{\mathbb{R}^d\times\mathbb{R}^d}|\zeta(x)-\zeta(y)|^2\frac {e^{-\frac {|x-z|^2} s-\frac {|x-y|^2} r}} {(sr)^{\frac d 2}}\mathrm{d} x\mathrm{d} y\\ &&\qquad \leq \|\zeta\|_{\infty}\lim_{n\rightarrow+\infty}{\int}_{\mathbb{R}^d\times\mathbb{R}^d}|\zeta_n(x)-\zeta_n(y)|\frac {e^{-\frac {|x-z|^2} s-\frac {|x-y|^2} r}} {(sr)^{\frac d 2}}\mathrm{d} x\mathrm{d} y\\ &&\qquad =\|\zeta\|_{\infty}\lim_{n\rightarrow+\infty}{\int}_0^1{\int}_{\mathbb{R}^d\times\mathbb{R}^d}\|\nabla_{y-x}\zeta_n(x+\theta(y-x))\|\frac {e^{-\frac {|x-z|^2} s-\frac {|x-y|^2} r}} {(sr)^{\frac d 2}}\mathrm{d} x\mathrm{d} y\mathrm{d} \theta\\ &&\qquad \leq \|\zeta\|_{\infty}\lim_{n\rightarrow+\infty}{\int}_0^1\left( {\int}_{\mathbb{R}^d}\left( {\int}_{\mathbb{R}^d}\|\nabla \zeta_n(x+\theta h+z)\|\frac {e^{-\frac {|x|^2} s}} {s^{d/2}}\mathrm{d} x\right)\frac {|h|e^{\frac {-|h|^2} r}} {r^{\frac d 2}} \mathrm{d} h\right)\mathrm{d} \theta\\ &&\qquad \leq \|\zeta\|_{\infty}{\int}_{\mathbb{R}^d} Ks^{-\frac 1 2}(1+\sqrt s)\frac {|h|e^{\frac {-|h|^2} r}} {r^{\frac d 2}} \mathrm{d} h\\ &&\qquad \leq C\|\zeta\|_{\infty} s^{-\frac 1 2}(1+\sqrt s) r^{\frac 1 2}, \end{array} $$

where the constant C is independent of z. The class \(\mathcal {A}\) used in [21] allows functions in \(\mathcal {A}\) can be just exponentially bounded. However, they assume that the drift is only sublinear growth. There is no example showing that the class \(\mathcal {A}\) used in [21] can contain functions which are more irregular than functions in \(\mathcal {A}\) of [19].

2.2 Illustrative examples

In this subsection, we shall provide several examples to illustrate the condition (H2) and the order of the convergence rate. Firstly, we give some comments on (H2). According to the proof of Theorem 2.1, \(X_{t}^{(\delta )}\) and Xt are weak solutions of the equation Yt = X0 + σWt in suitable probability spaces. By using the Girsanov transformation, the error between \(X_{t}^{(\delta )}\) and Xt mainly comes from the following term

$$\Big|{{\int}_{0}^{T}}\langle \sigma^{-1}(b(Y_{s})-b(Y_{s_{\delta}})),\mathrm{d} W_{s} \rangle\Big|.$$

Since Yt is a Gaussian process, (H2) is convenient to estimate the above stochastic integral, see (3.20) and the proof of Lemma 3.3 for more details. Comparing with the definition of Besov space (see [24, (1.13)]), we call the set that consists of functions satisfying (H2) the Gauss-Besov class. The exponential terms in the integrand of (H2) allow that b can grow to infinity as |x| increases.

Example 2.2

If b is the Hölder continuous with exponent β, i.e.,

$$|b(y)-b(x)|\le L|x-y|^{\beta},$$

then (H2) holds with \(\alpha =\frac {\beta }{2}\) and a constant function ϕ(s). It is clear that b has sublinear growth if β < 1. Then, for any T > 0, (2.1) holds with \(\alpha =\frac {\beta } 2\).

Proof

By the Hölder continuity and (2.4), the assertion follows from the following inequality

$$ \begin{array}{@{}rcl@{}} &&\sup_{z\in\mathbb{R}^d}{\int}_{\mathbb{R}^d\times\mathbb{R}^d}|b(y)-b(x)|^{p_0}\frac {\operatorname{e}^{-\frac {|x-z|^2} s-\frac {|y-x|^2} r}} {s^{\frac d 2}r^{\frac d 2}}\mathrm{d} x\mathrm{d} y\\ &&\le L^{p_0}\sup_{z\in\mathbb{R}^d}{\int}_{\mathbb{R}^d\times\mathbb{R}^d}|y-x|^{\beta {p_0}}\frac {\operatorname{e}^{-\frac {|x-z|^2} s-\frac {|y-x|^2} r}} {s^{\frac d 2}r^{\frac d 2}}\mathrm{d} x\mathrm{d} y\\ &&\le L^{p_0}\frac{1}{{s^{\frac d 2}r^{\frac d 2}}}\bigg(\frac{\beta {p_0}r}{\operatorname{e}}\bigg)^{\frac{\beta {p_0}}{2}}\sup_{z\in\mathbb{R}^d}{\int}_{\mathbb{R}^d\times\mathbb{R}^d}\operatorname{e}^{-\frac {|x-z|^2}{s}}\operatorname{e}^{-\frac {|y-x|^2}{2r}}\mathrm{d} x\mathrm{d} y\\ &&\le CL^{p_0} \left( \frac{\beta {p_0}r}{\operatorname{e}}\right)^{\frac{\beta {p_0}}{2}}. \end{array} $$

The following example shows that (H2) can hold even if the drift term b is not piecewise continuous.

Example 2.3

Let A be the Smith-Volterra-Cantor set on [0,1], which is constructed in the following way. The first step, we let \(I_{1,1}=\left (\frac 3 8,\frac 5 8\right )\), \(J_{1,1}=\left [0,\frac 3 8\right ],\) \(J_{1,2}=[\frac 5 8,1]\) and remove the open interval I1,1 from [0,1]. The second step, we remove the middle \(\frac 1 {4^{2}}\) open intervals, denoting by I2,1 and I2,2, from J1,1 and J1,2 respectively, i.e., \(I_{2,1}=\left (\frac 5 {32}, \frac {7} {32}\right )\), \(I_{2,2}=\left (\frac {25} {32},\frac {27} {32}\right )\). The intervals left are denoted by J2,1,J2,2,J2,3,J2,4, i.e.,

$$J_{2,1}=\left[0,\frac 5 {32}\right],J_{2,2}=\left[\frac 7 {32},\frac 3 8\right],J_{2,3}=\left[\frac 5 {8}, \frac {25} {32}\right],J_{2,4}=\left[\frac {27} {32},1\right].$$

For the n th step, we remove the middle \(\frac 1 {4^{n}}\) open intervals \(I_{n,1},\cdots , I_{n,2^{n-1}}\) from \(J_{n-1,1},\cdots , J_{n-1,2^{n-1}}\) respectively, and the intervals left are denoted by \(J_{n,1},\cdots , J_{n, 2^{n}}\). Let

$$A=\bigcap_{n=1}^{\infty}\left( \bigcup_{k=1}^{2^{n}} J_{n,k}\right).$$

Then, A is a nowhere dense set and the Lebesgue measure of A is 1/2. Define

All of the endpoints of the intervals \(\bar I_{n,j}\) are the discontinuous points of b, which is dense in A. For any interval I ⊂ [0,1] such that IA, it always contains the discontinuous points of b. However, any interval I ⊂ [0,1] such that IA = , it is a subset of some In,j. Hence, b is not a piecewise continuous function. In the following, we shall show that b satisfies condition (H2) with p0 = 2 and \(\alpha =\frac 1 4\) and \(\phi (s)=Cs^{-\frac 1 4}\).

Proof

For u > 0 and any interval (a1,a2) (it is similar for [a1,a2]),

For u < 0, we obtain that

Hence, it follows from Jensen’s inequality that

Combining this with (2.4), we obtain that

$$ \begin{array}{@{}rcl@{}} &&\sup_{z\in\mathbb{R}}{\int}_{\mathbb{R}\times\mathbb{R}}|b(y)-b(x)|^2\frac{{\operatorname{e}^{-\frac {|x-z|^2} s}\operatorname{e}^{-\frac {|y-x|^2} r}}} {s^{\frac 1 2}r^{\frac 1 2}}\mathrm{d} x\mathrm{d} y\\ &&\qquad\le\frac{1}{{s^{\frac 1 2}r^{\frac 1 2}}}{\int}_{\mathbb{R}}\operatorname{e}^{-\frac {|u|^2} r}{\int}_{\mathbb{R}}|b(x+u)-b(x)|^2\mathrm{d} x\mathrm{d} u\\ &&\qquad\le\frac{4}{{s^{\frac 1 2}r^{\frac 1 2}}}{\int}_{\mathbb{R}}\operatorname{e}^{-\frac {|u|^2} r}|u|\mathrm{d} u =\left( Cs^{-\frac 1 4}r^{\frac 1 4}\right)^2. \end{array} $$

A general class of functions that satisfies (H2) is the fractional Sobolev space \(W^{\beta ,p}(\mathbb {R}^{d})\), showing as follows.

Example 2.4

If there exist β > 0 and \(p\in [2,\infty )\cap (d,+\infty )\) such that the Gagliardo seminorm of b is finite, i.e.,

$$ \left[b\right]_{W^{\beta,p}}:=\left( {\int}_{\mathbb{R}^d\times\mathbb{R}^d} \frac {|b(x)-b(y)|^p} {|x-y|^{d+\beta p}}\mathrm{d} x\mathrm{d} y\right)^{\frac 1 p}<\infty, $$

then (H2) holds for p0 = p with \(\alpha =\frac {\beta } 2\) and \(\phi (s)= C_{1}s^{-\frac {d}{2}}\left [b\right ]_{W^{\beta ,p}}^{p}\). Hence, if b satisfies (H1) and \(\left [b\right ]_{W^{\beta ,p}}<\infty \) with \(p\in [2,\infty )\cap (d,+\infty )\), then (2.1) holds.

Proof

Indeed, by Hölder’s inequality and (2.4), it follows that

$$ \begin{array}{@{}rcl@{}} &&\frac{1}{(rs)^{\frac{d}{2}}}{\int}_{\mathbb{R}^d\times\mathbb{R}^d}|b(y)-b(x)|^{p}\operatorname{e}^{-\frac {|x-z|^2} s-\frac {|y-x|^2} r}\mathrm{d} x\mathrm{d} y\\ &&=\frac{1}{(rs)^{\frac{d}{2}}}{\int}_{\mathbb{R}^d\times\mathbb{R}^d}\frac {|b(x)-b(y)|^{p}} {|x-y|^{d+\beta {p}}}\operatorname{e}^{-\frac {|x-z|^2} s-\frac {|y-x|^2} r}|x-y|^{d+\beta {p}}\mathrm{d} x\mathrm{d} y\\ &&\le C_1s^{-\frac{d}{2}}r^{\frac{\beta p}{2}}{\int}_{\mathbb{R}^d\times\mathbb{R}^d}\frac {|b(x)-b(y)|^{p}} {|x-y|^{d+\beta {p}}}\operatorname{e}^{-\frac {|x-z|^2} s-\frac {|y-x|^2} {2r}}\mathrm{d} x\mathrm{d} y\\ &&\le C_1s^{-\frac{d}{2}}r^{\frac{\beta p}{2}}\left[b\right]_{W^{\beta,p}}^p. \end{array} $$

3 Proof of Theorem 2.1

The key point for proving the main result is to construct a reference SDE. By using Girsanov’s theorem, the reference SDE provides new representations of (1.1) and its EM’s approximation SDE (1.2) under another probability measures.

We denote by Yt = x + σWt the reference SDE of (1.1). Then, Yt is a time homogenous Markov process with heat kernel w.r.t. the Lebesgue measure as follows:

$$ p_t(x,y)=\frac{\exp\Big\{-\frac{\langle(\sigma\sigma^{*})^{-1}(y-x),(y-x)\rangle}{2t}\Big\}}{\sqrt{(2t\pi)^{d}\det(\sigma\sigma^{*})}},~~x,y\in\mathbb{R}^d. $$
(3.1)

To prove Theorem 2.1, we give three auxiliary lemmas.

The first lemma is on the exponential estimate of |b(Yt)|. Here, we use the condition (H1’), which is weaker than (H1).

  1. (H1’)

    There exist β ∈ [0,1), nonnegative constants L1,L2 and F ≥ 0 with \(F \in {L^{p_{1}}(\mathbb {R}^{d})}\) for some p1 > d such that

    $$ |b(x)|\leq L_1+L_2|x|^{\beta}+F(x). $$
    (3.2)

Lemma 3.1

Assume (H1’) holds. Then, for all T,λ > 0, it holds that

$$ \mathbb{E}\exp\Big\{\lambda{\int}_0^{T}|\sigma^{-1}b(Y_s)|^2\mathrm{d} s\Big\}<\infty. $$
(3.3)

Proof

Note that for any ε > 0

$$ L_1+L_2|x|^{\beta}\le L_1+(1-\beta)L_2^{\frac{1}{1-\beta}}\left( \frac{\beta}{\varepsilon}\right)^{\frac{\beta}{1-\beta}}+\varepsilon|x|=:L(\varepsilon)+\varepsilon|x|, $$
(3.4)

and for any a,b,c,ε1,ε2 > 0

$$(a+b+c)^{2}\le \left( 2+\frac{1}{\varepsilon_{1}}\right)a^{2}+(1+\varepsilon_{1}+\varepsilon_{2})b^{2}+\left( 2+\frac{1}{\varepsilon_{2}}\right)c^{2}.$$

Combining these with (3.2) and the Hölder inequality, we have that

$$ \begin{array}{@{}rcl@{}} &&\mathbb{E}\exp\Big\{\lambda{\int}_0^{T}|\sigma^{-1}b(Y_s)|^2\mathrm{d} s\Big\}\\ &&\le \mathbb{E}\exp\Big\{\lambda {\int}_0^{T} \|\sigma^{-1}\|^2 \left( L(\varepsilon)+\varepsilon|Y_s|+F(Y_s)\right)^2\mathrm{d} s\Big\}\\ &&\le \mathbb{E}\exp\Big\{\lambda {\int}_0^{T} \|\sigma^{-1}\|^2 \left( (L(\varepsilon)+\varepsilon|x|)+\varepsilon|Y_s-x|+F(Y_s)\right)^2\mathrm{d} s\Big\}\\ &&\leq \mathbb{E}\exp\Big\{\lambda {\int}_0^{T} \|\sigma^{-1}\|^2 \Big(\left( 2+\varepsilon_1^{-1}\right)(L(\varepsilon)+\varepsilon|x|)^2\\ &&\qquad\qquad +(1+\varepsilon_1+\varepsilon_2)\varepsilon^2|Y_s-x|^2+(2+\varepsilon_2^{-1}) F^2(Y_s)\Big) \mathrm{d} s\Big\}\\ &&\le \exp\{\lambda T \|\sigma^{-1}\|^2(L(\varepsilon)+\varepsilon|x|)^2\left( 2+\varepsilon_1^{-1}\right)\}\\ &&\quad \times\left( \mathbb{E}\exp\left\{\lambda (1+\varepsilon_1+\varepsilon_2)^2 \varepsilon^2\|\sigma^{-1}\|^2{\int}_0^T|Y_s-x|^2\mathrm{d} s\right\}\right)^{ \frac{1}{1+\varepsilon_1+\varepsilon_2}}\\ &&\quad \times \left( \!\mathbb{E}\exp\!\left\{ \frac{\lambda(2+\varepsilon_2^{-1})(1\!+\varepsilon_1+\varepsilon_2)}{\varepsilon_1+\varepsilon_2}\|\sigma^{-1}\|^2{\int}_0^T F^2(Y_s)\mathrm{d} s\right\}\right)^{\frac {\varepsilon_1+\varepsilon_2}{1+\varepsilon_1+\varepsilon_2}}. \end{array} $$
(3.5)

Let

$$ \begin{array}{@{}rcl@{}} I_{1,T}&=&\mathbb{E}\exp\left\{\lambda (1+\varepsilon_1+\varepsilon_2)^2 \varepsilon^2\|\sigma^{-1}\|^2{\int}_0^T|Y_s-x|^2\mathrm{d} s\right\},\\ I_{2,T}&=&\mathbb{E}\exp\left\{ \frac{\lambda(2+\varepsilon_2^{-1})(1+\varepsilon_1+\varepsilon_2)}{\varepsilon_1+\varepsilon_2}\|\sigma^{-1}\|^2{\int}_0^T F^2(Y_s)\mathrm{d} s\right\}. \end{array} $$

Since \(F\in L^{p_{1}}(\mathbb {R}^{d})\), for any 0 ≤ ST and q satisfying \(\frac d {p_{1}} +\frac 1 q<1\), we obtain that (see, e.g., [12])

$$ \mathbb{E} \left[{\int}_S^T F^2(Y_s)\mathrm{d} s\Big| \mathscr{F}_S\right]\leq (T-S)^{\frac 1 q}\|F\|_{L^{p_1}}. $$
(3.6)

This yields the following Khasminskii’s estimate (see, e.g., [27, Lemma 3.5]): for any C > 0

$$ \mathbb{E}\exp\left\{C{\int}_0^T F^2(Y_s)\mathrm{d} s\right\}<\infty. $$
(3.7)

Thus, for any λ,ε1,ε2 > 0, one has

$$ I_{2,T}<\infty. $$
(3.8)

For I1,T. Since ε, ε1 and ε2 are arbitrary, for any T > 0, we can choose them sufficiently small such that

$$1-2T^{2}(1+\varepsilon_{1}+\varepsilon_{2})^{2}\lambda \varepsilon^{2}\|\sigma^{-1}\|^{2}\|\sigma \|^{2} =:\hat{\lambda}>0.$$

This, together with the Jensen inequality and the heat kernel (3.1), yields that

$$ \begin{array}{@{}rcl@{}} I_{1,T}&=&\mathbb{E}\exp\left\{\lambda (1+\varepsilon_1+\varepsilon_2)^2 \varepsilon^2\|\sigma^{-1}\|^2{\int}_0^T|Y_s-x|^2\mathrm{d} s\right\}\\ & \leq& \frac 1 T{\int}_0^T\mathbb{E}\exp\left\{T\lambda (1+\varepsilon_1+\varepsilon_2)^2 \varepsilon^2\|\sigma^{-1}\|^2 |Y_s-x|^2 \right\}\mathrm{d} s\\ & = &{\int}_0^T{\int}_{\mathbb{R}^d}\frac {\exp\left\{T\lambda (1+\varepsilon_1+\varepsilon_2)^2\varepsilon^2\|\sigma^{-1}\|^2 | y|^2 -\frac {|\sigma^{-1}y|^2} {2s}\right\}} { T\sqrt{(2s\pi)^{d}\det(\sigma\sigma^{*})}}\mathrm{d} y \mathrm{d} s\\ & \leq& {\int}_0^T{\int}_{\mathbb{R}^d}\frac {\exp\left\{ -(\frac{1-2sT\lambda (1+\varepsilon_1+\varepsilon_2)^2\varepsilon^2 \|\sigma^{-1}\|^2\|\sigma \|^2}{2s})|\sigma^{-1}y|^2\right\}} { T\sqrt{(2s\pi)^{d}\det(\sigma\sigma^{*})}}\mathrm{d} y \mathrm{d} s\\ &\le&{\int}_0^T{\int}_{\mathbb{R}^d}\frac {\exp\left\{ -(\frac{\hat{\lambda}}{2s})|\sigma^{-1}y|^2\right\}} { T\sqrt{(2s\pi)^{d}\det(\sigma\sigma^{*})}}\mathrm{d} y \mathrm{d} s \\ &=&\hat{\lambda}^{-\frac{d}{2}} <\infty. \end{array} $$
(3.9)

Plugging (3.9) and (3.8) into (3.5), then (3.3) follows. □

The following lemma deals with the exponential estimate of \(|b(Y_{t_{\delta }})|\), where \(\{Y_{t_{\delta }}\}_{t\in [0,T]}\) denotes the solution to the discrete-time EM’s scheme. The Krylov estimate (3.6) fails for \(Y_{s_{\delta }}\) (see [3, Remark 2.5] or [23]). Hence, we use (H1) in Lemma 3.2 instead of (H1’).

Lemma 3.2

Assume (H1). Then, for all T > 0,λ > 0, we have

$$ \sup_{0<\delta < 1\wedge T}\mathbb{E}\exp\Big\{\lambda{\int}_0^{T}|\sigma^{-1}b(Y_{s_{\delta}})|^2\mathrm{d} s\Big\}<\infty. $$
(3.10)

Proof

Splitting the interval [0,T] and applying (3.4), it follows from the elementary inequality that

$$ \begin{array}{@{}rcl@{}} &&\mathbb{E}\exp\Big\{\lambda{\int}_0^{T}|\sigma^{-1}b(Y_{s_{\delta}})|^2\mathrm{d} s\Big\}\\ &&=\mathbb{E}\Big\{\exp\Big\{\lambda{\int}_0^{ \delta}|\sigma^{-1}b(Y_{s_{\delta}})|^2\mathrm{d} s\Big\}\exp\Big\{\lambda{\int}_{\delta}^{T}|\sigma^{-1}b(Y_{s_{\delta}})|^2\mathrm{d} s\Big\}\Big\}\\ &&\le \exp\left\{\lambda\delta\|\sigma^{-1}\|^2(L(\varepsilon)+\varepsilon x)^2\right\} \mathbb{E}\exp\Big\{\lambda{\int}_{\delta}^{T}\|\sigma^{-1}\|^2|L(\varepsilon)+\varepsilon x+\varepsilon (Y_{s_{\delta}}-x)|^2\mathrm{d} s\Big\}\\ &&\le \exp\{\lambda\delta\|\sigma^{-1}\|^2(L(\varepsilon)+\varepsilon x)^2\}\exp\{\lambda (T-\delta) \|\sigma^{-1}\|^2(L(\varepsilon)+\varepsilon|x|)^2\left( 1+\varepsilon_1^{-1}\right)\}\\ &&\quad\times\mathbb{E}\exp\left\{\lambda (1+\varepsilon_1) \varepsilon^2\|\sigma^{-1}\|^2{\int}_{\delta}^T|Y_{s_{\delta}}-x|^2\mathrm{d} s\right\}\\ &&\le \exp\{\lambda T\|\sigma^{-1}\|^2(L(\varepsilon)+\varepsilon x)^2\left( 1+\varepsilon_1^{-1}\right)\}\\ &&\quad\times\mathbb{E}\exp\left\{\lambda (1+\varepsilon_1) \varepsilon^2\|\sigma^{-1}\|^2{\int}_{\delta}^T|Y_{s_{\delta}}-x|^2\mathrm{d} s\right\}. \end{array} $$
(3.11)

For any T,λ > 0, we choose ε and ε1 sufficiently small such that

$$ 1-2T^2\lambda (1+\varepsilon_1) \varepsilon^2\|\sigma^{-1}\|^2\|\sigma \|^2=:\breve{\lambda}>0. $$

This, together with the Jensen inequality and (3.1), yields that

$$ \begin{array}{@{}rcl@{}} &&\mathbb{E}\exp\left\{\lambda (1+\varepsilon_1) \varepsilon^2\|\sigma^{-1}\|^2{\int}_{\delta}^T|Y_{s_{\delta}}-x|^2\mathrm{d} s\right\}\\ &&\le\frac{1}{T-\delta}{\int}_{\delta}^T\mathbb{E}\exp\{(T-\delta)\lambda (1+\varepsilon_1) \varepsilon^2\|\sigma^{-1}\|^2|Y_{s_{\delta}}-x|^2\}\mathrm{d} s\\ && \le{\int}_{\delta}^T{\int}_{\mathbb{R}^d}\frac{\exp\{(T-\delta)\lambda (1+\varepsilon_1) \varepsilon^2\|\sigma^{-1}\|^2|{ y } |^2-\frac{\langle(\sigma\sigma^{*})^{-1}y,y\rangle}{2s_{\delta}}\}}{(T-\delta)\sqrt{(2\pi s_{\delta})^d\det{(\sigma\sigma^{*})}}}\mathrm{d} y\mathrm{d} s\\ && \le{\int}_{\delta}^T{\int}_{\mathbb{R}^d}\frac{\exp\{(T-\delta)\lambda (1+\varepsilon_1) \varepsilon^2\|\sigma^{-1}\|^2\|\sigma \|^2|\sigma^{-1}y |^2-\frac{|\sigma^{-1}y|^2}{2s_{\delta}}\}}{(T-\delta)\sqrt{(2\pi s_{\delta})^d\det{(\sigma\sigma^{*})}}}\mathrm{d} y\mathrm{d} s\\ &&\le {\int}_{\delta}^T{\int}_{\mathbb{R}^d}\frac{\exp\{- \frac{(1-2(T-\delta)^2\lambda (1+\varepsilon_1) \varepsilon^2\|\sigma^{-1}\|^2\|\sigma \|^2)}{2s_{\delta}}|\sigma^{-1}y|^2\}} {(T-\delta)\sqrt{(2\pi s_{\delta})^d\det{(\sigma\sigma^{*})}}}\mathrm{d} y\mathrm{d} s\\ &&\le {\int}_{\delta}^T{\int}_{\mathbb{R}^d}\frac{\exp\{- \frac{(1-2T^2\lambda (1+\varepsilon_1) \varepsilon^2\|\sigma^{-1}\|^2\|\sigma \|^2)}{2s_{\delta}}|\sigma^{-1}y|^2\}} {(T-\delta)\sqrt{(2\pi s_{\delta})^d\det{(\sigma\sigma^{*})}}}\mathrm{d} y\mathrm{d} s\\ &&=\breve{\lambda}^{-\frac{d}{2}}<\infty. \end{array} $$
(3.12)

Combining this with (3.11), we have that (3.10) holds. □

Remark 3.1

According to the proofs of Lemmas 3.1 and 3.2 (see especially (3.9), (3.12), and the definitions of \(\hat {\lambda }\) and \(\breve {\lambda }\)), we have that 𝜖 = O(T− 1) as \(T\rightarrow +\infty \). Then, the constant TL2(𝜖) in (3.5) and (3.11) is of the order \((1-\beta )^{2}(L_{2}^{\frac {2}{1+\beta }}T)^{\frac {1+\beta } {1-\beta }}\). Hence, for larger \(L_{2}^{\frac 2 {1+\beta }}T\), the closer β to 1, the greater upper bound of (3.3) and (3.10).

Lemmas 3.1 and 3.2 serve to use the Novikov condition in the proof of Theorem 2.1. For the case that β < 1, the constant λ in both lemmas can be arbitrary. For the case that β = 1, with 𝜖 = L2 and L(𝜖) = L1 in (3.4), according to (3.9) and (3.12), one can see from the definitions of \(\hat {\lambda }\) and \(\breve {\lambda }\) that (3.3) and (3.10) hold for λ > 0 and T > 0 satisfying the following condition

$$ 2T^2\lambda L_2^2\|\sigma^{-1}\|^2\|\sigma\|^2<1, $$
(3.13)

and sufficiently small ε1 and ε2.

Lemma 3.3

Assume (H2). Then, there exists a constant Cσ > 0 depending on σ only such that for all 0 < stT we have

$$ \mathbb{E}|b(Y_t)-b(Y_s)|^{p_0}\le C_{\sigma}(\phi(2s\|\sigma\|^2)(2(t-s)\|\sigma\|^2)^{\alpha})^{p_0}. $$
(3.14)

Proof

By the definition of reference SDE, it is easy to see that

$$ \mathbb{E}|b(Y_t)-b(Y_{s})|^{p_0}=\mathbb{E}|b(x+\sigma W_t)-b(x+\sigma W_{s})|^{p_0}. $$

Noting that WtWs and Ws are mutually independent, we obtain from (3.1) and (H2) that

$$ \begin{array}{@{}rcl@{}} &&\mathbb{E}|b(x+\sigma W_t)-b(x+\sigma W_{s})|^{p_0}\\ &&={\int}_{\mathbb{R}^d}{\int}_{\mathbb{R}^d}|b(x+y)-b(x+z)|^{p_0}p_{t-s}(x+z,x+y)p_s(x,x+z)\mathrm{d} y\mathrm{d} z\\ &&={\int}_{\mathbb{R}^d}{\int}_{\mathbb{R}^d}|b(x+y)-b(x+z)|^{p_0}\frac{\operatorname{e}^{-\frac{\langle(\sigma\sigma^{*})^{-1}(y-z),(y-z)\rangle}{2(t-s)}}}{\sqrt{(2\pi (t-s) )^d\det(\sigma\sigma^{*})}} \frac{\operatorname{e}^{-\frac{\langle(\sigma\sigma^{*})^{-1}z,z\rangle}{2s}}}{\sqrt{(2\pi s)^d\det(\sigma\sigma^{*})}}\mathrm{d} y\mathrm{d} z\\ &&\le\frac{\|\sigma\|^{2d}}{\pi^d\det(\sigma\sigma^{*})}{\int}_{\mathbb{R}^d}{\int}_{\mathbb{R}^d}|b(x+y)-b(x+z)|^{p_0}\frac{\operatorname{e}^{-\frac{|y-z|^2}{2\|\sigma\|^2(t-s)}}\operatorname{e}^{-\frac{|z|^2}{2\|\sigma\|^2s}}}{(2 (t-s)\|\sigma\|^2 )^{d/2}(2 s\|\sigma\|^2 )^{d/2}}\mathrm{d} y\mathrm{d} z\\ &&=\frac{\|\sigma\|^{2d}}{\pi^d\det(\sigma\sigma^{*})}{\int}_{\mathbb{R}^d}{\int}_{\mathbb{R}^d}|b(u)-b(v)|^{p_0}\frac{\operatorname{e}^{-\frac{|u-v|^2}{2\|\sigma\|^2(t-s)}}\operatorname{e}^{-\frac{|v-x|^2}{2\|\sigma\|^2s}}}{(2 (t-s)\|\sigma\|^2 )^{d/2}(2 s\|\sigma\|^2 )^{d/2}}\mathrm{d} u\mathrm{d} v\\ &&\le\sup_{x\in\mathbb{R}^d}\frac{\|\sigma\|^{2d}}{\pi^d\det(\sigma\sigma^{*})}{\int}_{\mathbb{R}^d}{\int}_{\mathbb{R}^d}|b(u)-b(v)|^{p_0}\frac{\operatorname{e}^{-\frac{|u-v|^2}{2\|\sigma\|^2(t-s)}}\operatorname{e}^{-\frac{|v-x|^2}{2\|\sigma\|^2s}}}{(2 (t-s)\|\sigma\|^2 )^{d/2}(2 s\|\sigma\|^2 )^{d/2}}\mathrm{d} u\mathrm{d} v\\ &&\le\frac{\|\sigma\|^{2d}}{\pi^d\det(\sigma\sigma^{*})}(\phi(2s\|\sigma\|^2)(2(t-s)\|\sigma\|^2)^{\alpha})^{p_0}, \end{array} $$

which implies that (3.14) holds by taking \(C_{\sigma }=\frac {\|\sigma \|^{2d}}{\pi ^{d}\det (\sigma \sigma ^{*})}\). □

Now, we are in position to finish the Proof of Theorem 2.1.

Proof Proof of Theorem 2.1

Let

$$ \begin{array}{@{}rcl@{}} \hat W_t &=& W_t-{\int}_0^t\sigma^{-1}b(Y_s)\mathrm{d} s,\quad \tilde{W}_t =W_t-{\int}_0^t\sigma^{-1} b(Y_{s_{\delta}}) \mathrm{d} s,\\ R_{1,T}&=&\exp\Big\{{\int}_0^T\langle\sigma^{-1}b(Y_s),\mathrm{d} W_s\rangle-\frac{1}{2}{\int}_0^T|\sigma^{-1}b(Y_s)|^2\mathrm{d} s\Big\},\\ R_{2,T}&=&\exp\Big\{{\int}_0^T\langle\sigma^{-1} b(Y_{s_{\delta}}),\mathrm{d} W_s\rangle-\frac{1}{2}{\int}_0^T\big|\sigma^{-1} b(Y_{s_{\delta}})\big|^2\mathrm{d} s\Big\}. \end{array} $$

The proof is divided into two steps:

Step (i), we shall prove that the assertion holds under (H1) and (H2).

We first show that \(\{\hat W_{t}\}_{t\in [0,T]}\) is a Brownian motion under \(\mathbb {Q}_{1}:=R_{1,T}\mathbb P\), and \(\{\tilde W_{t}\}_{t\in [0,T]}\) is a Brownian motion under \(\mathbb {Q}_{2}:=R_{2,T}\mathbb P\). In view of Lemma 3.1, the Girsanov theorem implies that {R1,t}t∈[0,T] is a martingale and \(\{\hat W_{t}\}_{t\in [0,T]}\) is a Brownian motion under \(\mathbb {Q}_{1}\). Similarly, it follows from Lemma 3.2 and Novikov’s condition that \(\{\tilde W_{t}\}_{t\in [0,T]}\) is a Brownian motion under \(\mathbb {Q}_{2}\).

Then, we can reformulate Yt = x + σWt as follows:

$$Y_{t}=x+{{\int}_{0}^{t}} b(Y_{s})\mathrm{d} s+\sigma \hat W_{t},$$

which means that \((Y_{t},\hat {W_{t}})\) under \(\mathbb {Q}_{1}\) is a weak solution of (1.1). Hence, Yt under \(\mathbb {Q}_{1}\) has the same law as Xt under \(\mathbb P\) due to the pathwise uniqueness of the solutions to (1.1) (see Remark 2.1). Similarly, reformulating Yt = x + σWt as follows:

$$ Y_t= x+{\int}_0^t b(Y_{s_{\delta}}) \mathrm{d} s+\sigma \tilde{W}_t. $$
(3.15)

Then, \((Y_{t},\tilde {W}_{t})\) under \(\mathbb {Q}_{2}\) is also a weak solution of (1.2). Hence Yt under \(\mathbb {Q}_{2}\) has the same law as \(X_{t}^{(\delta )}\) under \(\mathbb P\) due to the pathwise uniqueness of solutions to the (1.2).

From these equivalence relations, we obtain that for any bounded and measurable function f on \(\mathbb {R}^{d}\)

$$ \begin{array}{@{}rcl@{}} &&|\mathbb{E} f(X_t)-\mathbb{E} f(X_t^{(\delta)})|=|\mathbb{E}_{\mathbb{Q}_1}f(Y_t)-\mathbb{E}_{\mathbb{Q}_2}f(Y_t)|\\ &&=\mathbb{E}|(R_{1,T}-R_{2,T})f(Y_t)|\le\|f\|_{\infty}\mathbb{E}|R_{1,T}-R_{2,T}|. \end{array} $$

Using the inequality |ex − ey|≤ (ex ∨ ey)|xy|, Hölder’s inequality and Minkowski’s inequality, we derive from definitions of R1,T and R2,T that

$$ \begin{array}{@{}rcl@{}} &&\mathbb{E}|R_{1,T}-R_{2,T}|\\ &&\le\mathbb{E}\Big\{(R_{1,T}\vee R_{2,T})\Big|{\int}_0^T\langle\sigma^{-1}(b(Y_s)-b(Y_{s_{\delta}})),\mathrm{d} W_s\rangle\\ &&\qquad +\frac{1}{2}{\int}_0^T\Big(|\sigma^{-1}b(Y_{s_{\delta}})|^2-|\sigma^{-1}b(Y_{s})|^2\Big)\mathrm{d} s\Big|\Big\} \end{array} $$
$$ \begin{array}{@{}rcl@{}} &&\le\mathbb{E}\Big[(R_{1,T}\vee R_{2,T})\Big|{\int}_0^T\langle\sigma^{-1}(b(Y_s)-b(Y_{s_{\delta}})),\mathrm{d} W_s\rangle\Big|\Big]\\ &&\quad+\frac{1}{2}\mathbb{E}\Big[(R_{1,T}\vee R_{2,T})\Big|{\int}_0^T\Big(|\sigma^{-1}b(Y_{s_{\delta}})|^2-|\sigma^{-1}b(Y_{s})|^2\Big)\mathrm{d} s\Big|\Big]\\ &&\le\Big(\mathbb{E}(R_{1,T}\vee R_{2,T})^{\frac {p_0} {p_0-1}}\Big)^{\frac {p_0-1} {p_0}} \Big(\mathbb{E}\Big|{\int}_0^T\langle\sigma^{-1}(b(Y_s)-b(Y_{s_{\delta}})),\mathrm{d} W_s\rangle\Big|^{p_0 }\Big)^{\frac 1 {p_0}}\\ &&\quad+\frac{1}{2}\Big(\mathbb{E}(R_{1,T}\vee R_{2,T})^{\frac {p_0+1} {p_0-1}}\Big)^{\frac {p_0-1} {p_0+1}}\Big(\mathbb{E}\Big|{\int}_0^T\big(|\sigma^{-1}b(Y_{s_{\delta}})|^2-|\sigma^{-1}b(Y_{s})|^2\big)\mathrm{d} s\Big|^{\frac {p_0+1} 2}\Big)^{\frac 2 {p_0+1}}\\ &&\le\Big(\mathbb{E}(R_{1,T}\vee R_{2,T})^{\frac {p_0} {p_0-1}}\Big)^{\frac {p_0-1} {p_0}} \Big(\mathbb{E}\Big|{\int}_0^T\langle\sigma^{-1}(b(Y_s)-b(Y_{s_{\delta}})),\mathrm{d} W_s\rangle\Big|^{p_0 }\Big)^{\frac 1 {p_0}}\\ &&\quad+\frac{1}{2}\Big(\mathbb{E}(R_{1,T}\vee R_{2,T})^{\frac {p_0+1} {p_0-1}}\Big)^{\frac {p_0-1} {p_0+1}}{\int}_0^T\Big(\mathbb{E}\big||\sigma^{-1}b(Y_{s_{\delta}})|^2-|\sigma^{-1}b(Y_{s})|^2\big|^{\frac {p_0+1} 2}\Big)^{\frac 2 {p_0+1}}\mathrm{d} s\\ &&=:\Big(\mathbb{E}(R_{1,T}\vee R_{2,T})^{\frac {p_0} {p_0-1}}\Big)^{\frac {p_0-1} {p_0}}G_{1,T}+\frac{1}{2}\Big(\mathbb{E}(R_{1,T}\vee R_{2,T})^{\frac {p_0+1} {p_0-1}}\Big)^{\frac {p_0-1} {p_0+1}}G_{2,T}. \end{array} $$
(3.16)

Let

$$ M_{t}={\int}_0^t\langle\sigma^{-1}b(Y_s),\mathrm{d} W_s\rangle~~\text{and}~~\hat{M}_{t}(q)=\operatorname{e}^{2qM_{t}-2q^2\langle M_{\cdot}\rangle_t},~q>0. $$

By Lemma 3.1, for any q > 1, \(\hat M_{t}(q)\) is an exponential martingale. Then, the Hölder inequality implies that

$$ \begin{array}{@{}rcl@{}} \mathbb{E} R_{1,T}^{\frac {p_0} {p_0-1}}&=&\mathbb{E}\exp\left\{\frac {p_0} {p_0-1}{\int}_0^{T}\langle\sigma^{-1}b(Y_s),\mathrm{d} W_s\rangle\right.\\ &&\qquad\quad\left.-\frac {p_0} {2(p_0-1)}{\int}_0^{T}|\sigma^{-1}b(Y_s)|^2\mathrm{d} s\right\}\\ &\le& \left( \mathbb{E} \hat{M}_{T}(\frac {p_0}{p_0-1})\right)^{1/2}\left( \mathbb{E}\exp\left\{\frac {p_0(p_0+1)} {(p_0-1)^2}{\int}_0^{T}|\sigma^{-1}b(Y_s)|^2\mathrm{d} s\right\}\right)^{1/2}\\ &=&\left( \mathbb{E}\exp\left\{\frac {p_0(p_0+1)} {(p_0-1)^2}{\int}_0^{T}|\sigma^{-1}b(Y_s)|^2\mathrm{d} s\right\}\right)^{1/2} \end{array} $$

and

$$ \begin{array}{@{}rcl@{}} \mathbb{E} R_{1,T}^{\frac {p_0+1} {p_0-1}}&\leq& \left( \mathbb{E} \hat{M}_{T}(\frac {p_0+1} {p_0-1})\right)^{1/2}\left( \mathbb{E}\exp\left\{\frac {(p_0+3)(p_0+1)} {(p_0-1)^2}{\int}_0^{T}|\sigma^{-1}b(Y_s)|^2\mathrm{d} s\right\}\right)^{1/2}\\ &=&\left( \mathbb{E}\exp\left\{\frac {(p_0+3)(p_0+1)} {(p_0-1)^2}{\int}_0^{T}|\sigma^{-1}b(Y_s)|^2\mathrm{d} s\right\}\right)^{1/2}. \end{array} $$

Then, it follows from Lemma 3.1 again that

$$ \mathbb{E}\left( R_{1,T }^{\frac {p_0+1} {p_0-1}}+R_{1,T}^{\frac {p_0} {p_0-1}}\right)<\infty. $$
(3.17)

Similarly, we can prove by using Hölder’s inequality and Lemma 3.2 that

$$ \mathbb{E}\left( R_{2,T }^{\frac {p_0+1} {p_0-1}}+R_{2,T}^{\frac {p_0} {p_0-1}}\right)<\infty. $$
(3.18)

Since \(\phi \in C((0,+\infty ),(0,+\infty ))\), there is C > 0 depending on l,T,σ such that

$$\phi^{2}(r)\leq C \phi^{2}(s),~l\leq s\leq r\leq 2\|\sigma\|^{2}T.$$

Combining this with that ϕ is non-increasing on (0,l) and \({{\int \limits }_{0}^{l}}\phi ^{2}(s)\mathrm {d} s<\infty \), which yields \({\int \limits }_{0}^{2\|\sigma \|^{2}T}\phi ^{2}(s)\mathrm {d} s<\infty \), we then obtain that

$$ \begin{array}{@{}rcl@{}} {\sum}_{k=1}^{[T/\delta] }\phi^2(2k\delta\|\sigma\|^2)\delta &\leq& {\sum}_{k=1}^{[T/\delta] }{\int}_{((k-1)\delta)\wedge \frac l {2\|\sigma\|^2} }^{(k\delta)\wedge \frac l {2\|\sigma\|^2} }\phi^2(2 \|\sigma\|^2 r)\mathrm{d} r+C{\int}_{\frac l {2\|\sigma\|^2} }^{T}\phi^2(2\|\sigma\|^2r)\mathrm{d} r\\ && ={\int}_{0}^{\frac l {2\|\sigma\|^2} }\phi^2(2\|\sigma\|^2r)\mathrm{d} r+C{\int}_{\frac l {2\|\sigma\|^2} }^{T}\phi^2(2\|\sigma\|^2r)\mathrm{d} r\\ && =\frac{1\vee C}{2\|\sigma\|^2}{\int}_0^{2\|\sigma\|^2T}\phi^2(s)\mathrm{d} s<\infty. \end{array} $$
(3.19)

This, together with the B-D-G inequality and Lemma 3.3, yields that for p0 ≥ 2

$$ \begin{array}{@{}rcl@{}} &&G_{1,T} =\Big(\mathbb{E}\Big|{\int}_0^T\langle\sigma^{-1}(b(Y_s)-b(Y_{s_{\delta}})),\mathrm{d} W_s\rangle\Big|^{p_0}\Big)^{1/p_0}\\ &&\le \left( \frac {p_0} {p_0-1}\right)^{\frac {p_0} 2}\left( \frac {p_0(p_0-1)} 2\right)^{\frac 1 2}\|\sigma^{-1}\| \left( {\int}_0^T\big(\mathbb{E}|b(Y_s)-b(Y_{s_{\delta}})|^{p_0 }\big)^{\frac 2 {p_0 }}\mathrm{d} s\right)^{\frac 1 2}\\ &&\le \delta^{\alpha}\frac {2^{\alpha}\|\sigma\|^{\frac {2d} {p_0}+2\alpha}\|\sigma^{-1}\|} {\left( \pi^d\det(\sigma\sigma^{*})\right)^{\frac 1 {p_0}}}\left( \frac {p_0} {p_0-1}\right)^{\frac {p_0} 2}\left( \frac {p_0(p_0-1)} 2\right)^{\frac 1 2}\Big({\int}_0^{T}\phi^2(2s_{\delta}\|\sigma\|^2)\mathrm{d} s\Big)^{\frac{1}{2}}\\ &&\le \delta^{\alpha}\frac {\sqrt{1\vee C}2^{\alpha-\frac{1}{2}}\|\sigma\|^{\frac {2d} {p_0}+2\alpha-1}\|\sigma^{-1}\|} {\left( \pi^d\det(\sigma\sigma^{*})\right)^{\frac 1 {p_0}}}\left( \frac {p_0} {p_0-1}\right)^{\frac {p_0} 2}\left( \frac {p_0(p_0-1)} 2\right)^{\frac 1 2}\Big({\int}_0^{2\|\sigma\|^2T}\phi^2(s)\mathrm{d} s\Big)^{\frac{1}{2}}\\ &&= C_{T,p_0,\sigma,\alpha,\phi}\delta^{\alpha}. \end{array} $$
(3.20)

Noting that for any p ≥ 1, one has

$$ \mathbb{E}|Y_t|^p\le2^{p-1}\left( |x|^p+(\sqrt t \|\sigma\|)^p\mathbb{E}| W_1|^p\right), $$
(3.21)

we derive from (3.4) and (3.19) that

$$ \begin{array}{@{}rcl@{}} &&\left( \mathbb{E}|b(Y_{s})+b(Y_{s_{\delta}})|^{\frac {p_{0}(p_{0}+1)} {p_{0}-1}}\right)^{\frac {p_{0}-1} {p_{0}(p_{0}+1)}}\\ &&\qquad \leq \left( \mathbb{E}\left( 2L(\varepsilon)+\varepsilon(|Y_{s}|+|Y_{s_{\delta}}|)\right)^{\frac {p_{0}(p_{0}+1)} {p_{0}-1}}\right)^{\frac {p_{0}-1} {p_{0}(p_{0}+1)}}\\ &&\qquad \leq 2\left\{L(\varepsilon)+2^{\frac{{p_{0}^{2}}+1}{p_{0}(p_{0}+1)}}\varepsilon\left( |x|+\sqrt {T}\|\sigma\|\left( \mathbb{E} |W_{1}|^{\frac {p_{0}(p_{0}+1)} {p_{0}-1}}\right)^{\frac {p_{0}-1} {p_{0}(p_{0}+1)}}\right)\right\}\\ &&\qquad =: C_{T,p_{0},\sigma,L(\varepsilon),\varepsilon,x}. \end{array} $$

Combining this with Lemma 3.3, (3.19) and Hölder’s inequality, we obtain

$$ \begin{array}{@{}rcl@{}} &&G_{2,T}=\frac{1}{2}{\int}_0^T\left( \mathbb{E}\Big||\sigma^{-1}b(Y_{s_{\delta}})|^2-|\sigma^{-1}b(Y_{s})|^2\Big|^{\frac {p_0+1} 2}\right)^{\frac 2 {p_0+1}}\mathrm{d} s\\ &&\le\frac{\|\sigma^{-1}\|^2}{2}{\int}_0^T\Big(\mathbb{E} |b(Y_s)-b(Y_{s_{\delta}})|^{\frac {p_0+1} 2}|b(Y_s)+b(Y_{s_{\delta}})|^{\frac {p_0+1} 2}\Big)^{\frac 2 {p_0+1}}\mathrm{d} s\\ &&\leq \frac{\|\sigma^{-1}\|^2}{2}{\int}_0^T\left( \mathbb{E} |b(Y_s)-b(Y_{s_{\delta}})|^{p_0}\right)^{\frac 1 {p_0}}\left( \mathbb{E}|b(Y_s)+b(Y_{s_{\delta}})|^{\frac {p_0(p_0+1)} {p_0-1}}\right)^{\frac {p_0-1} {p_0(p_0+1)}}\mathrm{d} s\\ &&\leq \frac{\|\sigma^{-1}\|^2}{2}C_{T,p_0,\sigma,L(\varepsilon),\varepsilon,x}{\int}_0^T\left( \mathbb{E} |b(Y_s)-b(Y_{s_{\delta}})|^{p_0}\right)^{\frac 1 {p_0}}\mathrm{d} s\\ &&\leq C_{T,p_0,\sigma,L(\varepsilon),\varepsilon,\phi,x}\delta^{\alpha}, \end{array} $$
(3.22)

where

$$ C_{T,p_0,\sigma,L(\varepsilon),\varepsilon,\phi,x}=\frac {{1\vee C}2^{\alpha-2}\|\sigma\|^{\frac {2d} {p_0}+2\alpha-2}\|\sigma^{-1}\|^2C_{T,p_0,\sigma,L(\varepsilon),\varepsilon,x} } {(\pi^d \det(\sigma\sigma^{*}))^{\frac 1 {p_0}}} {\int}_0^{2\|\sigma\|^2T}\phi(s)\mathrm{d} s. $$

The desired assertion (2.1) is proved by substituting (3.17), (3.18), (3.20) and (3.22) into (3.16). Therefore, the conclusion holds under (H1) and (H2).

Step (ii), we prove that if b satisfies the linear growth condition, then the conclusion (2.1) holds for T satisfying (2.2).

By Remark 3.1, we have that the conclusions of Lemmas 3.1 and 3.2 hold for any λ,T satisfying (3.13). By (2.2) and \(\frac {(p_{0}+3)(p_{0}+1)} {(p_{0}-1)^{2}}> \frac {p_{0}(p_{0}+1)} {(p_{0}-1)^{2}}>\frac 1 2\), we can choose \(\lambda =\frac {(p_{0}+3)(p_{0}+1)} {(p_{0}-1)^{2}}\) in Lemmas 3.1 and 3.2. Then, by checking step (i), we arrive at (3.16). Moreover, (3.17) and (3.18) hold by the same argument together with the stopping time technique. Then, we can conclude the second conclusion from (3.20) and (3.22). The proof is therefore complete. □

Remark 3.2

According to the proof of this theorem, the key point for that f in (2.1) can only be bounded and measurable is that the distributions of \(X^{(\delta )}_{t}\) and Xt come from the same process Yt = x + σWt. This fails for the multiplicative noise case.