1 Introduction

Consider the stochastic differential equation

$$\begin{aligned} X(t) = X_0 +\int _{0}^{t}b(X(s))ds + \sigma B(t), \end{aligned}$$
(1)

where B is a fractional Brownian motion of Hurst index \(H\in ]1/2,1[\), \(b :\mathbb R\rightarrow \mathbb R\) is a continuous map and \(\sigma \in \mathbb R^*\).

Along the last 2 decades, many authors studied statistical inference from observations drawn from stochastic differential equations driven by fractional Brownian motion.

Most references on the estimation of the trend component in Eq. (1) deals with parametric estimators. Kleptsyna and Le Breton (2001) and Hu and Nualart (2010), estimators of the trend component in Langevin’s equation are studied. Kleptsyna and Le Breton (2001) provide a maximum likelihood estimator, where the stochastic integral with respect to the solution of Eq. (1) returns to an Itô integral. Tudor and Viens (2007) extend this estimator to equations with a drift function depending linearly on the unknown parameter. Hu and Nualart (2010) provide a least square estimator, where the stochastic integral with respect to the solution of Eq. (1) is taken in the sense of Skorokhod. Hu et al. (2018) extend this estimator to equations with a drift function depending linearly on the unknown parameter.

Neuenkirch and Tindel (2014), the authors study a least square-type estimator defined by an objective function tailor-maid with respect to the main result of Tudor and Viens (2009) on the rate of convergence of the quadratic variation of the fractional Brownian motion. Chronopoulou and Tindel (2013) provide a likelihood based numerical procedure to estimate a parameter involved in both the drift and the volatility functions in a stochastic differential equation with multiplicative fractional noise.

On the nonparametric estimation of the trend component in Eq. (1), there are only few references. Saussereau (2014) and Mishra and Prakasa Rao (2011) study the consistency of some Nadaraya–Watson’s-type estimators of the drift function b in Eq. (1). On the nonparametric estimation in Itô’s calculus framework, the reader is referred to Kutoyants (2004).

Let \(K :\mathbb R\rightarrow \mathbb R_+\) be a kernel that is a nonnegative function with integral equal to 1. The paper deals with the consistency and a rate of convergence for the Nadaraya–Watson estimator

$$\begin{aligned} \widehat{b}_{T,h}(x) := \frac{\displaystyle {\int _{0}^{T}K\left( \frac{X(s) - x}{h}\right) \delta X(s)}}{\displaystyle {\int _{0}^{T}K\left( \frac{X(s) - x}{h}\right) ds}};\quad x\in \mathbb R \end{aligned}$$
(2)

of the drift function b in Eq. (1), where the stochastic integral with respect to X is taken in the sense of Skorokhod. Since to compute the Skorokhod integral is a challenge, by denoting by \(X_{x_0}\) the solution of Eq. (1) with initial condition \(x_0\in \mathbb R\), the following estimator is also studied:

$$\begin{aligned} \widehat{b}_{T,h,\varepsilon }(x):= & {} \frac{\displaystyle {\int _{0}^{T}K\left( \frac{X_{x_0}(s) - x}{h}\right) d X_{x_0}(s)}}{\displaystyle {\int _{0}^{T}K\left( \frac{X_{x_0}(s) - x}{h}\right) ds}}\nonumber \\&-\,\alpha _H\sigma ^2\frac{ \displaystyle {\frac{1}{h}\int _{0}^{T} \int _{0}^{u}K'\left( \frac{X_{x_0}(u) - x}{h}\right) \frac{X_{x_0 +\varepsilon }(u) - X_{x_0}(u)}{X_{x_0 +\varepsilon }(v) - X_{x_0}(v)}|u - v|^{2H - 2}dvdu}}{\displaystyle {\int _{0}^{T}K\left( \frac{X_{x_0}(s) - x}{h}\right) ds}}\nonumber \\ \end{aligned}$$
(3)

with \(\varepsilon > 0\) and \(x\in \mathbb R\). In this second estimator, the stochastic integral is taken pathwise. It depends on H, but an estimator of this parameter is for instance provided in Kubilius and Skorniakov (2016).

As detailed in Sect. 2.2, the Skorokhod integral is defined via the divergence operator which is the adjoint of the Malliavin derivative for the fractional Brownian motion. If \(H = 1/2\), then the Skorokhod integral coincides with Itô’s integral on its domain. When \(H\in ]1/2,1[\), it is more difficult to compute the Skorokhod integral, but not impossible as explained at the end of Sect. 2.2. Note that, the pathwise stochastic integral defined in Sect. 2.1 would have been a more natural choice, but unfortunately, it does not provide a consistent estimator (see Proposition 3.3).

Clearly, to be computable, the estimator \(\widehat{b}_{T,h,\varepsilon }(x)\) requires an observed path of the solution of Eq. (1) for two close but different values of the initial condition. This is not possible in any context, but we have in mind the following application field: if \(t\mapsto X_{x_0}(\omega ,t)\) denotes the concentration of a drug along time during its elimination by a patient \(\omega \) with initial dose \(x_0 > 0\), \(t\mapsto X_{x_0 +\varepsilon }(\omega ,t)\) could be approximated by replicating the exact same protocol on patient \(\omega \), but with initial dose \(x_0 +\varepsilon \) after the complete elimination of the previous dose.

We mention that we do not study the additional error which occurs when only discrete time observations with step \(\Delta \) on [0, T] (\(T = n\Delta \)) are available. Formula (3) has then to be discretized and a study in the spirit of Saussereau (2014) (Sect. 4.3) must be conducted.

Section 2 deals with some preliminary results on stochastic integrals with respect to the fractional Brownian motion and an ergodic theorem for the solution of Eq. (1). The consistency and a rate of convergence of the Nadaraya–Watson estimator studied in this paper are stated in Sect. 3. Almost all the proofs of the paper are provided in Sect 4.

Notations:

  1. (1)

    The vector space of Lipschitz continuous maps from \(\mathbb R\) into itself is denoted by \(\text {Lip}(\mathbb R)\) and equipped with the Lipschitz semi-norm \(\Vert .\Vert _{\text {Lip}}\) defined by

    $$\begin{aligned} \Vert \varphi \Vert _{\text {Lip}} := \sup \left\{ \frac{|\varphi (y) -\varphi (x)|}{|y - x|} \text { ; } x,y\in \mathbb R \text { and } x\not = y\right\} \end{aligned}$$

    for every \(\varphi \in \text {Lip}(\mathbb R)\).

  2. (2)

    For every \(m\in \mathbb N\),

    $$\begin{aligned} C_{b}^{m}(\mathbb R) := \left\{ \varphi \in C^m(\mathbb R) : \max _{k\in \llbracket 0,m\rrbracket } \Vert \varphi ^{(k)}\Vert _{\infty } <\infty \right\} . \end{aligned}$$
  3. (3)

    For every \(m\in \mathbb N^*\),

    $$\begin{aligned} \text {Lip}_{b}^{m}(\mathbb R) := \left\{ \varphi \in C^m(\mathbb R) : \varphi \in \text {Lip}(\mathbb R) \text { and } \max _{k\in \llbracket 1,m\rrbracket } \Vert \varphi ^{(k)}\Vert _{\infty } <\infty \right\} \end{aligned}$$

    and for every \(\varphi \in \text {Lip}_{b}^{m}(\mathbb R)\),

    $$\begin{aligned} \Vert \varphi \Vert _{\text {Lip}_{b}^{m}} := \Vert \varphi \Vert _{\text {Lip}}\vee \max _{k\in \llbracket 1,m\rrbracket }\Vert \varphi ^{(k)}\Vert _{\infty }. \end{aligned}$$

    The map \(\Vert .\Vert _{\text {Lip}_{b}^{m}}\) is a semi-norm on \(\text {Lip}_{b}^{m}(\mathbb R)\). Note that for every \(m\in \mathbb N^*\),

    $$\begin{aligned} C_{b}^{m}(\mathbb R) \subset \text {Lip}_{b}^{m}(\mathbb R). \end{aligned}$$
  4. (4)

    Consider \(n\in \mathbb N^*\). The vector space of infinitely continuously differentiable maps \(f :\mathbb R^n\rightarrow \mathbb R\) such that f and all its partial derivatives have polynomial growth is denoted by \(C_{p}^{\infty }(\mathbb R^n,\mathbb R)\).

2 Stochastic integrals with respect to the fractional Brownian motion and an ergodic theorem for fractional SDE

On the one hand, this section presents two different methods to define a stochastic integral with respect to the fractional Brownian motion. The first one is based on the pathwise properties of the fractional Brownian motion. Even if this approach is very natural, it is proved in Proposition 3.3 that the pathwise stochastic integral is not appropriate to get a consistent estimator of the drift function b in Eq. (1). Another stochastic integral with respect to the fractional Brownian motion is defined via the Malliavin divergence operator. This stochastic integral is called Skorokhod’s integral with respect to B. If \(H = 1/2\), which means that B is a Brownian motion, the Skorokhod integral defined via the divergence operator coincides with Itô’s integral on its domain. This integral is appropriate for the estimation of the drift function b in Eq. (1). On the other hand, an ergodic theorem for the solution of Eq. (1) is stated in Sect. 2.3.

2.1 The pathwise stochastic integral

This subsection deals with some definitions and basic properties of the pathwise stochastic integral with respect to the fractional Brownian motion of Hurst index greater than 1 / 2.

Definition 2.1

Consider x and w two continuous functions from [0, T] into \(\mathbb R\). Consider a partition \(D := (t_k)_{k\in \llbracket 0,m\rrbracket }\) of [st] with \(m\in \mathbb N^*\) and \(s,t\in [0,T]\) such that \(s < t\). The Riemann sum of x with respect to w on [st] for the partition D is

$$\begin{aligned} J_{x,w,D}(s,t) := \sum _{k = 0}^{m - 1}x(t_k)(w(t_{k + 1}) - w(t_k)). \end{aligned}$$

Notation. With the notations of Definition 2.1, the mesh of the partition D is

$$\begin{aligned} \delta (D) := \max _{k\in \llbracket 0,m - 1\rrbracket } |t_{k + 1} - t_k|. \end{aligned}$$

The following theorem ensures the existence and the uniqueness of Young’s integral (see Friz and Victoir 2010, Theorem 6.8).

Theorem 2.2

Let x (resp. w) be a \(\alpha \)-Hölder (resp. \(\beta \)-Hölder) continuous map from [0, T] into \(\mathbb R\) with \(\alpha ,\beta \in ]0,1]\) such that \(\alpha +\beta > 1\). There exists a unique continuous map \(J_{x,w} : [0,T]\rightarrow \mathbb R\) such that for every \(s,t\in [0,T]\) satisfying \(s < t\) and any sequence \((D_n)_{n\in \mathbb N}\) of partitions of [st] such that \(\delta (D_n)\rightarrow 0\) as \(n\rightarrow \infty \),

$$\begin{aligned} \lim _{n\rightarrow \infty } |J_{x,w}(t) - J_{x,w}(s) - J_{x,w,D_n}(s,t)| = 0. \end{aligned}$$

The map \(J_{x,w}\) is the Young integral of x with respect to w and \(J_{x,w}(t) - J_{x,w}(s)\) is denoted by

$$\begin{aligned} \int _{s}^{t}x(u)dw(u) \end{aligned}$$

for every \(s,t\in [0,T]\) such that \(s < t\).

The following proposition is a change of variable for Young’s integral.

Proposition 2.3

Let x be a \(\alpha \)-Hölder continuous map from [0, T] into \(\mathbb R\) with \(\alpha \in ]1/2,1[\). For every \(\varphi \in {\text {Lip}}_{b}^{1}(\mathbb R)\) and \(s,t\in [0,T]\) such that \(s < t\),

$$\begin{aligned} \varphi (x(t)) -\varphi (x(s)) = \int _{s}^{t}\varphi '(x(u))dx(u). \end{aligned}$$

For any \(\alpha \in ]1/2,H[\), the paths of B are \(\alpha \)-Hölder continuous (see Nualart 2006, Section 5.1). So, for every process \(Y := (Y(t))_{t\in [0,T]}\) with \(\beta \)-Hölder continuous paths from [0, T] into \(\mathbb R\) such that \(\alpha +\beta > 1\), by Theorem 2.2, it is natural to define the pathwise stochastic integral of Y with respect to B by

$$\begin{aligned} \left( \int _{0}^{t}Y(s)dB(s)\right) (\omega ) := \int _{0}^{t}Y(\omega ,s)dB(\omega ,s) \end{aligned}$$

for every \(\omega \in \Omega \) and \(t\in [0,T]\).

2.2 The Skorokhod integral

This subsection deals with some definitions and results on Malliavin calculus in order to define and to provide a suitable expression of Skorokhod’s integral.

Consider the vector space

$$\begin{aligned} \mathcal H := \left\{ \varphi :\mathbb R_+\rightarrow \mathbb R : \int _{0}^{\infty } \int _{0}^{\infty } |t - s|^{2H - 2}|\varphi (s)|\cdot |\varphi (t)|dsdt <\infty \right\} . \end{aligned}$$

Equipped with the scalar product

$$\begin{aligned} \langle \varphi ,\psi \rangle _{\mathcal H} := H(2H - 1)\int _{0}^{\infty }\int _{0}^{\infty } |t - s|^{2H - 2}\varphi (s)\psi (t)dsdt \text { ; } \varphi ,\psi \in \mathcal H, \end{aligned}$$

\(\mathcal H\) is the reproducing kernel Hilbert space of B. Let \(\mathbf B\) be the map defined on \(\mathcal H\) by

$$\begin{aligned} \mathbf B(h) := \int _{0}^{.}h(s)dB(s);\quad h\in \mathcal H \end{aligned}$$

which is the Wiener integral of h with respect to B. The family \((\mathbf B(h))_{h\in \mathcal H}\) is an isonormal Gaussian process.

Definition 2.4

The Malliavin derivative of a smooth functional

$$\begin{aligned} F = f( \mathbf B(h_1),\dots , \mathbf B(h_n)) \end{aligned}$$

where \(n\in \mathbb N^*\), \(f\in C_{p}^{\infty }(\mathbb R^n,\mathbb R)\) and \(h_1,\dots ,h_n\in \mathcal H\) is the \(\mathcal H\)-valued random variable

$$\begin{aligned} \mathbf DF := \sum _{k = 1}^{n} \partial _k f ( \mathbf B(h_1),\dots , \mathbf B(h_n))h_k. \end{aligned}$$

Proposition 2.5

The map \(\mathbf D\) is closable from \(L^2(\Omega ,\mathcal A,\mathbb P)\) into \(L^2(\Omega ;\mathcal H)\). Its domain in \(L^2(\Omega ,\mathcal A,\mathbb P)\) is denoted by \(\mathbb D^{1,2}\) and is the closure of the smooth functionals space for the norm \(\Vert .\Vert _{1,2}\) defined by

$$\begin{aligned} \Vert F\Vert _{1,2}^{2} := \mathbb E(|F|^2) + \mathbb E(\Vert \mathbf DF\Vert _{\mathcal H}^{2}) < \infty \end{aligned}$$

for every \(F\in L^2(\Omega ,\mathcal A,\mathbb P)\).

For a proof, see Nualart (2006, Proposition 1.2.1).

Definition 2.6

The adjoint \(\delta \) of the Malliavin derivative \(\mathbf D\) is the divergence operator. The domain of \(\delta \) is denoted by \( {\text {dom}}(\delta )\) and \(u\in {\text {dom}}(\delta )\) if and only if there exists a deterministic constant \(c > 0\) such that for every \(F\in \mathbb D^{1,2}\),

$$\begin{aligned} |\mathbb E(\langle \mathbf DF,u\rangle _{\mathcal H})| \leqslant c\mathbb E(|F|^2)^{1/2}. \end{aligned}$$

For every process \(Y := (Y(s))_{s\in \mathbb R_+}\) and every \(t > 0\), if \(Y\mathbf 1_{[0,t]}\in \text {dom}(\delta )\), then its Skorokhod integral with respect to B is defined on [0, t] by

$$\begin{aligned} \int _{0}^{t}Y(s)\delta B(s) := \delta (Y\mathbf 1_{[0,t]}). \end{aligned}$$

With the same notations:

$$\begin{aligned} \int _{0}^{t} Y(s)\delta X(s) := \int _{0}^{t}Y(s)b(X(s))ds + \sigma \int _{0}^{t}Y(s)\delta B(s). \end{aligned}$$

The following proposition provides the link between the Skorokhod integral and the pathwise stochastic integral of Sect. 2.1.

Proposition 2.7

If \(b\in {\text {Lip}}_{b}^{1}(\mathbb R)\), then Eq. (1) with initial condition \(x\in \mathbb R\) has a unique solution \(X_x\) with \(\alpha \)-Hölder continuous paths for every \(\alpha \in ]0,H[\). Moreover, for every \(\varphi \in {\text {Lip}}_{b}^{1}(\mathbb R)\),

$$\begin{aligned} \int _{0}^{t}\varphi (X_x(u))\delta X_x(u)= & {} \int _{0}^{t}\varphi (X_x(u))dX_x(u)\\&-\,\alpha _H\sigma ^2 \int _{0}^{t} \int _{0}^{u}\varphi '(X_x(u))\frac{\partial _x X_x(u)}{\partial _x X_x(v)}|u - v|^{2H - 2}dvdu, \nonumber \end{aligned}$$
(4)

where \(\alpha _H = H(2H - 1)\).

Moreover, we can prove the following Corollary, which allows us to propose a computable form for the estimator.

Corollary 2.8

Assume that \(b\in {\text {Lip}}_{b}^{2}(\mathbb R)\) and there exists a constant \(M > 0\) such that

$$\begin{aligned} b'(x)\leqslant -M;\quad \forall x\in \mathbb R. \end{aligned}$$

For every \(\varphi \in {\text {Lip}}_{b}^{1}(\mathbb R)\), \(x\in \mathbb R\) and \(\varepsilon ,t > 0\),

$$\begin{aligned} \left| \int _{0}^{t}\varphi (X_x(u))\delta X_x(u) - S_{\varphi }(x,\varepsilon ,t)\right| \leqslant C_{\varphi }\varepsilon t^{2H - 1}, \end{aligned}$$

where

$$\begin{aligned} S_{\varphi }(x,\varepsilon ,t):= & {} \int _{0}^{t}\varphi (X_x(u))dX_x(u)\\&-\,\alpha _H\sigma ^2 \int _{0}^{t} \int _{0}^{u}\varphi '(X_x(u))\frac{X_{x +\varepsilon }(u) - X_x(u)}{X_{x +\varepsilon }(v) - X_x(v)}|u - v|^{2H - 2}dvdu \end{aligned}$$

and

$$\begin{aligned} C_{\varphi } := H\sigma ^2\frac{\Vert b''\Vert _{\infty }\Vert \varphi '\Vert _{\infty }}{2M^2}. \end{aligned}$$

As mentioned in the Introduction, the formula for \( S_{\varphi }(x,\varepsilon ,t)\) can be used if two paths of X can be observed with different but close initial conditions.

Lastly, the following theorem, recently proved by Hu et al. (2018) (see Proposition 4.4), provides a suitable control of Skorokhod’s integral to study its long-time behavior.

Theorem 2.9

Assume that \(b\in {\text {Lip}}_{b}^{2}(\mathbb R)\) and there exists a constant \(M > 0\) such that

$$\begin{aligned} b'(x)\leqslant -M;\quad \forall x\in \mathbb R. \end{aligned}$$

There exists a deterministic constant \(C > 0\), not depending on T, such that for every \(\varphi \in {\text {Lip}}_{b}^{1}(\mathbb R)\):

$$\begin{aligned} \mathbb E\left( \left| \int _{0}^{T}\varphi (X(s))\delta B(s)\right| ^2\right)\leqslant & {} C\left( \left( \int _{0}^{T}\mathbb E(|\varphi (X(s))|^{1/H})ds\right) ^{2H}\right. \\&+\left. \left( \int _{0}^{T}\mathbb E(|\varphi '(X(s))|^2)^{1/(2H)}ds\right) ^{2H}\right) <\infty . \end{aligned}$$

2.3 Ergodic theorem for the solution of a fractional SDE

On the ergodicity of fractional SDEs, the reader can refer to Hairer (2005), Hairer and Ohashi (2007) and Hairer and Pillai (2013) (see Sect. 4.3 for details).

In the sequel, the map b fulfills the following condition.

Assumption 2.10

The map b belongs to \( {\text {Lip}}_{b}^{\infty }(\mathbb R)\) and there exists a constant \(M > 0\) such that

$$\begin{aligned} b'(x)\leqslant -M;\quad \forall x\in \mathbb R. \end{aligned}$$
(5)

Remark

  1. (1)

    Since \(b\in \text {Lip}_{b}^{1}(\mathbb R)\), Eq. (1) has a unique solution.

  2. (2)

    Under Assumption 2.10, the dissipativity conditions of Hairer (2005), Hairer and Ohashi (2007) and Hu et al. (2018) are fulfilled by b:

    $$\begin{aligned} (x - y)(b(x) - b(y)) \leqslant -M(x - y)^2;\quad \forall x,y\in \mathbb R \end{aligned}$$

    and there exists a constant \(M' > 0\) such that

    $$\begin{aligned} xb(x)\leqslant M'(1 - x^2);\quad \forall x\in \mathbb R. \end{aligned}$$

    Therefore, Assumption 2.10 is sufficient to apply the results proved in Hairer (2005), Hairer and Ohashi (2007) and Hu et al. (2018) in the sequel.

Proposition 2.11

Consider a measurable map \(\varphi :\mathbb R\rightarrow \mathbb R_+\) such that there exists a nonempty compact subset C of \(\mathbb R\) satisfying \(\varphi (C)\subset ]0,\infty [\). Under Assumption 2.10, there exists a deterministic constant \(l(\varphi ) > 0\) such that

$$\begin{aligned} \frac{1}{T} \int _{0}^{T} \varphi (X(t))dt \xrightarrow [T\rightarrow \infty ]{ {\text {a.s./L}^2}} l(\varphi ) > 0. \end{aligned}$$

3 Convergence of the Nadaraya–Watson estimator of the drift function

This section deals with the consistency and rate of convergence of the Nadaraya–Watson estimator of the drift function b in Eq. (1).

In the sequel, the kernel K fulfills the following assumption.

Assumption 3.1

\( {\text {supp}}(K) = [-1,1]\) and \(K\in C_{b}^{1}(\mathbb R,\mathbb R_+)\).

3.1 Why is pathwise integral inadequate

First of all, let us prove that, even if it seems very natural, the pathwise Nadaraya–Watson estimator

$$\begin{aligned} \widetilde{b}_{T,h}(x) := \frac{\displaystyle {\int _{0}^{T}K\left( \frac{X(s) - x}{h}\right) dX(s)}}{\displaystyle {\int _{0}^{T}K\left( \frac{X(s) - x}{h}\right) ds}} = \frac{\displaystyle {\frac{1}{Th}\int _{0}^{T}K\left( \frac{X(s) - x}{h}\right) dX(s)}}{\widehat{f}_{T,h}(x)} \end{aligned}$$

where

$$\begin{aligned} \widehat{f}_{T,h}(x) := \frac{1}{Th} \int _{0}^{T} K\left( \frac{X(s) - x}{h}\right) ds. \end{aligned}$$
(6)

is not consistent.

For this, we need the following lemma providing a convergence result for \(\widehat{f}_{T,h}(x)\). It will also be used to prove Proposition 3.4.

Lemma 3.2

Under Assumptions 2.10 and 3.1, there exists a deterministic constant \(l_h(x) > 0\) such that

$$\begin{aligned} \widehat{f}_{T,h}(x) \xrightarrow [T\rightarrow \infty ]{ {\text {a.s./L}^2}} l_h(x) > 0. \end{aligned}$$

Proof

Under Assumption 3.1, the map

$$\begin{aligned} y\in \mathbb R\longmapsto \frac{1}{h}K\left( \frac{y - x}{h}\right) \end{aligned}$$

satisfies the condition on \(\varphi \) of Proposition 2.11, which applies thus here and gives the result. \(\square \)

Now, we state the result proving that \(\widetilde{b}_{T,h}(x)\) is not consistent to recover b(x).

Proposition 3.3

Under Assumptions 2.10 and 3.1:

$$\begin{aligned} \widetilde{b}_{T,h}(x) \xrightarrow [T\rightarrow \infty ]{\mathbb P} 0. \end{aligned}$$

Proof

Let \(\mathcal K\) be a primitive function of K. By the change of variable formula for Young’s integral (Proposition 2.3):

$$\begin{aligned} \mathcal K\left( \frac{X(T) - x}{h}\right) - \mathcal K\left( \frac{X(0) - x}{h}\right)= & {} \frac{1}{h} \int _{0}^{T}K\left( \frac{X(s) - x}{h}\right) dX(s)\\= & {} T\widehat{f}_{T,h}(x)\widetilde{b}_{T,h}(x). \end{aligned}$$

Then,

$$\begin{aligned} \widetilde{b}_{T,h}(x) = \frac{1}{T\widehat{f}_{T,h}(x)}\left( \mathcal K\left( \frac{X(T) - x}{h}\right) - \mathcal K\left( \frac{X(0) - x}{h}\right) \right) . \end{aligned}$$

Since \(\mathcal K\) is differentiable with bounded derivative K:

$$\begin{aligned} |\widetilde{b}_{T,h}(x)| \leqslant \frac{\Vert K\Vert _{\infty }}{Th\widehat{f}_{T,h}(x)} |X(T) - X(0)|. \end{aligned}$$

Finally, as we know by Hairer (2005, Proposition 3.12) that

$$\begin{aligned} t\in \mathbb R_+\longmapsto \mathbb E(|X(t)|) \end{aligned}$$

is uniformly bounded, and by Lemma 3.2 that \(\widehat{f}_{T,h}(x)\) converges almost surely to \(l_h(x) > 0\) as \(T\rightarrow \infty \), it follows that \(\widetilde{b}_{T,h}(x)\) converges to 0 in probability, when \(T\rightarrow \infty \). \(\square \)

This is why the Skorokhod integral replaces the pathwise stochastic integral in \(\widehat{b}_{T,h}(x)\).

3.2 Convergence of the Nadaraya–Watson estimator

This subsection deals with the consistency and rate of convergence of the estimators.

The Nadaraya–Watson estimator \(\widehat{b}_{T,h}(x)\) defined by Eq. (2) can be decomposed as follows:

$$\begin{aligned} \widehat{b}_{T,h}(x) - b(x) = \frac{B_{T,h}(x)}{\widehat{f}_{T,h}(x)} + \frac{S_{T,h}(x)}{\widehat{f}_{T,h}(x)}, \end{aligned}$$
(7)

where \(\widehat{f}_{T,h}(x)\) is defined by (6),

$$\begin{aligned} B_{T,h}(x) := \frac{1}{Th} \int _{0}^{T} K\left( \frac{X(s) - x}{h}\right) (b(X(s)) - b(x))ds. \end{aligned}$$

and

$$\begin{aligned} S_{T,h}(x) := \frac{\sigma }{Th} \int _{0}^{T} K\left( \frac{X(s) - x}{h}\right) \delta B(s). \end{aligned}$$

By using the Lipschitz assumption 2.10 on b together with the technical lemmas proved in Sect. 2, the estimators \( \widehat{b}_{T,h}(x)\text { and } \widehat{b}_{T,h,\varepsilon }(x) \) can be studied.

Proposition 3.4

Under Assumptions 2.10 and 3.1,

$$\begin{aligned} |\widehat{b}_{T,h}(x) - b(x)| \leqslant \Vert b\Vert _{ {\text {Lip}}}h + \frac{|S_{T,h}(x)|}{\widehat{f}_{T,h}(x)}, \end{aligned}$$

and there exists a positive constant C such that

$$\begin{aligned} {\mathbb E}(S_{T,h}(x)^2) \leqslant \frac{C}{h^4 T^{2(1-H)}}. \end{aligned}$$

As a consequence, for fixed \(h>0\), we have

$$\begin{aligned} T^{\beta }V_{T,h}(x) \xrightarrow [T\rightarrow \infty ]{\mathbb P}0;\quad \forall \beta \in [0,1 - H[, \text{ where } \; V_{T,h}(x) := \left| \frac{S_{T,h}(x)}{\widehat{f}_{T,h}(x)}\right| . \end{aligned}$$
(8)

Moreover, for \(\widehat{b}_{T,h, \varepsilon }\) defined by (3), \(\forall \varepsilon >0\),

$$\begin{aligned} |\widehat{b}_{T,h,\varepsilon }(x)-\widehat{b}_{T,h}(x)|\leqslant C \frac{\varepsilon h^{-2}T^{2H-2}}{\widehat{f}_{T,h}(x)}. \end{aligned}$$
(9)

Heuristically, Proposition 3.4 says that the pointwise quadratic risk of the kernel estimator \(\widehat{b}_{T,h}(x)\) involves a squared bias of order \(h^2\) and a variance term of order \(1/(h^4T^{2(1 - H)})\). The best possible rate is thus \(T^{-\frac{2}{3} (1-H)}\) with a bandwidth choice of order \(T^{-\frac{1}{3} (1-H)}\). A more rigorous formulation of this is stated below.

Note also that it follows from (9) that the rate of \(\widehat{b}_{T,h,\varepsilon }(x)\) is preserved for any small \(\varepsilon \).

We want to emphasize that no order condition is set on the kernel, and the bias term is not bounded in the usual way for kernel setting (see e.g. Tsybakov (2004), Chapter 1). Indeed, we can not refer to the expectation of the numerator as a convolution product, because the existence of a stationary density is not ensured. Would it exist, it would be difficult to set adequate regularity conditions on it.

Now, consider a decreasing function \(h : [t_0,\infty [\rightarrow ]0,1[\) (\(t_0\in \mathbb R_+\)) such that

$$\begin{aligned} \lim _{T\rightarrow \infty } h(T) = 0 \text { and } \lim _{T\rightarrow \infty } Th(T) =\infty \end{aligned}$$

and assume that \(\widehat{f}_{T,h(T)}(x)\) fulfills the following assumption.

Assumption 3.5

There exists \(l(x)\in ]0,\infty ]\) such that \(\widehat{f}_{T,h(T)}(x)\) converges to l(x) in probability as \(T\rightarrow \infty \).

Section 3.3 deals with the special case of fractional SDE with Gaussian solution in order to prove that Assumption 3.5 holds in this setting.

In Proposition 3.6, the result of Proposition 3.4 is extended to the estimator \(\widehat{b}_{T,h(T)}(x)\) under Assumption 3.5.

Proposition 3.6

Under Assumptions 2.10, 3.1 and 3.5:

  1. (1)

    If there exists \(\beta \in ]0,1 - H[\) such that \(T^{-\beta } =_{T\rightarrow \infty } o(h(T)^2)\), then

    $$\begin{aligned} \widehat{b}_{T,h(T)}(x) \xrightarrow [T\rightarrow \infty ]{\mathbb P} b(x). \end{aligned}$$
  2. (2)

    For every \(\gamma \in ]0,\beta [\) such that

    $$\begin{aligned} h(T) =_{T\rightarrow \infty } o(T^{-\gamma }) \text { and } T^{H - 1 +\gamma } =_{T\rightarrow \infty } o(h(T)^2), \end{aligned}$$

    then

    $$\begin{aligned} T^{\gamma }|\widehat{b}_{T,h(T)}(x) - b(x)| \xrightarrow [T\rightarrow \infty ]{\mathbb P} 0. \end{aligned}$$

Example

Consider

$$\begin{aligned} \beta \in \left]\frac{2}{3}(1 - H),1 - H\right[ \text { and } h(T) := T^{\frac{H - 1}{3}}. \end{aligned}$$
  • \(Th(T) = T^{H +\frac{2}{3}(1 - H)}\xrightarrow [T\rightarrow \infty ]{}\infty \).

  • \(T^{-\beta }/h(T)^2 = T^{-\beta +\frac{2}{3}(1 - H)}\xrightarrow [T\rightarrow \infty ]{} 0\).

  • For every \(\gamma \in ]0,(1 - H)/3[\), \(h(T)/T^{-\gamma } = T^{\frac{H - 1}{3} +\gamma }\xrightarrow [T\rightarrow \infty ]{} 0\).

  • For every \(\gamma \in ]0,(1 - H)/3[\), \(T^{H - 1 +\gamma }/h(T)^2 = T^{\frac{H - 1}{3} +\gamma }\xrightarrow [T\rightarrow \infty ]{} 0\).

In Corollary 3.7, the result of Proposition 3.6 is extended to \(\widehat{b}_{T,h(T),\varepsilon (T)}(x)\) where

$$\begin{aligned} \lim _{T\rightarrow \infty } \varepsilon (T) = 0. \end{aligned}$$

Corollary 3.7

Under Assumptions 2.10, 3.1 and 3.5:

  1. (1)

    If there exists \(\beta \in ]0,1 - H[\) such that

    $$\begin{aligned} T^{-\beta } =_{T\rightarrow \infty } o(h(T)^2) \text { and } \varepsilon (T) =_{T\rightarrow \infty } o(h(T)^{-2}T^{2H - 2}), \end{aligned}$$

    then

    $$\begin{aligned} \widehat{b}_{T,h(T),\varepsilon (T)}(x) \xrightarrow [T\rightarrow \infty ]{\mathbb P} b(x). \end{aligned}$$
  2. (2)

    For every \(\gamma \in ]0,\beta [\) such that

    $$\begin{aligned} \left\{ \begin{array}{lll} &{}h(T) =_{T\rightarrow \infty } &{} o(T^{-\gamma }),\\ &{}T^{H - 1 +\gamma } =_{T\rightarrow \infty } &{} o(h(T)^2)\\ &{}\varepsilon (T) =_{T\rightarrow \infty } &{} o(h(T)^{-2}T^{2H - 2 +\gamma }) \end{array}\right. , \end{aligned}$$

    then

    $$\begin{aligned} T^{\gamma }|\widehat{b}_{T,h(T),\varepsilon (T)}(x) - b(x)| \xrightarrow [T\rightarrow \infty ]{\mathbb P} 0. \end{aligned}$$

Example

Example One can take \(\varepsilon (T) := h(T)^2\).

3.3 Special case of fractional SDE with Gaussian solution

The purpose of this subsection is to show that Assumption 3.5 holds when the drift function in Eq. (1) is linear with a negative slope. Note also that if \(H = 1/2\), then \(\widehat{f}_{T,h(T)}\) is a consistent estimator of the stationary density for Eq. (1) (see Kutoyants 2004, Section 4.2).

Assume that Eq. (1) has a centered Gaussian stationary solution X and consider the normalized process \(Y := X/\sigma _0\) where \(\sigma _0 :=\sqrt{\text {var}(X_0)}\).

Throughout this subsection, \(\nu \) is the standard normal density and the autocorrelation function \(\rho \) of Y fulfills the following assumption.

Assumption 3.8

\(\displaystyle {\int _{0}^{T}\int _{0}^{T} |\rho (v - u)|dvdu =_{T\rightarrow \infty } O(T^{2H})}\).

The following proposition ensures that under Assumption 3.8, \(\widehat{f}_{T,h(T)}\) fulfills Assumption 3.5 for every \(x\in \mathbb R^*\).

Proposition 3.9

Under Assumptions 2.10 and 3.1, if Eq. (1) has a centered, Gaussian, stationary solution X, the autocorrelation function \(\rho \) of \(Y := X/\sigma _0\) satisfies Assumption 3.8 and \(T^{2H - 2} =_{T\rightarrow \infty } o(h(T))\), then

$$\begin{aligned} \widehat{f}_{T,h(T)}(x) \xrightarrow [T\rightarrow \infty ]{\mathbb P} \frac{1}{\sigma _0}\nu \left( \frac{x}{\sigma _0}\right) > 0 \end{aligned}$$
(10)

for every \(x\in \mathbb R^*\).

Now, consider the fractional Langevin equation

$$\begin{aligned} X(t) = X_0 -\lambda \int _{0}^{t}X(s)ds +\sigma B(t), \end{aligned}$$
(11)

where \(\lambda ,\sigma > 0\). Equation (11) has a unique solution called Ornstein–Uhlenbeck’s process.

On the one hand, the drift function of Eq. (11) fulfills Assumption 2.10. So, under Assumption 3.1, by Proposition 3.4,

$$\begin{aligned} |\widehat{b}_{T,h}(x) +\lambda x| \leqslant \Vert b\Vert _{ {\text {Lip}}}h + V_{T,h}(x), \end{aligned}$$

where

$$\begin{aligned} T^{\beta }V_{T,h}(x) \xrightarrow [T\rightarrow \infty ]{\mathbb P}0;\quad \forall \beta \in [0,2H - 1[. \end{aligned}$$

On the other hand, by Cheridito et al. (2003, Section 2), Eq. (11) has a centered, Gaussian, stationary solution X such that:

$$\begin{aligned} X(t) = \sigma \int _{-\infty }^{t}e^{-\lambda (t - u)}dB(u);\quad \forall t\in \mathbb R_+. \end{aligned}$$

Moreover, Cheridito et al. (2003, Theorem 2.3), the autocorrelation function \(\rho \) of \(Y := X/\sigma _0\) satisfies

$$\begin{aligned} \rho (T) =_{T\rightarrow \infty } O(T^{2H - 2}). \end{aligned}$$

So, \(\rho \) fulfills Assumption 3.8.

Consider \(\beta \in ]0,2H - 1[\) and \(\gamma \in ]0,2H - 1 -\beta [\) such that

$$\begin{aligned} h(T) =_{T\rightarrow \infty } o(T^{-\gamma }) \text { and } T^{H - 1 +\gamma } =_{T\rightarrow \infty } o(h(T)^2). \end{aligned}$$

Then,

$$\begin{aligned} \lim _{T\rightarrow \infty } \frac{T^{2H - 2}}{h(T)} = \lim _{T\rightarrow \infty } h(T)T^{H - 1 -\gamma }\frac{T^{H - 1 +\gamma }}{h(T)^2} = 0. \end{aligned}$$

Therefore, by Propositions 3.6 and 3.9:

$$\begin{aligned} T^{\gamma }|\widehat{b}_{T,h(T)}(x) +\lambda x| \xrightarrow [T\rightarrow \infty ]{\mathbb P} 0;\quad \forall x\in \mathbb R^*. \end{aligned}$$

4 Proofs

4.1 Proof of Proposition 2.7

On the existence, uniqueness and regularity of the paths of the solution of Eq. (1), see Lejay (2010).

Now, let us prove (4).

Let \(X_x\) be the solution of Eq. (2.7) with initial condition \(x\in \mathbb R\). Consider also \(\varphi \in \text {Lip}_{b}^{1}(\mathbb R)\) and \(t > 0\). By Nualart (2006), Proposition 5.2.3:

$$\begin{aligned} \int _{0}^{t}\varphi (X_x(u))\delta X_x(u)= & {} \int _{0}^{t}\varphi (X_x(u))b(X_x(u))du +\sigma \int _{0}^{t}\varphi (X_x(u))\delta B(u)\\= & {} \int _{0}^{t}\varphi (X_x(u))dX_x(u)\\&-\,\alpha _H\sigma \int _{0}^{t} \int _{0}^{t}\varphi '(X_x(u))\mathbf D_vX_x(u)|u - v|^{2H - 2}dvdu. \end{aligned}$$

Consider \(u,v\in [0,t]\). On the one hand,

$$\begin{aligned} \mathbf D_vX_x(u) = \sigma \mathbf 1_{[0,u]}(v) + \int _{0}^{u}b'(X_x(r))\mathbf D_vX_x(r)dr. \end{aligned}$$

Then,

$$\begin{aligned} \mathbf D_vX_x(u) = \sigma \mathbf 1_{[0,u]}(v)\exp \left( \int _{v}^{u}b'(X_x(r))dr\right) . \end{aligned}$$

On the other hand,

$$\begin{aligned} \partial _xX_x(u) = 1 + \int _{0}^{u}b'(X_x(r))\partial _x X_x(r)dr. \end{aligned}$$

Then,

$$\begin{aligned} \partial _xX_x(u) = \exp \left( \int _{0}^{u}b'(X_x(r))dr\right) . \end{aligned}$$

Therefore,

$$\begin{aligned} \mathbf D_vX_x(u) = \sigma \mathbf 1_{[0,u]}(v) \frac{\partial _x X_x(u)}{\partial _x X_x(v)} \end{aligned}$$

and

$$\begin{aligned} \int _{0}^{t}\varphi (X_x(u))\delta X_x(u)= & {} \int _{0}^{t}\varphi (X_x(u))dX_x(u)\\&-\,\alpha _H\sigma ^2 \int _{0}^{t} \int _{0}^{u}\varphi '(X_x(u))\frac{\partial _x X_x(u)}{\partial _x X_x(v)}|u - v|^{2H - 2}dvdu. \end{aligned}$$

4.2 Proof of Corollary 2.8

Consider \(x\in \mathbb R\) and \(\varepsilon ,t > 0\). For every \(s\in [0,t]\),

$$\begin{aligned} \partial _x X_x(s) = 1 + \int _{0}^{s}b'(X_x(r))\partial _x X_x(r)dr \end{aligned}$$

and, by Taylor’s formula,

$$\begin{aligned} X_{x +\varepsilon }(s) - X_x(s) =\varepsilon +\int _{0}^{s} (X_{x +\varepsilon }(r) - X_x(r)) \int _{0}^{1}b'(X_x(r) +\theta (X_{x +\varepsilon }(r) - X_x(r)))d\theta dr. \end{aligned}$$

So, for every \((u,v)\in [0,t]^2\) such that \(v < u\),

$$\begin{aligned} \frac{\partial _x X_x(u)}{\partial _x X_x(v)} = \exp \left( \int _{v}^{u}b'(X_x(r))dr\right) \end{aligned}$$

and

$$\begin{aligned} \frac{X_{x +\varepsilon }(u) - X_x(u)}{X_{x +\varepsilon }(v) - X_x(v)} = \exp \left( \int _{v}^{u}\int _{0}^{1}b'(X_x(r) +\theta (X_{x +\varepsilon }(r) - X_x(r)))d\theta dr\right) . \end{aligned}$$

For a given \(\varphi \in \text {Lip}_{b}^{1}(\mathbb R)\), by Proposition 2.7,

$$\begin{aligned} \Delta _{\varphi }^{S}(x,\varepsilon ,t)\leqslant \alpha _H\sigma ^2\int _{0}^{t}\int _{0}^{u} |\varphi '(X_x(u))|\Delta _{\varphi }(x,\varepsilon ,u,v)(u - v)^{2H - 2}dvdu, \end{aligned}$$

where

$$\begin{aligned} \Delta _{\varphi }^{S}(x,\varepsilon ,t) := \left| \int _{0}^{t}\varphi (X_x(u))\delta X_x(u) - S_{\varphi }(x,\varepsilon ,t)\right| \end{aligned}$$

and, for every \((u,v)\in [0,t]^2\) such that \(v < u\),

$$\begin{aligned} \Delta _{\varphi }(x,\varepsilon ,u,v) := \left| \frac{\partial _x X_x(u)}{\partial _x X_x(v)} - \frac{X_{x +\varepsilon }(u) - X_x(u)}{X_{x +\varepsilon }(v) - X_x(v)}\right| . \end{aligned}$$

Since \(b'(\mathbb R)\subset ]-\infty ,0]\) and b is two times continuously differentiable,

$$\begin{aligned} \Delta _{\varphi }(x,\varepsilon ,u,v)= & {} \left| \exp \left( \int _{v}^{u}b'(X_x(r))dr\right) \right. \\&-\left. \exp \left( \int _{v}^{u}\int _{0}^{1}b'(X_x(r) +\theta (X_{x +\varepsilon }(r) - X_x(r)))d\theta dr\right) \right| \\\leqslant & {} \sup _{z\in b'(\mathbb R)} e^z\\&\times \int _{v}^{u}\left| b'(X_x(r)) - \int _{0}^{1}b'(X_x(r) +\theta (X_{x +\varepsilon }(r) - X_x(r)))d\theta \right| dr\\\leqslant & {} \int _{v}^{u}\int _{0}^{1}|b'(X_x(r)) - b'(X_x(r) +\theta (X_{x +\varepsilon }(r) - X_x(r)))|d\theta dr\\\leqslant & {} \frac{\Vert b''\Vert _{\infty }}{2}\int _{v}^{u}|X_{x +\varepsilon }(r) - X_x(r)|dr. \end{aligned}$$

Consider \(s\in \mathbb R_+\). By Eq. (1):

$$\begin{aligned} (X_{x +\varepsilon }(s) - X_x(s))^2= & {} \varepsilon ^2 + 2\int _{0}^{s} (X_{x +\varepsilon }(r) - X_x(r))d(X_{x +\varepsilon } - X_x)(r)\\= & {} \varepsilon ^2 + 2\int _{0}^{s} (X_{x +\varepsilon }(r) - X_x(r))(b(X_{x +\varepsilon }(r)) - b(X_x(r)))dr. \end{aligned}$$

By the mean-value theorem, there exists \(x_s\in \mathbb R\) such that

$$\begin{aligned} \partial _s(X_{x +\varepsilon }(s) - X_x(s))^2= & {} 2(X_{x +\varepsilon }(s) - X_x(s))^2 \frac{b(X_{x +\varepsilon }(s)) - b(X_x(s))}{X_{x +\varepsilon }(s) - X_x(s)}\\= & {} 2(X_{x +\varepsilon }(s) - X_x(s))^2 b'(x_s)\leqslant -2M(X_{x +\varepsilon }(s) - X_x(s))^2 \end{aligned}$$

and then,

$$\begin{aligned} |X_{x +\varepsilon }(s) - X_x(s)| \leqslant \varepsilon e^{-Ms}. \end{aligned}$$

Therefore,

$$\begin{aligned} \Delta _{\varphi }(x,\varepsilon ,u,v)\leqslant & {} \frac{\Vert b''\Vert _{\infty }}{2}\varepsilon \int _{v}^{u}e^{-Mr}dr\\= & {} \frac{\Vert b''\Vert _{\infty }}{2M}\varepsilon (e^{-Mv} - e^{-Mu}) \leqslant \frac{\Vert b''\Vert _{\infty }}{2M}\varepsilon e^{-Mv}. \end{aligned}$$

Finally, using the above bounds, and in a second stage, the integration by parts formula, we get:

$$\begin{aligned} \Delta _{\varphi }^{S}(x,\varepsilon ,t)\leqslant & {} \alpha _H\sigma ^2\frac{\Vert b''\Vert _{\infty }}{2M}\varepsilon \int _{0}^{t}\int _{0}^{u}|\varphi '(X_x(u))|e^{-Mv}(u - v)^{2H - 2}dvdu\\\leqslant & {} \alpha _H\sigma ^2\frac{\Vert b''\Vert _{\infty }\Vert \varphi '\Vert _{\infty }}{2M}\varepsilon \int _{0}^{t}e^{-Mv}\int _{v}^{t}(u - v)^{2H - 2}dudv\\= & {} \alpha _H\sigma ^2\frac{\Vert b''\Vert _{\infty }\Vert \varphi '\Vert _{\infty }}{2M(2H - 1)}\varepsilon \int _{0}^{t} e^{-Mv}(t - v)^{2H - 1}dv\\= & {} \alpha _H\sigma ^2\frac{\Vert b''\Vert _{\infty }\Vert \varphi '\Vert _{\infty }}{2M^2}\varepsilon \left( \frac{t^{2H - 1}}{2H - 1} -\int _{0}^{t}e^{-Mv}(t - v)^{2H - 2}dv\right) \\\leqslant & {} \alpha _H\sigma ^2\frac{\Vert b''\Vert _{\infty }\Vert \varphi '\Vert _{\infty }}{2M^2(2H - 1)}\varepsilon t^{2H - 1} = C_{\varphi }\varepsilon t^{2H - 1}. \end{aligned}$$

4.3 Proof of Proposition 2.11

Consider \(\gamma \in ]1/2,H[\), \(\delta \in ]H -\gamma ,1 -\gamma [\) and \(\Omega :=\Omega _-\times \Omega _+\), where \(\Omega _-\) (resp. \(\Omega _+\)) is the completion of \(C_{0}^{\infty }(\mathbb R_-,\mathbb R)\) (resp. \(C_{0}^{\infty }(\mathbb R_+,\mathbb R)\)) with respect to the norm \(\Vert .\Vert _-\) (resp. \(\Vert .\Vert _+\)) defined by

$$\begin{aligned} \Vert \omega _-\Vert _- := \sup _{s < t\leqslant 0} \frac{|\omega _-(t) -\omega _-(s)|}{|t - s|^{\gamma }(1 + |s| + |t|)^{\delta }};\quad \forall \omega _-\in \Omega _- \end{aligned}$$

(resp.

$$\begin{aligned} \Vert \omega _+\Vert _+ := \sup _{0\leqslant s < t} \frac{|\omega _+(t) -\omega _+(s)|}{|t - s|^{\gamma }(1 + |s| + |t|)^{\delta }};\quad \forall \omega _+\in \Omega _+). \end{aligned}$$

By Hairer (2005, Section 3) or more clearly by Hairer and Ohashi (2007, Lemmas 4.1 and 4.2), there exist a Borel probability measure \(\mathbb P\) on \(\Omega \) and a transition kernel P from \(\Omega _-\) to \(\Omega _+\) such that:

  • The process generated by \((\Omega ,\mathbb P)\) is a two-sided fractional Brownian motion \(\widetilde{B}\).

  • For every Borel set U (resp. V) of \(\Omega _-\) (resp. \(\Omega _+\)),

    $$\begin{aligned} \mathbb P(U\times V) = \int _U P(\omega _-,V)\mathbb P_-(d\omega _-) \end{aligned}$$

    where \(\mathbb P_-\) is the probability distribution of \((\widetilde{B}(t))_{t\in \mathbb R_-}\).

Let \(I :\mathbb R\times \Omega _+\rightarrow C^0(\mathbb R_+,\mathbb R)\) be the Itô (solution) map for Eq. (1). In general, I(x, .) with \(x\in \mathbb R\) is not a Markov process. However, the solution of Eq. (1) can be coupled with the past of the driving signal in order to bypass this difficulty. In other words, consider the enhanced Itô map \(\mathfrak I :\mathbb R\times \Omega \rightarrow C^0(\mathbb R_+,\mathbb R\times \Omega _-)\) such that for every \((x,\omega _-,\omega _+)\in \mathbb R\times \Omega \) and \(t\in \mathbb R_+\),

$$\begin{aligned} \mathfrak I(x,\omega _-,\omega _+)(t) := (I(x,\omega _+)(t),p_{\Omega _-}(\theta (\omega _-,\omega _+)(t))) \end{aligned}$$

where \(p_{\Omega _-}\) is the projection from \(\Omega \) onto \(\Omega _-\),

$$\begin{aligned} \theta (\omega _-,\omega _+)(t) := (\omega _-\sqcup \omega _+)(t +\cdot ) - (\omega _-\sqcup \omega _+)(\cdot ) \end{aligned}$$

and \(\omega _-\sqcup \omega _+\) is the concatenation of \(\omega _-\) and \(\omega _+\). By Hairer (2005, Lemma 2.12), the process \(\mathfrak I(x,.)\) is Markovian and has a Feller transition semigroup \((Q(t))_{t\in \mathbb R_+}\) such that for every \(t\in \mathbb R_+\), \((x,\omega _-)\in \mathbb R\times \Omega _-\) and every Borel set U (resp. V) of \(\mathbb R\) (resp. \(\Omega _-\)),

$$\begin{aligned} Q(t ; (x,\omega _-),U\times V) = \int _{V}\delta _{I(x,\omega _+)(t)}(U)P(t ; \omega _-,d\omega _+) \end{aligned}$$

where \(\delta _y\) is the delta measure located at \(y\in \mathbb R\) and \(P(t;\omega _-,.)\) is the pushforward measure of \(P(\omega _-,.)\) by \(\theta (\omega _-,.)(t)\).

In order to prove Proposition 2.11, let us first state the following result from Hairer (2005) and Hairer and Ohashi (2007).

Theorem 4.1

Under Assumption 2.10:

  1. (1)

    (Irreducibility) There exists \(\tau \in ]0,\infty [\) such that for every \((x,\omega _-)\in \mathbb R\times \Omega _-\) and every nonempty open set \(U\subset \mathbb R\),

    $$\begin{aligned} Q(\tau ; (x,\omega _-),U\times \Omega _-) > 0. \end{aligned}$$
  2. (2)

    There exists a unique probability measure \(\mu \) on \(\mathbb R\times \Omega _-\) such that \(\mu (p_{\Omega _-}\in \cdot ) =\mathbb P_-\) and

    $$\begin{aligned} Q(t)\mu =\mu ;\quad \forall t\in \mathbb R_+. \end{aligned}$$

For a proof of Theorem 4.1.(1), see Hairer and Ohashi (2007, Proposition 5.8). For a proof of Theorem 4.1.(2), see Hairer (2005, Theorem 6.1) which is a consequence of Proposition 2.18, Lemma 2.20 and Proposition 3.12.

Since the Feller transition semigroup Q has exactly one invariant measure \(\mu \) by Theorem 4.1, \(\mu \) is ergodic, and since the first component of the process generated by Q is a solution of Eq. (1), by the ergodic theorem for Markov processes:

$$\begin{aligned} \frac{1}{T} \int _{0}^{T} \varphi (X(t))dt= & {} \frac{1}{T} \int _{0}^{T} (\varphi \circ p_{\mathbb R})(\mathfrak I(X_0,.)(t))dt\\&\xrightarrow [T\rightarrow \infty ]{ {\text {a.s./L}^2}} \mu (\varphi \circ p_{\mathbb R}). \end{aligned}$$

Moreover, \(\mu = Q(\tau )\mu \). So,

$$\begin{aligned} \mu (\varphi \circ p_{\mathbb R})= & {} \int _{\mathbb R\times \Omega _-}(\varphi \circ p_{\mathbb R})(x,\omega _-)(Q(\tau )\mu )(dx,d\omega _-)\\= & {} \int _{\mathbb R\times \Omega _-}\varphi (x)\int _{\mathbb R\times \Omega _-} Q(\tau ; (\bar{x},\bar{\omega }_-),(dx,d\omega _-))\mu (d\bar{x},d\bar{\omega }_-)\\\geqslant & {} \min _{x\in C}\varphi (x)\cdot \int _{C\times \Omega _-} \int _{C\times \Omega _-} Q(\tau ; (\bar{x},\bar{\omega }_-),(dx,d\omega _-))\mu (d\bar{x},d\bar{\omega }_-)\\\geqslant & {} \min _{x\in C}\varphi (x)\cdot \int _{C\times \Omega _-} Q(\tau ; (\bar{x},\bar{\omega }_-),\text {int}(C)\times \Omega _-)\mu (d\bar{x},d\bar{\omega }_-). \end{aligned}$$

Since

$$\begin{aligned} Q(\tau ; (\bar{x},\bar{\omega }_-),\text {int}(C)\times \Omega _-) > 0;\quad \forall (\bar{x},\bar{\omega }_-)\in \mathbb R\times \Omega _- \end{aligned}$$

by Theorem 4.1.(1), then

$$\begin{aligned} \int _{C\times \Omega _-} Q(\tau ; (\bar{x},\bar{\omega }_-),\text {int}(C)\times \Omega _-)\mu (d\bar{x},d\bar{\omega }_-) > 0. \end{aligned}$$

Therefore, \(\mu (\varphi \circ p_{\mathbb R}) > 0\).

4.4 Proof of Proposition 3.4

First write that, under Assumption 2.10, for any \(s\in [0,T]\) such that \(X(s)\in [x - h,x + h]\),

$$\begin{aligned} |b(X(s)) - b(x)|\leqslant \Vert b\Vert _{\text {Lip}}h. \end{aligned}$$

So,

$$\begin{aligned} \left| \frac{B_{T,h}(x)}{\widehat{f}_{T,h}(x)}\right| \leqslant \Vert b\Vert _{\text {Lip}}h. \end{aligned}$$
(12)

Next, the following Lemma provides a suitable control of \(\mathbb E(|S_{T,h}(x)|^2)\).

Lemma 4.2

Under Assumptions 2.10 and 3.1, there exists a deterministic constant \(C > 0\), not depending on h and T, such that:

$$\begin{aligned} \mathbb E(|S_{T,h}(x)|^2) \leqslant CT^{2(H - 1)}h^{-4}. \end{aligned}$$

Proof

Since K belongs to \(C_{b}^{1}(\mathbb R,\mathbb R_+)\), the map

$$\begin{aligned} \varphi _h : y\in \mathbb R\longmapsto \varphi _h(y) := K\left( \frac{y - x}{h}\right) \end{aligned}$$

belongs to \(\text {Lip}_{b}^{1}(\mathbb R)\). Moreover, since K and \(K'\) are continuous with bounded support \([-1,1]\),

$$\begin{aligned} \left( \int _{0}^{T}\mathbb E(|\varphi _h(X(s))|^{1/H})ds\right) ^{2H} \leqslant \Vert K\Vert _{\infty }^2T^{2H} \end{aligned}$$

and

$$\begin{aligned} \left( \int _{0}^{T}\mathbb E(|\varphi _h'(X(s))|^2)^{1/(2H)}ds\right) ^{2H} \leqslant \Vert K'\Vert _{\infty }^2T^{2H}h^{-2}. \end{aligned}$$

Therefore, by Theorem 2.9, there exists a deterministic constant \(C > 0\), not depending on h and T, such that:

$$\begin{aligned} \mathbb E(|S_{T,h}(x)|^2)= & {} \frac{1}{T^2h^2}\mathbb E\left( \left| \int _{0}^{T}\varphi _h(X(s))\delta B(s)\right| ^2\right) \\\leqslant & {} CT^{2(H - 1)}h^{-4}. \end{aligned}$$

\(\square \)

First, by Inequality (12) and Eq. (7),

$$\begin{aligned} |\widehat{b}_{T,h}(x) - b(x)| \leqslant \Vert b\Vert _{ {\text {Lip}}}h + V_{T,h}(x) \end{aligned}$$

where \(V_{T,h}(x)\) is defined by (8).

Consider \(\beta \in [0,1 - H[\). By Lemma 4.2:

$$\begin{aligned} T^{2\beta }\mathbb E(|S_{T,h}(x)|^2) \leqslant CT^{2(H - 1 +\beta )}h^{-4} \xrightarrow [T\rightarrow \infty ]{} 0. \end{aligned}$$

So,

$$\begin{aligned} T^{\beta }|S_{T,h}(x)| \xrightarrow [T\rightarrow \infty ]{\mathbb P}0. \end{aligned}$$

Moreover, by Lemma 3.2:

$$\begin{aligned} \frac{1}{\widehat{f}_{T,h}(x)} \xrightarrow [T\rightarrow \infty ]{\mathcal D} \frac{1}{l_h(x)} > 0. \end{aligned}$$

Therefore, by Slutsky’s lemma:

$$\begin{aligned} T^{\beta }V_{T,h}(x) \xrightarrow [T\rightarrow \infty ]{\mathbb P} 0. \end{aligned}$$

Lastly, the bound (9) follows from the following Lemma.

Lemma 4.3

Under Assumptions 2.10 and 3.1, there exists a deterministic constant \(C > 0\), not depending on \(\varepsilon \), h and T, such that:

$$\begin{aligned} |\widehat{b}_{T,h,\varepsilon }(x) -\widehat{b}_{T,h}(x)| \leqslant C\frac{\varepsilon h^{-2}T^{2H - 2}}{\widehat{f}_{T,h}(x)}. \end{aligned}$$

Proof

Since K belongs to \(C_{b}^{1}(\mathbb R,\mathbb R_+)\), the map

$$\begin{aligned} \varphi _h : y\in \mathbb R\longmapsto \varphi _h(y) := K\left( \frac{y - x}{h}\right) \end{aligned}$$

belongs to \(\text {Lip}_{b}^{1}(\mathbb R)\). Consider

$$\begin{aligned} S_h(x_0,\varepsilon ,T):= & {} \int _{0}^{T}\varphi _h(X_{x_0}(u))dX_{x_0}(u)\\&-\alpha _H\sigma ^2 \int _{0}^{T} \int _{0}^{u}\varphi _h'(X_{x_0}(u))\frac{X_{x_0 +\varepsilon }(u) - X_{x_0}(u)}{X_{x_0 +\varepsilon }(v) - X_{x_0}(v)}|u - v|^{2H - 2}dvdu. \end{aligned}$$

By Corollary 2.8:

$$\begin{aligned} \left| \int _{0}^{T}\varphi _h(X_{x_0}(u))\delta X_{x_0}(u) - S_h(x_0,\varepsilon ,T)\right|\leqslant & {} H\sigma ^2\frac{\Vert b''\Vert _{\infty }\Vert \varphi _h'\Vert _{\infty }}{M^2} \varepsilon T^{2H - 1}\\\leqslant & {} C\varepsilon h^{-1}T^{2H - 1}, \end{aligned}$$

where

$$\begin{aligned} C := \frac{H\sigma ^2\Vert b''\Vert _{\infty }\Vert K'\Vert _{\infty }}{M^2}. \end{aligned}$$

Therefore,

$$\begin{aligned} |\widehat{b}_{T,h,\varepsilon }(x) -\widehat{b}_{T,h}(x)| \leqslant C\frac{\varepsilon h^{-2}T^{2H - 2}}{\widehat{f}_{T,h}(x)}. \end{aligned}$$

\(\square \)

4.5 Proof of Proposition 3.6

On the one hand, assume that there exists \(\beta \in ]0,1 - H[\) such that

$$\begin{aligned} T^{-\beta } =_{T\rightarrow \infty } o(h(T)^2) \end{aligned}$$

in order to show the consistency of the estimator \(\widehat{b}_{T,h(T)}(x)\). First, let us prove that

$$\begin{aligned} \frac{S_{T,h(T)}(x)}{\widehat{f}_{T,h(T)}(x)} \xrightarrow [T\rightarrow \infty ]{\mathbb P} 0. \end{aligned}$$
(13)

For \(\varepsilon > 0\) arbitrarily chosen:

$$\begin{aligned} \mathbb P\left( \left| \frac{S_{T,h(T)}(x)}{\widehat{f}_{T,h(T)}(x)}\right| \geqslant \varepsilon \right) \leqslant \mathbb P(|S_{T,h(T)}(x)|\geqslant \varepsilon T^{H +\beta - 1}) + \mathbb P(\widehat{f}_{T,h(T)}(x) < T^{H +\beta - 1}). \end{aligned}$$

By Lemma 4.2:

$$\begin{aligned} \mathbb P(|S_{T,h(T)}(x)|\geqslant \varepsilon T^{H +\beta - 1}) \leqslant C\varepsilon ^{-2}|h(T)^{-2}T^{-\beta }|^2 \xrightarrow [T\rightarrow \infty ]{} 0. \end{aligned}$$

So, since

$$\begin{aligned} \widehat{f}_{T,h(T)}(x) \xrightarrow [T\rightarrow \infty ]{\mathbb P} l(x)\in ]0,\infty ], \end{aligned}$$

the convergence result (13) is true.

Moreover, by Inequality (12):

$$\begin{aligned} \frac{B_{T,h(T)}(x)}{\widehat{f}_{T,h(T)}(x)} \xrightarrow [T\rightarrow \infty ]{\text {a.s.}} 0. \end{aligned}$$
(14)

Therefore, by the convergence results (13) and (14) together with Eq. (7):

$$\begin{aligned} \widehat{b}_{T,h(T)}(x) \xrightarrow [T\rightarrow \infty ]{\mathbb P}b(x). \end{aligned}$$

On the other hand, let \(\gamma \in ]0,\beta [\) be arbitrarily chosen such that

$$\begin{aligned} h(T) =_{T\rightarrow \infty } o(T^{-\gamma }) \text { and } T^{H - 1 +\gamma } =_{T\rightarrow \infty } o(h(T)^2) \end{aligned}$$

in order to show that

$$\begin{aligned} T^{\gamma }|\widehat{b}_{T,h(T)}(x) - b(x)| \xrightarrow [T\rightarrow \infty ]{\mathcal D} 0. \end{aligned}$$
(15)

First, by Inequality (12) and Eq. (7):

$$\begin{aligned} T^{\gamma }|\widehat{b}_{T,h(T)}(x) - b(x)| \leqslant \Vert b\Vert _{ {\text {Lip}}}T^{\gamma }h(T) + T^{\gamma }V_{T,h(T)}(x). \end{aligned}$$
(16)

By Lemma 4.2:

$$\begin{aligned} T^{2\gamma }\mathbb E(|S_{T,h(T)}(x)|^2) \leqslant C|h(T)^{-2}T^{H - 1 +\gamma }|^2 \xrightarrow [T\rightarrow \infty ]{} 0. \end{aligned}$$

So, since

$$\begin{aligned} \widehat{f}_{T,h(T)}(x) \xrightarrow [T\rightarrow \infty ]{\mathbb P} l(x)\in ]0,\infty ], \end{aligned}$$

by Slutsky’s lemma:

$$\begin{aligned} T^{\gamma }V_{T,h(T)}(x) \xrightarrow [T\rightarrow \infty ]{\mathcal D} 0. \end{aligned}$$

Finally, since \(h(T) =_{T\rightarrow \infty } o(T^{-\gamma })\), by Eq. (16), the convergence result (15) is true.

4.6 Proof of Corollary 3.7

In order to establish a rate of convergence for \(\widehat{b}_{T,h,\varepsilon }(x)\), Lemmas 4.2 and 4.3 provide a suitable control.

Indeed, by Lemma 4.3, there exists a deterministic constant \(C > 0\) such that:

$$\begin{aligned} |\widehat{b}_{T,h(T),\varepsilon (T)}(x) - b(x)|\leqslant & {} |\widehat{b}_{T,h(T),\varepsilon (T)}(x) -\widehat{b}_{T,h(T)}(x)| + |\widehat{b}_{T,h(T)}(x) - b(x)|\\\leqslant & {} C \frac{\varepsilon (T)h(T)^{-2}T^{2H - 2}}{\widehat{f}_{T,h(T)}(x)} + |\widehat{b}_{T,h(T)}(x) - b(x)|. \end{aligned}$$

Proposition 3.6 allows to conclude.

4.7 Proof of Proposition 3.9

Consider a random variable \(U\rightsquigarrow \mathcal N(0,1)\) and

$$\begin{aligned} \mathcal G := \{G :\mathbb R\rightarrow \mathbb R : \mathbb E(G(U)) = 0 \text { and }\mathbb E(G(U)^2) <\infty \}, \end{aligned}$$

which is a subset of \(L^2(\mathbb R,\nu (y)dy)\).

The Hermite polynomials

$$\begin{aligned} H_q(y) := (-1)^qe^{y^2/2}\frac{d^q}{dy^q}e^{-y^2/2};\quad y\in \mathbb R \text {, } q\in \mathbb N \end{aligned}$$

form a complet orthogonal system of functions of \(L^2(\mathbb R,\nu (y)dy)\) such that

$$\begin{aligned} \mathbb E(H_q(U)H_p(U)) = q!\delta _{p,q};\quad \forall p,q\in \mathbb N. \end{aligned}$$

By Taqqu (1975) (see p. 291) and Puig et al. (2002, Lemma 3.3):

  1. (1)

    For any \(G\in \mathcal G\) and \(y\in \mathbb R\),

    $$\begin{aligned} G(y) =\sum _{q = m(G)}^{\infty }\frac{J(q)}{q!}H_q(y) \end{aligned}$$
    (17)

    in \(L^2(\mathbb R,\nu (y)dy)\), where

    $$\begin{aligned} J(q) :=\mathbb E(G(U)H_q(U));\quad \forall q\in \mathbb N \end{aligned}$$

    and

    $$\begin{aligned} m(G) := \inf \{q\in \mathbb N : J(q)\not = 0\}. \end{aligned}$$
  2. (2)

    (Mehler’s formula) For any centered, normalized and stationary Gaussian process Z of autocorrelation function R:

    $$\begin{aligned} \mathbb E(H_q(Z(u))H_p(Z(v))) = q!R(v - u)^q\delta _{p,q};\quad \forall u,v\in \mathbb R_+ \text {, } \forall p,q\in \mathbb N. \end{aligned}$$
    (18)

Consider the map \(K_T :\mathbb R\rightarrow \mathbb R\) defined by:

$$\begin{aligned} K_T(y) := \frac{1}{h(T)}K\left( \frac{y}{h(T)}\right) ;\quad \forall y\in \mathbb R. \end{aligned}$$

In order to use (17) and (18) to prove the convergence result (10), note that \(\widehat{f}_{T,h(T)}(x)\) can be rewritten as

$$\begin{aligned} \widehat{f}_{T,h(T)}(x) = \frac{1}{T} \int _{0}^{T}G_{T,x}(Y(s))ds - R_{T,x}, \end{aligned}$$

where

$$\begin{aligned} R_{T,x} := \frac{1}{\sigma _0}\left( K_T*\nu \left( \frac{.}{\sigma _0}\right) \right) (x);\quad \forall y\in \mathbb R \end{aligned}$$

and

$$\begin{aligned} G_{T,x}(y) := K_T(\sigma _0y - x) - R_{T,x}. \end{aligned}$$

Lemma 4.4

The map \(G_{T,x}\) belongs to \(\mathcal G\) and there exists \(T_x > 0\) such that

$$\begin{aligned} m(G_{T,x}) = 1;\quad \forall T > T_x. \end{aligned}$$

Proof

On the one hand, since \(K_T\) is continuous and its support is compact, \(G_{T,x}\in L^2(\mathbb R,\nu (y)dy)\). Moreover,

$$\begin{aligned} \mathbb E(G_{T,x}(U))= & {} \int _{-\infty }^{\infty }G_{T,x}(y)\nu (y)dy\\= & {} \int _{-\infty }^{\infty }K_T(\sigma _0y - x)\nu (y)dy - R_{T,x} = 0. \end{aligned}$$

So, \(G_{T,x}\in \mathcal G\).

On the other hand, for every \(q\in \mathbb N\), by putting \(J_{T,x}(q) :=\mathbb E(G_{T,x}(U)H_q(U))\),

$$\begin{aligned} J_{T,x}(1)= & {} \int _{-\infty }^{\infty } G_{T,x}(y)H_1(y)\nu (y)dy\\= & {} \int _{(x - h(T))/\sigma _0}^{(x + h(T))/\sigma _0} K_T(\sigma _0y - x)\nu (y)ydy - R_{T,x}\int _{-\infty }^{\infty }H_0(y)H_1(y)\nu (y)dy\\= & {} \int _{(x - h(T))/\sigma _0}^{(x + h(T))/\sigma _0} K_T(\sigma _0y - x)\nu (y)ydy. \end{aligned}$$

For any \(x > 0\), there exists \(T_{x}^{+} > 0\) such that for every \(T > T_{x}^{+}\),

$$\begin{aligned} I_{T,x} := \left[ \frac{x - h(T)}{\sigma _0} ; \frac{x + h(T)}{\sigma _0}\right] \subset ]0,\infty [. \end{aligned}$$

For every \(T > T_{x}^{+}\), since \(y\mapsto K_T(\sigma _0y - x)\), \(\nu \) and \(\text {Id}_{\mathbb R}\) are continuous and strictly positive on \(I_{T,x}^{\circ }\), \(J_{T,x}(1) > 0\). Symmetrically, for every \(x < 0\), there exists \(T_{x}^{-} > 0\) such that for every \(T > T_{x}^{-}\), \(J_{T,x}(1) < 0\). This concludes the proof. \(\square \)

Lemma 4.5

For every \(x\in \mathbb R^*\),

$$\begin{aligned} \sum _{q = 1}^{\infty } \frac{J_{T,x}(q)^2}{q!} =_{T\rightarrow \infty } O\left( \frac{1}{h(T)}\right) . \end{aligned}$$

Proof

Since \(G_{T,x}\in L^2(\mathbb R,\nu (y)dy)\), by Parseval’s inequality:

$$\begin{aligned} \sum _{q = 1}^{\infty } \frac{J_{T,x}(q)^2}{q!}= & {} \mathbb E(G_{T,x}(U)^2)\\= & {} \int _{-\infty }^{\infty } (K_T(\sigma _0y - x) - R_{T,x})^2\nu (y)dy\\\leqslant & {} 2\int _{-\infty }^{\infty } K_T(\sigma _0y - x)^2\nu (y)dy + 2R_{T,x}^{2}. \end{aligned}$$

On the one hand,

$$\begin{aligned} R_{T,x} \xrightarrow [T\rightarrow \infty ]{} \frac{1}{\sigma _0}\nu \left( \frac{x}{\sigma _0}\right) . \end{aligned}$$

So,

$$\begin{aligned} R_{T,x}^{2} =_{T\rightarrow \infty } O(1). \end{aligned}$$

On the other hand,

$$\begin{aligned} \int _{-\infty }^{\infty }K_T(\sigma _0y - x)^2\nu (y)dy= & {} \frac{1}{\sigma _0h(T)}\int _{-1}^{1}K(y)^2\nu \left( \frac{h(T)y + x}{\sigma _0}\right) dy\\\leqslant & {} \frac{2\Vert K\Vert _{\infty }^{2}\Vert \nu \Vert _{\infty }}{\sigma _0h(T)}. \end{aligned}$$

Therefore,

$$\begin{aligned} \sum _{q = 2}^{\infty }\frac{J_{T,x}(q)^2}{q!} =_{T\rightarrow \infty } O\left( \frac{1}{h(T)}\right) . \end{aligned}$$

\(\square \)

In order to prove the convergence result (10), since

$$\begin{aligned} R_{T,x} \xrightarrow [T\rightarrow \infty ]{} \frac{1}{\sigma _0}\nu \left( \frac{x}{\sigma _0}\right) , \end{aligned}$$

let us prove that

$$\begin{aligned} \left| \frac{1}{T}\int _{0}^{T}G_{T,x}(Y(s))ds\right| \xrightarrow [T\rightarrow \infty ]{\text {L}^2} 0. \end{aligned}$$
(19)

By the decomposition (17) and Mehler’s formula (18) applied to \(G_{T,x}\) and Y, for every \(u,v\in [0,T]\),

$$\begin{aligned} \mathbb E(G_{T,x}(Y(u))G_{T,x}(Y(v))) = \sum _{q = 1}^{\infty }\frac{J_{T,x}(q)^2}{q!}\rho (v - u)^q. \end{aligned}$$

So, since \(\rho \) is a \([-1,1]\)-valued function,

$$\begin{aligned} \mathbb E\left( \left| \int _{0}^{T}G_{T,x}(Y(s))ds\right| ^2 \right)= & {} \int _{0}^{T}\int _{0}^{T} |\mathbb E(G_{T,x}(Y(u))G_{T,x}(Y(v)))|dudv\\\leqslant & {} \sum _{q = 1}^{\infty }\frac{J_{T,x}(q)^2}{q!} \int _{0}^{T}\int _{0}^{T}|\rho (v - u)|^qdudv\\\leqslant & {} \left( \int _{0}^{T}\int _{0}^{T}|\rho (v - u)|dudv\right) \sum _{q = 1}^{\infty }\frac{J_{T,x}(q)^2}{q!}. \end{aligned}$$

Then, by Assumption 3.8 and Lemma 4.5:

$$\begin{aligned} \lim _{T\rightarrow \infty } \mathbb E\left( \left| \frac{1}{T} \int _{0}^{T}G_{T,x}(Y(s))ds\right| ^2\right) =\lim _{T\rightarrow \infty } \frac{T^{2H - 2}}{h(T)} = 0. \end{aligned}$$

Therefore, the convergence result (19) is true.