Keywords

1 Introduction

Stochastic differential equations driven by fractional Brownian motion (fBm) have been the subject of an active research for the last two decades. The main reason is that these equations seem to be one of the most suitable tools to model the so-called long-range dependence in many applied areas, such as physics, finance, biology, network studies, etc. In modeling, the problem of statistical estimation of model parameters is of a particular importance, so the growing number of papers devoted to statistical methods for equations with fractional noise is not surprising.

In this paper, we concentrate on the estimation of an unknown drift parameter \(\theta \) in the fractional diffusion process given as the solution to the equation

$$\begin{aligned} X_t= X_0+\theta \int _0^t a(s,X_s)\,ds+\int _0^sb(s,X_s)\,dB^H_s, \end{aligned}$$
(1)

where \(B^H\) is a fBm with known Hurst index H. The integral with respect to fBm is understood in the path-wise sense. Special attention is given to the fractional Ornstein–Uhlenbeck process, which is a solution of the following Langevin equation

$$\begin{aligned} X_t= X_0+\theta \int _0^t X_s\,ds+B^H_s, \end{aligned}$$
(2)

and to its generalizations. This model has been studied since the early 2000s, and comparing to the general case, it has been well developed for now. The asymptotic and explicit distributions for various estimators were obtained, and almost sure limit theorems, large deviation principles, and Berry–Esséen bounds were established for this model. In the general case, only strong consistency results are known, up to our knowledge.

Note that when \(H=1/2\) we obtain a diffusion model driven by standard Brownian motion. The statistical inference for such models has been thoroughly studied by now, presented in many papers, and summarized in several books, see, e.g., [11, 28, 33, 35, 44, 46, 66, 70] and references cited therein. At the same time, we can mention only the book [67] devoted to fractional diffusions (some fractional models are also considered in [11, 51]). In the present article, we try to present the most recent achievements in this field focusing on rather general models.

We also study the following mixed model

$$\begin{aligned} X_t= X_0+\theta \int _0^t a(s,X_s)\,ds+\int _0^sb(s,X_s)\,dB^H_s+\int _0^sc(s,X_s)\,dW_s, \end{aligned}$$
(3)

which contains both standard and fractional Brownian motion. The motivation to consider such equations comes, in particular, from financial mathematics. When it is necessary to model randomness on a financial market, it is useful to distinguish between two main sources of this randomness. The first source is the stock exchange itself with thousands of agents. The noise coming from this source can be assumed white and is best modeled by a Wiener process. The second source has the financial and economic background. The random noise coming from this source usually has a long-range dependence property, which can be modeled by a fBm \(B^H\) with the Hurst parameter \(H > 1/2\). As examples of the Eq. (3), we consider linear and mixed Ornstein–Uhlenbeck models.

Note that in the present paper the parameter H is considered to be known. The problem of the Hurst parameter estimation in stochastic differential equations driven by fBm was studied in [9, 40, 41], for mixed models see [21].

Let us mention briefly some related models that are not considered in this article. First note that if \(a=b=c\equiv 1\), then we get simple models \(X_t=\theta t + B^H_t\) and \(X_t=\theta t + B^H_t+W_t\). They were studied in [8, 16, 32], respectively. Recently, a similar mixed model with two fractional Brownian motions was considered in [52, 56]. Prakasa Rao [67] investigated the equation \(dX_t = [a(t,X_t)+\theta b(t,X_t)]\,dt+\sigma (t)\,dB^H_t\). He studied maximum likelihood, Bayes, and instrumental variable estimation in this model. Multiparameter equations with additive fractional noise were considered in [18, 71]. Multidimensional model was investigated [60]. In [19, 50], the so-called sub-fractional Ornstein–Uhlenbeck process was studied, where the process \(B^H_t\) in (2) was replaced with a sub-fractional Brownian motion. A model with more general Gaussian noise was considered in [22]. For the parameter estimation in the so-called fractional Ornstein–Uhlenbeck process of the second kind, see [1, 2]. Linear and Ornstein–Uhlenbeck models with multifractional Brownian motion were studied in [20]. The parameter estimation for partially observed fractional models related to fractional Ornstein–Uhlenbeck process was investigated in [7, 14, 15, 23].

The paper is organized as follows. In Sect. 2 the basic facts about fBm, path-wise stochastic integration, pure and mixed stochastic differential equations with fBm are given. Section 3 is devoted to the case of estimation by continuous-time observations in the fractional model (1), when the whole trajectory of the solution is observed. In Sect. 4, we consider the discrete-time versions of this model. Mixed models are discussed in Sect. 5.

2 Basic Facts

In this section, we review basic properties of the fBm (Sect. 2.1), consider the path-wise integration using the fractional calculus (Sect. 2.2), and give the existence and uniqueness theorems for stochastic differential equations driven by fBm with \(H>1/2\) (Sect. 2.3) and for mixed stochastic differential equations with long-range dependence, involving both Wiener process and fBm with \(H>1/2\) (Sect. 2.4).

2.1 Fractional Brownian Motion

Let \((\varOmega , \mathcal {F},\overline{\mathcal {F}}, \mathbf {P})\) be a complete probability space with filtration \(\overline{\mathcal {F}}=\{\mathcal {F}_t, t\in \mathbb {R}^+\}\) satisfying the standard assumptions. It is assumed that all processes under consideration are adapted to filtration \(\overline{\mathcal {F}}\).

Definition 2.1

Fractional Brownian motion (fBm) with Hurst index \(H \in (0,1)\) is a Gaussian process \(B^{H}=\left\{ B_{t}^{H},\; t \in \mathbb {R}^+\right\} \) on \((\varOmega , \mathcal {F}, \mathbf {P})\) featuring the properties

  1. (a)

    \(B_{0}^{H}=0\);

  2. (b)

    \({}\mathbf {E}B_{t}^{H}=0\), \(t \in \mathbb {R}^+\);

  3. (c)

    \({}\mathbf {E}B_{t}^{H}B_{s}^{H}=\frac{1}{2}\left( t ^{2H}+ s ^{2H}-|t-s|^{2H}\right) \), \(s,t\in \mathbb {R}^+\).

It is not hard to see that for \(H=1/2\) fBm is a Brownian motion. For \(H\ne 1/2\) the fBm is neither a semimartingale nor a Markov process.

The fBm was first considered in [37]. Stochastic calculus for fBm was developed by Mandelbrot and van Ness [48], who obtained the following integral representation:

$$ B^H_t=a_H \left\{ \int _{-\infty }^0 \left[ (t-s)^{H-\frac{1}{2}}-(-s)^{H-\frac{1}{2}}\right] dW_s+\int _0^t(t-s)^{H-\frac{1}{2}}dW_s\right\} , $$

where \(W=\{W_t,\; t\in \mathbb {R}\}\) is a Wiener process, and \(a_H=\sqrt{\frac{2H\varGamma (\frac{3}{2}-H)}{\varGamma (H+\frac{1}{2})\varGamma (2-2H)}}\), \(\varGamma \) denotes the Gamma function.

Another representation of the fBm was obtained in [61]:

$$ B_t^{H} = \int _0^t g_H(t,s)\,dW_s,\quad t\in [0,T], $$

where \(W=\{W_t,\; t\ge 0\}\) is a Wiener process, and

$$\begin{aligned} g_H(t,s)&= a_H\left[ \left( \frac{t}{s}\right) ^{H-\frac{1}{2}}(t-s)^{H-\frac{1}{2}} -\left( H-\tfrac{1}{2}\right) s^{\frac{1}{2}-H}\int _s^t(v-s)^{H-\frac{1}{2}}v^{H-\frac{3}{2}}\,dv\right] . \end{aligned}$$

For \(H>1/2\) this expression can be slightly simplified:

$$\begin{aligned} g_H(t,s)&= \left( H-\tfrac{1}{2}\right) a_H s^{\frac{1}{2}-H} \int _s^t(v-s)^{H-\frac{3}{2}}v^{H-\frac{1}{2}}\,dv. \end{aligned}$$

Definition 2.1 implies that the fBm is self-similar with the self-similarity parameter H, that is, \(\left\{ B_H(c t)\right\} {\mathop {=}\limits ^{\mathcal D}}\left\{ c^H B^H(t)\right\} \) for any \(c> 0\), where \({\mathop {=}\limits ^{\mathcal D}}\) denotes the distributional equivalence.

The fBm has stationary increments in the sense that \(\mathbf {E}\left( B_{t}^{H}-B_{s}^{H}\right) ^{2}=|t-s|^{2H}\). Taking into account that the process \(B^H\) is Gaussian, one can deduce from the Kolmogorov theorem that it has the continuous (and even Hölder continuous up to order H) modification. In what follows, we consider this modification of fBm.

The increments of the fBm are independent only in the case \(H=1/2\). They are negatively correlated for \(H\in (0,1/2)\) and positively correlated for \(H\in (1/2,1)\). Moreover, for \(H\in (1/2,1)\) the fBm has the property of long-range dependence. This means that \(\sum _{n=1}^\infty \left| r(n)\right| =\infty \), where \(r(n)=\mathbf {E}B^H_1\left( B^H_{n+1}-B^H_n\right) \) is the autocovariance function.

Jost [34] established the formula for the transformation of an fBm with positively correlated increments into an fBm with negatively correlated increments, and vice versa. Let \(B^H=\left\{ B^H_t,t\in [0,T]\right\} \) be an fBm with Hurst index \(H\in (0,1)\). Then there exists a unique (up to modification) fBm \(B^{1-H}=\left\{ B^{1-H}_t,t\in [0,T]\right\} \) with Hurst index \(1-H\) such that

$$ B^H_t=\left( \frac{2H}{\varGamma (2H)\varGamma (3-2H)}\right) ^{\frac{1}{2}} \int _0^t(t-s)^{2H-1}\,dB^{1-H}_s, $$

where the integral with respect to fBm is a fractional Wiener integral.

For more details on fBm we refer to the books [10, 51, 62].

2.2 Elements of Fractional Calculus and Fractional Integration

In this subsection, we describe a construction of the path-wise integral following the approach developed by Zähle [76,77,78]. We start by introducing fractional integrals and derivatives, see [69] for the details on the concept of fractional calculus.

Definition 2.2

Let \(f\in L_1(a,b)\). The Riemann–Liouville left- and right-sided fractional integrals of order \(\alpha >0\) are defined for almost all \(x\in (a,b)\) by

$$\begin{aligned} \mathcal {I}^\alpha _{a+} f(x)&:=\frac{1}{\varGamma (\alpha )} \int _a^x(x-y)^{\alpha -1}f(y)\,dy ,\\ \mathcal {I}^\alpha _{b-} f(x)&:=\frac{(-1)^{-\alpha }}{\varGamma (\alpha )} \int _x^b(y-x)^{\alpha -1}f(y)\,dy, \end{aligned}$$

respectively, where \((-1)^{-\alpha }=e^{-i\pi \alpha }\).

Definition 2.3

For a function \(f:[a,b]\rightarrow \mathbb {R}\) the Riemann–Liouville left- and right-sided fractional derivatives of order \(\alpha \) (\(0<\alpha <1\)) are defined by

$$\begin{aligned} \mathcal {D}^\alpha _{a+} f(x)&:=\mathbbm {1}_{(a,b)}(x)\,\frac{1}{\varGamma (1-\alpha )}\,\frac{d}{dx} \int _a^x\frac{f(y)}{(x-y)^{\alpha }}\,dy ,\\ \mathcal {D}^\alpha _{b-} f(x)&:=\mathbbm {1}_{(a,b)}(x)\,\frac{(-1)^{1+\alpha }}{\varGamma (1-\alpha )}\,\frac{d}{dx} \int _x^b\frac{f(y)}{(y-x)^{\alpha }}\,dy. \end{aligned}$$

Denote by \(\mathcal {I}^\alpha _{a+}(L_p)\) (resp. \(\mathcal {I}^\alpha _{b-}(L_p)\)) the class of functions f that can be presented as \(f=\mathcal {I}^\alpha _{a+}\varphi \) (resp. \(f=\mathcal {I}^\alpha _{b-}\varphi \)) for \(\varphi \in L_p(a,b)\). For \(f\in \mathcal {I}^\alpha _{a+}(L_p)\) (resp. \(f\in \mathcal {I}^\alpha _{b-}(L_p)\)), \(p\ge 1\), the corresponding Riemann–Liouville fractional derivatives admit the following Weyl representation

$$\begin{aligned} \mathcal {D}^\alpha _{a+} f(x)&=\frac{1}{\varGamma (1-\alpha )}\left( \frac{f(x)}{(x-a)^\alpha }+ \alpha \int _a^x\frac{f(x)-f(y)}{(x-y)^{\alpha +1}}\,dy\right) \mathbbm {1}_{(a,b)}(x),\\ \mathcal {D}^\alpha _{b-} f(x)&=\frac{(-1)^{\alpha }}{\varGamma (1-\alpha )}\left( \frac{f(x)}{(b-x)^\alpha }+ \alpha \int _x^b\frac{f(x)-f(y)}{(y-x)^{\alpha +1}}\,dy\right) \mathbbm {1}_{(a,b)}(x), \end{aligned}$$

where the convergence of the integrals holds pointwise for a. a. \(x\in (a,b)\) for \(p = 1\) and in \(L_p(a,b)\) for \(p>1\).

Let \(f,g:[a,b]\rightarrow \mathbb {R}\). Assume that the limits

$$ f(u+):=\lim _{\delta \downarrow 0}f(u+\delta ) \quad \text {and}\quad g(u-):=\lim _{\delta \downarrow 0}f(u-\delta ) $$

exist for \(a\le u\le b\). Denote

$$\begin{aligned} f_{a+}(x)&=(f(x)-f(a+))\mathbbm {1}_{(a,b)}(x),\\ g_{b-}(x)&=(g(b-)-g(x))\mathbbm {1}_{(a,b)}(x). \end{aligned}$$

Definition 2.4

([76]) Assume that \(f_{a+}\in \mathcal {I}^\alpha _{a+}(L_p),\ g_{b-}\in \mathcal {I}^{1-\alpha }_{b-}(L_q)\) for some \(1/p+1/q\le 1\), \(0<\alpha <1\). The generalized (fractional) Lebesgue–Stieltjes integral of f with respect to g is defined by

$$\begin{aligned} \begin{aligned} \int _a^b f(x)\,dg(x):=&(-1)^\alpha \int _a^b \mathcal {D}^\alpha _{a+}f_{a+}(x)\,\mathcal {D}^{1-\alpha }_{b-}g_{b-}(x)\,dx+\\&+f(a+)\bigl (g(b-)-g(a+)\bigr ). \end{aligned} \end{aligned}$$
(4)

Note that this definition is correct, i. e.  independent of the choice of \(\alpha \) ([76, Proposition 2.1]). If \(\alpha p<1\), then (4) can be simplified to

$$ \int _a^b f(x)\,dg(x):=(-1)^\alpha \int _a^b \mathcal {D}^\alpha _{a+}f(x)\,\mathcal {D}^{1-\alpha }_{b-}g_{b-}(x)\,dx. $$

In particular, Definition 2.4 allows us to integrate Hölder continuous functions.

Definition 2.5

Let \(0<\lambda \le 1\). A function \(f:\mathbb {R}\rightarrow \mathbb {R}\) belongs to \(C^\lambda [a,b]\), if there exists a constant \(C>0\) such that for all \(s,t\in [a,b]\)

$$ \left| f(s)-f(t)\right| \le C\left| s-t\right| ^\lambda ,\quad s,t\in [a,b]. $$

Proposition 2.6

([76, Theorem 4.2.1]) Let \(f\in C^\lambda [a,b]\), \(g\in C^\mu [a,b]\) with \(\lambda +\mu >1\). Then the assumptions of Definition 2.4 are satisfied with any \(\alpha \in (1-\mu ,\lambda )\) and \(p=q=\infty \). Moreover, the generalized Lebesgue–Stieltjes integral \(\int _a^b f(x)\,dg(x)\) defined by (4) coincides with the Riemann–Stieltjes integral

$$\begin{aligned} \{R-S\}\int _a^b f(x)\,dg(x):=\lim _{\left| \pi \right| \rightarrow 0}\sum _i f(x_i^*)(g(x_{i+1})-g(x_i)), \end{aligned}$$

where \(\pi \!=\!\{a\!=\!x_0\le \! x_0^*\le x_1\!\le \! \ldots \! \le x_{n-1}\le x_{n-1}^*\le x_n\!=\!b\}\), \(\left| \pi \right| =\max _i\left| x_{i+1}-x_i\right| \).

Recall that for any \(\mu \in (0,H)\) the trajectories of the fBm \(B^H\) are \(\mu \)-Hölder continuous. Therefore, if \(Z=\left\{ Z_t, t\ge 0\right\} \) is a stochastic process whose trajectories are \(\lambda \)-Hölder continuous with \(\lambda > 1-H\), then the path-wise integral \(\int _0^TZ_t\,dB^H_t\) is well defined and coincides with the Riemann–Stieltjes integral.

Remark 2.7

There are many papers devoted to stochastic differential equations with fBm with different definitions of the stochastic integral. In the present paper, we concentrate only on the path-wise definition proposed in [76] for \(H>1/2\). We refer to the book [10] (see also [51]) for the extended survey on various approaches on stochastic integration with respect to fBm and the relations between different types of integrals.

2.3 Stochastic Differential Equations Driven by fBm

Consider a stochastic differential equation driven by fBm \(B^H=\left\{ B_t^H,\; t\in [0,T]\right\} \), \(H\in (1/2, 1)\) on a complete probability space \((\varOmega , \mathcal {F}, \mathbf {P})\):

$$\begin{aligned} X_t=X_0+\int _0^ta(s, X_s)ds+\int _0^tb(s, X_s)dB^H_s, \quad t\in [0, T]. \end{aligned}$$
(5)

Let the function \(b=b(t, x):[0, T]\times \mathbb {R}\rightarrow \mathbb {R}\) satisfy the assumptions: b is differentiable in x, there exist \(M>0\), \(0<\gamma , \kappa \le 1\) and for any \(R>0\) there exists \(M_R>0\) such that  

(A\(_{1}\)):

b is Lipschitz continuous in x:

$$|b(t, x)-b(t, y)|\le M|x-y|, \quad \forall t\in [0, T], x, y \in \mathbb {R}; $$
(A\(_{2}\)):

x-derivative of b is locally Hölder in x:

$$|b_x(t, x)-b_x(t, y)|\le M_R|x-y|^\kappa , \quad \forall |x|, |y|\le R, t\in [0, T]; $$
(A\(_{3}\)):

b and its spatial derivative are Hölder in time:

$$|b(t, x)-b(s, x)|+|b_x(t, x)-b_x(s, x)|\le M|t-s|^\gamma , \quad \forall x\in \mathbb {R}, t, s\in [0, T]. $$

 

Let the function \(a=a(t, x):[0, T]\times \mathbb {R}\rightarrow \mathbb {R}\) satisfy the assumptions  

(A\(_{4}\)):

for any \(R\ge 0\) there exists \(L_R>0\) such that

$$|a(t, x)-a(t, y)|\le L_R|x-y|, \quad \forall |x|, |y|\le R, \forall t\in [0, T]; $$
(A\(_{5}\)):

there exists the function \(a_0\in L_p[0, T]\) and \(L>0\) such that

$$|a(t, x)|\le L|x|+a_0(t),\quad \forall (t, x)\in [0, T]\times \mathbb {R}. $$

 

Fix a parameter \(\alpha \in (0,1/2)\). Let \(W_\infty ^\alpha [0,T]\) be the space of real-valued measurable functions \(f:[0,T]\rightarrow \mathbb {R}\) such that

$$ \left\| f\right\| _{\infty ,\alpha ;T}= \sup _{s\in [0,T]}\left( \left| f(s)\right| +\int _0^s \left| f(s)-f(u)\right| (s-u)^{-1-\alpha } du\right) <\infty . $$

Theorem 2.8

([65]) Let the coefficients a and b satisfy (A\(_{1}\))–(A\(_{5}\)) with \(p=(1-H+\varepsilon )^{-1}\) with some \(0<\varepsilon <H-1/2\), \(\gamma >1-H\), \(\kappa >H^{-1}-1\) (the constants \(M, M_R, R, L_R\), and the function \(a_0\) may depend on \(\omega \)). Then there exists the unique solution \(X=\{X_t, t\in [0, T]\}\) of Eq. (5), \(X\in L_0(\varOmega , \mathcal {F}, \mathbf {P}, W_\infty ^{1-H+\varepsilon }[0, T])\) with a.a. trajectories from \(C^{H-\varepsilon }[0, T]\).

Remark 2.9

Here we restrict ourselves to the one-dimensional case, but it is worth mentioning that Theorem 2.8 was proved in [65] for the case of multidimensional processes. It also admits multiparameter [54] and multifractional [68] generalizations.

When \(b(t,x)\equiv 1\), we obtain the following equation:

$$\begin{aligned} X_t=X_0+\int _0^ta(s, X_s)ds+B^H_t, \quad t\in [0, T]. \end{aligned}$$
(6)

Since this equation does not contain integration with respect to fractional Brownian motion, it can be considered for all \(H\in (0,1)\). Nualart and Ouknine [63] proved the existence and uniqueness of a strong solution to Eq. (6) under the following weak regularity assumptions on the coefficient a(tx).

Theorem 2.10

([63])

  1. (i)

    If \(H\le 1/2\) (singular case), we assume the linear growth condition

    $$ \left| a(t, x)\right| \le C(1 + |x|). $$
  2. (ii)

    If \(H>1/2\) (regular case), we assume that a is Hölder continuous of order \(\alpha \in (1-1/2H,1)\) in x and of order \(\gamma >H-1/2\) in time:

    $$ \left| a(t,x) - a(s,y)\right| \le C\left( |x - y|^\alpha + |t - s|^\gamma \right) . $$

Then the Eq. (6) has a unique strong solution.

Remark 2.11

The existence and uniqueness of a strong solution to (6) can be obtained under weaker conditions on a(tx). In particular, the equations with locally unbounded drift for \(H<1/2\) were studied in [64]. For \(H>1/2\) Hu et al. [31] considered the case when the coefficient a(tx) has a singularity at \(x=0\).

2.4 Mixed Stochastic Differential Equations with Long-Range Dependence

Let (\(\varOmega ,\mathcal {F},\left\{ \mathcal {F}_t\right\} _{t\in [0,T]},\mathbf {P}\)) be a complete probability space equipped with a filtration satisfying standard assumptions, and \(W=\{W_t, t \in [0,T]\}\) be a standard \(\mathcal {F}_t\)-Wiener process. In this subsection, we investigate more general model than (3): instead of the fBm we consider an \(\mathcal {F}_t\)-adapted stochastic process \(Z=\{Z_t, t \in [0,T]\}\), which is almost surely Hölder continuous with exponent \(\gamma >1/2\). The processes W and Z can be dependent. We study a mixed stochastic differential equation

$$\begin{aligned} X_t =X_0 +\int _0^t a(s,X_{s})\,ds+\int _0^tb(s,X_{s})\,dZ_s +\int _0^tc(s,X_{s})\,dW_s,\ t\in [0,T]\;. \end{aligned}$$
(7)

The integral w.r.t. Wiener process W is the standard Itô integral, and the integral w.r.t. Z is path-wise generalized Lebesgue–Stieltjes integral, see Definition 2.4.

We will assume that for some \(K>0\), \(\beta >1/2\), and for any \(t,s\in [0,T]\), \(x,y\in \mathbb R\),  

(B\(_{1}\)):

\(| a(t,x)| +|b(t,x)| + |c(t,x)| \le K(1+\left| x\right| )\),

(B\(_{2}\)):

\(|a(t,x)-a(t,y)|+|c(t,x)-c(t,y)|\le K|x-y|\),

(B\(_{3}\)):

\(|a(s,x)-a(t,x)|+|b(s,x)-b(t,x)|+|c(s,x)-c(t,x)|+|\partial _x{}b(s,x)-\partial _x{}b(t,x)|\le {K}|s-t|^{\beta }\),

(B\(_{4}\)):

\(|\partial _x{}b(t,x)-\partial _x{}b(t,y)|\le K|x-y|\),

(B\(_{5}\)):

\(\left| \partial _x b(t,x)\right| \le K\),

 

Theorem 2.12

([53]) Let \(\alpha \in (1-\gamma ,\frac{1}{2}\wedge \beta )\) If the coefficients of equation (7) satisfy conditions (B\(_{1}\))–(B\(_{5}\)), then it has a unique solution X such that \(\left\| X\right\| _{\infty ,\alpha ,T}<\infty \) a.s.

Remark 2.13

It was proved in [58] that Eq. (7) is uniquely solvable when assumptions (B\(_{1}\))–(B\(_{5}\)) hold and if additionally c is bounded as follows:  

(B\(_{6}\)):

\(|c(t,x)|\le K_1\) for some \(K_1>0\).

  Later, in [53] the existence and uniqueness theorem without assumption (B\(_{6}\)) was obtained. Equation (7) with \(Z=B^H\), a fractional Brownian motion, was first considered in [39], where existence and uniqueness of solution were proved for time-independent coefficients and zero drift. For inhomogeneous coefficients, unique solvability was established in [51] for \(H\in (3/4,1)\) and bounded coefficients, in [27] for any \(H>1/2\), but under the assumption that W and \(B^H\) are independent.

3 Drift Parameter Estimation by Continuous Observations

This section is devoted to the drift parameter estimation in the model (1) by continuous observations of the process X. We discuss the construction of the maximum likelihood estimator based on the Girsanov transform. Then we study a non-standard estimator. These results are applied to linear models. In the last subsection of this section, the various estimators in fractional Ornstein–Uhlenbeck model are considered.

3.1 General Fractional Model

Assume that \(H>\frac{1}{2}\) and consider the equation

$$\begin{aligned} X_t =x_0 +\theta \int _0^t a(s,X_{s})ds+ \int _0^tb(s,X_{s})dB_s^H,\quad t\in \mathbb {R}^+,\end{aligned}$$
(8)

where \(x_0\in \mathbb {R}\) is the initial value, \(\theta \) is the unknown parameter to be estimated, the first integral in the right-hand side of (8) is the Lebesgue–Stieltjes integral, and the second integral is the generalized Lebesgue–Stieltjes integral introduced in Definition 2.4.

3.1.1 The Standard Maximum Likelihood Estimator

Let the following assumptions hold:  

(C\(_{1}\)):

Linear growth of ab in space: for any \(t\in [0,T]\) and \(x\in \mathbb {R}\)

$$| a(t,x)| +|b(t,x)| \le K(1+\left| x\right| ),$$
(C\(_{2}\)):

Lipschitz continuity of a, b in space: for any \(t\in [0,T]\) and \(x,y\in \mathbb R\)

$$|a(t,x)-a(t,y)|+|b(t,x)-b(t,y)|\le K|x-y|,$$
(C\(_{3}\)):

Hölder continuity of a, b, \(\partial _x{}b\) in time: there exists \(\beta >1/2\) such that for any \(t,s\in [0,T]\) and \(x\in \mathbb {R}\)

$$|a(s,x)-a(t,x)|+|b(s,x)-b(t,x)|+|\partial _x{}b(s,x)-\partial _x{}b(t,x)|\le {K}|s-t|^{\beta },$$
(C\(_{4}\)):

Hölder continuity of \(\partial _x{}b\) in space: there exists such \(\rho \in (3/2 - H,1)\) that for any \(t\in [0,T]\) and \( x,y\in \mathbb {R}\)

$$ |\partial _x{}b(t,x)-\partial _x{}b(t,y)|\le {D}|x-y|^\rho ,$$

  Then, according to Theorem 2.8, solution for Eq. (8) exists on any interval [0, T] and is unique in the class of processes satisfying

$$\begin{aligned} \left\| X\right\| _{\infty ,\alpha ,T}<\infty \quad \text {a.s.} \end{aligned}$$
(9)

for some \(\alpha >1-H\).

In addition, suppose that the following assumption holds:  

(D\(_{1}\)):

\(b(t,X_t)\ne 0, t\in [0,T]\) and \(\frac{a(t,X_t)}{b(t,X_t)}\) is a.s. Lebesgue integrable on [0, T] for any \(T>0\).

  Denote \(\psi (t, x)=\frac{a(t,x)}{b(t,x)}\), \(\varphi (t):=\psi (t, X_t)\). Also, let the kernel

$$l_{H}(t,s)=c_{H}s^{\frac{1}{2}-H}(t-s)^{\frac{1}{2}-H}\mathbbm {1}_{\{0<s<t\}},$$

with \(c_{H}=\left( \frac{\varGamma (3-2H)}{2H\varGamma (\frac{3}{2}-H)^{3}\varGamma (H+\frac{1}{2})}\right) ^ {\frac{1}{2}}\), and introduce the integral

$$\begin{aligned} J_t=\int _0^t l_{H}(t,s)\varphi (s)ds=c_{H} \int _0^t (t-s)^{\frac{1}{2}-H}s^{\frac{1}{2}-H}\varphi (s)ds. \end{aligned}$$
(10)

Finally, let \( M_t^H =\int _0^t l_H(t,s)dB_s^H \) be Gaussian martingale with square bracket \(\langle M^H\rangle _t=t^{2-2H}\) (Molchan martingale, see [61]).

Consider the following two processes:

$$Y_t=\int _0^tb^{-1}(s, X_s)dX_s=\theta \int _0^t\varphi (s)ds+B_t^H$$

and

$$Z_t=\int _0^t l_{H}(t,s)dY_s=\theta J_t+M_t^H.$$

Remark 3.1

Note that the transformation from X to Z does not lead to loss of information since we can present Y (consequently, X) via Z and Volterra kernel introduced in Theorem 5.2 [61]. So, these processes generate the same filtration.

Also, note that we can rewrite process Z as

$$Z_t=\int _0^tl_{H}(t,s)b^{-1}(s, X_s)dX_s,$$

so Z is a functional of the observable process X. The following smoothness condition for the function \(\psi \) (Lemma 6.3.2 [51]) ensures the semimartingale property of Z.

Lemma 3.2

Let \(\psi =\psi (t, x)\in C^1(\mathbb {R}^+)\times C^2(\mathbb {R}).\) Then for any \(t>0\)

$$\begin{aligned} \begin{aligned}&\quad \quad \qquad \qquad \qquad \qquad J'(t)=(2-2H)C_H\psi (0,x_0)t^{1-2H}\\&\qquad \quad \quad \qquad +\int _0^tl_H(t,s)\left( \frac{\partial \psi }{\partial t}(s,X_s) +\theta \frac{\partial \psi }{\partial x}(s,X_s)a(s,X_s)\right) ds\\&-\left( H-\tfrac{1}{2}\right) c_H\int _0^ts^{-\frac{1}{2}-H}(t-s)^{\frac{1}{2}-H}\int _0^s\left( \frac{\partial \psi }{\partial t}(u,X_u)+\theta \frac{\partial \psi }{\partial x}(u,X_u)a(u,X_u)\right) duds\\&\quad +(2-2H)c_Ht^{1-2H}\int _0^t s^{2H-3}\int _0^su^{\frac{3}{2}-H}(s-u)^{\frac{1}{2}-H}\frac{\partial \psi }{\partial x}(u,X_u)b(u,X_u)dB_u^Hds\\&\qquad \qquad \quad \ +c_Ht^{-1}\int _0^tu^{\frac{3}{2}-H}(t-u)^{\frac{1}{2}-H}\frac{\partial \psi }{\partial x}(u,X_u)b(u,X_u)dB_u^H, \end{aligned} \end{aligned}$$
(11)

where \(C_H=B(\frac{3}{2}-H,\frac{3}{2}-H)c_H=\Big (\frac{\varGamma (\frac{3}{2}-H)}{2H\varGamma (H+\frac{1}{2})\varGamma (3-2H)}\Big )^{\frac{1}{2}},\) and all of the involved integrals exists a.s.

Remark 3.3

Suppose that \(\psi (t, x)\in C^1(\mathbb {R}^+)\times C^2(\mathbb {R})\) and limit \(\varsigma (0)=\lim _{s\rightarrow 0}\varsigma (s)\) exists a.s., where \(\varsigma (s)=s^{\frac{1}{2}-H}\varphi (s)\). In this case J(t) can be presented as

$$J(t)=c_H\int _0^t(t-s)^{\frac{1}{2}-H}\varsigma (s)ds = \frac{c_Ht^{\frac{3}{2}-H}}{\frac{3}{2}-H}\varsigma (0)+c_H\int _0^t\frac{(t-s)^{\frac{3}{2}-H}}{\frac{3}{2}-H}\varsigma '(s)ds,$$

and \(J'(t)\) from (11) can be simplified to

$$\begin{aligned} \begin{aligned}&J'(t)=c_Ht^{\frac{1}{2}-H}\varsigma (0)+\int _0^tl_H(t,s)\biggl (\left( \tfrac{1}{2}-H\right) s^{-1}\varphi (s) +\frac{\partial \psi }{\partial t}(s,X_s)\\&\qquad \qquad \quad \quad +\theta \frac{\partial \psi }{\partial x}(s,X_s)a(s,X_s)\biggr )ds+\int _0^tl_H(t,s)\frac{\partial \psi }{\partial x}(s,X_s)b(s,X_s)dB_s^H. \end{aligned} \end{aligned}$$

Same way as Z, processes J and \(J' \) are functionals of X. It is more convenient to consider process \(\chi (t)=(2-2H)^{-1}J'(t)t^{2H-1}\), so that

$$Z_t=(2-2H)\theta \int _0^t\chi (s)s^{1-2H}ds+M_t^H=\theta \int _0^t\chi (s)d\langle M^H\rangle _s+M_t^H.$$

Suppose that the following conditions hold:  

(D\(_{2}\)):

\(\mathbf {E}I_T:=\mathbf {E}\int _0^T\chi ^2_sd\langle M^H\rangle _s<\infty \) for any \(T>0\),

(D\(_{3}\)):

\(I_\infty :=\int _0^\infty \chi ^2_sd\langle M^H\rangle _s=\infty \) a.s.

  Then we can consider the maximum likelihood estimator (MLE)

$$\theta ^{(1)}_T=\frac{\int _0^T\chi _sdZ_s}{\int _0^T\chi ^2_sd\langle M^H\rangle _s}=\theta +\frac{\int _0^T\chi _sdM^H_s}{\int _0^T\chi ^2_sd\langle M^H\rangle _s}.$$

Condition (D\(_{2}\)) ensures that process \(\int _0^t\chi _sdM^H_s, t>0\) is a square integrable martingale, and condition (D\(_{3}\)) alongside with the law of large numbers for martingales ensure that \(\frac{\int _0^T\chi _sdM^H_s}{\int _0^T\chi ^2_sd\langle M^H\rangle _s}\rightarrow 0\) a.s. as \(T\rightarrow \infty \). Summarizing, we arrive at the following result.

Theorem 3.4

([51]) Let \(\psi (t, x)\in C^1(\mathbb {R}^+)\times C^2(\mathbb {R}) \) and assumptions (C\(_{1}\))–(C\(_{4}\)) and (D\(_{1}\))–(D\(_{3}\)) hold. Then the estimator \(\theta ^{(1)}_T\) is strongly consistent as \(T\rightarrow \infty \).

Remark 3.5

In [57] the explicit form of the likelihood ratio was established. It was shown that MLE can be presented as a function of the observed process \(X_t\), namely

$$ \widehat{\theta }_t^{\,(1)} =\frac{\int _0^t\left( \varphi (s) + (H-\tfrac{1}{2}) s^{2H-1} \int _0^s\frac{s^{\frac{1}{2}-H}\varphi (s)-u^{\frac{1}{2}-H}\varphi (u)}{(s-u)^{H+\frac{1}{2}}}du\right) d\widetilde{Y}_s}{\int _0^ts^{2H-1}\left( \frac{\varphi (s)}{s^{2H-1}} + (H-\tfrac{1}{2})\int _0^s\frac{s^{\frac{1}{2}-H}\varphi (s)-u^{\frac{1}{2}-H}\varphi (u)}{(s-u)^{H+\frac{1}{2}}}du\right) ^2ds}, $$

where \(\widetilde{Y}_s=\int _0^sv^{\frac{1}{2}-H}(s-v)^{\frac{1}{2}-H}b^{-1}(v,X_v)\,dX_v\).

Remark 3.6

Tudor and Viens [74] constructed the MLE for the following model

$$ X_t = \int _0^ta(X_s)\,ds+B^H_t,\quad X_0 = 0. $$

Under some regularity conditions on the coefficient a(x) they proved the strong consistency of the MLE in both cases \(H<1/2\) and \(H>1/2\).

3.1.2 A Nonstandard Estimator

It is possible to construct another estimator for parameter \(\theta \), preserving the structure of the standard MLE. Similar approach was applied in [29] to the fractional Ornstein–Uhlenbeck process with constant coefficients (see the estimator (19) below). We shall use process Y to define the estimator as follows:

$$\begin{aligned} \widehat{\theta }^{(2)}_T=\frac{\int _0^T\varphi _s\,dY_s}{\int _0^T\varphi ^2_s\,ds} =\theta +\frac{\int _0^T\varphi _sdB^H_s}{\int _0^T\varphi ^2_sds}.\end{aligned}$$
(12)

Theorem 3.7

([38]) Let assumptions (C\(_{1}\))–(C\(_{4}\)), (D\(_{1}\)), and (D\(_{2}\)) hold and let function \(\varphi \) satisfy the following assumption:  

(D\(_{4}\)):

There exists such \(\alpha >1-H\) and \(p>1\) that

$$\begin{aligned} \rho _{\alpha , p,T}:=\frac{T^{H+\alpha -1}(\log T)^p\int _0^T|(\mathcal {D}_{0+}^{\alpha }\varphi )(s)|ds}{\int _0^T \varphi ^2_sds}\rightarrow 0\quad \text {a.s. as}\quad T\rightarrow \infty . \end{aligned}$$
(13)

 

Then estimator \(\widehat{\theta }^{(2)}_T\) is correctly defined and strongly consistent as \(T\rightarrow \infty \).

Relation (13) ensures convergence \(\frac{\int _0^T\varphi _s\,dB^H_s}{\int _0^T\varphi ^2_s\,ds}\rightarrow 0\) a.s. in the general case. In a particular case when function \(\varphi \) is nonrandom and integral \(\int _0^T\varphi _s\,dB^H_s\) is a Wiener integral w.r.t. the fractional Brownian motion, conditions for existence of this integral are simpler since assumption (13) can be simplified.

Theorem 3.8

([38]) Let assumptions (C\(_{1}\))–(C\(_{4}\)), (D\(_{1}\)), and (D\(_{2}\)) hold and let function \(\varphi \) be nonrandom and satisfy the following assumption:  

(D\(_{5}\)):

There exists such \(p>0\) that

$$\begin{aligned}\lim \sup _{T\rightarrow \infty } \frac{T^{2H-1+p}}{\int _0^T\varphi ^2(t)dt}<\infty . \end{aligned}$$

  Then estimator \(\widehat{\theta }^{(2)}_T\) is strongly consistent as \(T\rightarrow \infty \).

In the next subsection, we consider some examples of \(\varphi \) and establish not only the convergence to zero but the rate of convergence as well.

3.1.3 Examples of the Remainder Terms with the Estimation of the Rate of Convergence to Zero

We start with the simplest case when \(\varphi \) is a power function, \(\varphi (t)=t^a\), \(a\ge 0\),\(t\ge 0\). It means that \(a(t,x)=b(t,x)t^a\). If the coefficient b(tx) satisfies assumptions (C\(_{1}\))–(C\(_{4}\)) and \(b(t,X_t)\ne 0, t\in [0,T]\), then a(tx) satisfies assumptions (C\(_{1}\))–(C\(_{4}\)) on any interval [0, T], condition (D\(_{1}\)) holds, then the Eq. (8) has the unique solution, the estimator \(\widehat{\theta }^{(2)}_T\) is correctly defined and we can study the properties of the remainder term \(\rho _{\alpha , p,T}\).

Lemma 3.9

([3]) Let \(\varphi (t)=t^a\), \(a\ge 0, t\ge 0\). Then \(\rho _{\alpha , p,T}=C_aT^{H-a-1} \times (\log T)^p\rightarrow 0\) as \(T\rightarrow \infty ,\) where

$$C_a= \frac{(2a+1)\varGamma (a+1)}{\varGamma (a-\alpha +2)}.$$

Remark 3.10

As to the rate of convergence to zero, we can say that

$$\rho _{\alpha , p,T}=O\left( T^{H-1-a+\varepsilon }\right) $$

as \(T\rightarrow \infty \) for any \(\varepsilon >0.\)

Now, we can consider \(\varphi \) that is a polynomial function. In this case, similar to monomial case, the solution of the Eq. (7) exists and is unique, and the estimator is correctly defined. As an immediate generalization of the Lemma 3.9, we get the following statement.

Lemma 3.11

([3]) Let \(N \in {\mathbb N} \setminus \{ 0 \}\) and \(\displaystyle \varphi _N(t)= \sum _{k=0}^N \alpha _k t^{a_k}\), \(t\ge 0\), \((a_k)\) be a sequence of nonnegative power coefficients, \(0\le a_0< a_1<\ldots < a_N\), and \((\alpha _k)\) be a sequence of nonnegative coefficients, \(\alpha _N>0\). Then \(\rho _{\alpha , p,T} \rightarrow 0\) as \(T\rightarrow \infty ,\) and the rate of convergence to zero is \(\rho _{\alpha , p,T}=O\left( T^{H-1-a_N+\varepsilon }\right) \) for any \(\varepsilon >0.\)

Now consider the case of the trigonometric function.

Lemma 3.12

([3]) Let \(\varphi (t)=\sin (\lambda t)\), \(\lambda \ge 0\). Then estimator \(\widehat{\theta }^{(2)}_T\) is strongly consistent as \(T\rightarrow \infty \).

Remark 3.13

We see that in the case of power and polynomial functions (Remark 3.10 and Lemma 3.11) we can get not only convergence to zero but also the rate of convergence, but in the case of the trigonometric function, we only get convergence. The difference can be seen from the following result.

Lemma 3.14

([3]) Let \(\varphi (t)=\sin (\lambda t)\), \(\lambda \ge 0\). Then

$$ \lim _{T \rightarrow + \infty }\rho _{\alpha , p,T}=\lim _{T \rightarrow + \infty } \frac{ T^{H+\alpha -1} (\log T)^p \int _0^T |( \mathcal {D}_{0^+}^\alpha \varphi )(x)|dx}{\int _0^T \varphi ^2(x) dx} = + \infty .$$

Remark 3.15

Note for completeness that for \(\frac{ T^{H+\alpha -1} (\log T)^p \int _0^T ( \mathcal {D}_{0^+}^\alpha \varphi )(x)dx}{\int _0^T \varphi _x^2 dx}\) situation is different, more precisely,

$$\lim _{T \rightarrow + \infty } \frac{ T^{H+\alpha -1} (\log T)^p \int _0^T ( \mathcal {D}_{0^+}^\alpha \varphi )(x)dx}{\int _0^T \varphi ^2 (x) dx}=0.$$

Lemma 3.16

([3]) Let \(\varphi (t)=\exp (-\lambda t)\), \(\lambda > 0\). Then

$$ \lim _{T \rightarrow + \infty }\rho _{\alpha , p,T}=\lim _{T \rightarrow + \infty } \frac{ T^{H+\alpha -1} (\log T)^p \int _0^T |( \mathcal {D}_{0^+}^\alpha \varphi )(x)|dx}{\int _0^T \varphi ^2(x) dx} = 0.$$

Remark 3.17

It is easy to deduce from the previous calculations that in the latter case

$$\rho _{\alpha , p,T}=O\left( T^{H-1+\varepsilon }\right) $$

as \(T\rightarrow \infty \) for any \(\varepsilon >0.\)

Lemma 3.18

([3]) Let \(\varphi (t)=\exp (\lambda t)\), \(\lambda > 0\). Then

$$ \lim _{T \rightarrow + \infty }\rho _{\alpha , p,T}=\lim _{T \rightarrow + \infty } \frac{ T^{H+\alpha -1} (\log T)^p \int _0^T |( \mathcal {D}_{0^+}^\alpha \varphi )(x)|dx}{\int _0^T \varphi ^2(x) dx} =0.$$

Remark 3.19

In this case

$$\rho _{\alpha , p,T}=O\left( e^{-( \lambda -\varepsilon )T}\right) =o\left( T^{-\epsilon }\right) $$

as \(T\rightarrow \infty \) for any \(\varepsilon >0.\)

Lemma 3.20

([3]) Let \(\varphi (t)= \log (1+ t)\). Then

$$ \lim _{T \rightarrow + \infty }\rho _{\alpha , p,T}=\lim _{T \rightarrow + \infty } \frac{ T^{H+\alpha -1} (\log T)^p \int _0^T |( \mathcal {D}_{0^+}^\alpha \varphi )(x)|dx}{\int _0^T \varphi ^2(x) dx} =0.$$

Remark 3.21

In this case

$$\rho _{\alpha , p,T}=O\left( T^{H-1+\varepsilon } \right) $$

as \(T\rightarrow \infty \) for any \(\varepsilon >0.\)

3.1.4 Sequential Estimators

Suppose that conditions (D\(_{1}\))–(D\(_{3}\)) hold. For any \(h>0\) consider the stopping time

$$\tau (h)=\inf \left\{ t>0: \int _0^t\chi ^2_sd\langle M^H\rangle _s=h\right\} .$$

Under conditions (D\(_{1}\))–(D\(_{2}\)) we have \(\tau (h)<\infty \) a.s. and \(\int _0^{\tau (h)}\chi ^2_sd\langle M^H\rangle _s=h\). The sequential MLE has a form

$$ \widehat{\theta }_{\tau (h)}^{(1)}=\frac{\int _0^{\tau (h)}\chi _sdZ_s}{h} =\theta +\frac{\int _0^{\tau (h)}\chi _sdM^H_s}{h}.$$

A sequential version of the estimator \(\widehat{\theta }_T^{(2)}\) has a form

$$\widehat{\theta }_{\upsilon (h)}^{(2)}=\theta +\frac{\int _0^{\upsilon (h)}\varphi _sdB^H_s}{h}, $$

where

$$\upsilon (h)=\inf \left\{ t>0: \int _0^t\varphi ^2 (s)ds=h\right\} .$$

Theorem 3.22

([38])

  1. (a)

    Let assumptions (D\(_{1}\))–(D\(_{3}\)) hold. Then the estimator \(\widehat{\theta }_{\tau (h)}^{(1)}\) is unbiased, efficient, strongly consistent, \(\mathbf {E}\left( \widehat{\theta }_{\tau (h)}^{(1)}-\theta \right) ^2=\frac{1}{h}\), and for any estimator of the form

    $$\widehat{\theta }_{\tau }=\frac{\int _0^{\tau }\chi _sdZ_s}{\int _0^{\tau }\chi ^2_sd\langle M^H\rangle _s} =\theta +\frac{\int _0^{\tau }\chi _sdM^H_s}{\int _0^{\tau }\chi ^2_sd\langle M^H\rangle _s}$$

    with \(\tau <\infty \) a.s. and \(\mathbf {E}\int _0^{\tau }\chi ^2_sd\langle M^H\rangle _s\le h\) we have that

    $$\mathbf {E}\left( \widehat{\theta }_{\tau (h)}^{(1)}-\theta \right) ^2\le \mathbf {E}(\widehat{\theta }_{\tau }-\theta )^2.$$
  2. (b)

    Let function \(\varphi \) be separated from zero, \(|\varphi (s)|\ge c>0\) a.s. and satisfy the assumption: for some \(1-H<\alpha <1\) and \(p>0\)

    $$\begin{aligned} \frac{\int _0^{\upsilon (h)}|(\mathcal {D}_{0+}^{\alpha }\varphi )(s)|ds}{(\upsilon (h))^{2-\alpha -H-p}}\rightarrow 0\quad \text {a.s.}\end{aligned}$$
    (14)

    as \(h\rightarrow \infty \). Then estimator \(\widehat{\theta }_{\upsilon (h)}^{(2)}\) is strongly consistent.

Remark 3.23

The assumption (14) holds, for example, for a bounded and Lipschitz function \(\varphi \).

3.2 Linear Models

Consider the linear version of model (8):

$$dX_t=\theta a(t)X_tdt+b(t) X_tdB_t^H, $$

where a and b are locally bounded nonrandom measurable functions. In this case solution X exists, it is unique and can be presented in the integral form

$$X_t=x_0+\theta \int _0^t a(s)X_sds+\int _0^tb(s) X_sdB_s^H=x_0\exp \left\{ \theta \int _0^t a(s) ds+\int _0^tb(s) dB_s^H\right\} .$$

Suppose that function b is nonzero and note that in this model

$$\varphi (t)=\frac{a(t)}{b(t)}.$$

Suppose that \(\varphi (t)\) is also locally bounded and consider maximum likelihood estimator \(\widehat{\theta }_T^{\,(1)}\). According to (10), to guarantee existence of process \(J'\), we have to assume that the fractional derivative of order \(\frac{3}{2}-H\) for function \(\varsigma (s):=\varphi (s)s^{\frac{1}{2}-H}\) exists and is integrable. The sufficient conditions for the existence of fractional derivatives can be found in [69]. One of these conditions states the following:  

(D\(_{6}\)):

Functions \(\varphi \) and \(\varsigma \) are differentiable and their derivatives are locally integrable.

  So, it is hard to conclude what is the behavior of the MLE for an arbitrary locally bounded function \(\varphi \). Suppose that condition (D\(_{6}\)) holds and limit \(\varsigma _0=\lim _{s\rightarrow 0}\varsigma (s)\) exists. In this case, according to Lemma 3.2 and Remark 3.3, process \(J'\) admits both of the following representations:

$$\begin{aligned} J'(t)&=(2-2H)C_H\varphi (0) t^{1-2H}+\int _0^tl_H(t,s)\varphi '(s)ds\\&\quad -\Big (H-\frac{1}{2}\Big )c_H\int _0^ts^{-\frac{1}{2}-H}(t-s)^{\frac{1}{2}-H}\int _0^s\varphi '(u)duds\\&=c_H\varsigma _0t^{\frac{1}{2}-H}+c_H\int _0^t(t-s)^{\frac{1}{2}-H}\varsigma '(s)ds, \end{aligned}$$

and assuming (D\(_{3}\)) also holds true, the estimator \(\widehat{\theta }^{(1)}_T\) is strongly consistent. Let us formulate some simple conditions sufficient for the strong consistency.

Lemma 3.24

([38]) If function \(\varphi \) is nonrandom, locally bounded, satisfies (D\(_{6}\)), limit \(\varsigma (0)\) exists, and one of the following assumptions hold:

  1. (a)

    function \(\varphi \) is not identically zero and \(\varphi '\) is nonnegative and nondecreasing;

  2. (b)

    derivative \(\varsigma '\) preserves the sign and is separated from zero;

  3. (c)

    derivative \(\varsigma '\) is nondecreasing and has a nonzero limit,

then the estimator \(\widehat{\theta }_T^{(1)}\) is strongly consistent as \(T\rightarrow \infty \).

Example 3.25

If the coefficients are constant, \(a(s)=a\ne 0\) and \(b(s)=b\ne 0\), then the estimator has a form \(\widehat{\theta }^{(1)}_T=\theta +\frac{b M^H_T}{aC_HT^{2-2H}}\) and is strongly consistent. In this case assumption (a) holds. In addition, power functions \(\varphi (s)=s^\rho \) are appropriate for \(\rho >H-1\): this can be verified directly from (10).

Let us now apply estimator \(\widehat{\theta }^{(2)}_T\) to the same model. It has a form (12). We can use Theorem 3.8 directly and under assumption (D\(_{5}\)) estimator \(\widehat{\theta }^{(2)}_T\) is strongly consistent. Note that we do not need any assumptions on the smoothness of \(\varphi \), which is a clear advantage of \(\widehat{\theta }^{(2)}_T\). We shall consider two more examples.

Example 3.26

If the coefficients are constant, \(a(s)=a\ne 0\) and \(b(s)=b\ne 0\), then the estimator has a form \(\widehat{\theta }^{(2)}_T=\theta +\frac{b B^H_T}{aT}\). In this case both estimators \(\widehat{\theta }^{(1)}_T\) and \(\widehat{\theta }^{(2)}_T\) are strongly consistent and \(\mathbf {E}\left( \theta -\widehat{\theta }^{(1)}_T\right) ^2=\frac{\gamma ^2 T^{2H-2}}{a^2C_H^2}\) has the same asymptotic behavior as \(\mathbf {E}\left( \theta -\widehat{\theta }^{(2)}_T\right) ^2=\frac{\gamma ^2 T^{2H-2}}{a^2}\).

Example 3.27

If nonrandom functions \(\varphi \) and \(\varsigma \) are bounded on some fixed interval \([0,t_0]\) but \(\varsigma \) is sufficiently irregular on this interval and has no fractional derivative of order \(\frac{3}{2}-H\) or higher then we cannot even calculate \(J'(t)\) on this interval and it is hard to analyze the behavior of the maximum likelihood estimator. However, if we assume that \(\varphi (t)\sim t^{H-1+\rho }\) at infinity with some \(\rho >0\), then assumption (D\(_{5}\)) holds and estimator \(\widehat{\theta }_T^{(2)}\) is strongly consistent as \(T\rightarrow \infty \). In this sense, the estimator \(\widehat{\theta }_T^{(2)}\) is more flexible. The estimator \(\widehat{\theta }_T^{(1)}\) was considered in [45].

3.3 Fractional Ornstein–Uhlenbeck Model

3.3.1 General Case

Consider the fractional Ornstein–Uhlenbeck, or Vasicek, model with nonconstant coefficients. It has a form

$$dX_t = \theta (a(t)X_t+b(t))dt + \gamma (t) dB^H_t,\,t\ge 0,$$

where a, b, and \(\gamma \) are nonrandom measurable functions. Suppose they are locally bounded and \(\gamma =\gamma (t)>0\). The solution for this equation is a Gaussian process and has a form

$$\begin{aligned} X_t=e^{\theta A(t)}\Big (x_0+\theta \int _0^tb(s)e^{-\theta A(s)}ds+\int _0^t\gamma (s)e^{-\theta A(s)}dB^H_s\Big ):=E(t)+G(t), \end{aligned}$$

where \(A(t)=\int _0^ta(s)ds\), \(E(t)=e^{\theta A(t)}\Big (x_0+\theta \int _0^tb(s)e^{-\theta A(s)}ds\Big )\) is a nonrandom function, \(G(t)=e^{\theta A(t)}\int _0^t\gamma (s)e^{-\theta A(s)}dB^H_s\) is a Gaussian process with zero mean.

Denote \(c(t)=\frac{a(t)}{\gamma (t)}, d(t)=\frac{b(t)}{\gamma (t)}.\) Now we shall state the conditions for strong consistency of the maximum likelihood estimator.

Theorem 3.28

([38]) Let functions a, c, d, and \(\gamma \) satisfy the following assumptions:  

(D\(_{7}\)):

\(-a_1\le a(s)\le -a_2<0\), \(-c_1\le c(s)\le -c_2<0\), \(0<\gamma _1\le \gamma (s)\le \gamma _2\), functions c and d are continuously differentiable, \(c'\) is bounded, \(c'(s)\ge 0\), and \(c'(s)\rightarrow 0\) as \(s\rightarrow \infty \).

  Then estimator \(\widehat{\theta }_T^{(1)}\) is strongly consistent as \(T\rightarrow \infty \).

Remark 3.29

The assumptions of the theorem are fulfilled, for example, if \(a(s)=-1\), \(b(s)=b\in \mathbb {R}\) and \(\gamma (s)=\gamma >0\). In this case we deal with a standard Ornstein–Uhlenbeck process X with constant coefficients that satisfies the equation

$$dX_t = \theta (b-X_t )dt + \gamma dB^H_t,\,t\ge 0.$$

3.3.2 The Case of Constant Coefficients

Consider a simple version of the Ornstein–Uhlenbeck model where \(a=\gamma =1\), \(b=x_0=0\). Corresponding stochastic differential equation has a form

$$ dX_t = \theta X_tdt + dB^H_t,\,t\ge 0 $$

with evident solution \(X_t=e^{\theta t}\int _0^te^{-\theta s}dB_s^H\). We start with maximum likelihood estimator \(\widehat{\theta }_T^{\,(1)}\). According to [36], it has the following form

$$\begin{aligned} \widehat{\theta }_T^{\,(1)}=\frac{\int _0^TQ(s)\,dZ_s}{\int _0^TQ^2(s)\,dw^H_s}, \end{aligned}$$
(15)

where \(w^H_t\!=\frac{t^{2-2H}\varGamma (3/2-H)}{2H\varGamma (3-2H)\varGamma (H+1/2)}\), \(Q(t)\!=\frac{d}{dw^H_t}\int _0^tk_H(t,s)X_s\,ds\), \(Z_t\!=\!\int _0^tk_H(t,s)\,dX_s\), \(k_H(t,s)=\frac{s^{1/2-H}(t-s)^{1/2-H}}{2H\varGamma (3/2-H)\varGamma (H+1/2)}\).

Theorem 3.30

([14, 36, 72, 73]) Let \(H\in [\frac{1}{2},1)\).

  1. 1.

    For any \(\theta \in \mathbb {R}\) the estimator \(\widehat{\theta }_T^{\,(1)}\) defined by (15) is strongly consistent.

  2. 2.

    Denote \(B(\theta ,T)=\mathbf {E}\left( \widehat{\theta }_T^{\,(1)}-\theta \right) \), \(V(\theta ,T)=\mathbf {E}\left( \widehat{\theta }_T^{\,(1)}-\theta \right) ^2\). The following properties hold:

    1. (i)

      If \(\theta <0\), then, as \(T\rightarrow \infty \),

      $$\begin{aligned} B(\theta ,T)\sim -2T^{-1};\quad V(\theta ,T)\sim 2\left| \theta \right| T^{-1}, \end{aligned}$$
      (16)
    2. (ii)

      If \(\theta =0\), then, for all T,

      $$\begin{aligned} B(0,T)=B(0,1)T^{-1};\quad V(0,T)=V(0,1)T^{-2}, \end{aligned}$$
    3. (iii)

      If \(\theta >0\), then, as \(T\rightarrow \infty \),

      $$\begin{aligned} B(\theta ,T)&\sim -2\sqrt{\pi \sin \pi H}\theta ^{3/2}e^{-\theta T}\sqrt{T};\end{aligned}$$
      (17)
      $$\begin{aligned} V(\theta ,T)&\sim 2\sqrt{\pi \sin \pi H}\theta ^{5/2}e^{-\theta T}\sqrt{T}. \end{aligned}$$
      (18)
  3. 3.
    1. (i)

      If \(\theta <0\), then, as \(T\rightarrow \infty \),

      $$\begin{aligned} \sqrt{T}\left( \widehat{\theta }_T^{\,(1)}-\theta \right) \xrightarrow {\mathcal L}\mathcal {N}(0,-2\theta ), \end{aligned}$$
    2. (ii)

      If \(\theta =0\), then, for all T,

      $$ T\widehat{\theta }_T^{\,(1)}{\mathop {=}\limits ^{\mathcal D}}\widehat{\theta }_1^{(1)} $$
    3. (iii)

      If \(\theta >0\), then, as \(T\rightarrow \infty \),

      $$\begin{aligned} \frac{e^{\theta T}}{2\theta }\left( \widehat{\theta }_T^{\,(1)}-\theta \right) \xrightarrow {\mathcal L}\sqrt{\sin \pi H}\,\mathcal {C}(1), \end{aligned}$$

      where \(\mathcal {C}(1)\) is the standard Cauchy distribution, and \(\xrightarrow {\mathcal L}\) denotes the convergence in law.

Remark 3.31

The MLE for fractional Ornstein–Uhlenbeck process was first studied in [36]. The authors derived the formula for MLE, proved its strong consistency, and got the asymptotic properties of the bias and the mean square error. The asymptotic normality in the case \(\theta <0\) was established in [14]. The asymptotic distributions for \(\theta =0\) and \(\theta >0\) were obtained in [72, 73]. The large deviation properties of the MLE were investigated in [5] (see also [6, 26]). The exact distribution of MLE was computed in [72, 73].

Remark 3.32

It holds that \(\widehat{\theta }_{H,T}^{(1)}{\mathop {=}\limits ^{\mathcal D}}\widehat{\theta }_{1-H,T}^{(1)}\), where \(\widehat{\theta }_{H,T}^{(1)}\) is the MLE under the Hurst parameter H and the time span T (see [14] for \(\theta <0\), [72] for \(\theta =0\), and [73] for \(\theta >0\)). The MLE for \(H<1/2\) was also considered in [74], where the relations (16)–(18) was proved for \(H<1/2\).

Remark 3.33

The properties of estimators in the fractional Ornstein–Uhlenbeck model substantially depend on the sign of \(\theta \). The hypothesis testing of the drift parameter sign was studied in [43, 59, 72, 73].

Consider for \(H\in (\frac{1}{2},1)\) the estimator \(\widehat{\theta }_T^{(2)}\):

$$\begin{aligned} \widehat{\theta }_T^{(2)}=\frac{\int _0^T X_sdX_s}{\int _0^T X_s^2ds}=\theta +\frac{\int _0^TX_sdB_s^H}{\int _0^TX_s^2ds}. \end{aligned}$$
(19)

It admits the following representation

$$\begin{aligned} \widehat{\theta }_T^{(2)}=\frac{X_T^2}{2\int _0^T X_s^2ds}. \end{aligned}$$
(20)

Note that this form of the estimator is well defined for all \(H\in (0,1)\).

Theorem 3.34

([4, 22]) Let \(\theta >0\), \(H\in (0,1)\). Then the estimator \(\widehat{\theta }_T^{(2)}\) given by (20) is strongly consistent as \(T\rightarrow \infty \). Moreover,

$$ e^{\theta T}\left( \widehat{\theta }_T^{(2)}-\theta \right) \xrightarrow {\mathcal L}2\theta \mathcal {C}(1), $$

as \(T\rightarrow \infty \), where \(\mathcal {C}(1)\) is the standard Cauchy distribution.

Remark 3.35

If \(\theta <0\), then \(\widehat{\theta }_T^{(2)}\) converges to zero in \(L_2(\varOmega ,\mathbf {P})\) ([29], see the remark at the end of Sect. 3). If the path-wise integral in (19) is replaced by the divergence-type integral, then the estimator (19) is strongly consistent and asymptotically normal [29, Theorems 3.2, 3.4]. The divergence-type integral is the limit of the Riemann sums defined in terms of the Wick product. Since it is not suitable for simulation and discretization, Hu and Nualart [29] proposed the following estimator for the ergodic case \(\theta <0\)

$$\begin{aligned} \widehat{\theta }_T^{(3)}=-\left( \frac{1}{H\varGamma (2H)T}\int _0^TX_s^2\,ds\right) ^{-\frac{1}{2H}}. \end{aligned}$$

Theorem 3.36

([29, 43]) Let \(\theta <0\), \(H\in (0,1)\). Then the estimator \(\widehat{\theta }_T^{(3)}\) is strongly consistent as \(T\rightarrow \infty \). If \(H\in (\frac{1}{2},\frac{3}{4})\), then

$$ \sqrt{T}\left( \widehat{\theta }_T^{(3)}-\theta \right) \xrightarrow {\mathcal L}\mathcal {N}\left( 0,-\theta \sigma ^2_H\right) , $$

as \(T\rightarrow \infty \), where

$$\begin{aligned} \sigma ^2_H=\frac{4H-1}{(2H)^2}\left( 1+\frac{\varGamma (3-4H)\varGamma (4H-1)}{\varGamma (2-2H)\varGamma (2H)}\right) . \end{aligned}$$
(21)

To construct the estimator for all \(\theta \in \mathbb {R}\), Moers [59] combined \(\widehat{\theta }_T^{(2)}\) and \(\widehat{\theta }_T^{(3)}\) as follows (assuming \(x_0\in \mathbb {R}\) is arbitrary):

$$\widehat{\theta }_T^{(4)}=\frac{X_T^2-x_0^2}{2\int _0^TX_t^2dt} -\left( \frac{1}{H\varGamma (2H)T}\int _0^TX_t^2dt\right) ^{-\frac{1}{2H}}.$$

Theorem 3.37

([59]) Let \(H\in [\frac{1}{2},1)\). Then the estimator \(\widehat{\theta }_T^{(4)}\) is strongly consistent for all \(\theta \in \mathbb {R}\). As \(T\rightarrow \infty \),

$$\begin{aligned} \sqrt{\left| \theta \right| T}&\left( \widehat{\theta }_T^{(4)}-\theta \right) \xrightarrow {\mathcal L}\mathcal {N}\left( 0,\theta ^2\sigma ^2_H\right) , \quad \theta <0,\;H\in \left[ \tfrac{1}{2},\tfrac{3}{4}\right) ,\\&\qquad \quad T\widehat{\theta }_T^{(4)}\xrightarrow {\mathcal L}\psi _{H},\quad \theta =0,\\&e^{\theta T}\left( \widehat{\theta }_T^{(4)}-\theta \right) \xrightarrow {\mathcal L}2\theta \frac{\eta _1}{\eta _2+x_0 b_H},\quad \theta >0, \end{aligned}$$

where \(\sigma ^2_H\) is defined in (21), \(b_H=\frac{\theta ^H}{\sqrt{H\varGamma (2H)}}\),

$$ \psi _{H}=\frac{\left( B^H_1\right) ^2}{2\int _0^1\left( B^H_t\right) ^2dt} -\left( \frac{1}{H\varGamma (2H)}\int _0^1\left( B^H_t\right) ^2dt\right) ^{-\frac{1}{2H}}, $$

and \(\eta _1\) and \(\eta _2\) are independent standard normal random variables.

Bishwal [12] studied for \(H\in (1/2,1)\) and \(\theta <0\) the following minimum contrast estimator

$$\begin{aligned} \widehat{\theta }_T^{(5)}=-\frac{T}{2\int _0^TQ^2(s)\,dw^H_s}, \end{aligned}$$
(22)

and proved the same asymptotic normality as the MLE (see statement 3(i) of Theorem 3.30). The distribution of \(\widehat{\theta }_T^{(5)}\) was computed in [72].

4 Drift Parameter Estimation by Discrete Observations

4.1 General Fractional Model

Consider a stochastic differential equation

$$\begin{aligned} X_t= X_0 + \theta \int _0^t a(X_s)ds+ \int _0^t b(X_s)dB^H_s, \end{aligned}$$
(23)

where \(X_0\) is a nonrandom coefficient. In [47] it is shown that this equation has a unique solution under the following assumptions: there exist constants \(K>0\), \(L>0\), \(\delta \in (1/H-1,1]\), and for every \(N>0\) there exists \(R_N>0\) such that  

(E\(_{1}\)):

\(\left| a(x)\right| +\left| b(x)\right| \le K\)    for all \(x,y\in \mathbb {R}\),

(E\(_{2}\)):

\(\left| a(x)-a(y)\right| +\left| b(x)-b(y)\right| \le L\left| x-y\right| \)    for all \(x,y\in \mathbb {R}\),

(E\(_{3}\)):

\(\left| b'(x)-b'(y)\right| \le R_N\left| x-y\right| ^\delta \)    for all \(x\in [-N,N], y\in [-N,N]\).

 

Our main problem is to construct an estimator for \(\theta \) based on discrete observations of X. Specifically, we will assume that for some \(n\ge 1\) we observe values \(X_{t_n^k}\) at the following uniform partition of \([0,2^n]\): \(t_k^n=k 2^{-n}\), \(k=0,1,\dots ,2^{2n}\).

In order to construct consistent estimators for \(\theta \), we need another technical assumption, in addition to conditions (E\(_{1}\))–(E\(_{3}\)):  

(E\(_{4}\)):

a(x) and b(x) are separated from zero.

 

We now define an estimator, which is a discretized version of a maximum likelihood estimator for F(X), where \(F(x) = \int _0^x b(y)^{-1} dy\):

$$ \widetilde{\theta }_n^{(1)}=\frac{2^n\sum _{k=1}^{2^{2n}}\left( t_k^n\right) ^{-\alpha } \left( 2^n-t_k^n\right) ^{-\alpha } b^{-1}\left( X_{t_{k-1}^n}\right) \left( X_{t_{k}^n}-X_{t_{k-1}^n}\right) }{\sum _{k=1}^{2^{2n}}\left( t_{k}^n\right) ^{-\alpha } \left( 2^n-t_{k}^n\right) ^{-\alpha } b^{-1}\left( X_{t_{k-1}^n}\right) a\left( X_{t_{k-1}^n}\right) }. $$

Theorem 4.1

([55]) Under conditions (E\(_{1}\))–(E\(_{4}\)), \(\widetilde{\theta }_n^{(1)}\) is strongly consistent. Moreover, for any \(\beta \!\in \!(1/2,H)\) and \(\gamma >1/2\) there exists a random variable \(\eta =\eta _{\beta ,\gamma }\) with all finite moments such that \(\left| \widetilde{\theta }_n^{(1)}-\theta \right| \le \eta n^{\kappa +\gamma } 2^{-\tau n}\), where \(\kappa =\gamma /\beta \), \(\tau = (1-H)\wedge (2\beta -1)\).

Consider a simpler estimator:

$$\widetilde{\theta }_n^{(2)}=\frac{2^n\sum _{k=1}^{2^{2n}} b^{-1}\left( X_{t_{k-1}^n}\right) \left( X_{t_{k}^n}-X_{t_{k-1}^n}\right) }{\sum _{k=1}^{2^{2n}} b^{-1}\left( X_{t_{k-1}^n}\right) a\left( X_{t_{k-1}^n}\right) }.$$

This is a discretized maximum likelihood estimator for \(\theta \) in Eq. (23), where \(B^H\) is replaced by Wiener process.

Theorem 4.2

([55]) Theorem 4.1 holds for \(\widetilde{\theta }_n^{(2)}\).

Now let us define a discretized version of \(\widehat{\theta }^{(2)}_T\) defined in (12). Put

$$\widetilde{\theta }^{(3)}_n:=\frac{2^n\sum _{k=1}^{2^{2n}}a\left( X_{t_{k-1}^n}\right) b^{-2}\left( X_{t_{k-1}^n}\right) \left( X_{t_{k}^n}-X_{t_{k-1}^n}\right) }{\sum _{k=1}^{2^{2n}}a^2\left( X_{t_{k-1}^n}\right) b^{-2}\left( X_{t_{k-1}^n}\right) }.$$

Let \(\varphi (t)=\frac{a(X_t)}{b(X_t)}\),

$$\widehat{\varphi }_n(t):=\sum _{k=0}^{2^{2n}-1}\varphi (t_k^n)\mathbbm {1}_{[t_k^n,t_{k+1}^n)}(t).$$

Theorem 4.3

([57]) Under conditions (E\(_{1}\))–(E\(_{4}\)), assume that there exist constants \(\beta >1-H\) and \(p>1\) such that

$$\frac{2^{n(H+\beta )}n^p\int _0^{2^n}\left| \left( D^\beta _{0+}\widehat{\varphi }_n\right) (s)\right| ds}{\sum _{k=1}^{2^{2n}}\varphi ^2(t_{k-1}^n)} \rightarrow 0 \quad \text {a. s. at }n\rightarrow \infty .$$

Then \(\widetilde{\theta }_n^{(3)}\) is strongly consistent.

4.2 Fractional Ornstein–Uhlenbeck Model

In this subsection, we consider discretized versions of the estimators \(\widehat{\theta }_T^{(2)}\) and \(\widehat{\theta }_T^{(3)}\) in the fractional Ornstein–Uhlenbeck model with constant coefficients

$$dX_t = \theta X_tdt + dB^H_t,\,t\ge 0.$$

We start with the case \(\theta >0\). Assume that a trajectory of \(X=X(t)\) is observed at the points \(t_{k,n}=k\varDelta _n\), \(0\le k\le n\), \(n\ge 1\), and \(T_n=n\varDelta _n\) denotes the length of “observation window”. Let us consider the following two estimators:

$$ \widetilde{\theta }_n^{(4)}=\frac{\sum _{i=1}^{n}X_{t_{i-1}}\left( X_{t_i}-X_{t_{i-1}}\right) }{\varDelta _n\sum _{i=1}^{n}X_{t_{i-1}}^2}, $$
$$ \widetilde{\theta }_n^{(5)}=\frac{X_{t_n}^2}{2\varDelta _n\sum _{i=1}^{n}X_{t_{i-1}}^2}. $$

These estimators are discretized versions of \(\widehat{\theta }_T^{(2)}\), obtained from representations (19) and (20).

Theorem 4.4

([24]) Let \(\theta >0\), \(H\in (\frac{1}{2},1)\). Suppose that \(\varDelta _n\rightarrow 0\) and \(n\varDelta _n^{1+\alpha }\rightarrow 0\) as \(n\rightarrow \infty \) for some \(\alpha >0\). Then the estimators \(\widetilde{\theta }_n^{(4)}\) and \(\widetilde{\theta }_n^{(5)}\) are strongly consistent as \(n\rightarrow \infty \).

A similar estimator to \(\widetilde{\theta }_n^{(4)}\) was considered in [42]. Let \(n\ge 1\), \(t_{k,n}=\frac{k}{n}\), \(0\le k\le n^m\), where \(m\in \mathbb {N}\) is some fixed integer. Suppose that we observe X at the points \(\{t_{k,n},n\ge 1,0\le k\le n^m\}\). Consider the estimator

$$ \widetilde{\theta }_n^{(6)}(m)=\frac{\sum _{k=0}^{n^m-1}X_{k,n}\varDelta X_{k,n}}{\frac{1}{n}\sum _{k=0}^{n^m-1}X_{k,n}^2}, $$

where \(X_{k,n}=X_{t_{k,n}}\), \(\varDelta X_{k,n}=X_{k+1,n}-X_{k,n}\).

Theorem 4.5

([42]) Let \(\theta >0\), \(H\in (0,1)\). Then for any \(m>1\) the estimator \(\widetilde{\theta }_n^{(6)}(m)\) is strongly consistent.

Now let \(\theta <0\). In [30, 75] the following discretized version of the estimator \(\widehat{\theta }_T^{(3)}\) was considered

$$ \widetilde{\theta }_n^{(7)}=-\left( \frac{1}{nH\varGamma (2H)}\sum _{k=1}^nX_{k\varDelta }^2\right) ^{-\frac{1}{2H}}, $$

where the process X was observed at the points \(\varDelta ,2\varDelta ,\dots ,n\varDelta \) for some fixed \(\varDelta >0\).

Theorem 4.6

([30]) Let \(\theta <0\), \(H\in [\frac{1}{2},1)\). Then the estimator \(\widetilde{\theta }_n^{(7)}\) is strongly consistent as \(n\rightarrow \infty \). If \(H\in [\frac{1}{2},\frac{3}{4})\), then

$$ \sqrt{n}\left( \widetilde{\theta }_n^{(7)}-\theta \right) \xrightarrow {\mathcal L}\mathcal {N}\left( 0,\frac{\theta ^2}{2H^2}\right) , $$

as \(n\rightarrow \infty \).

Remark 4.7

The discretization of MLE was considered in [74]. Discrete approximations to the minimum contrast estimator (22) were studied in [12].

Remark 4.8

For the case \(\theta <0\) the drift parameter estimator based on polynomial variations was proposed in [25].

Remark 4.9

In [13, 79], a more general situation was studied, where the equation had the form \(dX_t=\theta X_t dt+\sigma dB_t^H\), \(t>0\), and \(\vartheta =(\theta ,\sigma ,H)\) is the unknown parameter, \(\theta <0\). Consistent and asymptotically Gaussian estimators of the parameter \(\theta \) were proposed using the discrete observations of the sample path \((X_{k\varDelta _n}, k=0,\dots ,n)\) for \(H\in (\frac{1}{2},\frac{3}{4})\), where \(n\varDelta _n^p\rightarrow \infty \), \(p>1\), and \(\varDelta _n\rightarrow 0\) as \(n\rightarrow \infty \). In [79] the strongly consistent estimator is constructed for the scheme when \(H> \frac{1}{2}\), the time interval [0, T] is fixed and the process is observed at the points \(h_n,2h_n,\dots ,nh_n\), where \(h_n=\frac{T}{n}\).

5 Drift Parameter Estimation in Mixed Models

5.1 General Mixed Model

Let us take a Wiener process \(W=\{W_t, t\in \mathbb {R}^+\}\) on probability space \((\varOmega , \mathcal {F},\overline{\mathcal {F}}, P)\), possibly correlated with \(B^H\). Assume that \(H>\frac{1}{2}\) and consider a one-dimensional mixed stochastic differential equation involving both the Wiener process and the fractional Brownian motion

$$\begin{aligned} X_t =x_0 +\theta \int _0^t a(s,X_{s})ds+ \int _0^tb(s,X_{s})dB_s^H+\int _0^tc(s,X_{s})dW_s,\ t\in \mathbb {R}^+,\end{aligned}$$
(24)

where \(x_0\in \mathbb {R}\) is the initial value, \(\theta \) is the unknown parameter to be estimated, the first integral in the right-hand side of (24) is the Lebesgue–Stieltjes integral, the second integral is the generalized Lebesgue–Stieltjes integral introduced in Definition 2.4, and the third one is the Itô integral. From now on, we shall assume that the coefficients of equation (24) satisfy the assumptions (B\(_{1}\))–(B\(_{6}\)) on any interval [0, T]. It was proved in [58] that under these assumptions there exists a solution \(X=\{X_t, \mathcal {F}_t, t\in [0,T]\}\) for the Eq. (24) on any interval [0, T] which satisfies (9) for any \(\alpha \in (1-H,\kappa )\), where \(\kappa =\frac{1}{2}\wedge \beta \). This solution is unique in the class of processes satisfying (9) for some \(\alpha >1-H\).

Remark 5.1

In case when components W and \(B^H\) are independent, assumptions for the coefficients can be relaxed, as it has been shown in [27]. More specifically, coefficient c can be of linear growth and \(\partial _x{}b\) can be Hölder continuous up to some order less than 1.

If we consider general equation (24) with nonzero c, then it is impossible to construct reasonable MLE of the parameter \(\theta \). Therefore we construct the estimator of the same type as in (12). More exactly, suppose that the following assumption holds:  

(F\(_{1}\)):

\(c(t,X_t)\ne 0, t\in [0,T]\), \(\frac{a(t,X_t)}{c(t,X_t)}\) is a.s. Lebesgue integrable on [0, T] for any \(T>0\) and there exists generalized Lebesgue–Stieltjes integral \(\int _0^T \frac{b(t,X_t)}{c(t,X_t)}dB_t^H\).

 

Define functions \(\psi _1(t, x)=\frac{a(t,x)}{c(t,x)}\) and \(\psi _2(t, x)=\frac{b(t,x)}{c(t,x)}\), processes \(\varphi _i(t)=\psi _i(t, X_t)\), \(i=1,2\), and process

$$Y_t=\int _0^tc^{-1}(s, X_s)dX_s=\theta \int _0^t\varphi _1(s)ds+\int _0^t\varphi _2(s)dB_s^H+W_t.$$

Evidently, Y is a functional of X and is observable. Assume additionally that the generalized Lebesgue–Stieltjes integral \(\int _0^T \varphi _1(t)\varphi _2(t)dB_t^H\) exists and  

(F\(_{2}\)):

for any \(T>0\;\) \(\mathbf {E}\int _0^T\varphi _1^2(s)ds<\infty \).

 

Denote \(\vartheta (s)=\varphi _1(s)\varphi _2(s)\). We can consider the following estimator of parameter \(\theta \):

$$\begin{aligned} \widehat{\theta }_T=\frac{\int _0^T\varphi _1(s)dY_s}{\int _0^T\varphi _1^2(s)ds} =\theta +\frac{\int _0^T\vartheta (s)dB^H_s}{\int _0^T\varphi _1^2(s)ds}+\frac{\int _0^T\varphi _1(s)dW_s}{\int _0^T\varphi _1^2(s)ds}.\end{aligned}$$
(25)

Estimator \(\widehat{\theta }_T\) preserves the traditional form of MLE for diffusion models. The right-hand side of (25) provides a stochastic representation of \(\widehat{\theta }_T\).

Theorem 5.2

([38]) Let assumptions (F\(_{1}\)) and (F\(_{2}\)) hold, and, in addition,  

(F\(_{3}\)):

\(\int _0^T\varphi _1^2(s)ds=\infty \) a.s.

(F\(_{4}\)):

There exist such \(\alpha >1-H\) and \(p>1\) that

$$\begin{aligned}\frac{T^{H+\alpha -1}(\log T)^p\int _0^T|(\mathcal {D}_{0+}^{\alpha }\vartheta )(s)|ds}{\int _0^T \varphi ^2_{1}(s)ds}\rightarrow 0\quad \text {a.s. as}\quad T\rightarrow \infty . \end{aligned}$$

 

Then the estimator \(\widehat{\theta }_T\) is strongly consistent as \(T\rightarrow \infty \).

Similar to Theorem 3.8, conditions stated in Theorem 5.2 can be simplified in case when function \(\vartheta \) is nonrandom.

Theorem 5.3

([38]) Let assumptions (F\(_{1}\)) and (F\(_{2}\)) hold. Then, if functions \(\varphi _1\) and \(\varphi _2\) are nonrandom, function \(\varphi _1\) satisfies condition (D\(_{5}\)), function \(\varphi _2\) is bounded, then estimator \(\widehat{\theta }_T\) is strongly consistent as \(T\rightarrow \infty \).

Sequential version of the estimator \(\widehat{\theta }_T\) has a form

$$\widehat{\theta }_{\upsilon _1(h)}=\theta +\frac{\int _0^{\upsilon _1(h)}\vartheta (s)dB^H_s}{h}+\frac{\int _0^{\upsilon _1(h)}\varphi _1(s)dW_s}{h}, $$

where

$$\upsilon _1(h)=\inf \left\{ t>0: \int _0^t\varphi ^2_1(s)ds=h\right\} .$$

Theorem 5.4

([38])

  1. (a)

    Let function \(\varphi _1\) be separated from zero, \(|\varphi _1(s)|\ge c>0\) a.s. and let function \(\vartheta \) satisfy the assumption: for some \(1-H<\alpha <1\) and \(p>0\)

    $$\begin{aligned} \frac{\int _0^{\upsilon _1(h)}|(\mathcal {D}_{0+}^{\alpha }\vartheta )(s) |ds}{(\upsilon _1(h))^{2-\alpha -H-p}}\rightarrow 0\quad \text {a.s.}\end{aligned}$$
    (26)

    as \(h\rightarrow \infty \). Then estimator \(\widehat{\theta }_{\upsilon _1(h)}\) is strongly consistent.

  2. (b)

    Let function \(\vartheta \) be nonrandom, bounded, and positive, \(\varphi _1\) be separated from zero. Then estimator \(\widehat{\theta }_{\upsilon (h)}\) is consistent in the following sense: for any \(p>0\), \(\mathbf {E}\Big |\theta -\widehat{\theta }_{\upsilon _1(h)}\Big |^p\rightarrow 0\) as \(h\rightarrow \infty \).

Remark 5.5

The assumption (26) holds, for example, for a bounded and Lipschitz function \(\vartheta \).

5.2 Linear Model

Consider a mixed linear model of the form

$$\begin{aligned} dX_t=X_t\left( \theta a(t)dt+b(t)dB_t^H+c(t)dW_t\right) , \end{aligned}$$
(27)

where a, b, and c are nonrandom measurable functions. Assume that they are locally bounded. In this case solution X for Eq. (27) exists, is unique and can be presented in the integral form

$$X_t=x_0\exp \left\{ \theta \int _0^t a(s) ds+\int _0^tb(s) dB_s^H+\int _0^tc(s) dW_s-\tfrac{1}{2}\int _0^tc^2(s)ds\right\} .$$

Assume that \(c(s)\ne 0\). We have that \(\varphi _1(t)=\frac{a(t)}{c(t)}\) and \(\varphi _2(t)=\frac{b(t)}{c(t)}\). The estimator \(\widehat{\theta }_T\) has a form

$$ \widehat{\theta }_T=\frac{\int _0^T\varphi _1(s)dY_s}{\int _0^T\varphi _1^2(s)ds} =\theta +\frac{\int _0^T\varphi _1(s)\varphi _2(s)dB^H_s}{\int _0^T\varphi _1^2(s)ds}+\frac{\int _0^T\varphi _1(s)dW_s}{\int _0^T\varphi _1^2(s)ds}. $$

In accordance with Theorem 5.3, assume that function \(\varphi _1\) satisfies (D\(_{5}\)) and \(\varphi _2\) is bounded. Then the estimator \(\widehat{\theta }_T\) is strongly consistent. Evidently, these assumptions hold for the constant coefficients.

5.3 Mixed Fractional Ornstein–Uhlenbeck Model

Chigansky and Kleptsyna [17] considered the maximum likelihood estimation in the mixed fractional Ornstein–Uhlenbeck model

$$ X_t=X_0+\theta \int _0^tX_s\,ds+V_t $$

with \(V = B + B^H\), where B and \(B^H\), \(H\in (0,1)\setminus \left\{ \frac{1}{2}\right\} \) are independent standard and fractional Brownian motions. Let g(st) be the solution of the integro-differential Wiener–Hopf type equation:

$$ g(s,t)+\frac{d}{ds}\int _0^tg(r,t)H\left| s-r\right| ^{2H-1}{{\mathrm{sign}}}(s-r)\,dr=1, \quad 0<s\ne t\le T. $$

Then the process \(M_t=\int _0^tg(s,t)\,dV_s\), \(t\in [0,T]\), is a Gaussian martingale with quadratic variation \(\langle M\rangle _t=\int _0^tg(s,t)\,ds\), \(t\in [0,T]\). The MLE of \(\theta \) is given by

$$ \widehat{\theta }_T=\frac{\int _0^TQ_t(X)\,dZ_t}{\int _0^TQ_t(X)^2\,d\langle M\rangle _t}, $$

where \(Q_t(X)=\frac{d}{d\langle M\rangle _t}\int _0^tg(s,t)X_s\,ds\), and \(Z_t=\int _0^tg(s,t)\,dX_s\).

Theorem 5.6

([17]) For \(\theta <0\) the estimator \(\widehat{\theta }_T\) is asymptotically normal:

$$ \sqrt{T}\left( \widehat{\theta }_T-\theta \right) \xrightarrow {\mathcal L}\mathcal {N}(0,-2\theta ), \quad \text {as }T\rightarrow \infty . $$

Large deviation properties of this estimator where investigated in [49].