1 Introduction

Fractional Brownian motion (fBm) is a widely used generalization of the standard Brownian motion that incorporates long-range dependence while preserving the self-similarity and Gaussianity. These features make it a popular stochastic process in the mathematical modeling of various complex systems from financial applications to surface growth (e.g., [1, 2]). The need to detect and analyze such systems has led to the development of many, now classical, statistical instruments, including wavelet analysis, R/S estimators, generalized variations, etc., and we refer to the monographs [3, 10, 20] for further reference.

Fractional Brownian motion is homogeneous in space. Many physical systems, however, are subject to an external force and are confined to certain locations with high probability. It may therefore be appropriate to consider stochastic differential equations (SDE) as a generalization of the fBm model. A fractional Brownian motion then takes the role of a random forcing which heavily influences the underlying dynamics. Thus, the identification of this random forcing is of particular interest.

In our work, we will consider an SDE with additive fractional Brownian noise, i.e.,

$$\begin{aligned} X_{t}= x+\int _{0} ^{t} f(s, X_{s}) \hbox {d}s+ B ^{H} _{t}, \quad t\in T, \end{aligned}$$
(1)

where T is an interval in \([0, \infty )\), \(x\in \mathbb {R}\) is the initial state, and f is a measurable deterministic function satisfying suitable assumptions. Stochastic models as (1) appear in mathematical finance. (For example, in [14] the log-volatility is assumed to satisfy an SDE of the form (1).) Other potential applications come from climatology, and stochastic processes as in (1) (with Gaussian or non-Gaussian noise) are utilized as models for problems related to the earth’s energy balance, see [9, 11].

In fBm-related models particular interest lies in the estimation of the Hurst index (see Sect. 2 below) since it constitutes the characteristic parameter of the driving forcing.

The purpose of our endeavor is to estimate the Hurst index H in (1) based on the discrete observations of the process \((X_{t}) _{t\in T}\). Here we will deploy a well-known method, based on the quadratic variations of the observed process X. This method is known to work well for self-similar processes (see, e.g., [20] and references therein).

We will show that it can be also applied to (1), although X is not a self-similar process. The method exploits the fact that the absolutely continuous integral component of (1) does not affect the roughness captured by the quadratic variation of X.

We will define the sequence \(V (a, n, \Delta , B ^{H})\) (see Eq. (7) below) of the so-called quadratic a-variations, defined in terms of higher-order increments of X over the filtera. We give the limit behavior in distribution of this sequence making use of the Malliavin calculus to handle correlations and combine it with already known results concerning the variations of the fBm. Interestingly, in this setting the case \(H<\frac{1}{2}\) is easier to handle than the case of \(H>\frac{1}{2}\). This is due to the fact that, when studying the limit of sequence (7), one needs to take into account the correlations between the increments of the fBm, but also the correlations between the increments of the fBm and the increments of the Lebesgue integral in (1). If \(H<\frac{1}{2}\) these joint correlations are always dominated by those of the fBm, which is not the case for \(H>\frac{1}{2}\), when a supplementary assumption is needed on the mesh \(\Delta \) in (7).

By a standard procedure, we then construct a quadratic variation estimator for the Hurst index of model (1). We prove its consistency and its asymptotic normality, using the limit behavior of the quadratic a-variations, and we also analyze the estimator numerically.

Our work is structured as follows: In Sect. 2 we introduce the objects of study, fBm, filters a and the corresponding quadratic a-variations of a process. We also quote the underlying results on the behavior of the a-variations of fBm. Section 3 studies the properties and the a-variations of the solution to (1), via the techniques of the Malliavin calculus. Section 4 is devoted to the parameter estimation of the Hurst index from discrete observations of X. In Sect. 5 we present simulations and discuss statistical properties of the derived estimators. In Appendix we give a short review of the definitions and results from Malliavin calculus relevant to this work.

2 Preliminaries: Fractional Brownian Motion and Its Variations

A fractional Brownian motion\((B^H_t)_{t\in T}\) with Hurst index \(H\in (0,\,1)\) is a centered Gaussian process on the interval T (in this work we consider \(T=[0,\,1]\) or \(T={\mathbb {R}}^+\)) with covariance function

$$\begin{aligned} {\mathbb {E}}[B^H_t B^H_s]=\frac{1}{2}(t^{2H}+s^{2H}-|t-s|^{2H}) \quad \text {for}\ s,\,t\in T. \end{aligned}$$
(2)

From this definition it is apparent that for \(H=\tfrac{1}{2}\) we have that \({\mathbb {E}}[B^H_t B^H_s]=\min (s,t)\) and the fBm with \(H=\tfrac{1}{2}\) is the standard Brownian motion.

For a natural number \(p\in {\mathbb {N}}\) let us consider a \((p+1)\)-tuple \(a=(a_0,\dots , a_p)\) of real numbers with zero sum, i.e.,

$$\begin{aligned} \sum _{q=0} ^{p} a_{q}=0. \end{aligned}$$

Such a \((p+1)\)-tuple \(a=(a_0,\dots , a_p)\) will be called a filter of length \(p+1\). The order of a filter a, denoted by M(a), is defined to be the order of the first nonzero moment of the \((p+1)\)-tuple a, i.e.,

$$\begin{aligned} \sum _{i=0}^{p}a_i i^k=0 \text { for }0\le k < M(a)\quad \text { and }\quad \sum _{i=0}^{p}a_i\, i^{M(a)}\ne 0. \end{aligned}$$
(3)

We will frequently need the partial sum of the components of a and use the notation

$$\begin{aligned} b_{i}=\sum _{k=0} ^{i} a_{k} \quad \text{ for }\quad i=0,1,\dots ,p. \end{aligned}$$
(4)

Since by definition a filter has zero sum, clearly for any filter a we have \(M(a)\ge 1\). For instance, \(a=(a_{0}, a_{1}) = (-1, 1) \) is a filter of order 1 and of length 2, while \(a=(a_{0}, a_{1}, a_{2})=(1,-2, 1) \) is a filter of order 2 and of length 3.

For a process \((X_t)_{t\in T}\), a filter a of length \(p+1\) and a fixed mesh size\(\Delta >0\) we consider the sum

$$\begin{aligned} \Delta _a X_j := \sum _{i=0}^p a_i X_{(i+j)\Delta } \end{aligned}$$

with \(j\in {\mathbb {N}}\) such that \((p+j)\Delta \in T\). For example, if \(a=(a_{0}, a_{1}) = (-1, 1) \), then \(\Delta _a X_j =X_{(j+1)\Delta }- X_{j\Delta }\), while if \(a=(a_{0}, a_{1}, a_{2})=(1,-2, 1) \) then \(\Delta _a X_j =\Delta _{j\Delta } -2\Delta _{(j+1) \Delta } + X_{(j+2)\Delta }\).

We will refer to \(\Delta _a X_j\) as the increment of the processXat timejover the filtera. Note that due to the zero-sum condition on a we can rewrite \(\Delta _a X_j\) in the following way:

$$\begin{aligned} \begin{aligned} \Delta _a X_j&= \sum _{i=0}^p a_i X_{(i+j)\Delta } \\&=a_0 (X_{j\Delta }-X_{(j+1)\Delta })+(a_0+a_1)X_{(j+1)\Delta }+a_2 X_{(j+2)\Delta }+\dots \\&\quad + a_p X_{(j+p)\Delta } =\cdots \\&= a_0 (X_{j\Delta }-X_{(j+1)\Delta })+\dots +\sum _{i=0}^{p-1} a_i(X_{(j+p-1)\Delta }-X_{(j+p)\Delta }) \\&\quad +\underbrace{\sum _{i=0}^p a_i X_{(j+p)\Delta }}_{=0}\\&=\sum _{i=0}^{p-1} \sum _{k=0}^i a_k (X_{(i+j)\Delta }-X_{(i+j+1)\Delta })\\&=\sum _{i=0}^{p-1} b_i (X_{(i+j)\Delta }-X_{(i+j+1)\Delta }). \end{aligned} \end{aligned}$$
(5)

We will refer to this form as differences representation. The correlation of increments of the fBm over a filter plays an important role in our calculations. From (2) we can see that for a zero-sum vector \(a\in {\mathbb {R}}^{(p+1)}\) and \(i,\,j\in \mathbb {N}\) such that \(\Delta _a B^H_i\), \(\Delta _a B^H_j\) are well defined one has

$$\begin{aligned} {\text {Cov}}(\Delta _a B^H_i,\,\Delta _a B^H_j)=-\frac{\Delta ^{2H}}{2}\sum _{k=0}^p \sum _{l=0}^p a_k a_l |i+k-j-l|^{2H}. \end{aligned}$$
(6)

For a filter \(a\in {\mathbb {R}}^{p+1}\) and a fractional Brownian motion \((B^H_t)_{t\in T}\) we define its (normalized) quadratic a-variation in the following way:

$$\begin{aligned} V(a,\, n, \Delta , \,B^H)= \frac{1}{n}\sum _{j=1}^{n-p} \left( \frac{\left( \Delta _a B^H_j\right) ^2}{\sigma _{a,\, \Delta }}-1\right) , \end{aligned}$$
(7)

where \(n\in {\mathbb {N}}\) with \(n-p>0\) is the number of discrete observations on a grid of mesh size \(\Delta >0\) (that may depend on n) and

$$\begin{aligned} \sigma _{a,\, \Delta }={\text {Var}}\left( \Delta _a B^H_j\right) =-\frac{\Delta ^{2H}}{2}\sum _{k=0}^p \sum _{l=0}^p a_k a_l \vert k-l\vert ^{2H}. \end{aligned}$$
(8)

In what follows, we will assume that \(\Delta =n^{-\alpha }\) for some \(\alpha >0\) whenever the quadratic a-variation is considered. Observe that the choice \(a=(-1,\,1)\) and \(\alpha =1\) would yield the formula for the usual normalized quadratic variation of the fBm (see, e.g., [20]).

Let us recall the main result concerning the asymptotic behavior of the quadratic a-variation of fBm (see [7] and [13]). It states that sequence (7) converges to zero almost surely for any filter a and if \(M(a)> H +\frac{1}{4}\) then it satisfies a central limit theorem.

Theorem 1

Let \((B^H_t)_{t\in T}\) be a fractional Brownian motion with Hurst index \(H\in (0, 1)\), and let a be a filter of order M(a) and of length \(p+1\) with \(p\ge 1\). Then

  1. 1.

    \( V(a,\, n,\, \Delta ,\, B^H) {\mathop {\rightarrow }\limits ^{\text {a.s.}}} 0 \) as \(n\rightarrow \infty .\)

  2. 2.

    If \(M(a)> H+\frac{1}{4} \), then

    $$\begin{aligned} \sqrt{n} V(a,\, n,\, \Delta ,\, B^H){\mathop {\rightarrow }\limits ^{(d)}}N (0,\,\sigma _H), \end{aligned}$$

    where \(\sigma _{H}>0\) is an explicit constant.

In the construction of the estimators in Sect. 4 we need a multidimensional version of this statement which can also be found in [13] and [7]:

Theorem 2

Let \((B^H_t)_{t\in T}\) be a fractional Brownian motion with Hurst index \(H\in (0, 1)\). For \(i=1,\dots , N\) let \(a^i\) be a filter of order \(M(a^i)> H+\frac{1}{4} \). Then

$$\begin{aligned} \sqrt{n} (V(a^1,\, n,\, \Delta ,\, B^H),\dots , V(a^N,\, n,\, \Delta ,\, B^H) ) {\mathop {\rightarrow }\limits ^{(d)}}N (0,\,\Sigma _H), \end{aligned}$$

where \(\Sigma _{H}\) is a positive definite matrix.

Notice that, in the case of a filter of order 1, we have the well-known restriction \(H<\frac{3}{4}\) for the CLT of the sequence \(V(a,\, n,\, \Delta ,\, B^H)\) to hold. This restriction can be lifted by taking a filter of order greater than one, i.e., by considering higher-order increments of the fBm.

3 Stochastic Differential Equations Driven by the Fractional Brownian Motion

This section introduces the main object of our study: stochastic differential equations driven by the fractional Brownian motion. We will then study how the central limit theorems of the previous section carry over to the a-variation of the solution to the SDE.

3.1 Stochastic Differential Equations with fBm

For \(t\in T\subseteq \mathbb {R}^+\) we consider the SDE

$$\begin{aligned} X_t = x + \int _0^t f(s, X_s) \hbox {d}s + B^H_t, \end{aligned}$$
(9)

where \((B^H_t)_{t\in T}\) is a fractional Brownian motion with Hurst index \(H\in (0,1)\) and x is a real number. We assume that \(f\in C^{0,\, 1}_b (T\times \mathbb {R})\) (i.e., continuous in the time component and continuously differentiable in the space component as well as bounded, together with its partial derivative with respect to the second variable, in both components) with

$$\begin{aligned} ||f ||_{\infty }+||f' ||_{\infty }\le M, \end{aligned}$$
(10)

for a fixed constant \(M> 0\). We denote by \(f' \) the derivative of f with respect to its second variable, while \(\Vert \cdot \Vert \infty \) stands for the infinity norm.

In the course of the paper we will be interested in two cases:

  1. (S1)

    : the SDE (9) with \(T=[0,\,1]\),

  2. (S2)

    : the SDE (9) with \(T=\mathbb {R}^+\) for \(H\le \frac{1}{2}\).

The existence of pathwise solutions to (9) in both cases can be seen by considering the process \(X-B^H\) and thus reducing (S) to an ordinary differential equation. We refer to [18, 4] or [16], among others, for the existence and uniqueness of the solution under assumption (10).

Clearly, solutions to (S2) also solve (S1), if restricted to the unit interval. Therefore, statements made for (S2) also apply to (S1). We make this distinction mainly for notational reasons: In (S1) we consider only \(\alpha \ge 1\) (where \(\Delta = n^{-\alpha }\)), such that we do not leave the unit interval as the number of observations (n) grows. In (S2) \(\alpha \) is allowed to be less than one, and the observation window \(n\Delta \) grows with n.

We further denote by Y the absolutely continuous component of the Eq. (9), i.e.,

$$\begin{aligned} Y_t=\int _0^t f(s, X_s)\hbox {d}s, \text{ for } \text{ every } t\in T. \end{aligned}$$
(11)

If X is the solution to SDE (9) (in either the setting (S1) or (S2)) and \(a\in \mathbb {R}^{p+1}\) is a filter of length \(p+1\) we define the quadratica-variation of X to be

$$\begin{aligned} V(a,\, n, \Delta , \,X)= \frac{1}{n}\sum _{j=1}^{n-p} \left( \frac{(\Delta _a X_j)^2}{\sigma _{a,\, \Delta }}-1\right) , \end{aligned}$$

where the number of observations \(n\in \mathbb {N}\) satisfies \(n-p>0\) and the mesh size is chosen to satisfy \(\Delta =n^{-\alpha }\) for some \(\alpha >0\). Here \( \sigma _{a,\, \Delta }=Var (\Delta _a B^H_j)\) is again the variance of the increment of \(B^H\) along the filter a.

3.2 Central Limit Theorems for the Quadratic a-Variation of SDE

This section comprises the core results of this work, combining the limit results of the fBm from Sect. 2 with estimates for the solution of SDE obtained via Malliavin calculus. We obtain central limit theorems for SDE (9) providing the basis for the estimation of H to be discussed in Sect. 4. The results are presented in Theorems 3 and 4 where the cases \(H<\frac{1}{2}\) and \(H\in [\frac{1}{2},\,1)\) are treated separately. The proof of Theorem 3 consists of direct estimates and does not require any assumptions on the order M(a) of the filter a.

If \(H\ge \frac{1}{2}\) these estimates are not sufficient and we rely on the Malliavin calculus to provide a finer analysis of the correlation. The actual proof of Theorem 4 is postponed to Sect. 3.4 after some auxiliary results.

First, we will consider the case \(H<\frac{1}{2}\). Notice that in this case we have \(M(a) >H+\frac{1}{4}\) for any filter a.

Theorem 3

Assume \(H<\frac{1}{2}\), let X be a solution of the SDE (S2) and let a be a filter with \(p+1\) components. For \(\alpha > \frac{1}{2(1-H)}\) (with \(\Delta = n^{-\alpha }\)) we have

$$\begin{aligned} \sqrt{n} V(a,\, n,\, \Delta ,\, X){\mathop {\rightarrow }\limits ^{(d)}} N (0,\, \sigma _H), \end{aligned}$$

with \(\sigma _H>0\) from Theorem 1.

Proof

By a binomial expansion we can write:

$$\begin{aligned} \sqrt{n}V(a,\, n,\, \Delta ,\, X) = \sqrt{n} V(a,\, n,\, \Delta ,\, B^H) + R_n \end{aligned}$$

with

$$\begin{aligned} R_n=n^{-1/2}\Delta ^ {-2H}\sum _{r=0}^1 C^{r} \sum _{i=0}^{n-p} (\Delta _a B^H_{i})^r (\Delta _a Y_{i})^{2-r}=:R_{n,\, 0}+R_{n,\, 1} \end{aligned}$$

with some constants \(C^0\), \(C^1\). Due to Slutsky’s lemma and part (b) of Theorem 1 it is enough to show that \(\mathbb {E}[|R_{n,\,r}|]\rightarrow 0\) for \(r=0,\,1\). Using differences representation (5) we obtain with \(b_{j}\) given by (4)

$$\begin{aligned} |\Delta _a Y_i|=\Bigl |\sum _{j=0}^{p-1}b_j\int _{(i+j)\Delta }^{(i+j+1)\Delta } f(s, X_s) \hbox {d}s \Bigr | \lesssim \Delta , \end{aligned}$$
(12)

since f is uniformly bounded by (10). The notation \(f\lesssim g\) means that there exists a strictly positive constant c such that \(f\le cg\). Thus, we deduce for \(R_{n,\,r}, r=0,1\):

$$\begin{aligned} \mathbb {E}[|R_{n,\,r}|]\lesssim n^{-1/2}\Delta ^{-2H}\Delta ^{2-r}\sum _{i=0}^{n-p}\mathbb {E}[|\Delta _a B^H_i|^r]. \end{aligned}$$

Moreover, since for \(i\in \{0,\dots n\}\), \(\Delta _a B^H_i\) is normally distributed, \(\mathbb {E}[|\Delta _a B^H_i|]\) can be calculated directly:

$$\begin{aligned} \mathbb {E}[|\Delta _a B^H_i|]=\frac{2\sqrt{\sigma _{a,\,\Delta }}}{\sqrt{2\pi }}\lesssim \Delta ^H, \end{aligned}$$

since by (8), \(\sigma _{a,\, \Delta }= -\Delta ^{2H}\underset{i>j}{\sum \limits _{i,\,j =0}^{p}} a_i a_j (i-j)^{2H}.\) Hence, we have

$$\begin{aligned} \mathbb {E}[|R_{n,\, 0}|]\lesssim \Delta ^{-2H} n^{-1/2}\Delta ^2 n= n^{1/2-\alpha (2-2H)} \end{aligned}$$

as well as

$$\begin{aligned} \mathbb {E}[|R_{n,\, 1}|]\lesssim \Delta ^{-2H} n^{-1/2}\Delta n \Delta ^H= n^{1/2-\alpha (1-H)}, \end{aligned}$$

both of which converge to zero because of our assumption on \(\alpha \). \(\square \)

Remark 1

Note that for \(H<1/2\) one can choose \(\alpha = 1\), thus considering the usual equidistant partition of the interval \([0,\, 1]\). But it is also possible to choose a mesh \(\Delta \) less than \(\frac{1}{n}\).

In the second theorem the case \(H\in \big [\frac{1}{2},\,1\big )\) is being considered. For this case we will assume \(\alpha \ge 1\), which means that the observations do not leave the interval \([0,\,1]\). For \(H\ge \frac{1}{2}\) the bounds on the correlation terms in the above proof fail to converge for \(\alpha = 1\); therefore, we need more elaborate techniques to handle the remainder in Sect. 3.4. The assumption \(\alpha \ge 1\) is purely technical: It permits the use of Lemma 1 in Sect. 3.3, the core estimate in our Malliavin calculus approach.

Theorem 4

Let \((B^H_t)_{t\in [0,\,1]}\) be a fractional Brownian motion with Hurst index \(H\in \big [\frac{1}{2},\,1\big )\) and a filter a satisfying \(M(a) >H+\frac{1}{4}\). If the process \((X_t)_{t\in [0,\,1]}\) solves (S1) and if \(\alpha \ge 1\), \(\alpha > \max \left( \frac{2H-1}{2-2H},\, \frac{1}{4-4H}\right) \) (with \(\Delta = n^{-\alpha }\)), we have

$$\begin{aligned} \sqrt{n} V(a,\, n,\, \Delta ,\, X) {\mathop {\rightarrow }\limits ^{(d)}} N (0,\, \sigma _H), \end{aligned}$$

where \(\sigma _H\) is the constant from Theorem 1.

Remark 2

All in all, for \(H\in \left( \frac{1}{2},\,\frac{3}{4}\right) \) we have the convergence conditions \(\alpha > \frac{1}{4-4H}\) and \(\alpha > \frac{2H-1}{2-2H}\), both of which are satisfied for \(\alpha \ge 1\). When \(H>\frac{3}{4}\), we need to choose \(\alpha >1\).

Remark 3

The proofs of Theorems 3 and 4 were based upon the demonstration that the differences \(\sqrt{n}(V(a,\, n,\, \Delta ,\, X)- V(a,\, n,\, \Delta ,\, B^H))\) converge to zero in \(L^1\). This implies that both theorems can be generalized to their multivariate versions, again by means of Slutsky’s lemma and Theorem 2.

3.3 Auxiliary Results from Malliavin Calculus

We postpone to Appendix the technical definitions and results from Malliavin calculus. We refer to [4] for the fact that the solution X of (S1) belongs to the Sobolev space \(\mathbb {D}^{1,\,2}\), implying that the process Y defined above is also Malliavin differentiable (compare also the proofs of Lemma 5.1 and Lemma 5.3 in [16] for a generalization with respect to the time dependence of the drift). Moreover, the Malliavin derivative of X (with respect to the fBm \(B^ {H}\)) satisfies \(D_s X_t=0\) for \(s>t\) and

$$\begin{aligned} D_s X_t = \int _s^t f'(r,\,X_r)D_s X_r \hbox {d}r + 1 \end{aligned}$$

for \(s\le t\). This linear equation has the explicit solution

$$\begin{aligned} D_s X_t = e^{\int _s^t f'(r,\,X_r)\hbox {d}r}{,} \end{aligned}$$

and similarly to the equation considered in [4] this provides the bound

$$\begin{aligned} e^{-tM}\le D_s X_t\le e^{tM} \end{aligned}$$

almost everywhere for \(s\le t\), where M is the bound on \(\Vert f'\Vert _{\infty }\) defined above in (10).

Moreover, we need the following technical result, which analyzes the correlation between the increments of the processes \(B^{H}\) and Y. Below, \(\mathcal {H}\) is the canonical Hilbert space of the fBm (see Sect. 6).

Lemma 1

Let \(H \ge \frac{1}{2}\) and Y be defined by (11) for the SDE (S1). Then

  1. 1.

    for every \(s,\,t,\,u,\,v\in [0,\,1]\) with \(s<t\), \(u<v\), we have

    $$\begin{aligned} \mathbb {E}[|\left\langle D_\cdot (Y_{u}-Y_{v}),\, 1_{[{s},\,{t}]}(\cdot ) \right\rangle _{\mathcal {H}}|]\lesssim ({t}-{s})({u}-{v})\ , \end{aligned}$$
  2. 2.

    for every \(s_i,\,t_i,\,u_i,\,v_i\in [0,\,1]\) with \(s_i<t_i\), \(u_i<v_i\), \(i=1,\,2\).

    $$\begin{aligned}&\mathbb {E}[|\left\langle D_\cdot (Y_{u_1}-Y_{v_1}),\, 1_{[{s_1},\,{t_1}]}(\cdot ) \right\rangle _{\mathcal {H}}\left\langle D_\cdot (Y_{u_2}-Y_{v_2}),\, 1_{[{s_2},\,{t_2}]}(\cdot ) \right\rangle _{\mathcal {H}}|]\\&\quad \lesssim (t_1-s_1)(u_1-v_1)(t_2-s_2)(u_2-v_2). \end{aligned}$$

Proof

First note that for \(s\in [0,\,1]\)

$$\begin{aligned} D_s(Y_u-Y_v)=D_s\int _v^u f(a, X_a) \hbox {d}a =\int _v^u D_s f(a, X_a) \hbox {d}a. \end{aligned}$$

This can be seen by approximating the integrals using the fact that the integrands are bounded. Since the derivative operator is closed, it carries over to the limit. The chain rule yields

$$\begin{aligned} \int _v^u D_s f(a, X_a) \hbox {d}a =\int _v^u \underbrace{f'(a, X_a)}_{|\dots |\le M} \underbrace{D_s X_a}_{|\dots |\le e^M} \hbox {d}a, \end{aligned}$$

which results in the bound

$$\begin{aligned} |D_s(Y_u-Y_v)|\lesssim (u-v). \end{aligned}$$
(13)

Now we will consider the case \(H=\frac{1}{2}\), in which \(B^H\) is the standard Brownian motion. Since the space \(\mathcal {H}\) coincides with the space \( L^{2}(\Omega )\), we have

$$\begin{aligned}&\mathbb {E}[(|\langle D_\cdot (Y_u-Y_v), \, 1_{[s,\,t]}(\cdot )\rangle _{\mathcal {H}}|]=\mathbb {E}[|\langle D_\cdot (Y_u-Y_v), \, 1_{[s,\,t]}(\cdot )\rangle _{L^2}|]\\&\quad \le \mathbb {E}\left[ \int _s^t |D_{\beta } (Y_u-Y_v)| \hbox {d}\beta \right] \lesssim (u-v)(t-s), \end{aligned}$$

applying in the last step the above bound (13) on \(|D_{\beta } (Y_u-Y_v)|\). Part (a) is proven for \(H=\frac{1}{2}\). Part (b) is obtained similarly by the same arguments and bounds.

For the case \(H> \frac{1}{2}\) we first calculate (recall that the norm in \(\vert \mathcal {H}\vert \) is defined in (23))

$$\begin{aligned} \Vert D(Y_{u}-Y_{v}) \Vert _{\vert \mathcal {H} \vert } ^ {2}= & {} \int _0^1 \int _0^1 |D_s(Y_u-Y_v)||D_r(Y_u-Y_v)||s-r|^{2H-2} \hbox {d}s \hbox {d}r\\\lesssim & {} (u-v)^2\underbrace{\int _0^1 \int _0^1 |s-r|^{2H-2} \hbox {d}s \hbox {d}r}_{=\frac{1}{\alpha _H}\langle 1_{[0,\,1]},\, 1_{[0,\,1]}\rangle _{\mathcal {H}}}<\infty ; \end{aligned}$$

consequently, \(D_s(Y_u-Y_v)\in |\mathcal {H}|\), which enables us to use the scalar product representation given in (24). We get for the case (a), by (13)

$$\begin{aligned}&\mathbb {E}[|\left\langle D_\cdot (Y_u-Y_v),\, 1_{[s,\,t]}(\cdot ) \right\rangle _{\mathcal {H}}|] \\&\quad \le \mathbb {E}\left[ \alpha _H\int _0^1\int _0^1 |[D_{\beta }(Y_u-Y_v)| 1_{[s,\,t]}(\alpha ) |\alpha -\beta |^{2H-2}\hbox {d}\beta \hbox {d}\alpha \right] \\&\quad =\alpha _H\int _s^t\int _0^1 \mathbb {E}[|D_{\beta }(Y_u-Y_v)|] |\alpha -\beta |^{2H-2}\hbox {d}\beta \hbox {d}\alpha \\&\quad \lesssim (u-v) \int _s^t\int _0^1 |\alpha -\beta |^{2H-2}\hbox {d}\beta \hbox {d}\alpha . \end{aligned}$$

Moreover, it holds that

$$\begin{aligned} \int _s^t\int _0^1 |\alpha -\beta |^{2H-2}\hbox {d}\beta \hbox {d}\alpha= & {} \frac{1}{\alpha _H}\langle 1_{[s,\,t]},\,1_{[0,\,1]}\rangle _{\mathcal {H}}\\= & {} \frac{1}{2}(t^{2H}-s^{2H})+\frac{1}{2} ((1-s)^{2H}-(1-t)^{2H})\lesssim (t-s), \end{aligned}$$

since for \(0<a<b<1\) and \(H>\frac{1}{2}\)

$$\begin{aligned} b^{2H}-a^{2H}=2H\int _0^{b-a}(x+a)^{2H-1}dx\le 2H b^{2H-1} (b-a)\lesssim (b-a). \end{aligned}$$

This proves part (a) of the lemma for \(H>\frac{1}{2}\). The proof of part (b) follows analogously, since

$$\begin{aligned} \mathbb {E}[|D_{\beta _1}(Y_{u_1}-Y_{v_1})D_{\beta _2}(Y_{u_2}-Y_{v_2})|]\lesssim (u_1-v_1)(u_2-v_2). \end{aligned}$$

\(\square \)

3.4 Proof of Theorem 4

The correlation estimates of Lemma 1 allow us to prove the theorem.

Proof

For both cases we consider the sum \(R_{n,\, 0}+R_{n,\,1},\) defined as in the proof of Theorem 3. For the summand \(R_{n,\,0}\) we get similarly to the previous theorem by the boundedness of f

$$\begin{aligned} \mathbb {E}[|R_{n,\,0}|]= & {} Cn^{-1/2}\Delta ^{-2H} \sum _{i=0}^{n-p}\mathbb {E}[(\Delta _a Y_i)^2]\ \lesssim n^{-1/2}\Delta ^{-2H} n \Delta ^2 \\= & {} \Delta ^{2-2H}n^{1/2}=n^ {\alpha (2H-2)+\frac{1}{2}}, \end{aligned}$$

which converges to zero for \(\alpha > \frac{1}{2(2-2H)}\). We rewrite \(R_{n,\,1}\) with the help of differences representation (5). Then

$$\begin{aligned} R_{n,\,1}&= Cn^{-1/2}\Delta ^{-2H}\sum _{j=0}^{n-p}(\Delta _a Y_j)(\Delta _a B^H_j)\nonumber \\&=C n^{-1/2}\Delta ^{-2H}\sum _{j=0}^{n-p}(\Delta _a Y_j)\left( \sum _{i=1}^{p-1} b_i (B^H_{(i+j)\Delta }-B^H_{(i+j+1)\Delta })\right) \nonumber \\&= -C n^{-1/2}\Delta ^{-2H}\sum _{j=0}^{n-p}\sum _{i=1}^{p-1}b_i(\Delta _a Y_j) \, B^H({\varvec{1}}_{[(i+j)\Delta ,\,(i+j+1)\Delta ]})\nonumber \\&= C n^{-1/2}\Delta ^{-2H}\sum _{j=0}^{n-p}(\Delta _a Y_j) \, B^H(h_j)\nonumber \\&= C n^{-1/2}\Delta ^{-2H}\Big ( \delta (u) + \sum _{j=0}^{n-p} \langle D_\cdot \Delta _a Y_j, \, h_j \rangle _{{\mathcal {H}}}\Big )\ \end{aligned}$$
(14)

where \(\delta \) denotes the Skorokhod integral. This decomposition follows directly from the integration by parts formula (25) applied to \(u=\sum _{j=0}^{n-p} F_jh_j\) for

$$\begin{aligned} F_j=\Delta _a Y_j\in \mathbb {D}^{1,\,2}, \qquad h_j=-\sum _{i=0}^{p-1}b_i 1_{[(i+j)\Delta ,\, (i+j+1)\Delta ]}(\cdot ), \quad j\in \{0,\dots ,n-p\}\ . \end{aligned}$$

Notice that (see (5)) \(h_j = D_\cdot \Delta _a B^H_j\). We estimate the first summand in (14). We can write

$$\begin{aligned}&\mathbb {E}\left[ \left| \left\langle D_\cdot \Delta _a Y_j,\,h_j\right\rangle _{\mathcal {H}}\right| \right] \\&\quad = \mathbb {E}\left[ \left| \left\langle D_\cdot \left( \sum _{k=0}^{p-1} b_k (Y_{(k+j)\Delta }-Y_{(k+j+1)\Delta })\right) ,\,\sum _{i=0}^{p-1}\left( b_i 1_{[(i+j)\Delta ,\,(i+j+1)\Delta ]}(\cdot )\right) \right\rangle _{\mathcal {H}}\right| \right] \\&\quad \le \sum _{i,k =0}^{p-1}|b_i b_k| \cdot \mathbb {E}\left[ \left| \left\langle D_\cdot (Y_{(k+j)\Delta }-Y_{(k+j+1)\Delta }),\,1_{[(i+j)\Delta ,\,(i+j+1)\Delta ]}(\cdot )\right\rangle _{\mathcal {H}}\right| \right] \\&\quad \lesssim p^2\max _{0\le k \le p-1}\{ b_k^2\}\, \Delta ^2 \lesssim \Delta ^2 \end{aligned}$$

for any \(j\in {0,\dots , n-p}\), where the last inequality follows from Lemma 1, point 1. Hence, we obtain

$$\begin{aligned} \mathbb {E}[|n^{-1/2}\Delta ^{-2H} \sum _{j=0}^{n-p}\langle D_\cdot \Delta _a Y_j,\, h_j\rangle _{\mathcal {H}} | ]\lesssim n^{1/2}\Delta ^{2-2H}, \end{aligned}$$
(15)

which goes to zero if \(\alpha >\frac{1}{4-4H}\). Let us turn to the first component \(n^{-1/2}\Delta ^{-2H} \delta (u)\) and show that it vanishes in \(L^2\) (implying convergence in probability).

$$\begin{aligned} \begin{aligned}&\mathbb {E}\bigl [ | \delta (u) |^2 \bigr ] =\mathbb {E}\bigl [ | \sum _{j=0}^{n-p} \delta ((\Delta _aY_j)h_j) |^2 \bigr ] =\sum _{j,k=0}^{n-p} \mathbb {E}\bigl [ \delta ((\Delta _aY_j)h_j) \cdot \delta ((\Delta _aY_k)h_k) \bigr ] \\&\quad =\sum _{j,k=0}^{n-p} \Bigg ( \mathbb {E}\bigl [ \bigl \langle (\Delta _aY_j)h_j,\, (\Delta _aY_k)h_k\bigr \rangle _{{\mathcal {H}}}\bigr ] + \mathbb {E}\bigl [ \bigl \langle D_\cdot \Delta _aY_j,h_k \bigr \rangle _{{\mathcal {H}}} \bigl \langle D_\cdot \Delta _a Y_k,\, h_j\bigr \rangle _{{\mathcal {H}}}\bigr ] \Bigg ) \ , \end{aligned} \end{aligned}$$
(16)

which follows from a direct application of (26). An expansion of differences representation (5) for \(\Delta _aY_j, \Delta _aY_k\) and the definition of \(h_j,h_k\) gives

$$\begin{aligned}&\mathbb {E}\bigl [ \bigl \langle D_\cdot \Delta _aY_j,h_k \bigr \rangle _{{\mathcal {H}}} \bigl \langle D_\cdot \Delta _a Y_k,\, h_j\bigr \rangle _{{\mathcal {H}}}\bigr ] \\&\quad =\sum _{i,\,l,\,\mu ,\,\nu =0}^{p-1}b_i b_l b_\mu b_\nu \cdot \mathbb {E}\Big [ \bigl \langle D_\cdot (Y_{(i+j+1)\Delta }-Y_{(i+j)\Delta }),\, 1_{[(\nu +k)\Delta ,\, (\nu +k+1)\Delta ]}(\cdot )\bigr \rangle _{\mathcal {H}}\\&\qquad \cdot \bigl \langle D_\cdot (Y_{(l+k+1)\Delta }-Y_{(l+k)\Delta }),\, 1_{[(\mu +j)\Delta ,\, (\mu +j+1)\Delta ]}(\cdot )\bigr \rangle _{\mathcal {H}}\Big ]. \end{aligned}$$

With the use of Lemma 1 applied to each expectation this is easily dominated by

$$\begin{aligned} \bigl | \mathbb {E}\bigl [ \bigl \langle D_\cdot \Delta _aY_j,h_k \bigr \rangle _{{\mathcal {H}}} \bigl \langle D_\cdot \Delta _a Y_k,\, h_j\bigr \rangle _{{\mathcal {H}}}\bigr ] \bigr | \lesssim p^4 \max _{0\le k \le p-1}\{ b_k^4\}\, \Delta ^4 \lesssim \Delta ^4. \end{aligned}$$

Analogously, we obtain

$$\begin{aligned} \begin{aligned}&\bigl |\mathbb {E}\bigl [ \bigl \langle (\Delta _aY_j)h_j,\, (\Delta _aY_k)h_k\bigr \rangle _{{\mathcal {H}}}\bigr ]\bigr | \\&\quad =\sum _{i,\,l,\,\mu ,\,\nu =0}^{p-1} \bigl | b_i b_l b_\mu b_\nu \cdot \mathbb {E}\Big [ (Y_{(i+j+1)\Delta }-Y_{(i+j)\Delta })(Y_{(l+k+1)\Delta }-Y_{(l+k)\Delta }) \Big ]\\&\qquad \cdot \langle 1_{[(\mu +j)\Delta ,\, (\mu +j+1)\Delta ]}(\cdot ),\, 1_{[(\nu +k)\Delta ,\, (\nu +k+1)\Delta ]}(\cdot )\rangle _{\mathcal {H}} \bigr | \\&\quad \le \max _{0\le k \le p-1}\{ b_k^4\} M^2\Delta ^2\cdot \sum _{i,\,l,\,\mu ,\,\nu =0}^{p-1} \bigl | \langle 1_{[(\mu +j)\Delta ,\, (\mu +j+1)\Delta ]}(\cdot ),\, 1_{[(\nu +k)\Delta ,\, (\nu +k+1)\Delta ]}(\cdot )\rangle _{\mathcal {H}} \bigr |, \end{aligned} \end{aligned}$$
(17)

where we used the fact that the expectations are again bounded by \(M^2\Delta ^2\).

Observe that for fixed i, l, \(\mu \), \(\nu \) we will make the sum in (16) larger if we estimate

$$\begin{aligned} \begin{aligned}&\sum _{j,k=0}^{n-p} \mathbb {E}\bigl [ \bigl |\bigl \langle (\Delta _aY_j)h_j,\, (\Delta _aY_k)h_k\bigr \rangle _{{\mathcal {H}}}\bigr |\bigr ] \\&\quad \lesssim \Delta ^2\sum _{j,\,k=0}^{n-p+\eta } \mathbb {E}\bigl [ \bigl |\bigl \langle 1_{[j\Delta ,\, (j+1)\Delta ]}(\cdot ),\, 1_{[k\Delta ,\, (k+1)\Delta ]}(\cdot )\bigr \rangle _{\mathcal {H}}\bigr |\bigr ], \end{aligned} \end{aligned}$$
(18)

where \(\eta = \max (\mu ,\,\nu )\).

Now the inner product satisfies

$$\begin{aligned}&\langle 1_{[i\Delta ,\, (i+1)\Delta ]}(\cdot ),\, 1_{[j\Delta ,\, (j+1)\Delta ]}(\cdot )\rangle _{\mathcal {H}} \\&\quad =\Delta ^{2H}\cdot \frac{1}{2}\left( |i-j+1|^{2H}+|i-j-1|^{2H}-2|i-j|^{2H}\right) . \end{aligned}$$

We note that for \(H=\frac{1}{2}\) the indicator functions are orthogonal. On the diagonal we have for \(i=j\)

$$\begin{aligned} \langle 1_{[i\Delta ,\, (i+1)\Delta ]}(\cdot ),\, 1_{[j\Delta ,\, (j+1)\Delta ]}(\cdot )\rangle _{\mathcal {H}} =\Delta ^{2H}. \end{aligned}$$

For \(H>\frac{1}{2}\) also the off-diagonal elements contribute. For \(r:=i-j\) large we have \((r+1)^{2H}-2r^{2H}+(r-1)^{2H} \sim 2H(2H-1) r^{2H-2}\) since it uniformly approximates the second derivative of the function \(x\mapsto x^{2H}\) at \(x=r\). Indeed, a Taylor expansion shows that for \(r> 1\)

$$\begin{aligned} (r+1)^{2H}+(r-1)^{2H} - 2r^{2H} = 2H(2H-1)r^{2H-2} + O\left( (r-1)^{2H-2} \right) . \end{aligned}$$

We can now rearrange the off-diagonal part of the sum in (18) and use this approximation:

$$\begin{aligned}&\Delta ^2\sum _{\begin{array}{c} j,\,k=0\\ j\ne k \end{array}}^{n-p+\eta } \mathbb {E}\bigl [ \bigl |\bigl \langle 1_{[t_{j},\, t_{j+1}]}(\cdot ),\, 1_{[t_{k},\, t_{k+1}]}(\cdot )\bigr \rangle _{\mathcal {H}}\bigr |\bigr ]\\&\quad =\Delta ^{2+2H} \frac{1}{2}2\sum _{r=1}^{n-p+\eta } \left( \left( n-p+\eta \right) +1-r\right) \left( (r+1)^{2H}-2r^{2H}+(r-1)^{2H}\right) \\&\quad \lesssim \Delta ^{2+2H} \sum _{r=1}^{n-p+\eta } \left( \left( n-p+\eta \right) +1-r\right) r^{2H-2}. \end{aligned}$$

Recall now that for the sum \(\sum _{k=1}^n \frac{1}{k^s}\) with \(n\in \mathbb {N}\) and \(s\in \mathbb {R}\backslash \{1\}\) the following asymptotic equality stems from the Euler–Maclaurin formula.

$$\begin{aligned} \sum _{k=1}^n \frac{1}{k^s}\sim \zeta (s)-\frac{n^{1-s}}{s-1}\left( 1+O\left( \frac{1}{n}\right) \right) , \end{aligned}$$

where \(\zeta (s)\) is the Riemann zeta function. Therefore, since for \(H>\frac{1}{2}\) the term \(2H-2\) cannot equal 1,

$$\begin{aligned} \sum _{r=1}^{n-p+\eta }&\left( \left( n-p+\eta \right) +1-r\right) r^{2H-2}\\&\lesssim n \sum _{r=1}^{n-p+\eta } r^{2H-2} \lesssim n+n\cdot n^{1-(2-2H)}\lesssim n^{2H}. \end{aligned}$$

As a final step, we verify that we have for the correctly scaled second summand in (14)

$$\begin{aligned}&\mathbb {E}\bigl [ | n^{-1/2}\Delta ^{-2H} \delta (u) |^2 \bigr ] \\&\quad = n^{-1}\Delta ^{-4H} \sum _{j,k=0}^{n-p} \Bigg ( \mathbb {E}\bigl [ \bigl \langle (\Delta _aY_j)h_j,\, (\Delta _aY_k)h_k\bigr \rangle _{{\mathcal {H}}}\bigr ] \\&\qquad + \mathbb {E}\bigl [ \bigl \langle D_\cdot \Delta _aY_j,h_k \bigr \rangle _{{\mathcal {H}}} \bigl \langle D_\cdot \Delta _a Y_k,\, h_j\bigr \rangle _{{\mathcal {H}}}\bigr ] \Bigg ) \\&\quad \lesssim n^{-1}\Delta ^{-4H} \Bigg ( \Delta ^{2H+2}n^{2H} + n^2 \Delta ^4 \Bigg ) = n \Delta ^{4-4H}+ n^{2H-1}\Delta ^{2-2H}. \end{aligned}$$

Substituting \(\Delta =n^{-\alpha }\) gives us the condition \(\alpha >\frac{1}{4-4H}\) as above for the vanishing of the first term and the additional condition \(\alpha >\frac{2H-1}{2-2H}\) for the second. This bound combined with (15) and (14) will imply the conclusion. \(\square \)

4 Estimation of the Hurst Parameter

Here we will construct estimators for the Hurst parameter H of the Brownian motion \(B^H\) driving the SDE (S2) and derive their properties, using the previous results. In particular, we will transfer the almost sure convergence result from Theorem 1 to the quadratic a-variation of the solution of the SDE (S2) and apply the central limit theorems from the previous section.

We shall fix in the sequel a filter \(a=(a_0,\dots , a_p)\) of order M(a).

For a stochastic process \((Z_t)_{t\in [0,\,1]}\) we define

$$\begin{aligned} U(a,\,n,\,\Delta ,Z):=\frac{1}{n}\sum _{j=1}^{n-p} (\Delta _a Z_j)^2 \end{aligned}$$

with n such that \(n-p >0\) and \(\Delta = n^{-\alpha }\) for some \(\alpha > 0\). Then

$$\begin{aligned} V(a,\,n,\,\Delta ,B^H)=\frac{U(a,\,n,\,\Delta ,B^H)}{\sigma _{a,\, \Delta }}-\frac{n-p}{n} \end{aligned}$$

holds, and we deduce from Theorem 1 that

$$\begin{aligned} \frac{U(a,\,n,\,\Delta ,B^H)}{\sigma _{a,\, \Delta }} -1{\mathop {\rightarrow }\limits ^{\text {a.s.}}}0. \end{aligned}$$

The following theorem shows the same result for the solution X of (S1).

Theorem 5

Let \((B^H_t)_{t\in [0,\,1]}\) be a fractional Brownian motion with Hurst parameter \(H\in (0,1)\), and let \((X_t)_{t\in [0,\,1]}\) be the solution of the SDE (S2) driven by the fBm \(B^H\). Then

$$\begin{aligned} \frac{U(a,\,n,\,\Delta ,X)}{\sigma _{a,\, \Delta }} -1{\mathop {\rightarrow }\limits ^{\text {a.s.}}}0. \end{aligned}$$

Proof

We have

$$\begin{aligned} U(a,\,n,\,\Delta ,X)=U(a,\,n,\,\Delta ,B^H)+U(a,\,n,\,\Delta ,Y)+2\frac{1}{n}\sum _{j=1}^{n-p} (\Delta _a Y_j)(\Delta _a B^H_j), \end{aligned}$$

so it is enough to show that the last two summands divided by \(\sigma _{a,\, \Delta }\) (\(\sim \Delta ^{2H}\)) converge to zero almost surely. Recall that we have from (12) \(|\Delta _a Y_j|\lesssim \Delta \). It follows that

$$\begin{aligned} \left| \frac{U(a,\,n,\,\Delta ,Y)}{\sigma _{a,\, \Delta }}\right| {\mathop {\lesssim }\limits ^{\text {a.s.}}}\Delta ^{-2H} n^{-1}n\Delta ^2=\Delta ^{2-2H}, \end{aligned}$$

which goes to 0 for every \(\alpha \) as n tends to infinity.

To estimate \(\Delta _a B^H_j\) we refer to [17] for the fact that almost all sample paths of a fractional Brownian motion with the Hurst index H are Hölder continuous of order \(H-\varepsilon \) for any \(\varepsilon > 0\). (It follows from the Kolmogorov continuity theorem, since \(B^H\) is self-similar.) We get for almost every trajectory, using differences representation (5):

$$\begin{aligned} |\Delta _a B^H_j|\le \sum _{i=0}^{p-1}|b_i||B_{(i+j+1)\Delta }-B_{(i+j)\Delta }|\le \sum _{i=0}^{p-1}|b_i| C \Delta ^{H-\varepsilon }\lesssim \Delta ^{H-\varepsilon } \end{aligned}$$

with a positive random variable C. Then we get

$$\begin{aligned} \left| \frac{1}{n\sigma _{a,\, \Delta }}\sum _{j=1}^{n-p} (\Delta _a Y_j)(\Delta _a B^H_j)\right| {\mathop {\lesssim }\limits ^{\text {a.s.}}} n^{-1}\Delta ^{-2H}n\Delta \Delta ^{H-\varepsilon }=\Delta ^{1-H-\varepsilon }. \end{aligned}$$

We can ensure that it converges to zero by setting \(\varepsilon := \min \left( \frac{1-H}{2},\,\frac{H}{2}\right) \), and the claim follows. \(\square \)

Note that for the special case \(a=(-1,\,1)\) and \(\Delta = n^{-1}\) we can deduce directly from this result that the “standard” estimator \(\hat{H}\) given by

$$\begin{aligned} \hat{H}=\hat{H}(X):=\frac{\log (n U(a,\,n,\,\Delta ,\,X))}{-2\log (n)}+\frac{1}{2} \end{aligned}$$
(19)

converges almost surely to H.

Remark 4

Since the relationship between \(V(a,\,n,\,\Delta ,X)\) and \(U(a,\,n,\,\Delta ,X)\) is the same as for the process \(B^H\), it follows from Theorem 5 that \(V(a,\,n,\,\Delta ,X){\mathop {\rightarrow }\limits ^{\text {a.s.}}}0\).

Now we are in a position to construct an estimator \(\hat{H}_1\) in the way it was done in [13] (see also [7]) for processes with stationary increments. We know from Theorem 5 that \(U(a,\,n,\,\Delta ,X)\) asymptotically equals \(\sigma _{a,\,\Delta }\), and a simple rearrangement argument gives us

$$\begin{aligned} \sigma _{a,\,\Delta }=-\frac{1}{2}\Delta ^{2H}\sum _{k,\,l =0}^{p}a_k a_l |k-l|^{2H}=-\sum _{d=1}^{p} (\Delta d)^{2H}\sum _{k=0}^{p-d}a_k a_{k+d}. \end{aligned}$$

Let us consider for \(i\in \{1,\dots ,m\}\) (\(m\in {\mathbb {N}}\backslash \{0\}\)) a family of filters \(a^{(i)}=(a_0^{(i)},\dots , a_{p_i}^{(i)})\). For each member the associated variance \(\sigma _{a,\Delta }\) is a linear combination of the \( (\Delta d)^{2H}\), \(d\in \{1,\dots ,P\}\), where \(P=\max _i p_i\), the numbers \(p_i\) being not necessarily different. So if we consider a matrix \(A=(A_{ij})\in \mathbb {R}^{m\times P}\) defined by

$$\begin{aligned} A_{ij}=-\sum _{k=0}^{p_j-j}a^{(i)}_k a^{(i)}_{k+j}\text { for }j\le p_j\text { and }A_{ij}=0\text { otherwise}, \end{aligned}$$

then \(U_n:=(U(a^{(i)},\,n,\,\Delta ,X))_{i=1,\dots ,m}\) tends almost surely to AD, where D denotes the vector \(((\Delta d)^{2H})_{d=1,\dots ,P}\).

We choose the \(a^{(i)}\) such that A is of full rank and estimate D by linear regression (preserving the almost sure convergence property): \(\hat{D}=(A^TA)^{-1} A^T U_n\). Now we can consider \(\log |\hat{D}|\), which tends to \(2H (\log (d \Delta ))_{d=1,\dots ,P}\), and estimate H by another simple linear regression:

$$\begin{aligned} \hat{H}_1=\hat{H}_{1}(X):=\frac{\sum \nolimits _{d=1}^P \log |\hat{D}_d|\log (d\Delta )}{2\sum \nolimits _{d=1}^P \log (d\Delta )^2 }. \end{aligned}$$
(20)

If we know the observations to be equidistant but do not have access to the actual mesh size \(\Delta \), we can estimate D in the same way (because \(U_n\) does not directly depend on \(\Delta \)) and then perform a regression with the intercept term \(2H \log (\Delta )\). This allows the estimation

$$\begin{aligned} \hat{H}_2=\hat{H}_{2}(X):=\frac{\sum \nolimits _{d=1}^P \log |\hat{D}_d|\log (d)-\frac{1}{P}\sum \nolimits _{d=1}^P \log |\hat{D}_d|\sum \nolimits _{d=1}^P \log (d)}{2 \left( \sum \nolimits _{d=1}^P \log (d)^2-\frac{1}{P}\left( \sum \nolimits _{d=1}^P \log (d)\right) ^2\right) }. \end{aligned}$$

To establish some properties of the estimators \(\hat{H}_1\) and \(\hat{H}_2\), we first recall that, as a consequence of Theorem 5, we have the convergence result \(U_n{\mathop {\rightarrow }\limits ^{\text {a.s.}}}AD\) and therefore

$$\begin{aligned} \hat{D}=(A^TA)^{-1} A^T U_n{\mathop {\rightarrow }\limits ^{\text {a.s.}}} D. \end{aligned}$$

Then, evoking the continuous mapping theorem, we obtain \(\hat{H}_1{\mathop {\rightarrow }\limits ^{\text {a.s.}}}H\), \(\hat{H}_2{\mathop {\rightarrow }\limits ^{\text {a.s.}}} H\). In other words, both estimators are strongly consistent.

The question of asymptotic normality depends upon whether Theorems 3 and 4 are applicable. Choosing filters of order 2 one can ensure that the filter condition is always satisfied, but other assumptions require more care. In particular, if there is no prior knowledge about the true value of H, one has to assume that the mesh size condition \(\alpha > \frac{1}{4-4H}\) is satisfied. If a certain bound for H is known, say, if \(H\in \left( 0,\,\frac{3}{4}\right) \), then under the assumption that \(\alpha \ge 1\) (which includes the usual partition of \([0,\,1]\)) we obtain asymptotic normality. The following theorem summarizes our observations.

Theorem 6

Let X be a solution of the SDE (S2), and let the mesh size \(\Delta \) be chosen such that the assumptions of Theorems 3 and 4 are satisfied. Moreover, let \(a^{(i)}\) be filters of respective lengths \(p^{(i)}+1\) (for \(i=1,\dots , m\)) such that \(M(a^{(i)})> H+ \frac{1}{4}\). Then the sequences \(\sqrt{n} (\hat{H}_1-H)\) and \(\sqrt{n} (\hat{H}_2-H)\) converge weakly to normally distributed, centered random variables.

Proof

This is an application of the delta method (mentioned, for example, in [8]) for the vector \(V_n:=(V(a^{(i)},\,n,\,\Delta ,X))_{i=1,\dots ,m}\). Its almost sure convergence was shown in Remark 4 and the multivariate convergence in law of \(\sqrt{n}V_n\) follows due to Remark 3. Since for the construction of \(\hat{H}_1\), \(\hat{H}_2\) it underwent only linear and logarithmic transformations, the obtained estimators are indeed asymptotically normal. \(\square \)

5 Simulation Study

Similarly to [13] and [7] two filters \((a^{(1)}_i)_{i\in \{0,\dots ,p \}}\) and \((a^{(2)}_i)_{i\in \{0,\dots ,2p \}}\) are considered, where \(a^{(2)}\) is obtained by “thinning” the filter \(a^{(1)}\) (i.e., \(a^{(2)}_{2k}:= a^{(1)}_k\) for \(k\in \{0,\dots ,p \} \) and zero otherwise). In this case the estimator \(\hat{H}_1\) simplifies to

$$\begin{aligned} \hat{H}_1 =\frac{1}{2}\log _2 \left( \frac{U(a^{(2)}, n, \Delta , X)}{U(a^{(1)}, n, \Delta , X)} \right) {.} \end{aligned}$$

Note that this estimator is independent of the scaling: Its strong consistency has been shown for all \(\alpha >0\); hence, it is enough to know that the observations are equidistant, and the mesh size \(\Delta \) can then be chosen appropriately. It is, moreover, by construction independent of deterministic multiplicative scaling factors of the fBm involved, which allows us to include the (slightly more general) case

$$\begin{aligned} X_t = x + \int _0^t f(s, X_s) \hbox {d}s + \sigma B^H_t \end{aligned}$$
(21)

with some unknown \(\sigma >0\) and observed over an unknown interval in our simulation study. From these observations different settings for the simulations arise.

We simulate 100 trajectories of a process X defined by

$$\begin{aligned} dX_t = \sin (X_t + t)dt+\sigma dB^H_t,\quad X_0=0. \end{aligned}$$
(22)

The simulations were made with the R package yuima. The implemented method for simulations uses the Euler–Maruyama scheme, defined by \(X^ {n}_{t}= X ^ {n}_{t_{k}} + f(X ^ {n}_{t_{k}}) (t-t_{k}) + (B ^ {H} _{t}- B ^ {H} _{t_{k}})\) for \(t_{k}\le t\le t_{k+1}\) with \(X ^ {n}_{0}=x\) (see, e.g., [6]). This constitutes an additional source of error. However, it follows from [12] that, if X satisfies (9)

$$\begin{aligned} \sup _{0\le t\le T}{\mathbb {E}} [|X_t-X_t^n|^p]^{1/p}\le C \gamma _n^{-1} \end{aligned}$$

with

$$\begin{aligned} \gamma _n={\left\{ \begin{array}{ll} n^{2H-1/2} &{} \text { if }\frac{1}{2}<H<\frac{3}{4},\\ \frac{n}{\sqrt{\log (n)}} &{} \text { if } H=\frac{3}{4},\\ n &{} \text { if }\frac{3}{4}<H<1 \end{array}\right. } \end{aligned}$$

holds for all \(p\ge 1\).

Note that SDE (22) satisfies the assumptions of the paper [12]. By a Borel–Cantelli argument given in [15] we can conclude that for all \(\varepsilon >0\) there exist random variables \(\xi _{H,\,\varepsilon }\) such that

$$\begin{aligned} \Vert X-X^n\Vert _{\infty }\le \xi _{H,\,\varepsilon }n^{-\beta + \varepsilon } \end{aligned}$$

with \(\beta = 2H-\frac{1}{2}\) for \(\frac{1}{2}<H<\frac{3}{4}\) and \(\beta = 1\) for \(\frac{3}{4}\le H<1\).

Let \(\hat{H}, \hat{H}_{1}\) be the estimators given by (19) and (20), respectively. For the empirical mean squared error of the estimators \(\hat{H}\) and \(\hat{H}_1\), respectively, we can write

$$\begin{aligned} \frac{1}{m}\sum _{i=1}^m (\hat{H}(X_i)-H)^2&\le 2\frac{1}{m}\sum _{i=1}^m (\hat{H}(X_i)-\hat{H}(X_i^n))^2+ 2\frac{1}{m}\sum _{i=1}^m (\hat{H}(X_i^n)-H)^2,\\ \frac{1}{m}\sum _{i=1}^m (\hat{H}_1(X_i)-H)^2&\le 2\frac{1}{m}\sum _{i=1}^m (\hat{H}_1(X_i)-\hat{H}_1(X_i^n))^2+ 2\frac{1}{m}\sum _{i=1}^m (\hat{H}_1(X_i^n)-H)^2 \end{aligned}$$

for m independent copies of the process X and their respective approximations. We also have

$$\begin{aligned} \frac{1}{m}\sum _{i=1}^m (\hat{H}(X^n_i)-H)^2&\le 2\frac{1}{m}\sum _{i=1}^m (\hat{H}(X_i)-\hat{H}(X_i^n))^2+ 2\frac{1}{m}\sum _{i=1}^m (\hat{H}(X_i)-H)^2,\\ \frac{1}{m}\sum _{i=1}^m (\hat{H}_1(X_i^n)-H)^2&\le 2\frac{1}{m}\sum _{i=1}^m (\hat{H}_1(X_i)-\hat{H}_1(X_i^n))^2+ 2\frac{1}{m}\sum _{i=1}^m (\hat{H}_1(X_i)-H)^2. \end{aligned}$$

For the simulation error we can note that the functions \(\hat{H}\), \(\hat{H}_1\) are linear combinations of logarithms of terms \(nU(\alpha ,\, n,\, \Delta ,\, X)\) and, therefore, Lipschitz continuous on intervals \([a,\infty )\) for a positive a. Since the terms \(nU(\alpha ,\, n,\, \Delta ,\, X)\) are growing, the simulation error is bounded by \(n^{2\beta - 2\varepsilon }\) (up to a constant) for \(\beta \) as above, which converges faster than n for all \(H\in (1/2,\,1)\). Seeing that in the above settings this speed of convergence is faster than the second-order speed of convergence of our estimators, the simulation error can be considered to be asymptotically negligible as long as \(H\in (1/2 ,\, 1)\) is considered, which gives us a reason to consider the approximated process. Indeed, the empirical mean square error for our simulations demonstrate n as their rate of convergence although a simulation error is present.

Some trajectories are displayed in Fig. 1 for different values of H.

We calculate the (empirical) mean squared error for 1000, 2000, 4000 and 8000 observations of the approximation process in the following settings:

  1. (S1)

    \(H=0.7\) on an interval \([0,\,1]\) with the “standard” estimator \(\hat{H}\), \(\sigma = 1\),

  2. (S2)

    \(H=0.7\) on an interval \([0,\,1]\) with \(\hat{H}_1\) for \(a^{(1)}:=(-1,\,1)\), \(\sigma = 1\),

  3. (S3)

    \(H=0.7\) on an interval \([0,\,1]\) with \(\hat{H}_1\) for \(a^{(1)}:=(-1,\,1)\), \(\sigma = 5\),

  4. (S4)

    \(H=0.7\) on an interval \([0,\,10]\) with \(\hat{H}_1\) for \(a^{(1)}:=(-1,\,1)\), \(\sigma = 1\),

  5. (S5)

    \(H=0.98\) on an interval \([0,\,1]\) with the “standard” estimator \(\hat{H}\), \(\sigma = 1\),

  6. (S6)

    \(H=0.98\) on an interval \([0,\,1]\) with \(\hat{H}_1\) for \(a^{(1)}:=(1,\,-2,\,1)\), \(\sigma = 1\),

  7. (S7)

    \(H=0.98\) on an interval \([0,\,10]\) with \(\hat{H}_1\) for \(a^{(1)}:=(1,\,-2,\,1)\), \(\sigma = 1\).

We obtain the following results:

Fig. 1
figure 1

Simulated paths of SDE (22) for \(H=0.3\), 0.5 and 0.8, respectively

Fig. 2
figure 2

A boxplot diagram for \(\hat{H}\) in the settings (S5) and (S6)

n

(S1)

(S2)

(S3)

(S4)

(S5)

(S6)

(S7)

1000

\(2.23 \times 10^{-5}\)

0.00051

0.0004

0.0005

0.004

0.001

0.00085

2000

\(9.27 \times 10^{-6}\)

0.00027

0.0002

0.00025

0.003

0.00074

0.00054

4000

\(4.31 \times 10^{-6}\)

0.00014

0.00011

0.00012

0.003

0.00034

0.00025

8000

\(1.82 \times 10^{-6}\)

\(7.67 \times 10^{-5}\)

\(4.8 \times 10^{-5}\)

\(5.95 \times 10^{-5}\)

0.0025

0.00014

0.00015

Indeed, one can see that the estimators perform well for a broad range of conditions. For \(H<\frac{3}{4}\), in case of known \(\sigma \) and on the unit interval the simplest estimator \(\hat{H}\) seems to perform best. However, for large H its convergence becomes slower compared to that of \(\hat{H}_1\) for the order 2 filter. It is a known result (see, e.g., [5]) that for the filter \((-1,\,1)\) the asymptotic rates of convergence for the quadratic variation of an fBm are \(\sqrt{n\log (n)}\) in the case \(H=\frac{3}{4}\) and \(n^{2H-1}\) for \(H>\frac{3}{4}\). This constraint seems to transfer to the process X.

Figure 2 provides a visualization for the difference in the convergence rates. The boxplot diagrams are obtained by evaluating \(\hat{H}\) for 100 trajectories. They are read in the following way: The upper and lower lines indicate the range of the obtained data and the lines within the boxes represent the positions of the first and third quartiles as well as of the median.