Abstract
We consider adaptive maximum-likelihood-type estimators and adaptive Bayes-type ones for discretely observed ergodic diffusion processes with observation noise whose variance is constant. The quasi-likelihood functions for the diffusion and drift parameters are introduced and the polynomial-type large deviation inequalities for those quasi-likelihoods are shown to see the asymptotic properties of the adaptive Bayes-type estimators and the convergence of moments for both adaptive maximum-likelihood-type estimators and adaptive Bayes-type ones.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
We consider a d-dimensional ergodic diffusion process defined by the following stochastic differential equation such that
where \(\left\{ w_{t}\right\} _{t\ge 0}\) is an r-dimensional Wiener process, \(x_{0}\) is a random variable independent of \(\left\{ w_{t}\right\} _{t\ge 0}\), \(\alpha \in \Theta _{1}\) and \(\beta \in \Theta _{2}\) are unknown parameters, \(\Theta _{1}\subset \mathbf {R}^{m_{1}}\) and \(\Theta _{2}\subset \mathbf {R}^{m_{2}}\) are bounded, open, and convex sets in \(\mathbf {R}^{m_{i}}\) admitting Sobolev’s inequalities for embedding \(W^{1,p}\left( \Theta _{i}\right) \hookrightarrow C\left( \overline{\Theta }_{i}\right) \) for \(i=1,2\), \(\theta ^{\star }=\left( \alpha ^{\star },\beta ^{\star }\right) \) is the true value of the parameter, and \(a:\mathbf {R}^{d}\times \Theta _{1}\rightarrow \mathbf {R}^{d}\otimes \mathbf {R}^{r}\) and \(b:\mathbf {R}^{d}\times \Theta _{2}\rightarrow \mathbf {R}^{d}\) are known functions.
A matter of interest is to estimate the parameter \(\theta =\left( \alpha ,\beta \right) \) with partial and indirect observation of \(\left\{ X_{t}\right\} _{t\ge 0}\): the observation is discretised and contaminated by exogenous noise. The sequence of observation \(\left\{ Y_{ih_{n}}\right\} _{i=0,\ldots ,n}\), which our parametric estimation is based on, is defined as
where \(h_{n}>0\) is the discretisation step such that \(h_{n}\rightarrow 0\) and \(T_{n}=nh_{n}\rightarrow \infty \), \(\left\{ \varepsilon _{ih_{n}}\right\} _{i=0,\ldots ,n}\) is an i.i.d. sequence of random variables independent of \(\left\{ w_{t}\right\} _{t\ge 0}\) and \(x_0\) such that \(\mathbf {E}_{\theta ^{\star }}\left[ \varepsilon _{ih_{n}}\right] =0\) and \(\mathrm {Var}_{\theta ^{\star }}\left( \varepsilon _{ih_{n}}\right) =I_{d}\) where \(I_{m}\) is the identity matrix in \(\mathbf {R}^{m}\otimes \mathbf {R}^{m}\) for every \(m\in \mathbf {N}\), and \(\Lambda \in \mathbf {R}^{d}\otimes \mathbf {R}^{d}\) is a positive semi-definite matrix which is the variance of noise term. We also assume that the half vectorisation of \(\Lambda \) has bounded, open, and convex parameter space \(\Theta _{\varepsilon }\), and let us denote \(\Xi :=\Theta _{\varepsilon }\times \Theta _{1}\times \Theta _{2}\). We also notate the true parameter of \(\Lambda \) as \(\Lambda _{\star }\), its half vectorisation as \(\theta _{\varepsilon }^{\star }=\mathrm {vech}\Lambda _{\star }\), and \(\vartheta ^{\star }=\left( \theta _{\varepsilon }^{\star },\alpha ^{\star },\beta ^{\star }\right) \). That is to say, our interest is parametric inference for an ergodic diffusion with long-term and high-frequency noised observation. One of the concrete examples is the wind velocity data provided by NWTC Information Portal (2018) whose observation is contaminated by exogenous noise with statistical significance according to the test for noise detection (Nakakita and Uchida 2019a) (Fig. 1).
As the existent discussion, Nakakita and Uchida (2019a) propose the following estimators \(\hat{\Lambda }_{n}\), \(\hat{\alpha }_{n}\), and \(\hat{\beta }_{n}\) such that
where for every matrix A, \(A^\mathrm{{T}}\) is the transpose of A and \(A^{\otimes 2}=AA^\mathrm{{T}}\), \(\mathbb {H}_{1,n}^{\tau }\) and \(\mathbb {H}_{2,n}\) are the adaptive quasi-likelihood functions of \(\alpha \) and \(\beta \), respectively, defined in Sect. 3, and \(\tau \in \left( 1,2\right] \) is a tuning parameter, and Nakakita and Uchida (2019a) show that these estimators are asymptotically normal and particularly the drift one is asymptotically efficient. To obtain the convergence rates of the estimators, it is necessary to see the asymptotic properties of the quasi-likelihood functions. Both of them are functions of local means of observation defined as
where \(k_{n}\) is the number of partitions given for observation, \(p_{n}\) is that of observation in each partition, and \(\Delta _{n}=p_{n}h_{n}\) is the time interval which each partition has, and note that these parameters have the properties \(k_{n}\rightarrow \infty \), \(p_{n}\rightarrow \infty \) and \(\Delta _{n}\rightarrow 0\). Intuitively speaking, \(k_{n}\) and \(\Delta _{n}\) correspond to n and \(h_{n}\) in the observation scheme without exogenous noise, and divergence of \(p_{n}\) works to eliminate the influence of noise by law of large numbers. Hence, it should be also easy to understand that we have the asymptotic normality with the convergence rates \(\sqrt{k_{n}}\) and \(\sqrt{T_{n}}\) for \(\alpha \) and \(\beta \); that is,
where \(\xi \) is an \(\left( m_{1}+m_{2}\right) \)-dimensional Gaussian distribution with zero-mean.
The statistical inference for diffusion processes with discretised observation has been investigated for several decades, see Florens-Zmirou (1989), Yoshida (1992), Bibby and Sørensen (1995), and Kessler (1995, 1997). In practice, it is necessary to argue whether exogenous noise exists in observation, and it has been pointed out that the observational noise, known as microstructure noise, certainly exists in high-frequency financial data which is one of the major disciplines where statistics for diffusion processes is applied. Inference for diffusions under the noisy and discretised observation in fixed time interval [0, 1] is discussed by Jacod et al. (2009). Favetto (2014, 2016) examines the same model as our study and shows that simultaneous maximum-likelihood-type (ML-type) estimation has consistency under the situation where the variance of noise is unknown and asymptotic normality under the situation where the variance is known. As mentioned above, Nakakita and Uchida (2019a) propose adaptive ML-type estimation which has asymptotic normality even if we do not know the variance of noise, and test for noise detection which succeeds in showing the real example of the data (NWTC Information Portal 2018) contaminated by observational noise.
Our study aims to obtain polynomial-type large deviation inequalities for statistical random fields and to construct the adaptive Bayes-type estimators of both drift and diffusion parameters. Moreover, it is shown that both the adaptive ML-type estimators proposed in Nakakita and Uchida (2019a) and the adaptive Bayes-type estimators have not only asymptotic normality but also a certain type of convergence of moments. It is well known that asymptotic normality is one of the hopeful properties that estimators are expected to have; for instance, Nakakita and Uchida (2019b) utilise this result to compose likelihood-ratio-type statistics and related ones for parametric test and prove the convergence in distribution to a \(\chi ^2\)-distribution under null hypothesis and consistency of the test under alternative one. However, it is also known that asymptotic normality is not sufficient to develop some discussion requiring convergence of moments such as information criterion. In concrete terms, it is necessary to show the convergence of moments such that for every \(f\in C\left( \mathbf {R}^{m_{1}}\times \mathbf {R}^{m_{2}}\right) \) with at most polynomial growth and adaptive ML-type estimators \(\hat{\alpha }_{n}\) and \(\hat{\beta }_{n}\),
This is a stronger property than mere asymptotic normality since if we take f as a bounded and continuous function, then indeed asymptotic normality follows.
To see the asymptotic properties of the adaptive Bayes-type estimators and the convergence of moments for adaptive ML-type estimator, we can utilise polynomial-type large deviation inequalities (PLDI) and quasi-likelihood analysis (QLA) proposed by Yoshida (2011) which have been widely used to discuss the asymptotic properties of the Bayes-type estimators and the convergence of moments of both the ML-type estimation and the Bayes-type one in statistical inference for continuous-time stochastic processes. This approach is developed from the exponential-type large deviation and likelihood analysis introduced by Ibragimov and Has’minskii (1972, 1973, 1981) and Kutoyants (1984, 1994, 2004) for continuously observed stochastic processes. Yoshida (2011) also discusses the asymptotic properties of the simultaneous Bayes-type estimators and adaptive Bayes-type ones and convergence of moments for both the adaptive ML-type estimation and the Bayes-type estimation for ergodic diffusions with \(nh_{n}\rightarrow \infty \) and \(nh_{n}^{2}\rightarrow 0\). Uchida and Yoshida (2012, 2014) examine the same problem for adaptive ML-type and adaptive Bayes-type estimation of ergodic diffusions with more relaxed condition: \(nh_{n}\rightarrow \infty \) and \(nh_{n}^{p}\rightarrow 0\) for some \(p\ge 2\). Ogihara and Yoshida (2011) study the asymptotic properties of the adaptive Bayes-type estimators and the convergence of moments for both the ML-type estimators and the Bayes-type estimators of ergodic jump-diffusion processes in the scheme of \(nh_{n}\rightarrow \infty \) and \(nh_{n}^{2}\rightarrow 0\). For the Bayes-type estimation for diffusion-type processes in a general setting, see Ogihara (2018, 2019). Clinet and Yoshida (2017) show PLDI for the quasi-likelihood function for ergodic point processes and the convergence of moments for the corresponding ML-type and Bayes-type estimators. From the viewpoint of computational statistics, it is crucial to get the Bayes-type estimation for diffusion-type processes from high-frequency data. In particular, Bayesian statistics work well for the estimation of unknown nonlinear parameters based on the multimodal quasi-likelihood functions for diffusion-type processes. In order to obtain the ML-type estimators for diffusion-type processes, the hybrid-type estimators with initial Bayes-type estimators are studied by Kamatani and Uchida (2015) and Kaino and Uchida (2018a, b). For the convergence of moments for Z-estimators of ergodic diffusion processes, see Negri and Nishiyama (2017). As the applications of the convergence of moments for estimators of stochastic differential equations, Uchida (2010) composes AIC-type information criterion for ergodic diffusion processes, and Eguchi and Masuda (2018) propose BIC-type one for local asymptotic quadratic statistical experiments including some schemes for diffusion processes. For the convergence of moments of regularized estimators for a discretely observed ergodic diffusion process, see Masuda and Shimizu (2017).
The composition of the paper is as follows: in Sect. 2, we set some notations and assumptions; Sect. 3 proposes the adaptive Bayes-type estimators of both drift and diffusion parameters and gives the main results of our study such as QLA for our ergodic diffusion plus noise model and the convergence of moments of both the adaptive ML-type estimators and the adaptive Bayes-type estimators for the model; Sect. 4 studies a concrete example of adaptive Bayes-type estimators for ergodic diffusion plus noise model and shows the result of computational simulation; Sect. 5 provides the technical proofs for the main results shown in Sect. 3.
2 Notation and assumption
We set the following notations.
-
For every matrix A, \(A^\mathrm{{T}}\) is the transpose of A, and \(A^{\otimes 2}:=AA^\mathrm{{T}}\).
-
For every set of matrices A and B whose dimensions coincide, \(A\left[ B\right] :=\mathrm {tr}\left( AB^\mathrm{{T}}\right) \). Moreover, for any \(m\in \mathbf {N}\), \(A\in \mathbf {R}^{m}\otimes \mathbf {R}^{m}\) and \(u,v\in \mathbf {R}^{m}\), \(A\left[ u,v\right] :=v^\mathrm{{T}}Au\).
-
Let us denote the \(\ell \)th element of any vector v as \(v^{\left( \ell \right) }\) and \(\left( \ell _{1},\ell _{2}\right) \)th one of any matrix A as \(A^{\left( \ell _{1},\ell _{2}\right) }\).
-
For any vector v and any matrix A, \(\left| v\right| :=\sqrt{\mathrm {tr}\left( v^\mathrm{{T}}v\right) }\) and \(\left\| A\right\| :=\sqrt{\mathrm {tr}\left( A^\mathrm{{T}}A\right) }\).
-
For every \(p>0\), \(\left\| \cdot \right\| _{p}\) is the \(L^{p}\left( P_{\theta ^{\star }}\right) \)-norm.
-
\(A\left( x,\alpha \right) :=a\left( x,\alpha \right) ^{\otimes 2}\), \(a\left( x\right) :=a\left( x,\alpha ^{\star }\right) \), \(A\left( x\right) :=A\left( x,\alpha ^{\star }\right) \) and \(b\left( x\right) :=b\left( x,\beta ^{\star }\right) \).
-
For given \(\tau \in \left( 1,2\right] \), \(p_{n}:=h_{n}^{-1/\tau }\), \(\Delta _{n}:=p_{n}h_{n}\), and \(k_{n}:=n/p_{n}\), and we define the sequence of local means such that
$$\begin{aligned} \bar{Z}_{j}=\frac{1}{p_{n}}\sum _{i=0}^{p_{n}-1}Z_{j\Delta _n+ih_{n}},\ j=0,\ldots ,k_{n}-1, \end{aligned}$$where \(\left\{ Z_{ih_{n}}\right\} _{i=0,\ldots ,n}\) indicates an arbitrary sequence defined on the mesh \(\left\{ ih_{n}\right\} _{i=0,\ldots ,n}\) such as \(\left\{ Y_{ih_{n}}\right\} _{i=0,\ldots ,n}\), \(\left\{ X_{ih_{n}}\right\} _{i=0,\ldots ,n}\) and \(\left\{ \varepsilon _{ih_{n}}\right\} _{i=0,\ldots ,n}\).
Remark 1
Since the observation is masked by the exogenous noise, it should be transformed to obtain the undermined process \(\left\{ X_{t}\right\} _{t\ge 0}\). As illustrated by Nakakita and Uchida (2019a), the sequence \(\left\{ \bar{Y}_{j}\right\} _{j=0,\ldots ,k_{n}-1}\) can extract the state of the latent process \(\left\{ X_{t}\right\} _{t\ge 0}\) in the sense of the statement of Lemma 2.
-
\(\mathcal {G}_{t}:=\sigma \left( x_{0},w_{s}:s\le t\right) \), \(\mathcal {G}_{j,i}^{n}:=\mathcal {G}_{j\Delta _{n}+ih_{n}}\), \(\mathcal {G}_{j}^{n}:=\mathcal {G}_{j,0}^{n}\), \(\mathcal {A}_{j,i}^{n}:=\sigma \left( \varepsilon _{\ell h_{n}}:\ell \le jp_{n}+i-1\right) \), \(\mathcal {A}_{j}^{n}:=\mathcal {A}_{j,0}^{n}\), \(\mathcal {H}_{j,i}^{n}:=\mathcal {G}_{j,i}^{n}\vee \mathcal {A}_{j,i}^{n}\) and \(\mathcal {H}_{j}^{n}:=\mathcal {H}_{j,0}^{n}\).
-
We define the real-valued function as for \(l_{1},l_{2},l_{3},l_{4}=1,\ldots ,d\):
$$\begin{aligned}&V\left( (l_1,l_2),(l_3,l_4)\right) \\&\quad :=\sum _{k=1}^{d}\left( \Lambda _{\star }^{1/2}\right) ^{(l_1,k)}\left( \Lambda _{\star }^{1/2}\right) ^{(l_2,k)}\left( \Lambda _{\star }^{1/2}\right) ^{(l_3,k)}\left( \Lambda _{\star }^{1/2}\right) ^{(l_4,k)} \left( \mathbf {E}_{\theta ^{\star }}\left[ \left| \epsilon _{0}^{\left( k\right) }\right| ^4\right] -3\right) \\&\qquad +\frac{3}{2}\left( \Lambda _{\star }^{(l_1,l_3)}\Lambda _{\star }^{(l_2,l_4)}+\Lambda _{\star }^{(l_1,l_4)}\Lambda _{\star }^{(l_2,l_3)}\right) , \end{aligned}$$and with the function \(\sigma \) as for \(i=1,\ldots ,d\) and \(j=i,\ldots ,d\),
$$\begin{aligned} \sigma \left( i,j\right) :={\left\{ \begin{array}{ll} j&{}\text { if }i=1,\\ \sum _{\ell =1}^{i-1}\left( d-\ell +1\right) +j-i+1 &{} \text { if }i>1, \end{array}\right. } \end{aligned}$$we define the matrix \(W_{1}\) as for \(i_{1},i_{2}=1,\ldots ,d(d+1)/2\),
$$\begin{aligned} W_{1}^{\left( i_{1},i_{2}\right) }:=V\left( \sigma ^{-1}\left( i_{1}\right) ,\sigma ^{-1}\left( i_{2}\right) \right) . \end{aligned}$$ -
Let
$$\begin{aligned}&\left\{ B_{\kappa }(x)\left| \kappa =1,\ldots ,m_1,\ B_{\kappa }=(B_{\kappa }^{(j_1,j_2)})_{j_1,j_2}\right. \right\} ,\\&\left\{ f_{\lambda }(x)\left| \lambda =1,\ldots ,m_2,\ f_{\lambda }=(f^{(1)}_{\lambda },\ldots ,f^{(d)}_{\lambda })\right. \right\} \end{aligned}$$be sequences of \(\mathbf {R}^d\otimes \mathbf {R}^d\)-valued functions and \(\mathbf {R}^d\)-valued ones, respectively, such that the components of themselves and their derivative with respect to x are polynomial growth functions for all \(\kappa \) and \(\lambda \). Then, we define the following matrix-valued functionals, for \(\bar{B}_{\kappa }:=\frac{1}{2}\left( B_{\kappa }+B_{\kappa }^\mathrm{{T}}\right) \),
$$\begin{aligned}&\left( W_2^{(\tau )}\left( \left\{ B_{\kappa }:\kappa =1,\ldots ,m_{1}\right\} \right) \right) ^{(\kappa _1,\kappa _2)}\\&\qquad :={\left\{ \begin{array}{ll} \nu \left( \mathrm {tr}\left\{ \left( \bar{B}_{\kappa _1}A\bar{B}_{\kappa _2}A\right) (\cdot )\right\} \right) &{}\text { if }\tau \in (1,2),\\ \nu \left( \mathrm {tr}\left\{ \left( \bar{B}_{\kappa _1}A\bar{B}_{\kappa _2}A+4\bar{B}_{\kappa _1}A\bar{B}_{\kappa _2}\Lambda _{\star }+12\bar{B}_{\kappa _1}\Lambda _{\star }\bar{B}_{\kappa _2}\Lambda _{\star }\right) (\cdot )\right\} \right) &{}\text { if }\tau =2, \end{array}\right. }\\&\left( W_3\left( \left\{ f_{\lambda }:\lambda =1,\ldots ,m_{2}\right\} \right) \right) ^{(\lambda _1,\lambda _2)}\\&\qquad := \nu \left( \left( f_{\lambda _1}A\left( f_{\lambda _2}\right) ^\mathrm{{T}}\right) (\cdot )\right) , \end{aligned}$$where \(\nu =\nu _{\theta ^{\star }}\) is the invariant measure of \(X_{t}\) discussed in assumption [A1]—(iv), and for all function f on \(\mathbf {R}^{d}\), \(\nu \left( f\left( \cdot \right) \right) :=\int _{\mathbf {R}^{d}} f\left( x\right) \nu \left( {\mathrm {d}}x\right) \).
With respect to \(X_{t}\), we assume the following conditions.
-
[A1]
-
(i)
\(\inf _{x,\alpha }\det A\left( x,\alpha \right) >0\).
-
(ii)
For some constant C, for all \(x_{1},x_{2}\in \mathbf {R}^{d}\),
$$\begin{aligned} \sup _{\alpha \in \Theta _{1}}\left\| a\left( x_{1},\alpha \right) -a\left( x_{2},\alpha \right) \right\| + \sup _{\beta \in \Theta _{2}}\left| b\left( x_{1},\beta \right) -b\left( x_{2},\beta \right) \right| \le C\left| x_{1}-x_{2}\right| \end{aligned}$$ -
(iii)
For all \(p\ge 0\), \(\sup _{t\ge 0}\mathbf {E}_{\theta ^{\star }}\left[ \left| X_{t}\right| ^{p}\right] <\infty \).
-
(iv)
There exists an unique invariant measure \(\nu =\nu _{0}\) on \(\left( \mathbf {R}^{d},\mathcal {B}\left( \mathbf {R}^{d}\right) \right) \) and for all \(p\ge 1\) and \(f\in L^{p}\left( \nu \right) \) with polynomial growth,
$$\begin{aligned} \frac{1}{T}\int _{0}^\mathrm{{T}}f\left( X_{t}\right) {\mathrm {d}}t\rightarrow ^{P}\int _{\mathbf {R}^{d}}f\left( x\right) \nu \left( {\mathrm {d}}x\right) . \end{aligned}$$ -
(v)
For any polynomial growth function \(g:\mathbf {R}^{d}\rightarrow \mathbf {R}\) satisfying \(\int _{R^{d}}g\left( x\right) \nu \left( {\mathrm {d}}x\right) = 0\), there exist G(x), \(\partial _{x^{\left( i\right) }}G(x)\) with at most polynomial growth for \(i=1,\ldots ,d\) such that for all \(x\in \mathbf {R}^{d}\),
$$\begin{aligned} L_{\theta ^{\star }}G\left( x\right) =-g\left( x\right) , \end{aligned}$$where \(L_{\theta ^{\star }}\) is the infinitesimal generator of \(X_{t}\).
-
(i)
Remark 2
Pardoux and Veretennikov (2001) show a sufficient condition for [A1]-(v). Uchida and Yoshida (2012) also introduce the sufficient condition for [A1]—(iii)–(v) assuming [A1]—(i)–(ii), \(\sup _{x,\alpha }A\left( x,\alpha \right) <\infty \) and \(^\exists c_{0}>0\), \(M_{0}>0\) and \(\gamma \ge 0\) such that for all \(\beta \in \Theta _{2}\) and \(x\in \mathbf {R}^{d}\) satisfying \(\left| x\right| \ge M_{0}\),
-
[A2]
There exists \(C>0\) such that \(a:\mathbf {R}^{d}\times \Theta _{1}\rightarrow \mathbf {R}^{d}\otimes \mathbf {R}^{r}\) and \(b:\mathbf {R}^{d}\times \Theta _{2}\rightarrow \mathbf {R}^{d}\) have continuous derivatives satisfying
$$\begin{aligned} \sup _{\alpha \in \Theta _{1}}\left| \partial _{x}^{j}\partial _{\alpha }^{i}a\left( x,\alpha \right) \right|&\le C\left( 1+\left| x\right| \right) ^{C},\ 0\le i\le 4,\ 0\le j\le 2,\\ \sup _{\beta \in \Theta _{2}}\left| \partial _{x}^{j}\partial _{\beta }^{i}b\left( x,\beta \right) \right|&\le C\left( 1+\left| x\right| \right) ^{C},\ 0\le i\le 4,\ 0\le j\le 2. \end{aligned}$$
With the invariant measure \(\nu \), we define
where \(A^{\tau }\left( x,\alpha ,\Lambda \right) :=A\left( x,\alpha \right) +3\Lambda \mathbf {1}_{\left\{ 2\right\} }\left( \tau \right) \). For these functions, let us assume the following identifiability conditions hold.
-
[A3]
For all \(\tau \in \left( 1,2\right] \), there exists a constant \(\chi \left( \alpha ^{\star }\right) >0\) such that \(\mathbb {Y}_{1}^{\tau }\left( \alpha ;\theta ^{\star }\right) \le -\chi \left( \theta ^{\star }\right) \left| \alpha -\alpha ^{\star }\right| ^{2}\) for all \(\alpha \in \Theta _{1}\).
-
[A4]
For all \(\tau \in \left( 1,2\right] \), there exists a constant \(\chi '\left( \beta ^{\star }\right) >0\) such that \(\mathbb {Y}_{2}\left( \beta ;\theta ^{\star }\right) \le -\chi '\left( \theta ^{\star }\right) \left| \beta -\beta ^{\star }\right| ^{2}\) for all \(\beta \in \Theta _{2}\).
The next assumption is with respect to the moments of noise.
-
[A5]
For any \(k > 0\), \(\varepsilon _{ih_{n}}\) has kth moment and the components of \(\varepsilon _{ih_{n}}\) are independent of the other components for all i, \(\left\{ w_{t}\right\} _{t\ge 0}\) and \(x_{0}\). In addition, for all odd integer k, \(i=0,\ldots ,n\), \(n\in \mathbf {N}\), and \(\ell =1,\ldots ,d\), \(\mathbf {E}_{\theta ^{\star }}\left[ \left( \varepsilon _{ih_{n}}^{\left( \ell \right) }\right) ^{k}\right] =0\), and \(\mathbf {E}_{\theta ^{\star }}\left[ \varepsilon _{ih_{n}}^{\otimes 2}\right] =I_{d}\).
The following assumption determines the balance of convergence or divergence of several parameters. Note that \(\tau \) is a tuning parameter and hence we can control it arbitrarily in its space \(\left( 1,2\right] \).
-
[A6]
\(p_{n}=h_{n}^{-1/\tau }\), \(\tau \in \left( 1,2\right] \), \(h_{n}\rightarrow 0\), \(T_{n}=nh_{n}\rightarrow \infty \), \(k_{n}=n/p_{n}\rightarrow \infty \), \(k_{n}\Delta _{n}^{2}\rightarrow 0\) for \(\Delta _{n}:=p_{n}h_{n}\). Furthermore, there exists \(\epsilon _{0}>0\) such that \(nh_{n}\ge k_{n}^{\epsilon _{0}}\) for sufficiently large n.
Remark 3
The tuning parameter \(\tau \) controls the number of local means and that of samples used for one local mean. If we set \(\tau \) to be close to 2, we can make \(\sqrt{k_{n}}\) larger which is the convergence rate of the adaptive maximum-likelihood-type estimator \(\hat{\alpha }_{n}\) and the adaptive Bayes-type one \(\tilde{\alpha }_{n}\) defined later on. Note that the estimation with larger values of \(\tau \) does not necessarily outperform that with smaller value of \(\tau \) if the noise variance is large (e.g. see the simulation study in Nakakita and Uchida 2019a).
Remark 4
Let us denote \(\epsilon _{1}=\epsilon _{0}/2\) and \(f\in \mathcal {C}^{1,1}\left( \mathbf {R}^{d}\times \Xi \right) \) where f and the components of their derivatives are polynomial growth with respect to x uniformly in \(\vartheta \in \Xi \). Then, the discussion in Uchida (2010) verifies under [A1] and [A6], for all \(M>0\),
3 Quasi-likelihood analysis
First of all, we introduce and analyse some quasi-likelihood functions and estimators which are defined in Nakakita and Uchida (2019a). The quasi-likelihood functions for the diffusion parameter \(\alpha \) and the drift one \(\beta \) using this sequence are as follows:
where \(A_{n}^{\tau }\left( x,\alpha ,\Lambda \right) :=A\left( x,\alpha \right) +3\Delta _{n}^{\frac{2-\tau }{\tau -1}}\Lambda \). We set the adaptive ML-type estimators \(\hat{\Lambda }_{n}\), \(\hat{\alpha }_{n}\), and \(\hat{\beta }_{n}\) such that
Assume that \(\pi _{\ell }\), \(\ell =1,2\) are continuous and \(0<\inf _{\theta _{\ell }\in \Theta _{\ell }}\pi _{\ell }\left( \theta _{\ell }\right)<\sup _{\theta _{\ell }\in \Theta _{\ell }}\pi _{\ell }\left( \theta _{\ell }\right) <\infty \) and denote the adaptive Bayes-type estimators
Our purpose is to show the polynomial-type large deviation inequalities for the quasi-likelihood functions defined above in the framework introduced by Yoshida (2011), and the convergences of moments for the adaptive ML-type estimators and the Bayes-type estimators as the application of them. Note that the asymptotic properties of the Bayes-type estimators are shown by using the polynomial-type large deviation inequality. Let us denote the following statistical random fields for \(u_{1}\in \mathbf {R}^{m_{1}}\) and \(u_{2}\in \mathbf {R}^{m_{2}}\)
and some sets
and for \(r\ge 0\),
We use the notation as Nakakita and Uchida (2019a) for the information matrices
where for \(i_1,i_2\in \left\{ 1,\ldots ,m_1\right\} \),
and for \(j_1,j_2\in \left\{ 1,\ldots ,m_2\right\} \),
We also denote \(\hat{\theta }_{\varepsilon ,n}:=\mathrm {vech}\hat{\Lambda }_{n}\) and \(\theta _{\varepsilon }^{\star }:=\mathrm {vech}\Lambda _{\star }\).
Theorem 1
Under [A1]–[A6], we have the following results.
-
1.
The polynomial-type large deviation inequalities hold: for all \(L>0\), there exists a constant \(C\left( L\right) \) such that for all \(r>0\),
$$\begin{aligned} P_{\theta ^{\star }}\left[ \sup _{u_{1}\in V_{1,n}^{\tau }\left( r,\alpha ^{\star }\right) }\mathbb {Z}_{1,n}^{\tau }\left( u_{1};\hat{\Lambda }_{n},\alpha ^{\star }\right) \ge e^{-r}\right]&\le \frac{C\left( L\right) }{r^{L}},\\ P_{\theta ^{\star }}\left[ \sup _{u_{2}\in V_{2,n}\left( r,\beta ^{\star }\right) }\mathbb {Z}_{2,n}^{\mathrm {ML}}\left( u_{2};\hat{\alpha }_{n},\beta ^{\star }\right) \ge e^{-r}\right]&\le \frac{C\left( L\right) }{r^{L}},\\ P_{\theta ^{\star }}\left[ \sup _{u_{2}\in V_{2,n}\left( r,\beta ^{\star }\right) }\mathbb {Z}_{2,n}^{\mathrm {Bayes}}\left( u_{2};\tilde{\alpha }_{n},\beta ^{\star }\right) \ge e^{-r}\right]&\le \frac{C\left( L\right) }{r^{L}}. \end{aligned}$$ -
2.
The convergences of moment hold:
$$\begin{aligned} \mathbf {E}_{\theta ^{\star }}\left[ f\left( \sqrt{n}\left( \hat{\theta }_{\varepsilon ,n}-\theta _{\varepsilon }^{\star }\right) , \sqrt{k_{n}}\left( \hat{\alpha }_{n}-\alpha ^{\star }\right) , \sqrt{T_{n}}\left( \hat{\beta }_{n}-\beta ^{\star }\right) \right) \right]&\rightarrow \mathbb {E}\left[ f\left( \zeta _{0},\zeta _{1},\zeta _{2}\right) \right] ,\\ \mathbf {E}_{\theta ^{\star }}\left[ f\left( \sqrt{n}\left( \hat{\theta }_{\varepsilon ,n}-\theta _{\varepsilon }^{\star }\right) ,\sqrt{k_{n}}\left( \tilde{\alpha }_{n}-\alpha ^{\star }\right) , \sqrt{T_{n}}\left( \tilde{\beta }_{n}-\beta ^{\star }\right) \right) \right]&\rightarrow \mathbb {E}\left[ f\left( \zeta _{0},\zeta _{1},\zeta _{2}\right) \right] , \end{aligned}$$where
$$\begin{aligned} \left( \zeta _{0},\zeta _{1},\zeta _{2}\right) \sim N_{d\left( d+1\right) /2+m_{1}+m_{2}}\left( \mathbf {0},\left( \mathcal {J}^{\tau }\left( \vartheta ^{\star }\right) \right) ^{-1}\left( \mathcal {I}^{\tau }\left( \vartheta ^{\star }\right) \right) \left( \mathcal {J}^{\tau }\left( \vartheta ^{\star }\right) \right) ^{-1} \right) \end{aligned}$$and f is an arbitrary continuous functions of at most polynomial growth.
Remark 5
It is worth noting that the adaptive Bayes-type estimators are newly proposed for the ergodic diffusion plus noise models in this study and their properties, which are not only asymptotic normality but also mere convergences of moments, are shown. As an application of the Bayes-type estimation, Kaino et al. (2018) study the hybrid estimators with initial Bayes-type estimators for our ergodic diffusion plus noise model and give an example and simulation results of the hybrid estimators.
4 Example and simulation results
We consider the two-dimensional diffusion process.
where
Moreover, the true parameter values are \((\alpha _1^*, \alpha _2^*, \alpha _3^*)= (1,0,2)\) and \((\beta _1^*, \beta _2^*, \beta _3^*, \beta _4^*, \beta _5^*, \beta _6^*) = (1,0.1,0.1,1,1,1)\). The parameter space of \(\alpha \) is \(\Theta _1 = \{(\alpha _1,\alpha _2,\alpha _3) \in [0.1,50]\times [-25,25] \times [0.1,50] | \alpha _1 \alpha _3 - \alpha _2^2 \ge 10^{-3} \}\), and the parameter space of \(\beta \) is \(\Theta _2 = [-25,25]^6\).
The noisy data \(\left\{ Y_{ih_{n}}\right\} _{i=0,\ldots ,n}\) are defined as for all \(i=0,\ldots ,n\),
where \(n=10^6\), \(h_n=6.309573 \times 10^{-5}\), \(T=nh_n=63.09573\), \(\Lambda =10^{-3} I_2\), \(I_2\) is the \(2\times 2\)-identity matrix, \(\left\{ \varepsilon _{ih_{n}}\right\} _{i=0,\ldots ,n}\) is the i.i.d. sequence of two-dimensional normal random vectors with \(\mathbf {E}\left[ \varepsilon _{0}\right] =\mathbf {0}\) and \(\mathrm {Var}\left( \varepsilon _{0}\right) =I_{2}\).
For the true model, 1000 independent sample paths are generated by the Euler–Maruyama scheme, and the mean, the standard deviation (SD), the theoretical standard deviation and the mean squared error for the estimators in Theorem 1 are computed and shown in Tables 1, 2, 3, 4, and 5. The personal computer with Intel i7-6950X (3.00GHz) was used for the simulations.
Table 1 shows the simulation results of the estimator \(\hat{\Lambda }_n =(\hat{\Lambda }_{n,i,j})_{i,j=1,2}\) of \(\Lambda =(\Lambda _{ij})_{i,j=1,2}\).
Tables 2 and 3 show the simulation results of the adaptive ML-type estimator \((\hat{\alpha }_{A, n}, \hat{\beta }_{A, n})\) with the initial value being the true value, where
the quasi-likelihood functions \(\mathbb {H}_{1, n}^{\tau }\) and \(\mathbb {H}_{2, n}\) are that
the local mean \(\left\{ \bar{Y}_{j}\right\} _{j=0,\ldots ,k_n-1}\) is defined as
Here, \(\tau =2.0\), \(k_n=8000\), \(p_n=125\), \(\Delta _n=0.007886967\), \(T_n=k_n \Delta _n=63.09573\), \(A_{n}^{\tau }\left( x,\alpha ,\Lambda \right) = A\left( x,\alpha \right) +3\Delta _{n}^{\frac{2-\tau }{\tau -1}}\Lambda \), and \(A(x, \alpha )=a a^\mathrm{{T}} (x,\alpha )\). The adaptive ML-type estimator \((\hat{\alpha }_{A, n}, \hat{\beta }_{A, n})\) are obtained by means of optim() based on the “L-BFGS-B” method in the R Language.
From Tables 1, 2, and 3, we see that all the estimators have good behaviour.
Tables 4 and 5 show the simulation results of Bayes-type estimators with uniform priors defined as
The Bayes-type estimators of \(\alpha \) and \(\beta \) are calculated with MpCN method proposed by Kamatani (2018) for \(10^4\) Markov chains and \(10^3\) burn-in iterations.
From Tables 4 and 5, we can see that the Bayes-type estimators have good behaviour. Furthermore, the performance of the Bayes-type estimators is almost the same as that of the estimators in Tables 2 and 3.
Figures 2, 4, and 6 show the plots of the empirical distribution functions, the Q–Q plots and the histograms of the adaptive ML-type estimators \(\hat{\beta }_{A, n}^{(i)} (i = 2, 3, 6)\). Figures 3, 5, and 7 show those of the the Bayes-type estimators \(\tilde{\beta }_{ n}^{(i)} (i = 2, 3, 6)\). For Figs. 2, 3, 4, 5, 6, and 7, the left side is the empirical distribution function (solid line) and the theoretical cumulative distribution function (dotted line). Moreover, the middle is the Q–Q plot and the right side is the histogram and the theoretical probability density function (dotted line). From Figs. 2, 3, 4, 5, 6 and 7, we can see that both the adaptive ML-type estimators and the Bayes-type estimators have good behaviour.
5 Proofs
5.1 Evaluation for local means
In the first place, we give some evaluations related to local means. Some of the instruments are inherited from the previous studies by Nakakita and Uchida (2017, 2019a). We define the following random variables:
The next lemma is Lemma 11 in Nakakita and Uchida (2019a).
Lemma 1
\(\zeta _{j+1,n}\) and \(\zeta _{j+1,n}'\) are \(\mathcal {G}_{j+1}^{n}\)-measurable, independent of \(\mathcal {G}_{j}^{n}\) and Gaussian. These variables have the next decompositions:
The evaluation of the following conditional expectations holds:
where \(m_{n}=\left( \frac{1}{3}+\frac{1}{2p_{n}}+\frac{1}{6p_{n}^2}\right) \), \(m_{n}'=\left( \frac{1}{3}-\frac{1}{2p_{n}}+\frac{1}{6p_{n}^2}\right) \), and \(\chi _{n}=\frac{1}{6}\left( 1-\frac{1}{p_{n}^2}\right) \).
The next lemma can be obtained with same discussion as Proposition 12 in Nakakita and Uchida (2019a).
Lemma 2
Assume the component of the function \(f\in C^{1}\left( \mathbf {R}^{d}\times \Xi ;\ \mathbf {R}\right) \) and \(\partial _{x}f\) are polynomial growth functions uniformly in \(\vartheta \in \Xi \). For all \(p\ge 1\), there exists \(C\left( p\right) >0\) such that for all \(n\in \mathbf {N}\),
Lemma 3
Assume the component of the function \(f\in C^{1}\left( \mathbf {R}^{d}\times \Xi ;\ \mathbf {R}\right) \) and \(\partial _{x}f\) are polynomial growth functions uniformly in \(\vartheta \in \Xi \). For all \(p\ge 1\), there exists \(C\left( p\right) >0\) such that for all \(n\in \mathbf {N}\)
Proof
By Lemma 2,
\(\square \)
Lemma 4
Assume the components of the functions \(f,g\in C^{2}\left( \mathbf {R}^{d};\ \mathbf {R}\right) \), \(\partial _{x}f\), \(\partial _{x}g\), \(\partial _{x}^{2}f\) \(\partial _{x}^{2}g\) are polynomial growth functions. Then, we have
Proof
For Taylor’s expansion, we have
and Ito–Taylor expansion and Proposition 3.2 in Favetto (2014) verify
It holds that
and Proposition 3.2 in Favetto (2014) leads to
Hence, we obtain the result.\(\square \)
Lemma 5
-
(i)
The next expansion holds:
$$\begin{aligned} \bar{Y}_{j+1}-\bar{Y}_{j}= \Delta _{n}b\left( X_{j\Delta _{n}}\right) + a\left( X_{j\Delta _{n}}\right) \left( \zeta _{j+1,n}+\zeta _{j+2,n}'\right) +e_{j,n} +\left( \Lambda _{\star }\right) ^{1/2}\left( \bar{\varepsilon }_{j+1}-\bar{\varepsilon }_{j}\right) \end{aligned}$$where \(e_{j,n}\) is a \(\mathcal {H}_{j+2}^{n}\)-measurable random variable such that \(\left\| e_{j,n}\right\| _{p}\le C\left( p\right) \Delta _{n}\), for \(j=1,\ldots ,k_{n}-2\), \(n\in \mathbf {N}\) and \(p\ge 1\).
-
(ii)
For any \(p\ge 1\) and \(\mathcal {H}_{j}^{n}\)-measurable \(\mathbf {R}^{d}\otimes \mathbf {R}^{r}\)-valued random variable \(\mathbb {B}_{j}^{n}\) such that \(\sup _{j,n}\mathbf {E}\left[ \left\| \mathbb {B}_{j}^{n}\right\| ^{m}\right] <\infty \) for all \(m\in \mathbf {N}\), we have the next \(L^{p}\)-boundedness:
$$\begin{aligned} \mathbf {E}_{\theta ^{\star }}\left[ \left| \sum _{j=1}^{k_{n}-2}\mathbb {B}_{j}^{n}\left[ e_{j,n}\left( \zeta _{j+1,n}+\zeta _{j+2,n}'\right) ^\mathrm{{T}}\right] \right| ^{p}\right] ^{1/p}\le C\left( p\right) k_{n}\Delta _{n}^{2}. \end{aligned}$$ -
(iii)
For any \(p\ge 1\) and \(\mathcal {H}_{j}^{n}\)-measurable \(\mathbf {R}^{d}\)-valued random variable \(\mathbb {C}_{j}^{n}\) such that \(\sup _{j,n}\mathbf {E}\left[ \left| \mathbb {C}_{j}^{n}\right| ^{m}\right] <\infty \) for all \(m\in \mathbf {N}\), we have the next \(L^{p}\)-boundedness:
$$\begin{aligned} \mathbf {E}_{\theta ^{\star }}\left[ \left| \sum _{j=1}^{k_{n}-2}\mathbb {C}_{j}^{n}\left[ e_{j,n}\right] \right| ^{p}\right] ^{1/p}\le C\left( p\right) k_{n}\Delta _{n}^{3/2}. \end{aligned}$$
Proof
Firstly, we prove (i). Without loss of generality, assume p is an even number. It holds
and
where \(e_{j,n}=\sum _{l=1}^{3}\left( r_{j,n}^{\left( l\right) }+s_{j,n}^{\left( l\right) }\right) \),
using Lemma 1. By BDG inequality, Hölder’s inequality, and triangular inequality for \(L^{p/2}\)-norm, we have
and we also have \(\left\| s_{j,n}^{\left( 1\right) }\right\| _{p}\le C\left( p\right) \Delta _{n}\) which can be obtained in the analogous manner. For \(r_{j,n}^{\left( 2\right) }\), we obtain
because of BDG inequality, Hölder’s inequality, Fubini’s theorem, and the fact that \(h_{n}=\Delta _{n}/p_{n}\le \Delta _{n}^2\), and the same evaluation can be proved for \(s_{j,n}^{\left( 2\right) }\). It also holds
by Hölder’s inequality and Fubini’s theorem, and same evaluation holds for \(s_{j,n}^{\left( 3\right) }\): \(\left\| s_{j,n}^{\left( 3\right) }\right\| _{p}\le C\left( p\right) \Delta _{n}^{3/2}\). Hence, we obtain the evaluation for \(\left\| e_{j,n}\right\| _{p}\).
In the next place, we show (ii) holds. Note that it is sufficient to see only the moments for \(r_{j,n}^{\left( 1\right) }\zeta _{j+1,n}^\mathrm{{T}}\) and \(s_{j,n}^{\left( 1\right) }\left( \zeta _{j+2,n}'\right) ^\mathrm{{T}}\) because Hölder’s inequality and orthogonality are applicable for the others. We have the following expression for \(r_{j,n}^{\left( 1\right) }\) and \(s_{j,n}^{\left( 1\right) }\):
Let us define for all \(\ell =p_{n},\ldots ,\left( k_{n}-2\right) p_{n}+p_{n}-1\), \(\ell _{1}\left( \ell \right) =\left\lfloor \ell /p_{n}\right\rfloor \), and \(\ell _{2}\left( \ell \right) =\ell -\ell _{1}\left( \ell \right) \),
and then we have \(\sum _{j=1}^{k_{n}-2}\mathbb {B}_{j}^{n}\left[ r_{j,n}^{\left( 1\right) }\left( \zeta _{j+1,n}\right) ^\mathrm{{T}}\right] =\mathbb {D}_{\left( k_{n}-2\right) p_{n}+p_{n}-1}^{n} \). We can easily observe that \(\mathbb {D}_{\ell }^{n}-\mathbf {D}_{\ell }^{n}\) is a martingale with respect to \(\left\{ \mathcal {H}_{\ell _{1}\left( \ell \right) ,\ell _{2}\left( \ell \right) }^{n}\right\} \). Then, Burkholder’s inequality is applicable and it follows that
Hence, we have \(\left\| \mathbb {D}_{\left( k_{n}-2\right) p_{n}+p_{n}-1}^{n}-\mathbf {D}_{\left( k_{n}-2\right) p_{n}+p_{n}-1}^{n}\right\| _{p}\le C\left( p\right) k_{n}^{1/2}\Delta _{n}\). Furthermore, let us define
and clearly, we have \(\mathbf {D}_{\ell }^{n}=\mathbf {D}_{\ell }^{1,n}+\mathbf {D}_{\ell }^{2,n}\). In addition, we see \(\left\{ \mathbf {D}_{jp_{n}+p_{n}-1}^{1,n}\right\} _{j=1,\ldots ,k_{n}-2}\) is a martingale with respect to \(\left\{ \mathcal {H}_{j}^{n}\right\} _{j=1,\ldots ,k_{n}-2}\), and then, Burkholder’s inequality leads to
Regarding \(\mathbf {D}_{\ell }^{2,n}\), we have
since \(\left\| \mathbf {E}_{\theta ^{\star }}\left[ a\left( X_{j\Delta _{n}+ih_{n}}\right) |\mathcal {H}_{j}^{n}\right] -a\left( X_{j\Delta _{n}}\right) \right\| \le C\Delta _{n}\left( 1+\left| X_{j\Delta _{n}}\right| \right) ^{C}\). The same evaluation holds for \(s_{j,n}^{\left( 1\right) }\), and hence, we obtain the result.
Finally, we check that (iii) holds. It is only necessary to verify it for \(r_{j,n}^{\left( 1\right) }\) and \(s_{j,n}^{\left( 1\right) }\), and we show with respect to \(r_{j,n}^{\left( 1\right) }\). Since \(\left\{ \sum _{k=1}^{j}\mathbb {C}_{k,n}\left[ r_{k,n}^{\left( 1\right) }\right] \right\} \) for \(\ell \le k_{n}-2\) is a martingale with respect to \(\left\{ \mathcal {H}_{j}^{n}\right\} \), we can utilise Burkholder’s inequality and then
and we can have the same evaluation for \(s_{j,n}^{\left( 1\right) }\).\(\square \)
Remark 6
When the evaluation \(\left\| e_{j,n}\right\| _{p}\le C\left( p\right) \Delta _{n}\) is sufficient, then we can abbreviate \(\Delta _{n}b\left( X_{j\Delta _{n}}\right) \) in the right-hand side.
Lemma 6
-
(a)
For all \(p\ge 1\), there exists \(C\left( p\right) >0\) such that for all \(j=0,\ldots ,k_{n}-1\) and \(n\in \mathbf {N}\),
$$\begin{aligned} \left\| \bar{\varepsilon }_{j}\right\| _{p}&\le C\left( p\right) p_{n}^{-1/2}. \end{aligned}$$ -
(b)
For all \(p\ge 1\), there exists \(C\left( p\right) >0\) such that for all \(n\in \mathbf {N}\)
$$\begin{aligned} \left\| \hat{\Lambda }_{n}-\Lambda _{\star }\right\| _{p}\le C\left( p\right) \left( h_{n}+\frac{1}{\sqrt{n}}\right) . \end{aligned}$$
Proof
(a) Because of Hölder’s inequality, it is enough to evaluate it in the case where p is an even integer. We easily obtain
for [A5].
(b) As (a), it is enough to evaluate in the case where p is an even integer. Then, we have
The first term of the right-hand side has the evaluation
We can evaluate the second term of the right-hand side
and hence
The evaluation for the third term can be obtained in the same manner. For the fourth term, we have
and the same evaluation holds for the fifth term. Finally, we obtain
Hence, the evaluation for \(L^{p}\)-norm stated above holds.\(\square \)
Lemma 7
For every function f such that \(f\in C^{1}\left( \mathbf {R}^{d}\times \Xi ;\ \mathbf {R}\right) \) and all the elements of f and the derivatives are polynomial growth with respect to x uniformly in \(\vartheta \),
Proof
We have
\(\square \)
5.2 LAN for the quasi-likelihoods and proof for the main theorem
To prove the main theorem, we set some additional preliminary lemmas. Before the discussion, let us define the statistical random fields:
We give the locally asymptotic quadratic at \(\vartheta ^{\star }\in \Xi \) for \(u_{1}\in \mathbf {R}^{m_{1}}\) and \(u_{2}\in \mathbf {R}^{m_{2}}\),
where
and
and
and
We evaluate the moments of these random variables and fields in the following lemmas.
Lemma 8
-
(a)
For every \(p>1\),
$$\begin{aligned} \sup _{n\in \mathbf {N}}\mathbf {E}_{\theta ^{\star }}\left[ \left| \Delta _{1,n}^{\tau }\left( \vartheta ^{\star }\right) \right| ^{p}\right] <\infty . \end{aligned}$$ -
(b)
Let \(\epsilon _{1}=\epsilon _{0}/2\). Then, for every \(p>0\),
$$\begin{aligned} \sup _{n\in \mathbf {N}}\mathbf {E}_{\theta ^{\star }}\left[ \left( \sup _{\alpha \in \Theta _{1}}k_{n}^{\epsilon _{1}}\left| \mathbb {Y}_{1,n}^{\tau }\left( \alpha ;\vartheta ^{\star }\right) -\mathbb {Y}_{1}^{\tau }\left( \alpha ;\vartheta ^{\star }\right) \right| \right) ^{p} \right] <\infty . \end{aligned}$$
Proof
We start with the proof for (a). By Lemma 5, we obtain a decomposition
for
where
with the following property
because of Lemma 1, \(\Delta _{n}=p_{n}^{1-\tau }\), \(\Delta _{n}^{\frac{1}{1-\tau }}=p_{n}\) and \(\left( \Delta _{n}p_{n}\right) ^{-1}=\Delta _{n}^{\frac{2-\tau }{\tau -1}}\). Furthermore, we have the \(L^{p}\)-boundedness such that
because of \(\left\| \zeta _{j+1,n}+\zeta _{j+2,n}\right\| _{p}\le C\left( p\right) \Delta _{n}^{1/2}\) and \(\left\| \bar{\varepsilon }_{j}\right\| _{p}=C\left( p\right) p_{n}^{-1/2}\) for all \(j=0,\ldots ,k_{n}-1\) and \(n\in \mathbf {N}\), and the Taylor expansion for \(f\left( x\right) =\sqrt{1+x}\) around \(x=0\). The \(L^{p}\)-boundedness of \(R_{1,n}^{\tau \left( 1\right) }\) is led by Lemma 4 and Burkholder’s inequality for martingale, and that of \(R_{1,n}^{\tau \left( 2\right) }\) can be easily obtained by Lemma 6. With respect to \(R_{1,n}^{\tau \left( 3\right) }\), we decompose as \(R_{1,n}^{\tau \left( 3\right) }=\sum _{i=0}^{2}R_{i,1,n}^{\tau \left( 3\right) }\) where
We only evaluate \(R_{0,1,n}^{\tau \left( 3\right) }\) and for the case p is an even number. The next inequality holds because of the \(L^{p}\)-boundedness shown above:
We easily obtain the evaluation for the first term in the right-hand side
and that for the second term
because of Lemmas 5 and 6. For the third term, we can replace \(\hat{\Lambda }_{n}\) with \(\Lambda _{\star }\) and \(\bar{Y}_{3j-1}\) with \(X_{3j\Delta _{n}}\) because of Lemma 6 and the result from combining Lemma 1 and Proposition 12 in Nakakita and Uchida (2019a), we denote \( \eta _{3j,n}\left( u_{1}\right) =\left( a\left( X_{3j\Delta _{n}}\right) \right) ^\mathrm{{T}}\left( \partial _{\alpha }A_{n}^{\tau }\left( X_{3j\Delta _{n}}, \alpha ^{\star },\Lambda _{\star }\right) \left[ u_{1}\right] \right) b\left( X_{3j\Delta _{n}}\right) \) which is a \(\mathcal {H}_{3j}^{n}\)-measurable random variable. Because of Lemma 1 and BDG inequality, we have
It is obvious that the fourth term can be evaluated as bounded because \(\left\{ \varepsilon _{ih_{n}}\right\} \) is independent of X and i.i.d. Therefore, we obtain \(\left\| R_{0,1,n}^{\tau \left( 3\right) }\right\| _{p}<\infty \) and \(\left\| R_{1,n}^{\tau \left( 3\right) }\right\| _{p}<\infty \).
With respect to \(M_{1,n}^{\tau }\), we utilise Burkholder’s inequality for martingale: let us define \(M_{i,1,n}^{\tau }\) for \(i=0,1,2\) as same as \(R_{i,1,n}^{\tau \left( 3\right) }\) and then
because of the integrability.
In the next place, we give the proof for (b). Let us denote
Define \(R_{1,n}^{\tau \left( \dagger \right) }\) by \( R_{1,n}^{\tau \left( \dagger \right) }=\mathbb {Y}_{1,n}^{\tau }\left( \alpha ;\vartheta ^{\star }\right) -\mathbb {Y}_{1,n}^{\tau \left( \dagger \right) }\left( \alpha ;\vartheta ^{\star }\right) -M_{1,n}^{\tau \left( \dagger \right) } \) for
Firstly, we show \(L^p\)-boundedness of \(k_{n}^{\epsilon _{1}}R_{1,n}^{\tau \left( \dagger \right) }\) uniformly for n and \(\alpha \) for every p. We have the representation such that
Because of Lemma 5, the following evaluation holds:
Hence, we have the evaluation \( \sup _{\alpha \in \Theta _{1}}\sup _{n\in \mathbf {N}}\left\| R_{1,n}^{\tau \left( \dagger \right) }\right\| _{p} \le C\left( p\right) \Delta _{n}+C\left( p\right) \Delta _{n}^{1/2}\le C\Delta _{n}^{1/2}\), and hence,
In the next place, we see the same uniform \(L^{p}\)-boundedness of \(k_{n}^{\epsilon _{1}}M_{1,n}^{\tau \left( \dagger \right) }\) for every p. As the approximation, we set \(M_{1,n}^{\tau \left( \ddagger \right) }:=\sum _{i=0}^{2}M_{i,1,n}^{\tau \left( \ddagger \right) }\) where for \(i=0,1,2\), \(M_{i,1,n}^{\tau \left( \ddagger \right) }:=-\frac{1}{2k_{n}}\sum _{1\le 3j+i\le k_{n}-2}\mu _{3j+i,n}\),
where
It is easy to show \(\mathbf {E}\left[ \sup _{\alpha \in \Theta _{1}}k_{n}^{\epsilon _{1}}\left| M_{1,n}^{\tau \left( \dagger \right) }-M_{1,n}^{\tau \left( \ddagger \right) }\right| ^{p}\right] ^{1/p}\le C\left( p\right) k_{n}^{\epsilon _{1}}n^{-1/2}\le C\left( p\right) h_{n}^{1/2}\rightarrow 0\) for Lemma 6. For simplicity, we only evaluate \(k_{n}^{\epsilon _{1}}M_{0,1,n}^{\tau \left( \ddagger \right) }\). We have for all p,
Hence, by Burkholder’s inequality, for all p,
and then \(\sup _{n,\theta ^{\star }}\left\| k_{n}^{\epsilon _{1}}M_{1,n}^{\tau \left( \ddagger \right) }\right\| _{p}<\infty \). With the same procedure, we obtain the uniform \(L^p\)-boundedness of \(k_{n}^{\epsilon _{1}}\partial _{\alpha }R_{1,n}^{\tau \left( \dagger \right) }\) and \(k_{n}^{\epsilon _{1}}\partial _{\alpha }M_{1,n}^{\tau \left( \ddagger \right) }\). Sobolev’s inequality leads to \(\sup _{n\in \mathbf {N}}\left\| \sup _{\alpha \in \Theta _{1}}\left| k_{n}^{\epsilon _{1}}R_{1,n}^{\tau \left( \dagger \right) }\right| \right\| _{p}<\infty \) and \(\sup _{n\in \mathbf {N}}\left\| \sup _{\alpha \in \Theta _{1}}\left| k_{n}^{\epsilon _{1}}M_{1,n}^{\tau \left( \ddagger \right) }\right| \right\| _{p}<\infty \), and then \(\sup _{n\in \mathbf {N}}\left\| \sup _{\alpha \in \Theta _{1}}\left| k_{n}^{\epsilon _{1}}M_{1,n}^{\tau \left( \dagger \right) }\right| \right\| _{p}<\infty \). Note that for
we can evaluate \(\sup _{n\in \mathbf {N}}\left\| \sup _{\alpha \in \Theta _{1}}\left| k_{n}^{\epsilon _{1}}\left( \mathbb {Y}_{1,n}^{\tau \left( \ddagger \right) }\left( \alpha ;\vartheta ^{\star }\right) \left( \alpha ;\vartheta ^{\star }\right) -\mathbb {Y}_{1,n}^{\tau \left( \dagger \right) }\left( \alpha ;\vartheta ^{\star }\right) \left( \alpha ;\vartheta ^{\star }\right) \right) \right| \right\| _{p}<\infty \) because of Lemmas 3 and 7. Hence, the discussion of Remark 4 leads to the proof.\(\square \)
Lemma 9
-
(a)
For any \(M_{3}>0\),
$$\begin{aligned} \sup _{n\in \mathbf {N}}\mathbf {E}_{\theta ^{\star }}\left[ \left( k_{n}^{-1}\sup _{\vartheta \in \Xi } \left| \partial _{\alpha }^{3}\mathbb {H}_{1,n}^{\tau }\left( \alpha ;\Lambda \right) \right| \right) ^{M_{3}}\right] <\infty . \end{aligned}$$ -
(b)
Let \(\epsilon _{1}=\epsilon _{0}/2\). Then for \(M_{4}>0\),
$$\begin{aligned} \sup _{n\in \mathbf {N}}\mathbf {E}_{\theta ^{\star }}\left[ \left( k_{n}^{\epsilon _{1}}\left| \Gamma _{1,n}^{\tau }\left( \alpha ^{\star };\vartheta ^{\star }\right) -\Gamma _{1}^{\tau }\left( \vartheta ^{\star }\right) \right| \right) ^{M_{4}}\right] <\infty . \end{aligned}$$
Proof
With respect to (a), we have
and hence
For (b), the discussion same as Lemma 8 leads to the result.\(\square \)
Proposition 1
For any \(p>0\),
Proof
Theorem 3 in Yoshida (2011), Lemmas 8 and 9 lead to the following polynomial large deviation inequality \( P_{\theta ^{\star }}\left[ \sup _{u_{1}\in V_{1,n}^{\tau }\left( r,\alpha ^{\star }\right) }\mathbb {Z}_{1,n}^{\tau }\left( u_{1};\hat{\Lambda }_{n},\alpha ^{\star }\right) \ge e^{-r}\right] \le \frac{C\left( L\right) }{r^{L}}\) for all \(r>0\) and \(n\in \mathbf {N}\). The \(L^{p}\)-boundedness of \(\sqrt{k_{n}}\left( \hat{\alpha }_{n}-\alpha ^{\star }\right) \) is then obtained with the discussion parallel to Yoshida (2011).
With respect to the Bayes-type estimator, we need to verify the next boundedness: there exists \(\delta _{1}>0\) and \(C>0\) such that \(\sup _{n\in \mathbf {N}}\mathbf {E}_{\theta ^{\star }}\left[ \left( \int _{u_{1}:\left| u_{1}\right| \le \delta _{1}}\mathbb {Z}_{1,n}^{\tau }\left( u_{1};\hat{\Lambda }_{n},\alpha ^{\star }\right) \right. \right. \left. \left. {\mathrm {d}}u_{1}\right) ^{-1}\right] <\infty \). Because of the Lemma 2 in Yoshida (2011), it is sufficient to show that for some \(p>d\), \(\delta >0\) and \(C>0\), \(\sup _{n\in \mathbf {N}}\mathbf {E}_{\theta ^{\star }}\left[ \left| \log \mathbb {Z}_{1,n}^{\tau }\left( u_{1};\hat{\Lambda }_{n},\alpha ^{\star }\right) \right| ^{p}\right] \le C\left| u_{1}\right| ^{p}\) for all \(u_{1}\) such that \(\left| u_{1}\right| \le \delta \) and actually it is easy to obtain by Lemmas 8 and 9.\(\square \)
Lemma 10
-
(a)
For every \(p>0\),
$$\begin{aligned} \sup _{n\in \mathbf {N}}\mathbf {E}_{\theta ^{\star }}\left[ \left| \Delta _{2,n}^{\mathrm {ML}}\left( \vartheta ^{\star }\right) \right| ^{p}\right]<\infty ,\ \sup _{n\in \mathbf {N}}\mathbf {E}_{\theta ^{\star }}\left[ \left| \Delta _{2,n}^{\mathrm {Bayes}}\left( \vartheta ^{\star }\right) \right| ^{p}\right] <\infty . \end{aligned}$$ -
(b)
Let \(\epsilon _{1}=\epsilon _{0}/2\). Then, for every \(p>0\),
$$\begin{aligned} \sup _{n\in \mathbf {N}}\left\| \sup _{\beta \in \Theta _{2}}\left( k_{n}\Delta _{n}\right) ^{\epsilon _{1}}\left| \mathbb {Y}_{2,n}^{\mathrm {ML}}\left( \beta ;\vartheta ^{\star }\right) -\mathbb {Y}_{2}\left( \beta ;\vartheta ^{\star }\right) \right| \right\| _{p}&<\infty ,\\ \sup _{n\in \mathbf {N}}\left\| \sup _{\beta \in \Theta _{2}}\left( k_{n}\Delta _{n}\right) ^{\epsilon _{1}}\left| \mathbb {Y}_{2,n}^{\mathrm {Bayes}}\left( \beta ;\vartheta ^{\star }\right) -\mathbb {Y}_{2}\left( \beta ;\vartheta ^{\star }\right) \right| \right\| _{p}&<\infty . \end{aligned}$$
Proof
We only show the proof for \(\Delta _{2,n}^{\mathrm {ML}}\) and \(\mathbb {Y}_{2,n}^{\mathrm {ML}}\) since the proof for \(\Delta _{2,n}^{\mathrm {Bayes}}\) and \(\mathbb {Y}_{2,n}^{\mathrm {Bayes}}\) is quite parallel. For (a), we decompose
\( \Delta _{2,n}^{\mathrm {ML}}\left( \vartheta ^{\star }\right) \left[ u_{2}\right] =M_{2,n}^{\mathrm {ML}}+R_{2,n}^{\mathrm {ML}}\),
where
We can use \(L^{p}\)-boundedness of \(\sqrt{k_{n}}\left( \hat{\alpha }_{n}-\alpha ^{\star }\right) \) and Burkholder’s inequality; then, we obtain \(\sup _{n\in \mathbf {N}}\mathbf {E}_{\theta ^{\star }}\left[ \left| M_{2,n}^{\mathrm {ML}}\right| ^{p}\right] ^{1/p} \le C\left( p\right) \),and for the residuals, Lemmas 3 and 5 lead to \( \mathbf {E}_{\theta ^{\star }}\left[ \left| R_{2,n}^{\mathrm {ML}}\right| ^{p}\right] ^{1/p} \le C\left( p\right) \sqrt{k_{n}}\Delta _{n}\rightarrow 0\). Then, we obtain (a). We prove (b) in the second place. We decompose \(\mathbb {Y}_{2,n}^{\mathrm {ML}}\left( \beta ;\vartheta ^{\star }\right) \) as \( \mathbb {Y}_{2,n}^{\mathrm {ML}}\left( \beta ;\vartheta ^{\star }\right) =M_{2,n}^{\mathrm {ML}\left( \dagger \right) }\left( \hat{\alpha }_{n},\beta \right) +R_{2,n}^{\mathrm {ML}\left( \dagger \right) }\left( \hat{\alpha }_{n},\beta \right) +\mathbb {Y}_{2,n}^{\mathrm {ML}\left( \dagger \right) }\left( \beta ;\vartheta ^{\star }\right) \), where
It is easy to obtain \(\sup _{n\in \mathbf {N}}\mathbf {E}_{\theta ^{\star }}\left[ \sup _{\theta \in \Theta }\left| M_{2,n}^{\mathrm {ML}\left( \dagger \right) }\right| ^{p}\right] \le C\left( p\right) \left( k_{n}\Delta _{n}\right) ^{-p/2}\) using \(L^{p}\)-boundedness of \(\sqrt{k_{n}}\left( \hat{\alpha }_{n}-\alpha ^{\star }\right) \), Burkholder’s inequality, and Sobolev’s one, and \(\sup _{n\in \mathbf {N}}\mathbf {E}_{\theta ^{\star }}\left[ \sup _{\theta \in \Theta }\left| R_{2,n}^{\mathrm {ML}\left( \dagger \right) }\right| ^{p}\right] \le C\left( p\right) \Delta _{n}^{p/2}\) because of Lemma 5. Let us define
and then because of \(L^{p}\)-boundedness of \(\sqrt{k_{n}}\left( \hat{\alpha }_{n}-\alpha ^{\star }\right) \), and Lemma 3, we obtain
Then, \(L^{p}\)-boundedness of \(\sup _{\beta \in \Theta _{2}}\left( k_{n}\Delta _{n}\right) ^{\epsilon _{1}}\left| \mathbb {Y}_{2,n}^{\mathrm {ML}\left( \ddagger \right) }\left( \beta ;\vartheta ^{\star }\right) -\mathbb {Y}_{2}\left( \beta ;\vartheta ^{\star }\right) \right| \) is obtained by the discussion in Remark 4 and it verifies (b).\(\square \)
Lemma 11
-
(a)
For every \(M_{3}>0\),
$$\begin{aligned} \sup _{n\in \mathbf {N}}\mathbf {E}_{\theta ^{\star }}\left[ \left( \left( k_{n}\Delta _{n}\right) ^{-1}\sup _{\beta \in \Theta _{2}}\left| \partial _{\beta }^{3}\mathbb {H}_{2,n}\left( \hat{\alpha }_{n},\beta \right) \right| \right) ^{M_{3}}\right]&<\infty ,\\ \sup _{n\in \mathbf {N}}\mathbf {E}_{\theta ^{\star }}\left[ \left( \left( k_{n}\Delta _{n}\right) ^{-1}\sup _{\beta \in \Theta _{2}}\left| \partial _{\beta }^{3}\mathbb {H}_{2,n}\left( \tilde{\alpha }_{n},\beta \right) \right| \right) ^{M_{3}}\right]&<\infty . \end{aligned}$$ -
(b)
Let \(\epsilon _{1}=\epsilon _{0}/2\). Then, for every \(M_{4}>0\),
$$\begin{aligned} \sup _{n\in \mathbf {N}}\mathbf {E}_{\theta ^{\star }}\left[ \left( \left( k_{n}\Delta _{n}\right) ^{\epsilon _{1}}\left| \Gamma _{2,n}^{\mathrm {ML}}\left( \beta ^{\star };\vartheta ^{\star }\right) -\Gamma _{2}\left( \vartheta ^{\star }\right) \right| \right) ^{M_{4}}\right]&<\infty ,\\ \sup _{n\in \mathbf {N}}\mathbf {E}_{\theta ^{\star }}\left[ \left( \left( k_{n}\Delta _{n}\right) ^{\epsilon _{1}}\left| \Gamma _{2,n}^{\mathrm {Bayes}}\left( \beta ^{\star };\vartheta ^{\star }\right) -\Gamma _{2}\left( \vartheta ^{\star }\right) \right| \right) ^{M_{4}}\right]&<\infty . \end{aligned}$$
Proof
With respect to (a), we have for all \(\alpha \in \Theta _{1}\) and \(\beta \in \Theta _{2}\),
Hence, the evaluation of (a) can be obtained because of the integrability of \(\left\{ \bar{Y}_{j}\right\} _{j=0,\ldots ,k_{n}-1}\).
For (b), it is quite analogous to the (b) in Lemma 10.\(\square \)
Proof (Proof of Theorem 1)
The first polynomial-type large deviation inequality is shown in Proposition 1, and the second and third ones are also the consequence of Lemmas 10 and 11 and Theorem 3 in Yoshida (2011). This result, Lemma 6, and convergence in distribution shown by Nakakita and Uchida (2019a) complete the proof for convergence of moments with respect to the adaptive ML-type estimator.
Let us define the following statistical random fields, for all \(u_{0}\in \mathbf {R}^{d\left( d+1\right) /2}\) and \(n\in \mathbf {N}\) such that \(\theta _{\varepsilon }^{\star }+n^{-1/2}u_{0}\in \Theta _{\varepsilon }\),
where \(\theta _{\varepsilon }=\mathrm {vech}\Lambda \) and \(Z_{i+1}=\mathrm {vech}\left\{ \left( Y_{\left( i+1\right) h_{n}}-Y_{ih_{n}}\right) ^{\otimes 2}\right\} \). Note that \(\hat{\theta }_{\varepsilon ,n}\) maximises \(\mathbb {H}_{0,n}\). Now we prove the convergence in distribution such that for all \(R>0\),
where for \(\Delta _{0}\sim N_{d\left( d+1\right) /2}\left( \mathbf {0},\mathcal {I}^{\left( 1,1\right) }\left( \vartheta ^{\star }\right) \right) \), \(\Delta _{1}^{\tau }\sim N_{m_{1}}\left( \mathbf {0},\mathcal {I}^{\left( 2,2\right) ,\tau }\left( \vartheta ^{\star }\right) \right) \), \(\Delta _{2}\sim N_{m_{2}}\left( \mathbf {0},\mathcal {I}^{\left( 3,3\right) }\left( \vartheta ^{\star }\right) \right) \) such that \(\Delta _{0}\), \(\Delta _{1}^{\tau }\) and \(\Delta _{2}\) are diagonal,
and \(\mathcal {C}\left( B\left( R;\mathbf {R}^{m}\right) \right) \) is a metric space of continuous functions on the closed ball such that \(B\left( R;\mathbf {R}^{m}\right) =\left\{ u\in \mathbf {R}^{m};\left| u\right| \le R\right\} \), whose norm is defined as the supreme one. To prove it, it is sufficient to show the finite-dimensional convergence of
and the tightness of \(\left\{ \log \mathbb {Z}_{0,n}\left( u_{0}\right) |_{C(B(R))};n\in \mathbf {N}\right\} \), \(\left\{ \log \mathbb {Z}_{1,n}^{\tau }\left( u_{1}\right) |_{C(B(R))};n\in \mathbf {N}\right\} \), and \(\left\{ \log \mathbb {Z}_{2,n}\left( u_{3}\right) |_{C(B(R))};n\in \mathbf {N}\right\} \). The finite-dimensional convergence is a simple consequence of Nakakita and Uchida (2019a), and the tightness can be obtained if we can show
as Ogihara and Yoshida (2011) or Yoshida (2011). We have the first evaluation for the simple computation and the remaining ones by Lemmas 8, 9, 10, and 11. Hence, we obtain the convergences in distribution in \(\mathcal {C}\left( B\left( R;\mathbf {R}^{d\left( d+1\right) /2+m_{1}+m_{2}}\right) \right) \).
Finally, it is necessary to show the following evaluations for the proof utilising Theorem 10 in Yoshida (2011): there exists \(\delta _{1}>0\) and \(\delta _{2}>0\) such that
Because of the Lemma 2 in Yoshida (2011), it is sufficient to show that for some \(p>d\), \(\delta >0\) and \(C>0\),
for all \(u_{1}\), \(u_{2}\) satisfying \(\left| u_{1}\right| +\left| u_{2}\right| \le \delta \), and actually, it is easily obtained by Lemmas 8, 9, 10, and 11. These results above lead to the following convergences because of Theorem 10 in Yoshida (2011):
in \(\mathcal {C}\left( B\left( R;\mathbf {R}^{d\left( d+1\right) /2}\right) \right) \) for the functions \(f_{1}\) and \(f_{2}\) of at most polynomial growth, and the continuous mapping theorem verifies
Moreover, in a similar way as in the proof of Theorem 8 in Yoshida (2011), one has that for every \(p>0\), \( \sup _{n\in \mathbf {N}} \mathbf {E}_{\theta ^{\star }}\left[ \left| \sqrt{T_n}(\tilde{\beta }_{n}-\beta ^{\star })\right| ^{p}\right] <\infty \), which completes the proof.\(\square \)
References
Bibby, B. M., Sørensen, M. (1995). Martingale estimating functions for discretely observed diffusion processes. Bernoulli, 1, 17–39.
Clinet, S., Yoshida, N. (2017). Statistical inference for ergodic point processes and application to limit order book. Stochastic Processes and Their Applications, 127(6), 1800–1839.
Eguchi, S., Masuda, H. (2018). Schwarz-type model comparison for LAQ models. Bernoulli, 24(3), 2278–2327.
Favetto, B. (2014). Parameter estimation by contrast minimization for noisy observations of a diffusion process. Statistics, 48(6), 1344–1370.
Favetto, B. (2016). Estimating functions for noisy observations of ergodic diffusions. Statistical Inference for Stochastic Processes, 19, 1–28.
Florens-Zmirou, D. (1989). Approximate discrete time schemes for statistics of diffusion processes. Statistics, 20(4), 547–557.
Ibragimov, I. A., Has’minskii, R. Z. (1972). The asymptotic behaviour of certain statistical estimates in the smooth case. I. Investigation of the likelihood ratio. Teorija Verojatnostei i ee Primenenija, 17, 469–486. (Russian).
Ibragimov, I. A., Has’minskii, R. Z. (1973). Asymptotic behaviour of certain statistical estimates. II. Limit theorems for a posteriori density and for Bayesian estimates. Teorija Verojatnostei i ee Primenenija, 18, 78–93. (Russian).
Ibragimov, I. A., Has’minskii, R. Z. (1981). Statistical estimation. New York: Springer.
Jacod, J., Li, Y., Mykland, P. A., Podolskij, M., Vetter, M. (2009). Microstructure noise in the continuous case: The pre-averaging approach. Stochastic Processes and Their Applications, 119(7), 2249–2276.
Kaino, Y., Uchida, M. (2018a). Hybrid estimators for small diffusion processes based on reduced data. Metrika, 81(7), 745–773.
Kaino, Y., Uchida, M. (2018b). Hybrid estimators for stochastic differential equations from reduced data. Statistical Inference for Stochastic Processes, 21(2), 435–454.
Kaino, Y., Nakakita, S. H., Uchida, M. (2018). Hybrid estimation for ergodic diffusion processes based on noisy discrete observations. arXiv:1812.07497.
Kamatani, K. (2018). Efficient strategy for the Markov chain Monte Carlo in high-dimension with heavy-tailed target probability distribution. Bernoulli, 24(4B), 3711–3750.
Kamatani, K., Uchida, M. (2015). Hybrid multi-step estimators for stochastic differential equations based on sampled data. Statistical Inference for Stochastic Processes, 18(2), 177–204.
Kessler, M. (1995). Estimation des parametres d’une diffusion par des contrastes corriges. Comptes rendus de l’Académie des sciences. Série 1, Mathématique, 320(3), 359–362.
Kessler, M. (1997). Estimation of an ergodic diffusion from discrete observations. Scandinavian Journal of Statistics, 24, 211–229.
Kutoyants, Y. A. (1984). Parameter estimation for stochastic processes (B. L. S. Prakasa Rao, Ed., Trans.). Berlin: Herdermann.
Kutoyants, Y. A. (1994). Identification of dynamical systems with small noise. Dordrecht: Kluwer.
Kutoyants, Y. A. (2004). Statistical inference for ergodic diffusion processes. London: Springer.
Masuda, H., Shimizu, Y. (2017). Moment convergence in regularized estimation under multiple and mixed-rates asymptotics. Mathematical Methods of Statistics, 26(2), 81–110.
Nakakita, S. H., Uchida, M. (2017). Adaptive estimation and noise detection for an ergodic diffusion with observation noises. arXiv:1711.04462.
Nakakita, S. H., Uchida, M. (2019a). Inference for ergodic diffusions plus noise. Scandinavian Journal of Statistics, 46(2), 470–516.
Nakakita, S. H., Uchida, M. (2019b). Adaptive test for ergodic diffusions plus noise. Journal of Statistical Planning and Inference, 203, 131–150.
Negri, I., Nishiyama, Y. (2017). Moment convergence of \(Z\)-estimators. Statistical Inference for Stochastic Processes, 20(3), 387–397.
NWTC Information Portal. (2018). NWTC 135-m meteorological towers data repository. Retrieved from https://nwtc.nrel.gov/135mdata. Accessed 15 Feb 2018.
Ogihara, T. (2018). Parametric inference for nonsynchronously observed diffusion processes in the presence of market microstructure noise. Bernoulli, 24(4B), 3318–3383.
Ogihara, T. (2019). On the asymptotic properties of Bayes-type estimators with general loss functions. Journal of Statistical Planning and Inference, 199, 136–150.
Ogihara, T., Yoshida, N. (2011). Quasi-likelihood analysis for the stochastic differential equation with jumps. Statistical inference for stochastic processes, 14(3), 189–229.
Pardoux, E., Veretennikov, A. Y. (2001). On the Poisson equation and diffusion approximation. I. The Annals of Probability, 29(3), 1061–1085.
Uchida, M. (2010). Contrast-based information criterion for ergodic diffusion processes from discrete observations. Annals of the Institute of Statistical Mathematics, 62(1), 161–187.
Uchida, M., Yoshida, N. (2012). Adaptive estimation of an ergodic diffusion process based on sampled data. Stochastic Processes and their Applications, 122(8), 2885–2924.
Uchida, M., Yoshida, N. (2014). Adaptive Bayes-type estimators of ergodic diffusion processes from discrete observations. Statistical Inference for Stochastic Processes, 17(2), 181–219.
Yoshida, N. (1992). Estimation for diffusion processes from discrete observation. Journal of Multivariate Analysis, 41(2), 220–242.
Yoshida, N. (2011). Polynomial-type large deviation inequalities and quasi-likelihood analysis for stochastic differential equations. Annals of the Institute of Statistical Mathematics, 63, 431–479.
Acknowledgements
The authors would like to thank the referee for valuable comments and suggestions. This work was partially supported by JST CREST, JSPS KAKENHI Grant Number JP17H01100 and Cooperative Research Program of the Institute of Statistical Mathematics.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Nakakita, S.H., Kaino, Y. & Uchida, M. Quasi-likelihood analysis and Bayes-type estimators of an ergodic diffusion plus noise. Ann Inst Stat Math 73, 177–225 (2021). https://doi.org/10.1007/s10463-020-00746-3
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10463-020-00746-3