1 Introduction

Affine processes are applied in mathematical finance in several models including interest rate models (e.g. the Cox–Ingersoll–Ross, Vasiček or general affine term structure short rate models), option pricing (e.g. the Heston model) and credit risk models, see e.g. Duffie et al. (2003), Filipović (2009), Baldeaux and Platen (2013), and Alfonsi (2015). In this paper we consider two-factor affine processes, i.e. affine processes with state-space \([0, \infty ) \times \mathbb {R}\). Dawson and Li (2006) derived a jump-type stochastic differential equation (SDE) for such processes. Specializing this result to the diffusion case, i.e. two-factor affine processes without jumps, we obtain that for every \(a \in [0, \infty )\), \(b, \alpha , \beta , \gamma \in \mathbb {R}\), \(\sigma _1, \sigma _2, \sigma _3 \in [0, \infty )\) and \(\varrho \in [-1, 1]\), the SDE

$$\begin{aligned} {\left\{ \begin{array}{ll} \mathrm {d}Y_t = (a - b Y_t) \, \mathrm {d}t + \sigma _1 \sqrt{Y_t} \, \mathrm {d}W_t, \\ \mathrm {d}X_t = (\alpha - \beta Y_t - \gamma X_t) \, \mathrm {d}t + \sigma _2 \sqrt{Y_t} \, (\varrho \, \mathrm {d}W_t {+} \sqrt{1 {-} \varrho ^2} \, \mathrm {d}B_t) {+} \sigma _3 \, \mathrm {d}L_t, \end{array}\right. } \qquad t \in [0, \infty ), \end{aligned}$$
(1.1)

with an arbirary initial value \((Y_0, X_0)\) with \(\mathbb {P}(Y_0 \in [0, \infty )) = 1\) and independent of a 3-dimensional standard Wiener process \((W_t, B_t, L_t)_{t\in [0, \infty )}\), has a pathwise unique strong solution being a two-factor affine diffusion process, and conversely, every two-factor affine diffusion process is a pathwise strong solution of a SDE (1.1) with appropriate parameters \(a \in [0, \infty )\), \(b, \alpha , \beta , \gamma \in \mathbb {R}\), \(\sigma _1, \sigma _2, \sigma _3 \in [0, \infty )\) and \(\varrho \in [-1, 1]\), see Proposition 2.1.

The aim of this paper is to study the asymptotic properties of the conditional least squares estimators (CLSE) \((\widehat{a}_T, \widehat{b}_T, \widehat{\alpha }_T, \widehat{\beta }_T, \widehat{\gamma }_T)\) of the drift parameters \((a, b, \alpha , \beta , \gamma )\) based on continuous time observations \((Y_t, X_t)_{t\in [0,T]}\) with \(T > 0\). This estimator is the high frequency limit in probability as \(n \rightarrow \infty \) of the CLSE based on discrete time observations \((Y_{k/n}, X_{k/n})_{k\in \{0,\ldots ,{\lfloor nT\rfloor }\}}\), \(n \in \mathbb {N}\). We do not estimate the parameters \(\sigma _1\), \(\sigma _2\), \(\sigma _3\) and \(\varrho \), since for all \(T \in (0, \infty )\), they are measurable functions (i.e., statistics) of \((Y_t, X_t)_{t\in [0,T]}\), see Appendix C in the extended arXiv version Bolyog and Pap (2017) of this paper. For the calculation of \((\widehat{a}_T, \widehat{b}_T, \widehat{\alpha }_T, \widehat{\beta }_T, \widehat{\gamma }_T)\) one does not need to know the values of the diffusion coefficients \(\sigma _1\), \(\sigma _2\), \(\sigma _3\) and \(\varrho \), see (3.4).

The first coordinate process Y in (1.1) is called a Cox–Ingersoll–Ross (CIR) process (see Cox et al. 1985). In the submodel consisting only of the process Y, (Overbeck and Rydén 1997 Theorems 3.4, 3.5 and 3.6) derived the CLSE of (ab) based on continuous time observations \((Y_t)_{t\in [0,T]}\) with \(T > 0\), i.e., the limit in probability as \(n \rightarrow \infty \) of the CLSE based on discrete time observations \((Y_{k/n})_{k\in \{0,\ldots ,{\lfloor nT\rfloor }\}}\), \(n \in \mathbb {N}\), which turns to be the same as the CLSE \((\widehat{a}_T, \widehat{b}_T)\) of (ab) based on continuous time observations \((Y_t, X_t)_{t\in [0,T]}\), and they proved strong consistency and asymptotic normality in case of a subcritical CIR process Y, i.e., when \(b > 0\) and the initial distribution is the unique stationary distribution of the model.

Barczy et al. (2014) considered a submodel of (1.1) with \(a \in (0, \infty )\), \(\beta = 0\), \(\sigma _1 = 1\), \(\sigma _2 = 1\), \(\varrho = 0\) and \(\sigma _3 = 0\). The estimator of the parameters \((\alpha , \gamma )\) based on continuous time observations \((X_t)_{t\in [0,T]}\) with \(T > 0\) (which they call a least square estimator) is in fact the CLSE, i.e., the limit in probability as \(n \rightarrow \infty \) of the CLSE based on discrete time observations \((X_{k/n})_{k\in \{0,\ldots ,{\lfloor nT\rfloor }\}}\), \(n \in \mathbb {N}\), which can be shown by the method of the proof of Lemma 3.3. They proved strong consistency and asymptotic normality in case of a subcritical process (YX), i.e., when \(b > 0\) and \(\gamma > 0\).

Barczy et al. (2016) considered the so-called Heston model, which is a submodel of (1.1) with \(a, \sigma _1, \sigma _2 \in (0, \infty )\), \(\gamma = 0\), \(\varrho \in (-1, 1)\) and \(\sigma _3 = 0\). The estimator of the parameters \((a, b, \alpha , \beta )\) based on continuous time observations \((Y_t, X_t)_{t\in [0,T]}\) with \(T > 0\) (which they call least square estimator) is in fact the CLSE, i.e., the limit in probability as \(n \rightarrow \infty \) of the CLSE based on discrete time observations \((Y_{k/n}, X_{k/n})_{k\in \{0,\ldots ,{\lfloor nT\rfloor }\}}\), \(n \in \mathbb {N}\) which can be shown by the method of the proof of Lemma 3.3. They proved strong consistency and asymptotic normality in case of a subcritical process (YX), i.e., when \(b > 0\). Note that Barczy and Pap (2016) studied the maximum likelihood estimator (MLE) \((\widetilde{a}_T, \widetilde{b}_T, \widetilde{\alpha }_T, \widetilde{\beta }_T)\) of the parameters \((a, b, \alpha , \beta )\) in this Heston model under the additional assumption \(a \geqslant \frac{\sigma _1^2}{2}\). In the subcritical case, i.e., when \(b > 0\), for \((\widetilde{a}_T, \widetilde{b}_T, \widetilde{\alpha }_T, \widetilde{\beta }_T)\), they proved strong consistency and asymptotic normality in case of \(a > \frac{\sigma _1^2}{2}\), and weak consistency in case of \(a = \frac{\sigma _1^2}{2}\). In the critical case, namely, if \(b = 0\), under the additional assumption \(a > \frac{\sigma _1^2}{2}\), they showed weak consistency of \((\widetilde{a}_T, \widetilde{b}_T, \widetilde{\alpha }_T, \widetilde{\beta }_T)\), asymptotic normality of \((\widetilde{a}_T, \widetilde{\alpha }_T)\), and determined the asymptotic behavior of \((\widetilde{a}_T, \widetilde{b}_T, \widetilde{\alpha }_T, \widetilde{\beta }_T)\). In the supercritical case, namely, when \(b < 0\), they showed that \(\widetilde{b}_T\) is strongly consistent, \(\widetilde{\beta }_T\) is weakly consistent, \((\widetilde{b}_T, \widetilde{\beta }_T)\) is asymptotically mixed normal, and determined the asymptotic behavior of \((\widetilde{a}_T, \widetilde{b}_T, \widetilde{\alpha }_T, \widetilde{\beta }_T)\). Barczy et al. (2018a, b) studied the asymptotic behavior of maximum likelihood estimators for a jump-type Heston model and for the growth rate of a jump-type CIR process, respectively, based on continuous time observations.

We consider general two-factor affine diffusions (1.1). In the subcritical case, i.e., when \(b > 0\) and \(\gamma > 0\), we prove strong consistency and asymptotic normality of \((\widehat{a}_T, \widehat{b}_T, \widehat{\alpha }_T, \widehat{\beta }_T, \widehat{\gamma }_T)\) under the additional assumptions \(a > 0\), \(\sigma _1 > 0\) and \((1 - \varrho ^2) \sigma _2^2 + \sigma _3^2 > 0\). In a special critical case, namely if \(b = 0\) and \(\gamma = 0\), we show weak consistency of \((\widehat{b}_T, \widehat{\beta }_T, \widehat{\gamma }_T)\) and determine the asymptotic behavior of \((\widehat{a}_T, \widehat{b}_T, \widehat{\alpha }_T, \widehat{\beta }_T, \widehat{\gamma }_T)\) under the additional assumptions \(\beta = 0\) and \((1 - \varrho ^2) \sigma _2^2 + \sigma _3^2 > 0\). In a special supercritical case, namely, when \(\gamma< b < 0\), we show strong consistency of \(\widehat{b}_T\), weak consistency of \((\widehat{\beta }_T, \widehat{\gamma }_T)\) and prove asymptotic mixed normality of \((\widehat{a}_T, \widehat{b}_T, \widehat{\alpha }_T, \widehat{\beta }_T, \widehat{\gamma }_T)\) under the additional assumptions \(\alpha \beta \leqslant 0\), \(\sigma _1 > 0\), and either \(\sigma _3 > 0\), or \(\bigl (a - \frac{\sigma _1^2}{2}\bigr ) (1 - \varrho ^2) \sigma _2^2 > 0\). Note that we decided to deal with the CLSE of \((a, b, \alpha , \beta , \gamma )\), since the MLE of \((a, b, \alpha , \beta , \gamma )\) contains, for example, \(\int _0^T \frac{X_t}{(1-\varrho ^2)\sigma _2^2Y_t+\sigma _3^2} \, \mathrm {d}t\), and the question of the asymptotic behavior of this integral as \(T \rightarrow \infty \) is still open in the critical and supercritical cases. For the sake of brevity of the paper some simple proofs and calculation steps are omitted. However, all these details are included in the extended arXiv version Bolyog and Pap (2017) of this paper.

2 The affine two-factor model

Let \(\mathbb {N}\), \(\mathbb {Z}_+\), \(\mathbb {R}\), \(\mathbb {R}_+\), \(\mathbb {R}_{++}\), \(\mathbb {R}_-\), \(\mathbb {R}_{--}\) and \(\mathbb {C}\) denote the sets of positive integers, non-negative integers, real numbers, non-negative real numbers, positive real numbers, non-positive real numbers, negative real numbers and complex numbers, respectively. For \(x, y \in \mathbb {R}\), we will use the notations \(x \wedge y := \min (x, y)\) and \(x \vee y := \max (x, y)\). By \(C^2_\mathrm {c}(\mathbb {R}_+ \times \mathbb {R}, \mathbb {R})\), we denote the set of twice continuously differentiable real-valued functions on \(\mathbb {R}_+ \times \mathbb {R}\) with compact support. Let \((\varOmega , \mathcal {F}, \mathbb {P})\) be a probability space equipped with the augmented filtration \((\mathcal {F}_t)_{t\in \mathbb {R}_+}\) corresponding to \((W_t, B_t, L_t)_{t\in \mathbb {R}_+}\) and a given initial value \((\eta _0, \xi _0)\) being independent of \((W_t, B_t, L_t)_{t\in \mathbb {R}_+}\) such that \(\mathbb {P}(\eta _0 \in \mathbb {R}_+) = 1\), constructed as in Karatzas and Shreve (1991, Sect. 5.2). Note that \((\mathcal {F}_t)_{t\in \mathbb {R}_+}\) satisfies the usual conditions, i.e., the filtration \((\mathcal {F}_t)_{t\in \mathbb {R}_+}\) is right-continuous and \(\mathcal {F}_0\) contains all the \(\mathbb {P}\)-null sets in \(\mathcal {F}\). We will denote the convergence in distribution, convergence in probability, almost surely convergence and equality in distribution by \({\mathop {\longrightarrow }\limits ^{\mathcal {D}}}\), \({\mathop {\longrightarrow }\limits ^{\mathbb {P}}}\), \({\mathop {\longrightarrow }\limits ^{{\mathrm {a.s.}}}}\) and \({\mathop {=}\limits ^{\mathcal {D}}}\), respectively. By \(\Vert {\varvec{x}}\Vert \) and \(\Vert {\varvec{A}}\Vert \), we denote the Euclidean norm of a vector \({\varvec{x}}\in \mathbb {R}^d\) and the spectral norm of a matrix \({\varvec{A}}\in \mathbb {R}^{d \times d}\), respectively. By \({\varvec{I}}_d \in \mathbb {R}^{d \times d}\), we denote the \(d\times d\) unit matrix. For square matrices \({\varvec{A}}_1, \ldots , {\varvec{A}}_k\), \({\text {diag}}({\varvec{A}}_1, \ldots , {\varvec{A}}_k)\) will denote the square block matrix containing the matrices \({\varvec{A}}_1, \ldots , {\varvec{A}}_k\) in its diagonal.

The next proposition is about the existence and uniqueness of a strong solution of the SDE (1.1), see Bolyog and Pap (2016, Proposition 2.2).

Proposition 2.1

Let \((\eta _0, \xi _0)\) be a random vector independent of the process \((W_t, B_t, L_t)_{t\in \mathbb {R}_+}\) satisfying \(\mathbb {P}(\eta _0 \in \mathbb {R}_+) = 1\). Then for all \(a \in \mathbb {R}_+\), \(b, \alpha , \beta , \gamma \in \mathbb {R}\), \(\sigma _1, \sigma _2, \sigma _3 \in \mathbb {R}_+\), \(\varrho \in [-1, 1]\), there is a (pathwise) unique strong solution \((Y_t, X_t)_{t\in \mathbb {R}_+}\) of the SDE (1.1) such that \(\mathbb {P}((Y_0, X_0) = (\eta _0, \xi _0)) = 1\) and \(\mathbb {P}(Y_t \in \mathbb {R}_+ \ for\ all \ t \in \mathbb {R}_+) = 1\). Further, for all \(s, t \in \mathbb {R}_+\) with \(s \leqslant t\), we have

$$\begin{aligned} Y_t = \mathrm {e}^{-b(t-s)} Y_s + a \int _s^t \mathrm {e}^{-b(t-u)} \, \mathrm {d}u + \sigma _1 \int _s^t \mathrm {e}^{-b(t-u)} \sqrt{Y_u} \, \mathrm {d}W_u \end{aligned}$$
(2.1)

and

$$\begin{aligned} \begin{aligned} X_t&= \mathrm {e}^{-\gamma (t-s)} X_s + \int _s^t \mathrm {e}^{-\gamma (t-u)} (\alpha - \beta Y_u) \, \mathrm {d}u \\&\quad + \sigma _2 \int _s^t \mathrm {e}^{-\gamma (t-u)}\sqrt{Y_u} \, (\varrho \, \mathrm {d}W_u + \sqrt{1 - \varrho ^2} \, \mathrm {d}B_u) + \sigma _3 \int _s^t \mathrm {e}^{-\gamma (t-u)} \, \mathrm {d}L_u. \end{aligned} \end{aligned}$$
(2.2)

Moreover, \((Y_t, X_t)_{t\in \mathbb {R}_+}\) is a two-factor affine process with infinitesimal generator

$$\begin{aligned} \begin{aligned} (\mathcal {A}_{(Y,X)} f)(y, x)&= (a - b y) f_1'(y, x) + (\alpha - \beta y - \gamma x) f_2'(y, x) \\&\quad + \frac{1}{2} y \bigl [ \sigma _1^2 f_{1,1}''(y, x) + 2 \varrho \sigma _1 \sigma _2 f_{1,2}''(y, x) + \sigma _2^2 f_{2, 2}''(y,x) \bigr ] \\&\quad + \frac{1}{2} \sigma _3^2 f_{2, 2}''(y,x), \end{aligned} \end{aligned}$$
(2.3)

where \((y,x) \in \mathbb {R}_+ \times \mathbb {R}\), \(f \in \mathcal {C}^2_c(\mathbb {R}_+ \times \mathbb {R}, \mathbb {R})\), and \(f_i'\), \(i \in \{1, 2\}\), and \(f_{i,j}''\), \(i, j \in \{1, 2\}\), denote the first and second order partial derivatives of f with respect to its i-th and i-th and j-th variables.

Conversely, every two-factor affine diffusion process is a (pathwise) unique strong solution of a SDE (1.1) with suitable parameters \(a \in \mathbb {R}_+\), \(b, \alpha , \beta , \gamma \in \mathbb {R}\), \(\sigma _1, \sigma _2, \sigma _3 \in \mathbb {R}_+\) and \(\varrho \in [-1, 1]\).

The next proposition gives the asymptotic behavior of the first moment of the process \((Y_t, X_t)_{t\in \mathbb {R}_+}\) as \(t \rightarrow \infty \), see Bolyog and Pap (2016, Prop. 2.3).

Proposition 2.2

Let us consider the two-factor affine diffusion model (1.1) with \(a \in \mathbb {R}_+\), \(b, \alpha , \beta , \gamma \in \mathbb {R}\), \(\sigma _1, \sigma _2, \sigma _3 \in \mathbb {R}_+\), \(\varrho \in [-1, 1]\). Suppose that \(\mathbb {E}(Y_0 |X_0|) < \infty \).

  1. (i)

    If \(b, \gamma \in \mathbb {R}_{++}\) then \(\lim _{t\rightarrow \infty } \mathbb {E}(Y_t) = \frac{a}{b}\) and \(\lim _{t\rightarrow \infty } \mathbb {E}(X_t) = \frac{\alpha }{\gamma } - \frac{a\beta }{b\gamma }\).

  2. (ii)

    If \(b \in \mathbb {R}_{++}\) and \(\gamma = 0\) then \(\lim _{t\rightarrow \infty } \mathbb {E}(Y_t) = \frac{a}{b}\) and \(\lim _{t\rightarrow \infty } t^{-1} \mathbb {E}(X_t) = \alpha - \frac{a\beta }{b}\).

  3. (iii)

    If \(b = 0\) and \(\gamma \in \mathbb {R}_{++}\) then \(\lim _{t\rightarrow \infty } t^{-1} \mathbb {E}(Y_t) = a\) and \(\lim _{t\rightarrow \infty } t^{-1} \mathbb {E}(X_t) = - \frac{a\beta }{\gamma }\).

  4. (iv)

    If \(b = \gamma = 0\) then \(\lim _{t\rightarrow \infty } t^{-1} \mathbb {E}(Y_t) = a\) and \(\lim _{t\rightarrow \infty } t^{-2} \mathbb {E}(X_t) = - \frac{1}{2} a \beta \).

  5. (v)

    Otherwise, there exists \(c \in \mathbb {R}_{++}\) such that \(\lim _{t\rightarrow \infty } \mathrm {e}^{-ct} \mathbb {E}(Y_t) \in \mathbb {R}\) or \(\lim _{t\rightarrow \infty } \mathrm {e}^{-ct} \mathbb {E}(Y_t) \in \mathbb {R}\).

Based on the asymptotic behavior of the first moment of the process \((Y_t, X_t)_{t\in \mathbb {R}_+}\) as \(t \rightarrow \infty \), we can classify two-factor affine diffusions in the following way.

Definition 2.3

Let \((Y_t, X_t)_{t\in \mathbb {R}_+}\) be the unique strong solution of the SDE (1.1) satisfying \(\mathbb {P}(Y_0 \in \mathbb {R}_+) = 1\). We call \((Y_t, X_t)_{t\in \mathbb {R}_+}\) subcritical, critical or supercritical if \(b \wedge \gamma \in \mathbb {R}_{++}\), \(b \wedge \gamma = 0\) or \(b \wedge \gamma \in \mathbb {R}_{--}\), respectively.

3 CLSE based on continuous time observations

Overbeck and Rydén (1997) investigated the CIR process Y, and for each \(T \in \mathbb {R}_{++}\), they defined a CLSE \((\widehat{a}_T, \widehat{b}_T)\) of (ab) based on continuous time observations \((Y_t)_{t\in [0,T]}\) as the limit in probability of the CLSE \((\widehat{a}_{T,n}, \widehat{b}_{T,n})\) of (ab) based on discrete time observations \((Y_{\frac{iT}{n}})_{i\in \{0,1,\ldots ,n\}}\) as \(n \rightarrow \infty \).

We consider a two-factor affine diffusion process \((Y_t, X_t)_{t\in \mathbb {R}_+}\) given in (1.1) with known \(\sigma _1 \in \mathbb {R}_{++}\), \(\sigma _2, \sigma _3 \in \mathbb {R}_+\) and \(\varrho \in [-1, 1]\), and with a random initial value \((\eta _0, \zeta _0)\) independent of \((W_t, B_t, L_t)_{t\in \mathbb {R}_+}\) satisfying \(\mathbb {P}(\eta _0 \in \mathbb {R}_+) = 1\), and we will consider \({\varvec{\theta }}= (a, b, \alpha , \beta , \gamma )^\top \in \mathbb {R}_+ \times \mathbb {R}^4\) as a parameter. The aim of the following discussion is to construct a CLSE of \({\varvec{\theta }}\) based on continuous time observations \((Y_t, X_t)_{t\in [0,T]}\) with some \(T \in \mathbb {R}_{++}\).

Let us recall the CLSE \({\widehat{{\varvec{\theta }}}}_{T,n}\) of \({\varvec{\theta }}\) based on discrete time observations \((Y_{\frac{i}{n}}, X_{\frac{i}{n}})_{i\in \{0,1,\ldots ,{\lfloor nT\rfloor }\}}\) with some \(n \in \mathbb {N}\), which can be obtained by solving the extremum problem

$$\begin{aligned} {\widehat{{\varvec{\theta }}}}_{T,n} := \mathop {\mathrm{arg\,min}}\limits _{{\varvec{\theta }}\in \mathbb {R}^5} \sum _{i=1}^{\lfloor nT\rfloor }\left[ \Bigl (Y_{\frac{i}{n}} - \mathbb {E}\Bigl (Y_{\frac{i}{n}} \,\Big |\, \mathcal {F}_{\frac{i-1}{n}}\Bigr )\Bigr )^2 + \Bigl (X_{\frac{i}{n}} - \mathbb {E}\Bigl (X_{\frac{i}{n}} \,\Big |\, \mathcal {F}_{\frac{i-1}{n}}\Bigr )\Bigr )^2\right] . \end{aligned}$$

By (2.1) and (2.2), together with Proposition 3.2.10 in Karatzas and Shreve (1991), for all \(i \in \mathbb {N}\), we obtain

$$\begin{aligned} \mathbb {E}\Bigl (Y_{\frac{i}{n}} \,|\,\mathcal {F}_{\frac{i-1}{n}}\Bigr ) = \mathrm {e}^{-\frac{b}{n}} Y_{\frac{i-1}{n}} + a \int _0^{\frac{1}{n}} \mathrm {e}^{-bw} \, \mathrm {d}w \end{aligned}$$

and

$$\begin{aligned} \mathbb {E}\Bigl (X_{\frac{i}{n}} \,|\,\mathcal {F}_{\frac{i-1}{n}}\Bigr )= & {} \mathrm {e}^{-\frac{\gamma }{n}} X_{\frac{i-1}{n}} + \alpha \int _0^{\frac{1}{n}} \mathrm {e}^{-\gamma w} \, \mathrm {d}w - \beta Y_{\frac{i-1}{n}} \int _0^{\frac{1}{n}} \mathrm {e}^{(\gamma -b)w-\frac{\gamma }{n}} \, \mathrm {d}w \\&- a \beta \int _0^{\frac{1}{n}} \mathrm {e}^{\gamma w-\frac{\gamma }{n}} \biggl (\int _0^w \mathrm {e}^{-b(w-v)} \, \mathrm {d}v\biggr ) \mathrm {d}w. \end{aligned}$$

Consequently,

$$\begin{aligned} {\widehat{{\varvec{\theta }}}}_{T,n} = \mathop {\mathrm{arg\,min}}\limits _{(a,b,\alpha ,\beta ,\gamma )^\top \in \mathbb {R}^5} \sum _{i=1}^{\lfloor nT\rfloor }\biggl [&\Bigl (Y_{\frac{i}{n}} - Y_{\frac{i-1}{n}} - \Bigl (c - d Y_{\frac{i-1}{n}}\Bigr )\Bigr )^2 \nonumber \\&+ \Bigl (X_{\frac{i}{n}} - X_{\frac{i-1}{n}} - \Bigl (\delta - \varepsilon Y_{\frac{i-1}{n}} - \zeta X_{\frac{i-1}{n}}\Bigr )\Bigr )^2\biggr ], \end{aligned}$$
(3.1)

where

$$\begin{aligned} (c, d, \delta , \varepsilon , \zeta ) := (c_n(a, b), d_n(b), \delta _n(a, b, \alpha , \beta , \gamma ), \varepsilon _n(b, \beta , \gamma ), \zeta _n(\gamma )) := g_n(a, b, \alpha , \beta , \gamma ) \end{aligned}$$
(3.2)

with

$$\begin{aligned} c:= & {} c_n(a, b) := a \int _0^{\frac{1}{n}} \mathrm {e}^{-bw} \, \mathrm {d}w, \qquad d := d_n(b) := 1 - \mathrm {e}^{-\frac{b}{n}}, \\ \delta:= & {} \delta _n(a, b, \alpha , \beta , \gamma ) := \alpha \int _0^{\frac{1}{n}} \mathrm {e}^{-\gamma w} \, \mathrm {d}w - a \beta \int _0^{\frac{1}{n}} \mathrm {e}^{\gamma w-\frac{\gamma }{n}} \biggl (\int _0^w \mathrm {e}^{-b(w-v)} \, \mathrm {d}v\biggr ) \mathrm {d}w, \\ \varepsilon:= & {} \varepsilon _n(b, \beta , \gamma ) := \beta \int _0^{\frac{1}{n}} \mathrm {e}^{(\gamma -b)w-\frac{\gamma }{n}} \, \mathrm {d}w, \qquad \zeta := \zeta _n(\gamma ) := 1 - \mathrm {e}^{-\frac{\gamma }{n}}. \end{aligned}$$

Since the function \(g_n : \mathbb {R}^5 \rightarrow \mathbb {R}\times (-\infty , 1) \times \mathbb {R}^2 \times (-\infty , 1)\) is bijective, first we determine the CLSE \((\widehat{c}_{T,n}, \widehat{d}_{T,n}, \widehat{\delta }_{T,n}, \widehat{\varepsilon }_{T,n}, \widehat{\zeta }_{T,n})\) of the transformed parameters \((c, d, \delta , \varepsilon , \zeta )\) by minimizing the sum on the right-hand side of (3.1) with respect to \((c, d, \delta , \varepsilon , \zeta )\). We have

$$\begin{aligned} \bigl (\widehat{c}_{T,n}, \widehat{d}_{T,n}\bigr )&= \mathop {\mathrm{arg\,min}}\limits _{(c,d)^\top \in \mathbb {R}^2} \sum _{i=1}^{\lfloor nT\rfloor }\Bigl (Y_{\frac{i}{n}} - Y_{\frac{i-1}{n}} - \Bigl (c - d Y_{\frac{i-1}{n}}\Bigr )\Bigr )^2, \\ \bigl (\widehat{\delta }_{T,n}, \widehat{\varepsilon }_{T,n}, \widehat{\zeta }_{T,n}\bigr )&= \mathop {\mathrm{arg\,min}}\limits _{(\delta ,\varepsilon ,\zeta )^\top \in \mathbb {R}^3} \sum _{i=1}^{\lfloor nT\rfloor }\Bigl (X_{\frac{i}{n}} - X_{\frac{i-1}{n}} - \Bigl (\delta - \varepsilon Y_{\frac{i-1}{n}} - \zeta X_{\frac{i-1}{n}}\Bigr )\Bigr )^2, \end{aligned}$$

hence, similarly as on page 675 in Barczy et al. (2013), we get

$$\begin{aligned} \begin{bmatrix} \widehat{c}_{T,n} \\ \widehat{d}_{T,n} \end{bmatrix} = \bigl ({\varvec{\varGamma }}_{T,n}^{(1)}\bigr )^{-1} {\varvec{\varphi }}_{T,n}^{(1)}, \qquad \begin{bmatrix} \widehat{\delta }_{T,n} \\ \widehat{\varepsilon }_{T,n} \\ \widehat{\zeta }_{T,n} \end{bmatrix} = \bigl ({\varvec{\varGamma }}_{T,n}^{(2)}\bigr )^{-1} {\varvec{\varphi }}_{T,n}^{(2)} \end{aligned}$$
(3.3)

with

$$\begin{aligned} {\varvec{\varGamma }}_{T,n}^{(1)}:= & {} \begin{bmatrix} {\lfloor nT\rfloor }&\quad - \sum \limits _{i=1}^{\lfloor nT\rfloor }Y_{\frac{i-1}{n}} \\ - \sum \limits _{i=1}^{\lfloor nT\rfloor }Y_{\frac{i-1}{n}}&\quad \sum \limits _{i=1}^{\lfloor nT\rfloor }Y_{\frac{i-1}{n}}^2 \end{bmatrix}, \qquad {\varvec{\varphi }}_{T,n}^{(1)} := \begin{bmatrix} Y_{\frac{{\lfloor nT\rfloor }}{n}} - Y_0 \\ - \sum \limits _{i=1}^{\lfloor nT\rfloor }\Bigl (Y_{\frac{i}{n}} - Y_{\frac{i-1}{n}}\Bigr ) Y_{\frac{i-1}{n}} \end{bmatrix},\\ {\varvec{\varGamma }}_{T,n}^{(2)}:= & {} \begin{bmatrix} {\lfloor nT\rfloor }&\quad - \sum \limits _{i=1}^{\lfloor nT\rfloor }Y_{\frac{i-1}{n}}&\quad - \sum \limits _{i=1}^{\lfloor nT\rfloor }X_{\frac{i-1}{n}} \\ - \sum \limits _{i=1}^{\lfloor nT\rfloor }Y_{\frac{i-1}{n}}&\quad \sum \limits _{i=1}^{\lfloor nT\rfloor }Y_{\frac{i-1}{n}}^2&\quad \sum \limits _{i=1}^{\lfloor nT\rfloor }Y_{\frac{i-1}{n}} X_{\frac{i-1}{n}} \\ - \sum \limits _{i=1}^{\lfloor nT\rfloor }X_{\frac{i-1}{n}}&\quad \sum \limits _{i=1}^{\lfloor nT\rfloor }Y_{\frac{i-1}{n}} X_{\frac{i-1}{n}}&\quad \sum \limits _{i=1}^{\lfloor nT\rfloor }X_{\frac{i-1}{n}}^2 \end{bmatrix}, \\ {\varvec{\varphi }}_{T,n}^{(2)}:= & {} \begin{bmatrix} X_{\frac{{\lfloor nT\rfloor }}{n}} - X_0 \\ - \sum \limits _{i=1}^{\lfloor nT\rfloor }\Bigl (X_{\frac{i}{n}} - X_{\frac{i-1}{n}}\Bigr ) Y_{\frac{i-1}{n}} \\ - \sum \limits _{i=1}^{\lfloor nT\rfloor }\Bigl (X_{\frac{i}{n}} - X_{\frac{i-1}{n}}\Bigr ) X_{\frac{i-1}{n}} \end{bmatrix} \end{aligned}$$

on the event where the random matrices \({\varvec{\varGamma }}_{T,n}^{(1)}\) and \({\varvec{\varGamma }}_{T,n}^{(2)}\) are invertible.

Lemma 3.1

Let us consider the two-factor affine diffusion model (1.1) with \(a \in \mathbb {R}_+\), \(b, \alpha , \beta , \gamma \in \mathbb {R}\), \(\sigma _1 \in \mathbb {R}_{++}\), \(\sigma _2, \sigma _3 \in \mathbb {R}_+\) and \(\varrho \in [-1, 1]\) with a random initial value \((\eta _0, \zeta _0)\) independent of \((W_t, B_t, L_t)_{t\in \mathbb {R}_+}\) satisfying \(\mathbb {P}(\eta _0 \in \mathbb {R}_+) = 1\). Suppose that \((1 - \varrho ^2) \sigma _2^2 + \sigma _3^2 > 0\). Then for each \(T \in \mathbb {R}_{++}\) and \(n \in \mathbb {N}\), the random matrices \({\varvec{\varGamma }}_{T,n}^{(1)}\) and \({\varvec{\varGamma }}_{T,n}^{(2)}\) are invertible almost surely, and hence there exists a unique CLSE \(\bigl (\widehat{c}_{T,n}, \widehat{d}_{T,n}, \widehat{\delta }_{T,n}, \widehat{\varepsilon }_{T,n}, \widehat{\zeta }_{T,n}\bigr )\) of \((c, d, \delta , \varepsilon , \zeta )\) taking the form given in (3.3).

A proof can be found in the Arxiv version of this paper Bolyog and Pap (2017).

Remark 3.2

The first order Taylor approximation of \(g_n(a, b, \alpha , \beta , \gamma )\) at (0, 0, 0, 0, 0) is \(\frac{1}{n}(a, b, \alpha , \beta , \gamma )\), hence we obtain the first order Taylor approximations

$$\begin{aligned}&Y_{\frac{i}{n}} - \mathbb {E}\Bigl (Y_{\frac{i}{n}} \,|\,\mathcal {F}_{\frac{i-1}{n}}\Bigr ) \approx Y_{\frac{i}{n}} - Y_{\frac{i-1}{n}} - \frac{1}{n} \Bigl (a - b Y_{\frac{i-1}{n}}\Bigr ), \\&X_{\frac{i}{n}} - \mathbb {E}\Bigl (X_{\frac{i}{n}} \,|\,\mathcal {F}_{\frac{i-1}{n}}\Bigr ) \approx X_{\frac{i}{n}} - X_{\frac{i-1}{n}} - \frac{1}{n} \Bigl (\alpha - \beta Y_{\frac{i-1}{n}} - \gamma X_{\frac{i-1}{n}}\Bigr ). \end{aligned}$$

Using these approximations, one can define an approximate CLSE \({\widehat{{\varvec{\theta }}}}_{T,n}^{\mathrm {approx}}\) of \({\varvec{\theta }}\) based on discrete time observations \((Y_i, X_i)_{i\in \{0,1,\ldots ,{\lfloor nT\rfloor }\}}\), \(n \in \mathbb {N}\), by solving the extremum problem

$$\begin{aligned} {\widehat{{\varvec{\theta }}}}_{T,n}^{\mathrm {approx}} := \mathop {\mathrm{arg\,min}}\limits _{(a,b,\alpha ,\beta ,\gamma )^\top \in \mathbb {R}^5} \sum _{i=1}^{\lfloor nT\rfloor }\biggl [&\Bigl (Y_{\frac{i}{n}} - Y_{\frac{i-1}{n}} - \frac{1}{n} \Bigl (a - b Y_{\frac{i-1}{n}}\Bigr )\Bigr )^2 \\&+ \Bigl (X_{\frac{i}{n}} - X_{\frac{i-1}{n}} - \frac{1}{n} \Bigl (\alpha - \beta Y_{\frac{i-1}{n}} - \gamma X_{\frac{i-1}{n}}\Bigr )\Bigr )^2\biggr ], \end{aligned}$$

hence \({\widehat{{\varvec{\theta }}}}_{T,n}^{\mathrm {approx}} = n \bigl (\widehat{c}_{T,n}, \widehat{d}_{T,n}, \widehat{\delta }_{T,n}, \widehat{\varepsilon }_{T,n}, \widehat{\zeta }_{T,n}\bigr )^\top \). This definition of approximate CLSE can be considered as the definition of LSE given in Hu and Long (2009a, formula (1.2)) for generalized Ornstein–Uhlenbeck processes driven by \(\alpha \)-stable motions, see also Hu and Long (2009b, formula (3.1)). For a heuristic motivation of the estimator \({\widehat{{\varvec{\theta }}}}_n^{\mathrm {approx}}\) based on discrete observations, see, e.g., Hu and Long (2007, p. 178) (formulated for Langevin equations). \(\square \)

We have

$$\begin{aligned}&\frac{1}{n} {\varvec{\varGamma }}_{T,n}^{(1)} {\mathop {\longrightarrow }\limits ^{{\mathrm {a.s.}}}}\begin{bmatrix} T&\quad -\int _0^T Y_s \, \mathrm {d}s \\ - \int _0^T Y_s \, \mathrm {d}s&\quad \int _0^T Y_s^2 \, \mathrm {d}s \end{bmatrix} =: {\varvec{G}}_T^{(1)}, \\&\frac{1}{n} {\varvec{\varGamma }}_{T,n}^{(2)} {\mathop {\longrightarrow }\limits ^{{\mathrm {a.s.}}}}\begin{bmatrix} T&\quad -\int _0^T Y_s \, \mathrm {d}s&\quad -\int _0^T X_s \, \mathrm {d}s\\ -\int _0^T Y_s \, \mathrm {d}s&\quad \int _0^T {Y_s}^2 \, \mathrm {d}s&\quad \int _0^T X_s Y_s \, \mathrm {d}s\\ -\int _0^T X_s \, \mathrm {d}s&\quad \int _0^T X_s Y_s \, \mathrm {d}s&\quad \int _0^T X_s^2 \, \mathrm {d}s \end{bmatrix} =: {\varvec{G}}_T^{(2)} \end{aligned}$$

as \(n \rightarrow \infty \), since \((Y_t, X_t)_{t\in \mathbb {R}_+}\) is almost surely continuous. By Proposition I.4.44 in Jacod and Shiryaev (2003) with the Riemann sequence of deterministic subdivisions \(\left( \frac{i}{n} \wedge T\right) _{i\in \mathbb {N}}\), \(n \in \mathbb {N}\)., we obtain

$$\begin{aligned} {\varvec{\varphi }}_{T,n}^{(1)} {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}\begin{bmatrix} Y_T - Y_0 \\ -\int _0^T Y_s \, \mathrm {d}Y_s \end{bmatrix} =: {\varvec{f}}_T^{(1)}, \qquad {\varvec{\varphi }}_{T,n}^{(2)} {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}\begin{bmatrix} X_T - X_0 \\ -\int _0^T Y_s\, \mathrm {d}X_s \\ -\int _0^T X_s\, \mathrm {d}X_s \end{bmatrix} =: {\varvec{f}}_T^{(2)}, \end{aligned}$$

as \(n \rightarrow \infty \). By Slutsky’s lemma, using also Lemma 3.1, we conclude

$$\begin{aligned} {\widehat{{\varvec{\theta }}}}_{T,n}^{\mathrm {approx}} = n \begin{bmatrix} \widehat{c}_{T,n} \\ \widehat{d}_{T,n} \\ \widehat{\delta }_{T,n} \\ \widehat{\varepsilon }_{T,n} \\ \widehat{\zeta }_{T,n} \end{bmatrix} {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}\begin{bmatrix} ({\varvec{G}}_T^{(1)})^{-1} {\varvec{f}}_T^{(1)} \\ ({\varvec{G}}_T^{(2)})^{-1} {\varvec{f}}_T^{(2)} \end{bmatrix} =: \begin{bmatrix} \widehat{a}_T \\ \widehat{b}_T \\ \widehat{\alpha }_T \\ \widehat{\beta }_T \\ \widehat{\gamma }_T \end{bmatrix} =: {\widehat{{\varvec{\theta }}}}_T \qquad \text {as} \ n \rightarrow \infty , \end{aligned}$$
(3.4)

whenever the random matrices \({\varvec{G}}_T^{(1)}\) and \({\varvec{G}}_T^{(2)}\) are invertible.

Lemma 3.3

Let us consider the two-factor affine diffusion model (1.1) with \(a \in \mathbb {R}_+\), \(b, \alpha , \beta , \gamma \in \mathbb {R}\), \(\sigma _1 \in \mathbb {R}_{++}\), \(\sigma _2, \sigma _3 \in \mathbb {R}_+\) and \(\varrho \in [-1, 1]\) with a random initial value \((\eta _0, \zeta _0)\) independent of \((W_t, B_t, L_t)_{t\in \mathbb {R}_+}\) satisfying \(\mathbb {P}(\eta _0 \in \mathbb {R}_+) = 1\). Suppose that \((1 - \varrho ^2) \sigma _2^2 + \sigma _3^2 > 0\). Then for each \(T \in \mathbb {R}_{++}\), the random matrices \({\varvec{G}}_T^{(1)}\) and \({\varvec{G}}_T^{(2)}\) are invertible almost surely, and hence \({\widehat{{\varvec{\theta }}}}_T\) given in (3.4) exists almost surely. Moreover, \({\widehat{{\varvec{\theta }}}}_{T,n} {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}{\widehat{{\varvec{\theta }}}}_T\) as \(n \rightarrow \infty \).

Proof

A proof of the first statement can be found in the Arxiv version of this paper Bolyog and Pap (2017). Next we are going to show \({\widehat{{\varvec{\theta }}}}_{T,n} {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}{\widehat{{\varvec{\theta }}}}_T\) as \(n \rightarrow \infty \). The function \(g_n\) introduced in (3.2) admits an inverse \(g_n^{-1} : \mathbb {R}\times (-\infty , 1) \times \mathbb {R}^2 \times (-\infty , 1) \rightarrow \mathbb {R}^5\) satisfying

$$\begin{aligned} g_n^{-1}(c, d, \delta , \varepsilon , \zeta ) = (a, b, \alpha , \beta , \gamma ) \end{aligned}$$

with

$$\begin{aligned} b= & {} -n \log (1 - d), \qquad a = \frac{c}{\int _0^{\frac{1}{n}} \mathrm {e}^{-bw} \, \mathrm {d}w}, \qquad \gamma = -n \log (1 - \zeta ), \\ \beta= & {} \frac{\varepsilon }{\int _0^{\frac{1}{n}} \mathrm {e}^{(\gamma -b)w-\frac{\gamma }{n}} \, \mathrm {d}w}, \qquad \alpha = \frac{\delta +a\beta \int _0^{\frac{1}{n}} \mathrm {e}^{\gamma w-\frac{\gamma }{n}} \bigl (\int _0^w \mathrm {e}^{-b(w-v)} \, \mathrm {d}v\bigr ) \mathrm {d}w}{\int _0^{\frac{1}{n}} \mathrm {e}^{-\gamma w} \, \mathrm {d}w}. \end{aligned}$$

Convergence (3.4) yields \((\widehat{c}_{T,n}, \widehat{d}_{T,n}, \widehat{\delta }_{T,n}, \widehat{\varepsilon }_{T,n}, \widehat{\zeta }_{T,n}) {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}{\varvec{0}}\) as \(n \rightarrow \infty \), hence \(\widehat{d}_{T,n} \in (-\infty , 1)\) and \(\widehat{\zeta }_{T,n} \in (-\infty , 1)\) with probability tending to one as \(n \rightarrow \infty \). Consequently, \(g_n^{-1}(\widehat{c}_{T,n}, \widehat{d}_{T,n}, \widehat{\delta }_{T,n}, \widehat{\varepsilon }_{T,n}, \widehat{\zeta }_{T,n}) ={\widehat{{\varvec{\theta }}}}_{T,n}\) with probability tending to one as \(n \rightarrow \infty \). We have

$$\begin{aligned} \widehat{b}_{T,n} = -n \log (1 - \widehat{d}_{T,n}) = n \widehat{d}_{T,n} h_1(\widehat{d}_{T,n}) \end{aligned}$$

with probability tending to one as \(n \rightarrow \infty \), where the continuous function \(h_1 : (-\infty , 1) \rightarrow \mathbb {R}\) is given by

$$\begin{aligned} h_1(x) := {\left\{ \begin{array}{ll} - \frac{1}{x} \log (1 - x) &{} \text {if }\ x \ne 0, \\ 1 &{} \text {if }\ x = 0. \end{array}\right. } \end{aligned}$$

By (3.4), we have \(n \widehat{d}_{T,n} {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}\widehat{b}_T\) and \(\widehat{d}_{T,n} {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}0\), thus we obtain \(h_1(\widehat{d}_{T,n}) {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}h_1(0) = 1\), and hence \(\widehat{b}_{T,n} {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}\widehat{b}_T\) as \(n \rightarrow \infty \).

Moreover,

$$\begin{aligned} \widehat{a}_{T,n} = \frac{\widehat{c}_{T,n}}{\int _0^{\frac{1}{n}} \mathrm {e}^{-\widehat{b}_{T,n}w} \, \mathrm {d}w} = \frac{n\widehat{c}_{T,n}}{n\int _0^{\frac{1}{n}} \mathrm {e}^{-\widehat{b}_{T,n}w} \, \mathrm {d}w} = \frac{n\widehat{c}_{T,n}}{\int _0^1 \exp \bigl \{-n^{-1}\widehat{b}_{T,n}v\bigr \} \, \mathrm {d}v} = \frac{n\widehat{c}_{T,n}}{h_2(n^{-1}\widehat{b}_{T,n})} \end{aligned}$$

with probability tending to one as \(n \rightarrow \infty \), where the continuous function \(h_2 : \mathbb {R}\rightarrow \mathbb {R}\) is given by

$$\begin{aligned} h_2(x) := \int _0^1 \mathrm {e}^{-xv} \, \mathrm {d}v = {\left\{ \begin{array}{ll} \frac{1-\mathrm {e}^{-x}}{x} &{} \text {if }\ x \ne 0, \\ 1 &{} \text {if }\ x = 0. \end{array}\right. } \end{aligned}$$

We have already showed \(\widehat{b}_{T,n} {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}\widehat{b}_T\), yielding \(n^{-1} \widehat{b}_{T,n} {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}0\), and hence \(h_2(n^{-1} \widehat{b}_{T,n}) {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}h_2(0) = 1\) as \(n \rightarrow \infty \). By (3.4), we have \(n \widehat{c}_{T,n} {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}\widehat{a}_T\), thus we obtain \(\widehat{a}_{T,n} {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}\widehat{a}_T\) as \(n \rightarrow \infty \).

In a similar way,

$$\begin{aligned} \widehat{\gamma }_{T,n} = -n \log (1 - \widehat{\zeta }_{T,n}) = n \widehat{\zeta }_{T,n} h_1(\widehat{\zeta }_{T,n}) \end{aligned}$$

with probability tending to one as \(n \rightarrow \infty \). By (3.4), we have \(n \widehat{\zeta }_{T,n} {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}\widehat{\gamma }_T\) and \(\widehat{\zeta }_{T,n} {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}0\), thus we obtain \(h_1(\widehat{\zeta }_{T,n}) {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}h_1(0) = 1\), and hence \(\widehat{\gamma }_{T,n} {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}\widehat{\gamma }_T\) as \(n \rightarrow \infty \).

Further,

$$\begin{aligned} \widehat{\beta }_{T,n} = \frac{\widehat{\varepsilon }_{T,n}}{\int _0^{\frac{1}{n}} \mathrm {e}^{(\widehat{\gamma }_{T,n}-\widehat{b}_{T,n})w-\frac{\widehat{\gamma }_{T,n}}{n}} \, \mathrm {d}w} =\frac{n\widehat{\varepsilon }_{T,n}\mathrm {e}^{\frac{\widehat{\gamma }_{T,n}}{n}}}{h_2(n^{-1}(\widehat{b}_{T,n}-\widehat{\gamma }_{T,n}))} \end{aligned}$$

with probability tending to one as \(n \rightarrow \infty \). We have already showed \(\widehat{b}_{T,n} {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}\widehat{b}_T\) and \(\widehat{\gamma }_{T,n} {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}\widehat{\gamma }_T\), yielding \(n^{-1} \widehat{b}_{T,n} {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}0\) and \(n^{-1} \widehat{\gamma }_{T,n} {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}0\), and hence \(\mathrm {e}^{\frac{\widehat{\gamma }_{T,n}}{n}} {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}1\) and \(h_2(n^{-1} (\widehat{b}_{T,n} - \widehat{\gamma }_{T,n})) {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}h_2(0) = 1\) as \(n \rightarrow \infty \). By (3.4), we have \(n \widehat{\varepsilon }_{T,n} {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}\widehat{\beta }_T\), thus we obtain \(\widehat{\beta }_{T,n} {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}\widehat{\beta }_T\) as \(n \rightarrow \infty \).

Finally,

$$\begin{aligned} \widehat{\alpha }_{T,n} = \frac{\widehat{\delta }_{T,n} +\widehat{a}_{T,n}\widehat{\beta }_{T,n} \int _0^{\frac{1}{n}} \mathrm {e}^{\widehat{\gamma }_{T,n} w-\frac{\widehat{\gamma }_{T,n}}{n}} \bigl (\int _0^w \mathrm {e}^{-b(w-v)} \, \mathrm {d}v\bigr ) \mathrm {d}w}{\int _0^{\frac{1}{n}} \mathrm {e}^{-\widehat{\gamma }_{T,n}w} \, \mathrm {d}w} = \frac{n\widehat{\delta }_{T,n}+\widehat{a}_{T,n}\widehat{\beta }_{T,n}\mathrm {e}^{-\frac{\widehat{\gamma }_{T,n}}{n}}I_{T,n}}{h_2(n^{-1}\widehat{\gamma }_{T,n})} \end{aligned}$$

with probability tending to one as \(n \rightarrow \infty \), where

$$\begin{aligned} I_{T,n} = n \int _0^{\frac{1}{n}} \mathrm {e}^{\widehat{\gamma }_{T,n}w} \bigl (\int _0^w \mathrm {e}^{-b(w-v)} \, \mathrm {d}v\bigr ) \mathrm {d}w \leqslant \frac{1}{n} \mathrm {e}^{\frac{|\widehat{\gamma }_{T,n}|}{n}} \mathrm {e}^{\frac{|\widehat{b}_{T,n}|}{n}}. \end{aligned}$$

We have already showed \(\widehat{a}_{T,n} {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}\widehat{a}_T\), \(\widehat{b}_{T,n} {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}\widehat{b}_T\), \(\widehat{\beta }_{T,n} {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}\widehat{\beta }_T\) and \(\widehat{\gamma }_{T,n} {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}\widehat{\gamma }_T\), yielding \(n^{-1} \widehat{b}_{T,n} {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}0\) and \(n^{-1} \widehat{\gamma }_{T,n} {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}0\), and hence \(h_2(n^{-1} \widehat{\gamma }_{T,n}) {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}h_2(0) = 1\), \(\mathrm {e}^{\frac{\widehat{\gamma }_{T,n}}{n}} {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}1\), \(\mathrm {e}^{\frac{|\widehat{\gamma }_{T,n}|}{n}} {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}1\) and \(\mathrm {e}^{\frac{|\widehat{b}_{T,n}|}{n}} {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}1\), implying \(I_{T,n} {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}0\) as \(n \rightarrow \infty \). By (3.4), we have \(n \widehat{\delta }_{T,n} {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}\widehat{\alpha }_T\), thus we obtain \(\widehat{\alpha }_{T,n} {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}\widehat{\alpha }_T\) as \(n \rightarrow \infty \). \(\square \)

Using the SDE (1.1) and Corollary 3.2.20 in Karatzas and Shreve (1991), one can check that

$$\begin{aligned} {\widehat{{\varvec{\theta }}}}_T - {\varvec{\theta }}= \begin{bmatrix} \widehat{a}_T - a \\ \widehat{b}_T -b \\ \widehat{\alpha }_T - \alpha \\ \widehat{\beta }_T - \beta \\ \widehat{\gamma }_T - \gamma \end{bmatrix} = \begin{bmatrix} ({\varvec{G}}_T^{(1)})^{-1} {\varvec{h}}_T^{(1)} \\ ({\varvec{G}}_T^{(2)})^{-1} {\varvec{h}}_T^{(2)} \end{bmatrix} = {\varvec{G}}_T^{-1} {\varvec{h}}_T \end{aligned}$$
(3.5)

on the event where the random matrices \({\varvec{G}}_T^{(1)}\) and \({\varvec{G}}_T^{(2)}\) are invertible, where

$$\begin{aligned} {\varvec{G}}_T := \begin{bmatrix} {\varvec{G}}_T^{(1)}&{\varvec{0}}\\ {\varvec{0}}&{\varvec{G}}_T^{(2)} \end{bmatrix}, \qquad {\varvec{h}}_T:= \begin{bmatrix} {\varvec{h}}_T^{(1)} \\ {\varvec{h}}_T^{(2)} \end{bmatrix}, \end{aligned}$$

with

$$\begin{aligned} {\varvec{h}}_T^{(1)} := \sigma _1 \int _0^T \sqrt{Y_s} \begin{bmatrix} 1 \\ - Y_s \end{bmatrix} \mathrm {d}W_s, \qquad {\varvec{h}}_T^{(2)} := \int _0^T \begin{bmatrix} 1 \\ - Y_s \\ - X_s \end{bmatrix} (\sigma _2 \sqrt{Y_s} \, \mathrm {d}\widetilde{W}_s + \sigma _3 \, \mathrm {d}L_s), \end{aligned}$$

where

$$\begin{aligned} \widetilde{W}_s := \varrho W_s + \sqrt{1-\varrho ^2} B_s, \qquad s \in \mathbb {R}_+, \end{aligned}$$
(3.6)

is a standard Wiener process, independent of L. For details see the Arxiv version of this paper Bolyog and Pap (2017).

4 Consistency of CLSE

First we consider the case of subcritical Heston models, i.e., when \(b \in \mathbb {R}_{++}\).

Theorem 4.1

Let us consider the two-factor affine diffusion model (1.1) with \(a, b \in \mathbb {R}_{++}\), \(\alpha , \beta \in \mathbb {R}\), \(\gamma \in \mathbb {R}_{++}\), \(\sigma _1 \in \mathbb {R}_{++}\), \(\sigma _2, \sigma _3 \in \mathbb {R}_+\) and \(\varrho \in [-1, 1]\) with a random initial value \((\eta _0, \zeta _0)\) independent of \((W_t, B_t, L_t)_{t\in \mathbb {R}_+}\) satisfying \(\mathbb {P}(\eta _0 \in \mathbb {R}_+) = 1\). Suppose that \((1 - \varrho ^2) \sigma _2^2 + \sigma _3^2 > 0\). Then the CLSE of \({\varvec{\theta }}= (a, b, \alpha , \beta , \gamma )^\top \) is strongly consistent, i.e., \({\widehat{{\varvec{\theta }}}}_T = \bigl (\widehat{a}_T, \widehat{b}_T, \widehat{\alpha }_T, \widehat{\beta }_T, \widehat{\gamma }_T\bigr )^\top {\mathop {\longrightarrow }\limits ^{{\mathrm {a.s.}}}}{\varvec{\theta }}= (a, b, \alpha , \beta , \gamma )^\top \) as \(T \rightarrow \infty \).

Proof

By (3.5), we have

$$\begin{aligned} {\widehat{{\varvec{\theta }}}}_T - {\varvec{\theta }}= (T^{-1} {\varvec{G}}_T)^{-1} (T^{-1} {\varvec{h}}_T) \end{aligned}$$
(4.1)

on the event, where the random matrix \({\varvec{G}}_T\) is invertible, which has propapility 1, see Lemma 3.3.

By Theorem A.2, we obtain

$$\begin{aligned} T^{-1} {\varvec{G}}_T {\mathop {\longrightarrow }\limits ^{{\mathrm {a.s.}}}}\mathbb {E}({\varvec{G}}_\infty ) \qquad \text {as }\ T \rightarrow \infty , \end{aligned}$$
(4.2)

where

$$\begin{aligned} {\varvec{G}}_\infty := \begin{bmatrix} {\varvec{G}}_\infty ^{(1)}&{\varvec{0}}\\ {\varvec{0}}&{\varvec{G}}_\infty ^{(2)} \end{bmatrix} \end{aligned}$$
(4.3)

with

$$\begin{aligned} {\varvec{G}}_\infty ^{(1)} := \begin{bmatrix} 1&\quad -Y_\infty \\ -Y_\infty&\quad Y_\infty ^2 \end{bmatrix}, \qquad {\varvec{G}}_\infty ^{(2)} := \begin{bmatrix} 1&\quad -Y_\infty&\quad -X_\infty \\ -Y_\infty&\quad Y_\infty ^2&\quad Y_\infty X_\infty \\ -X_\infty&\quad Y_\infty X_\infty&\quad X_\infty ^2 \end{bmatrix}, \end{aligned}$$

where the random vector \((Y_\infty , X_\infty )\) is given by Theorem A.1, since, by Theorem B.2, the entries of \(\mathbb {E}({\varvec{G}}_\infty )\) exist and finite.

The matrix \(\mathbb {E}({\varvec{G}}_\infty ^{(1)})\) is strictly positive definite, since for all \({\varvec{x}}\in \mathbb {R}^2 \setminus \{{\varvec{0}}\}\), we have \({\varvec{x}}^\top \mathbb {E}({\varvec{G}}_\infty ^{(1)}) {\varvec{x}}> 0\). Indeed, for all \({\varvec{x}}= (x_1, x_2)^\top \in \mathbb {R}^2 \setminus \{{\varvec{0}}\}\),

$$\begin{aligned} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix}^\top \mathbb {E}({\varvec{G}}_\infty ^{(1)}) \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \mathbb {E}\bigl [(x_1 - x_2 Y_\infty )^2\bigr ] > 0, \end{aligned}$$

since, by Theorem A.2, the distribution of \(Y_\infty \) is absolutely continuous, hence \(x_1 - x_2 Y_\infty \ne 0\) with probability 1. In a similar way, the matrix \(\mathbb {E}({\varvec{G}}_\infty ^{(2)})\) is strictly positive definite, since for all \({\varvec{x}}\in \mathbb {R}^3 \setminus \{{\varvec{0}}\}\), we have \({\varvec{x}}^\top \mathbb {E}({\varvec{G}}_\infty ^{(2)}) {\varvec{x}}> 0\). Indeed, for all \({\varvec{x}}= (x_1, x_2, x_3)^\top \in \mathbb {R}^3 \setminus \{{\varvec{0}}\}\),

$$\begin{aligned} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix}^\top \mathbb {E}({\varvec{G}}_\infty ^{(2)}) \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \mathbb {E}\bigl [(x_1 - x_2 Y_\infty - x_3 Z_\infty )^2\bigr ] > 0, \end{aligned}$$

since, by Theorem A.2, the distribution of \((Y_\infty , X_\infty )\) is absolutely continuous, hence \(x_1 - x_2 Y_\infty - x_3 X_\infty \ne 0\) with probability 1. Thus the matrices \(\mathbb {E}({\varvec{G}}_\infty ^{(1)})\) and \(\mathbb {E}({\varvec{G}}_\infty ^{(2)})\) are invertible, whence we conclude

$$\begin{aligned} (T^{-1} {\varvec{G}}_T)^{-1} {\mathop {\longrightarrow }\limits ^{{\mathrm {a.s.}}}}\begin{bmatrix} [\mathbb {E}({\varvec{G}}_\infty ^{(1)})]^{-1}&{\varvec{0}}\\ {\varvec{0}}&[\mathbb {E}({\varvec{G}}_\infty ^{(2)})]^{-1} \end{bmatrix} = [\mathbb {E}({\varvec{G}}_\infty )]^{-1} \qquad \text {as }\ T \rightarrow \infty . \end{aligned}$$
(4.4)

The aim of the next discussion is to show convergence

$$\begin{aligned} T^{-1} {\varvec{h}}_T {\mathop {\longrightarrow }\limits ^{{\mathrm {a.s.}}}}{\varvec{0}}\qquad \text {as }\ T \rightarrow \infty . \end{aligned}$$
(4.5)

We have

$$\begin{aligned} \frac{1}{T} \int _0^T \sqrt{Y_s} \, \mathrm {d}W_s = \frac{1}{T} \int _0^T Y_s \, \mathrm {d}s \cdot \frac{\int _0^T \sqrt{Y_s} \, \mathrm {d}W_s}{\int _0^T Y_s \, \mathrm {d}s} {\mathop {\longrightarrow }\limits ^{{\mathrm {a.s.}}}}0 \qquad \text {as }\ T \rightarrow \infty . \end{aligned}$$

Indeed, we have already proved

$$\begin{aligned} \frac{1}{T} \int _0^T Y_s \, \mathrm {d}s {\mathop {\longrightarrow }\limits ^{{\mathrm {a.s.}}}}\mathbb {E}(Y_\infty ) = \frac{a}{b} \in \mathbb {R}_{++} \qquad \text {as }\ T \rightarrow \infty , \end{aligned}$$

and the strong law of large numbers for continuous local martingales (see, e.g., Theorem C.1) implies

$$\begin{aligned} \frac{\int _0^T \sqrt{Y_s} \, \mathrm {d}W_s}{\int _0^T Y_s \, \mathrm {d}s} {\mathop {\longrightarrow }\limits ^{{\mathrm {a.s.}}}}0 \qquad \text {as }\ T \rightarrow \infty , \end{aligned}$$

since we have

$$\begin{aligned} \int _0^T Y_s \, \mathrm {d}s = T \cdot \frac{1}{T} \int _0^T Y_s \, \mathrm {d}s {\mathop {\longrightarrow }\limits ^{{\mathrm {a.s.}}}}\infty \qquad \text {as }\ T \rightarrow \infty . \end{aligned}$$

Further,

$$\begin{aligned} \frac{1}{T} \int _0^T (\sigma _2 \sqrt{Y_s} \, \mathrm {d}\widetilde{W}_s + \sigma _3 \, \mathrm {d}L_s) = \frac{1}{T} \int _0^T (\sigma _2^2 Y_s + \sigma _3^2) \mathrm {d}s \cdot \frac{\int _0^T (\sigma _2 \sqrt{Y_s} \, \mathrm {d}\widetilde{W}_s + \sigma _3 \, \mathrm {d}L_s)}{\int _0^T (\sigma _2^2 Y_s + \sigma _3^2) \, \mathrm {d}s} {\mathop {\longrightarrow }\limits ^{{\mathrm {a.s.}}}}0 \end{aligned}$$

as \(T \rightarrow \infty \). Indeed, we have already proved

$$\begin{aligned} \frac{1}{T} \int _0^T (\sigma _2^2 Y_s + \sigma _3^2) \, \mathrm {d}s {\mathop {\longrightarrow }\limits ^{{\mathrm {a.s.}}}}\mathbb {E}(\sigma _2^2 Y_\infty + \sigma _3^2) = \sigma _2^2 \frac{a}{b} + \sigma _3^2 \in \mathbb {R}_{++} \qquad \text {as }\ T \rightarrow \infty , \end{aligned}$$

and the strong law of large numbers for continuous local martingales (see, e.g., Theorem C.1) implies

$$\begin{aligned} \frac{\int _0^T (\sigma _2 \sqrt{Y_s} \, \mathrm {d}\widetilde{W}_s + \sigma _3 \, \mathrm {d}L_s)}{\int _0^T (\sigma _2^2 Y_s + \sigma _3^2) \, \mathrm {d}s} {\mathop {\longrightarrow }\limits ^{{\mathrm {a.s.}}}}0 \qquad \text {as }\ T \rightarrow \infty , \end{aligned}$$

since we have

$$\begin{aligned} \int _0^T (\sigma _2^2 Y_s + \sigma _3^2) \, \mathrm {d}s = T \cdot \frac{1}{T} \int _0^T (\sigma _2^2 Y_s + \sigma _3^2) \, \mathrm {d}s {\mathop {\longrightarrow }\limits ^{{\mathrm {a.s.}}}}\infty \qquad \text {as }\ T \rightarrow \infty . \end{aligned}$$

One can check

$$\begin{aligned} \frac{1}{T} \int _0^T Y_s \sqrt{Y_s} \, \mathrm {d}W_s {\mathop {\longrightarrow }\limits ^{{\mathrm {a.s.}}}}0, \qquad \frac{1}{T} \int _0^T Y_s (\sigma _2 \sqrt{Y_s} \, \mathrm {d}\widetilde{W}_s + \sigma _3 \, \mathrm {d}L_s) {\mathop {\longrightarrow }\limits ^{{\mathrm {a.s.}}}}0, \\ \frac{1}{T} \int _0^T X_s (\sigma _2 \sqrt{Y_s} \, \mathrm {d}\widetilde{W}_s + \sigma _3 \, \mathrm {d}L_s) {\mathop {\longrightarrow }\limits ^{{\mathrm {a.s.}}}}0 \end{aligned}$$

as \(T \rightarrow \infty \) in the same way, since

$$\begin{aligned}&\frac{1}{T} \int _0^T Y_s^3 \, \mathrm {d}s {\mathop {\longrightarrow }\limits ^{{\mathrm {a.s.}}}}\mathbb {E}(Y_\infty ^3) \in \mathbb {R}_{++}, \\&\frac{1}{T} \int _0^T Y^2_s (\sigma _2^2 Y_s + \sigma _3^2) \, \mathrm {d}s {\mathop {\longrightarrow }\limits ^{{\mathrm {a.s.}}}}\mathbb {E}\bigl [Y^2_s (\sigma _2^2 Y_s +\,\, \sigma _3^2)\bigr ] \in \mathbb {R}_{++}, \\&\frac{1}{T} \int _0^T X^2_s (\sigma _2^2 Y_s + \sigma _3^2) \, \mathrm {d}s {\mathop {\longrightarrow }\limits ^{{\mathrm {a.s.}}}}\mathbb {E}\bigl [X^2_s (\sigma _2^2 Y_s + \sigma _3^2)\bigr ] \in \mathbb {R}_{++} \end{aligned}$$

as \(T \rightarrow \infty \). Consequently, we conclude (4.5). Finally, by (4.4) and (4.5), we obtain the statement. \(\square \)

In order to handle supercritical two-factor affine diffusion models when \(b \in \mathbb {R}_{--}\), we need the following integral version of the Toeplitz Lemma, due to Dietz and Kutoyants (1997).

Lemma 4.2

Let \(\{\varphi _T : T \in \mathbb {R}_+\}\) be a family of probability measures on \(\mathbb {R}_+\) such that \(\varphi _T([0,T]) = 1\) for all \(T \in \mathbb {R}_+\), and \(\lim _{T\rightarrow \infty } \varphi _T([0,K]) = 0\) for all \(K \in \mathbb {R}_{++}\). Then for every bounded and measurable function \(f : \mathbb {R}_+ \rightarrow \mathbb {R}\) for which the limit \(f(\infty ) := \lim _{t\rightarrow \infty } f(t)\) exists, we have

$$\begin{aligned} \lim _{T\rightarrow \infty } \int _0^\infty f(t) \, \varphi _T(\mathrm {d}t) = f(\infty ). \end{aligned}$$

As a special case, we have the following integral version of the Kronecker Lemma, see Küchler and Sørensen (1997, Lemma B.3.2).

Lemma 4.3

Let \(a : \mathbb {R}_+ \rightarrow \mathbb {R}_+\) be a measurable function. Put \(b(T) := \int _0^T a(t) \, \mathrm {d}t\), \(T \in \mathbb {R}_+\). Suppose that \(\lim _{T\rightarrow \infty } b(T) = \infty \). Then for every bounded and measurable function \(f : \mathbb {R}_+ \rightarrow \mathbb {R}\) for which the limit \(f(\infty ) := \lim _{t\rightarrow \infty } f(t)\) exists, we have

$$\begin{aligned} \lim _{T\rightarrow \infty } \frac{1}{b(T)} \int _0^T a(t) f(t) \, \mathrm {d}t = f(\infty ). \end{aligned}$$

Next we present an auxiliary lemma in the supercritical case on the asymptotic behavior of \(Y_t\) as \(t \rightarrow \infty \).

Lemma 4.4

Let us consider the two-factor affine diffusion model (1.1) with \(a \in \mathbb {R}_+\), \(b \in \mathbb {R}_{--}\), \(\alpha , \beta , \gamma \in \mathbb {R}\), \(\sigma _1, \sigma _2, \sigma _3 \in \mathbb {R}_+\) and \(\varrho \in [-1, 1]\) with a random initial value \((\eta _0, \zeta _0)\) independent of \((W_t, B_t, L_t)_{t\in \mathbb {R}_+}\) satisfying \(\mathbb {P}(\eta _0 \in \mathbb {R}_+) = 1\). Then there exists a random variable \(V_Y\) such that

$$\begin{aligned} \mathrm {e}^{bt} Y_t {\mathop {\longrightarrow }\limits ^{{\mathrm {a.s.}}}}V_Y \qquad \text {as }\ t \rightarrow \infty \end{aligned}$$
(4.6)

with \(\mathbb {P}(V_Y \ne 0) = 1\), and, for each \(k \in \mathbb {N}\),

$$\begin{aligned} \mathrm {e}^{kbt} \int _0^t Y_u^k \, \mathrm {d}u {\mathop {\longrightarrow }\limits ^{{\mathrm {a.s.}}}}-\frac{V_Y^k}{kb} \qquad \text {as }\ t \rightarrow \infty . \end{aligned}$$
(4.7)

Proof

By (2.1),

$$\begin{aligned} \mathbb {E}( Y_t \,|\,\mathcal {F}_s ) = \mathbb {E}( Y_t \,|\,Y_s ) = \mathrm {e}^{-b(t-s)} Y_s + a \int _s^t \mathrm {e}^{-b(t-u)} \, \mathrm {d}u \end{aligned}$$

for all \(s, t \in \mathbb {R}_+\) with \(0 \leqslant s \leqslant t\). Thus

$$\begin{aligned} \mathbb {E}( \mathrm {e}^{bt} Y_t \,|\,\mathcal {F}^Y_s ) = \mathrm {e}^{bs} Y_s + a \int _s^t \mathrm {e}^{bu} \, \mathrm {d}u \geqslant \mathrm {e}^{bs} Y_s \end{aligned}$$

for all \(s, t \in \mathbb {R}_+\) with \(0 \leqslant s \leqslant t\), consequently, the process \((\mathrm {e}^{bt} Y_t)_{t\in \mathbb {R}_+}\) is a non-negative submartingale with respect to the filtration \((\mathcal {F}^Y_t)_{t\in \mathbb {R}_+}\). Moreover, \(b \in \mathbb {R}_{--}\) implies

$$\begin{aligned} \mathbb {E}(\mathrm {e}^{bt} Y_t) = y_0 + a \int _0^t \mathrm {e}^{bu} \, \mathrm {d}u \leqslant y_0 + a \int _0^\infty \mathrm {e}^{bu} \, \mathrm {d}u = y_0 - \frac{a}{b} < \infty , \qquad t \in \mathbb {R}_+, \end{aligned}$$

hence, by the submartingale convergence theorem, there exists a non-negative random variable \(V_Y\) such that (4.6) holds.

The distribution of \(V_Y\) coincides with the distribution of \(\widetilde{\mathcal {Y}}_{-1/b}\), where \((\widetilde{\mathcal {Y}}_t)_{t\in \mathbb {R}_+}\) is a CIR process given by the SDE

$$\begin{aligned} \mathrm {d}\widetilde{\mathcal {Y}}_t = a \mathrm {d}t + \sigma _1 \sqrt{\widetilde{\mathcal {Y}}_t} \, \mathrm {d}\mathcal {W}_t, \qquad t \in \mathbb {R}_+, \end{aligned}$$

with initial value \(\widetilde{\mathcal {Y}}_0 = y_0\), where \((\mathcal {W}_t)_{t\in \mathbb {R}_+}\) is a standard Wiener process, see Ben Alaya and Kebaier (2012, Proposition 3). Consequently, \(\mathbb {P}(V_Y \in \mathbb {R}_{++}) = 1\), since \(\widetilde{\mathcal {Y}}_t\), \(t \in \mathbb {R}_{++}\), are absolutely continuous random variables.

If \(\omega \in \varOmega \) such that \(\mathbb {R}_+ \ni t \mapsto Y_t(\omega )\) is continuous and \(\mathrm {e}^{bt} Y_t(\omega ) \rightarrow V_Y(\omega )\) as \(t \rightarrow \infty \), then, by the integral Kronecker Lemma 4.3 with \(f(t) = \mathrm {e}^{kbt} Y_t(\omega )^k\) and \(a(t) = \mathrm {e}^{-kbt}\), \(t \in \mathbb {R}_+\), we have

$$\begin{aligned} \frac{1}{\int _0^t \mathrm {e}^{-kbu} \, \mathrm {d}u} \int _0^t \mathrm {e}^{-kbu} (\mathrm {e}^{kbu} Y_u(\omega )^k) \, \mathrm {d}u \rightarrow V_Y(\omega )^k \qquad \text {as }\ t \rightarrow \infty . \end{aligned}$$

Here \(\int _0^t \mathrm {e}^{-kbu} \, \mathrm {d}u = - \frac{\mathrm {e}^{-kbt} - 1}{kb}\), \(t \in \mathbb {R}_+\), thus we conclude the second convergence in (4.7). \(\square \)

The next theorem states strong consistency of the CLSE of b in the supercritical case.

Theorem 4.5

Let us consider the two-factor affine diffusion model (1.1) with \(a \in \mathbb {R}_+\), \(b \in \mathbb {R}_{--}\), \(\alpha , \beta , \gamma \in \mathbb {R}\), \(\sigma _1 \in \mathbb {R}_{++}\), \(\sigma _2, \sigma _3 \in \mathbb {R}_+\) and \(\varrho \in [-1, 1]\) with a random initial value \((\eta _0, \zeta _0)\) independent of \((W_t, B_t, L_t)_{t\in \mathbb {R}_+}\) satisfying \(\mathbb {P}(\eta _0 \in \mathbb {R}_+) = 1\). Then the CLSE of b is strongly consistent, i.e., \(\widehat{b}_T {\mathop {\longrightarrow }\limits ^{{\mathrm {a.s.}}}}b\) as \(T \rightarrow \infty \).

Proof

By Lemma 3.3, there exists a unique CLSE \(\widehat{b}_T\) of b for all \(T \in \mathbb {R}_{++}\) which has the form given in (3.4). By Ito’s formula,

$$\begin{aligned} \int _0^T Y_s \, \mathrm {d}Y_s = \frac{1}{2} ( Y_T^2 - Y_0^2) - \frac{1}{2} \sigma _1^2 \int _0^T Y_s \, \mathrm {d}s, \qquad T \in \mathbb {R}_+, \end{aligned}$$

hence, by (4.6) and (4.7), we have

$$\begin{aligned} \widehat{b}_T&= \frac{(Y_T-Y_0)\int _0^T Y_s\,\mathrm {d}s-T\int _0^T Y_s\,\mathrm {d}Y_s}{T\int _0^T Y_s^2\,\mathrm {d}s-\bigl (\int _0^T Y_s\,\mathrm {d}s\bigr )^2}\\&=\frac{(Y_T-Y_0)\int _0^T Y_s\,\mathrm {d}s-\frac{T}{2}(Y_T^2-Y_0^2) +\frac{T}{2}\sigma _1^2\int _0^T Y_s\,\mathrm {d}s}{T\int _0^T Y_s^2\,\mathrm {d}s-\bigl (\int _0^T Y_s\,\mathrm {d}s\bigr )^2} \\&= \frac{\frac{1}{T}\bigl (\mathrm {e}^{bT}Y_T-\mathrm {e}^{bT}Y_0\bigr ) \bigl (\mathrm {e}^{bT}\int _0^T Y_s\,\mathrm {d}s \bigr ) -\frac{1}{2}\bigl (\mathrm {e}^{2bT}Y_T^2-\mathrm {e}^{2bT}Y_0^2\bigr ) +\frac{1}{2}\sigma _1^2\mathrm {e}^{bT}\bigl (\mathrm {e}^{bT}\int _0^T Y_s\,\mathrm {d}s\bigr )}{\mathrm {e}^{2bT}\int _0^T Y_s^2\,\mathrm {d}s -\frac{1}{T}\bigl (\mathrm {e}^{bT}\int _0^T Y_s\,\mathrm {d}s\bigr )^2} \\&{\mathop {\longrightarrow }\limits ^{{\mathrm {a.s.}}}}\frac{0(V_Y-0)\bigl (-\frac{V_Y}{b}\bigr )-\frac{1}{2}(V_Y^2-0) +\frac{1}{2}\sigma _1^2 0\bigl (-\frac{V_Y}{b}\bigr )}{-\frac{V_Y^2}{2b}-0\bigl (-\frac{V_Y}{b}\bigr )^2}= b \end{aligned}$$

as \(T \rightarrow \infty \). \(\square \)

Remark 4.6

For critical two-factor affine diffusion models, it will turn out that the CLSE of a and \(\alpha \) are not even weakly consistent, but the CLSE of b, \(\beta \) and \(\gamma \) are weakly consistent, see Theorem 6.2. \(\square \)

Remark 4.7

For supercritical two-factor affine diffusion models, it will turn out that the CLSE of a and \(\alpha \) are not even weakly consistent, but the CLSE of \(\beta \) and \(\gamma \) are weakly consistent, see Theorem 7.3. \(\square \)

5 Asymptotic behavior of CLSE: subcritical case

Theorem 5.1

Let us consider the two-factor affine diffusion model (1.1) with \(a, b \in \mathbb {R}_{++}\), \(\alpha , \beta \in \mathbb {R}\), \(\gamma \in \mathbb {R}_{++}\), \(\sigma _1 \in \mathbb {R}_{++}\), \(\sigma _2, \sigma _3 \in \mathbb {R}_+\) and \(\varrho \in [-1, 1]\) with a random initial value \((\eta _0, \zeta _0)\) independent of \((W_t, B_t, L_t)_{t\in \mathbb {R}_+}\) satisfying \(\mathbb {P}(\eta _0 \in \mathbb {R}_+) = 1\). Suppose that \((1 - \varrho ^2) \sigma _2^2 + \sigma _3^2 > 0\). Then the CLSE of \({\varvec{\theta }}= (a, b, \alpha , \beta , \gamma )^\top \) is asymptotically normal, namely,

$$\begin{aligned} T^{\frac{1}{2}} ({\widehat{{\varvec{\theta }}}}_T - {\varvec{\theta }}) {\mathop {\longrightarrow }\limits ^{\mathcal {D}}}\mathcal {N}_5({\varvec{0}}, [\mathbb {E}({\varvec{G}}_\infty )]^{-1} \mathbb {E}(\widetilde{{\varvec{G}}}_\infty ) [\mathbb {E}({\varvec{G}}_\infty )]^{-1}) \qquad \text {as }\ T \rightarrow \infty , \end{aligned}$$
(5.1)

where \({\varvec{G}}_\infty \) is given in (4.3) and \(\widetilde{{\varvec{G}}}_\infty \) has the form

$$\begin{aligned} \begin{bmatrix} \sigma _1^2 Y_\infty&- \sigma _1^2 Y_\infty ^2&\varrho \sigma _1 \sigma _2 Y_\infty&- \varrho \sigma _1 \sigma _2 Y_\infty ^2&- \varrho \sigma _1 \sigma _2 Y_\infty X_\infty \\ - \sigma _1^2 Y_\infty ^2&\sigma _1^2 Y_\infty ^3&- \varrho \sigma _1 \sigma _2 Y_\infty ^2&\varrho \sigma _1 \sigma _2 Y_\infty ^3&\varrho \sigma _1 \sigma _2 Y_\infty ^2 X_\infty \\ \varrho \sigma _1 \sigma _2 Y_\infty&- \varrho \sigma _1 \sigma _2 Y_\infty ^2&\sigma _2^2 Y_\infty + \sigma _3^2&- (\sigma _2^2 Y_\infty + \sigma _3^2) Y_\infty&- (\sigma _2^2 Y_\infty + \sigma _3^2) X_\infty \\ - \varrho \sigma _1 \sigma _2 Y_\infty ^2&\varrho \sigma _1 \sigma _2 Y_\infty ^3&- (\sigma _2^2 Y_\infty + \sigma _3^2) Y_\infty&(\sigma _2^2 Y_\infty + \sigma _3^2) Y_\infty ^2&(\sigma _2^2 Y_\infty + \sigma _3^2) Y_\infty X_\infty \\ - \varrho \sigma _1 \sigma _2 Y_\infty X_\infty&\varrho \sigma _1 \sigma _2 Y_\infty ^2 X_\infty&- (\sigma _2^2 Y_\infty + \sigma _3^2) X_\infty&(\sigma _2^2 Y_\infty + \sigma _3^2) Y_\infty X_\infty&(\sigma _2^2 Y_\infty + \sigma _3^2) X_\infty ^2 \end{bmatrix}, \end{aligned}$$

where the random vector \((Y_\infty , X_\infty )\) is given by Theorem A.1.

Proof

By (3.5), we have

$$\begin{aligned} T^{\frac{1}{2}} ({\widehat{{\varvec{\theta }}}}_T - {\varvec{\theta }}) = (T^{-1} {\varvec{G}}_T)^{-1} (T^{-\frac{1}{2}} {\varvec{h}}_T) \end{aligned}$$
(5.2)

on the event where \({\varvec{G}}_T\) is invertible, which holds almost surely, see Lemma 3.3. We have \((T^{-1} {\varvec{G}}_T)^{-1} {\mathop {\longrightarrow }\limits ^{{\mathrm {a.s.}}}}[\mathbb {E}({\varvec{G}}_\infty )]^{-1}\) as \(T \rightarrow \infty \) by (4.4). The process \(({\varvec{h}}_t)_{t\in \mathbb {R}_+}\) is a 5-dimensional continuous local martingale with quadratic variation process \(\langle {\varvec{h}}\rangle _t = \widetilde{{\varvec{G}}}_t\), \(t \in \mathbb {R}_+\), where

$$\begin{aligned} \widetilde{{\varvec{G}}}_t := \int _0^t \begin{bmatrix} \sigma _1^2 Y_s&- \sigma _1^2 Y_s^2&\varrho \sigma _1 \sigma _2 Y_s&- \varrho \sigma _1 \sigma _2 Y_s^2&- \varrho \sigma _1 \sigma _2 Y_s X_s \\ - \sigma _1^2 Y_s^2&\sigma _1^2 Y_s^3&- \varrho \sigma _1 \sigma _2 Y_s^2&\varrho \sigma _1 \sigma _2 Y_s^3&\varrho \sigma _1 \sigma _2 Y_s^2 X_s \\ \varrho \sigma _1 \sigma _2 Y_s&- \varrho \sigma _1 \sigma _2 Y_s^2&\sigma _2^2 Y_s + \sigma _3^2&- (\sigma _2^2 Y_s + \sigma _3^2) Y_s&- (\sigma _2^2 Y_s + \sigma _3^2) X_s \\ - \varrho \sigma _1 \sigma _2 Y_s^2&\varrho \sigma _1 \sigma _2 Y_s^3&- (\sigma _2^2 Y_s + \sigma _3^2) Y_s&(\sigma _2^2 Y_s + \sigma _3^2) Y_s^2&(\sigma _2^2 Y_s + \sigma _3^2) Y_s X_s \\ - \varrho \sigma _1 \sigma _2 Y_s X_s&\varrho \sigma _1 \sigma _2 Y_s^2 X_s&- (\sigma _2^2 Y_s + \sigma _3^2) X_s&(\sigma _2^2 Y_s + \sigma _3^2) Y_s X_s&(\sigma _2^2 Y_s + \sigma _3^2) X_s^2 \end{bmatrix} \mathrm {d}s. \end{aligned}$$

By Theorem A.2, we obtain

$$\begin{aligned} T^{-1} \widetilde{{\varvec{G}}}_T {\mathop {\longrightarrow }\limits ^{{\mathrm {a.s.}}}}\mathbb {E}(\widetilde{{\varvec{G}}}_\infty ) \qquad \text {as }\ T \rightarrow \infty , \end{aligned}$$
(5.3)

since, by Theorem B.2, the entries of \(\mathbb {E}(\widetilde{{\varvec{G}}}_\infty )\) exist and finite. Using (5.3), Theorem C.2 yields \(T^{-\frac{1}{2}} {\varvec{h}}_T {\mathop {\longrightarrow }\limits ^{\mathcal {D}}}\mathcal {N}_5({\varvec{0}}, \mathbb {E}(\widetilde{{\varvec{G}}}_\infty ))\) as \(T \rightarrow \infty \). Hence, by (5.2) and by Slutsky’s lemma,

$$\begin{aligned} T^{\frac{1}{2}} ({\widehat{{\varvec{\theta }}}}_T - {\varvec{\theta }}) {\mathop {\longrightarrow }\limits ^{\mathcal {D}}}[\mathbb {E}({\varvec{G}}_\infty )]^{-1} \mathcal {N}_5\bigl ({\varvec{0}}, \mathbb {E}(\widetilde{{\varvec{G}}}_\infty )\bigr ) =\mathcal {N}_5\bigl ({\varvec{0}}, [\mathbb {E}({\varvec{G}}_\infty )]^{-1} \mathbb {E}(\widetilde{{\varvec{G}}}_\infty ) \bigl ([\mathbb {E}({\varvec{G}}_\infty )]^{-1}\bigr )^\top \bigr ) \end{aligned}$$

as \(T \rightarrow \infty \). \(\square \)

6 Asymptotic behavior of CLSE: critical case

First we present an auxiliary lemma. A proof can be found in the Arxiv version of this paper Bolyog and Pap (2017).

Lemma 6.1

If \((\mathcal {Y}_t, \mathcal {X}_t)_{t\in \mathbb {R}_+}\) and \((\widetilde{\mathcal {Y}}_t, \widetilde{\mathcal {X}}_t)_{t\in \mathbb {R}_+}\) are continuous semimartingales such that \((\mathcal {Y}_t, \mathcal {X}_t)_{t\in \mathbb {R}_+} {\mathop {=}\limits ^{\mathcal {D}}}(\widetilde{\mathcal {Y}}_t, \widetilde{\mathcal {X}}_t)_{t\in \mathbb {R}_+}\), then

$$\begin{aligned}&\biggl (\mathcal {Y}_1, \mathcal {X}_1, \int _0^1 \mathcal {X}_s \, \mathrm {d}\mathcal {Y}_s, \int _0^1 \mathcal {Y}_s^k \mathcal {X}_s^\ell \, \mathrm {d}s: k, \ell \in \mathbb {Z}_+, k + \ell \leqslant n\biggr ) \\&{\mathop {=}\limits ^{\mathcal {D}}}\biggl (\widetilde{\mathcal {Y}}_1, \widetilde{\mathcal {X}}_1, \int _0^1 \widetilde{\mathcal {X}}_s \, \mathrm {d}\widetilde{\mathcal {Y}}_s, \int _0^1 \widetilde{\mathcal {Y}}_s^k \widetilde{\mathcal {X}}_s^\ell \, \mathrm {d}s : k, \ell \in \mathbb {Z}_+, k + \ell \leqslant n\biggr ) \end{aligned}$$

for each \(n \in \mathbb {N}\).

Theorem 6.2

Let us consider the two-factor affine diffusion model (1.1) with \(a \in \mathbb {R}_+\), \(b = 0\), \(\alpha \in \mathbb {R}\), \(\beta = 0\), \(\gamma = 0\), \(\sigma _1, \sigma _2, \sigma _3 \in \mathbb {R}_+\) and \(\varrho \in [-1, 1]\) with a random initial value \((\eta _0, \zeta _0)\) independent of \((W_t, B_t, L_t)_{t\in \mathbb {R}_+}\) satisfying \(\mathbb {P}(\eta _0 \in \mathbb {R}_+) = 1\). Suppose that \((1 - \varrho ^2) \sigma _2^2 + \sigma _3^2 > 0\). Then

$$\begin{aligned} \begin{bmatrix} \widehat{a}_T - a \\ T \widehat{b}_T \\ \widehat{\alpha }_T - \alpha \\ T \widehat{\beta }_T \\ T \widehat{\gamma }_T \end{bmatrix} {\mathop {\longrightarrow }\limits ^{\mathcal {D}}}\begin{bmatrix} \left( \int _0^1 \begin{bmatrix} 1 \\ - \mathcal {Y}_s \end{bmatrix} \begin{bmatrix} 1 \\ - \mathcal {Y}_s \end{bmatrix}^\top \mathrm {d}s\right) ^{-1} \begin{bmatrix} \mathcal {Y}_1 - a \\ -\frac{1}{2} \mathcal {Y}_1^2 + \bigl (a + \frac{\sigma _1^2}{2}\bigr ) \int _0^1 \mathcal {Y}_s \, \mathrm {d}s \end{bmatrix} \\ \left( \int _0^1 \begin{bmatrix} 1 \\ - \mathcal {Y}_s \\ - \mathcal {X}_s \end{bmatrix} \begin{bmatrix} 1 \\ - \mathcal {Y}_s \\ - \mathcal {X}_s \end{bmatrix}^\top \mathrm {d}s\right) ^{-1} \begin{bmatrix} \mathcal {X}_1 - \alpha \\ - \mathcal {Y}_1 \mathcal {X}_1 + (\alpha + \varrho \sigma _1 \sigma _2) \int _0^1 \mathcal {Y}_s \, \mathrm {d}s + \int _0^1 \mathcal {X}_s \, \mathrm {d}\mathcal {Y}_s \\ - \frac{1}{2} \mathcal {X}_1^2 + \alpha \int _0^1 \mathcal {X}_s \, \mathrm {d}s + \frac{\sigma _2^2}{2} \int _0^1 \mathcal {Y}_s \, \mathrm {d}s \end{bmatrix} \end{bmatrix} \end{aligned}$$
(6.1)

as \(T \rightarrow \infty \), where \((\mathcal {Y}_t, \mathcal {X}_t)_{t\in \mathbb {R}_+}\) is the unique strong solution of the SDE

$$\begin{aligned} {\left\{ \begin{array}{ll} \mathrm {d}\mathcal {Y}_t = a \, \mathrm {d}t + \sigma _1 \sqrt{\mathcal {Y}_t} \, \mathrm {d}W_t, \\ \mathrm {d}\mathcal {X}_t = \alpha \, \mathrm {d}t + \sigma _2 \sqrt{\mathcal {Y}_t} \, (\varrho \, \mathrm {d}W_t + \sqrt{1 - \varrho ^2} \, \mathrm {d}B_t), \end{array}\right. } \qquad t \in [0, \infty ), \end{aligned}$$
(6.2)

with initial value \((\mathcal {Y}_0, \mathcal {X}_0) = (0, 0)\).

Proof

By (3.5), we have

$$\begin{aligned} \begin{bmatrix} \widehat{a}_T - a \\ T \widehat{b}_T \end{bmatrix}&= \begin{bmatrix} 1&\quad 0 \\ 0&\quad T \end{bmatrix} \begin{bmatrix} \widehat{a}_T - a \\ \widehat{b}_T \end{bmatrix} = {\text {diag}}(1, T) ({\varvec{G}}_T^{(1)})^{-1} {\varvec{h}}_T^{(1)} \\&= \bigl ({\text {diag}}(T^{-\frac{1}{2}}, T^{-\frac{3}{2}}) {\varvec{G}}_T^{(1)} {\text {diag}}(T^{-\frac{1}{2}}, T^{-\frac{3}{2}})\bigr )^{-1} {\text {diag}}(T^{-1}, T^{-2}) {\varvec{h}}_T^{(1)} \\&= \begin{bmatrix} 1&\quad - \frac{1}{T^2} \int _0^T Y_s \, \mathrm {d}s \\ - \frac{1}{T^2} \int _0^T Y_s \, \mathrm {d}s&\quad \frac{1}{T^3} \int _0^T Y_s^2 \, \mathrm {d}s \end{bmatrix}^{-1} \begin{bmatrix} \frac{\sigma _1}{T} \int _0^T Y_s^{\frac{1}{2}} \, \mathrm {d}W_s \\ - \frac{\sigma _1}{T^2} \int _0^T Y_s^{\frac{3}{2}} \, \mathrm {d}W_s \end{bmatrix}. \end{aligned}$$

In a similar way,

$$\begin{aligned} \begin{bmatrix} \widehat{\alpha }_T - \alpha \\ T \widehat{\beta }_T \\ T \widehat{\gamma }_T \end{bmatrix}= & {} \begin{bmatrix} 1&\quad - \frac{1}{T^2} \int _0^T Y_s \, \mathrm {d}s&\quad - \frac{1}{T^2} \int _0^T X_s \, \mathrm {d}s \\ - \frac{1}{T^2} \int _0^T Y_s \, \mathrm {d}s&\quad \frac{1}{T^3} \int _0^T Y_s^2 \, \mathrm {d}s&\quad \frac{1}{T^3} \int _0^T Y_s X_s \, \mathrm {d}s \\ - \frac{1}{T^2} \int _0^T X_s \, \mathrm {d}s&\quad \frac{1}{T^3} \int _0^T Y_s X_s \, \mathrm {d}s&\quad \frac{1}{T^3} \int _0^T X_s^2 \, \mathrm {d}s \end{bmatrix}^{-1}\\&\times \begin{bmatrix} \frac{\sigma _2}{T} \int _0^T Y_s^{\frac{1}{2}} \, \mathrm {d}\widetilde{W}_s + \frac{\sigma _3}{T} L_T \\ - \frac{\sigma _2}{T^2} \int _0^T Y_s^{\frac{3}{2}} \, \mathrm {d}\widetilde{W}_s - \frac{\sigma _3}{T^2} \int _0^T Y_s \, \mathrm {d}L_s \\ - \frac{\sigma _2}{T^2} \int _0^T Y_s^{\frac{1}{2}} X_s \, \mathrm {d}\widetilde{W}_s - \frac{\sigma _3}{T^2} \int _0^T X_s \, \mathrm {d}L_s \end{bmatrix}. \end{aligned}$$

The aim of the following discussion is to prove

$$\begin{aligned}&\biggl (\frac{1}{T} Y_T, \frac{1}{T} X_T, \frac{1}{T^2} \int _0^T X_s \, \mathrm {d}Y_s, \frac{1}{T^{k+\ell +1}} \int _0^T Y_s^k X_s^\ell \, \mathrm {d}s : k, \ell \in \mathbb {Z}_+, k + \ell \leqslant 2\biggr )\nonumber \\&\quad {\mathop {\longrightarrow }\limits ^{\mathcal {D}}}\biggl (\mathcal {Y}_1, \mathcal {X}_1, \int _0^1 \mathcal {X}_s \, \mathrm {d}\mathcal {Y}_s, \int _0^1 \mathcal {Y}_s^k \mathcal {X}_s^\ell \, \mathrm {d}s : k, \ell \in \mathbb {Z}_+, k + \ell \leqslant 2\biggr ) \end{aligned}$$
(6.3)

as \(T \rightarrow \infty \). By part (ii) of Remark 2.7 in Barczy et al. (2013), we have

$$\begin{aligned} \bigl (\widetilde{\mathcal {Y}}_t^{(T)}, \widetilde{\mathcal {X}}_t^{(T)}\bigr )_{t\in \mathbb {R}_+} := \Bigl (\frac{1}{T} \mathcal {Y}_{Tt}, \frac{1}{T} \mathcal {X}_{Tt} \Bigr )_{t\in \mathbb {R}_+} {\mathop {=}\limits ^{\mathcal {D}}}(\mathcal {Y}_t, \mathcal {X}_t)_{t\in \mathbb {R}_+} \qquad \text {for all }\ T \in \mathbb {R}_{++}, \end{aligned}$$

since, by Proposition 2.1, \((\mathcal {Y}_t, \mathcal {X}_t)_{t\in \mathbb {R}_+}\) is an affine process with infinitesimal generator

$$\begin{aligned} (\mathcal {A}_{(\mathcal {Y},\mathcal {X})} f)(y, x)= & {} a f_1'(y, x) + \alpha f_2'(y, x)\\&+\,\, \frac{1}{2} y \bigl [\sigma _1^2 f_{1,1}''(y, x) + 2 \varrho \sigma _1 \sigma _2 f_{1,2}''(y, x) + \sigma _2^2 f_{2, 2}''(y,x) \bigr ]. \end{aligned}$$

Hence, by Lemma 6.1, we obtain

$$\begin{aligned}&\biggl (\mathcal {Y}_1, \mathcal {X}_1, \int _0^1 \mathcal {X}_s \, \mathrm {d}\mathcal {Y}_s, \int _0^1 \mathcal {Y}_s^k \mathcal {X}_s^\ell \, \mathrm {d}s : k, \ell \in \mathbb {Z}_+, k + \ell \leqslant 2\biggr ) \\&{\mathop {=}\limits ^{\mathcal {D}}}\biggl (\widetilde{\mathcal {Y}}_1^{(T)}, \widetilde{\mathcal {X}}_1^{(T)}, \int _0^1 \widetilde{\mathcal {X}}_s^{(T)} \, \mathrm {d}\widetilde{\mathcal {Y}}_s^{(T)}, \int _0^1 \bigl (\widetilde{\mathcal {Y}}_s^{(T)}\bigr )^k \bigl (\widetilde{\mathcal {X}}_s^{(T)}\bigr )^\ell \, \mathrm {d}s : k, \ell \in \mathbb {Z}_+, k + \ell \leqslant 2\biggr ) \\&= \biggl (\frac{1}{T} \mathcal {Y}_T, \frac{1}{T} \mathcal {X}_T, \frac{1}{T^2} \int _0^T \mathcal {X}_s \, \mathrm {d}\mathcal {Y}_s, \frac{1}{T^{k+\ell +1}} \int _0^T \mathcal {Y}_s^k \mathcal {X}_s^\ell \, \mathrm {d}s : k, \ell \in \mathbb {Z}_+, k + \ell \leqslant 2\biggr ) \end{aligned}$$

for all \(T \in \mathbb {R}_{++}\). Then, by Slutsky’s lemma, in order to prove (6.3), it suffices to show the convergences

$$\begin{aligned}&\frac{1}{T} (Y_T - \mathcal {Y}_T) {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}0, \qquad \frac{1}{T} (X_T - \mathcal {X}_T) {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}0, \end{aligned}$$
(6.4)
$$\begin{aligned}&\frac{1}{T^2} \left( \int _0^T X_s \, \mathrm {d}Y_s - \int _0^T \mathcal {X}_s \, \mathrm {d}\mathcal {Y}_s\right) {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}0, \qquad \frac{1}{T^{k+\ell +1}} \int _0^T (Y_s^k X_s^\ell - \mathcal {Y}_s^k \mathcal {X}_s^\ell ) \, \mathrm {d}s {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}0 \nonumber \\ \end{aligned}$$
(6.5)

as \(T \rightarrow \infty \) for all \(k, \ell \in \mathbb {Z}_+\) with \(k + \ell \leqslant 2\). By (3.21) in Barczy et al. (2013), we have

$$\begin{aligned} \mathbb {E}(|Y_s - \mathcal {Y}_s|) \leqslant \mathbb {E}(Y_0), \qquad s \in \mathbb {R}_+, \end{aligned}$$
(6.6)

hence

$$\begin{aligned}&\mathbb {E}\left( \left| \frac{1}{T} (Y_T - \mathcal {Y}_T)\right| \right) \leqslant \frac{1}{T} \mathbb {E}(Y_0) \rightarrow 0, \\&\mathbb {E}\left( \left| \frac{1}{T^2} \int _0^T (Y_s - \mathcal {Y}_s) \, \mathrm {d}s\right| \right) \leqslant \frac{1}{T^2} \int _0^T \mathbb {E}(|Y_s - \mathcal {Y}_s|) \, \mathrm {d}s \leqslant \frac{1}{T} \mathbb {E}(Y_0) \rightarrow 0, \end{aligned}$$

as \(T \rightarrow \infty \), implying \(\frac{1}{T} (Y_T - \mathcal {Y}_T) {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}0\) and \(\frac{1}{T^2} \int _0^T (Y_s - \mathcal {Y}_s) \, \mathrm {d}s {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}0\) as \(T \rightarrow \infty \), i.e., the first convergence in (6.4) and the second convergence in (6.5) for \((k, \ell ) = (1, 0)\).

As in (3.23) in Barczy et al. (2013), we have \(\mathbb {E}(|X_s - \mathcal {X}_s|) \leqslant \mathbb {E}(|X_0|) + \sqrt{(\sigma _2^2 \mathbb {E}(Y_0) + \sigma _3^2) s}\) for all \(s \in \mathbb {R}_+\), hence

$$\begin{aligned} \sup _{s\in [0,T]} \mathbb {E}(|X_s - \mathcal {X}_s|) = {\text {O}}(T^{\frac{1}{2}}) \qquad \text {as }\ T \rightarrow \infty , \end{aligned}$$
(6.7)

thus

$$\begin{aligned} \mathbb {E}\left( \left| \frac{1}{T} (X_T - \mathcal {X}_T)\right| \right) =&\frac{1}{T} {\text {O}}(T^{\frac{1}{2}}) \rightarrow 0, \\ \mathbb {E}\left( \left| \frac{1}{T^2} \int _0^T (X_s - \mathcal {X}_s) \, \mathrm {d}s\right| \right) \leqslant&\frac{1}{T^2} \int _0^T \mathbb {E}(|X_s - \mathcal {X}_s|) \, \mathrm {d}s = \frac{1}{T^2} \int _0^T {\text {O}}(T^{\frac{1}{2}}) \, \mathrm {d}s\\ =&\frac{1}{T^2} {\text {O}}(T^{\frac{3}{2}}) \rightarrow 0, \end{aligned}$$

as \(T \rightarrow \infty \), implying \(\frac{1}{T} (X_T - \mathcal {X}_T) {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}0\) and \(\frac{1}{T^2} \int _0^T (X_s - \mathcal {X}_s) \, \mathrm {d}s {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}0\) as \(T \rightarrow \infty \), i.e., the second convergence in (6.4) and the second convergence in (6.5) for \((k, \ell ) = (0, 1)\).

As in (3.25) in Barczy et al. (2013), we have \(\mathbb {E}[(Y_s - \mathcal {Y}_s)^2] \leqslant 2 \mathbb {E}(Y_0^2) + 2 s \sigma _1^2 \mathbb {E}(Y_0)\) for all \(s \in \mathbb {R}_+\), hence

$$\begin{aligned} \sup _{s\in [0,T]} \mathbb {E}[(Y_s - \mathcal {Y}_s)^2] = {\text {O}}(T) \qquad \text {as }\ T \rightarrow \infty . \end{aligned}$$
(6.8)

By Proposition B.1, \(\mathbb {E}(Y_s^2) = \mathbb {E}(Y_0^2) + (2 a + \sigma _1^2) \bigl (\mathbb {E}(Y_0) s + a \frac{s^2}{2}\bigr )\) for all \(s \in \mathbb {R}_+\), hence

$$\begin{aligned} \sup _{s\in [0,T]} \mathbb {E}(Y_s^2) = {\text {O}}(T^2) \qquad \text {as }\ T \rightarrow \infty , \end{aligned}$$
(6.9)

and \(\sup _{s\in [0,T]} \mathbb {E}(\mathcal {Y}_s^2) = {\text {O}}(T^2)\) as \(T \rightarrow \infty \). We have

$$\begin{aligned} \mathbb {E}(|Y_s^2 - \mathcal {Y}_s^2|) = \mathbb {E}(|(Y_s - \mathcal {Y}_s) (Y_s + \mathcal {Y}_s)|)&\leqslant \sqrt{\mathbb {E}[(Y_s - \mathcal {Y}_s)^2] \mathbb {E}[(Y_s + \mathcal {Y}_s)^2]} \\&\leqslant \sqrt{2 \mathbb {E}[(Y_s - \mathcal {Y}_s)^2] (\mathbb {E}(Y_s^2) + \mathbb {E}(\mathcal {Y}_s^2))}, \end{aligned}$$

yielding

$$\begin{aligned} \sup _{s\in [0,T]} \mathbb {E}(|Y_s^2 - \mathcal {Y}_s^2|) = \sqrt{2 {\text {O}}(T) ({\text {O}}(T^2) + {\text {O}}(T^2))} = {\text {O}}(T^{\frac{3}{2}}) \qquad \text {as }\ T \rightarrow \infty , \end{aligned}$$

thus

$$\begin{aligned} \mathbb {E}\left( \left| \frac{1}{T^3} \int _0^T (Y_s^2 - \mathcal {Y}_s^2) \, \mathrm {d}s\right| \right)\leqslant & {} \frac{1}{T^3} \int _0^T \mathbb {E}(|Y_s^2 - \mathcal {Y}_s^2|) \, \mathrm {d}s = \frac{1}{T^3} \int _0^T {\text {O}}(T^{\frac{3}{2}}) \, \mathrm {d}s\\= & {} \frac{1}{T^3} {\text {O}}(T^{\frac{5}{2}}) \rightarrow 0, \end{aligned}$$

as \(T \rightarrow \infty \), implying \(\frac{1}{T^3} \int _0^T (Y_s^2 - \mathcal {Y}_s^2) \, \mathrm {d}s {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}0\) as \(T \rightarrow \infty \)., i.e., the second convergence in (6.5) for \((k, \ell ) = (2, 0)\).

In a similar way, \(\mathbb {E}[(X_s - \mathcal {X}_s)^2] \leqslant 2 \mathbb {E}(X_0^2) + 2 s (\sigma _2^2 \mathbb {E}(Y_0) + \sigma _3^2)\) for all \(s \in \mathbb {R}_+\), hence

$$\begin{aligned} \sup _{s\in [0,T]} \mathbb {E}[(X_s - \mathcal {X}_s)^2] = {\text {O}}(T) \qquad \text {as }\ T \rightarrow \infty . \end{aligned}$$
(6.10)

By Proposition B.1, \(\mathbb {E}(X_s^2) = \mathbb {E}(X_0^2) + \alpha \bigl (s \mathbb {E}(X_0) + \alpha \frac{s^2}{2}\bigr ) + \sigma _2^2 \bigl (s \mathbb {E}(Y_0) + a \frac{s^2}{2}\bigr ) + \sigma _3^2 s\), thus \(\sup _{s\in [0,T]} \mathbb {E}(X_s^2) = {\text {O}}(T^2)\) and \(\sup _{s\in [0,T]} \mathbb {E}(\mathcal {X}_s^2) = {\text {O}}(T^2)\) as \(T \rightarrow \infty \). We have

$$\begin{aligned} \mathbb {E}(|X_s^2 - \mathcal {X}_s^2|) \leqslant \sqrt{2 \mathbb {E}[(X_s - \mathcal {X}_s)^2] (\mathbb {E}(X_s^2) + \mathbb {E}(\mathcal {X}_s^2))}, \end{aligned}$$

yielding

$$\begin{aligned} \sup _{s\in [0,T]} \mathbb {E}(|X_s^2 - \mathcal {X}_s^2|) = \sqrt{2 {\text {O}}(T) ({\text {O}}(T^2) + {\text {O}}(T^2))} = {\text {O}}(T^{\frac{3}{2}}) \qquad \text {as }\ T \rightarrow \infty , \end{aligned}$$

thus

$$\begin{aligned} \mathbb {E}\left( \left| \frac{1}{T^3} \int _0^T (X_s^2 - \mathcal {X}_s^2) \, \mathrm {d}s\right| \right)\leqslant & {} \frac{1}{T^3} \int _0^T \mathbb {E}(|X_s^2 - \mathcal {X}_s^2|) \, \mathrm {d}s = \frac{1}{T^3} \int _0^T {\text {O}}(T^{\frac{3}{2}}) \, \mathrm {d}s\\= & {} \frac{1}{T^3} {\text {O}}(T^{\frac{5}{2}}) \rightarrow 0, \end{aligned}$$

as \(T \rightarrow \infty \), implying \(\frac{1}{T^3} \int _0^T (X_s^2 - \mathcal {X}_s^2) \, \mathrm {d}s {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}0\) as \(T \rightarrow \infty \), i.e., the second convergence in (6.5) for \((k, \ell ) = (0, 2)\).

Further,

$$\begin{aligned} \mathbb {E}(|Y_s X_s - \mathcal {Y}_s \mathcal {X}_s|)\leqslant & {} \mathbb {E}(|Y_s - \mathcal {Y}_s| |X_s|) + \mathbb {E}(\mathcal {Y}_s |X_s - \mathcal {X}_s|) \\\leqslant & {} \sqrt{\mathbb {E}[(Y_s - \mathcal {Y}_s)^2] \mathbb {E}(X_s^2)}+ \sqrt{\mathbb {E}(\mathcal {Y}_s^2) \mathbb {E}[(X_s - \mathcal {X}_s)^2]} \end{aligned}$$

yields

$$\begin{aligned} \sup _{s\in [0,T]} \mathbb {E}(|Y_s X_s - \mathcal {Y}_s \mathcal {X}_s|) = \sqrt{{\text {O}}(T) {\text {O}}(T^2)} + \sqrt{{\text {O}}(T^2) {\text {O}}(T))} = {\text {O}}(T^{\frac{3}{2}}) \qquad \text {as }\ T \rightarrow \infty , \end{aligned}$$

thus

$$\begin{aligned} \mathbb {E}\left( \left| \frac{1}{T^3} \int _0^T (Y_s X_s - \mathcal {Y}_s \mathcal {X}_s) \, \mathrm {d}s\right| \right)\leqslant & {} \frac{1}{T^3} \int _0^T \mathbb {E}(|Y_s X_s - \mathcal {Y}_s \mathcal {X}_s|) \, \mathrm {d}s = \frac{1}{T^3} \int _0^T {\text {O}}(T^{\frac{3}{2}}) \, \mathrm {d}s\\= & {} \frac{1}{T^3} {\text {O}}(T^{\frac{5}{2}}) \rightarrow 0, \end{aligned}$$

as \(T \rightarrow \infty \), implying \(\frac{1}{T^3} \int _0^T (Y_s X_s - \mathcal {Y}_s \mathcal {X}_s) \, \mathrm {d}s {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}0\) as \(T \rightarrow \infty \), i.e., the second convergence in (6.5) for \((k, \ell ) = (1, 1)\).

Using the Cauchy–Schwarz inequality, we obtain

$$\begin{aligned} \mathbb {E}\left( \left| \int _0^T X_s \, \mathrm {d}Y_s - \int _0^T \mathcal {X}_s \, \mathrm {d}\mathcal {Y}_s\right| \right)\leqslant & {} \mathbb {E}\left( \left| \int _0^T (X_s - \mathcal {X}_s) \, \mathrm {d}Y_s\right| \right) + \mathbb {E}\left( \left| \int _0^T \mathcal {X}_s \, \mathrm {d}(Y_s - \mathcal {Y}_s)\right| \right) \\\leqslant & {} \sqrt{E_1(T)} + \sqrt{E_2(T)} \end{aligned}$$

with

$$\begin{aligned} E_1(T) := \mathbb {E}\left( \left| \int _0^T (X_s - \mathcal {X}_s) \, \mathrm {d}Y_s\right| ^2\right) , \qquad E_2(T) := \mathbb {E}\left( \left| \int _0^T \mathcal {X}_s \, \mathrm {d}(Y_s - \mathcal {Y}_s)\right| ^2\right) . \end{aligned}$$

Using \(\mathrm {d}Y_s = a \, \mathrm {d}s + \sigma _1 \sqrt{Y_s} \, \mathrm {d}W_s\), we have

$$\begin{aligned} E_1(T)= & {} \mathbb {E}\left( \left| a \int _0^T (X_s - \mathcal {X}_s) \, \mathrm {d}s + \sigma _1 \int _0^T (X_s - \mathcal {X}_s) \sqrt{Y_s} \, \mathrm {d}W_s\right| ^2\right) \\\leqslant & {} 2 a^2 E_{1,1}(T) + 2 \sigma _1^2 E_{1,2}(T) \end{aligned}$$

with

$$\begin{aligned} E_{1,1}(T) := \mathbb {E}\left( \left| \int _0^T (X_s - \mathcal {X}_s) \, \mathrm {d}s\right| ^2\right) , \qquad E_{1,2}(T) := \mathbb {E}\left( \left| \int _0^T (X_s - \mathcal {X}_s) \sqrt{Y_s} \, \mathrm {d}W_s\right| ^2\right) . \end{aligned}$$

Applying (6.10), we obtain

$$\begin{aligned} E_{1,1}(T)&= \mathbb {E}\left( \int _0^T \int _0^T (X_s - \mathcal {X}_s) (X_u - \mathcal {X}_u) \, \mathrm {d}s \, \mathrm {d}u\right) \\&= \int _0^T \int _0^T \mathbb {E}[(X_s - \mathcal {X}_s) (X_u - \mathcal {X}_u)] \, \mathrm {d}s \, \mathrm {d}u \\&\leqslant \int _0^T \int _0^T \sqrt{\mathbb {E}[(X_s - \mathcal {X}_s)^2] \mathbb {E}[(X_u - \mathcal {X}_u)^2]} \, \mathrm {d}s \, \mathrm {d}u\\&= \int _0^T \int _0^T \sqrt{{\text {O}}(T) {\text {O}}(T)} \, \mathrm {d}s \, \mathrm {d}u = {\text {O}}(T^3). \end{aligned}$$

Again by the Cauchy–Schwarz inequality, we obtain

$$\begin{aligned} E_{1,2}(T)= & {} \mathbb {E}\left( \int _0^T (X_s - \mathcal {X}_s)^2 Y_s \, \mathrm {d}s\right) = \int _0^T \mathbb {E}[(X_s - \mathcal {X}_s)^2 Y_s] \, \mathrm {d}s\\\leqslant & {} \int _0^T \sqrt{\mathbb {E}[(X_s - \mathcal {X}_s)^4] \mathbb {E}(Y_s^2)} \, \mathrm {d}s. \end{aligned}$$

Using \(X_t = X_0 + \sigma _2 \int _0^t \sqrt{Y_s} \, \mathrm {d}\widetilde{W}_s + \sigma _3 L_t\) and \(\mathcal {X}_t = \sigma _2 \int _0^t \sqrt{\mathcal {Y}_s} \, \mathrm {d}\widetilde{W}_s\), we get \(X_t - \mathcal {X}_t = X_0 + \sigma _2 \int _0^t (\sqrt{Y_s} - \sqrt{\mathcal {Y}_s}) \, \mathrm {d}\widetilde{W}_s + \sigma _3 L_t\), and, applying Minkowski inequality and a martingale moment inequality in Karatzas and Shreve (1991, 3.3.25), we obtain

$$\begin{aligned} (\mathbb {E}[(X_t - \mathcal {X}_t)^4])^{\frac{1}{4}}&\leqslant [\mathbb {E}(X_0^4)]^{\frac{1}{4}} + \sigma _2\left( \mathbb {E}\left[ \left( \int _0^t (\sqrt{Y_s} - \sqrt{\mathcal {Y}_s}) \, \mathrm {d}\widetilde{W}_s\right) ^4 \right] \right) ^{\frac{1}{4}}+ \sigma _3 [\mathbb {E}(L_t^4)]^{\frac{1}{4}} \\&\leqslant [\mathbb {E}(X_0^4)]^{\frac{1}{4}}+ \sigma _2\left( (2 \cdot 3)^2 t \mathbb {E}\left( \int _0^t (\sqrt{Y_s} - \sqrt{\mathcal {Y}_s})^4 \, \mathrm {d}s\right) \right) ^{\frac{1}{4}}+ \sigma _3 \root 4 \of {3} \sqrt{t} \\&\leqslant [\mathbb {E}(X_0^4)]^{\frac{1}{4}} + \sigma _2 \left( 36 t \int _0^t \mathbb {E}[(Y_s - \mathcal {Y}_s)^2] \, \mathrm {d}s\right) ^{\frac{1}{4}} + \sigma _3 \root 4 \of {3} \sqrt{t}. \end{aligned}$$

Applying (6.8), we get

$$\begin{aligned} \sup _{t\in [0,T]} \mathbb {E}[(X_t - \mathcal {X}_t)^4] = {\text {O}}(T^3) \qquad \text {as }\ T \rightarrow \infty , \end{aligned}$$
(6.11)

which, by (6.9), implies \(E_{1,2}(T) = \int _0^T \sqrt{{\text {O}}(T^3) {\text {O}}(T^2)} \, \mathrm {d}s = {\text {O}}(T^{\frac{7}{2}})\) as \(T \rightarrow \infty \). Using \(E_{1,1}(T) = {\text {O}}(T^3)\) as \(T \rightarrow \infty \), we conclude \(E_1(T) = {\text {O}}(T^3) + {\text {O}}(T^{\frac{7}{2}})= {\text {O}}(T^{\frac{7}{2}})\) as \(T \rightarrow \infty \).

Using \(\mathrm {d}Y_s = a \, \mathrm {d}s + \sigma _1 \sqrt{Y_s} \, \mathrm {d}W_s\) and \(\mathrm {d}\mathcal {Y}_s = a \, \mathrm {d}s + \sigma _1 \sqrt{\mathcal {Y}_s} \, \mathrm {d}W_s\), we obtain \(\mathrm {d}(Y_t - \mathcal {Y}_t) = \sigma _1 (\sqrt{Y_t} - \sqrt{\mathcal {Y}_t}) \, \mathrm {d}W_t\), thus

$$\begin{aligned} E_2(T)&= \sigma _1^2 \mathbb {E}\left( \int _0^T \mathcal {X}_s^2 (\sqrt{Y_s} - \sqrt{\mathcal {Y}_s})^2 \, \mathrm {d}s\right) \leqslant \sigma _1^2 \int _0^T \mathbb {E}[\mathcal {X}_s^2 |Y_s - \mathcal {Y}_s|] \, \mathrm {d}s\\&\leqslant \sigma _1^2 \int _0^T \sqrt{\mathbb {E}(\mathcal {X}_s^4) \mathbb {E}[(Y_s - \mathcal {Y}_s)^2]} \, \mathrm {d}s. \end{aligned}$$

Using \(\mathcal {X}_t = \alpha t + \sigma _2 \int _0^t \sqrt{\mathcal {Y}_s} \, \mathrm {d}\widetilde{W}_s\), we obtain

$$\begin{aligned}{}[\mathbb {E}(\mathcal {X}_t^4)]^{\frac{1}{4}}&\leqslant |\alpha | t + \sigma _2 \left( \mathbb {E}\left[ \left( \int _0^t \sqrt{\mathcal {Y}_s} \, \mathrm {d}\widetilde{W}_s\right) ^4 \right] \right) ^{\frac{1}{4}} \leqslant |\alpha | t + \sigma _2 \left( (2 \cdot 3)^2 t \mathbb {E}\left( \int _0^t \mathcal {Y}_s^2 \, \mathrm {d}s\right) \right) ^{\frac{1}{4}} \\&= |\alpha | t + \sigma _2 \left( 36 t \int _0^t a \left( a + \frac{\sigma _1^2}{2}\right) s^2 \, \mathrm {d}s \right) ^{\frac{1}{4}} = \left( |\alpha | + \sigma _2 \root 4 \of {6 a (2 a + \sigma _1^2)}\right) t, \end{aligned}$$

hence we conclude

$$\begin{aligned} \sup _{s\in [0,T]} \mathbb {E}(\mathcal {X}_s^4) = {\text {O}}(T^4) \qquad \text {as }\ T \rightarrow \infty . \end{aligned}$$
(6.12)

Using (6.8), we obtain \(E_2(T) = \int _0^T \sqrt{{\text {O}}(T^4) {\text {O}}(T)} \, \mathrm {d}s = {\text {O}}(T^{\frac{7}{2}})\) as \(T \rightarrow \infty \). Hence

$$\begin{aligned} \mathbb {E}\left( \left| \frac{1}{T^2} \left( \int _0^T X_s \, \mathrm {d}Y_s - \int _0^T \mathcal {X}_s \, \mathrm {d}\mathcal {Y}_s\right) \right| \right) \leqslant \frac{1}{T^2} \bigl (\sqrt{E_1(T)} + \sqrt{E_2(T)}\bigr ) = \frac{1}{T^2} {\text {O}}(T^{\frac{7}{4}}) \rightarrow 0 \end{aligned}$$

as \(T \rightarrow \infty \), implying \(\frac{1}{T^2} \left( \int _0^T X_s \, \mathrm {d}Y_s - \int _0^T \mathcal {X}_s \, \mathrm {d}\mathcal {Y}_s\right) {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}0\) as \(T \rightarrow \infty \), i.e., the first convergence in (6.5). Thus we conclude convergence (6.3).

Applying the first equation of (1.1) and using \(b = 0\), we obtain

$$\begin{aligned} \frac{\sigma _1}{T} \int _0^T Y_s^{\frac{1}{2}} \, \mathrm {d}W_s = \frac{1}{T} (Y_T - Y_0) - a. \end{aligned}$$

By Itô’s formula and using \(b = 0\),

$$\begin{aligned} \mathrm {d}(Y_t^2)= & {} 2 Y_t \, \mathrm {d}Y_t + \sigma _1^2 Y_t \, \mathrm {d}t = 2 Y_t (a \, \mathrm {d}t + \sigma _1 Y_t^{\frac{1}{2}} \, \mathrm {d}W_t) + \sigma _1^2 Y_t \, \mathrm {d}t\\= & {} \,(2 a + \sigma _1^2) Y_t \, \mathrm {d}t + 2 \sigma _1 Y_t^{\frac{3}{2}} \, \mathrm {d}W_t, \end{aligned}$$

hence

$$\begin{aligned} Y_T^2 = Y_0^2 + (2 a + \sigma _1^2) \int _0^T Y_s \, \mathrm {d}s + 2 \sigma _1 \int _0^T Y_s^{\frac{3}{2}} \, \mathrm {d}W_s. \end{aligned}$$

Consequently,

$$\begin{aligned} - \frac{\sigma _1}{T^2} \int _0^T Y_s^{\frac{3}{2}} \, \mathrm {d}W_s = - \frac{1}{2T^2} (Y_T^2 - Y_0^2) + \frac{2 a + \sigma _1^2}{2T^2} \int _0^T Y_s \, \mathrm {d}s. \end{aligned}$$

In a similar way, applying the second equation of (1.1) and using \(\beta = 0\) and \(\gamma = 0\), we obtain

$$\begin{aligned} \frac{\sigma _2}{T} \int _0^T Y_s^{\frac{1}{2}} \, \mathrm {d}\widetilde{W}_s + \frac{\sigma _3}{T} L_T = \frac{1}{T} (X_T - X_0) - \alpha . \end{aligned}$$

By Itô’s formula and using \(\beta = 0\) and \(\gamma = 0\),

$$\begin{aligned} \mathrm {d}(Y_t X_t) =&Y_t \, \mathrm {d}X_t + X_t \, \mathrm {d}Y_t + \varrho \sigma _1 \sigma _2 Y_t \, \mathrm {d}t = Y_t (\alpha \, \mathrm {d}t + \sigma _2 Y_t^{\frac{1}{2}} \, \mathrm {d}\widetilde{W}_t + \sigma _3 \, \mathrm {d}L_t)\\&+ X_t \, \mathrm {d}Y_t + \varrho \sigma _1 \sigma _2 Y_t \, \mathrm {d}t \\ =&\,\, (\alpha + \varrho \sigma _1 \sigma _2) Y_t \, \mathrm {d}t + \sigma _2 Y_t^{\frac{3}{2}} \, \mathrm {d}\widetilde{W}_t + X_t \, \mathrm {d}Y_t + \sigma _3 Y_t \, \mathrm {d}L_t, \end{aligned}$$

hence

$$\begin{aligned} Y_T X_T = Y_0 X_0 + (\alpha + \varrho \sigma _1 \sigma _2) \int _0^T Y_s \, \mathrm {d}s + \sigma _2 \int _0^T Y_s^{\frac{3}{2}} \, \mathrm {d}\widetilde{W}_s + \int _0^T X_s \, \mathrm {d}Y_s + \sigma _3 \int _0^T Y_s \, \mathrm {d}L_s. \end{aligned}$$

Consequently,

$$\begin{aligned}&- \frac{\sigma _2}{T^2} \int _0^T Y_s^{\frac{3}{2}} \, \mathrm {d}\widetilde{W}_s - \frac{\sigma _3}{T^2} \int _0^T Y_s \, \mathrm {d}L_s\\&\quad = - \frac{1}{T^2} (Y_T X_T - Y_0 X_0) + \frac{\alpha + \varrho \sigma _1 \sigma _2}{T^2} \int _0^T Y_s \, \mathrm {d}s+ \frac{1}{T^2} \int _0^T X_s \, \mathrm {d}Y_s. \end{aligned}$$

Again by Itô’s formula and using \(\beta = 0\) and \(\gamma = 0\),

$$\begin{aligned} \mathrm {d}(X_t^2) = 2 X_t \, \mathrm {d}X_t + (\sigma _2^2 Y_t + \sigma _3^2) \, \mathrm {d}t = 2 X_t (\alpha \, \mathrm {d}t + \sigma _2 Y_t^{\frac{1}{2}} \, \mathrm {d}\widetilde{W}_t + \sigma _3 \, \mathrm {d}L_t) + (\sigma _2^2 Y_t + \sigma _3^2) \, \mathrm {d}t, \end{aligned}$$

hence

$$\begin{aligned} X_T^2 = X_0^2 + \int _0^T (2 \alpha X_s + \sigma _2^2 Y_s + \sigma _3^2) \, \mathrm {d}s + 2 \sigma _2 \int _0^T Y_s^{\frac{1}{2}} X_s \, \mathrm {d}\widetilde{W}_s + 2 \sigma _3 \int _0^T X_s \, \mathrm {d}L_s. \end{aligned}$$

Consequently,

$$\begin{aligned}&- \frac{\sigma _2}{T^2} \int _0^T Y_s^{\frac{1}{2}} X_s \, \mathrm {d}\widetilde{W}_s - \frac{\sigma _3}{T^2} \int _0^T X_s \, \mathrm {d}L_s \\&\quad = - \frac{1}{2T^2} (X_T^2 - X_0^2) + \frac{\alpha }{T^2} \int _0^T X_s \, \mathrm {d}s+ \frac{\sigma _2^2}{2T^2} \int _0^T Y_s \, \mathrm {d}s + \frac{\sigma _3^2}{2T}. \end{aligned}$$

Applying (6.3) and the continuous mapping theorem, we obtain

$$\begin{aligned}&\begin{bmatrix} 1&\quad - \frac{1}{T^2} \int _0^T Y_s \, \mathrm {d}s \\ - \frac{1}{T^2} \int _0^T Y_s \, \mathrm {d}s&\quad \frac{1}{T^3} \int _0^T Y_s^2 \, \mathrm {d}s \end{bmatrix} {\mathop {\longrightarrow }\limits ^{\mathcal {D}}}\begin{bmatrix} 1&\quad - \int _0^1 \mathcal {Y}_s \, \mathrm {d}s \\ - \int _0^1 \mathcal {Y}_s \, \mathrm {d}s&\quad \int _0^1 \mathcal {Y}_s^2 \, \mathrm {d}s \end{bmatrix}, \\&\begin{bmatrix} \frac{1}{T} (Y_T - Y_0) - a \\ - \frac{1}{2T^2} (Y_T^2 - Y_0^2) + \frac{2 a + \sigma _1^2}{2T^2} \int _0^T Y_s \, \mathrm {d}s \end{bmatrix} {\mathop {\longrightarrow }\limits ^{\mathcal {D}}}\begin{bmatrix} \mathcal {Y}_1 - a \\ - \frac{1}{2} \mathcal {Y}_1^2 + \frac{2 a + \sigma _1^2}{2} \int _0^1 \mathcal {Y}_s \, \mathrm {d}s \end{bmatrix}, \\&\begin{bmatrix} 1&\quad - \frac{1}{T^2} \int _0^T Y_s \, \mathrm {d}s&\quad - \frac{1}{T^2} \int _0^T X_s \, \mathrm {d}s \\ - \frac{1}{T^2} \int _0^T Y_s \, \mathrm {d}s&\quad \frac{1}{T^3} \int _0^T Y_s^2 \, \mathrm {d}s&\quad \frac{1}{T^3} \int _0^T Y_s X_s \, \mathrm {d}s \\ - \frac{1}{T^2} \int _0^T X_s \, \mathrm {d}s&\quad \frac{1}{T^3} \int _0^T Y_s X_s \, \mathrm {d}s&\quad \frac{1}{T^3} \int _0^T X_s^2 \, \mathrm {d}s \end{bmatrix}\\&\quad {\mathop {\longrightarrow }\limits ^{\mathcal {D}}}\begin{bmatrix} 1&\quad - \int _0^1 \mathcal {Y}_s \, \mathrm {d}s&\quad - \int _0^1 \mathcal {X}_s \, \mathrm {d}s \\ - \int _0^1 \mathcal {Y}_s \, \mathrm {d}s&\quad \int _0^1 \mathcal {Y}_s^2 \, \mathrm {d}s&\quad \int _0^1 \mathcal {Y}_s \mathcal {X}_s \, \mathrm {d}s \\ - \int _0^1 \mathcal {X}_s \, \mathrm {d}s&\quad \int _0^1 \mathcal {Y}_s \mathcal {X}_s \, \mathrm {d}s&\quad \int _0^1 \mathcal {X}_s^2 \, \mathrm {d}s \end{bmatrix}, \\&\begin{bmatrix} \frac{1}{T} (X_T - X_0) - \alpha \\ - \frac{1}{T^2} (Y_T X_T - Y_0 X_0) + \frac{\alpha + \varrho \sigma _1 \sigma _2}{T^2} \int _0^T Y_s \, \mathrm {d}s + \frac{1}{T^2} \int _0^T X_s \, \mathrm {d}Y_s \\ - \frac{1}{2T^2} (X_T^2 - X_0^2) + \frac{\alpha }{T^2} \int _0^T X_s \, \mathrm {d}s + \frac{\sigma _2^2}{2T^2} \int _0^T Y_s \, \mathrm {d}s + \frac{\sigma _3^2}{2T} \end{bmatrix}\\&\quad {\mathop {\longrightarrow }\limits ^{\mathcal {D}}}\begin{bmatrix} \mathcal {X}_1 - \alpha \\ - \mathcal {Y}_1 \mathcal {X}_1 + (\alpha + \varrho \sigma _1 \sigma _2) \int _0^1 \mathcal {Y}_s \, \mathrm {d}s + \int _0^1 \mathcal {X}_s \, \mathrm {d}\mathcal {Y}_s \\ - \frac{1}{2} \mathcal {X}_1^2 + \alpha \int _0^1 \mathcal {X}_s \, \mathrm {d}s + \frac{\sigma _2^2}{2} \int _0^1 \mathcal {Y}_s \, \mathrm {d}s \end{bmatrix} \end{aligned}$$

jointly as \(T \rightarrow \infty \). Applying again the continuous mapping theorem, we conclude (6.1), since the limiting random matrices in the first and third convergences above are almost surely invertible by Lemma 3.1. \(\square \)

7 Asymptotic behavior of CLSE: supercritical case

First we present an auxiliary lemma about the asymptotic behavior of \(\mathbb {E}(X_t^2)\) as \(t \rightarrow \infty \).

Lemma 7.1

Let us consider the two-factor affine diffusion model (1.1) with \(a \in \mathbb {R}_+\), \(b \in \mathbb {R}_{--}\), \(\alpha , \beta \in \mathbb {R}\), \(\gamma \in (-\infty , b)\), \(\sigma _1 \in \mathbb {R}_{++}\), \(\sigma _2, \sigma _3 \in \mathbb {R}_+\) and \(\varrho \in [-1, 1]\) with a random initial value \((\eta _0, \zeta _0)\) independent of \((W_t, B_t, L_t)_{t\in \mathbb {R}_+}\) satisfying \(\mathbb {P}(\eta _0 \in \mathbb {R}_+) = 1\). Then \(\sup _{t\in \mathbb {R}_+} \mathrm {e}^{2\gamma t} \mathbb {E}(X_t^2) < \infty \).

Proof

By Proposition B.1,

$$\begin{aligned} \sup _{t\in \mathbb {R}_+} \mathrm {e}^{bt} \mathbb {E}(Y_t) = \sup _{t\in \mathbb {R}_+} \bigl (\mathbb {E}(Y_0) + a \int _0^t \mathrm {e}^{bu} \, \mathrm {d}u\bigr ) = \mathbb {E}(Y_0) + a \int _0^\infty \mathrm {e}^{bu} \, \mathrm {d}u < \infty , \end{aligned}$$

since \(b < 0\). Moreover,

$$\begin{aligned} \sup _{t\in \mathbb {R}_+} \mathrm {e}^{\gamma t} |\mathbb {E}(X_t)| =&\sup _{t\in \mathbb {R}_+} \biggl |\mathbb {E}(X_0) + \alpha \int _0^t \mathrm {e}^{\gamma u} \, \mathrm {d}u - \beta \int _0^t \mathrm {e}^{\gamma u} \mathbb {E}(Y_u) \, \mathrm {d}u\biggr | \\ \leqslant&|\mathbb {E}(X_0)| + |\alpha | \int _0^\infty \mathrm {e}^{\gamma u} \, \mathrm {d}u \\&+ |\beta | \biggl (\sup _{u\in \mathbb {R}_+} \mathrm {e}^{bu} \mathbb {E}(Y_u)\biggr ) \int _0^\infty \mathrm {e}^{(\gamma -b)u} \, \mathrm {d}u < \infty , \end{aligned}$$

using \(\gamma < 0\) and \(\gamma - b < 0\). Again by Proposition B.1,

$$\begin{aligned} \sup _{t\in \mathbb {R}_+} \mathrm {e}^{2bt} \mathbb {E}(Y_t^2)&= \sup _{t\in \mathbb {R}_+} \biggl (\mathbb {E}(Y_0^2) + (2a + \sigma _1^2) \int _0^t \mathrm {e}^{2bu} \mathbb {E}(Y_u) \, \mathrm {d}u\biggr ) \\&\leqslant \mathbb {E}(Y_0^2) + (2a + \sigma _1^2) \biggl (\sup _{u\in \mathbb {R}_+} \mathrm {e}^{bu} \mathbb {E}(Y_u)\biggr ) \int _0^\infty \mathrm {e}^{bu} \, \mathrm {d}u < \infty , \end{aligned}$$

using \(b < 0\). Hence

$$\begin{aligned} \sup _{t\in \mathbb {R}_+} \mathrm {e}^{(b+\gamma )t} |\mathbb {E}(Y_t X_t)| =&\sup _{t\in \mathbb {R}_+} \biggl |\mathbb {E}(Y_0 X_0) + a \int _0^t \mathrm {e}^{(b+\gamma )u} \mathbb {E}(X_u) \, \mathrm {d}u \\&+ (\alpha + \varrho \sigma _1 \sigma _2) \int _0^t \mathrm {e}^{(b+\gamma )u} \mathbb {E}(Y_u) \, \mathrm {d}u - \beta \int _0^t \mathrm {e}^{(b+\gamma )u} \mathbb {E}(Y_u^2) \, \mathrm {d}u\biggr | \\ \leqslant&|\mathbb {E}(Y_0 X_0)| + a \biggl (\sup _{u\in \mathbb {R}_+} \mathrm {e}^{\gamma u} |\mathbb {E}(X_u)|\biggr ) \int _0^\infty \mathrm {e}^{bu} \, \mathrm {d}u\\&+ (|\alpha | + |\varrho | \sigma _1 \sigma _2) \biggl (\sup _{u\in \mathbb {R}_+} \mathrm {e}^{bu} \mathbb {E}(Y_u)\biggr ) \int _0^\infty \mathrm {e}^{\gamma u} \, \mathrm {d}u \\&+ |\beta | \biggl (\sup _{u\in \mathbb {R}_+} \mathrm {e}^{2bu} \mathbb {E}(Y_u^2)\biggr ) \int _0^\infty \mathrm {e}^{(\gamma -b)u} \, \mathrm {d}u < \infty , \end{aligned}$$

using \(b < 0\), \(\gamma < 0\) and \(\gamma - b < 0\). Consequently,

$$\begin{aligned} \sup _{t\in \mathbb {R}_+} \mathrm {e}^{2\gamma t} \mathbb {E}(X_t^2)&= \sup _{t\in \mathbb {R}_+} \biggl (\mathbb {E}(X_0^2) + \alpha \int _0^t \mathrm {e}^{2\gamma u} X_u \, \mathrm {d}u - 2 \beta \int _0^t \mathrm {e}^{2\gamma u} Y_u X_u \, \mathrm {d}u \\&+ \sigma _2^2 \int _0^t \mathrm {e}^{2\gamma u} Y_u \, \mathrm {d}u + \sigma _3^2 \int _0^t \mathrm {e}^{2\gamma u} \, \mathrm {d}u\biggl ) \\&\leqslant \mathbb {E}(X_0^2) + |\alpha | \biggl (\sup _{u\in \mathbb {R}_+} \mathrm {e}^{\gamma u} |\mathbb {E}(X_u)|\biggr ) \int _0^\infty \mathrm {e}^{\gamma u} \, \mathrm {d}u \\&+ 2 |\beta | \biggl (\sup _{u\in \mathbb {R}_+} \mathrm {e}^{(b+\gamma )u} |\mathbb {E}(Y_u X_u)|\biggr ) \int _0^\infty \mathrm {e}^{(\gamma -b)u} \, \mathrm {d}u \\&+ \sigma _2^2 \biggl (\sup _{u\in \mathbb {R}_+} \mathrm {e}^{bu} \mathbb {E}(Y_u)\biggr ) \int _0^\infty \mathrm {e}^{(2\gamma -b) u} \, \mathrm {d}u + \sigma _3^2 \int _0^\infty \mathrm {e}^{2\gamma u} \, \mathrm {d}u < \infty \end{aligned}$$

using \(\gamma < 0\), \(\gamma - b < 0\) and \(2 \gamma - b < 0\). \(\square \)

Next we present an auxiliary lemma about the asymptotic behavior of \(X_t\) as \(t \rightarrow \infty \).

Lemma 7.2

Let us consider the two-factor affine diffusion model (1.1) with \(a \in \mathbb {R}_+\), \(b \in \mathbb {R}_{--}\), \(\alpha , \beta \in \mathbb {R}\), \(\gamma \in (-\infty , b)\), \(\sigma _1 \in \mathbb {R}_{++}\), \(\sigma _2, \sigma _3 \in \mathbb {R}_+\) and \(\varrho \in [-1, 1]\) with a random initial value \((\eta _0, \zeta _0)\) independent of \((W_t, B_t, L_t)_{t\in \mathbb {R}_+}\) satisfying \(\mathbb {P}(\eta _0 \in \mathbb {R}_+) = 1\). Suppose that \(\alpha \beta \in \mathbb {R}_-\). Then there exists a random variable \(V_X\) such that

$$\begin{aligned} \mathrm {e}^{\gamma t} X_t {\mathop {\longrightarrow }\limits ^{{\mathrm {a.s.}}}}V_X \qquad \text {as }\ t \rightarrow \infty \end{aligned}$$
(7.1)

and, for each \(k, \ell \in \mathbb {Z}_+\) with \(k + \ell > 0\),

$$\begin{aligned} \mathrm {e}^{(kb+\ell \gamma )t} \int _0^t Y_u^k X_u^\ell \, \mathrm {d}u {\mathop {\longrightarrow }\limits ^{{\mathrm {a.s.}}}}-\frac{V_Y^k V_X^\ell }{kb+\ell \gamma } \qquad \text {as }\ t \rightarrow \infty , \end{aligned}$$
(7.2)

where \(V_Y\) is given in (4.6). If, in addition, \(\sigma _3 \in \mathbb {R}_{++}\) or \(\bigl (a - \frac{\sigma _1^2}{2}\bigr ) (1 - \varrho ^2) \sigma _2^2 \in \mathbb {R}_{++}\), then the distribution of the random variable \(V_X\) is absolutely continuous. Particularly, \(\mathbb {P}(V_X \ne 0) = 1\).

Proof

By (2.2),

$$\begin{aligned} \mathbb {E}( X_t \,|\,\mathcal {F}_s ) = \mathbb {E}( X_t \,|\,Y_s, X_s ) = \mathrm {e}^{-\gamma (t-s)} X_s + \int _s^t \mathrm {e}^{-\gamma (t-u)} (\alpha - \beta Y_u) \, \mathrm {d}u \end{aligned}$$

for all \(s, t \in \mathbb {R}_+\) with \(0 \leqslant s \leqslant t\). If \(\alpha \in \mathbb {R}_+\) and \(\beta \in \mathbb {R}_-\), then

$$\begin{aligned} \mathbb {E}( \mathrm {e}^{\gamma t} X_t \,|\,\mathcal {F}^{Y,X}_s ) = \mathrm {e}^{\gamma s} X_s + \int _s^t \mathrm {e}^{\gamma u} (\alpha - \beta Y_u) \, \mathrm {d}u \geqslant \mathrm {e}^{\gamma s} X_s \end{aligned}$$

for all \(s, t \in \mathbb {R}_+\) with \(0 \leqslant s \leqslant t\), consequently, the process \((\mathrm {e}^{\gamma t} X_t)_{t\in \mathbb {R}_+}\) is a submartingale with respect to the filtration \((\mathcal {F}^{Y,X}_t)_{t\in \mathbb {R}_+}\). If \(\alpha \in \mathbb {R}_-\) and \(\beta \in \mathbb {R}_+\), then

$$\begin{aligned} \mathbb {E}( \mathrm {e}^{\gamma t} X_t \,|\,\mathcal {F}^{Y,X}_s ) = \mathrm {e}^{\gamma s} X_s + \int _s^t \mathrm {e}^{\gamma u} (\alpha - \beta Y_u) \, \mathrm {d}u \leqslant \mathrm {e}^{\gamma s} X_s \end{aligned}$$

for all \(s, t \in \mathbb {R}_+\) with \(0 {\leqslant } s {\leqslant } t\), consequently, the process \((\mathrm {e}^{\gamma t} X_t)_{t\in \mathbb {R}_+}\) is a supermartingale with respect to the filtration \((\mathcal {F}^{Y,X}_t)_{t\in \mathbb {R}_+}\), hence the process \((-\mathrm {e}^{\gamma t} X_t)_{t\in \mathbb {R}_+}\) is a submartingale with respect to the filtration \((\mathcal {F}^{Y,X}_t)_{t\in \mathbb {R}_+}\). In both cases, \(\sup _{t\in \mathbb {R}_+} \mathbb {E}(|\mathrm {e}^{\gamma t} X_t|^2) < \infty \), see Lemma 7.1. Hence, by the submartingale convergence theorem, there exists a random variable \(V_X\) such that (7.1) holds.

If \(\omega \in \varOmega \) such that \(\mathbb {R}_+ \ni t \mapsto (Y_t(\omega ), X_t(\omega ))\) is continuous and \((\mathrm {e}^{bt} Y_t(\omega ), \mathrm {e}^{\gamma t} X_t(\omega )) \rightarrow (V_Y(\omega ), V_X(\omega ))\) as \(t \rightarrow \infty \), then, by the integral Kronecker Lemma 4.3 with \(f(t) = \mathrm {e}^{(kb+\ell \gamma )t} Y_t(\omega )^k X_t(\omega )^\ell \) and \(a(t) = \mathrm {e}^{-(kb+\ell \gamma )t}\), \(t \in \mathbb {R}_+\), we have

$$\begin{aligned}&\frac{1}{\int _0^t \mathrm {e}^{-(kb+\ell \gamma )u} \, \mathrm {d}u} \int _0^t \mathrm {e}^{-(kb+\ell \gamma )u} (\mathrm {e}^{(kb+\ell \gamma )u} Y_u(\omega )^k X_u(\omega )^\ell ) \, \mathrm {d}u \\&\quad \rightarrow V_Y(\omega )^k V_X(\omega )^\ell \qquad \text {as }\ t \rightarrow \infty . \end{aligned}$$

Here \(\int _0^t \mathrm {e}^{-(kb+\ell \gamma )u} \, \mathrm {d}u = - \frac{\mathrm {e}^{-(kb+\ell \gamma )t} - 1}{kb+\ell \gamma }\), \(t \in \mathbb {R}_+\), thus we conclude (7.2).

Now suppose that \(\sigma _3 \in \mathbb {R}_{++}\) or \(\bigl (a - \frac{\sigma _1^2}{2}\bigr ) (1 - \varrho ^2) \sigma _2^2 \in \mathbb {R}_{++}\). We are going to show that the random variable \(V_X\) is absolutely continuous. Put \(Z_t := X_t - r Y_t\), \(t \in \mathbb {R}_+\) with \(r := \frac{\sigma _2 \varrho }{\sigma _1}\). Then the process \((Y_t, Z_t)_{t\in \mathbb {R}_+}\) is an affine process satisfying

$$\begin{aligned} {\left\{ \begin{array}{ll} \mathrm {d}Y_t = (a - b Y_t) \, \mathrm {d}t + \sigma _1 \sqrt{Y_t} \, \mathrm {d}W_t, \\ \mathrm {d}Z_t = (A - B Y_t - \gamma Z_t) \, \mathrm {d}t + \varSigma _2 \sqrt{Y_t} \, \mathrm {d}B_t + \sigma _3 \, \mathrm {d}L_t. \end{array}\right. } \qquad t \in \mathbb {R}_+, \end{aligned}$$

where \(A := \alpha - r a\), \(B := \beta - r (b - \gamma )\) and \(\varSigma _2 := \sigma _2 \sqrt{1 - \varrho ^2}\), see (Bolyog and Pap 2016, Proposition 2.5). We have

$$\begin{aligned} \mathrm {e}^{\gamma t} X_t&=\,\, r \mathrm {e}^{\gamma t} Y_t + \mathrm {e}^{\gamma t} Z_t \\ {}&= r \mathrm {e}^{\gamma t} Y_t + Z_0 + \int _0^t \mathrm {e}^{\gamma u} (A - B Y_u) \, \mathrm {d}u + \varSigma _2 \int _0^t \mathrm {e}^{\gamma u} \sqrt{Y_u} \, \mathrm {d}B_u\\&\quad + \sigma _3 \int _0^t \mathrm {e}^{\gamma u} \, \mathrm {d}L_u, \end{aligned}$$

where we used (2.2) with \(s = 0\) multiplied both sides by \(\mathrm {e}^{\gamma t}\). Thus the conditional distribution of \(\mathrm {e}^{\gamma t} X_t\) given \((Y_u)_{u\in [0,t]}\) and \(X_0\) is a normal distribution with mean \(r \mathrm {e}^{\gamma t} Y_t + Z_0 + \int _0^t \mathrm {e}^{\gamma u} (A - B Y_u) \, \mathrm {d}u\) and with variance \(\varSigma _2^2 \int _0^t \mathrm {e}^{2\gamma u} Y_u \, \mathrm {d}u + \sigma _3^2 \int _0^t \mathrm {e}^{2\gamma u} \, \mathrm {d}u\). Hence

$$\begin{aligned}&\mathbb {E}\bigl (\mathrm {e}^{\mathrm {i}\lambda \mathrm {e}^{\gamma t} X_t} \,\big |\,(Y_u)_{u\in [0,t]}, X_0\bigr ) \\&\quad = \exp \biggl \{\mathrm {i}\lambda \biggl (r \mathrm {e}^{\gamma t} Y_t + Z_0 + \int _0^t \mathrm {e}^{\gamma u} (A - B Y_u) \, \mathrm {d}u\biggr )\\&\qquad - \frac{\lambda ^2}{2} \biggl (\varSigma _2^2 \int _0^t \mathrm {e}^{2\gamma u} Y_u \, \mathrm {d}u + \sigma _3^2 \int _0^t \mathrm {e}^{2\gamma u} \, \mathrm {d}u\biggr )\biggr \}. \end{aligned}$$

Consequently,

$$\begin{aligned} \, \bigl |\mathbb {E}\bigl (\mathrm {e}^{\mathrm {i}\lambda \mathrm {e}^{\gamma t} X_t}\bigr )\bigr |&= \bigl |\mathbb {E}\bigl (\mathbb {E}\bigl (\mathrm {e}^{\mathrm {i}\lambda \mathrm {e}^{\gamma t} X_t} \,\big |\,(Y_u)_{u\in [0,t]}, X_0\bigr )\bigr )\bigr | \\&\quad = \biggl |\mathbb {E}\biggl (\exp \biggl \{\mathrm {i}\lambda \biggl (r \mathrm {e}^{\gamma t} Y_t + Z_0 + \int _0^t \mathrm {e}^{\gamma u} (A - B Y_u) \, \mathrm {d}u\biggr )\\&\quad - \frac{\lambda ^2}{2} \biggl (\varSigma _2^2 \int _0^t \mathrm {e}^{2\gamma u} Y_u \, \mathrm {d}u + \sigma _3^2 \int _0^t \mathrm {e}^{2\gamma u} \, \mathrm {d}u\biggr )\biggr \}\biggr ) \biggr | \\&\quad \leqslant \mathbb {E}\biggl (\biggl |\exp \biggl \{\mathrm {i}\lambda \biggl (r \mathrm {e}^{\gamma t} Y_t + Z_0 + \int _0^t \mathrm {e}^{\gamma u} (A - B Y_u) \, \mathrm {d}u\biggr )\\&\quad - \frac{\lambda ^2}{2} \biggl (\varSigma _2^2 \int _0^t \mathrm {e}^{2\gamma u} Y_u \, \mathrm {d}u + \sigma _3^2 \int _0^t \mathrm {e}^{2\gamma u} \, \mathrm {d}u\bigr )\biggr \}\biggr | \biggr ) \\&\quad = \mathbb {E}\biggl (\exp \biggl \{- \frac{\lambda ^2}{2} \biggl (\varSigma _2^2 \int _0^t \mathrm {e}^{2\gamma u} Y_u \, \mathrm {d}u + \sigma _3^2 \int _0^t \mathrm {e}^{2\gamma u} \, \mathrm {d}u\biggr )\biggr \}\biggr ). \end{aligned}$$

Convergence (7.1) implies \(\mathrm {e}^{\gamma t} X_t {\mathop {\longrightarrow }\limits ^{\mathcal {D}}}V_X\) as \(t \rightarrow \infty \), hence, by the continuity theorem and by the monotone convergence theorem,

$$\begin{aligned} \bigl |\mathbb {E}\bigl (\mathrm {e}^{\mathrm {i}\lambda V_X}\bigr )\bigr |&= \lim _{t\rightarrow \infty } \bigl |\mathbb {E}\bigl (\mathrm {e}^{\mathrm {i}\lambda \mathrm {e}^{\gamma t} X_t}\bigr )\bigr |\\&\leqslant \lim _{t\rightarrow \infty } \mathbb {E}\biggl (\exp \biggl \{- \frac{\lambda ^2}{2} \biggl (\varSigma _2^2 \int _0^t \mathrm {e}^{2\gamma u} Y_u \, \mathrm {d}u + \sigma _3^2 \int _0^t \mathrm {e}^{2\gamma u} \, \mathrm {d}u\biggr )\biggr \}\biggr ) \\&=\, \mathbb {E}\biggl (\exp \biggl \{- \frac{\lambda ^2}{2} \biggl (\varSigma _2^2 \int _0^\infty \mathrm {e}^{2\gamma u} Y_u \, \mathrm {d}u + \sigma _3^2 \int _0^\infty \mathrm {e}^{2\gamma u} \, \mathrm {d}u\biggr )\biggr \}\biggr ). \end{aligned}$$

for all \(\lambda \in \mathbb {R}\). If \(\sigma _3 \in \mathbb {R}_{++}\), then we have

$$\begin{aligned} \bigl |\mathbb {E}\bigl (\mathrm {e}^{\mathrm {i}\lambda V_X}\bigr )\bigr | \leqslant \exp \left\{ - \frac{\sigma _3^2}{4(-\gamma )} \lambda ^2\right\} \end{aligned}$$

for all \(\lambda \in \mathbb {R}\), hence \(\int _{-\infty }^\infty \bigl |\mathbb {E}\bigl (\mathrm {e}^{\mathrm {i}\lambda V_X}\bigr )\bigr | \mathrm {d}\lambda < \infty \), implying absolute continuity of the distribution of \(V_X\).

If \(\bigl (a - \frac{\sigma _1^2}{2}\bigr ) (1 - \varrho ^2) \sigma _2^2 \in \mathbb {R}_{++}\), then we have

$$\begin{aligned} \bigl |\mathbb {E}\bigl (\mathrm {e}^{\mathrm {i}\lambda V_X}\bigr )\bigr | \leqslant \mathbb {E}\biggl (\exp \biggl \{- \frac{\varSigma _2^2}{2} \lambda ^2 \int _0^\infty \mathrm {e}^{2\gamma u} Y_u \, \mathrm {d}u\biggr \}\biggr ) \leqslant \mathbb {E}\biggl (\exp \biggl \{- \frac{\varSigma _2^2\mathrm {e}^{4\gamma }}{2} \lambda ^2 \int _1^2 Y_u \, \mathrm {d}u\biggr \}\biggr ) \end{aligned}$$

for all \(\lambda \in \mathbb {R}\). Applying the comparison theorem (see, e.g., Karatzas and Shreve 1991, 5.2.18), we obtain \(\mathbb {P}(\mathcal {Y}_t \leqslant Y_t \ \text {for all }\ t \in \mathbb {R}_+) = 1\), where \((\mathcal {Y}_t)_{t\in \mathbb {R}_+}\) is the unique strong solution of the SDE

$$\begin{aligned} \mathrm {d}\mathcal {Y}_t = (a - b \mathcal {Y}_t) \, \mathrm {d}t + \sigma _1 \sqrt{\mathcal {Y}_t} \, \mathrm {d}W_t, \qquad t \in [0, \infty ), \end{aligned}$$

with initial value \(\mathcal {Y}_0 = 0\). Consequently, taking into account \(\varSigma _2 = \sigma _2 \sqrt{1 - \varrho ^2} > 0\), we obtain

$$\begin{aligned}&\int _{-\infty }^\infty \bigl |\mathbb {E}\bigl (\mathrm {e}^{\mathrm {i}\lambda V_X}\bigr )\bigr | \, \mathrm {d}\lambda \leqslant \int _{-\infty }^\infty \mathbb {E}\biggl (\exp \biggl \{- \frac{\varSigma _2^2\mathrm {e}^{4\gamma }}{2} \lambda ^2 \int _1^2 \mathcal {Y}_u \, \mathrm {d}u\biggr \}\biggr ) \mathrm {d}\lambda \\&\quad = \mathbb {E}\biggl (\int _{-\infty }^\infty \exp \biggl \{- \frac{\varSigma _2^2\mathrm {e}^{4\gamma }}{2} \lambda ^2 \int _1^2 \mathcal {Y}_u \, \mathrm {d}u\biggr \} \mathrm {d}\lambda \biggr )\\&\quad = \mathbb {E}\left( \frac{\sqrt{2\pi }}{\varSigma _2\mathrm {e}^{2\gamma }\sqrt{\int _1^2 \mathcal {Y}_u \, \mathrm {d}u}}\right) = \frac{\sqrt{2\pi }}{\varSigma _2\mathrm {e}^{2\gamma }} \mathbb {E}\left( \frac{1}{\sqrt{\int _1^2 \mathcal {Y}_u \, \mathrm {d}u}}\right) < \infty , \end{aligned}$$

whenever

$$\begin{aligned} \mathbb {E}\left( \frac{1}{\sqrt{\int _1^2 \mathcal {Y}_u \, \mathrm {d}u}}\right) < \infty . \end{aligned}$$
(7.3)

By the Cauchy–Schwarz inequality, we have

$$\begin{aligned} 1 = \biggl (\int _1^2 \sqrt{\mathcal {Y}_u} \cdot \frac{1}{\sqrt{\mathcal {Y}_s}} \, \mathrm {d}u\biggr )^2 \leqslant \int _1^2 \mathcal {Y}_u \, \mathrm {d}u \int _1^2 \frac{1}{\mathcal {Y}_u} \, \mathrm {d}u, \end{aligned}$$

hence

$$\begin{aligned} \mathbb {E}\left( \frac{1}{\sqrt{\int _1^2 \mathcal {Y}_u \, \mathrm {d}u}}\right) \leqslant \mathbb {E}\left( \sqrt{\int _1^2 \frac{1}{\mathcal {Y}_u} \, \mathrm {d}u}\right) \leqslant \sqrt{\mathbb {E}\left( \int _1^2 \frac{1}{\mathcal {Y}_u} \, \mathrm {d}u\right) } = \sqrt{\int _1^2 \mathbb {E}\biggl (\frac{1}{\mathcal {Y}_u}\biggr ) \mathrm {d}u}. \end{aligned}$$

For each \(u \in \mathbb {R}_{++}\), we have \(\mathcal {Y}_u {\mathop {=}\limits ^{\mathcal {D}}}c(u) \xi \), where the distribution of \(\xi \) has a chi-square distribution with degrees of freedom \(\frac{4a}{\sigma _1^2}\) and \(c(u) := \frac{\sigma _1^2}{4} \int _0^u \mathrm {e}^{-bv} \, \mathrm {d}v = \frac{\sigma _1^2(\mathrm {e}^{-bu}-1)}{4(-b)}\), see Proposition B.1. Hence

$$\begin{aligned} \mathbb {E}\biggl (\frac{1}{\mathcal {Y}_u}\biggr ) = \frac{1}{c(u)} \mathbb {E}\biggl (\frac{1}{\xi }\biggr ), \end{aligned}$$

where \(\mathbb {E}\bigl (\frac{1}{\xi }\bigr ) < \infty \), since the density of \(\xi \) has the form

and the assumption \(a - \frac{\sigma _1^2}{2} > 0\) yields \(\frac{2a}{\sigma _1^2} - 1 > 0\). Consequently,

$$\begin{aligned} \int _1^2 \mathbb {E}\bigl (\frac{1}{\mathcal {Y}_u}\bigr ) \mathrm {d}u = \mathbb {E}\biggl (\frac{1}{\xi }\biggr ) \int _1^2 \frac{1}{c(u)} \, \mathrm {d}u = \mathbb {E}\biggl (\frac{1}{\xi }\biggr ) \int _1^2 \frac{4(-b)}{\sigma _1^2(\mathrm {e}^{-bu}-1)} \, \mathrm {d}u < \infty , \end{aligned}$$

thus we obtain (7.3), and hence \(\int _{-\infty }^\infty \bigl |\mathbb {E}\bigl (\mathrm {e}^{\mathrm {i}\lambda V_X}\bigr )\bigr | \mathrm {d}\lambda < \infty \), and we conclude absolute continuity of the distribution of \(V_X\). \(\square \)

Theorem 7.3

Let us consider the two-factor affine diffusion model (1.1) with \(a \in \mathbb {R}_+\), \(b \in \mathbb {R}_{--}\), \(\alpha , \beta \in \mathbb {R}\), \(\gamma \in (-\infty , b)\), \(\sigma _1 \in \mathbb {R}_{++}\), \(\sigma _2, \sigma _3 \in \mathbb {R}_+\) and \(\varrho \in [-1, 1]\) with a random initial value \((\eta _0, \zeta _0)\) independent of \((W_t, B_t, L_t)_{t\in \mathbb {R}_+}\) satisfying \(\mathbb {P}(\eta _0 \in \mathbb {R}_+) = 1\). Suppose that \(\alpha \beta \in \mathbb {R}_-\). Suppose that \(\sigma _3 \in \mathbb {R}_{++}\) or \(\bigl (a - \frac{\sigma _1^2}{2}\bigr ) (1 - \varrho ^2) \sigma _2^2 \in \mathbb {R}_{++}\). Then

$$\begin{aligned} \begin{bmatrix} T \mathrm {e}^{\frac{bT}{2}} (\widehat{a}_T - a) \\ \mathrm {e}^{-\frac{bT}{2}} (\widehat{b}_T - b) \\ T \mathrm {e}^{\frac{bT}{2}} (\widehat{\alpha }_T - \alpha ) \\ \mathrm {e}^{-bT/2} (\widehat{\beta }_T - \beta ) \\ \mathrm {e}^{\frac{(b-2\gamma )T}{2}} (\widehat{\gamma }_T - \gamma ) \end{bmatrix} {\mathop {\longrightarrow }\limits ^{\mathcal {D}}}{\varvec{V}}^{-1} {\varvec{\eta }}{\varvec{\xi }}\end{aligned}$$
(7.4)

as \(T \rightarrow \infty \) with

$$\begin{aligned} {\varvec{V}}:= \begin{bmatrix} 1&\quad \frac{V_Y}{b}&\quad 0&\quad 0&\quad 0 \\ 0&\quad -\frac{V_Y^2}{2b}&\quad 0&\quad 0&\quad 0 \\ 0&\quad 0&\quad 1&\quad \frac{V_Y}{b}&\quad \frac{V_X}{\gamma } \\ 0&\quad 0&\quad 0&\quad -\frac{V_Y^2}{2b}&\quad -\frac{V_YV_X}{b+\gamma } \\ 0&\quad 0&\quad 0&\quad -\frac{V_YV_X}{b+\gamma }&\quad -\frac{V_X^2}{2\gamma } \end{bmatrix}, \end{aligned}$$

where \(V_Y\) and \(V_X\) are given in (4.6) and (7.1), respectively, \({\varvec{\eta }}\) is a \(5 \times 5\) random matrix such that

$$\begin{aligned} {\varvec{\eta }}{\varvec{\eta }}^\top = \begin{bmatrix} - \frac{\sigma _1^2 V_Y}{b}&\quad \frac{\sigma _1^2 V_Y^2}{2b}&\quad - \frac{\varrho \sigma _1\sigma _2 V_Y}{b}&\quad \frac{\varrho \sigma _1\sigma _2 V_Y^2}{2b}&\quad \frac{\varrho \sigma _1\sigma _2 V_Y V_X}{b+\gamma } \\ \frac{\sigma _1^2 V_Y^2}{2b}&\quad - \frac{\sigma _1^2 V_Y^3}{3b}&\quad \frac{\varrho \sigma _1\sigma _2 V_Y^2}{2b}&\quad - \frac{\varrho \sigma _1\sigma _2 V_Y^3}{3b}&\quad - \frac{\varrho \sigma _1\sigma _2 V_Y^2 V_X}{2b+\gamma } \\ - \frac{\varrho \sigma _1\sigma _2 V_Y}{b}&\quad \frac{\varrho \sigma _1\sigma _2 V_Y^2}{2b}&\quad - \frac{\sigma _2^2 V_Y}{b}&\quad \frac{\sigma _2^2 V_Y^2}{2b}&\quad \frac{\sigma _2^2 V_Y V_X}{b+\gamma } \\ \frac{\varrho \sigma _1\sigma _2 V_Y^2}{2b}&\quad - \frac{\varrho \sigma _1\sigma _2 V_Y^3}{3b}&\quad \frac{\sigma _2^2 V_Y^2}{2b}&\quad - \frac{\sigma _2^2 V_Y^3}{3b}&\quad - \frac{\sigma _2^2 V_Y^2 V_X}{2b+\gamma } \\ \frac{\varrho \sigma _1\sigma _2 V_Y V_X}{b+\gamma }&\quad - \frac{\varrho \sigma _1\sigma _2 V_Y^2 V_X}{2b+\gamma }&\quad \frac{\sigma _2^2 V_Y V_X}{b+\gamma }&\quad - \frac{\sigma _2^2 V_Y^2 V_X}{2b+\gamma }&\quad - \frac{\sigma _2^2 V_Y V_X^2}{b+2\gamma } \end{bmatrix}, \end{aligned}$$

and \({\varvec{\xi }}\) is a 5-dimensional standard normally distributed random vector independent of \((V_Y, V_X)\).

Proof

We have

$$\begin{aligned} \begin{bmatrix} T \mathrm {e}^{\frac{bT}{2}} (\widehat{a}_T - a) \\ \mathrm {e}^{-\frac{bT}{2}} (\widehat{b}_T - b) \\ T \mathrm {e}^{\frac{bT}{2}} (\widehat{\alpha }_T - \alpha ) \\ \mathrm {e}^{-\frac{bT}{2}} (\widehat{\beta }_T - \beta ) \\ \mathrm {e}^{\frac{(b-2\gamma )T}{2}} (\widehat{\gamma }_T - \gamma ) \end{bmatrix} = {\text {diag}}\Bigl (T \mathrm {e}^{\frac{bT}{2}}, \mathrm {e}^{-\frac{bT}{2}}, T \mathrm {e}^{\frac{bT}{2}}, \mathrm {e}^{-\frac{bT}{2}}, \mathrm {e}^{\frac{(b-2\gamma )T}{2}}\Bigr ) \bigl ({\widehat{{\varvec{\theta }}}}_T - {\varvec{\theta }}\bigr ), \end{aligned}$$

where, by (3.5),

$$\begin{aligned} {\widehat{{\varvec{\theta }}}}_T - {\varvec{\theta }}= {\varvec{G}}_T^{-1} {\varvec{h}}_T = \begin{bmatrix} {\varvec{G}}_T^{(1)}&\quad {\varvec{0}}\\ {\varvec{0}}&\quad {\varvec{G}}_T^{(2)} \end{bmatrix}^{-1} \begin{bmatrix} {\varvec{h}}_T^{(1)} \\ {\varvec{h}}_T^{(2)} \end{bmatrix}. \end{aligned}$$

We are going to apply Theorem C.2 for the continuous local martingale \(({\varvec{h}}_T)_{T\in \mathbb {R}_+}\) with quadratic variation process \(\langle {\varvec{h}}\rangle _T = \widetilde{{\varvec{G}}}_T\), \(T \in \mathbb {R}_+\) (introduced in the proof of Theorem 5.1). With scaling matrices

$$\begin{aligned} {\varvec{Q}}(T) := {\text {diag}}\Bigl (\mathrm {e}^{\frac{bT}{2}}, \mathrm {e}^{\frac{3bT}{2}}, \mathrm {e}^{\frac{bT}{2}}, \mathrm {e}^{\frac{3bT}{2}}, \mathrm {e}^{\frac{(b+2\gamma )T}{2}}\Bigr ), \qquad t \in \mathbb {R}_{++}, \end{aligned}$$

by (7.2), we have

$$\begin{aligned} {\varvec{Q}}(T) \langle {\varvec{h}}\rangle _T {\varvec{Q}}(T)^\top {\mathop {\longrightarrow }\limits ^{{\mathrm {a.s.}}}}\begin{bmatrix} - \frac{\sigma _1^2 V_Y}{b}&\frac{\sigma _1^2 V_Y^2}{2b}&- \frac{\varrho \sigma _1\sigma _2 V_Y}{b}&\frac{\varrho \sigma _1\sigma _2 V_Y^2}{2b}&\frac{\varrho \sigma _1\sigma _2 V_Y V_X}{b+\gamma } \\ \frac{\sigma _1^2 V_Y^2}{2b}&- \frac{\sigma _1^2 V_Y^3}{3b}&\frac{\varrho \sigma _1\sigma _2 V_Y^2}{2b}&- \frac{\varrho \sigma _1\sigma _2 V_Y^3}{3b}&- \frac{\varrho \sigma _1\sigma _2 V_Y^2 V_X}{2b+\gamma } \\ - \frac{\varrho \sigma _1\sigma _2 V_Y}{b}&\frac{\varrho \sigma _1\sigma _2 V_Y^2}{2b}&- \frac{\sigma _2^2 V_Y}{b}&\frac{\sigma _2^2 V_Y^2}{2b}&\frac{\sigma _2^2 V_Y V_X}{b+\gamma } \\ \frac{\varrho \sigma _1\sigma _2 V_Y^2}{2b}&- \frac{\varrho \sigma _1\sigma _2 V_Y^3}{3b}&\frac{\sigma _2^2 V_Y^2}{2b}&- \frac{\sigma _2^2 V_Y^3}{3b}&- \frac{\sigma _2^2 V_Y^2 V_X}{2b+\gamma } \\ \frac{\varrho \sigma _1\sigma _2 V_Y V_X}{b+\gamma }&- \frac{\varrho \sigma _1\sigma _2 V_Y^2 V_X}{2b+\gamma }&\frac{\sigma _2^2 V_Y V_X}{b+\gamma }&- \frac{\sigma _2^2 V_Y^2 V_X}{2b+\gamma }&- \frac{\sigma _2^2 V_Y V_X^2}{b+2\gamma } \end{bmatrix} = {\varvec{\eta }}{\varvec{\eta }}^\top \end{aligned}$$

as \(T \rightarrow \infty \). Hence by Theorem C.2, for each random matrix \({\varvec{A}}\) defined on \((\varOmega , \mathcal {F}, \mathbb {P})\), we obtain

$$\begin{aligned} ({\varvec{Q}}(T){\varvec{h}}_T, {\varvec{A}}) {\mathop {\longrightarrow }\limits ^{\mathcal {D}}}({\varvec{\eta }}{\varvec{\xi }}, {\varvec{A}}) \qquad \text {as }\ T \rightarrow \infty , \end{aligned}$$
(7.5)

where \({\varvec{\xi }}\) is a 5-dimensional standard normally distributed random vector independent of \(({\varvec{\eta }}, {\varvec{A}})\). The aim of the following discussion is to include appropriate scaling matrices for \({\varvec{G}}_T\). The matrices \({\varvec{G}}_T^{(1)}\) and \({\varvec{G}}_T^{(2)}\) can be written in the form

$$\begin{aligned} {\varvec{G}}_T^{(1)} = {\text {diag}}\bigl (T^{\frac{1}{2}}, \mathrm {e}^{-bT}\bigr ) \begin{bmatrix} 1&-\frac{\mathrm {e}^{bT}}{\sqrt{T}} \int _0^T Y_s \, \mathrm {d}s \\ -\frac{\mathrm {e}^{bT}}{\sqrt{T}} \int _0^T Y_s \, \mathrm {d}s&\mathrm {e}^{2bT} \int _0^T Y_s^2 \, \mathrm {d}s \end{bmatrix} {\text {diag}}\bigl (T^{\frac{1}{2}}, \mathrm {e}^{-bT}\bigr ) \end{aligned}$$

and

$$\begin{aligned} {\varvec{G}}_T^{(2)}&= {\text {diag}}\bigl (T^{\frac{1}{2}}, \mathrm {e}^{-bT}, \mathrm {e}^{-\gamma T}\bigr )\\&\quad \quad \times \begin{bmatrix} 1&-\frac{\mathrm {e}^{bT}}{\sqrt{T}} \int _0^T Y_s \, \mathrm {d}s&-\frac{\mathrm {e}^{\gamma T}}{\sqrt{T}} \int _0^T X_s \, \mathrm {d}s \\ -\frac{\mathrm {e}^{bT}}{\sqrt{T}} \int _0^T Y_s \, \mathrm {d}s&\mathrm {e}^{2bT} \int _0^T Y_s^2 \, \mathrm {d}s&\mathrm {e}^{(b+\gamma )T} \int _0^T Y_s X_s \, \mathrm {d}s \\ -\frac{\mathrm {e}^{\gamma T}}{\sqrt{T}} \int _0^T X_s \, \mathrm {d}s&\mathrm {e}^{(b+\gamma )T} \int _0^T Y_s X_s \, \mathrm {d}s&\mathrm {e}^{2\gamma T} \int _0^T X_s^2 \, \mathrm {d}s \end{bmatrix}\\&\quad \quad \times {\text {diag}}\bigl (T^{\frac{1}{2}}, \mathrm {e}^{-bT}, \mathrm {e}^{-\gamma T}\bigr ), \end{aligned}$$

hence the matrices \(({\varvec{G}}_T^{(1)})^{-1}\) and \(({\varvec{G}}_T^{(2)})^{-1}\) can be written in the form

$$\begin{aligned} ({\varvec{G}}_T^{(1)})^{-1} = {\text {diag}}\bigl (T^{-\frac{1}{2}}, \mathrm {e}^{bT}\bigr ) \begin{bmatrix} 1&-\frac{\mathrm {e}^{bT}}{\sqrt{T}} \int _0^T Y_s \, \mathrm {d}s \\ -\frac{\mathrm {e}^{bT}}{\sqrt{T}} \int _0^T Y_s \, \mathrm {d}s&\mathrm {e}^{2bT} \int _0^T Y_s^2 \, \mathrm {d}s \end{bmatrix}^{-1} {\text {diag}}\bigl (T^{-\frac{1}{2}}, \mathrm {e}^{bT}\bigr ) \end{aligned}$$

and

$$\begin{aligned} ({\varvec{G}}_T^{(2)})^{-1}&= {\text {diag}}\bigl (T^{-\frac{1}{2}}, \mathrm {e}^{bT}, \mathrm {e}^{\gamma T}\bigr )\\&\quad \quad \times \begin{bmatrix} 1&-\frac{\mathrm {e}^{bT}}{\sqrt{T}} \int _0^T Y_s \, \mathrm {d}s&-\frac{\mathrm {e}^{\gamma T}}{\sqrt{T}} \int _0^T X_s \, \mathrm {d}s \\ -\frac{\mathrm {e}^{bT}}{\sqrt{T}} \int _0^T Y_s \, \mathrm {d}s&\mathrm {e}^{2bT} \int _0^T Y_s^2 \, \mathrm {d}s&\mathrm {e}^{(b+\gamma )T} \int _0^T Y_s X_s \, \mathrm {d}s \\ -\frac{\mathrm {e}^{\gamma T}}{\sqrt{T}} \int _0^T X_s \, \mathrm {d}s&\mathrm {e}^{(b+\gamma )T} \int _0^T Y_s X_s \, \mathrm {d}s&\mathrm {e}^{2\gamma T} \int _0^T X_s^2 \, \mathrm {d}s \end{bmatrix}^{-1}\\&\qquad \times {\text {diag}}\bigl (T^{-\frac{1}{2}}, \mathrm {e}^{bT}, \mathrm {e}^{\gamma T}\bigr ). \end{aligned}$$

We have

$$\begin{aligned}&{\text {diag}}\Bigl (T \mathrm {e}^{\frac{bT}{2}}, \mathrm {e}^{-\frac{bT}{2}}, T \mathrm {e}^{\frac{bT}{2}}, \mathrm {e}^{-\frac{bT}{2}}, \mathrm {e}^{\frac{(b-2\gamma )T}{2}}\Bigr ) {\text {diag}}\Bigl (T^{-\frac{1}{2}}, \mathrm {e}^{bT}, T^{-\frac{1}{2}}, \mathrm {e}^{bT}, \mathrm {e}^{\gamma T}\Bigr )\\&\quad = {\text {diag}}\Bigl (T^{\frac{1}{2}} \mathrm {e}^{\frac{bT}{2}}, \mathrm {e}^{\frac{bT}{2}}, T^{\frac{1}{2}} \mathrm {e}^{\frac{bT}{2}}, \mathrm {e}^{\frac{bT}{2}}, \mathrm {e}^{\frac{bT}{2}}\Bigr ) \end{aligned}$$

and

$$\begin{aligned}&{\text {diag}}\Bigl (T^{-\frac{1}{2}}, \mathrm {e}^{bT}, T^{-\frac{1}{2}}, \mathrm {e}^{bT}, \mathrm {e}^{\gamma T}\Bigr ) {\varvec{Q}}(T)^{-1} \\&= {\text {diag}}\Bigl (T^{-\frac{1}{2}}, \mathrm {e}^{bT}, T^{-\frac{1}{2}}, \mathrm {e}^{bT}, \mathrm {e}^{\gamma T}\Bigr ) {\text {diag}}\Bigl (\mathrm {e}^{-\frac{bT}{2}}, \mathrm {e}^{-\frac{3bT}{2}}, \mathrm {e}^{-\frac{bT}{2}}, \mathrm {e}^{-\frac{3bT}{2}}, \mathrm {e}^{-\frac{(b+2\gamma )T}{2}}\Bigr ) \\&= {\text {diag}}\Bigl (T^{-\frac{1}{2}} \mathrm {e}^{-\frac{bT}{2}}, \mathrm {e}^{-\frac{bT}{2}}, T^{-\frac{1}{2}} \mathrm {e}^{-\frac{bT}{2}}, \mathrm {e}^{-\frac{bT}{2}}, \mathrm {e}^{-\frac{bT}{2}}\Bigr ). \end{aligned}$$

Moreover,

$$\begin{aligned}&{\text {diag}}\Bigl (T^{\frac{1}{2}} \mathrm {e}^{\frac{bT}{2}}, \mathrm {e}^{\frac{bT}{2}}\Bigr ) \begin{bmatrix} 1&-\frac{\mathrm {e}^{bT}}{\sqrt{T}} \int _0^T Y_s \, \mathrm {d}s \\ -\frac{\mathrm {e}^{bT}}{\sqrt{T}} \int _0^T Y_s \, \mathrm {d}s&\mathrm {e}^{2bT} \int _0^T Y_s^2 \, \mathrm {d}s \end{bmatrix} {\text {diag}}\Bigl (T^{-\frac{1}{2}} \mathrm {e}^{-\frac{bT}{2}}, \mathrm {e}^{-\frac{bT}{2}}\Bigr ) \\&= \begin{bmatrix} 1&-\mathrm {e}^{bT} \int _0^T Y_s \, \mathrm {d}s \\ -\frac{\mathrm {e}^{bT}}{T} \int _0^T Y_s \, \mathrm {d}s&\mathrm {e}^{2bT} \int _0^T Y_s^2 \, \mathrm {d}s \end{bmatrix} =: {\varvec{J}}_T^{(1)} \end{aligned}$$

and

$$\begin{aligned}&{\text {diag}}\Bigl (T^{\frac{1}{2}} \mathrm {e}^{\frac{bT}{2}}, \mathrm {e}^{\frac{bT}{2}}, \mathrm {e}^{\frac{bT}{2}}\Bigr ) \begin{bmatrix} 1&-\frac{\mathrm {e}^{bT}}{\sqrt{T}} \int _0^T Y_s \, \mathrm {d}s&-\frac{\mathrm {e}^{\gamma T}}{\sqrt{T}} \int _0^T X_s \, \mathrm {d}s \\ -\frac{\mathrm {e}^{bT}}{\sqrt{T}} \int _0^T Y_s \, \mathrm {d}s&\mathrm {e}^{2bT} \int _0^T Y_s^2 \, \mathrm {d}s&\mathrm {e}^{(b+\gamma )T} \int _0^T Y_s X_s \, \mathrm {d}s \\ -\frac{\mathrm {e}^{\gamma T}}{\sqrt{T}} \int _0^T X_s \, \mathrm {d}s&\mathrm {e}^{(b+\gamma )T} \int _0^T Y_s X_s \, \mathrm {d}s&\mathrm {e}^{2\gamma T} \int _0^T X_s^2 \, \mathrm {d}s \end{bmatrix}\\&\times {\text {diag}}\Bigl (T^{-\frac{1}{2}} \mathrm {e}^{-\frac{bT}{2}}, \mathrm {e}^{-\frac{bT}{2}}, \mathrm {e}^{-\frac{bT}{2}}\Bigr ) \\&\quad = \begin{bmatrix} 1&-\mathrm {e}^{bT} \int _0^T Y_s \, \mathrm {d}s&-\mathrm {e}^{\gamma T} \int _0^T X_s \, \mathrm {d}s \\ -\frac{\mathrm {e}^{bT}}{T} \int _0^T Y_s \, \mathrm {d}s&\mathrm {e}^{2bT} \int _0^T Y_s^2 \, \mathrm {d}s&\mathrm {e}^{(b+\gamma )T} \int _0^T Y_s X_s \, \mathrm {d}s \\ -\frac{\mathrm {e}^{\gamma T}}{T} \int _0^T X_s \, \mathrm {d}s&\mathrm {e}^{(b+\gamma )T} \int _0^T Y_s X_s \, \mathrm {d}s&\mathrm {e}^{2\gamma T} \int _0^T X_s^2 \, \mathrm {d}s \end{bmatrix} =: {\varvec{J}}_T^{(2)} \end{aligned}$$

Consequently,

$$\begin{aligned} \begin{bmatrix} T \mathrm {e}^{\frac{bT}{2}} (\widehat{a}_T - a) \\ \mathrm {e}^{-\frac{bT}{2}} (\widehat{b}_T - b) \\ T \mathrm {e}^{\frac{bT}{2}} (\widehat{\alpha }_T - \alpha ) \\ \mathrm {e}^{-\frac{bT}{2}} (\widehat{\beta }_T - \beta ) \\ \mathrm {e}^{\frac{(b-2\gamma )T}{2}} (\widehat{\gamma }_T - \gamma ) \end{bmatrix} = {\text {diag}}\bigl ({\varvec{J}}_T^{(1)}, {\varvec{J}}_T^{(2)}\bigr )^{-1} {\varvec{Q}}(T) {\varvec{h}}_T, \end{aligned}$$

where, by Lemma 7.2,

$$\begin{aligned} {\text {diag}}\bigl ({\varvec{J}}_T^{(1)}, {\varvec{J}}_T^{(2)}\bigr ) {\mathop {\longrightarrow }\limits ^{\mathbb {P}}}{\varvec{V}}\qquad \text {as } \ T \rightarrow \infty . \end{aligned}$$
(7.6)

By (7.5) with \({\varvec{A}}= {\varvec{V}}\), by (7.6) and by Theorem 2.7 (iv) of van der Vaart (1998), we obtain

$$\begin{aligned} \bigl ({\varvec{Q}}(T){\varvec{h}}_T, {\text {diag}}\bigl ({\varvec{J}}_T^{(1)}, {\varvec{J}}_T^{(2)}\bigr )\bigr ) {\mathop {\longrightarrow }\limits ^{\mathcal {D}}}({\varvec{\eta }}{\varvec{\xi }}, {\varvec{V}}) \qquad \text {as }\ T \rightarrow \infty . \end{aligned}$$

The random matrix \({\varvec{V}}\) is invertible almost surely, since

$$\begin{aligned} \det ({\varvec{V}}) = - \frac{(b-\gamma )^2V_Y^4V_X^2}{8(b+\gamma )^2b^2\gamma } > 0 \end{aligned}$$

almost surely by Lemma 7.2. Consequently, \({\text {diag}}\bigl ({\varvec{J}}_T^{(1)}, {\varvec{J}}_T^{(2)}\bigr )^{-1} {\varvec{Q}}(T) {\varvec{h}}_T {\mathop {\longrightarrow }\limits ^{\mathcal {D}}}{\varvec{V}}^{-1} {\varvec{\eta }}{\varvec{\xi }}\) as \(T \rightarrow \infty \). \(\square \)

8 Summary

The following table summarize the results of the present paper on the asymptotic properties of the CLSE \((\widehat{a}_T, \widehat{b}_T, \widehat{\alpha }_T, \widehat{\beta }_T, \widehat{\gamma }_T)\) for the drift parameters \((a, b, \alpha , \beta , \gamma )\) of general two-factor affine diffusions (1.1). We recall that \(a \in [0, \infty )\), \(b, \alpha , \beta , \gamma \in \mathbb {R}\), \(\sigma _1, \sigma _2, \sigma _3 \in [0, \infty )\) and \(\varrho \in [-1, 1]\).

figure a

For comparison, the following table summarize the results of Barczy and Pap (2016) on the asymptotic properties of the MLE \((\widetilde{a}_T, \widetilde{b}_T, \widetilde{\alpha }_T, \widetilde{\beta }_T)\) for the drift parameters \((a, b, \alpha , \beta )\) of a Heston model, which is a submodel of (1.1) with \(a \geqslant \frac{\sigma _1^2}{2}\), \(\sigma _1, \sigma _2 \in (0, \infty )\), \(\gamma = 0\), \(\varrho \in (-1, 1)\) and \(\sigma _3 = 0\).

figure b