1 Introduction

According with a very common pattern for financial returns, \(r_t\), the term volatility refers to the coefficient \(\sigma _t\) into the equation:

$$\begin{aligned} r_t=\mu _t+\sigma _t\varepsilon _t \quad \varepsilon _t\sim NID(0,1) \end{aligned}$$
(1)

where \(\varepsilon _t\) is a unit-variance innovation, Gaussian distributed in the simplest case; when \(\sigma _t\) does not depend on \(\epsilon _t\), \(\mu _t\) is the expected value of \(r_t\) given all information known at time \(t-1\) (Taylor 1986).

Two main classes of volatility models are known in literature:

  • Models for conditional heteroskedasticity;

  • Stochastic Volatility models.

In the models for conditional heteroskedasticity, generally identified as GARCH models (Bollerslev 1986), volatility is a function of the previous information, \(I_{t-1}\), and \(\sigma ^2_t\) corresponds to the conditional variance of the returns: \(\sigma _t^2 = Var(r_t \mid I_{t-1})\).

An alternative approach in volatility modelling consists in considering \(\sigma ^2_t\) as a latent stochastic process, whose logarithm, \(h_t\), is usually represented as an autoregressive process, in most cases, of order one:

$$\begin{aligned} h_{t+1} = \omega + \beta h_{t}+\eta _t \quad \eta _t\sim NID(0,\sigma ^2_{\eta }) \end{aligned}$$
(2)

Models in the form (1)–(2) are reported in literature with the general name of Stochastic Volatility (SV) models (Melino and Turnbull 1990; Taylor 1994) and are discrete approximations to various diffusion processes proposed in the asset-pricing theory (Hull and White 1987; Wiggins 1987).

Unlike GARCH family models, which are easily estimated by the Maximum Likelihood (ML) method, the ML estimation of SV models is a tough challenge since the likelihood is defined by a T-dimensional integral, which is hard to manage (Harvey and Shephard 1996).

In accord with Sandmann and Koopman (1998), methods to face SV models estimation can be subdivided into two groups: (i) methods oriented to rebuild the exact likelihood of the SV model or a related model; (ii) methods based on more workable, but sub-optimal methods. Among all the methods, the Quasi Maximum Likelihood (QML) method (Ruiz 1994; Harvey et al. 1994) appears to be a good compromise between simplicity and efficiency: it maximizes a function that is not the actual exact likelihood function, but it provides an optimal linear estimator (and predictor) of \(h_t\), consistent and asymptotically Gaussian, according with the results of Dunsmuir (1979).

The method is based on the Kalman filter (Harvey 1990), which filters past and present volatilities and predicts future ones; Ruiz (1994) suggest that the method works well for sample sizes usually used in financial economics.

On the other hand, the application of the basicFootnote 1 QML faces some issues when the volatility innovation, \(\eta _t\), and the return innovation, \(\varepsilon _t\), are correlated, as explained in Sect. 2. This correlation is fundamental to model the asymmetric effect of negative and positive returns on the expected volatility, behaviour often observed in stock prices, and known as leverage effect (Christie 1982; Engle and Ng 1993).

This paper presents a new procedure for applying the QML method to asymmetric SV models, which produces a volatility predictor different from the one known in the literature, preferable to the latter in some cases. A brief presentation of the known procedure is reported in Sect. 2, while the new procedure is illustrated in Sect. 3. In Sect. 4, a simulation study evaluates the properties of the estimators obtained with the new proposal. Section 5 illustrates the application of the old and new procedure on three financial series. The achieved goals are summarized in Conclusions.

2 Leverage effect and stochastic volatility

In financial series increases of volatility are often observed after negative returns, and generally the greater the extent of the loss the greater the increase. This evidence is called “leverage effect” as price drop in a stock decreases the value of the firm equity, and increases the leverage-ratio. The increased leverage-ratio will involve higher risk on the equity which will be more volatile during next period (Black 1976).

A way to include the leverage effect in (2) consists in assuming the volatility innovation \(\eta _t\) correlated with the return innovation \(\varepsilon _t\), as in the following general Gaussian SV model:

$$\begin{aligned} \begin{array}{l} x_t = \exp (h_t/2)\varepsilon _t \\ h_{t+1} = \omega + \beta h_{t} + \eta _{t} \end{array} \quad \Bigg [ \begin{array}{c} \varepsilon _t \\ \eta _t \end{array} \Bigg ] \sim NID\Bigg ( \Bigg [ \begin{array}{c} 0 \\ 0 \end{array} \Bigg ], \Bigg [ \begin{array}{cc} 1 &{} \gamma \\ \gamma &{} \sigma ^2_{\eta } \end{array} \Bigg ] \Bigg ) \end{aligned}$$
(3)

where: \(x_t=r_t-\mu _t\) is the mean-adjusted return on an asset, simply “return” in the following; \(h_t=\ln \sigma ^2_t\) is the log-square volatility; \(\varepsilon _t\) and \(\eta _t\) are zero-mean Gaussian innovations, serially independent, with \(E(\eta _t\varepsilon _{t-k})=\gamma\) if \(k=0\), and zero otherwise; then, the correlation between \(\varepsilon _t\) and \(\eta _t\) is \(\rho =\gamma /\sigma _{\eta }\).

In Model (3), the leverage effect on \(h_{t+1}\) corresponds to:

$$\begin{aligned} E(\eta _t\mid \varepsilon _t)=\gamma \varepsilon _t \end{aligned}$$
(4)

which is proportional to the size of \(\varepsilon _t\) and of opposite sign if \(\gamma <0\).

The basic QML estimation of a Gaussian SV model would consists in the estimation, via Kalman Filter, of the auxiliary state space model:

$$\begin{aligned} \begin{array}{l} y_t = -1.27 + h_{t} + \xi _t \\ h_{t+1} = \omega + \beta h_{t} + \eta _{t} \\ \end{array} \quad \Bigg [\begin{array}{c} \xi _t \\ \eta _t \end{array}\Bigg ] \sim ID\Bigg ( \Bigg [ \begin{array}{c} 0 \\ 0 \end{array} \Bigg ], \Bigg [ \begin{array}{cc} 4.93 &{} 0 \\ 0 &{} \sigma ^2_{\eta } \\ \end{array} \Bigg ] \Bigg ) \end{aligned}$$
(5)

where: \(y_t=\ln x^2_t\), and \(\xi _t = \ln \varepsilon ^2_t + 1.27\); \(-\) 1.27 and 4.93 are the values of \(E(\ln \varepsilon ^2_t)\) and \(Var(\ln \varepsilon ^2_t)\) respectively, when \(\varepsilon _t\) is Gaussian (Zelen and Severo 1972).

Unfortunately, the basic QML does not allow to estimate the parameter \(\gamma\) (or \(\rho\)) directly since the disturbances \(\xi _t\) and \(\eta _t\) are uncorrelated when the distribution of \(\varepsilon _t\) and \(\eta _t\) is symmetric, e.g. Gaussian or Student’s t, so that the original covariance (correlation) between \(\varepsilon _t\) and \(\eta _t\) is definitely lost in (5) (Harvey et al. 1994).

Harvey and Shephard (1996) proposed a method, here briefly named HS-QML, to estimate the parameters of a SV model with correlated disturbances. According with this method, the parameters of (3) can be estimated applying the QML method to the time-varying linear state space modelFootnote 2:

$$\begin{aligned} \begin{array}{rl} y_t &{}= -1.27 + h_{t} + \xi _t \\ h_{t+1} &{} = \omega + 0.80\gamma s_t + \beta h_{t} + \eta ^*_{t} \end{array} \nonumber \\ \Bigg [\begin{array}{c} \xi _t \\ \eta ^*_t \end{array}\Bigg ]\mid s_t \sim ID\Bigg ( \Bigg [ \begin{array}{c} 0 \\ 0 \end{array} \Bigg ], \Bigg [ \begin{array}{cc} 4.93 &{} 1.11 \gamma s_t\\ 1.11\gamma s_t &{} \sigma ^2_{\eta }-0.64\gamma ^2 \\ \end{array} \Bigg ] \Bigg ) \end{aligned}$$
(6)

where \(s_t\) indicates the sign of \(x_t\): it is the equal to 1 (\(-\,1\)) when \(x_t\), i.e. \(\varepsilon _t\), is positive (negative).

The one step ahead predictor of \(h_t\) provided by the Kalman Filter (see Appendix A) combines the mechanisms of the EGARCH (Nelson 1991) and the Threshold-ARCH predictors (Glosten et al. 1993; Zakoian 1994):

$$\begin{aligned} {\hat{h}}_{t+1\mid t}= & {} \omega + \beta {\hat{h}}_{t\mid t-1}+ 0.80\gamma s_t +\kappa _t (y_t-{\hat{y}}_{t\mid t-1}) \end{aligned}$$
(7)
$$\begin{aligned} \kappa _t= & {} \frac{\beta P_{t\mid t-1} + 1.11\gamma s_t}{ P_{t\mid t-1} +4.93} \end{aligned}$$
(8)

where \(\kappa _t\) is the gain of the Kalman filter (Harvey 1990); \({\hat{y}}_{t\mid t-1}=-1.27+{\hat{h}}_{t\mid t-1}\) is the one step ahead prediction of \(y_t\); \(P_{t\mid t-1}=E[({\hat{h}}_{t\mid t-1}-h_t)^2]\) is the MSE of \({\hat{h}}_{t\mid t-1}\).

If \(\gamma <0\), Formula (8) entails that: \((\kappa _t\mid P_{t\mid t-1}, s_t=-1)>(\kappa _t\mid P_{t\mid t-1}, s_t=+1)\). Therefore the leverage effect predicted by (7) can be broken down into two parts: (i) a part, \(\gamma s_t\), depending on the sign but not on the size of \(\varepsilon _t\) (as in the Threshold-ARCH model); (ii) a part, \(\kappa _t (y_t-{\hat{y}}_{t\mid t-1})\), depending on the sign and the size of \(\varepsilon _t\).Footnote 3 This forecast only partially reflects the characteristics of the leverage effect as expressed in (4).

3 Iterative QML for asymmetric SV models

Having assumed that \(\eta _t,\) and \(\varepsilon _t\) are bivariate normal with \(E(\eta _t\varepsilon _t)=\gamma\), the auxiliary model (5) can be rewritten as:

$$\begin{aligned} \begin{array}{l} y_t = -1.27 + h_{t} + \xi _t \\ h_{t+1}=\omega + \gamma \varepsilon _t + \beta h_{t} + \eta _t^+ \end{array} \nonumber \\ \Bigg [\begin{array}{c} \xi _t \\ \eta _t^+ \end{array}\Bigg ] \sim ID\Bigg ( \Bigg [ \begin{array}{c} 0 \\ 0 \end{array} \Bigg ], \Bigg [ \begin{array}{cc} 4.93 &{} 0 \\ 0 &{} \sigma ^2_{\eta } -\gamma ^2\\ \end{array} \Bigg ] \Bigg ) \end{aligned}$$
(9)

where the disturbance \(\eta ^+_t\) represents the exogenous innovation on \(h_t\).

As \(\varepsilon _t=x_t/\exp (h_{t}/2)\), Model (9) is not linear, then it cannot be estimated by means of basic QML. Nevertheless, we can perform the following procedure:

  1. 1.

    \(\varepsilon _t\) is initially set to \(x_t/s_X\), being \(s_X\) the sample standard deviation of the series of returns \((x_1, x_2,\ldots ,x_T)\);

  2. 2.

    Model (9), now linear, is estimated by basic QML;

  3. 3.

    \(\varepsilon _t\) is updated by \(x_t/\exp {({\tilde{h}}_{t}/2)}\), where \({\tilde{h}}_{t}\) are the smoothed \(h_t\), provided by the Kalman smoother;

  4. 4.

    steps 2 and 3 are repeated successively according with a pre-set stopping ruleFootnote 4.

Empirical evidence shows that the parameter estimates converge to realistic values after few steps. The trick of the iterative procedure, IQML in the following, consists in smoothing \(\varepsilon _t\) and \(h_t\) not conjointly: treating separately \(\varepsilon _t\) and \(h_t\) does not compromise the linearity of the second equation into (9).

The IQML can be viewed like a variant of the EM approach (Dempster et al. 1977) proposed by Shumway and Stoffer (1982) for smoothing and forecasting time series, but some differences should be highlighted. In the case of Model (9), the approach of Shumway and Stoffer (1982) would consist in: (i) setting initial values of the model parameters; (ii) smoothing \(h_t\) in order to build a likelihood function (expectation step); (iii) estimating the model parameters maximizing the likelihood function (maximization step); (iv) iterating steps (ii)–(iii) until the estimates and the likelihood function are stable. Nevertheless the direct smoothing of \(h_t\) (step ii) is complicated by the non-linearity of the model. This problem is bypassed in IQML because \(\varepsilon _t\) and \(h_t\) are smoothed separately: \(\varepsilon _t\) is smoothed before the maximisation (estimation) step, whereas \(h_t\) is smoothed after the maximization step in order to update \(\varepsilon _t\). As a result the maximisation step differs between the algorithms: in IQML it consists in maximizing a pseudo (quasi) log-likelihood given \(y_t\) and \({\tilde{\varepsilon }}_t\); in EM it consists in the ML estimation of a regression model, the second equation in (9), given \({\tilde{h}}_t\).Footnote 5 The IQML procedure is more viable than the EM approach described above, but involves an inevitable loss of efficiency due to smoothing \(\varepsilon _t\) and \(h_t\) separately.

It is interesting to note the form of the predictor of \(h_t\) provided by the Kalman Filter:

$$\begin{aligned} {\hat{h}}_{t+1\mid t} = \omega + \beta {\hat{h}}_{t\mid t-1} + \gamma {\tilde{\varepsilon }}_t + \kappa _{t} (y_t-{\hat{y}}_{t\mid t-1}) \end{aligned}$$
(10)

where \(\kappa _{t}\) is the gain of the Kalman Filter. Since Model (9) satisfies the steady-state conditions,Footnote 6\(\kappa _{t}\) converges to a constant that we name \(\alpha\). Therefore, the steady-state predictor of \(h_t\) can be formalized as:

$$\begin{aligned} {\hat{h}}_{t+1\mid t} = \omega + \beta {\hat{h}}_{t\mid t-1} + \gamma {\tilde{\varepsilon }}_{t} +\alpha \xi (| {\tilde{\varepsilon }}_{t}| ) \end{aligned}$$
(11)

where the function \(\xi (| {\tilde{\varepsilon }}_{t}| )=2\ln | {\tilde{\varepsilon }}_{t}| +1.27\) is a monotonically increasing, mean-corrected, function of the (approximated) magnitude of \(\varepsilon _t\). Predictor (11) shows a clear similarity with the EGARCH predictor in which \(\xi (| \varepsilon _{t}| )=|\varepsilon _{t}| -\sqrt{2/\pi }\) is the mean-corrected magnitude of \(\varepsilon _t\) when \(\varepsilon _t\sim N(0,1)\). Predictor (11) may be also viewed as a Log-GARCH predictor (Geweke 1986; Pantula 1986) with leverage effect.

The leverage effect predicted by (11) tries to replicate the form of the leverage effect Model (4): it is proportional to the amplitude of the estimated return innovation, \(\tilde{\varepsilon _t}\), and of opposite sign (if \(\gamma <0\)). Formulas (7) and (11) show that the HS-QML and IQML methods involve two different predictors of \(h_t\), whose performances are evaluated in Sects. 4 and 5.

3.1 Student’s \(\textbf{t}\) return innovations

The IQML method can be generalized to the case where \(\varepsilon _t\) has a scaled Student’s t-distribution with v degrees of freedom, scaled in order to have unit variance, i.e. \(\varepsilon _t\sim t_v\sqrt{(v-2)/v}\). In this case (see “Appendix B”):

$$\begin{aligned} E[\ln \varepsilon ^2_t]= & {} g_1(v) = -1.27+\ln (v/2-1)-\psi _0(v/2) \end{aligned}$$
(12)
$$\begin{aligned} Var[\ln \varepsilon ^2_t]= & {} g_2(v) = 4.93+\psi _1(v/2) \end{aligned}$$
(13)

being \(\psi _0\) and \(\psi _1\) the digamma and trigamma function, respectively (Davis 1972).

If the parameter v is assumed known, the model estimation procedure is the IQML described above, with the values \(-\)1.27 and 4.93 now replaced by the values of \(g_1(v)\) and \(g_2(v)\), respectively. Alternatively, \(-\)1.27 and 4.93 are replaced by the parametric formulas of \(g_1(v)\) and \(g_2(v)\), and v is treated as an additional unknown parameter.

4 Finite sample properties of the IQML estimator

By means of a simulation study, Harvey and Shephard (1996) provided empirical results in favour of the consistency of the estimates obtained with the HS-QML method. In the study, series of different lengths, T, were simulated 1000 times (\(n=1000\)) from Model (3) with “empirically reasonable” parameters values. On each series, the model parameters were estimated using the HS-QML method, under some practical constraints.Footnote 7

Table 1 reports some estimation results of that simulation study: the average and the root mean square error, RMSE, (figures in brackets) of the estimates of \(\beta\), \(\ln \sigma ^2_{\eta }\) and \(\rho\); \(\ln \sigma ^2_{\eta }\) was preferred to \(\sigma ^2_{\eta }\) because the estimates of the first are closer to be normally distributed then ones of the second. We can note that the RMSEs decrease as the series length increases and, ceteris paribus, they decrease if the absolute value of \(\rho\) (i.e. \(\gamma\)) increases.

The same approach, based on simulations, is followed here to asses the finite sample properties of the IQML estimators.Footnote 8 With the same settings adopted by Harvey and Shephard, 1000 series were simulated from Model (3), and on each series the model was estimated with the IQML method (Table 2). As in HS-QML, the RMSEs of the IQML estimators decrease as the strength of the correlation increases and as the length of the series increases. The RMSEs of the new method are slightly lower than ones of the other method.

Table 1 Simulations results of the HS-QML method (\(n=1000\))
Table 2 Simulations results of the IQML method (\(n=1000\))

The normal distribution of the IQML estimates has been tested using the Shapiro-Wilks and the Jarque-Bera tests. Table 3 reports the p-values of the tests on the estimates of \(\omega\), \(\gamma\), \(\beta\), and \(\ln \sigma ^2_{\eta }\), both with \(T=3000\) and \(T=6000\). With series of length \(T = 6000\), the estimates of \(\omega\), \(\gamma\) and \(\ln \sigma ^2_{\eta }\) can be already considered normally distributed with a significance level of 0.03. On the other hand, the sampling distribution of \({\hat{\beta }}\) cannot yet be approximated by the normal, this is because the simulated value, 0.975, is very close to the upper limit in case of stationary volatility (\(\beta =1\)). As a result, the finite sample distribution of \({\hat{\beta }}\) has a slightly longer left tail than the right one, although the skewness decreases the more negative the correlation \(\rho\) (Fig. 1). In this case the Gaussian approximation should be possible with series length greater than those considered.

Table 3 IQML—p value of normality tests (\(n=1000\))
Fig. 1
figure 1

Estimated density of \({\hat{\beta }}\) in the simulation study (\(n=1000\))

4.1 Goodness of the IQML filtered and smoothed volatilities

A simulation was conducted to evaluate the goodness of the filtered and smoothed volatilities provided by the IQML and HS-QML methods. To this end: (i) 100 series of length \(T=1000\) were simulated from model (3) with \(\beta =0.97\), \(\rho =-0.90\), \(\ln (\sigma ^2_{\eta })=-4\) (i.e. \(\gamma =-0.122\) and \(\sigma ^2_{\eta }=0.0183\)); (ii) on each series, the IQML and HS-QML methods were applied to filter and smooth the log-square volatility \(h_t\) using the estimated parameters provided by each method, then filtered and smoothed \(\sigma ^2_t\)s were obtained by exponential transformation of the corresponding log-square volatilities; (iii) the closeness of the filtered and smoothed \(\sigma ^2_t\)s to the simulated ones was measured on each series using the (average) loss functions MSE, MAE and Qlike (Patton 2011; Hansen and Lunde 2005):

$$\begin{aligned} \textrm{MSE}= & {} T^{-1}\sum _{t=1}^{T} (\sigma ^2_t - \exp {h}_t)^2 \end{aligned}$$
(14)
$$\begin{aligned} \textrm{MAE}= & {} T^{-1}\sum _{t=1}^{T} |\sigma ^2_t - \exp {h}_t| \end{aligned}$$
(15)
$$\begin{aligned} \mathrm {Qlike^*}= & {} T^{-1}\sum _{t=1}^{T}\left( \frac{\sigma ^2_t}{\exp {h}_t} - 1 -\ln \frac{\sigma ^2_t}{\exp {h}_t}\right) \end{aligned}$$
(16)

The loss function (16) corresponds to the variant of Qlike proposed by Patton (2011, p. 252) in order to make the function homogeneous. This variant is formally the relative difference of \(\sigma ^2_t\) from \(\exp {h}_t\) minus the corresponding log-difference. Given the relationship between the two differences, \(\mathrm {Qlike^*}\) grows as the gap between \(\sigma ^2_t\) and \(\exp h_t\) increases, and grows more when the gap is negative, i.e. \(\sigma ^2_t\) is underestimated (Fig. 2). On the other hand, MSE and MAE present symmetric effects of overestimation and underestimation, but MSE is more sensitive than MAE to large gaps between \(\sigma ^2_t\) and \(\exp h_t\) (due to sharp changes in volatility or outliers).

Fig. 2
figure 2

Behaviour of the loss functions given \(\sigma ^2=2\)

Table 4 reports the overall average of the loss functions for the methods,Footnote 9 and the the number of series, \(n^*\), where IQML determines the lowest loss. We can see that the values of all three loss functions are lower when the IQML method is used, both in case of filtered and in case of smoothed \(\sigma ^2_t\)s. These results seem to suggest that the IQML method provides filtered and smoothed \(\sigma ^2_t\)s that fit better the actual ones. In particular, the IQML method appears to better limit: (i) large gaps between actual volatility and filtered (smoothed) volatility; (ii) underestimation of actual volatility.

Finally, the \(n^*\) counter shows that the IQML method outperforms the HS-QML in a large percentage of simulated series.

Table 4 Overall average of the loss functions on the simulated series (\(n=100; T=1000\))

5 Empirical Applications

The IQML method was applied on several financial series for the estimation of the asymmetric SV model (3). This section illustrates the application of the method on three series of financial indices:

  • NASDAQ Composite (IXIC)

  • DAX index (GDAXI)

  • CAC40 index (FCHI)

in the period from 04-01-2021 to 30-12-2022. The returns we consider are percentage log differences of the daily index closing values. The estimation results are compared with those obtained with the HS-QML method.Footnote 10

Table 5 reports the estimates of \(\omega , \gamma , \beta\), and the standard deviation of the exogenous innovation on \(h_t\) (i.e. \(\sigma _{\eta ^+}\)). The standard errors of the estimates are calculated on the basis of the “Outer Product of the Gradient” (OPG) method.Footnote 11 The quasi log-likelihood, lly, of models (6) and (9) is also reported.

The two methods provide fairly close estimates, also standard errors are pretty close. We note that the IQML standard errors are moderately smaller than the corresponding HS-QML standard errors in the IXIC and FCHI series, but not in GDAXI. The quasi-log likelihood values, lly, of the two methods are also very close in each series: the lly of IQML is slightly higher in IXIC and FCHI, and slightly lower in GDAXI.

Table 5 Comparison of the methods on three financial series

The goodness of the filtered and smoothed \(\sigma ^2_t\)s provided by the methods is assessed by the MSE (14) and \(\mathrm {Qlike^*}\) (16) loss functions computed using the square return \(x^2_t\) as proxy of \(\sigma ^2_t\) (see Table 6). As stated by Patton (2011), these loss functions are robust to noise when the volatilities proxy are the square returns. Based on these results, neither method appears clearly better than the other: the IQML seems little better for the IXIC filtered \(\sigma ^2_t\)s and the GDAXI smoothed \(\sigma ^2_t\)s; the HS-QML seems little better for the IXIC smoothed \(\sigma ^2_t\)s; for the FCHI series, IQML seems preferable if the criterion is MSE, but HS-QML could be better if Qlike is the criterion. Nevertheless, all differences in the loss functions are too small to highlight a clear superiority of one method over the other. In the case of financial series, the choice of the most appropriate method should take place on a case-by-case basis, considering more than one criterion.

Table 6 MSE and Qlike losses of the filtered and smoothed \(\sigma ^2_t\)

6 Conclusion

The IQML method consists in iterating the basic QML method over an asymmetric SV model. The procedure is made possible using a proxy of the return innovation \(\varepsilon _t\). This simple procedure allows the user to estimate the parameters of an SV model in which the return innovation and volatility innovation are correlated. This goal can also be achieved with the modified QML method proposed by Harvey and Shephard (1996) (HS-QML), but the two methods provide different volatility predictors. The IQML predictor is conceptually similar to the EGARCH predictor, whereas the HS-QML predictor is more similar to the Threshold-ARCH predictor. A simulation study shows that the IQML filtered and smoothed square volatilities generally fit simulated \(\sigma ^2_t\)s better than those of the other model do. On the other hand, empirical applications suggest comparing the two methods, using loss functions, to identify the most suitable for the series under study.

The simulation study also shows that IQML estimators exhibit decreasing RMSEs as the series length increases and finite sample distributions that can be approximated by the Gaussian distribution; the approximation improves as the sample size increases.

The IQML method can be viewed as a variant of the Expectation-Maximization (EM) algorithm proposed by Shumway and Stoffer (1982), although the algorithms differ in some methodological aspects as specified in Sect. 3.

Finally, the method can be applied to the case with Student’s t return innovations in order to treat returns with high kurtosis.