1 Introduction

Stochastic volatility (SV) models initially proposed by Taylor (1982, 2008), compose a well-known class of models to estimate volatility. These models have gained attention in the financial econometrics literature because of their flexibility to capture the nonlinear behavior observed in financial time series returnsFootnote 1. Unlike the GARCH models, the SV models define volatility as truly contemporaneous such that their measure of volatility includes not only expected volatility but also unexpected volatility. According to Melino and Turnbull (1990) and Carnero et al. (2004), an appealing aspect of the SV model is its close association to financial economic theories and its ability to capture the stylized facts often observed in daily series of financial returns in a more appropriate way.

The daily asymmetrical relation between equity market returns and volatility has received a lot of attention in the financial literature; see for instance, Black (1976), Campbell and Hentschel (1992), and Bekaert and Wu (2000). Asymmetric equity market volatility is important for al least three reasons. First, it is an important characteristic of the market volatility dynamics, has asset pricing implications and is a feature of priced risk factors. Second, it plays an important role in risk prediction, hedging and option pricing. Finally, asymmetric volatility implies negatively skewed returns distributions, i.e. it may help explain some of the market’s chances of losing.

On the other hand, the relation between expected returns and expected volatility have been extensively examined in recent years. Overall, there appears to be stronger evidence of a negative relationship between unexpected returns and innovations to the volatility process, which French et al. (1987) interpreted as indirect evidence of a positive correlation between the expected risk premium and ex ante volatility. If expected volatility and expected returns are positively related and future cash flows are unaffected, the current stock index price should fall. Conversely, small shocks to the return process lead to an increase in contemporaneous stock index prices. This theory, known as the volatility feedback theory hinges on two assumptions: first, the existence of a positive relation between the expected components of the return and volatility process and second, volatility persistence. An alternative explanation for asymmetric volatility where causality runs in the opposite direction is the leverage effect put forward by Black (1976), who asserted that a negative (positive) return shock leads to an increase (decrease) in the firm’s financial leverage ratio, which has an upward (downward) effect on the volatility of its stock returns. However, French et al. (1987) and Schwert (1989) argued that leverage alone cannot account for the magnitude of the negative relationship. For example, Campbell and Hentschel (1992) found evidence of both volatility feedback and leverage effects, whereas Bekaert and Wu (2000) presented results suggesting that the volatility feedback effect dominates the leverage effect empirically. Frequently, the volatility of daily stock returns has been estimated with SV models, but the results have relied on an extensive pre-modeling of these series to avoid the problem of simultaneous estimation of the mean and variance. Koopman and Uspensky (2002) introduced the SV in mean (SVM) model to deal with this problem. In the SVM setup, the unobserved volatility is incorporated as an explanatory variable in the mean equation of the returns under the normality assumption of the innovations. Moreover, they derived an exact maximum likelihood estimation based on Monte Carlo simulation methods. Abanto-Valle et al. (2012) extended the SVM model to the class of scale mixture of Normal distributions and developed a Markov Chain Monte Carlo (MCMC) algorithm to sample parameters and the log-volatilities from a Bayesian perspective. Recently, Abanto-Valle et al. (2021) apply Hamiltonian Monte Carlo (HMC) and Riemann Manifold HMC (RMHMC) methods within the MCMC algorithm to update the log-volatilities and parameters of the SVM model, respectively. However, the resulting MCMC algorithms has some undesirable features. In particular, the procedure is quite involved, requiring a large amount of computer-intensive simulations. In addition, the computational cost increases rapidly with the sample size.

This paper has two objectives. The first is to offer an algorithm that requires less computational time in simulations and estimates even when the sample increases as compared with MCMC algorithms as proposed in Abanto-Valle et al. (2021). For this, this article applies an alternative approximate Bayesian estimation method to the SVM model considered by Abanto-Valle et al. (2012) and Abanto-Valle et al. (2021). First, we approximate the likelihood function by integrating out the log-volatilities as suggested by Langrock (2011), Langrock et al. (2012) and Abanto-Valle et al. (2017). Second, we get the maximum a posteriori by using a numerical optimization routine, and third, we use importance sampling to sample from the posterior distribution of the parameters using a multivariate normal distribution where the mean and variance are given by the maximum a posteriori and the inverse of the Hessian matrix evaluated at the maximum a posteriori, respectively.

The second objective is to provide empirical evidence estimating the SVM model using daily data for five Latin American stock markets. Time-varying volatility for developed economies’ financial variables have been studied extensively; see for instance, Liesenfeld and Jung (2000), Jacquier et al. (2004) and Abanto-Valle et al. (2010). However, empirical studies of the volatility characteristics of the financial markets in Latin America are very scarce and are far from being thoroughly analyzed despite their growth in recent years, see Abanto-Valle et al. (2011), Rodríguez (2016, 2017a, 2017b), Lengua Lafosse and Rodríguez (2018), Alanya and Rodríguez (2019). Moreover, Abanto-Valle et al. (2021) use HMC amd RMHMC methods to analyse the SVM model using Latin American markets. For this reason, we perform a detailed empirical study of five Latin American indexes: MERVAL (Argentina), IBOVESPA (Brazil), IPSA (Chile), MEXBOL (Mexico) and IGBVL (Peru) in the context of the SVM model using the HMM approach. We also include the S&P 500 (USA), FTSE 100 (England), NIKKEI 225 (Japan) and SZSE (China) returns in order to perform some comparisons. All the results are in line with Abanto-Valle et al. (2021).

The remainder of this paper is organized as follows. Section 2 describes the SVM model, the approximated likelihood of the SVM model based on hiden Markov models (HMM) techniques and the Bayesian inference procedure. In Sect. 3, we conduct a simultation study to verify the frequentist properties of estimators compared to the methods used in Abanto-Valle et al. (2021) including computational time intensity. Section 4 is devoted to the application of the proposed methodology to five indexes of Latin American countries, the S&P 500 (USA), FTSE100 (England), NIKKEI 225 (Japan) and SZSE (China). Finally, some concluding remarks and suggestions for future developments are given in Sect. 5.

2 The Stochastic Volatility in Mean (SVM) Model

The inclusion of variance as one of the determinants of the mean facilitates the examination of the relationship between returns and volatility. The SVM model is defined by

$$\begin{aligned} y_{t}=\; & {} \beta _{0}+\beta _{1}y_{t-1}+\beta _{2}e^{h_{t}}+e^{\frac{h_{t}}{2} }\epsilon _{t}, \end{aligned}$$
(1a)
$$\begin{aligned} h_{t+1}=\; & {} \mu +\phi (h_{t}-\mu )+\sigma \eta _{t}, \end{aligned}$$
(1b)

where \(y_{t}\) and \(h_{t}\) are, respectively, the compounded return and the log-volatility at time t, for \(t=1,\ldots ,T\). We assume that \(|\beta _1|<1\), \(|\phi |<1\), i.e., that the returns and the log-volatility process are stationary and that the initial value \(h_{1}\sim \mathcal {N}(\mu ,\frac{\sigma _{\eta }^{2}}{1-\phi ^{2}})\). Furthermore, the innovations \( \epsilon _{t}\) and \(\eta _{t}\) are assumed to be mutually independent and normally distributed with mean zero and unit variance. The SVM model incorporates the unobserved volatility as an explanatory variable in the mean equation in such way that the parameter \(\beta _{2}\) measures the volatility-in-mean effect. In other words, when estimating the \(\beta _{2}\) parameter in the SVM model, it measures the ex-ante relation between returns and volatility and the volatility feedback effect.Footnote 2

2.1 Likelihood evaluation by iterated numerical integration and fast evaluation of the approximate likelihood using HMM techniques

To formulate the likelihood, we require the conditional pdfs of the random variables \(y_t\), given \(h_t\) and \(y_{t-1}\) (\(t=1,\ldots ,T\)), and of the random variables \(h_t\), given \(h_{t-1}\) (\(t=2,\ldots ,T\)). We denote these by \(p(y_t \mid y_{t-1}, h_t)\) and \(p(h_t \mid h_{t-1})\), respectively. The likelihood of the SVM model defined by equations (1a) and (1b) can then be derived as

$$\begin{aligned} \mathcal {L}= & {} \int \ldots \int p(y_1,\ldots ,y_T,h_1,\ldots ,h_T \mid y_0) d h_T \ldots d h_1\\= & {} \int \ldots \int p(y_1,\ldots ,y_T \mid y_0, h_1,\ldots ,h_T) p(h_1,\ldots ,h_T) d h_T\ldots d h_1\\= & {} \int \ldots \int p(h_1) p(y_1 \mid y_0,h_1) \prod _{t=2}^T p(y_t \mid y_{t-1},h_t)p(h_t \mid h_{t-1}) d h_T\ldots d h_1. \end{aligned}$$

Hence, the likelihood is a high-order multiple integral that cannot be evaluated analytically. Through numerical integration, using a simple rectangular rule based on m equidistant intervals, \(B_i = (b_{i-1},b_i)\), \(i=1, \ldots , m\), with midpoints \(b_i^*\) and length b, the likelihood can be approximated as follows:

$$\begin{aligned} {\mathcal{L}}\; \approx & \;b^{T} \sum\limits_{{i_{1} = 1}}^{m} \ldots \sum\limits_{{i_{T} = 1}}^{m} p (h_{1} = b_{{i_{1} }}^{*} )p(y_{1} {\mid }y_{0} ,h_{1} = b_{{i_{1} }}^{*} ) \\ & \times \prod\limits_{{t = 2}}^{T} p (y_{t} {\mid }y_{{t - 1}} ,h_{t} = b_{{i_{t} }}^{*} )p(h_{t} = b_{{i_{t} }}^{*} {\mid }h_{{t - 1}} = b_{{i_{{t - 1}} }}^{*} ) = {\mathcal{L}}_{{approx}} \\ \end{aligned}$$
(2)

This approximation can be made arbitrarily accurate by increasing m, provided that the interval \((b_0,b_m)\) covers the essential range of the log-volatility process. We note that this simple midpoint quadrature is by no means the only way in which the integral can be approximated (cf. Langrock et al., 2012). The numerical evaluation of approximate likelihood given in (2) will usually be computationally intractable since it involves \(m^T\) summands. However, it can be evaluated numerically using the tools well-established for HMMs, which are the models that have exactly the same dependence structure as the stochastic volatility in mean models, but with a finite and hence discrete state space (cf. Langrock, 2011, Langrock et al., 2012). In the given scenario, the discrete states correspond to the intervals \(B_i\), \(i=1, \ldots , m\), in which the state space has been partitioned. A fundamental property of HMM, which we exploit here, is that the likelihood can be evaluated efficiently using the so-called forward algorithm, a recursive scheme which iteratively traverses forward along the time series, updating the likelihood and the state probabilities in each step (Zucchini et al., 2016). Under the HMM, to apply the forward algorithm results in a convenient closed-form matrix product expression for the likelihood. For the SVM model is given by:

$$\begin{aligned} \mathcal {L}_{\text {approx}} = \varvec{\delta } \textbf{P}(y_1)\varvec{\Gamma }\textbf{P}(y_2)\varvec{\Gamma }\textbf{P}(y_3) \cdots \varvec{\Gamma }\textbf{P}(y_{T-1}) \varvec{\Gamma }\textbf{P}(y_T) \textbf{1}' \, . \end{aligned}$$
(3)

Note that, in equation (3), the \(m \times m\)-matrix \(\varvec{\Gamma }=\bigl (\gamma _{ij}\bigr )\) plays the role of the transition probability matrix in case of an HMM, defined by \(\gamma _{ij}= p(h_t=b_{j}^* \mid h_{t-1}= b_{i}^*) \cdot b\), which is an approximation of the probability of the log-volatility process changing from some value in the interval \(B_i\) to some value in the interval \(B_j\); this conditional probability is determined by (1b). The vector \(\varvec{\delta }\) is the analogue to the Markov chain initial distribution in case of an HMM. It is defined such that \({\delta _i}\) is the density of the \(\mathcal {N}(\mu ,\frac{\sigma _{\eta }^2}{1-\phi ^2})\)-distribution –the stationary distribution of the log-volatility process– multiplied by b. Furthermore, \(\textbf{P}(y_t)\) is an \(m \times m\) diagonal matrix with the ith diagonal entry \(p(y_t \mid y_{t-1},h_t=b^*_{i})\) determined by (1a). Finally, \(\textbf{1}'\) is a column vector of ones. Using the matrix product expression given in (3), the computational effort required to evaluate the approximate likelihood is linear in the number of observations, say T, and quadratic in the number of intervals used in the discretization, say m. In other words, the likelihood can be calculated in a fraction of seconds, even for high values of T and m. Furthermore, the approximation can be arbitrarily accurate by increasing m (potentially widening the interval \([b_0,b_m]\)).

Although we are using the HMM forward algorithm to evaluate the (approximate) likelihood, the specifications of \(\varvec{\delta }\), \(\varvec{\Gamma }\) and \(\textbf{P}(x_t)\) given above do not define precisely an HMM. In general, the row sums of \(\varvec{\Gamma }\) will be only approximately equal to one, and the vector components \(\varvec{\delta }\) will only approximately sum to one. If desired, this can be remedied by scaling each row of \(\varvec{\Gamma }\) and the vector \(\varvec{\delta }\) to total 1.

2.2 Bayesian Inference for the SVM Model

Because we have some constraints in the original parametric space of the SVM model \((|\beta _1|<1,|\phi | < 1,\sigma _{\eta }> 0)\), we consider the transformations for the following parameters: \(\gamma =\log \biggr (\frac{1+\beta _1}{1-\beta _1}\biggr )\), \(\psi =\log \biggr (\frac{1+\phi }{1-\phi }\biggr )\), and \(\omega =\log (\sigma )\). Let \(\varvec{\theta }=(\beta _0,\gamma ,\beta _2,\mu ,\psi ,\omega )'\) and \(p(\varvec{\theta })\) be the prior distribution of \(\varvec{\theta }\). As the likelihood function is invariant to 1:1 transformations, from equation (3), we obtain the posterior distribution up to a normalization constant, namely:

$$\begin{aligned} p(\varvec{\theta }\mid y_0,\textbf{y}_T)\propto p(\varvec{\theta })\mathcal {L}_{\text {approx}}(\varvec{\theta }), \end{aligned}$$
(4)

where \(\textbf{y}_T=(y_1,\ldots ,y_T)'\). Suposse we wish to calculate an expectation \(E_{p(\varvec{\theta }\mid y_0,\textbf{y}_T)}[h(\varvec{\theta })]\), which can be calculated by using the importance density \(q(\varvec{\theta })\) as follows:

$$\begin{aligned} E_{p(\varvec{\theta }\mid y_0,\textbf{y}_T)}[h(\varvec{\theta })]= & {} \frac{\int h(\varvec{\theta })p(\varvec{\theta }\mid y_0,\textbf{y}_T)d\varvec{\theta }}{\int p(\varvec{\theta }\mid y_0,\textbf{y}_T) d\varvec{\theta }}\nonumber \\= & {} \frac{\int \frac{h(\varvec{\theta })p(\varvec{\theta }\mid y_0, \textbf{y}_T)}{q(\varvec{\theta })}q(\varvec{\theta })d\varvec{\theta }}{\int \frac{p(\varvec{\theta }\mid y_0,\textbf{y}_T)}{q(\varvec{\theta })}q(\varvec{\theta })d\varvec{\theta }}= \frac{E_{q(\varvec{\theta })}\biggr [h(\varvec{\theta })\varvec{\omega }(\varvec{\theta })\biggr ]}{E_{q(\varvec{\theta })}\biggr [\varvec{\omega }(\varvec{\theta })\biggr ]}, \end{aligned}$$
(5)

where \(\varvec{\omega }(\varvec{\theta })=\frac{p(\varvec{\theta }\mid y_0,\textbf{y}_T)}{q(\varvec{\theta })}\) and \(E_q[. ]\) denotes an expected value with respect to the importance density \(q(\varvec{\theta })\). Therefore a sample of independent draws \(\varvec{\theta }_1,\ldots ,\varvec{\theta }_m\) from \(q(\varvec{\theta })\) can be used to estimate \(E_{p(\theta \mid y_0,\textbf{y}_T)}[h(\varvec{\theta })]\) by

$$\begin{aligned} \bar{h}= & {} \frac{\sum _{i=1}^m h(\varvec{\theta }_i) \varvec{\omega }(\theta _i) }{\sum _{i=1}^m \varvec{\omega }(\theta _i)}. \end{aligned}$$
(6)

It is shown that using one sample \(\varvec{\theta }_i's\) in estimating the ratio in (5) is more efficient than using two samples (one for the numerator and another for denominator); see Chen et al. (2008). It follows from the strong law of large numbers that \(\bar{h}\rightarrow E_{p(\theta \mid y_0,\textbf{y}_T)}[h(\varvec{\theta })]\) as \(m \rightarrow \infty \) almost surely; see Geweke (1989). In the same way, a variance of \(\bar{h}(\varvec{\theta })\) can be consistenly estimated by \(\sum _{i=1}^m \varvec{\omega }(\varvec{\theta }_i)^2 [h(\varvec{\theta }_i)-\bar{h}]^2/[\sum _{i=1}^m \varvec{\omega }(\varvec{\theta }_i)]^2\).

2.3 Forecasting

Using HMMs approach offers a convenient method to obtain forecast distributions for SVM models. One of the advantages is the ease of finding the cumulative distribution function of the one-step-ahead forecast distribution on day \(t-1\) for the approximating HMM. This represents the conditional distribution of the return on day t, given all previous observations. The expression for this is as follows:

$$\begin{aligned} F(y_t\mid y_0,y_1,\ldots ,y_{t-1})= & {} \sum _{i=1}^m \zeta _i F(y_t \mid y_{t-1},h_t=b_i^*) \end{aligned}$$
(7)

where \(\zeta _i\) is the \(i-th\) entry of the vector \(\varvec{\alpha }_{t-1}/\varvec{\alpha }_{t-1}{} \textbf{1}'\), obtained from the forward probabilities:

$$\begin{aligned} \varvec{\alpha }_t = \varvec{\delta } \textbf{P}(y_1) \varvec{\Gamma }\textbf{P}(y_2)\varvec{\Gamma } \cdots \varvec{\Gamma }\textbf{P}(y_{t}) \end{aligned}$$

where \(\varvec{\delta }\), \(\textbf{P}(y_k\) ) and \(\varvec{\Gamma }\) are defined as in subsection 2.1. Moreover, the forecast distribution provided in equation (7) can be utilized for model checking through the examination of residuals (Kim et al., 1998). The one-step-ahead forecast pseudo-residual (or quantile residual) can be expressed as follows:

$$\begin{aligned} r_t=\; & {} \Phi ^{-1}( F(y_t\mid y_0,y_1,\ldots ,y_{t-1})) \end{aligned}$$
(8)

for \(t=1,\ldots ,T\). If the model is correctly specified, the \(r_t\) follows a standard normal distribution (Kim et al., 1998; Smith, 1985). Therefore, forecast pseudo-residuals can be employed to detect extreme values and assess the adequacy of the model. This can be done by utilizing techniques such as qq-plots or formal tests for normality.

3 Simulation Study

To assess the performance of the methodology described in the previous Section, we conducted some simulation experiments. All the calculations were performed using stand-alone code developed by the authors using the Rcpp interface inside the R package. First, we simulated a data set comprising \(T=6000\) observations from the SVM model, specifying \(\varvec{\beta }=(0.14,0.03,-0.10)'\), \(\mu =0.3\), \(\phi =0.98\), \(\sigma _{\eta }=0.2\) and \(y_0=0.2\), which correspond to typical values found in daily series of returns; see for example Leão et al. (2017) and Abanto-Valle et al. (2017). The resulting transformed true parameter vector \(\varvec{\theta }=(\beta _0,\gamma ,\beta _2,\mu ,\psi ,\omega )'=(0.14,0.06,-0.10, 0.30,4.5951,-1.6094)'\). Figure 1 shows the resulting artificial data set. We set the priors distributions as follows: \((\beta _0,\gamma ,\beta _2)\sim \mathcal {N}_3(\mathbf{0_3},100{\textbf{I}_3})\), \(\mu \sim \mathcal {N}(0,100)\), \(\psi \sim \mathcal {N}(4.5,100)\) and \(\omega \sim \mathcal {N}(-1.5,100)\), where \(\mathcal {N}_r(.,.)\) and \(\mathcal {N}(.,.)\) denote the \(r-\)variate and univariate normal distributions and \(\textbf{0}_r\) and \(\textbf{I}_r\) are the \(r\times 1\) vector of zeros and the \(r\times r\) the identity matrix, respectively.

Fig. 1
figure 1

Simulated data set from the SVM model with \(\varvec{\beta }=(0.14,0.03,-0.10)'\), \(\mu =0.3\), \(\phi =0.98\) and \(\sigma =0.2\). The vertical dotted lines (red) indicate the sample size \(T=1500, 3000\) and 6000, respectively

In order to investigate the influence of the choice of m on the accuracy of the likelihood approximation in the posterior distribution, and of the sample size T on the computing time, we fitted the SVM model using \(m=50, 100, 150, 200\) (i.e., different levels of accuracy), \(b_{m}=-b_{0}=4\), to subsamples of length \(T=1500, 3000, 6000\) of the original simulated series. Table 1 reports the results of the maximum a posteriori (MAP). In general, we observe that all MAP estimates approach their true values as we go from \(T=1500\) to \(T=6000\). This convergence is faster and clearer for \(\psi \) and \(\omega \). In the case of \(\mu \), the MAP estimates are different for \(T=1500\) observations but they converge to their true value rapidly when sample size increases. In particular, the MAP for \(\beta _2\) approaches its true value when \(T=6000\). The log likelihood value is fairly stable for any value of m and sample size. Most of MAP estimates obtained by numerical maximization become stable for values of m around 100, for all the sample sizes considered here. However, when \(T=6000\), we observe that stabilization is reached for \(m=150\) or more. Therefore, we recommend to use \(m=150, 200\) in empirical applications.

Next, the multivariate normal distribution with mean and covariace matrix being the MAP and the inverse of the Hessian matrix evaluated at the MAP is used as an importance density. We draw a sample of size 1000 using sampling importance resampling. Based on it, we calculate the expected value (posterior mean) and the standard deviation in the original escale using equation (6). The results are reported in Table 2. In all the cases the posterior credibility intervals of 95% contain the true value of the parameters. Considering a time series of size 6000, the proposal methodology spend 174.18 seconds to simulate and report the results, making useful our proposal in real time applications.

We also investigated the influence of the choice of \(b_{0}\) and \(b_{m}\) (the results are not presented here for space reasons). Overall, we observe that the estimator performance is not affected much. However, when these values are chosen either too small (not covering the support of the log-volatility process, e.g., \(b_{m}=-b_{0}=2\)) or too large (leading to a partition of the support into unnecessarily wide intervals and poor approximation of the likelihood, e.g. \(m=50\) and \(b_{m}=-b_{0}=15\)), the estimator performance could be affected. In practice, it can easily be checked post-hoc if the chosen range, specified by \(b_{0}\) and \(b_{m}\), is adequate, by investigating the stationary distribution of the fitted log-volatility process.

The second simulation experiment studies the properties of the estimators of the SVM model parameters. We generated 300 datasets from the SVM model, specifying \(\varvec{\beta }=(0.14,0.03,-0.10)'\), \(\phi =0.98\), \(\sigma =0.2\), \(\mu =0.30\). For each generated data set, we fitted the SVM model using \(m=50,100,150,200\) and \(b_{m}=-b_0=4\), for \(T=1500\), \(T=3000\) and \(T=6000\), respectively. Tables 3, 4 and 5 report the sample mean, the mean relative bias (MRB), the mean relative absolute deviation (MRAD) and the mean squared error (MSE) of the parameter estimates for \(T=1500, 3000, 6000\), respectively.

For all the sample sizes, i.e., \(T=1500\), \(T=3000\), and \(T=6000\), higher values of MRAD are found for the estimator of \(\beta _1\) and \(\mu \), while none of the other estimators exhibited a notable bias. The bias found for \(\mu \) does not substantially affect the resulting model and its performance in forecasting, since it merely indicates a minor shift of the volatility process. It is important to stress that the MSEs are smaller for the larger sample size, as presumed. The obtained results for \(m=50\) are similar to those using higher values of m. Consequently, a more acceptable approximation of the likelihood is achieved.

Overall, it can be concluded that the use of the HMM machinery to maximize the approximate posterior distributions of SVM models numerically leads to a good estimator performance, considering a modest computational effort.

Table 1 SVM Model, simulated data set: Maximum a Posteriori of the parameters and computing times in seconds for the HMM method (\(b_{m}=-b_{0}=4)\).
Table 2 SVM Model estimation results using HMM approach, simulated data set.
Table 3 SVM Model: Simulaton study results based on 300 replicates using the HMM method (\(b_{max}=-b_{min}=4\) and \(T=1500\)).
Table 4 SVM Model: Simulaton study results based on 300 replicates using the HMM method (\(b_{max}=-b_{min}=4\) and \(T=3000\)).
Table 5 SVM Model: Simulaton study results based on 300 replicates using the HMM method (\(b_{max}=-b_{min}=4\) and \(T=6000\)).

4 Empirical Application

We consider the daily closing prices of five Latin American stock markets: MERVAL (Argentina), IBOVESPA (Brazil), IPSA (Chile), MEXBOL (Mexico) and IGBVL (Peru). We use the S&P 500 (USA), FTSE 100 (England), NIKKEI 225 (Japan) and SZSE (China) in order to compare the behavior of the results with Latin American stock markets. The data sets were obtained from the Yahoo finance web site available to download at http://finance.yahoo.com. The period of analysis is from January 6, 1998, until December 30, 2016. Stock returns are computed as \( y_{t}=100\times (\log P_{t}-\log P_{t-1})\), where \(P_{t}\) is the (adjusted) closing price on day t.

Table 6 shows the number of observations and summary descriptive statistics. The sample size differs between countries due to holidays and stock market non-trading days. According to Table 6, the IGBVL and S&P 500, FTSE 100, NIKKEI 225 and SZSE returns are negatively skewed whereas the rest are positively skewed. The IGBVL returns are the most negatively skewed with \({ - } \)0.3915 and the IBOVESPA returns the most positively skewed with 0.5313. Regarding the kurtosis, all the daily returns are leptokurtic (all kurtosis coefficients are higher than 3). Brazil, Peru, and Chile are the markets with the highest degree of kurtosis with the USA near Chile’s value. The SZSE returns showed the lowest kurtosis. Although there are high differences between the minimum and maximum values, the most outstanding values correspond to Argentina and Brazil.

We further observe that the IGBVL and IPSA returns show the highest level of first-order autocorrelation. These values decrease fast for the other orders of autocorrelation. In the case of returns, high first-order autocorrelation reflects the effects of non-synchronous or thin trading. The squared returns show high level of autocorrelation of order one, which can be seen as an indication of volatility clustering. We further observe that high-order autocorrelations for squared returns are still high and decrease slowly.Footnote 3 The Q(12) test statistic, which is a joint test for the hypothesis that the first twelve autocorrelation coefficients are equal to zero, indicates that this hypothesis has to be rejected at the 5% significance level for all returns and squared returns series.

We set the priors distributions as follows: \((\beta _0,\gamma ,\beta _2)\sim \mathcal {N}_3(\mathbf{0_3},100{\textbf{I}_3})\), \(\mu \sim \mathcal {N}(0,100)\), \(\psi \sim \mathcal {N}(4.5,100)\) and \(\omega \sim \mathcal {N}(-1.5,100)\), where \(\mathcal {N}_r(.,.)\) and \(\mathcal {N}(.,.)\) denote the \(r-\)variate and univariate normal distributions and \(\textbf{0}_r\) and \(\textbf{I}_r\) are the \(r\times 1\) vector of zeros and the \(r\times r\) the identity matrix, respectively. We use the procedure described in Sect. 2 by using \(b_m=-b_0=4\) and \(m=200\), to ensure numerical stability in the results. We use the multivariate normal distribution with mean and variance being the mode and the inverse of the Hessian matrix evaluated at the mode as importance density. To compare our methodology, we use the MCMC procedure based on HMC and RMHMC algorithms as described in Abanto-Valle et al. (2021) based on 30000 iterations and discarding the first 10000, only every 10-th values of the chain are stored.

Table 7 summarizes these results for the five Latin American stock market and the S&P 500 and Table 8 summarizes the results for the FTSE 100, NIKKEI 225 and SZSE returns using our proposal and MCMC methods based on Abanto-Valle et al. (2021). It is important to note that all the estimates are similar for all markets considered here. The estimated values using the two approaches are practically identical. Volatility persistence estimates (values of \(\phi ) \) are all highly significant and quite similar among all markets (ranging from 0.9523 for Argentina to 0.9851 for Mexico) with the HMM aproach. The MEXBOL, IBOVESPA, FTSE 100 and S&P 500 are more persistent than the other markets. The main difference between the results is the case of Brazil, where the posterior mean of \(\phi \) is 0.9841 with the HMM approach and 0.9730 with MCMC.

The posterior mean estimates of \(\sigma \) show that all returns have similar estimates in the range from 0.1330 to 0.2757. The higest value is 0.2757 for the IGBVL jointly with the estimate of \(\phi \) indicates that IGBVL is the most volatile stock market index in the region. Regarding the posterior mean of \(\mu \), we found that the estimates are statistically significant for the MERVAL, IBOVESPA, IPSA, NIKKEI 225 and SZSE markets. For the MEXBOL, IGBVL, S&P 500 and FTSE 100 markets, the parameter \(\mu \) is not significant because the credibility interval contains the null value.

We observe that, with the exception of SZSE returns, the posterior mean parameter \(\beta _{0}\) is always positive and statistically significant for all series of returns. The value of \(\beta _{1}\) that measures the correlation of returns is as expected, small and very similar to the first-order autocorrelation coefficients reported in Table 6. The estimates of \(\beta _{1}\) are statistically significant for Argentina, Chile, Mexico, Peru and USA and not significant for Brazil, England, Japan and China. Although in the cases of Chile and Peru these values are 0.1876 and 0.1873, respectively, these values indicate a weak persistence with a rapid mean reversion.

Regarding the parameter of interest (\(\beta _{2}\)), this is more negative in the cases of USA, Brazil and Chile. Intermediate values are observed in Argentina and Mexico, while Peru presents the smallest value in absolute terms. England, Japan and USA have similar negative values, which is interesting since they represent the three most developed markets in the sample with high volumes of transactions. Moreover, while all countries have a credibility interval that excludes the zero value, this does not happen in the case of Peru and China, so it is difficult to argue for an uncertainty effect in these markets. It is important to note that the right side of the credibility interval is very close to zero in all markets except the USA, England and Japan. Therefore, the posterior mean of \(\beta _{2}\) parameter, which measures both the ex ante relationship between returns and volatility and the volatility feedback effect, is negative for all series and statistically significant for all the series with the exception of Peru and China. These findings are very similar and consistent with those found by Abanto-Valle et al. (2021).

Following Koopman and Uspensky (2002), the volatility feedback effect (negative) dominates the positive effect which links the returns with the expected volatility. Our estimates are more negative compared to those of Koopman and Uspensky (2002) where the hypothesis that \(\beta _{2}=0\) can never be rejected at the conventional 5% siginificance level. Therefore, the volatility feedback effect is clearly dominant in our results (except for Peru and China) in comparison to those of Koopman and Uspensky (2002). Our results are similar to those found using Hamiltonian Monte Carlo (HMC) and Riemannian HMC methods based on Abanto-Valle et al. (2021). These results confirm the hypothesis that investors require higher expected returns when unanticipated increases in future volatility are highly persistent.

It is important to stress that these findings are consistent with higher values of \(\phi \) combined with larger negative values for the in-mean parameter; see the cases of Brazil and USA, for example. We have indirect evidence of a positive intertemporal relation between expected excess market returns and its volatility as this is one of the assumptions underlying the volatility feedback hypothesis.

Figures 2 and 3 show -for all the series analyzed- the returns in absolute value (full gray line), the estimates of \(\text {e}^{\frac{h_{t}}{2}}\) of the SVM and SV models (dotted red and green lines) obtained with the HMM approximation and the smoothed mean of \(\text {e}^{\frac{h_{t}}{2}}\) of SVM and SV models (black and blue solid lines) obtained with MCMC. Some comments can be extracted from both figures: (i) a great similarity is observed between the results for both methods since they all follow the evolution of the returns in absolute value; (ii) there are high volatility clusters that are common to all markets and correspond to international financial crises that have had a global effect, such as the Russian/Brazilian crisis (1998) and the great global financial crisis of 2007–2008; (iii) some other smaller high volatility clusters correspond to domestic events that have altered certain markets but with limited effect at the national level. For example, in the case of the IGBVL (Peru), there are periods of financial stress in 2006, 2011 and somewhat smaller in 2016, all of which correspond to presidential elections where the possibility of success for candidates from the political left has affected the stock market. In the case of Argentina, there are small clusters of high volatility linked to problems with different exchange rate adjustment systems. In the case of Brazil, except for the 1998 crisis and the global financial crisis, the rest seem to be calm periods, just like Chile. In the case of the S&P 500, a long cluster of high volatility can be observed between 1998–2001 and a shorter one in 2011, although of reduced size compared to the global financial crisis. The period between 1998 and 2001 is related to the dot.com bubble that affected the S&P 500, FTSE 100 and to a lesser extent the NIKKEI 225 and SZSE markets.

From a practitioners’ viewpoint, implementing the MCMC procedure developed in Abanto-Valle et al. (2021) requires to write about 800 lines of C++ code. In stark contrast, the approach discussed in the present paper is easily implemented using Rcpp inside R, writing less than 200 lines of code. In all the applications considered here the MCMC procedure takes about 40 min and our HMM procedure takes about 20 min, for all the countries. Therefore, from the practical viewpoint, there is substantial merit in considering the HMM approach as an alternative to MCMC schemes.

To compare the in-sample fit of the SVM model, we fit the basic SV model for each of the indexes considered in this study. Subsequently, we calculate the Log-Predictive Score (LPS, Delatola and Griffin, 2011) and the Deviance Information Criteria (DIC, Spiegelhalter et al., 2002). In both cases, the best model is determined by the smallest LPS (DIC) value. Table 9 exhibits the LPS and DIC values for all the indexes considered. According to both criteria, the SVM model demonstrates the best fit for each dataset.

Now, we perform an out-of-sample analysis of forecast performance for the models covered in Table 9. We consider a validation sample from January 3, 2017, until January 31, 2019. Jarque-Bera test has been applied to the pseudo-residuals obtained from different indexes for the SV and SVM models; the corresponding \(p-\)values are listed in Table 10. The Jarque-Bera test does not reject the hypothesis of normality of the residuals under the two models at the 5% level for the MERVAL, BOVESPA, IPSA, MEXBOL, IGBVL and rejects for the S&P 500, FTSE 100, NIKKEI 225 and SZSE, respectively. Figures 4 and 5 show the QQ-plots of the residuals. Figures reveal a poor fit in the left tail for all the series. The indicated mis-specification could be caused by the presence of correlation between the perturbation terms of the returns and volatility or the presence of fat-tails which, of course, are not modeled in this paper since they are not part of the central objective and could be considered as a future avenue of research.

The previous discussions on plots and tests are valuable for evaluating the overall fit of a model. However, when it comes to assessing the risk associated with a share or index, particular attention is given to the extreme left tail of the forecast distribution. This tail region is crucial for determining the value-at-risk (VaR), which represents the maximum potential loss of a portfolio at a given confidence level over a specified period. For instance, the 1-day 1% VaR corresponds to the 0.01-quantile of the one-day-ahead forecast distribution (see Table 11). An exception is said to have occurred when the return falls below that quantile. If the forecasting model is accurate, the number of exceptions, denoted as X, over n days follows a Binomial\((n, \alpha )\) distribution, where \(\alpha \) represents the confidence level. This distributional result allows for backtesting, which involves comparing the observed number of exceptions with the corresponding theoretical distribution to assess the adequacy of the time series model. An established approach to test the accuracy of VaR forecasts is to evaluate the violation rate, estimated as \( \hat{\alpha }= X/n\). To examine the accuracy of VaR forecasts, we adopt the unconditional coverage test introduced in Kupiec (1995), which employs a likelihood ratio test with a test statistic distributed as a \(\chi ^2_1\):

$$\begin{aligned} LR_{uc}=2\left\{ \log \left[ {\hat{\alpha }}^x (1-{\hat{\alpha }})^{n-x}\right] -\log \left[ \alpha ^x (1-\alpha )^{n-x}\right] \right\} . \end{aligned}$$

The null hypothesis states that the observed violation rate is equal to the predetermined nominal probability \(\alpha \). Further details can be found in Kupiec (1995). Based on the results of the unconditional coverage test, we do not reject the null hypothesis, indicating that the achieved violation rate is equal to 5% for all returns under the SVM model, except for the FTSE 100 index. Similarly, for the SV model, we do not reject the null hypothesis for all indexes, except for the IBOVESPA index. However, we reject the null hypothesis that the achieved violation rate is 1% only for the S&P 500 index under both the SV and SVM models.

Table 6 Summary statistics for daily stock returns data
Table 7 Estimation of the SVM Model using HMC and HMM machinery with Importance Sampling
Table 8 Estimation of the SVM Model using HMC and HMM machinery with Importance Sampling
Table 9 Model comparison criteria.
Table 10 p-values of Jarque-Bera test applied to one-step-ahead forecast pseudo residuals
Table 11 Violation rate (VR) as a percentage in n one-day-ahead forecast, \(\textit{P}-\)values of the unconditional coverage test at the 1% and 5% levels

5 Discussion

This paper has two objectives. The first is to show gains in computational time using HMM methods versus MCMC methods developed and used, for example, in Abanto-Valle et al. (2021). The second objective is to empirically estimate the effect of the volatility in the mean (SVM model) using five Latin American stock markets and the S&P 500, FTSE 100, Nikkei 225 and SZSE markets comparing with the results obtained by Koopman and Uspensky (2002)Footnote 4 and Abanto-Valle et al. (2021). Regarding, the first goal, this article introduces an approximate Bayesian inference via importance sampling, of the SVM model proposed by Koopman and Uspensky (2002). The likelihood function of the SVM model is approximated using HMMs machinery. The empirical application reveals similar results between our proposal methodology and the MCMC methods used by Abanto-Valle et al. (2021). However, our proposal is less time-consuming, which is very important in real time applications.

The SVM model allows us to investigate the dynamic relationship between returns and their time-varying volatility. Therefore, concerning the second goal, we illustrated our methods through an empirical application of five Latin American return series and the S&P 500, FTSE 100, Nikkei 225 and SZSE returns. The \(\beta _{2}\) estimate, which measures both the ex-ante relationship between returns and volatility and the volatility feedback effect, was negative and significant for all the indexes considered here except for the IGBVL and SZSE. The results are in line with those of French et al. (1987), who found a similar relationship between unexpected volatility dynamics and returns and confirm the hypothesis that investors require higher expected returns when unanticipated increases in future volatility are highly persistent. This fact is consistent with our findings of higher values of \(\phi \) combined with larger negative values for the in-mean parameter.

Future research considers extending the model and algorithm to include time-varying parameters, including the in-mean parameter. This fact would allow us a comparison with other algorithms, such as the one proposed in Chan (2017). Another extension is to incorporate heavy-tails as in Abanto-Valle et al. (2012) or skewness and heavy-tails simultaneously, as in Leão et al. (2017).

Fig. 2
figure 2

MERVAL, IBOVESPA, IPSA, MEXBOL, IGBVL and S&P 500 returns data sets: absolute returns (full gray line), \(\text {e}^{\frac{h_t}{2}}\) estimator of the SVM model usign HMM machinery (dotted red line), \(\text {e}^{\frac{h_t}{2}}\) estimator of the SV model usign HMM machinery (dotted green line), posterior smoothed mean of \(\text {e}^{\frac{h_t}{2}}\) of the SVM model using MCMC (black solid line) and posterior smoothed mean of \(\text {e}^{\frac{h_t}{2}}\) of the SV model using MCMC (blue solid line)

Fig. 3
figure 3

FTSE 100, NIKKEI 225 and SZSE returns data sets: absolute returns (full gray line), \(\text {e}^{\frac{h_t}{2}}\) estimator of the SVM model usign HMM machinery (dotted red line), \(\text {e}^{\frac{h_t}{2}}\) estimator of the SV model usign HMM machinery (dotted green line), posterior smoothed mean of \(\text {e}^{\frac{h_t}{2}}\) of the SVM model using MCMC (black solid line) and posterior smoothed mean of \(\text {e}^{\frac{h_t}{2}}\) of the SV model using MCMC (blue solid line)

Fig. 4
figure 4

QQ-plots of the forecast pseudo-residuals for SV (left) and SVM (right) models for the MERVAL, IBOVESPA, IPSA, MEXBOL and IGBVL, respectively

Fig. 5
figure 5

QQ-plots of the forecast pseudo-residuals for SV (left) and SVM (right) models for the S &P 500, FTSE 100, NIKKEI 225 and SZSE, respectively