1 Introduction

Over the past three decades, stochastic volatility (SV) models introduced by Taylor (1982) have played an important role in modelling financial time series which are characterized by a time-varying volatility feature. This class of models is often viewed as a better formal alternative to ARCH-type models because the volatility is itself driven by an exogenous innovation, a fact that is consistent with finance theory, although it makes the model relatively more difficult to estimate. Several extensions of the original SV formulation have been proposed in the literature to account for further volatility features such as long memory, simultaneous dependence, excess kurtosis, leverage effect and change in regime (e.g. Harvey et al. 1994; Ghysels et al. 1996; Breidt 1997; Breidt et al. 1998; So et al. 1998; Chib et al. 2002; Carvalho and Lopes 2007; Omori et al. 2007; Nakajima and Omori 2009). However, it seems that most of the proposed formulations have been devoted to time-invariant volatility parameters and hence they could not meaningfully explain time series whose volatility structure changes over time, in particular volatility displaying a stochastic periodic pattern that cannot be accounted for by time-invariant SV-type models.

In order to describe periodicity in the volatility, Tsiakas (2006) proposed various interesting and parsimonious time-varying stochastic volatility models in which the volatility parameters are expressed as deterministic periodic functions of time with appropriate exogenous variables. The proposed models called “periodic stochastic volatility” (PSV) have been successfully applied to model the evolution of daily S&P 500 returns. This is an evidence that the periodically changing structure may characterize time series volatility. However, the PSV formulations are by definition especially well adapted to a kind of deterministic periodicity in the second moment and hence they might neglect a possible stochastic periodicity in these moments (see e.g. Hylleberg et al. (1990) and Ghysels and Osborn (2001) for the difference between deterministic and stochastic periodicity). A complementary approach which seems to be appropriate in capturing stochastic periodicity in the volatility is to consider a linear time-invariant representation for the volatility equation involving seasonal lags, leading to a seasonal SV specification (see e.g. Ghysels et al. 1996). However, because of the time-invariance of the volatility parameters, the seasonal SV model may be too restrictive in representing periodicity and a model with periodic time-varying parameters seems to be more relevant. Indeed, as pointed out by Bollerslev and Ghysels (1996, p. 140) many financial time series encountered in practice are such that neglecting periodic time-variation in the corresponding volatility equation give rise to a loss in forecast efficiency, which is more severe in the GARCH model than in linear ARMA. This has motivated Bollerslev and Ghysels (1996) to propose the periodic GARCH (PGARCH) formulation in which the parameters vary periodically over time in order to capture the stochastic periodicity pattern in the conditional second moment. At present the PGARCH model is among the most important models for describing periodic time series volatility (see e.g. Bollerslev and Ghysels 1996; Taylor 2006; Koopman et al. 2007; Osborn et al. 2008; Regnard and Zakoïan 2011; Sigauke and Chikobvu 2011; Aknouche and Al-Eid 2012). However, despite the recognized relevance of the PGARCH model, an alternative periodic SV for stochastic periodicity is in fact needed for many reasons. First, it is well known that an SV-like model is more flexible than a GARCH type model because the volatility in the latter is only driven by the past of the observed process which constitutes a serious limitation. Second, compared to SV-type models, the probability structure of PGARCH models is relatively more complex to obtain (Aknouche and Bibi 2009). Finally, compared to the PGARCH, the PAR-SVs easily allows to simple multivariate generalizations.

In this paper we propose to model stochastic periodicity in the volatility through a model that generalizes the standard SV equation so that the parameters vary periodically over time. Thus, in the proposed model termed periodic autoregressive stochastic volatility (PAR-SVs) the log-volatility process follows a first-order periodic autoregression and may be generalized so as to have any linear periodic representation. This model may be seen as an extension of the models of Tsiakas (2006) to include periodic feature in the autoregressive dynamic of the log-volatility equation. The structure and probability properties of the proposed model such as periodic stationarity, autocovariance structure and relationship with multivariate stochastic volatility models are first studied. In particular, periodic ARMA (PARMA) representations for the logarithm of the squared PAR-SVs process are proposed. Then, parameter estimation is conducted via the quasi-maximum likelihood (QML) method, properties of which are discussed. In addition, Bayesian estimation approach using Markov Chains Monte Carlo (MCMC) techniques is also considered. Specifically, a Gibbs sampler is used to estimate the joint posterior distribution of the parameters and the augmented volatility while calling for the Griddy Gibbs procedure when estimating the conditional posterior distribution of the augmented parameters. On the other hand, selection of the period of the PAR-SVs model is carried out using the (conditional) deviance information criterion (DIC). Simulation experiments are undertaken to assess finite-sample performances of the QMLE and the Bayesian Griddy Gibbs methods. Moreover, empirical applications to modeling series of daily, quarterly and monthly S&P 500 returns are conducted in order to appreciate the usefulness of the proposed PAR-SVs model. In the particular daily return case, a variant of the PAR-SVs model with missing values, dealing with the “day-of-the-week” effect is applied.

The rest of this paper proceeds as follows. Section 2 proposes the PAR-SVs model and studies its main probabilistic properties. In Sect. 3, the quasi-maximum likelihood method via prediction error decomposition and Kalman filtering is adopted. Moreover, a single-move Bayesian approach by means of the Griddy Gibbs (BGG) sampler is proposed. In particular, some MCMC diagnostic tools are presented and period selection in PAR-SVs models is carried out using the DIC. Through a simulation study, Sect. 4 examines the behavior of the QML and BGG methods in finite samples. Section 5 applies the PAR-SVs specification to model daily, quarterly and monthly S&P 500 returns using the Bayesian Griddy Gibbs method. Finally, Sect. 6 concludes.

2 The PAR-SVs and its main probabilistic properties

In this paper, we say that a stochastic process \(\left\{ \varepsilon _{t},t\in {\mathbb {Z}}\right\} \) has a periodic autoregressive stochastic volatility representation with period S (PAR-SVs in short) if it is given by

$$\begin{aligned} \left\{ \begin{array}{l} \varepsilon _{t}=\sqrt{h_{t}}\eta _{t} \\ \log \left( h_{t}\right) =\alpha _{t}+\beta _{t}\log \left( h_{t-1}\right) +\sigma _{t}e_{t} \end{array} \right. \, t\in {\mathbb {Z}}, \end{aligned}$$
(2.1a)

where the parameters \(\alpha _{t},\) \(\beta _{t},\) and \(\sigma _{t}\) are S -periodic over t (i.e. \(\alpha _{t}=\alpha _{t+Sn}\) \(\forall n\in {\mathbb {Z}} \) and so on) and the period S \(\ge 1\) is the smallest positive integer verifying the latter relationship. The sequence of random vectors \(\left\{ (\eta _{t},e_{t}),t\in {\mathbb {Z}}\right\} \) is assumed to be independent and identically distributed (iid in short) with mean \((0,0)^{\prime }\) and covariance matrix \(I_{2}\) (\(I_{2}\) stands for the identity matrix of dimension 2). We have called model (2.1a) periodic autoregressive stochastic volatility rather than shortly periodic stochastic volatility because the log-volatility is rather driven by a first-order periodic autoregression and also in order to make distinction between model (2.1a) and the periodic stochastic volatility (PSV) model proposed by Tsiakas (2006). In fact, the PAR-SVs model (2.1a) may be generalized so that \(h_{t}\) satisfies any stable periodic ARMA (henceforth PARMA) representation.

Note that when \(\beta _{t}=0\), model (2.1a) reduces to Tsiakas’s (2006) model if we take \(\alpha _{t}\) to be an appropriate deterministic periodic function of time. In that case, the effect of any current shock in the innovation \(e_{t}\) only influences the present volatility and does not affect its future evolution. This is the case of what is called deterministic periodicity (Hylleberg et al. 1990). If, in contrast, \(\beta _{t}\ne 0\) for some t, the log-volatility equation involves lagged values of the log-volatility process. Therefore, the log-volatility consists at any time of an accumulation of past shocks, so that present shocks affect more or less the future log-volatility evolution depending on the stability of the log-volatility equation (see the periodic stationarity condition (2.5) below). This case is commonly named stochastic periodicity in the volatility.

It should be noted that although \(h_{t}\) is conventionally called volatility, it is not the conditional variance of the observed process given its past information in the familiar sense as in ARCH-type models. This is because \(h_{t}\) is instead \(\mathcal {F}_{t}\)-measurable and so \(E\left( \varepsilon _{t}^{2}/\mathcal {F}_{t-1}\right) =E\left( h_{t}/\mathcal {F}_{t-1}\right) \ne h_{t}\), where \(\mathcal {F}_{t}\) is the \(\sigma \) -Algebra generated by \(\left\{ \varepsilon _{u},u\le t\right\} \). Nevertheless, \(E\left( h_{t}\right) =E\left( \varepsilon _{t}^{2}\right) \) and \(E\left( \varepsilon _{t}^{2}/h_{t}\right) =h_{t}\) as in the ARCH-type case.

To emphasize the periodicity of the model, let \(t=nS+v\) for \(n\in {\mathbb {Z}}\) and \(1\le v\le S\). Then model (2.1a) may be written as follows

$$\begin{aligned} \left\{ \begin{array}{l} \varepsilon _{nS+v}=\sqrt{h_{nS+v}}\eta _{nS+v} \\ \log \left( h_{nS+v}\right) =\alpha _{v}+\beta _{v}\log \left( h_{nS+v-1}\right) +\sigma _{v}e_{nS+v} \end{array} \right. \quad n\in {\mathbb {Z}},\quad 1\le v\le S, \end{aligned}$$
(2.1b)

where by season v (\(1\le v\le S\)) we mean the channel \(\left\{ \ldots ,v-S,v,v+S,v+2S,\ldots \right\} \) with corresponding parameters \(\alpha _{v},\) \(\beta _{v}\) and \(\sigma _{v}\).

From (2.1b) the log-volatility appears to be a Markov chain, which is not homogeneous as in time-invariant stochastic volatility models, but is rather periodically homogeneous due to the periodic time-variation of parameters. This may relatively complicate studying the probabilistic structure of the PAR-SVs model. As is common in periodic time varying modeling, a routine approach is to write (2.1b) as a time-invariant multivariate SV model by embedding seasons v, \(1\le v\le S\) (see e.g. Gladyshev 1961 and Tiao and Grupe 1980 for periodic linear models) and then studying the property of this latter. More precisely, define the S-variate sequences \( \left\{ H_{n},n\in {\mathbb {Z}}\right\} \), \(\left\{ \mathbf {\varepsilon }_{n},n\in {\mathbb {Z}}\right\} \) by \(H_{n}=\left( h_{nS+1},\ldots ,h_{nS+S}\right) ^{\prime }\) and \( \mathbf {\varepsilon }_{n}=\left( \varepsilon _{nS+1},\ldots ,\varepsilon _{nS+S}\right) ^{\prime }\). Then model (2.1b) may be cast in the following multivariate SV form

$$\begin{aligned} \left\{ \begin{array}{l} \mathbf {\varepsilon }_{n}= { diag}\left( H_{n}^{\frac{1}{2}}\right) \Lambda _{n}\\ \log H_{n}=B\log H_{n-1}+\xi _{n} \end{array} \right. \, n\in {\mathbb {Z}}, \end{aligned}$$
(2.2)

where \(\Lambda _{n}=\left( \eta _{nS+1},\ldots ,\eta _{nS+S}\right) ^{\prime }\), \({ diag}\left( a\right) \) stands for the diagonal matrix formed by the entries of the vector a in the given order. The notations \(H_{n}^{\frac{1}{ 2}}\) and \(\log H_{n}\) denote the S-vectors defined respectively by \(H_{n}^{ \frac{1}{2}}\left( v\right) =\sqrt{h_{nS+v}}\) and \(\log H_{n}\left( v\right) =\log \left( h_{nS+v}\right) \) (\(1\le v\le S\)). The matrices B and \(\xi _{n}\) in (2.2) are given by

$$\begin{aligned} B=\left( \begin{array}{cccc} 0 &{}\quad \ldots &{}\quad 0 &{}\quad \beta _{1} \\ 0 &{}\quad \ldots &{} 0 &{}\quad \beta _{2}\beta _{1} \\ \vdots &{}\quad \vdots &{}\quad \vdots &{}\quad \vdots \\ 0 &{}\quad \ldots &{}\quad 0 &{}\quad \prod \limits _{v=1}^{S}\beta _{S-v} \end{array} \right) _{S\times S}, \xi _{n}=\left( \begin{array}{c} \lambda _{nS+1} \\ \beta _{2}\lambda _{nS+1}+\lambda _{nS+2} \\ \vdots \\ \sum \limits _{k=1}^{S}\prod \limits _{v=0}^{S-k-1}\beta _{S-v}\lambda _{nS+k} \end{array} \right) _{S\times 1}, \end{aligned}$$

with \(\lambda _{nS+v}=\alpha _{v}+\sigma _{v}e_{nS+v}\) \((1\le v\le S)\).

However, this approach has the main drawback that available methods for analyzing multivariate SV models do not consider the particular structure of the coefficients in (2.2) and it may be difficult to conclude on model (2.1). Thus, studying probabilistic and statistical properties of model (2.1) directly may be simpler and better than studying them through model (2.2). This suggests that periodic stochastic volatility modelling cannot be trivially deduced from existing multivariate SV analysis. In the sequel, we study the structure of model (2.1) using mainly the direct approach.

Throughout this paper, we frequently use solutions of the following ordinary difference equation

$$\begin{aligned} u_{t}=a_{t}+b_{t}u_{t-1}, \quad t\in {\mathbb {Z}}, \end{aligned}$$
(2.3a)

with S-periodic coefficients \(a_{t}\) and \(b_{t}\). Recall that the unique solution of (2.3a) is given, under the requirement that \(\left| \prod \limits _{v=1}^{S}b_{v}\right| <1\), by

$$\begin{aligned} u_{nS+v}=\left( 1-\prod \limits _{v=1}^{S}b_{v}\right) ^{-1}\sum \limits _{j=0}^{S-1}\prod \limits _{i=0}^{j-1}b_{v-i}a_{v-j},\quad 1\le v\le S, \quad n\in {\mathbb {Z}}. \end{aligned}$$
(2.3b)

Before studying the probabilistic properties of model (2.1), it is useful to recall some probability properties related to periodically time-varying stochastic difference equations like strict periodic stationarity and periodic ergodicity. A real-valued stochastic process \(\left\{ Y_{t}, t\in {\mathbb {Z}}\right\} \) defined on a probability space \((\Omega ,\mathcal {F}, P)\) is said to be strictly periodically stationary (henceforth s.p.s.) with period \(S\ge 1\) if its infinite-dimensional distribution is invariant under a shift multiple of S for all channel v \(\left( 1\le v\le S\right) \), i.e. the probability distribution of \(\left( \ldots ,Y_{v},Y_{v+1},Y_{v+2},\ldots \right) \) is the same as that of \( (\ldots ,Y_{v+hS},Y_{v+1+hS},\) \(Y_{v+2+hS},\ldots )\) for all \(1\le v\le S\) and all \(h\in {\mathbb {Z}}\), where S is the smallest positive integer verifying the latter property. For instance, the simplest s.p.s. process is a sequence \(\left\{ u_{t},t\in {\mathbb {Z}} \right\} \) of independent and periodically distributed random variables (henceforth ipd), i.e. \(\left\{ u_{t},t\in {\mathbb {Z}} \right\} \) is independent and \(u_{t}\) has the same distribution as \(u_{t+nS}\) for all \(t,n\in {\mathbb {Z}} \). Thus a s.p.s. process with \(S=1\) is a strictly stationary one and an idp sequence with \(S=1\) reduces to an iid sequence. Like the ergodic theorem for strictly stationary processes, the periodic ergodic theorem for strictly periodically stationary sequences can be stated as follows. If \( \left\{ Y_{t}, t\in {\mathbb {Z}} \right\} \) is s.p.s. and if f is a measurable function from \({\mathbb {R}}^{\mathbb {Z}}\) to \({\mathbb {R}}\) such that \(E\left( f\left( \ldots ,Y_{t-S},Y_{t},Y_{t+S},\ldots \right) \right) <\infty \) for all \(t\in {\mathbb {Z}}\), then

$$\begin{aligned} \frac{1}{n}\sum _{k=0}^{n-1}f\left( \ldots Y_{v+\left( k-1\right) S},Y_{v+kS},Y_{v+\left( k+1\right) S},\ldots \right) \overset{a.s.}{\underset{ n\rightarrow \infty }{\rightarrow }}Y_{v}^{*},\quad \forall \quad 1\le v\le S, \end{aligned}$$

for some random variable \(Y_{v}^{*}\). When \(\left\{ Y_{nS+v}, n\in {\mathbb {Z}} \right\} \) satisfies for all channel \(1\le v\le S\) a certain irreducibility property called periodic ergodicity, which roughly means that \(\left\{ Y_{nS+v}, n\in {\mathbb {Z}} \right\} \) may reach any non P-negligible subclass of the state space for all \(1\le v\le S\), then the limiting random variable \(Y_{v}^{*}\) is almost surely constant and then

$$\begin{aligned} Y_{v}^{*}=E(f\left( \ldots ,Y_{v-S},Y_{v},Y_{v+S},\ldots \right) ), (1\le v\le S), a.s. \end{aligned}$$

To define periodic ergodicity, let \(T: {\mathbb {R}}^{\mathbb {Z}}\rightarrow {\mathbb {R}}^{\mathbb {Z}}\) denote the shift transformation defined for any \(\mathbf {x}_{v}=\left( \ldots ,x_{v},x_{v+1},x_{v+2},\ldots \right) \in {\mathbb {R}}^{ {\mathbb {Z}} }\) by \(T\mathbf {x}_{v}=\left( \ldots ,x_{v+1},x_{v+2},x_{v+3},\ldots \right) \) \( \left( 1\le v\le S\right) \) and write \(T^{S}\) for the S-th power of T : \(T^{S}=T\circ T\circ \ldots \circ T\), S times. Thus, \(\left\{ Y_{t}, t\in {\mathbb {Z}}\right\} \) is s.p.s. if and only if \(T^{S}\) preserves the probability measure \(P_{Y_{v}}\) for all \(1\le v\le S\) (\(P_{Y_{v}}\) being the image measure of P by the process \(\left\{ Y_{nS+v}, n\in {{\mathbb {Z}}}\right\} \)). A Borel set \(C_{v}\subset {\mathbb {R}}^{{{ {\mathbb {Z}} }}}\) of the form \(C_{v}=\left\{ \mathbf {x}_{v}\in {\mathbb {R}}^{ {\mathbb {Z}} }:\mathbf {x}_{v}=\left( \ldots ,x_{v},x_{v+1},x_{v+2},\ldots \right) \right\} \) is said to be S-invariant along the channel v \(\left( 1\le v\le S\right) \) if \(T^{-S}\left( C_{v}\right) =C_{v}\), where \(T^{-S}\left( C_{v}\right) =\left\{ \mathbf {x}_{v}\in {\mathbb {R}}^{{{ {\mathbb {Z}} }}}:T^{S}\mathbf {x}_{v}\in C_{v}\right\} \). A s.p.s. process \(\left\{ Y_{t}, t\in {{ {\mathbb {Z}} }}\right\} \) is said to be periodically ergodic if for all \(1\le v\le S\), \(P\left( \left( \ldots ,Y_{v},Y_{v+1},Y_{v+2},\ldots \right) \in C_{v}\right) =0\) or 1, for all S-invariant Borel set \(C_{v}\) over channel v. Similarly to strict periodic stationarity, the simplest periodically ergodic process is a sequence of ipd random variables. Like strict stationarity and ergodicity (see e.g. Billingsley 1995, Theorem 36.4), strict periodic stationarity and periodic ergodicity are preserved under certain transformations. Indeed, if \(\left\{ Y_{t}, t\in {\mathbb {Z}} \right\} \) is s.p.s. and periodically ergodic and if \(\left\{ Z_{t},t\in {\mathbb {Z}} \right\} \) is given by \(Z_{t}=f_{t}\left( \ldots ,Y_{t},Y_{t+1},Y_{t+2},\ldots \right) \), where \(f_{t}\) is a function from \( {\mathbb {R}}^{{{ {\mathbb {Z}} }}}\) into \( {\mathbb {R}} \) which is measurable, periodic over t with period S (\(f_{t}=f_{t+nS} \) for all n and t), and possibly depending on S-periodically time-varying parameters, then \(\left\{ Z_{t},t\in {{ {\mathbb {Z}}}}\right\} \) is also s.p.s. and periodically ergodic. Thus a sequence of ipd random variables may be seen as a “building-block” for the class of s.p.s. and periodically ergodic processes.

Now, we have the following result which provides a necessary and sufficient condition for strict periodic stationarity and periodic ergodicity of model (2.1).

Theorem 2.1

(Strict periodic stationarity)

The PAR-SVs equation given by (2.1) admits a unique (nonanticipative) strictly periodically stationary and periodically ergodic solution given for \(n\in {\mathbb {Z}}\) and \(1\le v\le S\) by

$$\begin{aligned} \varepsilon _{nS+v}=\eta _{nS+v}\exp \left\{ \frac{1}{2}\left( \dfrac{ \sum _{j=0}^{S-1}\prod \limits _{i=0}^{j-1}\beta _{v-i}\alpha _{v-j}}{ 1-\prod \limits _{v=1}^{S}\beta _{v}}+\sum _{j=0}^{\infty }\prod \limits _{i=0}^{j-1}\beta _{v-i}\sigma _{v-j}e_{nS+v-j}\right) \right\} , \end{aligned}$$
(2.4)

where the series in (2.4) converges almost surely, if and only if,

$$\begin{aligned} \left| \prod \limits _{v=1}^{S}\beta _{v}\right| <1. \end{aligned}$$
(2.5)

Proof

The result obviously follows from standard linear periodic autoregression (PAR) theory while using (2.3) (see e.g. Aknouche and Bibi 2009). So, details are omitted. \(\square \)

From (2.5) we see that the monodromy coefficient \(\prod \nolimits _{v=1}^{S} \beta _{v}\) is the analog of the persistent parameter in the case of time-invariant SV and standard GARCH models.

Other properties such as periodic geometric ergodicity and strong mixing are obvious. Let first say that a strictly periodically stationary stochastic process \(\left\{ \varepsilon _{t},t\in {\mathbb {Z}}\right\} \) is geometrically periodically ergodic if and only if the corresponding multivariate strictly stationary process \(\left\{ \mathbf {\varepsilon }_{t},t\in {\mathbb {Z}}\right\} \) given by \(\mathbf {\varepsilon }_{n}=\left( \varepsilon _{nS+1},\ldots ,\varepsilon _{nS+S}\right) ^{\prime }\) is geometrically ergodic in the classical sense (see e.g. Meyn and Tweedie (2009) for the definition of geometric ergodicity). Similarly, \(\left\{ \varepsilon _{t},t\in {\mathbb {Z}}\right\} \) is said to be periodically \(\beta \) -mixing if and only if \(\left\{ \mathbf {\varepsilon }_{t},t\in {\mathbb {Z}}\right\} \) is \(\beta \)-mixing.

Theorem 2.2

(Geometric periodic ergodicity)

Under the condition \(\left| \prod \nolimits _{v=1}^{S}\beta _{v}\right| <1\), the process \(\left\{ \varepsilon _{t},t\in {\mathbb {Z}}\right\} \) defined by (2.1) is geometrically periodically ergodic and hence is periodically \(\beta \)-mixing.

Proof

The result follows from geometric ergodicity of the vector autoregression \(\{\log H_{n},n\) \(\in {\mathbb {Z}}\}\) given by (2.2), which may be easily established using Meyn and Tweedie’s (2009) results (see also Davis and Mikosch 2009). \(\square \)

Given the form of the strictly periodically stationary solution (2.4), it is easy to give its second-order properties. Assume first the following conditions hold

$$\begin{aligned} E\left( \mathop {\prod }_{j=0}^{\infty }\Delta _{v,j}\right)= & {} \mathop {\prod }_{j=0}^{\infty }E\left( \Delta _{v,j}\right) \quad \text { for all }\quad 1\le v\le S, \end{aligned}$$
(2.6a)
$$\begin{aligned} \mathop {\prod }_{j=0}^{\infty }\delta _{v,j}< & {} \infty \quad \text { for all }\quad 1\le v\le S, \end{aligned}$$
(2.6b)

where

$$\begin{aligned} \Delta _{v,j}=\exp \left( \prod \limits _{i=0}^{j-1}\beta _{v-i}\sigma _{v-j}e_{v-j}\right) \quad \text {and}\quad \delta _{v,j}=E\left( \Delta _{v,j}\right) . \end{aligned}$$

As pointed out by an Associate Editor, equality (2.6a) is not always satisfied for any independent sequence \(\left\{ \Delta _{j},j\in {\mathbb {N}} \right\} \) and one can exhibit examples of independent sequences for which (2.6a) is not fulfilled. Nevertheless, from the dominated convergence theorem, a sufficient condition for (2.6a) to be satisfied is that

$$\begin{aligned} \prod \limits _{j=1}^{n}\Delta _{v,j}\le W_{v}\; a.s.\quad \text { for all }\quad n\in {\mathbb {N}}, \end{aligned}$$
(2.6c)

for some integrable random variable \(W_{v}\) \(\left( 1\le v\le S\right) \).

Thus under (2.5) and (2.6) the following result provides sufficient conditions for model (2.1) to have a unique strictly periodically stationary solution with finite second moment.

Theorem 2.3

(Second-order periodic stationarity)

Under conditions (2.5) and (2.6), the series in (2.4) also converges in the mean square sense and the process given by (2.4) is strictly periodically stationary with \( E\left( \varepsilon _{v}^{2}\right) <\infty \) \(\left( 1\le v\le S\right) \).

Proof

Routine computation shows that under (2.5) and (2.6) the series in (2.4),

$$\begin{aligned} \sum _{j=0}^{\infty }\prod \limits _{i=0}^{j-1}\beta _{v-i}\sigma _{v-j}e_{nS+v-j},\quad 1\le v\le S, \end{aligned}$$

converges in mean square. Moreover, under these conditions, it is clear that \(\left\{ \varepsilon _{t},t\in {\mathbb {Z}}\right\} \) given by (2.4) is a white noise with periodic variance (henceforth periodic white noise) since \(E\left( \varepsilon _{t}\right) =0\), \(E\left( \varepsilon _{t}\varepsilon _{t-h}\right) =0\) (\(h>0\)) and, while using (2.3),

$$\begin{aligned} Var\left( \varepsilon _{nS+v}\right)= & {} E\left( \exp \left( \dfrac{ \sum _{j=0}^{S-1}\prod \limits _{i=0}^{j-1}\beta _{v-i}\alpha _{v-j}}{ 1-\prod \limits _{v=1}^{S}\beta _{v}}+\sum _{j=0}^{\infty }\prod \limits _{i=0}^{j-1}\beta _{v-i}\sigma _{v-j}e_{nS+v-j}\right) \right) \nonumber \\= & {} \exp \left( \dfrac{\sum _{j=0}^{S-1}\prod \limits _{i=0}^{j-1}\beta _{v-i} \alpha _{v-j}}{1-\prod \limits _{v=1}^{S}\beta _{v}}\right) \mathop {\prod }_{j=0}^{\infty }\delta _{v,j},\quad 1\le v\le S. \end{aligned}$$
(2.7)

\(\square \)

In the case of Gaussian log-volatility innovations \(\left\{ e_{t},t\in {\mathbb {Z}}\right\} \), (i.e. \(e_{t}\sim N(0,1)\)) it is also possible to obtain more explicit results reducing then assumptions of Theorem 2.3. Using the fact that if \(X\sim N(0,1)\) then \(E(\exp (\phi X))=\exp (\frac{\phi ^{2}}{2})\) for all non-null real constant \(\phi \), we obtain

$$\begin{aligned} \delta _{v,j}=\exp \frac{1}{2}\left( \sigma _{v-j}^{2}\prod \limits _{i=0}^{j-1}\beta _{v-i}^{2}\right) , \end{aligned}$$
(2.8)

and condition (2.6b) of finiteness of \(\mathop {\prod }_{j=0}^{\infty }\delta _{v,j}\) reduces to the periodic stationarity condition (2.5): \(\left| \prod \nolimits _{v=1}^{S}\beta _{v}\right| <1\). Moreover, using (2.8) and (2.3) the variance of the process given by (2.7) may be expressed more explicitly as follows

$$\begin{aligned} Var\left( \varepsilon _{nS+v}\right)= & {} \exp \left( \dfrac{ \sum _{j=0}^{S-1}\prod \limits _{i=0}^{j-1}\beta _{v-i}\alpha _{v-j}}{ 1-\prod \limits _{v=1}^{S}\beta _{v}}\right) \mathop {\prod }_{j=0}^{\infty }\exp \left( \frac{1}{2}\sigma _{v-j}^{2}\prod \limits _{i=0}^{j-1}\beta _{v-i}^{2}\right) \nonumber \\= & {} \exp \left( \dfrac{\sum _{j=0}^{S-1}\prod \limits _{i=0}^{j-1}\beta _{v-i} \alpha _{v-j}}{1-\prod \limits _{v=1}^{S}\beta _{v}}\right) \exp \left( \frac{1}{2}\sum _{j=0}^{\infty }\prod \limits _{i=0}^{j-1} \beta _{v-i}^{2}\sigma _{v-j}^{2}\right) \nonumber \\= & {} \exp \left( \dfrac{\sum \limits _{j=0}^{S-1}\prod \limits _{i=0}^{j-1} \beta _{v-i}\alpha _{v-j}}{1-\prod \limits _{v=1}^{S}\beta _{v}} +\frac{1}{2}\dfrac{\sum \limits _{j=0}^{S-1} \prod \limits _{i=0}^{j-1}\beta _{v-i}^{2} \sigma _{v-j}^{2}}{1-\prod \limits _{v=1}^{S} \beta _{v}^{2}}\right) . \end{aligned}$$
(2.9)

For example, the variance \(Var\left( \varepsilon _{nS+v}\right) \) of the process is given respectively, for \(S=2\) and \(S=3\), by

$$\begin{aligned} Var\left( \varepsilon _{2n+1}\right)= & {} \exp \left( \frac{\alpha _{1}+\beta _{1}\alpha _{2}}{1-\beta _{1}\beta _{2}}+\frac{1}{2} \frac{\sigma _{1}^{2}+\beta _{1}^{2}\sigma _{2}^{2}}{1-\beta _{1}^{2} \beta _{2}^{2}} \right) , \\ Var\left( \varepsilon _{2n+2}\right)= & {} \exp \left( \frac{\alpha _{2}+\beta _{2}\alpha _{1}}{1-\beta _{1}\beta _{2}}+\frac{1}{2} \frac{\sigma _{2}^{2}+\beta _{2}^{2}\sigma _{1}^{2}}{1-\beta _{1}^{2} \beta _{2}^{2}} \right) , \\ Var\left( \varepsilon _{3n+1}\right)= & {} \exp \left( \frac{\alpha _{1} +\beta _{1}\alpha _{3}+\beta _{1}\beta _{3}\alpha _{2}}{1-\beta _{1}\beta _{2} \beta _{3}}+\frac{1}{2}\frac{\sigma _{1}^{2}+\beta _{1}^{2}\sigma _{3}^{2}+ \beta _{1}\beta _{3}\sigma _{2}^{2}}{1-\beta _{1}\beta _{2}\beta _{3}}\right) , \\ Var\left( \varepsilon _{3n+2}\right)= & {} \exp \left( \frac{\alpha _{2} +\beta _{2}\alpha _{1}+\beta _{2}\beta _{1}\alpha _{3}}{1-\beta _{1}\beta _{2} \beta _{3}}+\frac{1}{2}\frac{\sigma _{2}^{2}+\beta _{2}\sigma _{1}^{2} +\beta _{2}\beta _{1}\sigma _{3}^{2}}{1-\beta _{1}\beta _{2}\beta _{3}}\right) , \\ Var\left( \varepsilon _{3n+3}\right)= & {} \exp \left( \frac{\alpha _{3}+\beta _{3}\alpha _{2}+\beta _{3}\beta _{2}\alpha _{1}}{1-\beta _{1}\beta _{2}\beta _{3}}+\frac{1}{2}\frac{\sigma _{3}^{2} +\beta _{3}\sigma _{2}^{2}+\beta _{3}\beta _{2}\sigma _{1}^{2}}{1-\beta _{1} \beta _{2}\beta _{3}}\right) . \end{aligned}$$

Next, the autocovariance function of the squared process \(\left\{ \varepsilon _{t}^{2},t\in {\mathbb {Z}} \right\} \) is provided. This one is useful in identifying the model and deriving certain estimation methods such as simple and generalized methods of moments. Let \(\gamma _{v}^{\varepsilon ^{2}}\left( h\right) =E\left( \varepsilon _{nS+v}^{2}\varepsilon _{nS+v-h}^{2}\right) -E\left( \varepsilon _{nS+v}^{2}\right) E\left( \varepsilon _{nS+v-h}^{2}\right) \).

Theorem 2.4

(Autocovariance structure of \(\left\{ \varepsilon _{t}^{2},t\in {\mathbb {Z}} \right\} \))

  1. i)

    Under (2.5), (2.6) and the conditions \( \mathop {\prod }_{j=0}^{\infty }\delta _{v,j}\delta _{v-h,j}<\infty \) and \( E\left( \eta _{1}^{4}\right) <\infty \) we have

    $$\begin{aligned} \gamma _{v}^{\varepsilon ^{2}}\left( 0\right)= & {} \exp \left( 2\dfrac{ \sum \limits _{j=0}^{S-1}\prod \limits _{i=0}^{j-1}\beta _{v-i}\alpha _{v-j}}{ 1-\prod \limits _{v=1}^{S}\beta _{v}}\right) \nonumber \\&\times \left( E\left( \eta _{1}^{4}\right) E\left( \exp \left( 2\sum \limits _{j=h}^{\infty }\prod \limits _{i=0}^{j-1}\beta _{v-i}\sigma _{v-j}e_{v-j}\right) \right) -\mathop {\prod }_{j=0}^{\infty }\delta _{v,j}^{2}\right) , \end{aligned}$$
    (2.10a)
    $$\begin{aligned} \gamma _{v}^{\varepsilon ^{2}}\left( h\right)= & {} \left( E\left( \exp \left( \sum \limits _{j=0}^{h-1}\prod \limits _{i=0}^{j-1} \beta _{v-i}\sigma _{v-j}e_{v-j}+\left( 1+\prod \limits _{i=0}^{h-1}\beta _{v-i}^{-1}\right) \sum \limits _{j=h}^{\infty }\prod \limits _{i=0}^{j-1}\beta _{v-i}\sigma _{v-j} e_{v-j}\right) \right) \right. \nonumber \\&\quad -\left. \mathop {\prod }_{j=0}^{\infty }\delta _{v,j}\delta _{v-h,j}\right) \exp \left( \dfrac{\sum \limits _{j=0}^{S-1}\prod \limits _{i=0}^{j-1} \beta _{v-i}\alpha _{v-j}+\sum \limits _{j=0}^{S-1}\prod \limits _{i=0}^{j-1} \beta _{v-h-i}\alpha _{v-h-j}}{1-\prod \limits _{v=1}^{S}\beta _{v}}\right) ,\quad h>0.\nonumber \\ \end{aligned}$$
    (2.10b)

Proof

Using (2.4) direct calculation gives

$$\begin{aligned}&E\left( \varepsilon _{nS+v}^{2}\varepsilon _{nS+v-h}^{2}\right) =\nonumber \\&\left. E\left( \exp \left( \sum _{j=0}^{\infty }\prod \limits _{i=0}^{j-1}\beta _{v-i}\sigma _{v-j}e_{nS\,+\,v\,-\,j} +\sum _{j=0}^{\infty }\prod \limits _{i=0}^{j-1}\beta _{v-h-i}\sigma _{v-h-j}e_{nS\,+\,v-h-j}\right) \right) \right. \nonumber \\&\left. \times \exp \left( \frac{\sum _{j=0}^{S-1}\prod \limits _{i=0}^{j-1} \beta _{v-i}\alpha _{v-j}}{1-\prod \limits _{v=1}^{S}\beta _{v}}+\dfrac{ \sum _{j=0}^{S-1}\prod \limits _{i=0}^{j-1}\beta _{v-h-i}\alpha _{v-h-j}}{ 1-\prod \limits _{v=1}^{S}\beta _{v}}\right) \right. \nonumber \\&E\left( \eta _{nS+v}^{2}\eta _{nS+v-h}^{2}\right) , \end{aligned}$$
(2.11)

under finiteness of the latter expectations. When in particular \(h=0\), combining (2.7) and (2.11) we get (2.10a) under finiteness of \(E\left( \eta _{1}^{4}\right) \).

For \(h>0\), because of the independence structure of \(\left\{ \eta _{t},t\in {\mathbb {Z}} \right\} \) one obtains

$$\begin{aligned} E\left( \varepsilon _{nS+v}^{2}\varepsilon _{nS+v-h}^{2}\right)= & {} \exp \left( \frac{\sum \limits _{j=0}^{S-1}\prod \limits _{i=0}^{j-1}\beta _{v-i} \alpha _{v-j}\,+\,\sum \limits _{j=0}^{S-1}\prod \limits _{i=0}^{j-1} \beta _{v-h-i}\alpha _{v-h-j}}{1-\prod \limits _{v=1}^{S}\beta _{v}}\right) \\&\times \;E\left( \exp \left( \sum \limits _{j=0}^{h-1}\prod \limits _{i=0}^{j-1}\beta _{v-i} \sigma _{v-j}e_{v-j}\,+\,\sum \limits _{j=h}^{\infty }\prod \limits _{i=0}^{j-1}\beta _{v-i}\sigma _{v-j}e_{v-j} \right. \right. \\&\left. \left. \qquad +\,\sum \limits _{j=0}^{\infty }\prod \limits _{i=0}^{j-1}\beta _{v-h-i}\sigma _{v-h-j}e_{v-h-j}\right) \right) \\= & {} \exp \left( \dfrac{\sum \limits _{j=0}^{S-1}\prod \limits _{i=0}^{j-1} \beta _{v-i}\alpha _{v-j}+\sum \limits _{j=0}^{S-1}\prod \limits _{i=0}^{j-1}\beta _{v-h-i}\alpha _{v-h-j}}{1-\prod \limits _{v=1}^{S}\beta _{v}}\right) \\&\times E\left( \exp \left( \sum \limits _{j=0}^{h-1}\prod \limits _{i=0}^{j-1}\beta _{v-i} \sigma _{v-j}e_{v-j}+\left( 1+\prod \limits _{i=0}^{h-1}\beta _{v-i}^{-1}\right) \right. \right. \\&\left. \left. \times \, \sum \limits _{j=h}^{\infty }\prod \limits _{i=0}^{j-1}\beta _{v-i}\sigma _{v-j}e_{v-j}\right) \right) , \end{aligned}$$

giving (2.10b). \(\square \)

The S kurtoses \({ Kurt}\left( v\right) \) \(\left( 1\le v\le S\right) \) of the PAR-SVs model may be given from (2.9) and (2.10) as follows

$$\begin{aligned} {{ Kurt}}\left( v\right)= & {} E\left( \eta _{1}^{4}\right) \frac{ \mathop {\prod }_{j=0}^{\infty }E\left( \exp \left( \prod \limits _{i=0}^{j-1}\beta _{v-i}\sigma _{v-j}e_{v-j}\right) ^{2}\right) }{\mathop {\prod }_{j=0}^{\infty }\left( E\left( \exp \prod \limits _{i=0}^{j-1}\beta _{v-i}\sigma _{v-j}e_{v-j}\right) \right) ^{2}},\quad 1\le v\le S, \\\ge & {} E\left( \eta _{1}^{4}\right) . \nonumber \end{aligned}$$
(2.12)

By the Cauchy–Schwartz inequality, this clearly shows that the PAR-SVs model may be characterized by excess Kurtosis for all channels \(\left\{ 1,\ldots ,S\right\} \). In particular, under the normality assumption on the innovations, the second-order periodic stationarity reduces to (2.6a) and the following conditions \(E(\eta _{1}^{4})<\infty \) and \(\left| \prod \limits _{v=1}^{S}\beta _{v}\right| <1\). So from (2.8), expression (2.12) reduces to

$$\begin{aligned} {{ Kurt}}\left( v\right) =E\left( \eta _{1}^{4}\right) ,\quad 1\le v\le S. \end{aligned}$$

The autocovariance function has also a more explicit form in the case of Gaussian \(\left\{ e_{t},t\in {\mathbb {Z}}\right\} \).

Corollary 2.1

(Autocovariance structure of \(\left\{ \varepsilon _{t}^{2},t\in {\mathbb {Z}} \right\} \) under normality of \(\left\{ e_{t},t\in {\mathbb {Z}}\right\} \))

Under the same assumptions of Theorem 2.4 and if \(\left\{ e_{t},t\in {\mathbb {Z}}\right\} \) is Gaussian then,

$$\begin{aligned} \gamma _{v}^{\varepsilon ^{2}}\left( 0\right)= & {} \;\exp \left( 2\left( 1-\prod \limits _{v=1}^{S}\beta _{v}\right) ^{-1}\sum \limits _{j=0}^{S-1} \prod \limits _{i=0}^{j-1}\beta _{v-i} \alpha _{v-j}+\left( 1-\prod \limits _{v=1}^{S}\beta _{v}^{2}\right) ^{-1} \sum \limits _{j=0}^{S-1}\prod \limits _{i=0}^{j-1}\beta _{v-i}^{2} \sigma _{v-j}^{2}\right) \nonumber \\&\times \left( E\left( \eta _{1}^{4}\right) -1\right) , \end{aligned}$$
(2.13a)
$$\begin{aligned} \gamma _{v}^{\varepsilon ^{2}}\left( h\right)= & {} \exp \left( \frac{ \sum \limits _{j=0}^{S-1}\prod \limits _{i=0}^{j-1}\beta _{v-i}\alpha _{v-j} +\sum \limits _{j=0}^{S-1}\prod \limits _{i=0}^{j-1}\beta _{v-h-i} \alpha _{v-h-j}}{1-\prod \limits _{v=1}^{S}\beta _{v}} +\frac{\sum \limits _{j=0}^{S-1} \prod \limits _{i=0}^{j-1}\beta _{v-i}^{2}\sigma _{v-j}^{2}}{ 1-\prod \limits _{v=1}^{S}\beta _{v}^{2}}\right) \nonumber \\&\times \left( \exp \left( \dfrac{\sum \limits _{j=0}^{S-1}\prod \limits _{i=0}^{j-1} \beta _{v-i}^{2}\sigma _{v-j}^{2}}{1-\prod \limits _{v=1}^{S}\beta _{v}^{2}} \prod \limits _{i=0}^{h-1}\beta _{v-i}\right) -1\right) , h>0. \end{aligned}$$
(2.13b)

Proof

For Gaussian innovations, we use again the fact that if \( X\sim N(0,1)\) then \(E(\exp (\phi X))=\exp (\frac{\phi ^{2}}{2})\). Therefore, (2.13a) follows from (2.10a) and (2.9). For \(h>0\) we have

$$\begin{aligned}&E\left( \varepsilon _{nS+v}^{2}\varepsilon _{nS+v-h}^{2}\right) \\&\quad =\exp \left( \left( 1-\prod \limits _{v=1}^{S}\beta _{v}\right) ^{-1}\left( \sum \limits _{j=0}^{S-1}\prod \limits _{i=0}^{j-1}\beta _{v-i}\alpha _{v-j} +\sum \limits _{j=0}^{S-1}\prod \limits _{i=0}^{j-1}\beta _{v-h-i}\alpha _{v-h-j} \right) \right) \\&\qquad \times \left. \mathop {\prod }_{j=0}^{h-1}\exp \left( \frac{1}{2}\prod \limits _{i=0}^{j-1} \beta _{v-i}^{2}\sigma _{v-j}^{2}\right) \mathop {\prod }_{j=0}^{\infty }\exp \left( \frac{1}{2}\left( 1+\prod \limits _{i=0}^{h-1}\beta _{v-i}^{-1}\right) ^{2} \prod \limits _{i=0}^{j-1}\beta _{v-i}^{2}\sigma _{v-j}^{2}\right) .\right. \end{aligned}$$

After tedious but straightforward calculation, the autocovariance function at lag h (\(h>0\)) simplifies for Gaussian innovations to

$$\begin{aligned} \gamma _{v}^{\varepsilon ^{2}}\left( h\right)= & {} \exp \left( \frac{ \sum \limits _{j=0}^{S-1}\prod \limits _{i=0}^{j-1}\beta _{v-i}\alpha _{v-j} +\sum \limits _{j=0}^{S-1}\prod \limits _{i=0}^{j-1}\beta _{v-h-i} \alpha _{v-h-j}}{1-\prod \limits _{v=1}^{S}\beta _{v}}\right) \left[ \exp \left( \sum \limits _{j=0}^{h-1}\prod \limits _{i=0}^{j-1}\beta _{v-i}^{2} \frac{\sigma _{v-j}^{2}}{2}\right) \right. \\&\times \exp \left( \sum \limits _{j=h}^{\infty }\prod \limits _{i=0}^{j-1}\beta _{v-i}^{2}\sigma _{v-j}^{2}\left( 1+\prod \limits _{i=0}^{h-1}\beta _{v-i}^{-1}\right) ^{2}\right) -\left. \exp \left( \sum \limits _{j=0}^{\infty }\prod \limits _{i=0}^{j-1}\beta _{v-i}^{2}\sigma _{v-j}^{2}\right) \right] .\\= & {} \exp \left( \frac{\sum \limits _{j=0}^{S-1}\prod \limits _{i=0}^{j-1}\beta _{v-i} \alpha _{v-j}+\sum \limits _{j=0}^{S-1}\prod \limits _{i=0}^{j-1} \beta _{v-h-i}\alpha _{v-h-j}}{1-\prod \limits _{v=1}^{S}\beta _{v}}+\frac{ \sum \limits _{j=0}^{S-1}\prod \limits _{i=0}^{j-1}\beta _{v-i}^{2} \sigma _{v-j}^{2}}{1-\prod \limits _{v=1}^{S}\beta _{v}^{2}}\right) \\&\times \left( \exp \left( \frac{\sum \limits _{j=0}^{S-1}\prod \limits _{i=0}^{j-1} \beta _{v-i}^{2}\sigma _{v-j}^{2}}{1-\prod \limits _{v=1}^{S}\beta _{v}^{2}} \prod \limits _{i=0}^{h-1}\beta _{v-i}\right) -1\right) , \end{aligned}$$

which is (2.13b). \(\square \)

It is worth noting that expanding the exponential function in (2.13b) under the periodic stationarity condition (2.5), the autocovariance function \(\gamma _{v}^{\varepsilon ^{2}}\left( h\right) \) of the squared process \(\left\{ \varepsilon _{t}^{2},t\in {\mathbb {Z}} \right\} \) has the following equivalent form as \(h\rightarrow \infty \)

$$\begin{aligned} \gamma _{v}^{\varepsilon ^{2}}\left( h\right) \sim K\prod \limits _{i=0}^{h-1}\beta _{v-i}\sim K\left( \prod \limits _{v=1}^{S}\beta _{v}\right) ^{h/S}, \end{aligned}$$

and so \(\gamma _{v}^{\varepsilon ^{2}}\left( h\right) \) converges geometrically to zero as \(h\rightarrow \infty \), where K is an appropriate real constant. However, the decreasing of \(\gamma _{v}^{\varepsilon ^{2}}\left( h\right) \) is not compatible with the recurrence equation that satisfy periodic ARMA (PARMA) autocovariances and we can conclude that the squared process \(\left\{ \varepsilon _{t}^{2},t\in {\mathbb {Z}} \right\} \) does not admit a PARMA autocovariance representation.

Nevertheless, the logarithmed squared process \(\left\{ \log \left( \varepsilon _{t}^{2}\right) , t\in {\mathbb {Z}} \right\} \) has in fact a PARMA autocovariance structure. Considering the following notations \(Y_{t}=\log \left( \varepsilon _{t}^{2}\right) \), \(X_{t}=\log h_{t}\), \(u_{t}=\log \left( \eta _{t}^{2}\right) \), \(E\left( \log \left( \eta _{t}^{2}\right) \right) =\mu _{u}\) and \(Var\left( \log \left( \eta _{t}^{2}\right) \right) =\Delta _{u}^{2}\), we have from (2.1)

$$\begin{aligned} Y_{t}=X_{t}+u_{t}. \end{aligned}$$
(2.14)

Theorem 2.5

\((\mathrm{PARMA}\left( 1,1\right) \) representation of \( \left\{ \log \left( \varepsilon _{t}^{2}\right) , t\in {\mathbb {Z}} \right\} \))

Under assumption (2.5) and finiteness of \(\Delta _{u}^{2}\) the process \(\left\{ Y_{t},t\in {\mathbb {Z}} \right\} \) has a PARMAs \(\left( 1,1\right) \) representation given by

$$\begin{aligned} Y_{nS+v}-\mu _{v}^{Y}= & {} \beta _{v}\left( Y_{nS+v-1}-\mu _{v-1}^{Y}\right) +\zeta _{nS+v}-\psi _{v}\zeta _{nS+v-1},\nonumber \\&\quad {1\le v\le S,}\;{t\in {\mathbb {Z}}}, \end{aligned}$$
(2.15a)

where \(\mu _{v}^{Y}=E\left( Y_{nS+v}\right) \),

$$\begin{aligned} \psi _{v}=\left\{ \begin{array}{cc} \dfrac{\left( 1+\beta _{v}^{2}\right) \Delta _{u}^{2}+\sigma _{v}^{2}-\sqrt{ \left( \left( 1+\beta _{v}^{2}\right) \Delta _{u}^{2}+\sigma _{v}^{2}\right) \left( \left( 1-\beta _{v}^{2}\right) \Delta _{u}^{2}+\sigma _{v}^{2}\right) }}{2\beta _{v}\Delta _{u}^{2}}&{}\quad if \quad \Delta _{u}^{2}\ne 0 \\ 0&{}\quad \text { if }\quad \Delta _{u}^{2}=0 \end{array} \right. , \quad {1\le v\le S,}\nonumber \\ \end{aligned}$$
(2.15b)

and \(\left\{ \zeta _{t},t\in {\mathbb {Z}} \right\} \) is a periodic white noise with periodic variance

$$\begin{aligned} \sigma _{\zeta , v}^{2}=Var\left( \zeta _{nS+v}\right) =\left\{ \begin{array}{cc} \dfrac{\beta _{v}\Delta _{u}^{2}}{\psi _{v}}&{}\quad \text { if }\quad \prod \limits _{v=1}^{S}\beta _{v}\ne 0 \\ 0&{}\quad \text { if }\quad \prod \limits _{v=1}^{S}\beta _{v}=0 \end{array} \right. ,\quad {1\le v\le S.} \end{aligned}$$
(2.15c)

Proof

The second-order structure of \(\left\{ X_{t},t\in {\mathbb {Z}} \right\} \) is given form (2.1) while using (2.3),

$$\begin{aligned}&\left. \begin{array}{l} \mu _{v}^{X}=E\left( X_{nS+v}\right) =\alpha _{v}+E\left( X_{nS+v-1}\right) =\left( 1-\prod \limits _{v=1}^{S}\beta _{v}\right) ^{-1} \sum \limits _{j=0}^{S-1}\prod \limits _{i=0}^{j-1}\beta _{v-i} \alpha _{v-j},\\ \gamma _{v}^{X}\left( 0\right) =Var\left( X_{nS+v}^{2}\right) = \beta _{v}^{2}E\left( X_{nS+v-1}^{2}\right) +\sigma _{v}^{2}=\left( 1-\prod \limits _{v=1}^{S}\beta _{v}^{2}\right) ^{-1}\sum \limits _{j=0}^{S-1} \prod \limits _{i=0}^{j-1}\beta _{v-i}^{2}\sigma _{v-j}^{2} \end{array} \right. \\&\left. \begin{array}{c} \gamma _{v}^{X}\left( h\right) ={{ Cov}}\left( X_{nS+v},X_{nS+v-h}\right) \\ =\beta _{v}\gamma _{v-1}^{X}\left( h-1\right) ,\end{array}, \quad h>0.\right. \end{aligned}$$

Therefore, using (2.14) we have

$$\begin{aligned}&\left. \begin{array}{l} \mu _{v}^{Y}=E\left( Y_{nS+v}\right) =E\left( X_{nS+v}\right) +E\left( u_{nS+v}\right) =\left( 1-\prod \limits _{v=1}^{S}\beta _{v}\right) ^{-1}\sum \limits _{j=0}^{S-1} \prod \limits _{i=0}^{j-1}\beta _{v-i}\alpha _{v-j}+\mu _{u}, \\ \gamma _{v}^{Y}(0)=Var\left( Y_{nS+v}\right) =Var\left( X_{nS+v}\right) +\Delta _{u}^{2}=\left( 1-\prod \limits _{v=1}^{S}\beta _{v}^{2}\right) ^{-1}\sum \limits _{j=0}^{S-1} \prod \limits _{i=0}^{j-1}\beta _{v-i}^{2}\sigma _{v-j}^{2}+\Delta _{u}^{2} \end{array} \right. \\&\left. \begin{array}{c} \gamma _{v}^{Y}(h)=\gamma _{v}^{X}(h)=\beta _{v}\gamma _{v-1}^{X}(h-1)= \beta _{v}\beta _{v-1}\ldots \beta _{v-h+1}\gamma _{v-h}^{X}(0) \\ =\beta _{v}\beta _{v-1}\ldots \beta _{v-h+1}\left( 1-\prod \limits _{v=1}^{S}\beta _{v}^{2}\right) ^{-1}\sum \limits _{j=0}^{S-1} \prod \limits _{i=0}^{j-1}\beta _{v-h-i}^{2}\sigma _{v-h-j}^{2} \end{array},\quad h>0.\right. \end{aligned}$$

Clearly the process \(\left\{ Y_{t},t\in {\mathbb {Z}} \right\} \) has a PARMA representation since

$$\begin{aligned} \gamma _{v}^{Y}(h)=\beta _{v}\gamma _{v-1}^{Y}(h-1)\quad \text { for any }\quad h>1. \end{aligned}$$

To identify the parameters of such a representation we use expressions of \( \gamma _{v}^{Y}(h)\) for \(h=0,1\). If \(\left\{ Y_{t},t\in {\mathbb {Z}} \right\} \) has indeed a PARMA representation (2.15a) then for all \(1\le v\le S\),

$$\begin{aligned} \gamma _{v}^{Y}(0) =\,&\beta _{v}\gamma _{v}^{Y}(1)+\sigma _{\zeta ,v}^{2}\left( 1+\psi _{v}\left( \psi _{v}-\beta _{v}\right) \right) \nonumber \\ \gamma _{v}^{Y}(1) =\,&\beta _{v}\gamma _{v-1}^{Y}(0)-\psi _{v}\sigma _{\zeta , v}^{2}{.} \end{aligned}$$
(2.15d)

Hence, if \(\Delta _{u}^{2}\ne 0\) we have for all \(1\le v\le S,\)

$$\begin{aligned} \frac{1+\psi _{v}\left( \psi _{v}-\beta _{v}\right) }{\psi _{v}}&=\frac{ \gamma _{v}^{Y}(0)-\beta _{v}\gamma _{v}^{Y}(1)}{\beta _{v} \gamma _{v-1}^{Y}(0)-\gamma _{v}^{Y}(1)} \nonumber \\&=\dfrac{\gamma _{v}^{Y}(0)-\beta _{v}^{2}\gamma _{v-1}^{X}(0)}{\beta _{v}\left( \gamma _{v-1}^{X}(0)+\Delta _{u}^{2})\right) -\beta _{v}\gamma _{v-1}^{X}(0)} \nonumber \\&=\frac{\gamma _{v}^{Y}(0)-\left( \gamma _{v}^{X}(0)-\sigma _{v}^{2}\right) }{\beta _{v}\Delta _{u}^{2}} \nonumber \\&=\frac{\gamma _{v}^{Y}(0)-\left( \gamma _{v}^{Y}(0)-\sigma _{v}^{2} -\Delta _{u}^{2}\right) }{\beta _{v}\Delta _{u}^{2}} \nonumber \\&=\frac{\sigma _{v}^{2}+\Delta _{u}^{2}}{\beta _{v}\Delta _{u}^{2}}. \end{aligned}$$
(2.15e)

The latter equation admits, for all \(1\le v\le S\), two solutions one of which is with modulus less than 1 (\(\left| \psi _{v}\right| <1\)) and is given by (2.15b). Such a choice clearly ensures that \( \prod \nolimits _{v=1}^{S}\left| \psi _{v}\right| <1\), but is not unique. Moreover, when \(\prod \nolimits _{v=1}^{S}\beta _{v}\ne 0\) using (2.15d), the variance of \(\left\{ \zeta _{t},t\in {\mathbb {Z}} \right\} \) is

$$\begin{aligned} \sigma _{\zeta , v}^{2}= & {} \frac{\beta _{v}\gamma _{v-1}^{Y}(0) -\gamma _{v}^{Y}(1)}{\psi _{v}} \\= & {} \frac{\beta _{v}\left( \gamma _{v-1}^{X}(0)+\Delta _{u}^{2}\right) -\beta _{v}\gamma _{v-1}^{Y}(0)}{\psi _{v}}, \end{aligned}$$

showing (2.15c).

If, however, \(\Delta _{u}^{2}=0\) the relationship \(\gamma _{v}^{Y}(h)=\beta _{v}\gamma _{v-1}^{Y}(h-1)\) also holds for \(h=1\) and so the process \(\left\{ Y_{t},t\in {\mathbb {Z}} \right\} \) is a pure first-order periodic autoregression (PAR(1)) with \( \psi _{v}=0\) for all v. When \(\prod \nolimits _{v=1}^{S}\beta _{v}=0\), the process \(\left\{ Y_{t},t\in {\mathbb {Z}} \right\} \) is a strong periodic white noise (an ipd sequence) and so \(\psi _{v}=0\) for all v (see also Francq and Zakoïan (2006) for the particular non-periodic case \(S=1\)). \(\square \)

It is worth noting that representation (2.15a) is not unique. Indeed, in contrast with time-invariant ARMA models for which an ARMA process may be uniquely identified from its autocovariance function (see Brockwell and Davis 1991), it is not always possible to build a unique PARMA model from an autocovariance function having PARMA structure. However, we may enumerate all possible representations from solving (2.15d) and choosing the best one fitting the observed series. The resulting representation will be abusively called the PARMA representation. Such a representation is useful for obtaining predictions for the process \(\left\{ \log \left( \varepsilon _{t}^{2}\right) , t\in {\mathbb {Z}} \right\} \). It may also be used to obtain approximate predictions for the squared process \(\left\{ \varepsilon _{t}^{2},t\in {\mathbb {Z}} \right\} \) since the latter does not admit a PARMA representation (see Sect. 4.2). If we denote by \({\widehat{\varepsilon }}_{t+h/t}^{2}=E\left( \varepsilon _{t+h}^{2}/\varepsilon _{t}^{2},\varepsilon _{t-1}^{2},\ldots \right) \) the mean-square prediction of \(\varepsilon _{t+h}^{2} \) based on \(\varepsilon _{t}^{2},\varepsilon _{t-1}^{2},\ldots ,\) then \({\widehat{\varepsilon }}_{t+h/t}^{2}\) may be approximated by

$$\begin{aligned} C\exp \left( {{\widehat{\log \left( \varepsilon _{t+h/t}^{2}\right) }}}\right) , \end{aligned}$$

where

$$\begin{aligned} {{\widehat{\log \left( \varepsilon _{t+h/t}^{2}\right) }}}=E\left( \log \left( \varepsilon _{t+h}^{2}\right) /\log \left( \varepsilon _{t}^{2}\right) , \log \left( \varepsilon _{t-1}^{2}\right) ,\ldots \right) , \end{aligned}$$

and C is a normalization factor. The constant C is introduced to minimize the bias due to using incorrectly the following relationship

$$\begin{aligned} \exp \left( {{\widehat{\log \left( \varepsilon _{t+h/t}^{2}\right) }}}\right) = {{\widehat{\exp \log \left( \varepsilon _{t+h/t}^{2}\right) }}}, \end{aligned}$$

as we know from Jensen’s inequality that the latter equality is in fact not true. Typically, one can take C as the sample variance of \(\left( \log \left( \varepsilon _{t}^{2}\right) , t=1,\ldots ,T\right) \).

3 Parameter estimation of the PAR-SVs model

In this Section we consider two estimation methods for the PAR-SVs model. The first one is a QML method based on prediction-error decomposition of a corresponding linear periodic state-space model. This method which uses Kalman filtering to obtain linear predictors and error prediction variances is used as a Benchmark to the second proposed method, which is based on the Bayesian approach. In the latter method, from given conjugate priors, the conditional posteriors are obtained from the Gibbs sampler in which the conditional posteriors of the augmented volatilities are derived via the Griddy–Gibbs technique. In the rest of this section we consider a series \(\varepsilon =\left( \varepsilon _{1},\ldots , \varepsilon _{T}\right) ^{\prime }\) generated from model (2.1) with sample-size \(T=NS\) supposed without loss of generality multiple of the period S. The vector of model parameters is denoted by \(\theta =\left( \omega ^{\prime },\sigma ^{2\prime }\right) ^{\prime }\) where \(\omega =\left( \omega _{1}^{\prime },\omega _{2}^{\prime },\ldots ,\omega _{S}^{\prime }\right) ^{\prime }\) with \( \omega _{v}=\left( \alpha _{v},\beta _{v}\right) ^{\prime }\) (\(1\le v\le S\) ) and \(\sigma ^{2}=\left( \sigma _{1}^{2},\sigma _{2}^{2},\ldots , \sigma _{S}^{2}\right) ^{\prime }\).

3.1 QMLE via prediction error decomposition and Kalman filtering

Taking in (2.1) the logarithm of the square of \(\varepsilon _{t}\) we obtain the following linear periodic state space-model

$$\begin{aligned} \left\{ \begin{array}{l} Y_{nS+v}=\mu +X_{nS+v}+{\widetilde{u}}_{nS+v} \\ X_{nS+v}=\alpha _{v}+\beta _{v}X_{nS+v-1}+\sigma _{v}e_{nS+v} \end{array} \right. , \quad n\in {\mathbb {Z}},\quad 1\le v\le S, \end{aligned}$$
(3.1)

where as in the above \(Y_{nS+v}=\log \left( \varepsilon _{nS+v}^{2}\right) \), \(X_{nS+v}=\log \left( h_{nS+v}\right) \), \(u_{nS+v}=\log \left( \eta _{nS+v}^{2}\right) , \) \(\mu =E\left( u_{nS+v}\right) \), \({\widetilde{u}}_{nS+v}=u_{nS+v}-\mu \) and \(\Delta _{u}^{2}=Var\left( u_{nS+v}\right) \). When \(\left\{ \eta _{t},t\in {\mathbb {Z}} \right\} \) is standard Gaussian, the mean and variance of \(\log \left( \eta _{nS+v}^{2}\right) \) can be accurately approximated by \(\psi \left( \frac{1}{ 2}\right) -\ln \left( \frac{1}{2}\right) \approx -1.27\) and \(\pi ^{2}/2\) respectively, where \(\psi \left( .\right) \) is the gamma function (e.g. Ruiz 1994). Note, however, that the linear state-space model (3.1) is not Gaussian, unless (i) \(e_{1}\) is Gaussian, (ii) \(e_{1}\) and \(\eta _{1}\) are independent and (iii) \(\eta _{1}\) has the same distribution as \(\exp \left( X/2\right) \) for some X normally distributed with mean zero and variance 1. In what follows we assume for simplicity of exposition that \(\eta _{1}\) is standard Gaussian, but the QML method we present below is still valid when \(\eta _{1}\) is not Gaussian and even when \(\mu \) and \(\Delta _{u}^{2}\) are unknown.

Let \(Y=\left( Y_{1},\ldots , Y_{T}\right) ^{\prime }\) be the series of log-squares corresponding to \(\varepsilon =\left( \varepsilon _{1},\ldots , \varepsilon _{T}\right) ^{\prime }\) (i.e. \(Y_{t}=\log \left( \varepsilon _{t}^{2}\right) , \) \(1\le t\le T)\), which is generated from (3.1) with true parameter \(\theta _{0}\). The quasi-likelihood function \(l_{Q}(\theta ;Y) \) evaluated at a generic parameter \(\theta \) may be written via the prediction error decomposition as follows

$$\begin{aligned} \log (l_{Q}(\theta ;Y))=-\frac{T}{2}\log (2\pi )-\frac{1}{2} \sum \limits _{t=1}^{T} \left( \log (F_{t}) +\dfrac{\left( Y_{t}-{\widehat{Y}}_{{t}/{t-1}}\right) ^{2}}{F_{t}}\right) , \end{aligned}$$
(3.2)

where \({\widehat{Y}}_{t/{t-1}}={\widehat{X}}_{t \mid {t-1}}+\mu \), \({\widehat{X}}_{t\mid {t-1}}\) is the best predictor of the state \(X_{t}\) based on the observations \(Y_{1},\ldots ,Y_{t-1}\) with mean square errors \( P_{t/{t-1}}=E\left( X_{t}-{\widehat{X}}_{t/{t-1}}\right) ^{2}\) and \( F_{t}=E\left( Y_{t}-{\widehat{Y}}_{t/{t-1}}\right) ^{2}\). A QML estimate \({\widehat{\theta }}_\mathrm{QML}\) of the true \(\theta _{0}\) is the maximizer of \(\log (l_{Q}(\theta ;Y))\) over some compact parametric space \(\Theta \), where \(l_{Q}(\theta ;Y)\) is evaluated as if the linear state space model (3.1) was Gaussian. Thus the best state predictor \({\widehat{X}}_{t\mid {t-1}}\) and the state prediction error variance \(P_{t/{t-1}}\) may be recursively computed using the Kalman filter, which in the context of model (3.1) is described by the following recursions

$$\begin{aligned} \begin{array}{l} {\widehat{X}}_{t/{t-1}}=\beta _{t}\left( {\widehat{X}}_{{t-1}/{t-2}}+P_{{t-1}/{t-2}}F_{t-1}^{-1}\left( Y_{t-1} -{\widehat{X}}_{{t-1}/{t-2}}-\mu \right) \right) +\alpha _{t} \\ P_{t/{t-1}}=\beta _{t}^{2}\left( P_{{t-1}/{t-2}}-P_{{t-1}/{t-2}}^{2}F_{t-1}^{-1}\right) +\sigma _{t}^{2} \\ F_{t}=P_{t/{t-1}}+\Delta _{u}^{2} \end{array}, \quad 2\le t\le T,\nonumber \\ \end{aligned}$$
(3.3a)

while remembering that \(\alpha _{t}\), \(\beta _{t}\) and \(\sigma _{t}^{2}\) are S-periodic over t. The start-up values of (3.3a) are calculated on the basis of \({\widehat{X}}_{1/0}=E\left( X_{1}\right) \) and \(P_{1/0}=Var\left( X_{1}\right) \). Using results of Sect. 2, we then get

$$\begin{aligned} {\widehat{X}}_{1/0}=\dfrac{\sum _{j=0}^{S-1}{\prod \limits _{i=0}^{j-1}} \beta _{1-i}\alpha _{1-j}}{1-{\prod \limits _{v=1}^{S}}\beta _{v}} \text {and } P_{1/0}= \frac{\sum _{j=0}^{S-1}{\mathop {\prod }\limits _{i=0}^{j-1}}{\beta _{1-i}^{2}} {\sigma _{1-j}^{2}}}{1-{\prod \limits _{v=1}^{S}}\beta _{v}^{2}}. \end{aligned}$$
(3.3b)

Recursions (3.3) may also be used in a reverse form for smoothing purposes, i.e. to obtain the best linear predictor \({\widetilde{X}}_{t}\) of \( X_{t}\) based on \(Y_{1},\ldots , Y_{T}\), from which we get estimates of the unobserved volatilities \(h_{t}\) \(\left( 1\le t\le T\right) \).

Consistency and asymptotic normality of the QML estimate may be established using standard theory of linear (non-Gaussian) signal plus noise models with time-invariant parameters (Dunsmuir 1979). For this, we invoke the corresponding multivariate time-invariant model (2.2) which we transform to a linear form as follows

$$\begin{aligned} \left\{ \begin{array}{l} \mathbf {Y}_{n}=\log H_{n}+\Xi _{n} \\ \log H_{n}=B\log H_{n-1}+\xi _{n} \end{array} \right. \quad n\in {\mathbb {Z}}, \end{aligned}$$
(3.4)

where \(\mathbf {Y}_{n}\) and \(\Xi _{n}\ \)are S-vectors such that \(\mathbf {Y}_{n}\left( v\right) =Y_{nS+v}\), and \(\Xi _{n}\left( v\right) =u_{nS+v}\) \( (1\le v\le S)\) and where \(\log H_{n},\) B and \(\xi _{n}\) are given by (2.2). In view of (3.4), we can use the theory in Dunsmuir (1979) to yield the asymptotic variance of the QMLE of \(\theta _{0}\) under the finiteness of \(E\left( Y_{v}^{4}\right) \) \((1\le v\le S)\) (see also Ruiz 1994 and Harvey et al. 1994).

Note finally that if we assume that: (i) \(e_{t}\) is Gaussian, (ii) \(e_{1}\) and and \(\eta _{1}\) are independent, and (iii) \(\log \left( \eta _{1}^{2}\right) \sim N(0,1)\) (i.e. \(\eta _{1}\) has the same distribution as \(\exp \left( X/2\right) \) for some \(X\sim N(0,1)\)), then the linear state space (3.1) would be Gaussian and the QMLE of \(\theta _{0}\) would reduce to the exact maximum likelihood estimate (MLE) which is then asymptotically efficient. However, the assumption that \(\log \left( \eta _{1}^{2}\right) \sim N(0,1)\) seems to have a little interest in practice.

3.2 Bayesian inference via Gibbs sampling

Adopting the Bayesian approach, the parameter vector \(\theta \) of the model and the unobserved volatilities \(h=\left( h_{1},h_{2},\ldots ,h_{T}\right) ^{\prime }\) which are also considered as augmented parameters, are viewed as random with a certain prior distribution \(f\left( \theta , h\right) \). Given a series \(\varepsilon =\left( \varepsilon _{1},\ldots , \varepsilon _{T}\right) ^{\prime }\) generated from the PAR-SVs model (2.1) with Gaussian innovations, the goal is to make inference about the joint posterior distribution, \(f\left( \theta , h/\varepsilon \right) \), of \(\left( \theta , h\right) \) given \(\varepsilon \). Because of the periodic structure of the PAR-SVs model it is natural to assume that the parameters h, \(\omega , \) \(\sigma _{1}^{2},\sigma _{2}^{2},\ldots ,\sigma _{S}^{2}\) are independent of each other. Thus, the joint posterior distribution \(f\left( \theta , h/\varepsilon \right) =f\left( \omega , \sigma ^{2},h/\varepsilon \right) \) can be estimated using Gibbs sampling provided we can draw samples from any of the \(S+2\) conditional posterior distributions \(f\left( \omega /\varepsilon , \sigma ^{2},h\right) \), \(f\left( \sigma _{v}^{2}/\varepsilon , \omega ,\sigma _{-\left\{ v\right\} }^{2},h\right) \) \( \left( 1\le v\le S\right) \) and \(f\left( h/\varepsilon , \omega , \sigma ^{2}\right) \), where \(x_{-\left\{ t\right\} }\) denotes the vector obtained from x after removing its t-th component \(x_{t}\). Since the posterior distribution of the volatility \(f\left( h/\varepsilon , \omega ,\sigma ^{2}\right) \) has a rather complicated expression, we sample it element-by-element as done by Jacquier et al. (1994). Thus, the “single-move” Gibbs sampler for the joint posterior distribution \(f\left( \omega , \sigma ^{2},h/\varepsilon \right) \) reduces to drawing samples from any of the \(T+S+1\) conditional posterior distributions \(f\left( \omega /\varepsilon , \sigma ^{2},h\right) \), \(f\left( \sigma _{v}^{2}/\varepsilon , \omega , \sigma _{-\left\{ v\right\} }^{2},h\right) \), \((1\le v\) \(\le \) S) and \(f\left( h_{t}/\varepsilon , \omega , \sigma ^{2},h_{-\left\{ t\right\} }\right) , \) \(\left( 1\le t\le T\right) \). Under normality of the volatility proxies and using standard linear regression theory with an appropriate adaptation to the periodic AR form of the log-volatility equation (2.1), the conditional posteriors \(f\left( \omega /\varepsilon , \sigma ^{2},h\right) \) and \(f\left( \sigma _{v}^{2}/\varepsilon , \omega , \sigma _{-\left\{ v\right\} }^{2},h\right) \), \((1\le v\) \(\le \) S) may be determined directly from given conjugate priors \(f\left( \omega \right) \) and \(f\left( \sigma _{v}^{2}\right) \), \(\left( 1\le v\le S\right) \). However, like the non-periodic SV case (Jacquier et al. 1994), direct draws from the distribution \(f\left( h_{t}/\varepsilon , \omega , \sigma ^{2},h_{-\left\{ t\right\} }\right) \) are not possible because it has unusual form. Nevertheless, unlike Jacquier et al. (1994) which used a Metropolis-Hasting chain after determining the form of \(f\left( h_{t}/\varepsilon , \omega , \sigma ^{2},h_{-\left\{ t\right\} }\right) \) except for a scaling factor, we use the Griddy–Gibbs procedure as in Tsay (2010) because in our periodic context its implementation seems much simpler.

3.2.1 Prior and posterior sampling analysis

(a) Sampling the log-volatility periodic autoregressive parameter \(\omega \)

Before giving the conditional posterior distribution \(f\left( \omega /\varepsilon , \sigma ^{2},h\right) \) through some conjugate prior distributions and linear regression theory, we first write the PAR log-volatility equation in a standard linear regression form. Setting \( \mathcal {H}_{nS+v}=\left( \underset{v-1\text {times}}{\underbrace{0,\ldots ,0}}, 1,\log \left( h_{nS+v-1}\right) , \underset{S-v\text { times}}{\underbrace{ 0,\ldots ,0}}\right) ^{\prime }\), model (2.1b) for \(t=1,\ldots ,NS\) may be rewritten in the following periodically homoskedastic linear regression

$$\begin{aligned} \log \left( h_{nS+v}\right) =\mathcal {H}_{nS+v}^{\prime }\omega +\sigma _{v}e_{nS+v},\quad 1\le v\le S,\quad 0\le n\le N-1, \end{aligned}$$
(3.5a)

or also in a standard regression

$$\begin{aligned} \frac{\log \left( h_{nS+v}\right) }{\sigma _{v}}=\frac{1}{\sigma _{v}} \mathcal {H}_{nS+v}^{\prime }\omega +e_{nS+v},\quad 1\le v\le S,\quad 0\le n\le N-1, \end{aligned}$$
(3.5b)

with iid Gaussian errors. Assuming known the variances \(\sigma _{v}^{2}\) \( \left( 1\le v\le S\right) \) and the initial observation \(h_{0}\), the least squares estimate \({\widehat{\omega }}_{WLS}\) of \(\omega \), based on (3.5b), (which is just the weighted least squares estimate of \(\omega \) based on (3.5a)) has the following form

$$\begin{aligned} {\widehat{\omega }}_{WLS}=\left( \sum _{n=0}^{N-1}\sum _{v=1}^{S}\frac{1}{\sigma _{v}^{2}}\mathcal {H}_{nS+v}\mathcal {H}_{nS+v}^{\prime }\right) ^{-1}\sum _{n=0}^{N-1}\sum _{v=1}^{S}\frac{1}{\sigma _{v}^{2}} \mathcal {H}_{nS+v}\log \left( h_{nS+v}\right) , \end{aligned}$$

and is normally distributed with mean \(\omega \) and covariance matrix

$$\begin{aligned} {{\Gamma }}=\left( \sum _{n=0}^{N-1}\sum _{v=1}^{S}\frac{1}{ \sigma _{v}^{2}}\mathcal {H}_{nS+v}\mathcal {H}_{nS+v}^{\prime }\right) ^{-1}. \end{aligned}$$
(3.6)

Under assumption (3.5b), information of the data about \(\omega \) is contained in the weighted least squares estimate \({\widehat{\omega }}_{WLS}\) of \(\omega \). To get a closed-form expression for the conditional posterior \( f\left( \omega /\varepsilon , \sigma ^{2},h\right) \) we use a conjugate prior for \(\omega \). This prior distribution is Gaussian, i.e. \(\omega \sim N\left( \omega ^{0},\Sigma ^{0}\right) \), where the hyperparameters \(\omega ^{0},\Sigma ^{0}\) are known and are fixed so that to have a quite reasonably diffuse prior yet informative.

Thus from standard regression theory (e.g. Box and Tiao 1973; Tsay 2010) the conditional posterior distribution of \(\omega \) given \(\varepsilon ,\sigma ^{2},h\) is:

$$\begin{aligned} \omega /\varepsilon , \sigma ^{2},h\sim N\left( \omega ^{*},\Sigma ^{*}\right) , \end{aligned}$$
(3.7a)

where

$$\begin{aligned} \Sigma ^{*}= & {} \left( \sum _{n=0}^{N-1}\sum _{v=1}^{S}\frac{1}{\sigma _{v}^{2}}\mathcal {H}_{nS+v}\mathcal {H}_{nS+v}^{\prime }+\left( \Sigma ^{0}\right) ^{-1}\right) ^{-1} \end{aligned}$$
(3.7b)
$$\begin{aligned} \omega ^{*}= & {} \Sigma ^{*}\left( \sum _{n=0}^{N-1}\sum _{v=1}^{S}\frac{ 1}{\sigma _{v}^{2}}\mathcal {H}_{nS+v}\log \left( h_{nS+v}\right) +\left( \Sigma ^{0}\right) ^{-1}\omega ^{0}\right) . \end{aligned}$$
(3.7c)

Some remarks are in order:

  1. (i)

    The matrix \({{\Gamma }}\) given by (3.6) is block diagonal. So if we assume that \(\Sigma ^{0}\) is also block diagonal, then we obtain the same result as if we assume that the seasonal parameters \(\omega _{1},\omega _{2},\ldots ,\omega _{S}\) are independent of each other, and each one has a conjugate prior with hyperparameters, say \(\omega _{v}^{0}\) and \( \Sigma _{v}^{0}\) (\(1\le v\le S\)), that are appropriate components of \( \omega ^{0}\) and \(\Sigma ^{0}\).

  2. (ii)

    Faster and more stable computation of \(\omega ^{*}\) and \(\Sigma ^{*}\) in (3.7) which does not involve any matrix inversion (in contrast with (3.7b)) may be obtained while setting \(\omega ^{*}=\omega _{NS}^{*}\), \(\Sigma ^{*}=\Sigma _{NS}^{*}\) and recursively then computing the latter quantities using the well-known recursive least squares (RLS) algorithm (see Ljung and Söderström 1983, Lemma 2.2) which is given by

    $$\begin{aligned} \begin{array}{l} \omega _{nS+v}^{*}=\omega _{nS+v-1}^{*} +\dfrac{\Sigma _{nS+v-1}^{*-1}\mathcal {H}_{nS+v}\left( \log \left( h_{nS+v}\right) - \mathcal {H}_{nS+v}^{\prime }\omega _{nS+v-1}^{*}\right) }{\sigma _{v}^{2}+\mathcal {H}_{nS+v}^{\prime }\Sigma _{nS+v-1}^{*-1}\mathcal {H}_{nS+v}} \\ \Sigma _{nS+v}^{*-1}=\Sigma _{nS+v-1}^{*-1} -\dfrac{\Sigma _{nS+v-1}^{*-1}\mathcal {H}_{nS+v} \mathcal {H}_{nS+v}^{\prime }\Sigma _{nS+v-1}^{*-1}}{\sigma _{v}^{2}+\mathcal {H}_{nS+v}^{\prime } \Sigma _{nS+v-1}^{*-1}\mathcal {H}_{nS+v}} \end{array}, \begin{array}{c} {1\le v\le S} \\ {0\le n\le N-1,} \end{array}\nonumber \\ \end{aligned}$$
    (3.8a)

    with starting values

    $$\begin{aligned} \omega _{0}^{*}=\omega _{0}\quad \text { and }\quad \Sigma _{0}^{{*}-1}=\Sigma _{0}. \end{aligned}$$
    (3.8b)

    This may improve the numerical stability and computation time tied to the whole estimation method, especially for large period S.

(b) Sampling the periodic variance parameters \(\sigma _{v}^{2},\quad \) \(1\le v\le S\)

We also use conjugate priors for \(\sigma _{v}^{2},\) \(1\le v\le S\) to get a closed form expression for the conditional posterior of \(\sigma _{v}^{2}\) given data and the other parameters \(\sigma _{-\left\{ v\right\} }^{2}\). Such priors are provided by the inverted Khi-squared distribution:

$$\begin{aligned} \tfrac{a_{v}\lambda _{v}}{\sigma _{v}^{2}}\sim \chi _{a_{v}}^{2},\quad 1\le v\le S, \end{aligned}$$
(3.9a)

where \(a_{v}\lambda _{v}=1\) \(\left( 1\le v\le S\right) \). Given the parameters \(\omega \ \)and h, if we define

$$\begin{aligned} e_{nS+v}=\log \left( h_{nS+v}\right) -\alpha _{v}-\beta _{v}\log \left( h_{nS+v-1}\right) ,\quad 1\le v\le S, \quad 0\le n\le N-1,\nonumber \\ \end{aligned}$$
(3.9b)

then \(e_{v},e_{v+S},\ldots ,e_{\left( N-1\right) S+v}\sim iiN\left( 0,\sigma _{v}^{2}\right) \), \(1\le v\le S\). From standard Bayesian linear regression theory (see e.g. Tsay 2010) the conditional posterior distribution of \( \sigma _{v}^{2}\) (\(1\le v\le S\)) given the data and the remainder parameters is an inverted Khi-squared distribution with degree of freedom \(a_{v}+N-1\), that is

$$\begin{aligned} \dfrac{a_{v}\lambda _{v}+\sum _{n=0}^{N-1}e_{nS+v}^{2}}{\sigma _{v}^{2}} /\varepsilon , \omega , \sigma _{-\left\{ v\right\} }^{2},h\sim \chi _{a_{v}+N-1}^{2},\quad 1\le v\le S. \end{aligned}$$
(3.9c)

(c) Sampling the augmented volatility parameters \({\underline{h}}=\left( h_{1},h_{2},\ldots ,h_{T}\right) ^{\prime }\)

Now, it remains to sample from the conditional posterior distribution \( f\left( h_{t}/\varepsilon , \theta , h_{-\left\{ t\right\} }\right) \) for \( t=1,2,\ldots ,T\). Let us first give the expression of this distribution (except for a multiplicative constant) and we will show how to (indirectly) draw samples from it using the Griddy Gibbs technique. Because of the Markovian (but non-homogeneous) structure of the volatility process \(\left\{ h_{t},t\in {\mathbb {Z}} \right\} \) and the conditional independence of \(\varepsilon _{t}\) and \( h_{t-h}\) (\(h\ne 0\)) given \(h_{t}\), it follows that for any \(1<t<T.\)

(3.10)

Using the fact that \(\varepsilon _{t}/\theta , h_{t}\equiv \varepsilon _{t}/h_{t}\sim N\left( 0,h_{t}\right) \), \(\log \left( h_{t}\right) /\log \left( h_{t-1}\right) , \theta \sim N\left( \alpha _{t}+\beta _{t}\log \left( h_{t-1}\right) ,\sigma _{t}^{2}\right) , \) and \(d\log (h_{t})=\frac{1}{h_{t}} dh_{t}\), formula (3.10) becomes

$$\begin{aligned} f\left( h_{t}/\varepsilon , \theta , h_{-\left\{ t\right\} }\right) \varpropto \frac{1}{\sqrt{h_{t}^{3}}}\exp \left( -\frac{\varepsilon _{t}^{2}}{2h_{t}}- \frac{1}{2\Omega _{t}}\left( \log \left( h_{t}\right) -\mu _{t}\right) ^{2}\right) ,\quad 1<t<T,\nonumber \\ \end{aligned}$$
(3.11a)

where

$$\begin{aligned} \mu _{t}= & {} \frac{\sigma _{t+1}^{2}\left( \alpha _{t}+\beta _{t}\log \left( h_{t-1}\right) \right) +\sigma _{t}^{2}\beta _{t+1}\left( \log \left( h_{t+1}\right) -\alpha _{t+1}\right) }{\sigma _{t+1}^{2} +\sigma _{t}^{2}\beta _{t+1}^{2}} \end{aligned}$$
(3.11b)
$$\begin{aligned} \Omega _{t}= & {} \frac{\sigma _{t+1}^{2}\sigma _{t}^{2}}{\sigma _{t+1}^{2} +\sigma _{t}^{2}\beta _{t+1}^{2}}. \end{aligned}$$
(3.11c)

Note that in (3.11a) we have used the well-known formula (see Box and Tiao 1973, p. 418) \(A\left( x-a\right) ^{2}+B\left( x-b\right) ^{2}=\left( x-c\right) ^{2}\left( A+B\right) +\left( a-b\right) ^{2}\dfrac{AB}{A+B}\), where \(c=(Aa+Bb)/(A+B)\) provided that \(A+B\ne 0\).

For the two end-points \(h_{1}\) and \(h_{T}\) we may simply use a naive approach which consists of assuming \(h_{1}\) fixed so that the sampling starts with \(t=2\) and use the fact that \(\log \left( h_{T}\right) /\theta , \log \left( h_{T-1}\right) \sim N(\alpha _{T}+\beta _{T-1}\log \left( h_{T-1}\right) ,\sigma _{T}^{2})\). Alternatively, we may also use a forecast of \(h_{T+1}\) and a backward prediction of \(h_{0}\) and employ again formula (3.11) for \(0<t<T+1\). In that case, we forecast \(h_{T+1}\) on the basis of the log-volatility equation of model (2.1) by using a 2-step ahead forecast \({\widehat{\log \left( h_{T-1}\right) }}\left( 2\right) \) at the origin \(T-1\), which is given from (2.1) by \({\widehat{\log \left( h_{T-1}\right) }}\left( 2\right) =\alpha _{T+1}+\beta _{T+1}\alpha _{T}+\beta _{T+1}\beta _{T}\log \left( h_{T-1}\right) \). The backward forecast of \( h_{0} \) is obtained using a 2-step ahead backward forecast on the basis of the backward periodic autoregression (Sakai and Ohno 1997) associated to the PAR log-volatility.

Once the conditional posterior \(f\left( h_{t}/\varepsilon , \theta ,h_{-\left\{ t\right\} }\right) \) is determined except for a scale factor, we may use some indirect sampling algorithms to draw the volatility \(h_{t}\). Jacquier et al. (1994) used the rejection Metropolis–Hasting algorithm. Alternatively, following Tsay (2010) we use the Griddy–Gibbs technique (Ritter and Tanner 1992) which consists in:

  1. (i)

    Choosing a grid of m points from a given interval \([h_{t1},h_{tm}]\) of \( h_{t}\): \(h_{t1}\le h_{t2}\le \ldots \le h_{tm}\); then evaluating the conditional posterior \(f\left( h_{t}/\varepsilon , \theta , h_{-\left\{ t\right\} }\right) \) via (3.11) (ignoring the normalization constant) at each one of these points, giving \(f_{ti}=f\left( h_{ti}/\varepsilon , \theta ,h_{-\left\{ t\right\} }\right) \), \(i=1,\ldots ,m\).

  2. (ii)

    Building from the values \(f_{t1},f_{t2},\ldots ,f_{tm}\) the discrete distribution \(p\left( .\right) \) defined at \(h_{ti}\) (\(1\le i\le m\)) by \( p\left( h_{ti}\right) =\dfrac{f_{ti}}{\sum _{j=1}^{m}f_{tj}}\). This may be seen as an approximation to the inverse cumulative distribution of \(f\left( h_{t}/\varepsilon , \theta , h_{-\left\{ t\right\} }\right) \).

  3. (iii)

    Generating a number from the uniform distribution on (0, 1) and transforming it using the discrete distribution \(p\left( .\right) \) obtained in (ii) to get a random draw for \(h_{t}\).

It is worth noting that the choice of the grid \([h_{t1},h_{tm}]\) is crucial for efficiency of the Griddy algorithm. We follow here a similar device by Tsay (2010), which consists of taking the range of \(h_{t}\), at the lth Gibbs iteration, to be \([h_{t}^{*m},h_{t}^{*M}]\), where

$$\begin{aligned} h_{t}^{*m}=0.6\max \left( h_{t}^{(0)},h_{t}^{(l-1)}\right) , h_{t}^{*M}=1.4\min \left( h_{t}^{(0)},h_{t}^{(l-1)}\right) , \end{aligned}$$
(3.12)

\(h_{t}^{\left( l-1\right) }\) and \(h_{t}^{\left( 0\right) }\) being, respectively, the estimate of \(h_{t}\) for the \((l-1)\)-th iteration and initial value.

3.2.2 Bayes Griddy Gibbs sampler for PAR-SVs

The following algorithm summarizes the Gibbs sampler for drawing from the conditional posterior distribution \(f\left( \theta ,h/\varepsilon \right) \) given \(\varepsilon \). For \(l=0,1,\ldots ,M\), consider the notation \(h^{\left( l\right) }=\left( h_{1}^{\left( l\right) },\ldots ,h_{T}^{\left( l\right) }\right) ^{\prime }\), \(\omega ^{\left( l\right) }=\left( \alpha _{1}^{\left( l\right) },\beta _{1}^{\left( l\right) },\ldots ,\alpha _{S}^{\left( l\right) },\beta _{S}^{\left( l\right) }\right) ^{\prime }\) and \(\sigma ^{2\left( l\right) }=\left( \sigma _{1}^{2\left( l\right) },\sigma _{2}^{2\left( l\right) },\ldots ,\sigma _{S}^{2\left( l\right) }\right) ^{\prime }\).

Algorithm 3.1

Step 0 Specify starting values \(h^{\left( 0\right) }\), \(\omega ^{\left( 0\right) }\) and \(\sigma ^{2\left( 0\right) }\).

Step 1 Repeat for \(l=0,1,...,M-1,\)

                  Draw \(\omega ^{\left( l+1\right) }\) from \(f\left( \omega /\varepsilon ,\sigma ^{2\left( l\right) },h^{\left( l\right) }\right) \) using (3.7a) and (3.8).

                  Draw \(\sigma ^{2\left( l+1\right) }\) from \(f\left( \sigma ^{2}/\varepsilon ,\omega ^{\left( l+1\right) },h^{\left( l\right) }\right) \) using (3.9b) and (3.9c).

                  Repeat for \(t=1,2,...,T=NS\)

                   Griddy Gibbs:

                        Select a grid of m points \(\left( h_{ti}^{\left( l+1\right) }\right) \): \(h_{t1}^{\left( l+1\right) }\le h_{t2}^{\left( l+1\right) }\le ...\le h_{tm}^{\left( l+1\right) }\).

                        For \(1\le i\le m\) calculate \(f_{ti}^{\left( l+1\right) }=f\left( h_{ti}^{\left( l+1\right) }/\varepsilon ,\theta ^{\left( l\right) },h_{-\left\{ t\right\} }^{\left( l\right) }\right) \) from (3.11).

                        Define the inverse distribution \(p\left( h_{ti}^{\left( l+1\right) }\right) =\dfrac{f_{ti}^{\left( l+1\right) }}{\sum _{j=1}^{m}f_{tj}^{\left( l+1\right) }}\), \(1\le i\le m\).

                        Generate a number u from the uniform (0, 1) distribution.

                        Transform u using the inverse distribution \(p\left( .\right) \) to get \(h_{t}^{\left( l+1\right) }\), which is

                        considered as a draw from \(f\left( h_{t}/\varepsilon ,\theta ^{\left( l+1\right) },h_{-\left\{ t\right\} }^{\left( l\right) }\right) \).

Step 2 Return the values \(h^{\left( l\right) }\), \(\omega ^{\left( l\right) }\) and \(\sigma ^{2\left( l\right) }\), \(l=1,...,M\). \(\square \)

3.2.3 Inference and prediction using the Gibbs sampler for PAR-SVs

Once sampling from the posterior distribution \(f\left( \theta ,h/\varepsilon \right) \), statistical inference for the PAR-SVs model may be easily made.

The Bayes Griddy–Gibbs parameter estimate \({\widehat{\theta }}_{BGG}\) of \( \theta \) is taken to be the posterior mean \({\overline{\theta }}=E\left( \theta /\varepsilon \right) \) which is, under the Markov chain ergodic theorem, approximated with any desired degree of accuracy by

$$\begin{aligned} {\widehat{\theta }}_{BGG}=\tfrac{1}{M}\sum _{l=l_{0}}^{M+l_{0}}\theta ^{\left( l\right) }, \end{aligned}$$

where \(\theta ^{\left( l\right) }\) is the l-th draw of \(\theta \) from \( f\left( \theta , h/\varepsilon \right) \) given by Algorithm 3.1, \(l_{0}\) is the burn-in size, i.e. the number of initial draws discarded, and M is the number of draws.

Smoothing and forecasting volatility are obtained as a by-product of the Bayes Griddy–Gibbs method. The smoothed value, \({\overline{h}}_{t}=E\left( h_{t}/\varepsilon \right) \), of \(h_{t}\) (\(1\le t\le T\)) is obtained while sampling from the distribution \(f\left( h_{t}/\varepsilon \right) \) which in turn is the marginal of the posterior distribution \(f\left( \theta , h/\varepsilon \right) \). So \(E\left( h_{t}/\varepsilon \right) \) may be accurately approximated by \(\frac{1}{M}\sum _{l=l_{0}}^{M+l_{0}}h_{t}^{\left( l\right) }\) where \(h_{t}^{\left( l\right) }\) is the l-th draw of \(h_{t}\) from \(f\left( \theta , h_{t}/\varepsilon \right) \). Forecasting future values \(h_{T+1},h_{T+2},..,h_{T+k}\) are obtained either as in the above using the log-volatility equation with the Bayes parameter estimates, or directly while sampling from the predictive distribution \(f\left( h_{T+1},h_{T+2},..,h_{T+k}/\varepsilon \right) \) (see also Jacquier et al. 1994).

3.2.4 MCMC diagnostics

It is important to discuss the numerical properties of the proposed BGG method in which the volatilities are sampled element by element. Despite the ease of implementation, it is well documented that the main drawback of the single-move approach (e.g. Kim et al. 1998) is that the posterior draws are often highly correlated thereby resulting in a slow mixing and so a slow convergence properties. Among several MCMC diagnostic measures, we consider here the relative numerical inefficiency (RNI) (e.g. Geweke 1989; Geyer 1992), which is given by

$$\begin{aligned} { RNI}=1+2\sum \limits _{k=1}^{B}K\left( \tfrac{k}{B}\right) {\widehat{\rho }}_{k}, \end{aligned}$$

where \(B=500\) is the bandwidth, \(K\left( .\right) \) is the Parzen kernel (e.g. Kim et al. 1998) and \({\widehat{\rho }}_{k}\) the sample autocorrelation at lag k of the BGG parameter draws. The RNI indicates in fact on the inefficiency due to the serial correlation of the BGG draws (see also Geweke 1989; Tsiakas 2006). Another MCMC diagnostic measure (Geweke 1989) we use here is the numerical standard error (NSE), which is the square root of the estimated asymptotic variance of the MCMC estimator. In fact, the NSE is given by

$$\begin{aligned} { NSE}=\sqrt{\tfrac{1}{M}\left( {\widehat{\gamma }}_{0}+2\sum \limits _{k=1}^{B}K \left( \tfrac{k}{B}\right) {\widehat{\gamma }}_{k}\right) }, \end{aligned}$$

where \({\widehat{\gamma }}_{k}\) is the sample autocovariance at lag k of the BGG parameter draws, \(K\left( .\right) \) is the Parzen kernel and M is the number of draws.

3.2.5 Period selection via the Deviance Information Criterion

An important issue in PAR-SVs modeling is the selection of the period S. This problem is especially more pronounced for modeling daily returns because their periodicity is not as obvious as in quarterly or monthly data. Although many authors (e.g. Franses and Paap 2000; Tsiakas 2006) have emphasized the day-of-the-week effect in daily stock returns, which often entails a period of \(S=5\), the period-selection problem in periodic volatility models remains a challenging problem. Standard order-selection measures such as the AIC and BIC, which require the specification of the number of free parameters in each model, are not applicable for comparing complex Bayesian hierarchical models like the PAR-SVs model. This is because in the PAR-SVs model, the number of free parameters, which is augmented by the latent volatilities that are in fact not independent but Markovian, is not well defined (cf. Berg et al. 2004). For a long time, the Bayes factor has been viewed as the best way to carry out Bayesian model comparison. However, its calculation based on evaluating the marginal likelihood requires extremely high-dimensional integration, and this would be more computationally demanding especially for PAR-SVs model which involves a larger number of parameters augmented by the volatilities, exceeding the sample size.

In this paper, we will carry out period selection using rather the deviance information criterion (DIC), which may be viewed as a trade-off between model adequacy and model complexity (Spiegelhalter et al. 2002). Such a criterion, which represents a Bayesian generalization of the AIC, is easily obtained from MCMC draws, needing no extra-calculations. The (conditional) DIC as introduced by Spiegelhalter et al. (2002) is defined in the context of PAR-SVs to be

$$\begin{aligned} { DIC}\left( S\right) =-4E_{\theta , h/\varepsilon }\left( \log \left( f\left( \varepsilon /\theta , h\right) \right) \right) +2\log \left( f\left( \varepsilon /{\overline{\theta }},{\overline{h}}\right) \right) , \end{aligned}$$

where \(f\left( \varepsilon /\theta , h\right) \) is the (conditional) likelihood of the PAR-SVs model for a given period S and \(\left( {\overline{\theta }},{\overline{h}}\right) =E\left( (\theta ,h\mathbf {)} /\varepsilon \right) \) is the posterior mean of \((\theta , h)\). From the Griddy–Gibbs draws, the expectation \(E_{\theta , h/\varepsilon }\left( \log \left( f\left( \varepsilon /\theta , h\right) \right) \right) \) can be estimated by averaging the conditional log-likelihood, \(\log f\left( \varepsilon /\theta ,h\right) \), over the posterior draws of \((\theta , h)\). Further, the joint posterior mean estimate of \(({\overline{\theta }},{\overline{h}})\) can be approximated by the mean of the posterior draws of \((\theta ^{\left( l\right) },h^{\left( l\right) })\). Using the fact that \(f\left( \varepsilon /\theta , h\right) :=f\left( \varepsilon /h\right) =-\frac{1}{2} \sum _{t=1}^{T}\left( \log \left( 2\pi h_{t}\right) +\frac{\varepsilon _{t}^{2}}{h_{t}}\right) \), the DIC\(\left( S\right) \) is estimated by

$$\begin{aligned} \tfrac{2}{M}\sum _{l=l_{0}}^{l_{0}+M}\sum _{t=1}^{T}\left( \log \left( 2\pi h_{t}^{\left( l\right) }\right) +\frac{\varepsilon _{t}^{2}}{h_{t}^{\left( l\right) }}\right) -\sum _{t=1}^{T}\left( \log \left( 2\pi {\overline{h}}_{t}\right) +\frac{\varepsilon _{t}^{2}}{{\overline{h}}_{t}}\right) , \end{aligned}$$

where \(h_{t}^{\left( l\right) }\) denotes the l-th BGG draw of \(h_{t}\) from \(f\left( h_{t}/\varepsilon _{t},\theta \right) \), M is the number of draws, \(l_{0}\) is the burn-in size and \({\overline{h}}_{t}:=E\left( h_{t}/\varepsilon \right) \ \)is estimated by \(\frac{1}{M} \sum _{l=l_{0}}^{l_{0}+M}h_{t}^{\left( l\right) }\) (\(1\le t\le n\)). Of course, a model is preferred if it has the smallest DIC value.

Since the DIC is random and for the same fitted series it may change value from a MCMCdraw to another, it is useful to get its corresponding numerical standard error. However, as pointed out by Berg et al. (2004), no efficient method has been developed for calculating reasonably accurate Monte Carlo standard errors of DIC. Nevertheless, following the recommendation of Zhu and Carlin (2000) we simply replicate the calculation of DIC some G times and estimate Var (DIC) by its sample variance, giving a broad indication of the implied variability of DIC.

Note finally that for the class of latent variable models to which belongs the PAR-SVs model, there are in fact several alternative definitions of the DIC depending on the different concepts of the likelihood used (complete, observed, conditional) and the one we worked with here is the conditional DIC as categorized by Celeux et al. (2006). We have avoided using the observed DIC because, like the Bayes factor, it is based on evaluating the marginal likelihood whose computation is typically very time-consuming.

4 Simulation study: finite-sample performance of the QML and BGG estimates

In this section, a simulation study is undertaken to assess the performance of the QML, BGG Bayes estimates in finite samples.

Concerning finite-properties of the QML and BGG estimates, three instances of the Gaussian PAR-SVs model with period \(S=2\) are considered and are reported respectively in Tables 1, 2 and 3. The parameter \(\theta =\left( \alpha _{1},\beta _{1}, \alpha _{2},\beta _{2}, \sigma _{1}^{2},\sigma _{2}^{2}\right) ^{\prime }\) are chosen for each instance in order to be in accordance with empirical evidence. In particular, for the three instances the persistence parameter \(\beta _{1}\beta _{2}\) equals 0.90, 0.95 and 0.99 respectively. We have also set small values for \(\sigma _{1}^{2}\) and \(\sigma _{2}^{2}\) because it is a critical case for the performance of the QMLE as pointed out by Ruiz (1994) and Harvey et al. (1994) in the standard SV case. The choice of \(S=2\) is motivated by computational and time-consuming considerations. For each instance, we have considered 1000 replications of PAR-SVs series with sample size 1500, for which we calculated the QML and Bayes Monte Carlo replications. Mean of estimates (\({\widehat{\theta }}_{QML}\) and \({\widehat{\theta }}_{BGG}\)) and their standard deviations (Std) over the 1000 replications are reported in Tables 1, 2 and 3.

Table 1 Instance 1—simulation results for QML and BGG on a Gaussian PAR-SV\(_{2}\) with \(T=1500\)
Table 2 Instance 2—simulation results for QML and BGG on a Gaussian PAR-SV\(_{2}\) with \(T=1500\)
Table 3 Instance 3—simulation results for QML and BGG on a Gaussian PAR-SV\(_{2}\) with \(T=1500\)

For the QML method a non linear optimization routine is required. We have applied a Gauss–Newton type algorithm starting from different values of the \( \theta \) parameter estimate. For the Bayes Griddy Gibbs estimate, we have taken the same prior distributions for \(\omega =\left( \alpha _{1},\beta _{1},\alpha _{2},\beta _{2}\right) ^{\prime }\) across instances:

$$\begin{aligned} \begin{array}{c} \omega \sim N\left( \omega _{0},{ diag}\left( 0.05,0.5,0.05,0.5\right) \right) , \quad \omega _{0}=\left( 0,0,0,0\right) ^{\prime }, \\ \frac{1}{\sigma _{1}^{2}}\sim \chi _{5}^{2}, \frac{1}{\sigma _{2}^{2}}\sim \chi _{5}^{2}, \end{array} \end{aligned}$$

which are quite diffuse, but proper. Concerning initial parameter values, the initial volatility \(h^{\left( 0\right) }\) in the Gibbs sampler is taken to be the volatility generated by the fitted GARCH\(\left( 1,1\right) \), that is \(h^{\left( 0\right) }=h^{G}\) where

$$\begin{aligned} \left\{ \begin{array}{l} \varepsilon _{t}=\sqrt{h_{t}^{G}}\eta _{t} \\ h_{t}^{G}=\varphi _{0}+\varphi _{1}\varepsilon _{t-1}^{2}+\psi h_{t-1}^{G} \end{array} \right. , \quad t\in {\mathbb {Z}}, \end{aligned}$$

while the initial log-volatility parameter estimate \(\theta ^{\left( 0\right) }\) is taken to be the ordinary least-squares estimate of \(\theta \) based on the series \(\log \left( h^{\left( 0\right) }\right) \). Furthermore, in the Griddy Gibbs iteration, \(h_{t}\) is generated using 500 grid points and the range of \(h_{t}\) at the l-th Gibbs iteration is taken as in (3.12). Finally, the Gibbs sampler is run for 5500 iterations from which we discarded the first 500 iterations.

It can be observed that the parameters are quite well estimated by the two methods with an obvious superiority of the Bayes estimate over the QMLE. Indeed, in all instances the BGG estimate (BGGE) greatly dominates the QMLE in the sense that it has smaller bias and standard deviations. We also observe that the QMLE provides poor estimates as small as the variance parameters \(\sigma _{1}^{2}\) and \(\sigma _{2}^{2}\).

From a theoretical point of view, it would be interesting to compare the QMLE and BGGE when \(\log \left( \eta _{1}^{2}\right) \sim N\left( 0,1\right) \), i.e. when \(\eta _{1}\sim \exp \left( X/2\right) \) with \(X\sim N\left( 0,1\right) \). In that case, as emphasized in Sect. 3, the QMLE reduces to the MLE and it would be more (asymptotically) efficient than the BGGE. So through simulations, the QMLE would (in principle) perform better than the BGGE for PAR-SVs series with quite large sample size. However, the BGG method should be adapted to the case of distribution \(\eta _{1}\sim \exp \left( X/2\right) \), which may entail a lot of effort for a distribution (\(\exp \left( N\left( 0,1\right) /2\right) \)) that seems to have a little interest in practice.

5 Application to the \( S \& P\) 500 returns

For the sake of illustration, we propose to fit Gaussian PAR-SVs models (2.1) with various periods to the returns on the \( S \& P\) 500 (closing value) index. In order to highlight many possible values of the PAR-SVs period, three types of datasets are considered namely daily, quarterly and monthly \( S \& P\) 500 returns. For the three series considered, we use the Bayes Griddy Gibbs estimate thanks to its good finite-sample properties, with number of iterations \(M=5000\) and burn-in 500. As in Sect. 4, we take the initial volatility \(h^{\left( 0\right) }\) to be the volatility generated by the fitted GARCH\(\left( 1,1\right) \) while the initial log-volatility parameter estimate \(\theta ^{\left( 0\right) }\) is taken to be the ordinary least-squares estimate of \(\theta \) based on the series \(\log \left( h^{\left( 0\right) }\right) \). We have in fact avoided to use the volatility fitted by the periodic GARCH (PGARCH\(\left( 1,1\right) \)) model as initial value \(h^{\left( 0\right) }\) because of some numerical difficulties in the corresponding QML estimation when S becomes large (once \(S\ge 3\)). In the Gibbs step, the volatility \(h^{\left( l\right) }\) is drawn across PAR-SVs models using the Griddy–Gibbs technique using the same devises given in Sect. 4, i.e. using 500 grid points and the range of \(h_{t}\) at the l-th Gibbs iteration is taken as in (3.12). All procedures have been applied on a personal computer using Matlab 2013. The BGG programs are available from the author upon request.

5.1 Daily S&P 500 returns: day-of-the-week effect

5.1.1 The data

The first dataset consists of the daily \( S \& P\) 500 returns (in decimals) over the sample period starting from January, 01, 2007 to December, 31, 2012, with a total of \(T=1509\) observations. The time series plots of the index (panel (a)) and its return (panel (b)) are presented in Fig. 1. The same data has also been considered by Chan and Grant (2014).

Fig. 1
figure 1

Daily S&P 500 from January 2007 to December 2012. a Level, b return

Table 4 shows some descriptive statistics for the returns, the absolute returns, the squared return and the log-absolute returns where it may be seen that the data exhibits negative skewness, high kurtosis and low autocorrelation. Moreover, unreported sample correlations with high lags show that the absolute and squared returns are characterized by high persistence with an obvious higher correlation for the absolute returns than the squares. Finally, the log-absolute return looks like a Gaussian much more than do the daily \(\left( \varepsilon _{t}\right) , \) \(\left( |\varepsilon _{t}|\right) \) and \(\left( \varepsilon _{t}^{2}\right) \). The same finding has been observed by Tsiakas (2006) for the \( S \& P\) 500 returns, but for a different sample period.

Table 4 Some descriptive statistics for the daily S&P 500 returns

It is by now well documented (Bollerslev and Ghysels 1996; Franses and Paap 2000; Tsiakas 2006) that daily \( S \& P\) 500 returns are characterized by the day-of-the-week effect which often suggests the presence of periodicity in volatility with period \(S=5\). While the sample-period chosen here is different from those taken by e.g. Franses and Paap (2000) and Tsiakas (2006) for the same daily S&P 500 variable, it may be observed from Table 5 that the average return and the volatility (approximated by the absolute value) are somewhat different from a day to another. Of course, the difference significancy could be studied more effectively using e.g. the bootstrap approximation of the distribution of the return, along each day as done by Tsiakas (2006). However, this is behind the scoop of this application, which is made only for illustration purposes.

Table 5 Day of the week effect in the daily \( S \& P\) 500 returns

5.1.2 The models and prior distributions

In order to identify the period of the best fitting PAR-SVs model according to the DIC, we estimate six PAR-SVs models (2.1) corresponding to each \(S\in \left\{ 1,\ldots ,6\right\} \). For \(S=5\), because of the presence of holidays, model (2.1) in which \(\omega _{v}=\omega _{d(v)}\) and \(\sigma _{v}=\sigma _{d(v)}\) with \(d(v)=nS+v\) (\(1\le v\le S\), \(n\in {\mathbb {Z}}\)) seems not suitable. This is because with model (2.1) each day of a week may have different specification than the same day of the week before. So when \(S=5\) we also estimate the following variant of model (2.1) (henceforth PAR-SV\(_{5}^{*}\)):

$$\begin{aligned} \left\{ \begin{array}{l} \varepsilon _{t}=\sqrt{h_{t}}\eta _{t} \\ \log \left( h_{t}\right) =\alpha _{d\left( t\right) }+\beta _{d\left( t\right) }\log \left( h_{t-1}\right) +\sigma _{d\left( t\right) }e_{t} \end{array} \right. , \quad 1\le t\le T, \end{aligned}$$
(5.1)

in which d(t) is instead defined to be

$$\begin{aligned} d\left( t\right) =\left\{ \begin{array}{c} 1\quad \text { if the day corresponding to }\quad t\text { is a Monday} \\ 2\quad \text { if the day corresponding to }\quad t\text { is a Tuesday} \\ \vdots \\ 5\quad \text { if the day corresponding to }\quad t\text { is a Friday.} \end{array} \right. \end{aligned}$$

Such a specification with missing values (see e.g. Franses and Paap 2000, Regnard and Zakoïan 2011 in the periodic GARCH case) seems well adapted to explain the day-of-the-week effect.

In calculating the BGG estimate across models, the chosen prior distributions for all the candidate PAR-SVs models are reported in Table 6. These priors are informative, but reasonably flat (cf. Figs. 2, 3). When \(S=1\), the prior distributions in Table 6 are similar to those proposed by Tsay (2010, Example 12.3) for his SV model. For the variant PAR-SV\(_{5}^{*}\) model (5.1), we use the same priors as in the PAR-SV\(_{5}\) model (2.1). Note that in Table 6, the diagonal matrix \(D_{k}\) (\(k=2,4,10\)) is defined to be

$$\begin{aligned} D_{k}\left( i,j\right) =\left\{ \begin{array}{ll} 0&{}\quad \text { if }\quad i\ne j \\ 0.05&{}\quad \text { if }\quad i=j\text { is odd} \\ 0.5&{}\quad \text { if }\quad i=j\text { is even} \end{array} \right. ,\quad 1\le i,\quad j\le k. \end{aligned}$$
(5.2)
Table 6 Prior distributions of \(\omega \) and \(\sigma ^{2}\) for the candidates PAR-SVs, \(1\le S\le 6\), (\(D_{k}, 0_{k\times 1}\) and \(\mathbf {1}_{k}\) denote respectively the diagonal matrix given by (5.2), the null vector with k components and the k-vector with all components equal 1)
Fig. 2
figure 2

Prior (dashed line) and BGG posterior (solid line) distributions of parameters in the PAR-SV\(_{1}\) (aperiodic SV) model

Fig. 3
figure 3

Prior (dashed line) and BGG posterior (solid line) distributions of parameters in the PAR-SV\(_{5}^{*}\) model

Using the QML method, the fitted standard GARCH(1, 1) specification to the daily \( S \& P\) 500 returns is given by

$$\begin{aligned} h_{t}^{G}= \underset{\left( 0.00000081\right) }{0.00000314}+\underset{ \left( 0.0121\right) }{0.1022}\varepsilon _{t-1}^{2}+\underset{ \left( 0.0138\right) }{0.8822}h_{t-1}^{G}, \end{aligned}$$

with standard deviations of estimates in parentheses. The found volatility \( h^{G}\) is used to initialize the volatility parameter \(h^{\left( 0\right) }\) in the Gibbs sampler across all estimated PAR-SVs (and PAR-SV\(_{5}^{*}\)) models.

5.1.3 Results

The estimated DIC’s across PAR-SVs models, their computation times (in minutes), their numerical standard errors (approximated by their standard deviations over G replications) and the monodromy parameters of all estimated models are reported in Table 7. In computing the standard errors of DIC, we have replicated the BGG procedure (Algorithm 3.1) \( G=500\) times.

From Table 7, some broad conclusions are in order. Firstly, the DIC’s corresponding to the PAR-SVs (\(1\le S\le 6\)) models given by (2.1) are very close to each other. So, with regard to the standard errors of the DIC’s, which are reasonably small, it is difficult to distinguish between the corresponding PAR-SVs (\(1\le S\le 6\)) models despite the inherent ranking reported in Table 7. On the other hand, the DIC favors the PAR-SV\(_{5}^{*}\) given by (5.1), whose value (\(-8881.8162\)) is quite small than the others. Secondly, while the BGG method is relatively time-consuming for all PAR-SVs models, a fact that is well known in the single move approach, the computation time is almost similar across PAR-SVs models in spite of the increasing number of parameters when S tends to be large. So fitting a periodic PAR-SVs model is carried out without increasing computational cost compared to the non-periodic SV. Thirdly, the monodromy parameters \(\prod \nolimits _{v=1}^{S} \beta _{v}\) across models are quite large, which suggests a strong persistence in volatility.

According to the DIC, the best model is the PAR-SV\(_{5}^{*}\) given by (5.1), whose parameters, their MCMC standard deviations (Std), their NSE and their RNI are reported in Table 9. As a benchmark, Table 8 reports the same information concerning the second best model ranked by the DIC, which is the standard SV corresponding to PAR-SV\(_{1}\). Due to lack of space the remaining estimated models are not presented here, but are available from the author. Further, prior and posterior distributions of the estimates for the PAR-SV\(_{1}\) and PAR-SV\(_{5}^{*}\) are plotted in Figs. 2 and 3 respectively.

Table 7 Estimated DIC, standard deviation, ranking, computation time (in minutes) and the monodromy (Monod.) estimate for the candidate PAR-SVs (\(1\le S\le 6\)) and PAR-SV\(_{5}^{*}\) models. Results concerning the selected model are displayed in bold
Table 8 BGG parameter estimates for the PAR-SV\(_{1}\) (standard SV)
Table 9 BGG parameter estimates for the PAR-SV\(_{5}^{*}\)

It may be seen from Tables 8 and 9 that the parameters appear quite well estimated as shown by their low MCMC standard deviations, low RNI and small NSE. The latter clearly shows that even with the single move approach, when a suitable choice of the range of h in the Griddy–Gibbs procedure is made, the MCMC estimates mix well. This is confirmed by the low autocorrelations of the estimates (cf. Fig. 4). Moreover, from Table 9 it can be observed that the parameters are quite different from a day to another, especially for the \(\alpha _{v}\) and \(\beta _{v}\) (\(1\le v\le 5\) ). On the other hand, the estimates are comparable with similar models in the literature when \(S=1\). Prior and posterior distributions of the estimates for the PAR-SV\(_{1}\) and PAR-SV\(_{5}^{*}\) are plotted in Figs. 2 and 3 respectively. The prior distributions used are, as pointed out above, relatively noninformative while the posterior distributions are quite concentrated. In addition, from Fig. 5 the volatilities induced by the GARCH(1, 1) (dashed-line) and PAR-VS\(_{5}^{*}\) (solid line) models have similar pattern. Note finally that these result were quite stable to using different initial values, priors, and numbers of iterations for the Gibbs sampler. However, the efficiency of the Gibbs sampler greatly depends on the choice of the range of h in the Griddy–Gibbs step.

Fig. 4
figure 4

Sample autocorrelations of the PAR-SV\(_{5}^{*}\) parameter estimates

Fig. 5
figure 5

Volatilities induced by the GARCH\(\left( 1,1\right) \), the SV and PAR-SV\(_{5}^{*}\)

5.2 Quarterly S&P 500 returns

The second dataset consists of the quarterly \( S \& P\) 500 returns over the sample period from the first quarter (Q1) 1871 to the fourth quarter (Q4) 2012, with a total of \(T=567\) observations. The index is calculated by taking average price per share in month ending quarter. The time series plots of the index series and its return are displayed in Fig. 6. The data are given from Shiller (2015).

Fig. 6
figure 6

Quarterly \( S \& P\;500\) index: a level and b return

We estimated five PAR-SVs models (2.1) corresponding to each \(S\in \left\{ 1,\ldots ,5\right\} \) using the same prior distributions as in Table 6 (for \(1\le v\le 5\)). The estimated volatility via the GARCH(1, 1) model, which is used to initialize the volatility in the Gibbs sampler, has the following specification

$$\begin{aligned} h_{t}^{G}=\underset{\left( 0.0035\right) }{0.0010}+\underset{\left( 0.0577\right) }{0.1792}\varepsilon _{t-1}^{2}+\underset{\left( 0.0674\right) }{0.6796}h_{t-1}^{G}, \end{aligned}$$

with standard errors of estimates in parentheses. The estimated DIC’s across PAR-SVs models, their computation times (in minutes), their numerical standard errors and the corresponding monodromy parameters are reported in Table 10. The standard errors of DIC are calculated as above.

Table 10 Estimated DIC, standard deviations, ranking, computation time (in minutes) and the monodromy (Monod.) estimates for the candidate PAR-SVs (\(1\le S\le 5\)) models. Results concerning the selected model are displayed in bold

From Table 10, the DIC selects the four-periodic PAR-SV\(_{4}\) with smallest value \(\mathbf {-1211.9735}\). Such a value is not so far from those of the remaining PAR-SVs models (\(S\ne 4\)) regarding their numerical standard errors. On the other hand, the corresponding computation times are quite comparable while the monodromy parameters are less important than in the daily return case. The parameters of the found PAR-SV\(_{4}\) model, their MCMC standard deviations, their NSE and their RNI are listed in Table 11.

Table 11 BGG parameter estimates for the selected PAR-SV\(_{4}\) model

The same conclusions as those for the daily return case may be drawn: the parameters are quite well estimated in view of their low standard deviations, low RNI and small NSE (cf. Table 11). Moreover, the posterior distributions are fairly concentrated (cf. Fig. 7). On the other hand, the parameters are quite different from a quarter to another especially for the \(\alpha _{v}\) and \(\beta _{v}\). However, in overall, the estimates seem slightly less accurate than in the daily return case, which is perhaps due to the smaller sample size. Finally, Fig. 8 plots the volatilities generated by the GARCH(1, 1) (panel (a)) and the PAR-SV\(_{4}\) (panel (b)) where it may be seen that they display a very similar pattern.

Fig. 7
figure 7

Prior (dashed line) and posterior (solid line) distributions of parameters in the PAR-SV\(_{4}\) model

Fig. 8
figure 8

Volatilities induced by the GARCH\(\left( 1,1\right) \) and PAR-SV\(_{4}\)

5.3 Monthly S&P 500 returns

The third dataset consists of the return of the monthly \( S \& P\) 500 index from January 1950 to January 2015, involving 780 observations. The returns are computed using the first adjusted closing index of each month. Plots of the \( S \& P\) 500 index and its return are given in Fig. 9. A similar monthly series, but on different sample period has been studied by Tsay (2010, example 12.3) via a SV model.

Fig. 9
figure 9

Monthly \( S \& P\;500\) index: a level and b return

We estimated twelve PAR-SVs models (2.1) corresponding to each \( S\in \left\{ 1,\ldots ,12\right\} \) using the prior distributions presented in Table 12.

Table 12 Prior distributions of \(\omega \) and \(\sigma ^{2}\) for the candidates PAR-SVs, \(1\le S\le 12\), (\(D_{k}\), \(0_{k\times 1}\) and \(\mathbf {1}_{k}\) denote respectively the diagonal matrix given by (5.2), the null vector with k components and the k-vector with all components equal 1)
Table 13 Estimated DIC, standard deviations, ranking, computation time (in minutes) and the monodromy (Monod.) estimates for the candidate PAR-SVs (\(1\le S\le 12\)) models. Results concerning the selected model are displayed in bold
Table 14 BGG parameter estimates for the selected PAR-SV\(_{12}\) model
Fig. 10
figure 10

Prior (dashed line) and posterior (solid line) distributions of parameters in the PAR-SV\(_{12}\) model

The volatility generated by the GARCH(1, 1) model, which is used to initialize the volatility in the Gibbs sampler across estimated PAR-SV\(_{S} \) models, is given by

$$\begin{aligned} h_{t}^{G}=\underset{\left( 0.00002\right) }{0.0001}+\underset{\left( 0.0245\right) }{0.1058}\varepsilon _{t-1}^{2}+\underset{\left( 0.0271\right) }{0.8502}h_{t-1}^{G}, \end{aligned}$$

with standard errors of estimates in parentheses. The estimated DIC’s for PAR-SVs models, their computation times (in minutes), their numerical standard errors and the corresponding monodromy parameters are reported in Table 10. The standard errors of DIC are calculated using 500 replications of the BGG procedure.

According to the DIC, the best model is the 12-periodic PAR-SV\(_{12}\) with value \(-2686.6698\). However, the DIC’s in Table 13 are very close to each other, so in view of their standard errors, it is difficult to discriminate between the corresponding models. On the other hand, as in the quarterly return case, the monodromy estimates are around half a unity while the computation times are close to each other. The specification of the selected PAR-SV\(_{12}\) model is given in Table 14.

From Table 14 and Fig. 10, it may be concluded that the estimates are relatively good in spite of the small sample size compared to the large number of parameters to estimate. The MCMC Std, the RNI and the NSE are fairly low while the posterior distributions are quite concentrated. Moreover, the parameters seem different from a month to another, especially for the \(\alpha _{v}\) and \(\beta _{v}\). Finally, from Fig. 11, the volatilities induced by the GARCH(1, 1) (upper panel) and the 12-periodic PAR-SV\(_{12}\) (lower panel) have a similar behavior in both shape and magnitude.

Fig. 11
figure 11

Volatilities induced by the GARCH\(\left( 1,1\right) \) and PAR-SV\(_{12}\)

6 Conclusion

In this paper we have proposed a stochastic volatility model whose log-volatility follows a periodic autoregression. This model may be seen as a flexible complementary to the periodic GARCH process because it overcomes the limitation that the volatility is only driven by the past of the process. Moreover, the periodic time-variation of the parameters allows a more flexible periodic volatility modelling compared to the time-invariant seasonal SV or deterministic periodic SV. As we have seen, statistical inference for this model may be easily done using the Bayesian MCMC approach without additional computational cost compared to the standard SV case. While the PAR-SVs model allows modelling some financial features such as periodicity in volatility, volatility clustering and excess kurtosis, it seems incapable of representing other observed facts. In particular, excess kurtosis implied by the PAR-SVs model (2.1) is only of a given order of magnitude and is generally smaller than the kurtosis generated by PAR-SVs with heavy tail innovation \(\eta _{1}\), like the Student distribution. So, various interesting generalizations of the proposed PAR-SVs model to account for additional features like large excess kurtosis, leverage effect, change in volatility regime and simultaneous volatility dependence are needed and may constitute future research. In particular, PAR-SVs with heavy tailed innovations like a student or mixture Gaussian distribution, Markov switching PAR-SVs, PAR-SVs models with correlated error terms, and multivariate versions of the PAR-SVs are appealing. Finally, a multi-move MCMC approach for estimating PAR-SVs models would be of great interest.