GARCH prediction using spline wavelet support vector machine

Tang, Ling-Bing; Sheng, Huan-Ye; Tang, Ling-Xiao

doi:10.1007/s00521-009-0241-7

GARCH prediction using spline wavelet support vector machine

Original Article
Published: 20 February 2009

Volume 18, pages 913–917, (2009)
Cite this article

Download PDF

Access provided by CONRICYT-eBooks

Neural Computing and Applications Aims and scope Submit manuscript

GARCH prediction using spline wavelet support vector machine

Download PDF

Ling-Bing Tang^1,2,
Huan-Ye Sheng¹ &
Ling-Xiao Tang³

291 Accesses
9 Citations
Explore all metrics

Abstract

Volatility forecasting is vital important in finance to reduce risk and take better decisions. This paper proposes a spline wavelet support vector machine (SWSVM) to forecast the volatility of financial time series based on generalized autoregressive conditional heteroscedasticity model. An admissible spline wavelet kernel is constructed by incorporating the wavelet technique and spline theory into support vector machine (SVM). Since spline wavelet function can yield features that describe the stock time series both at various locations and at varying time granularities, the SWSVM gains the cluster feature of volatility well. Compared with Gaussian kernel in the standard SVM, the applicability and validity of spline wavelet kernel in SWSVM are confirmed through computer simulations and experiments on real-world stock data.

Volatility forecasting via SVR–GARCH with mixture of Gaussian kernels

Article 16 November 2016

A Continuous Differentiable Wavelet Shrinkage Function for Economic Data Denoising

Article 11 September 2018

The volatility mechanism and intelligent fusion forecast of new energy stock prices

Article Open access 22 February 2024

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Time series prediction is important and challenging task in finance. Compared with traditional time series analysis with emphasis on modeling the conditional first moment, Engle [1] and Bollerslev [2] developed generalized autoregressive conditional heteroscedasticity (GARCH) model to take the dependency of the conditional second moments into modeling consideration. Since this accommodates the increasingly important demand to explain and to model risk and uncertainty in financial time series, GARCH model has been main tool for volatility forecasting [3–8]. To enhance the forecasting performance of GARCH farther, Perez-Cruz proposed GARCH-SVM model and proved that forecasting volatility using support vector machine (SVM) is not only feasible but also effective [9].

SVM have originally been used for classification purposes but their principles can be extended easily to the task of regression and time series prediction [10–12]. The prediction performance of SVM is greatly dependent upon kernel [13]. There are many kinds of existent support vector kernels such as the Gaussian and polynomial kernels used to map the data in input space to a high-dimensional feature space in which the problem becomes linearly separable. Since wavelet function can describe stock time series both at various locations and at varying time granularities [14–17], it should describe the cluster feature of volatility well. One of the basic methods for constructing wavelets involves the use of spline functions, which are probably the simplest functions with small supports [18, 19]. Therefore, it is valuable for us to research the problem of whether a desirable performance could be achieved if we combine SVM with spline wavelet theory. In this paper, we construct a novel spline wavelet kernel for SVM to forecast volatility.

The objective of this paper is to evaluate the performance of spline wavelet kernel in spline wavelet support vector machine (SWSVM) for volatility prediction according to GARCH model by comparing it with the Gaussian kernel in SVM. This paper is organized as follows: Sect. 2 provides a brief introduction to the multi-resolution theory of wavelets. Section 3 describes how to construct spline wavelet kernel and prove that it is admissible support vector kernel. Section 4 discusses about the experimental results on simulating and real data sets, followed by conclusions in the last section.

2 Multi-resolution theory

Multi-resolution analysis is the decomposition of the Hilbert space L ₂(R) to the nested sequence of closed subspaces {V _j}_j∈Z, which satisfy the following relations [15]:

(1)
$ \cdots \subset V_{2} \subset V_{1} \subset V_{0} \subset V_{-1} \subset V_{-2} \subset \cdots $
(2)
$ {\rm {close}}\left\{{\bigcup\limits_{j \in Z} {V_{j}}} \right\} = L_{2} \left(R \right),\quad \bigcap\limits_{j \in Z} {V_{j}} = \left\{0 \right\} $
(3)
∀f ∈ L ₂(R) and $ \forall j \in Z, \quad f\left(x \right) \in V_{j} \Leftrightarrow f\left({2x} \right) \in V_{j-1} $
(4)
∀f ∈ L ₂(R) and $ \forall k \in Z, \quad f\left(x \right) \in V_{0} \Leftrightarrow f\left({x-k} \right) \in V_{0} $
(5)
∃φ ∈ V ₀ so that {φ(x − k)}_k∈Z is the Riesz basis in V ₀.

Dilatations and translations of scaling function φ(x), the {φ _j,k(x) = φ(2^−j x − k)}_k∈Z is the Riesz basis in V _j. Let us denote by W _j the complement of the subspace V _j in the space V _j−1, there is V _j−1 = V _j ⊕ W _j, j ∈ Z. Similarly, the Riesz basis of W _j is {ψ_j,k(x) = ψ(2^−j x − k)}_k∈Z by dilatations and translations of wavelet function ψ(x).

3 Spline wavelet kernel and SWSVM

Theorem 1: Let x ∈ R, the scaling function φ ^(N)(x) is B-spline of the order (N − 1) with the compact support interval [0, N] and the integer division {x _k = k, k ∈ Z}, the recursion formula is valid [20]

$$ \begin{aligned}\varphi_{k}^{(N)} (x) & = \frac{x-k}{N - 1}\varphi_{k}^{(N - 1)} (x) + \frac{k + N - x}{N - 1}\varphi_{k + 1}^{(N - 1)} (x),\quad N = 2,3, \ldots, \\ \varphi_{k}^{(1)} (x) & = A_{[k,k + 1)} (x) = \left\{\begin{gathered} 1, \quad k \le x < k + 1 \hfill \\ 0,\quad {\rm {otherwise}}, \hfill \\ \end{gathered} \right. \\ \end{aligned} $$

(1)

where φ ^(N)_k (x) = φ ^(N)(x − k).

Theorem 2: Compactly supported spline wavelet ψ^(N)(x)can be expressed by the wavelet equation $ \psi^{(N)} (x) = \sum_{k} {d(k)} \varphi^{(N)} (2x - k), $ where scaling function is B-spline φ ^(N)(x) given in (1), and coefficients d(k) are equal to [19]

$$ d(k) = \frac{{(- 1)^{k}}}{{2^{N - 1}}}\sum\limits_{l = 0}^{N} {\left(\begin{gathered} N \hfill \\ l \hfill \\ \end{gathered} \right)} \varphi^{(2 N)} (k - l + 1),\quad k = 0, \ldots,\,3-2$$

(2)

Theorem 3: Let ψ^(N)(x) be a mother wavelet, let j and k denote the dilation and translation, respectively. All calculations are done on integer divisions, with the finest resolution level division {x _k = k, k = 0, …, 2^m}. If s, t ∈ R ^M , then dot-product wavelet kernels are:

$$ K(s,t) = \prod\limits_{i = 1}^{M} {\sum\limits_{j = 1}^{J} {\sum\limits_{k = 1 - N}^{{L_{j} - N}} {\psi_{{_{j,k}}}^{(N)} (s_{i})}} \psi_{{_{j,k}}}^{(N)^\prime} (t_{i})}, \quad L_{j} = 2^{m - j} $$

(3)

Proof: Let x ¹, …, x ^l∈ R ^M and a ₁, …, a _l∈ R,

$$ \begin{gathered} \sum\limits_{p,q}^{l} {a_{p} a_{q} K(x^{p},x^{q})} \hfill \\ = \sum\limits_{p,q}^{l} {a_{p} a_{q}} \prod\limits_{i = 1}^{M} {\sum\limits_{j = 1}^{J} {\sum\limits_{k = 1 - N}^{{L_{j} - N}} {\psi_{{_{j,k}}}^{(N)} \left({x_{i}^{p}} \right)}} \psi_{{_{j,k}}}^{(N)^\prime} (x_{i}^{q})} \hfill \\ = \left({\sum\limits_{p}^{l} {a_{p}} \prod\limits_{i = 1}^{M} {\sum\limits_{j = 1}^{J} {\sum\limits_{k = 1 - N}^{{L_{j} - N}} {\psi_{{_{j,k}}}^{(N)} \left({x_{i}^{p}} \right)}}}} \right)\left({\sum\limits_{q}^{l} {a_{q}} \prod\limits_{i = 1}^{M} {\sum\limits_{j = 1}^{J} {\sum\limits_{k = 1 - N}^{{L_{j} - N}} {\psi_{{_{j,k}}}^{(N)^\prime} \left({x_{i}^{q}} \right)}}}} \right) \hfill \\ = \left({\sum\limits_{p}^{l} {a_{p}} \prod\limits_{i = 1}^{M} {\sum\limits_{j = 1}^{J} {\sum\limits_{k = 1 - N}^{{L_{j} - N}} {\psi_{{_{j,k}}}^{(N)} \left({x_{i}^{p}} \right)}}}} \right)^{2} \hfill \\ \ge 0 \hfill \\ \end{gathered} $$

Hence, dot-product kernels satisfy Mercer’s condition. Therefore, it is admissible support vector kernel.

It’s well known to all that the Gaussian kernel need to compute a l by l kernel matrix, O(l ²). The major difference on complexity of spline wavelet kernel is on the constructing stage to gain Multi-resolution frame. Hence, if there is the number h of frame elements describing the frame-based kernel, the computational complexity can be calculated by O(h ² × l ²).

4 Experiment analysis

Two simulated data sets are examined in the first series of experiment. Each data set consists of 1,560 samples generated by a GARCH (1,1) model:

$$ y_{t} = \mu + \sigma_{t} \varepsilon_{t} $$

(4)

$$ \sigma_{t}^{2} = \omega + \alpha y_{t - 1}^{2} + \beta \sigma_{t - 1}^{2} $$

(5)

where y _t is daily return and ε _t is innovation, an uncorrelated process with zero mean and unit variance. For the sake of simplicity, the mean of a financial return series μ is often neglected. In addition, the parameters ω, α and β must satisfy ω > 0, α, β ≥ 0 to ensure that the conditional variance σ ²_t is positive. The experimental setup is as follows: μ = 0, ω = 0.1, α = 0.4 and β = 0.5 and a disturbance term ε _t distributed first as Gaussian and then as a Student’s t with four degrees of freedom (kurtosis = 4). This second distribution tries to model the excess of kurtosis that appears in real financial series. Every time series consists of 1,560 samples. They are referred to as Data-1 and Data-2.

The data examined in the experiment are composed of the following daily indices: DAXINDX, FRCAC40, FTSE100, JAPDOWA and SPCOMP. These stock market indices P _t are then transformed into daily returns y _t by 100 times their log differences:

$$ y_{t} = 100\ln \left({p_{t}/p_{t - 1}} \right) $$

(6)

All the index data encompass the period from 1 January 1992 to 31 December 1997. There are 1,560 observations for each time series of daily return. Each whole data set is divided into several overlapping training and testing sets according to the walk-forward testing routine [21]. Each training and test set is moved forward through the time series by 130 observations, in which there are a total of 520 observations in the training set, 520 observations in the test set. The optimal values of C, ε and γ in SVM with Gaussian kernel are chosen based on fivefold cross validation. The same method is also used in the SWSVM to choose C, ε and the dilation j separately. All parameters are shown in Table 1. The results are collated and the best results are recorded as follow, which are gained from the first sets (January 1992–January 1996).

Table 1 Parameters of model

Full size table

The prediction performance is evaluated using the following statistical metrics: normalized mean squared error (NMSE), normalized mean absolute error (NMAE) and the hit rate (HR). These metrics are calculated as follows:

$$ {\rm{NMSE}} = \sqrt{{{\sum\nolimits_{t = 1}^N {\left({\mathop{{\hat{\sigma_{t}^2}}}} - y_{t}^{2}\right)^{2}}}{\mathord{\left/{\vphantom{\sum\nolimits_{t = 1}^{N}{\left({\mathop{{\hat{\sigma_{t}^2}}}} - y_{t}^{2} \right)^{2}}}}{\sum\nolimits_{t = 1}^{N}{\left({y_{t - 1}^{2} - y_{t}^{2}}\right)}} \right. \kern-\nulldelimiterspace}{\sum\nolimits_{t = 1}^{N} {\left({y_{t - 1}^{2} - y_{t}^{2}} \right)}}}^{2}}} $$

(7)

$$ {\rm{NMAE}} = {{\sum\nolimits_{t = 1}^{N} {\left|{\mathop{{\hat{\sigma_{t}^2}}}} - y_{t}^{2} \right|}}\mathord{{\left/{\vphantom {{\sum\nolimits_{t = 1}^{N} {\left|{\mathop{{\hat{\sigma_{t}^2}}}} - y_{t}^{2}\right|}}{\sum\nolimits_{t = 1}^{N} {\left|{y_{t - 1}^{2} - y_{t}^{2}} \right|}}}} \right. \kern-\nulldelimiterspace}{\sum\nolimits_{t = 1}^{N} {\left| {y_{t - 1}^{2} - y_{t}^{2}}\right|}}}} $$

(8)

$$ {\rm{HR}} = \frac{1}{N}\sum\limits_{t = 1}^{N} {q_{t},q_{t} = \left\{{\begin{array}{*{20}c} 1 & : \\ 0 & : \\ \end{array}}\right.} \begin{array}{*{20}c} {\left({\mathop{{\hat{\sigma_{t}^2}}}} - y_{t - 1}^{2} \right)\left(y_{t}^{2} - y_{t - 1}^{2}\right) \ge 0} & {} \\ { {\rm{else}}} &{} \\ \end{array} $$

(9)

where N represents the total number of data points in the test set. $ {\mathop{{\hat{\sigma_{t}^2}}}} $ denotes the predicted conditional variance. $ \mathop{\hat{y}} $ represents the predicted return. y denotes the actual return. The NMSE relates the mean square error of the predicted volatility $ {\mathop{{\hat{\sigma_{t}^2}}}} $ by SVM to the mean square error of the naive model $ {\mathop{{\hat{\sigma_{t}^2}}}} = y_{t - 1}^{2}. $ The NMAE is more robust against outliers in comparison with NMSE. They are the measures of the deviation between the actual and predicted values. The smaller the values of them, the closer are the predicted time series values to the actual values. On the contrary, the larger the value of HR as a measure of how often the model gives the correct direction of change of volatility, the better is the performance of prediction.

The smaller cross-validation error is the stronger generalization ability of model is. After we gained parameters by cross-validation method to avoid over-fitting, we firstly inspect the prediction result of training set, which is supplement for prediction result of test set.

The results on the training set are listed in Table 2. It can be observed that in all the daily indices, the smaller values of NMSE are in the spline wavelet kernel. The smaller values of NMAE are in spline wavelet kernel with the exclusion of FTSE100 and JAPDOWA, too. As for HR, only in Data-1 and DAXINDX the larger values are in Gaussian kernel. A paired-test [22] is performed to determine if there is significant difference between the two kernels based on the NMSE of the training set. The calculated t-value indicates that spline wavelet kernel outperforms Gaussian kernel with 5% significance level for a one-tailed test.

Table 2 Results on the training set

Full size table

The results on the test set in Table 3 provide a better basis for a comparison of the two kernels where over-fitting issues may be neglected. As expected, the results of the test set are worse than those of the training set in terms of NMSE, NMAE and HR. But the similar conclusion can still be achieved. The table illustrates that apart from Data-1, the smaller values of NMSE are founded in spline wavelet kernel. And the smaller values of NMAE are founded in spline wavelet kernel except for DAXINDX and FTSE100. The larger values of HR all occurred in spline wavelet kernel. And a paired-test on the NMSE of the test set also shows that spline wavelet kernel outperforms Gaussian kernel with 5% significance level for a one-tailed test.

Table 3 Results on the test set

Full size table

The squared observations y ²_t and the predicted values of $ {\mathop{{\hat{\sigma_{t}^2}}}} $ from both two kernels for the test sets are illustrated in Figs. 1 and 2, where only the FRCAC40 is drawn, since it has well representative values of NMSE and NMAE in all daily indices. In this investigation, all of the parameters are seen in Table 1. It is clear that both Gaussian and spline wavelet kernel are enough to grasp the features reflected by the naive model. The predictions made by both kernels are very similar according to Figs. 1 and 2, although, as we can see in Table 3, performance of spline wavelet kernel is better than that of Gaussian kernel.

All experiments are run on 1.6 MHz Intel processors with 2 GB main memory under window XP professional. Training time in CPU-seconds of SVM is 3.5688 while that of WSVM is 3.1182e + 003. WSVM is more slowly than SVM because better prediction performance gained by constructing wavelet kernel based on multi-scale theory is at the cost of the time consuming.

5 Conclusion and discussion

An effective spline wavelet kernel which we combine spline theory and wavelet method with SVM to construct for volatility forecasting is presented in this paper. The existence of spline wavelet kernel is proven firstly. And then the forecasting performance is evaluated by using two simulated data sets and five real daily indices. As demonstrated in the experiment, spline wavelet kernel forecasts significantly better than Gaussian kernel in all aspects. The superior performance of spline wavelet kernel to the Gaussian kernel mostly lies in that spline wavelet is a set of bases that can approximate arbitrary functions. Future work will involve a theoretic analysis of the multi-scale frame on spline. More sophisticated spline wavelet kernel which can closely follow the volatility cluster will be explored for further improving the performance of SWSVM in volatility forecast.

References

Engle RF (1982) Autoregressive conditional heteroskedasticity with estimates of the variance of UK inflation. Econometrica 50:987–1008. doi:10.2307/1912773
Article MATH MathSciNet Google Scholar
Bollerslev T (1986) A generalized autoregressive conditional heteroskedasticity. J Econom 31:307–327. doi:10.1016/0304-4076(86)90063-1
Article MATH MathSciNet Google Scholar
Day TE, Lewis CM (1998) The behavior of the volatility implicit in the prices of stock index options. J Financ Econ 22:103–122. doi:10.1016/0304-405X(88)90024-4
Article Google Scholar
Franses PH, van Dijk D (1996) Forecasting stock market volatility using (nonlinear) GARCH models. J Forecast 15:229–235. doi:10.1002/(SICI)1099-131X(199604)15:3<229::AID-FOR620>3.0.CO;2-3
Article Google Scholar
Harvey CR, Whaley RE (1991) S&P100 index option volatility. J Finance 46:1551–1561. doi:10.2307/2328872
Article Google Scholar
Hull J, White A (1987) The pricing of options on assets with stochastic volatilities. J Finance 42:281–300. doi:10.2307/2328253
Article Google Scholar
Li WK, Mak TK (1994) On the squared residual autocorrelations in nonlinear time series with conditional heteroskedasticity. J Time Ser Anal 15:627–636. doi:10.1111/j.1467-9892.1994.tb00217.x
Article MATH MathSciNet Google Scholar
Poterba JM, Summers LH (1986) The persistence of volatility and stock market fluctuations. Am Econ Rev 76:1142–1151
Google Scholar
Perez-Cruz F, Afonso-Rodriguez JA, Giner J (2003) Estimating GARCH models using support vector machines. Quant Finance 3(3):163–172
Article MathSciNet Google Scholar
Smola AJ, Schölkopf BA (1998) Tutorial on support vector regression. NeuroCOLT technical report NC-TR-98-030. Royal Holloway College, London
Vapnik VN (1995) The nature of statistical learning theory. Springer, New York
MATH Google Scholar
Vapnik VN (1998) Statistical learning theory. Wiley, New York
MATH Google Scholar
Schölkopf B, Burges CJC, Smola AJ (1999) Advances in kernel methods. The MIT Press, London
Google Scholar
Daubechies I (1990) The wavelet transform: time-frequency localization and signal analysis. IEEE Trans Inform Theory 36(5):961–1005. doi:10.1109/18.57199
Article MATH MathSciNet Google Scholar
Daubechies I (1992) Ten lectures on wavelets. SIAM, Philadelphia
MATH Google Scholar
Mallat S (1998) A theory for muliresolution signal decomposition: The wavelet representation. IEEE Trans Pattern Anal Mach Intell 11:674–693. doi:10.1109/34.192463
Article Google Scholar
Mallat S (1998) A wavelet tour of signal processing. Academic Press, Boston
MATH Google Scholar
Chui CK (1992) An introduction to wavelets. Academic Press, London
Chui CK (1997) Wavelets: a mathematical tool for signal analysis. SIAM, Philadelphia
Zavjalov YS, Kvasov BI, Miroshnicenko VL (1980) Spline function methods. Nauka, Moscow
Google Scholar
Kaastra I, Boyd M (1996) Designing a neural network for forecasting financial and economic time series. Neurocomputing 10(3):215–236. doi:10.1016/0925-2312(95)00039-9
Article Google Scholar
Montgomery DC, Runger GC (1999) Applied statistics and probability for engineers. Wiley & Sons, New York
Google Scholar

Download references

Acknowledgments

We gratefully acknowledge the many helpful discussions and suggestions we have had with Prof. Dr. Kuno Egle in Department of Econometrics, Statistics and Mathematical Finance, School of Economics and Business Engineering, University of Karlsruhe. We also wish to thank the generous support provided by Baden-Württenberg state government in Germany.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Shanghai Jiao Tong University, 800 Dongchuan Rd., Minhang District, 200240, Shanghai, China
Ling-Bing Tang & Huan-Ye Sheng
Department of Computer and Electronic Engineering, Hunan Business College, Yuelu Rd., Yuelu District, 410205, Changsha, China
Ling-Bing Tang
School of Economics, Changsha University of Science and Technology, 45 Chiling Rd., Tianxin District, 410076, Changsha, China
Ling-Xiao Tang

Authors

Ling-Bing Tang
View author publications
You can also search for this author in PubMed Google Scholar
Huan-Ye Sheng
View author publications
You can also search for this author in PubMed Google Scholar
Ling-Xiao Tang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ling-Bing Tang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tang, LB., Sheng, HY. & Tang, LX. GARCH prediction using spline wavelet support vector machine. Neural Comput & Applic 18, 913–917 (2009). https://doi.org/10.1007/s00521-009-0241-7

Download citation

Received: 29 January 2008
Accepted: 29 January 2009
Published: 20 February 2009
Issue Date: November 2009
DOI: https://doi.org/10.1007/s00521-009-0241-7

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

GARCH prediction using spline wavelet support vector machine

Abstract

Similar content being viewed by others

Volatility forecasting via SVR–GARCH with mixture of Gaussian kernels

A Continuous Differentiable Wavelet Shrinkage Function for Economic Data Denoising

The volatility mechanism and intelligent fusion forecast of new energy stock prices

1 Introduction

2 Multi-resolution theory

3 Spline wavelet kernel and SWSVM

4 Experiment analysis

5 Conclusion and discussion

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

GARCH prediction using spline wavelet support vector machine

Abstract

Similar content being viewed by others

Volatility forecasting via SVR–GARCH with mixture of Gaussian kernels

A Continuous Differentiable Wavelet Shrinkage Function for Economic Data Denoising

The volatility mechanism and intelligent fusion forecast of new energy stock prices

Explore related subjects

1 Introduction

2 Multi-resolution theory

3 Spline wavelet kernel and SWSVM

4 Experiment analysis

5 Conclusion and discussion

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation