Deep Nonlinear Ensemble Framework for Stock Index Forecasting and Uncertainty Analysis

Wang, Jujie; Feng, Liu; Li, Yang; He, Junjie; Feng, Chunchen

doi:10.1007/s12559-021-09961-3

Deep Nonlinear Ensemble Framework for Stock Index Forecasting and Uncertainty Analysis

Published: 13 November 2021

Volume 13, pages 1574–1592, (2021)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Cognitive Computation Aims and scope Submit manuscript

Deep Nonlinear Ensemble Framework for Stock Index Forecasting and Uncertainty Analysis

Download PDF

Jujie Wang¹,
Liu Feng²,
Yang Li²,
Junjie He³ &
…
Chunchen Feng²

495 Accesses
3 Citations
Explore all metrics

Abstract

Stock index forecasting plays an important role in avoiding risk and increasing returns for financial regulators and investors. However, due to the volatility and uncertainty of the stock market, forecasting stock indices accurately is challenging. In this paper, a deep nonlinear ensemble framework is proposed for stock index forecasting and uncertainty analysis. (1) Singular spectrum analysis (SSA) is utilized to extract features from a raw stock index and eliminate the interference. (2) Enhanced weighted support vector machine (EWSVM) is proposed for forecasting each component that is decomposed, of which the penalty weights are based on the time order and the hyperparameters are optimized using the simulated annealing algorithm. (3) Recurrent neural network (RNN) is used to integrate the forecast of each component into the final point forecast. (4) Gaussian process regression (GPR) is applied to obtain the interval forecast of the original stock index. Two practical cases (Nikkei 225 Index, Japan and Hang Seng Index, Hong Kong, China) are utilized to evaluate the performance of the proposed model. In terms of the results of point forecasting, the MAE, ${R}^{2}$, MAPE, and RMSE of Nikkei 225 Index are 66.0745, 0.9972, 0.0066, and 80.0381, and those of Hang Seng Index are 79.2145,0.9968, 0.0073, and 96.7740. In terms of the results of interval forecasting, the ${CP}_{95\%}$, ${MWP}_{95\%}$, and ${MC}_{95\%}$ of Nikkei 225 Index are 0.89979, 0.05746, and 0.06385, and those of Hang Seng Index are 0.97985, 0.28223, and 0.28803. Forecasting stock indices accurately is crucial for investment decision and risk management and is extremely meaningful to investors and financial regulators. In this paper, the SSA-EWSVM-RNN-GPR model is used to forecast the closing prices of stock indices, and compared with eight benchmark models, the proposed SSA-EWSVM-RNN-GPR model can be an effective tool for both point and interval forecasting of stock indices.

A deep learning integrated framework for predicting stock index price and fluctuation via singular spectrum analysis and particle swarm optimization

Article 22 January 2024

Two-Stage Deep Ensemble Paradigm Based on Optimal Multi-scale Decomposition and Multi-factor Analysis for Stock Price Prediction

Article 15 September 2023

Predicting Stock Price Using Two-Stage Machine Learning Techniques

Article 20 July 2020

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

At present, stock investment has become one of the popular choices for people to invest and manage money. However, the stock index price has the characteristics of instability, nonlinearity, low signal-to-noise ratio, and easy to be disturbed by external factors [1], which makes it very difficult to predict the stock index trend. Accurate stock index prediction can not only help investors to have an insight into the fluctuation characteristics of the stock index and make accurate decisions to obtain benefits, but also help government departments to timely and effectively supervise and guide the market, so as to avoid financial risks [2]. Therefore, establishing an effective stock index prediction framework is a very necessary and challenging research topic.

Although the current research on stock index prediction has made some progress, there are still some problems to be solved in future research. In the existing literature, machine learning method overcomes the limitations of fundamental analysis method and statistical method in stock index prediction, so as to achieve more accurate stock index prediction results. Among them, support vector machine has been applied to stock index prediction by many scholars, and has achieved good prediction performance. However, the standard support vector machine does not consider the time characteristics of stock index data, and the optimization of kernel parameters and penalty parameters of SVM is also worthy of further research. In addition, the previous stock index prediction mainly focused on point prediction, but in most cases, point prediction cannot effectively solve the uncertainty related to stock index data, while interval prediction can quantify the changes of prediction results caused by uncertain factors. Therefore, interval prediction contains more information than point prediction.

In view of the above considerations, a deep nonlinear integration framework, SSA-EWSVM-RNN-GPR, is designed for point and interval prediction of stock index. Specifically, first of all, the framework applies the excellent data decomposition technology SSA to preprocess the original closing price, which can effectively reduce the interference of noise and improve the performance of the framework. Then, an enhanced weighted support vector machine model is proposed. In addition, the super parameters of the model are optimized by simulated annealing algorithm (SAA) to produce a predictor with good generalization ability. EWSVM model optimized by SAA method is used to predict n components after data preprocessing. Then, a deep nonlinear ensemble paradigm based on RNN is developed to integrate the prediction of all components and other indicators (opening price, high price, low price, and change) into the endpoint prediction. Finally, GPR is utilized to realize the confidence interval prediction based on the previous point prediction. Through systematic and comprehensive experimental simulation and analysis, it is proved that the prediction framework is a scientific and effective prediction tool.

The main innovations of this paper are summarized as follows:

(a)
Instead of modelling the original data directly, this study uses a novel SSA to decompose the original stock index signal into several components for feature extraction and denoising. Additionally, a signal-decomposition-based hybrid model is built to improve the performance of stock index forecasting.
(b)
Considering the shortcomings of the standard SVM model, this study proposes a novel EWSVM model for forecasting each component that was decomposed via SSA, for which the penalty weights are based on the time order and the hyperparameters are optimized via SAA.
(c)
To reconstruct the complicated nonlinear relationships between components that were decomposed by SSA, this study uses RNN to integrate the forecasting result of each component nonlinearly.
(d)
A two-stage mixed learning strategy is applied to capture the complicated features of stock index changes, and it utilizes EWSVM to forecast each component that was decomposed by SSA and, subsequently, applies RNN to nonlinearly integrate these forecasting results into the final point forecast.
(e)
To provide more uncertainty information for investors and regulators, this study uses the GPR model to identify the confidence interval for forecasting based on the previous point forecasting result.
(f)
A novel deep nonlinear ensemble model for stock index forecasting and uncertainty analysis, namely, SSA-EWSVM-RNN-GPR, is proposed for obtaining precise point forecasting and reliable interval forecasting results, and it has been shown to outperform various benchmark models.“

The remainder of this paper is organized as follows: “Literature Review” provides a detailed literature review and analysis, and draws some conclusions. “Related Methodology” describes the related basic methods that are utilized in the developed models. Subsequently, the developed models are presented in “Proposed Model”. “Optimizing Parameters with an Intelligence Algorithm” presents the experimental results and analysis, and conclusions of this study are summarized in “Case Study”.

Literature Review

Many methods have been proposed for stock market forecasting, and these methods can be divided into three main categories: fundamental analysis, statistical methods, and machine learning methods [3]. Fundamental analysis forecasts the trend of a stock price by analyzing the operating conditions, financial conditions, and macroeconomic policies of a company. Wafi et al. [4] analyzed the strengths and disadvantages of several fundamental analysis models and concluded that the residual income model (RIM) was the superior stock market forecasting model. However, fundamental analysis focuses on the long-term trends of the stock market; hence, it is unable to capture the recent trends of the stock market.

Statistical methods regard a stock index as a time series and utilize statistical models to identify hidden properties of the sequence and realize satisfactory forecasting performance [5]. Auto-regressive moving average (ARMA) and generalized auto-regressive conditional heteroskedasticity (GARCH) are classical statistical forecasting models [6,7,8,9,10]. Shi et al. [8] built a hybrid model, which includes ARMA, for forecasting the linear component of the stock price. Wei et al. [9] utilized a new GARCH-class model to study the long-term volatility of China’s stock market. Although statistical methods such as ARMA and GRACH realize relatively satisfactory performances in forecasting, they are weak in terms of nonlinear processing capability; namely, they cannot capture nonlinear characteristics that are hidden in the sequence.

To overcome the problem that is discussed above, machine learning methods, which include shallow learning and deep learning methods, are proposed, and they possess excellent nonlinear mapping capacity [11, 12]. Shallow learning methods include artificial neural network (ANN) and support vector machine (SVM) [13,14,15,16,17,18]. Wang [14] designed an improved ANN model for forecasting stock prices in the Turkish stock market, which mitigated the well-known problems of overfitting and underfitting. Grigoryan [15] applied the SVM technique to the forecasting of non-stationary stock price series satisfactorily. Xiao et al. [18] applied SVM to the forecasting of stock prices and demonstrated that SVM was a promising approach for stock price forecasting. Nevertheless, the standard SVM method assigns the same penalty weight ($C$) to all training samples that exceed the specified error (ε). The standard SVM method is suitable for series without time characteristics; however, a stock index is a featured time series; namely, the significance of the data and the influence on the method may not be the same across periods. Weighted support vector machine (WSVM) considers the timing of data and assigns different penalty weights to samples in different periods. Typically, the weight of a near-term sample is larger than that of a long-term sample [19,20,21]. Therefore, WSVM can obtain a more accurate forecasting result than SVM. Chen et al. [20] utilized WSVM with the Shanghai stock market and the Shenzhen stock market to forecast the turning signals of the stock and proved that the model can be applied in practice effectively. In addition, various methods are used to complete the difficult parameter optimization of SVM. Fayed et al. [22] used a sped-up exhaustive grid-search method to optimize the kernel parameters and the penalty parameter of SVM. Bergstra et al. [23] concluded that random search outperformed grid search and manual search within a small fraction of their computation times. Nevertheless, grid search and random search have drawbacks due to their low search efficiencies and long search times. To overcome these drawbacks, SAA was proposed for optimizing the hyperparameters of machine learning methods [24, 25]. SAA has the characteristics of strong robustness, less initial constraints, and excellent global optimization performance. It has achieved remarkable results in solving complex nonlinear optimization problems. Dias et al. [24] proposed SAA to solve the dual quadratic optimization problem of SVM. The results show that SAA can effectively improve the performance of the SVM model. Superior to shallow learning methods, various deep learning methods, such as RNN, can mine the deep intrinsic characteristics from complicated stock index data and have been utilized widely over the years [26, 27]. Qiu et al. [27] proposed a hybrid RNN model for stock trend prediction; they concluded that the developed model was highly promising and could be considered a feasible and effective stock market timing tool.

Recently, numerous signal decomposition methods have been used to strengthen the performances of mainstream forecasting models. These signal decomposition methods are often combined with forecasting methods for the construction of a hybrid model. Signal decomposition can smooth the series, extract the time and frequency features of the series, denoise the original series, and decompose the series into various components. For instance, Wang et al. [28] proposed a new approach for forecasting stock prices via the wavelet decomposition (WD)–based backpropagation network (BP), which was proven to be superior to the single BP model. Wei [29] combined empirical mode decomposition (EMD) with ANN to improve the performance of stock price forecasting. However, WD and EMD both have disadvantages: (1) the decomposition performance of WD depends strongly upon the choices of the decomposition levels and the wavelet basis; (2) the performance of WD is easily influenced by the white noise in the original data; and (3) EMD lacks a strict mathematical foundation and physical meaning. To address these problems, SSA was proposed by scholars for realizing superior signal decomposition and data denoising [30,31,32,33,34]. Wen et al. [34] used SSA to decompose a stock price into the trend, the market fluctuation, and the noise with various economic features, and they introduced these components into SVM for forecasting.

After the forecasting of each component that was decomposed via signal decomposition methods, the signal integration method is applied and is the most significant part of the signal decomposition and integration-based hybrid forecasting model. The simple integration method refers to the linear accumulation of components’ forecasting results while ignoring the complex nonlinear relationships between the components. In addition, due to the excellent nonlinear mapping performance, machine learning methods, especially deep learning methods, are applied to integrate the forecasts of the components. For instance, Wang et al. [35] utilized SSA to decompose the original price series into several components, forecast each component, and, finally, integrated them together using RNN, which resulted in error reductions and improvements compared to the single models.

Currently, most of the improvement efforts for stock index forecasting methods focus on increasing the accuracy of point forecasting, while less work has been conducted on realizing reliable interval forecasting of stock indices. Although accurate point forecasting is important for investors and financial regulators, reliable interval forecasting with consideration of uncertainty information is also beneficial for controlling and avoiding unnecessary risk in the stock market. Therefore, researchers have applied statistical learning methods such as GPR to conduct uncertainty analyses of time series forecasting [36,37,38,39,40]. For instance, Fang et al. [38] utilized GPR for probabilistic forecasting of carbon dioxide emission and generated satisfactory forecasting intervals. Zhang et al. [40] presented a new wind speed forecasting model that is based upon a shared-weight long short-term memory network (SWLSTM) and GPR, which is used to realize high-precision point forecasting and reliable interval forecasting, and the SWLSTM-GPR model was proven to perform effectively. However, GPR is rarely used in financial forecasting studies.

Through the review and analysis of the above literature, we can draw the following conclusions: (1) fundamental analysis model has limitations in capturing the recent fluctuation trend of stock index, and statistical methods cannot effectively deal with the nonlinear model of stock index data. These deficiencies may lead to poor prediction performance in the process of practice. (2) Stock index data is a kind of time series data. Therefore, it is necessary to design a more advanced SVM method based on previous experience, and use the SAA method with strong applicability and strong global optimization performance to optimize some parameters of the model, which is conducive to the development of the optimal weighted support vector machine model with good generalization ability and stability. (3) In the current prediction model, selecting practical and effective data preprocessing technology to improve the prediction performance is one of the key points of the hybrid model. The excellent decomposition algorithm can effectively remove the noise signal in the original data and significantly improve the prediction results. In addition, under the decomposition integration framework, the deep learning method shows excellent performance in integration, among which the RNN model has performed well in the past ensemble research. (4) The research on interval prediction of stock index is of great significance to the stock market, because interval prediction can analyze the uncertainty of stock index and quantify the change of prediction results caused by uncertain factors, so as to improve more and more useful reference information for stock market participants. However, most of the existing studies only focus on the point prediction of stock index, ignoring the significance of considering the interval prediction of stock index.

According to the combing and analysis of existing literature, we can reasonably infer that the research considering the uncertainty analysis of stock index is very insufficient. Therefore, developing an innovative stock index prediction framework for point prediction and interval prediction of stock index is of great significance to the stock market.

Related Methodology

In this section, five related basic methods in the developed SSA-EWSVM-RNN-GPR model are described.

Singular Spectrum Analysis

As a novel signal decomposition method, SSA can be used for feature extraction and denoising of an original times series signal. This study uses SSA to decompose the original stock index signal into several components for feature extraction and denoising. The following is the detailed procedure of SSA:

$Step 1:$ Embedding: the analysis object of SSA is a centralized one-dimensional series $\left[{x}_{1},{x}_{2},\cdots ,{x}_{N}\right]$, where $N$ denotes the length of the series. The trajectory matrix X can be represented as follows:

$$X=\left[\begin{array}{ccc}\begin{array}{cc}{x}_{1}& {x}_{2}\\ {x}_{2}& {x}_{3}\end{array}& \cdots & \begin{array}{c}{x}_{N-J+1}\\ {x}_{N-J+2}\end{array}\\ \vdots & \ddots & \vdots \\ \begin{array}{cc}{ x}_{J}& {x}_{J+1}\end{array}& \cdots & {x}_{N}\end{array}\right]$$

(1)

where $J$ denotes the integer that corresponds to one main trend component and other detail components.

$Step 2:$ Singular value decomposition (SVD): the decomposition of the time series can be represented as:

$$X=\sum_{m=1}^{J}\sqrt{{\lambda }_{m}}{A}_{m}{B}_{m}^{T}$$

(2)

where ${B}_{m}$ denotes the right eigenvector, ${A}_{m}$ denotes the left eigenvector, and ${\lambda }_{m}$ denotes the feature vector of ${A}_{m}$, which is called the time-empirical orthogonal function (T-EOF).

$Step 3:$ Grouping: in this step, the series is grouped as follows:

$$X={X}_{{I}_{1}}+\dots +{X}_{{I}_{c}}$$

(3)

where $c$ denotes the number of disjoint groups that contain $J$ subsequences, which include one main trend component and other detail components, and $I$ denotes the disjoint groups.

$Step 4:$ Refactoring: the projection of ${X}_{i}$ on ${A}_{m}$ can be represented as follows:

$${a}_{i}^{m}={X}_{i}{A}_{m}=\sum_{j=1}^{J}{x}_{i+j}{A}_{m,j}, \;\;\;0\le i\le N-J$$

(4)

Then, T-EOF is used to conduct the following refactoring:

$${x}_{i}^{k}=\left\{\begin{array}{c}\frac{1}{i}{\sum }_{j=1}^{i}{a}_{i-j}^{k}{A}_{k,j}, \quad 1\le i\le J-1 \\ \frac{1}{J}{\sum }_{j=1}^{J}{a}_{i-j}^{k}{A}_{k,j}, \quad J\le i\le N-J+1 \\ \frac{1}{N-i+1}{\sum }_{j=i-N+J}^{J}{a}_{i-j}^{k}{E}_{k,j}, \quad N-J+2\le i\le N\end{array}\right.$$

(5)

Support Vector Machine

SVM aims at identifying a hyperplane such that the total deviation of the distance between each sample and the hyperplane is maximized, and it is utilized widely in the field of forecasting. The procedure of SVM is presented in detail below.

The solution for SVM is expressed as follows:

$$\underset{\varphi ,d,{\xi }_{i},\widehat{{\xi }_{l}} }{\mathrm{min}}\frac{1}{2}{\varphi }^{T}\varphi +C\sum_{i=1}^{n}\left({\xi }_{i}+\widehat{{\xi }_{i}}\right)$$

(6)

$$s.t. \left\{\begin{array}{c}g\left({x}_{i}\right)-{y}_{i}\le \varepsilon +{\xi }_{i} \\ {y}_{i}-g\left({x}_{i}\right)\le \varepsilon +\widehat{{\xi }_{i}} \\ {\xi }_{i}\ge 0,\widehat{{\xi }_{i}}\ge 0 \end{array}\;\;\;\hfill(i=\mathrm{1,2},\dots ,n)\right.$$

where $g(x)$ denotes the outcome, ${y}_{i}$ denotes the observation value, $n$ denotes the number of training samples, $\varepsilon$ denotes the threshold error, ${\xi }_{i}$ and $\widehat{{\xi }_{i}}$ are two relaxation variables, and $C$ is a constant. A schematic diagram is presented in Fig. 1.

The above expression can be represented into dual form as follows:

$$\underset{{\beta }_{i},{\widehat{\beta }}_{i}}{\mathrm{max}}\sum_{i=1}^{n}{y}_{i}\left({\widehat{\beta }}_{i}-{\beta }_{i}\right)-\epsilon \left({\widehat{\beta }}_{i}+{\beta }_{i}\right)-\frac{1}{2}\sum_{i=1}^{n}\sum_{j=1}^{n}\left({\widehat{\beta }}_{i}-{\beta }_{i}\right)\left({\widehat{\beta }}_{j}-{\beta }_{j}\right){x}_{i}^{T}{x}_{j}$$

(7)

$$s.t.\left\{\genfrac{}{}{0pt}{}{\sum_{i=1}^{n}\left({\widehat{\beta }}_{i}-{\beta }_{i}\right)=0}{ 0\le {\widehat{\beta }}_{i},{\beta }_{i}\le C }\right.$$

where ${\widehat{\beta }}_{i}\mathrm{ and }{\beta }_{i}$ denote the Lagrange multipliers and ${x}_{i}$ denotes the training data input.

Then, the solution of SVM is expressed as:

$$g\left(x\right)=\sum_{i=1}^{n}\left({\widehat{\beta }}_{i}-{\beta }_{i}\right)K\left(x,{x}_{i}\right)+d$$

(8)

where $K\left(x,{x}_{i}\right)$ denotes the kernel function.

Additionally, the radial basis function (RBF) is chosen to be the kernel function of SVM, and the kernel function is expressed as follows:

$$K\left(x,{x}_{i}\right)=exp\left(-\frac{1}{2{\sigma }^{2}}{\Vert x-{x}_{i}\Vert }^{2}\right)$$

(9)

Simulated Annealing Algorithm

In contrast to other parameter optimizing methods, the SAA introduces random factors; namely, SAA accepts a solution that is worse than the current solution with a specified probability when iteratively updating a feasible solution. The procedure of SAA is described in detail in the following, and the process is presented graphically in Fig. 2.

$Step 1:$ It is necessary to define the initial value of the maximum temperature ${T}_{0}$, the initial parameter values, their ranges, and the number of iterations.

$Step 2:$ The cooling trend function is defined as follows:

$${T}_{i+1}=b\times {T}_{i}$$

(10)

where $b$ is a constant and its value is defined as 0.8.

$Step 3:$ Generate a new solution ${s}_{i+1}$ based on the current solution ${s}_{i}$. Calculating the corresponding objective function value $F\left({s}_{i+1}\right)$ yields:

$$\Delta F = F \left({s}_{i+1}\right)-F \left({s}_{i}\right)$$

(11)

$Step 4:$ According to the Metropolis criteria, the new solution is accepted with a specified probability. If $\Delta F <0$, the new solution ${s}_{i+1}$ is accepted as the new current solution. If $\Delta F> 0$, the new solution ${s}_{i+1}$ is accepted according to the probability $P$:

$$P={e}^{-\frac{\Delta F}{{T}_{i}}}$$

(12)

$Step 5:$ Repeat step 3 and step 4 for the specified number of iterations at temperature ${T}_{i}$.

$Step 6:$ Determine whether the temperature has reached the value of ${T}_{end}$. If not, return to step 2 and continue the optimization of the equilibrium point at the next temperature. If the conditions are satisfied, the algorithm is terminated to obtain the optimal solution.

Recurrent Neural Network

RNN considers sequence data as input while conducting recursion in the evolution direction. RNN shows strong performance in natural language processing problems and can be employed on various practical problems, such as speech recognition and price forecasting. The following is the detailed procedure of RNN.

RNN outperforms simple machine learning methods in nonlinear forecasting. The structure of the utilized RNN is illustrated in Fig. 3.

In Fig. 3, $x$ denotes an input, $h$ denotes a hidden cell, $o$ denotes an output, $L$ denotes the loss function, $y$ denotes a label of the training set, $x$ denotes the time, and $V, U, \mathrm{and} W$ denote the weight matrices. At time $t$, the relation formula can be expressed as follows:

$${h}_{t}=\varnothing \left(U{x}_{t}+W{h}_{t-1}+e\right)$$

(13)

where $\varnothing$ denotes the activation function and $e$ denotes the bias coefficient.

The output can be represented as follows:

$${o}_{t}=V{h}_{t}+c$$

(14)

Thus, the final forecast can be expressed as follows:

$${\widehat{y}}_{t}=\tau \left({o}_{t}\right)$$

(15)

where $\tau$ denotes the activation function that is utilized for classification.

Gaussian Process Regression

GPR is based upon Bayesian theory, and it is a nonparametric model that uses a Gaussian process (GP) to conduct regression analysis. Equipped with the GP and their kernel functions, GPR is more convenient and has been applied to numerous tasks, such as time series analysis, image processing, and trend forecasting. The procedure of GPR is presented in detail in the following.

According to the assumption, a regression model that contains noise is represented as:

$$y=r\left(x\right)+\omega$$

(16)

where $x$ denotes the input vector, $r$ denotes the function value, and $y$ denotes the observations. It is assumed further that $\omega \sim N\left(0,{\sigma }_{n}^{2}\right)$.

Then, the joint distribution of $r$ and $y$ and the prior distribution of $y$ are represented as follows:

$$\left\{\begin{array}{c}y\sim N\left(0,{\sigma }_{n}^{2}{Z}_{n}+E\left(X,X\right)\right) \\ \left[\begin{array}{c}y\\ r\end{array}\right]\sim N\left(0,\left[\begin{array}{cc}{\sigma }_{n}^{2}{Z}_{n}+E\left(X,X\right)& E\left(X,x\right)\\ E\left(x,X\right)& E\left(x,x\right)\end{array}\right]\right)\end{array} \right.$$

(17)

where $E\left(X,X\right)={E}_{n}=\left({e}_{ij}\right)$ is an $n\times n$ symmetric positive-definite covariance matrix, matrix element ${e}_{ij}=e\left({x}_{i},{x}_{j}\right)$ is utilized to measure the coefficient of ${x}_{i}$ and ${x}_{j}$, $E\left(X,x\right)=E{\left(x,X\right)}^{T}$ denotes the $n\times 1$ variance matrix of test point $x$ and the input of training set $X$, $e\left(x,x\right)$ denotes the covariance of $x$, and ${Z}_{n}$ denotes the n-dimensional identity matrix.

Finally, the posterior distribution of $r$ is expressed as follows:

$$r\left(X,y,x\sim N\left(\overline{r },Cov\left(r\right)\right)\right)$$

(18)

In the above expression:

$$\left\{\begin{array}{c}\overline{r }=E\left(x,X\right){\left[E\left(X,X\right)+{\sigma }_{n}^{2}{Z}_{n}\right]}^{-1}y \\ Cov\left(r\right)=e\left(x,x\right)-E\left(x,X\right)\times {\left[E\left(X,X\right)+{\sigma }_{n}^{2}{Z}_{n}\right]}^{-1}E\left(X,x\right)\end{array} \right.$$

(19)

The commonly used covariance function can be represented as follows:

$$e\left(x,{x}^{^{\prime}}\right)={\sigma }_{r}^{2}exp\left(-\frac{1}{2}{\left(x-{x}^{^{\prime}}\right)}^{T}{M}^{-1}\left(x-{x}^{^{\prime}}\right)\right)$$

(20)

where ${\sigma }_{r}^{2}$ denotes signal variance.

Proposed Model

Enhanced Weighted Support Vector Machine

Definition of the Penalty Weight Function

The available hybrid forecasting models often ignore the accuracy of the intermediate forecasting result, and the accuracy of the final forecasting result has substantial room for improvement. Therefore, this paper focuses on increasing the accuracy of intermediate forecasting. Since standard SVM ignores the timing of the training samples, EWSVM is innovatively proposed to improve the forecasting performance. A stock index is a time series; namely, the importance of the data differs among periods. Typically, it is assumed that the recent data and the information that they provide have a stronger impact on the model than the forward data. In contrast to standard SVM, EWSVM regards the penalty weight of each sample as increasing backwards over time. In EWSVM, the weight function is defined as follows:

$${w}_{i}=\frac{1}{1+{e}^{8-\frac{16i}{n}}}$$

(21)

In this expression, $i$ denotes $i$ th day’s sample of the training set in chronological order, $n$ represents the length of complete training set, and ${w}_{i}$ represents the weight of the $i$ th day’s sample in chronological order. In addition, ${w}_{i}\bullet C$ replaces $C$ in formula (6) and formula (7).

Optimizing Parameters with an Intelligence Algorithm

Previous studies often used grid search, random search, and ordinary Monte Carlo algorithms to optimize parameters. Grid and random search have huge computational complexities and low search efficiencies, while the ordinary Monte Carlo algorithm easily falls into local extrema. SAA method can overcome these limitations. As a universal algorithm, SAA has the global optimization performance of probability in theory. At present, SAA has been widely used in the fields of production scheduling, machine learning, signal processing, and so on. Therefore, in this research, SAA is selected to optimize EWSVM, and the optimized super parameters are the penalty coefficient and threshold error of EWSVM. The main steps for SAA to optimize the EWSVM model are as follows:

Step 1: Define the initial parameter values and parameter ranges of SAA and EWSVM, and the number of iterations.

Step 2: Determine the objective function of SAA. The objective function of this paper is designed as:

$$\mathrm{min Obj}=\mathrm{RMSE }=\sqrt{\frac{1}{L}{\sum }_{i=1}^{L}{({\widehat{t}}_{l}-{t}_{l})}^{2}}$$

(22)

where $L$ denotes the samples’ number, and ${\widehat{t}}_{l}$ and ${t}_{l}$ represent the original values and prediction value at point $l$.

Step 3: According to Metropolis standard, update the location of the solution.

Step 4: Repeat steps 2 and 3 for the specified number of iterations until the iteration stop condition is reached.

Step 5: Stop the iteration and get the optimal parameters of the EWSVM model.

SSA-EWSVM-RNN-GPR Model

As was introduced in the “Introduction” section, this study designed an innovative signal decomposition and interval forecasting-based hybrid model for precise point and reliable interval forecasting of a stock index. Figure 4 shows the flow chart of the proposed SSA-EWSVM-RNN-GPR model. The details of the designed prediction framework are described below.

Module I : Data decomposition: in this research, the closing price signal of stock index is decomposed into N subseries by SSA method, where the decomposition results include main trend sequence, detail components, and noise signals. In the result of decomposition, the noise component is removed and the remaining sequence is used as the predictive input variable of the next module.
Module II : Preliminary prediction: the EWSVM model designed in this paper is used to predict the main trend series and detail components derived from the decomposition of stock index data. Among them, the enhanced weighted support vector machine considers the timing of data and includes an S-shaped weight function. In addition, SAA method is used to optimize the penalty coefficient and threshold error of EWSVM, so as to seek the minimum RMSE of the EWSVM result. The excellent global optimization of the algorithm is helpful to obtain the EWSVM model with the best prediction accuracy and stability.
Module III : Nonlinear ensemble prediction: an excellent deep learning method RNN model is selected for the nonlinear ensemble of the proposed stock index prediction framework to achieve accurate point prediction. The input variables of this module are the preliminary prediction results of EWSVM and four stock indexes (opening price, high price, low price, and change). Furthermore, the proposed SSA-EWSVM-RNN model is compared with several benchmark models (including SSA-LSSVM-RNN, SSA-SVM-RNN, SSA-RNN, RNN, LSTM [41], DWT-RNN [42], AHP-LSSVM [43], and WEIGHT-LSSVM [44]) to verify the prediction performance of the proposed model.
Module IV : Interval prediction: GPR is used in the interval prediction of the stock index prediction framework designed in this research. Based on the point prediction results obtained by the first three modules, the GPR method is applied to predict the confidence interval, so as to provide more abundant future stock index trend information. In addition, in order to prove that the proposed model has prediction advantages in interval prediction, the proposed SSA-EWSVM-RNN-GPR model is compared with SSA-LSSVM-RNN-GPR, SSA-SVM-RNN-GPR, SSA-RNN-GPR, RNN-GPR, LSTM-GPR, DWT-RNN-GPR, AHP-LSSVM-GPR, and WEIGHT-LSSVM-GPR models.

Case Study

This section describes in detail the data, evaluation indicators, and several experiments that are utilized to evaluate the performance of the SSA-EWSVM-RNN-GPR model.

Data Collection

This study uses two data sets to evaluate the efficiency of the proposed model: the Nikkei 225 Index and the Hang Seng Index. To constitute the data set of the Nikkei 225 Index, 4690 continuous closing price, opening price, high price, low price, and change of stock opening data from January 5th, 2001, to January 16th, 2020, are collected, while 4714 continuous closing price, opening price, high price, low price, and change of stock opening data from December 5th, 2000, to January 16th, 2020, are collected to constitute the data set of the Hang Seng Index. The closing price sequences of the two data sets are plotted in Figs. 5 and 6, respectively. In the following modelling process, $80\%$ of each data set in the front is used as a training set, while the remaining $20\%$ of the data set is used as a test set.

Evaluation Indicators

In order to comprehensively evaluate the performance of the SSA-EWSVM-RNN-GPR prediction framework, seven evaluation criteria include mean absolute error (MAE), root mean square error (RMSE), mean absolute percentage error (MAPE), decision coefficient (R²), coverage probability (${CP}_{\alpha }$), average width percentage (${MWP}_{\alpha }$), and a new design standard ${MC}_{\alpha }$, used in this paper. MAE, RMSE, MAPE, and R² are used to evaluate the point prediction results, and ${CP}_{\alpha }$,${MWP}_{\alpha }$, and ${MC}_{\alpha }$ are used to evaluate the interval prediction results. The mathematical formulas of these indicators are shown in Table 1.

Table 1 Evaluation indicators of performance

Full size table

In these expressions, R denotes the real price, $\overline{R }$ denotes the mean value of the real data, and P denotes the forecast price, α denotes the confidence, c denotes the number of real prices that fall into the forecasting interval, T is the number of test sets, and up and down are the upper and lower limits of the prediction interval.

Results of Data Decomposition

SSA is introduced to decompose the closing price of stock index into some signal components with different characteristics. SSA algorithm can be used to reduce random data and obtain more stable data. The original data is decomposed into ten signal components, including trend signal, detail component, and noise signal. Considering that the noise signal may have a negative impact on the prediction results, in the decomposition process, the last two components are discarded as noise, and the remaining eight components (including one main trend component and seven detail components) are selected for subsequent experiments. The decomposition results of the Nikkei 225 Index and the Hang Seng Index are shown in Figs. 7 and 8.

Results of Point Prediction

The actual closing price of the stock index is decomposed by the SSA method, and finally eight sub-sequences are obtained. Each sub-sequence is separately input into EWSVM for model construction. Since variables often differ in terms of their units and degrees of variation, by modelling on the original data directly, it is difficult to realize satisfactory forecasting performance. Thus, standardization will be used to eliminate this negative effect. The standardization formula is expressed as follows:

$${x}^{*}=\frac{x-\mu }{\sigma }$$

(23)

In this expression, $x$ denotes the original stock price, $\mu$ denotes the average value of $x$, $\sigma$ denotes the standard deviation of $x,$ and ${x}^{*}$ denotes the standardized closing price.

The standardized training set is used to construct SAA-EWSVM model. The role of SAA is to optimize the penalty coefficient and threshold error of EWSVM. In the process of parameter optimization, a search space is established, in which the penalty coefficient follows the uniform distribution on [0.001,5000], and the threshold error follows the uniform distribution on [0.001,100]. The number of iterations of parameter optimization is 1000. In addition, the kernel function of EWSVM is radial basis function. Figure 9 shows the parameter optimization process of Nikkei 225 index and Hang Seng Index. Table 2 shows the parameter optimization results of Nikkei 225 index and Hang Seng Index. In this table, C denotes the punishment coefficient and ε denotes the threshold error.

Table 2 Optimal parameters of EWSVM that are based on the Nikkei 225 Index and the Hang Seng Index

Full size table

According to the optimized parameters, the optimal EWSVM model is constructed. Figures 10 and 11 describe the EWSVM results of the Nikkei 225 Index and the Hang Seng Index, including the fitting results of the training set and the prediction results of the test set of eight sub-sequences.

The preliminary results of EWSVM are part of the input variables of RNN nonlinear ensemble. In addition, other influencing factors such as opening price, high price, low price, and change are also considered in this research. Multivariable RNN model is used for nonlinear integration of 12 sequences. The RNN model designed in this study includes an input layer and an output layer, and the number of neurons in the input layer and output layer is 100 and 1, respectively. The activation function of this model chooses rectified linear unit function, and the dropout is set to 0.15, and the number of iterations is 500. Furthermore, the lag period of the model is 5 days; that is, the data of the first five days are used to predict the data of the next day. The final point prediction results are shown in Fig. 12. Table 3 shows the point forecast evaluation results of the Nikkei 225 Index and the Hang Seng Index, respectively.

Table 3 Evaluation of the point forecasting result that was obtained with SSA-EWSVM-RNN

Full size table

Results of Interval Prediction

Based on the previous point prediction results (Figs. 12 and 13), in order to obtain additional uncertainty information, GPR is used for 95% confidence interval prediction of closing price. The kernel function of the Gaussian process uses the optimized radial basis function, and the width range of the radial basis function varies from [0.01, 10000]. The interval forecast results are shown in Fig. 14, and the interval forecast evaluation results of the two data sets of Nikkei 225 Index and Hang Seng Index are given in Table 4.

Table 4 Evaluation of the interval forecasting result that was obtained with SSA-EWSVM-RNN-GPR

Full size table

Comparison and Analysis

In order to clearly prove the prediction advantages of the proposed stock index prediction framework, SSA-EWSVM-RNN-GPR is compared with eight benchmark models from the perspectives of prediction results and running time.

Comparison of Results

In order to evaluate the performance of the proposed model, several groups of comparative experiments are designed to further prove the rationality and superiority of the stock index prediction framework. Figure 13 and Table 5 show the point prediction evaluation results of various models, and Fig. 15 and Table 6 describe the interval prediction results of them.

Table 5 Comparisons of the point forecasting results that were obtained with various models

Full size table

Table 6 Comparisons of the interval forecasting results that were obtained with various models

Full size table

(a)
Comparison of SSA-EWSVM-RNN-GPR, SSA-LSSVM-RNN-GPR and SSA-SVM-RNN-GPR.

Experiment (a) is used to verify the advantages of EWSVM model. By analyzing the point prediction results and interval prediction results of SSA-EWSVM-RNN-GPR, SSA-LSSVM-RNN-GPR, and SSA-SVM-RNN-GPR, it is obvious that the model proposed in this research has achieved the best prediction performance. For example, in the case of the Hang Seng Index, the SSA-EWSVM-RNN-GPR model achieved the optimal MAE, R², MAPE, RMSE, ${CP}_{95\%}$, ${MWP}_{95\%}$, and ${MC}_{95\%}$ with 79.215, 0.997, 0.007, 96.774, 0.97985, 0.28223, and 0.28803, respectively, which reveals the superiority of the EWSVM model based on this research over other support vector machine models.
(b)
Comparison of SSA-EWSVM-RNN-GPR and SSA- RNN-GPR.

Experiment (b) aims to illustrate the role of the EWSVM module in the proposed stock index forecasting framework. Through the prediction results of the SSA-EWSVM-RNN-GPR and SSA-RNN-GPR models, it can be seen that SSA-EWSVM-RNN-GPR has better prediction performance. Therefore, EWSVM is indispensable in this framework and can improve the overall predictive ability of the designed framework.
(c)
Comparison of SSA-EWSVM-RNN-GPR and RNN-GPR.

In experiment (c), by analyzing the prediction results of the SSA-EWSVM-RNN-GPR and RNN-GPR models, it can be seen that the prediction ability of the hybrid model under the decomposition integration framework is significantly better than the single model. For example, in the case of the Nikkei 225 Index, the proposed model obtains the optimal MAE, R2, MAPE, RMSE, ${CP}_{95\%}$, ${MWP}_{95\%}$, and ${MC}_{95\%}$ with 66.074, 0.997, 0.007, 80.038, 0.89979, 0.05746, and 0.06385, respectively.
(d)
Comparison of SSA-EWSVM-RNN-GPR and published models.

Experiment (d) aims to demonstrate the prediction superiority of the proposed stock index prediction framework, and to prove it by comparing SSA-EWSVM-RNN-GPR with other published models. Since there are very few stock index forecasting models for interval forecasting, this research provides interval forecasting results of other published models based on GPR for comparison and analysis. Based on the point prediction results and interval prediction results of the SSA-EWSVM-RNN-GPR, LSTM-GPR, DWT-RNN-GPR, AHP-LSSVM-GPR, and WEIGHT-LSSVM-GPR models, it is observed that SSA-EWSVM-RNN-GPR has better predictive performance. Therefore, it can be concluded that the prediction framework proposed in this paper is more effective than other models.

Comparison of Running Time

In order to discuss the computational complexity of the model, a comparison of the average running time of various models is given, that is, the time comparison of each iteration of the data in the model, as shown in Table 7.

Table 7 Comparison of the average running time of various models

Full size table

In the Nikkei 225 Index and Hang Seng Index cases, the average running time of the SSA-EWSVM-RNN-GPR model is 1.0212 and 1.0473, respectively. Compared with other models, the running time ranking of this model is in the middle position. Moreover, the running time of RNN-GPR, LSTM-GPR, and DWT-RNN-GPR is relatively short, and the running time of SSA-SVM-RNN-GPR and SSA-RNN-GPR is close to that of the proposed model. However, the prediction framework proposed in this research is significantly higher than the above models in forecast accuracy, and the time consumed is not disappointing. Therefore, the moderate computational complexity of this model is worthy of acceptance.

Discussion

Table 5 and Table 6 show the performance results of different models in point prediction and interval prediction. Based on these results, the following key points are summarized:

(1)
According to the comparison between SSA-EWSVM-RNN and SSA-SVM-RNN, SSA-EWSVM-RNN, which improves upon the standard SVM, can be more accurate and more self-correcting when forecasting the original closing price series. The comparison results between SSA-EWSVM-RNN and SSA-LSSVM-RNN prove the prediction superiority of the improved weighted support vector machine method in this research.
(2)
The prediction results of SSA-EWSVM-RNN-GPR and SSA-RNN-GPR verify the role of EWSVM in the proposed prediction framework, and also show that emphasizing the accuracy of intermediate prediction is helpful to improve the final prediction accuracy.
(3)
According to the comparison between SSA-EWSVM-RNN-GPR and RNN-GPR, the accuracy difference between the two models shows that the hybrid model based on signal decomposition is better than the single prediction model.
(4)
From the comparison between SSA-EWSVM-RNN-GPR and eight benchmark models, it can be concluded that the accuracy of point prediction has a great influence on the performance of interval prediction.
(5)
Compared with other models, the proposed SSA-EWSVM-RNN-GPR model proved to be an effective model for point prediction and interval prediction.

Conclusions

Forecasting stock indices accurately is crucial for investment decision and risk management and is extremely meaningful to investors and financial regulators. In this paper, the SSA-EWSVM-RNN-GPR model is used to forecast the closing prices of stock indices, and it is proven to be efficient in both point and interval forecasting. The SSA-EWSVM-RNN-GPR model utilizes singular spectrum analysis, enhanced weighted support vector machine, the simulated annealing algorithm, a recurrent neutral network, and Gaussian process regression. Singular spectrum analysis is used to decompose the original stock index signals into several components, to extract meaningful components and to discard the noise components. The enhanced weighted support vector machine, which considers the timing of the data, is employed to separately forecast each component that was decomposed via singular spectrum analysis, within which the intelligent simulated annealing algorithm is used to optimize the hyperparameters. To nonlinearly integrate the previous result, a recurrent neutral network is utilized to obtain accurate point forecasting results. Furthermore, to obtain uncertainty information regarding the forecasted closing price, Gaussian process regression is applied for confidence interval forecasting based on the previous point forecasting results. Compared with eight benchmark models, the proposed SSA-EWSVM-RNN-GPR model can be an effective tool for both point and interval forecasting of stock indices.

Although the developed stock index forecast framework shows good forecast accuracy in both point forecast and interval forecast, there are still some limitations that need to be improved. For example, better optimization algorithms in the future can be applied to the stock index prediction model. In addition, this research only considers the closing price, opening price, high price, low price, and change of stock index. In the future research, more factors can be considered under the framework of stock index prediction.

Abbreviations

ANN:: Artificial neutral network
ARMA:: Auto-regressive moving average model
BP:: Back propagation network
CP _α :: Coverage probability
EMD:: Empirical mode decomposition
EWSVM:: Enhanced weighted support vector machine
GARCH:: Generalized auto-regressive conditional heteroskedasticity
GP:: Gaussian process
GPR:: Gaussian process regression
MAE:: Mean absolute error
MAPE:: Mean absolute percentage error
MC _α :: MWP_α Divided by CP_α
MSE:: Mean square error
MWP _α :: Mean width percentage
RBF:: Radial basis function
RIM:: Residual income model
RNN:: Recurrent neutral network
RMSE:: Root mean square error
R ² :: Coefficient of determination
SAA:: Simulated annealing algorithm
SSA:: Singular spectrum analysis
SVM:: Support vector machine
SVD:: Singular value decomposition
SWLSTM:: Shared weight long short-term memory network
T-EOF:: Time-empirical orthogonal function
WD:: Wavelet decomposition
WSVM:: Weighted support vector machine

References

Long W, Lu Z, Cui L. Deep learning-based feature engineering for stock price movement prediction. Knowl-Based Syst. 2019;164:163–73.
Article Google Scholar
Zhang K, Zhong G, Dong J, Wang S, Wang Y. Stock market prediction based on generative adversarial network. Proced Comput Sci. 2019;147:400–6.
Article Google Scholar
Wang Y, Wang L, Yang F, Di W, Chang Q. Advantages of direct input-to-output connections in neural networks: the Elman network for stock index forecasting. Inf Sci. 2021;547:1066–79.
Article MathSciNet Google Scholar
Wafi AS, Hassan H, Mabrouk A. Fundamental analysis models in financial markets–review study. Proced Econ Financ. 2015;30:939–47.
Article Google Scholar
Rounaghi MM, Zadeh FN. Investigation of market efficiency and financial stability between S&P 500 and London stock exchange: monthly and yearly forecasting of time series stock returns using ARMA model. Physica A. 2016;456:10–21.
Article Google Scholar
Chen S, Jeong K, Härdle WK. Recurrent support vector regression for a non-linear ARMA model with applications to forecasting financial returns. Computation Stat. 2015;30(3):821–43.
Article MathSciNet Google Scholar
Francq C, Wintenberger O, Zakoïan JM. Goodness-of-fit tests for Log-GARCH and EGARCH models. TEST. 2018;27(1):27–51.
Article MathSciNet Google Scholar
Shi S, Liu W, Jin M. Stock price forecasting based on a combined model of ARMA and BP neural network and Markov model. J Inform Process Manage. 2013;4(3):215–21.
Google Scholar
Wei Y, Yu Q, Liu J, Cao Y. Hot money and China’s stock market volatility: further evidence using the GARCH–MIDAS model. Physica A. 2018;492:923–30.
Article Google Scholar
Zhang X, Frey R. Improving ARMA-GARCH forecasts for high frequency data with regime-switching ARMA-GARCH. J Comput Anal Appl. 2015;18(1):727–51.
MathSciNet MATH Google Scholar
Yu J, Tan M, Zhang H, Tao D, Rui Y. Hierarchical deep click feature prediction for fine-grained image recognition. IEEE Trans Pattern Anal Mach Intell. 2019;99:1–1.
Google Scholar
Yu J, Tao D, Wang M, Rui Y. Learning to rank using user clicks and visual features for image retrieval. IEEE Trans Cybern. 2015;45(4):767–79.
Article Google Scholar
Pan Y, Xiao Z, Wang X, Yang D. A multiple support vector machine approach to stock index forecasting with mixed frequency sampling. Knowl-Based Syst. 2017;122:90–102.
Article Google Scholar
Wang J, Wang J. Forecasting stock market indexes using principle component analysis and stochastic time effective neural networks. Neurocomputing. 2015;156:68–78.
Article Google Scholar
Grigoryan H. A Stock market prediction method based on support vector machines (SVM) and independent component analysis (ICA). Database Sys J. 2016;7(1):12–21.
MathSciNet Google Scholar
Moghaddam AH, Moghaddam MH, Esfandyari M. Stock market index prediction using artificial neural network. Journal of Economics Financ Administr S. 2016;21(41):89–93.
Google Scholar
Zhang D, Lou S. The application research of neural network and BP algorithm in stock price pattern classification and prediction. Future Gener Comp Sy. 2021;115:872–9.
Article Google Scholar
Xiao J, Zhu X, Huang C, Yang X, Wen F, Zhong M. A new approach for stock price analysis and prediction based on SSA and SVM. J Inform Techn Decis Making. 2019;18(01):287–310.
Article Google Scholar
Chandar SK. Hybrid models for intraday stock price forecasting based on artificial neural networks and metaheuristic algorithms. Pattern Recogn Lett. 2021;147:124–33.
Article Google Scholar
Chen Y, Hao Y. Integrating principle component analysis and weighted support vector machine for stock trading signals prediction. Neurocomputing. 2018;321:381–402.
Article Google Scholar
Huang C, Zhou J, Chen J, Yang J, Clawson K, Peng Y. A feature weighted support vector machine and artificial neural network algorithm for academic course performance prediction. Neural Comput Appl. 2021; 1–13.
Fayed HA, Atiya AF. Speed up grid-search for parameter selection of support vector machines. Appl Soft Comput. 2019;80:202–10.
Article Google Scholar
Bergstra J, Bengio Y. Random search for hyper-parameter optimization. J Mach Learn Res. 2012;13:281–305.
MathSciNet MATH Google Scholar
Tao Y, Yan H, Gao H, Sun Y, Li G. Application of SVR optimized by modified simulated annealing (MSA-SVR) air conditioning load prediction model. J Ind Inf Integr. 2019;15:247–51.
Google Scholar
Winters-Hilt S. Clustering via support vector machine boosting with simulated annealing. Int J Comput Optim. 2017;4(1):53–89.
Google Scholar
Gao T, Chai Y. Improving stock closing price prediction using recurrent neural network and technical indicators. Neural Computat. 2018;1–22.
Qiu Y, Yang HY, Lu S, Chen W. A novel hybrid model based on recurrent neural networks for stock market timing. Soft Comput. 2020; 1–18.
Wang JZ, Wang JJ, Zhang ZG, Guo SP. Forecasting stock indices with back propagation neural network. Expert Syst Appl. 2011;38(11):14346–55.
Google Scholar
Wei LY. A hybrid ANFIS model based on empirical mode decomposition for stock time series forecasting. Appl Soft Comput. 2016;42:368–76.
Article Google Scholar
Ghodsi M, Hassani H, Rahmani D, Silva ES. Vector and recurrent singular spectrum analysis: which is better at forecasting. J Appl Stat. 2018;45(10):1872–99.
Article MathSciNet Google Scholar
Lahmiri S. Minute-ahead stock price forecasting based on singular spectrum analysis and support vector regression. Appl Math Comput. 2018;320:444–51.
MathSciNet MATH Google Scholar
Leles MC, Moreira MG, Vale-Cardoso AS, Nascimento CL, Sbruzzi EF, Guimarães HN. Comparision between Basic and Toeplitiz SSA applied to non-stationary time-series. Stat Interface. 2019;12(4):527–36.
Article MathSciNet Google Scholar
Liu H, Mi X, Li Y, Duan Z, Xu Y. Smart wind speed deep learning based multi-step forecasting model using singular spectrum analysis, convolutional gated recurrent unit network and support vector regression. Renew Energ. 2019;143:842–54.
Article Google Scholar
Wen F, Xiao J, He Z, Gong X. Stock price prediction based on SSA and SVM. Proced Comput Sci. 2014;31:625–31.
Article Google Scholar
Wang J, Wang Z, Li X, Zhou H. Artificial bee colony-based combination approach to forecasting agricultural commodity prices. J Forecast. 2019; https://doi.org/10.1016/j.ijforecast.2019.08.006
Bishoyi A, Wang X, Dey DK. learning semiparametric regression with missing covariates using Gaussian process models. Bayesian Anal. 2020;15(1):215–39.
Article MathSciNet Google Scholar
Chandorkar M, Camporeale E, Wing S. Probabilistic forecasting of the disturbance storm time index: an autoregressive Gaussian process approach. Space Weather. 2017;15(8):1004–19.
Article Google Scholar
Fang D, Zhang X, Yu Q, Jin TC, Tian L. A novel method for carbon dioxide emission forecasting based on improved Gaussian processes regression. J clean Prod. 2018;173:143–50.
Article Google Scholar
Zhang C, Wei H, Zhao X, Liu T, Zhang K. A Gaussian process regression based hybrid approach for short-term wind speed prediction. Energ Convers Manage. 2016;126:1084–92.
Article Google Scholar
Zhang Z, Ye L, Qin H, Liu Y, Wang C, Yu X, et al. Wind speed prediction method using shared weight long short-term memory network and Gaussian process regression. Appl Energ. 2019;247:270–84.
Article Google Scholar
Liu K, Zhou J, Dong D. Improving stock price prediction using the long short-term memory model combined with online social networks. J Behav Exp Finance. 2021;30:100507.
Article Google Scholar
Hajiabotorabi Z, Kazemi A, Samavati FF, Ghaini FMM. Improving DWT-RNN model via B-spline wavelet multiresolution to forecast a high-frequency time series. Expert Syst Appl. 2019;138:112842.
Article Google Scholar
Marković I, Stojanović M, Stanković J, Stanković M. Stock market trend prediction using AHP and weighted kernel LS-SVM. Soft Comput. 2017;21(18):5387–98.
Article Google Scholar
Chen TT, Lee SJ. A weighted LS-SVM based learning system for time series forecasting. Inf Sci. 2015;299:99–116.
Article MathSciNet Google Scholar

Download references

Funding

This research was supported by the National Natural Science Foundation of China (Grant No. 71971122 and 71501101).

Author information

Authors and Affiliations

School of Management Science and Engineering, Nanjing University of Information Science and Technology, Nanjing, 210044, China
Jujie Wang
Changwang School of Honors, Nanjing University of Information Science and Technology, Nanjing, 210044, China
Liu Feng, Yang Li & Chunchen Feng
School of Computer and Software, Nanjing University of Information Science and Technology, Nanjing, 210044, China
Junjie He

Authors

Jujie Wang
View author publications
You can also search for this author in PubMed Google Scholar
Liu Feng
View author publications
You can also search for this author in PubMed Google Scholar
Yang Li
View author publications
You can also search for this author in PubMed Google Scholar
Junjie He
View author publications
You can also search for this author in PubMed Google Scholar
Chunchen Feng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jujie Wang.

Ethics declarations

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed Consent

Informed consent was obtained from all individual participants included in the study.

Conflict of Interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, J., Feng, L., Li, Y. et al. Deep Nonlinear Ensemble Framework for Stock Index Forecasting and Uncertainty Analysis. Cogn Comput 13, 1574–1592 (2021). https://doi.org/10.1007/s12559-021-09961-3

Download citation

Received: 31 March 2021
Accepted: 05 November 2021
Published: 13 November 2021
Issue Date: November 2021
DOI: https://doi.org/10.1007/s12559-021-09961-3

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Deep Nonlinear Ensemble Framework for Stock Index Forecasting and Uncertainty Analysis

Abstract

Similar content being viewed by others

A deep learning integrated framework for predicting stock index price and fluctuation via singular spectrum analysis and particle swarm optimization

Two-Stage Deep Ensemble Paradigm Based on Optimal Multi-scale Decomposition and Multi-factor Analysis for Stock Price Prediction

Predicting Stock Price Using Two-Stage Machine Learning Techniques

Explore related subjects

Introduction

Literature Review

Related Methodology

Singular Spectrum Analysis

Support Vector Machine

Simulated Annealing Algorithm

Recurrent Neural Network

Gaussian Process Regression

Proposed Model

Enhanced Weighted Support Vector Machine

Definition of the Penalty Weight Function

Optimizing Parameters with an Intelligence Algorithm

SSA-EWSVM-RNN-GPR Model

Case Study

Data Collection

Evaluation Indicators

Results of Data Decomposition

Results of Point Prediction

Results of Interval Prediction

Comparison and Analysis

Comparison of Results

Comparison of Running Time

Discussion

Conclusions

Abbreviations

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Ethical Approval

Informed Consent

Conflict of Interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation