Keywords

1 Introduction

The stock market has the characteristics of high return and high risk [1], which has always been concerned on the analysis and forecast of stock prices. Actually, the complexity of the internal structure in stock price system and the diversity of the external factors (the national policy, the bank rate, price index, the performance of quoted companies and the psychological factors of the investors) determine the complexity of the stock market, uncertainty and difficulty of stock price forecasting task [2]. Because the stock price is collected according to the order of time, it actually forms a complex nonlinear time series [3]. Some traditional stock market analysis methods, such as stock price graph analysis (k line graph [4]), cannot profoundly reveal the stock intrinsic relationship, so that the prediction results are not so ideal on stock price. Stock price prediction methodologies fall into three broad categories which are fundamental analysis, technical analysis (charting) and technological methods.

From the view of mathematics, the key to effective stock price prediction is to discover the intrinsic mapping or function, and to fit and approximate the mapping or the function. As it has been quickly developed, the mixture of Gaussian processes (MGP) model [5] is a powerful tool for solving this problem. But most of the MGP models are very complex and involve a large number of parameters and hyper-parameters, which makes the application of the MGP models very difficult [6]. Thus, we adopt the MGP model which proposed in [7] with excluding unnecessary priors and carefully selecting the model structure and gating function. This MGP model remains the main structure, features and advantages of the original MGP model. Moreover, it can be effectively applied to the modeling and prediction of nonlinear time series via the precise hard-cut EM algorithm. In fact, the precise hard-cut EM algorithm is more efficient than the soft EM algorithm since we could get the hyper-parameters of each GP independently in the M-step. It was demonstrated by the experimental results that this precise hard-cut EM algorithm for the MGP model really gives more precise prediction than some typical regression models and algorithms.

Along this direction, we apply the MGP model to the short-term stock price forecasting via the precise hard-cut EM algorithm. The experimental results show that this MGP based method can find potential rules from historical datasets, and their forecasting results are more stable and accurate.

The rest of this paper is organized as follows. In Sect. 2, we give a brief review of the MGP model and introduce the precise hard-cut EM algorithm. Section 3 presents the framework of stock price forecasting and the experimental results of the MGP based method as well as the comparisons of the regression models and algorithms. Finally, we give a brief conclusion in Sect. 4.

2 The Precise Hard-cut EM Algorithm for MGPs

2.1 The MGP Model

We consider the MGP model as described in [7]. In fact, it can be viewed as a special mixture model where each component is a GP. The whole set of indicators \( {\text{Z}} = \left[ {{\text{z}}_{1} ,{\text{z}}_{2} , \ldots ,{\text{z}}_{\text{N}} } \right]^{\text{T}} \), inputs X and outputs Y are sequentially generated and the MGP model is mathematically defined as follows:

$$ {\text{p}}\left( {{\text{z}}_{\text{t}} = {\text{c}}} \right) = {\uppi }_{\text{c}} , {\text{t}} = 1,2, \ldots ,{\text{N}}, $$
(1)
$$ {\text{p}}({\text{x}}_{\text{t}} |{\text{z}}_{\text{t}} = {\text{c}},\uptheta_{\text{c}} ) = {\text{N}}({\text{x}}_{\text{t}} |\, \upmu_{\text{c}} ,{\text{S}}_{\text{c}} ), {\text{t}} = 1,2, \ldots ,{\text{N}}, $$
(2)
$$ {\text{p}}({\text{y}}|{\text{X}},{\uptheta }) = \mathop \prod \nolimits_{{{\text{c}} = 1}}^{C} {\text{N}}({\text{y}}_{\text{c}} |0,{\text{K}}({\text{X}}_{\text{c}} ,{\text{X}}_{\text{c}} |{\uptheta }_{\text{c}} ) + \upsigma_{\text{c}}^{2}\, {\text{I}}_{{{\text{N}}_{\text{c}} }} ) $$
(3)

where \( {\text{K}}\left( {{\text{x}}_{\text{i}} ,{\text{x}}_{\text{j}} } \right) = {\text{g}}^{2} {\text{exp}}\{ - \frac{1}{2}\left( {{\text{x}}_{\text{i}} - {\text{x}}_{\text{j}} } \right)^{\text{T}} {\text{B}}\left( {{\text{x}}_{\text{i}} - {\text{x}}_{\text{j}} } \right)\} \), \( {\text{B}} = {\text{diag}}\{ {\text{b}}_{1}^{2} ,{\text{b}}_{2}^{2} , \ldots ,{\text{b}}_{\text{d}}^{2} \} \), and Eq. (2) adopts Gaussian inputs in most generative MGP models [810]. \( {\uptheta }_{\text{c}} = \left\{ {{\uppi }_{\text{c}} {,\upmu }_{\text{c}} ,{\text{S}}_{\text{c}} ,{\text{g}}_{\text{c}} ,{\text{b}}_{{{\text{c}},1}} ,{\text{b}}_{{{\text{c}},2}} , \ldots ,{\text{b}}_{{{\text{c}},{\text{d}}}} ,{\upsigma }_{\text{c}} } \right\} \) are the parameters in the c-th GP component and \( {\uptheta } = \left\{ {{\uptheta }_{\text{c}} } \right\}_{{{\text{c}} = 1}}^{\text{C}} \) denotes all the parameters in the mixture model.

The generative structure is prominent and clear for the MGP model, and the model avoids the complicated parameters setting. In various GP components, Gaussian means \( {\upmu }_{\text{c}} \) are different so that each component concentrates on the different region and this mixture model can fit multimodal dataset.

2.2 The Precise Hard-cut EM Algorithm

To avoid the computational complexity of Q function, it is reasonable to use the hard-cut version of the EM algorithm and we then can efficiently learn the parameters for the MGP model. In fact, the precise hard-cut EM algorithm [7] is a good choice and we summarize its procedures as follows:

After the convergence of the precise hard-cut EM algorithm, we have obtained the estimates of all the parameters for the MGP. For a test input \( {\text{x}}^{ *} \), we can classify it into the z-th component of the MGP by the MAP criterion as follows:

$$ {\text{z}} = {\text{argmax}}_{\text{c}} {\text{p}}\left( {{\text{z}}^{ *} = {\text{c|x}}^{ *} } \right) = {\text{argmax}}_{\text{c}} { \uppi }_{\text{c}} {\text{N}}\left( {{\text{x}}^{ *} {|\upmu }_{\text{c}} ,{\text{S}}_{\text{c}} } \right) $$
(8)

Based on such a classification, we can predict the output of the test input via the corresponding GP using

$$ {{\widehat{y}^{*}}} = \text{K}\left( {{\text{x}}^{ * } ,{\text{X}}} \right)\left[ {{\text{K}}\left( {\text{X,X}} \right){ + }\upsigma^{ 2} {\text{I}}} \right]^{ - 1} {\text{y}} $$
(9)

In the next section, the precise hard-cut EM algorithm for the MGP model will be used for the stock closing price prediction, and the obtained results will be compared with the classical regression models and algorithms.

3 Stock Price Prediction

3.1 The General Prediction Model

The time series can be denoted as \( \left\{ {{\text{s}}\left( {\text{t}} \right)} \right\}_{t = 1}^{\infty } \). For time series prediction task under certain conditions, Taken’s Theorem [11] ensures that for some embedding dimension \( {\text{d }} \in {\text{N}}^{ + } \) and almost all time delay \( {\uptau } \in {\text{N}}^{ + } \), there is a smooth function \( {\text{f}}:{\text{R}}^{\text{d}} \to {\text{R}} \) so that \( {\text{s}}\left( {\text{t}} \right) = {\text{f}}\left[ {{\text{s}}\left( {{\text{t}} - {\text{d}}\uptau } \right), \ldots ,{\text{s}}\left( {{\text{t}} - 2{\uptau }} \right),{\text{s}}\left( {{\text{t}} - {\uptau }} \right)} \right] \). Thus, a natural choice of the training dataset can be \( \{ {\text{x}}_{\text{t}} ,{\text{y}}_{\text{t}} \}_{t = 1}^{N} \) where \( {\text{x}}_{\text{t}} = \left[ {{\text{s}}\left( {{\text{t}} - {\text{d}}\uptau } \right), \ldots ,{\text{s}}\left( {{\text{t}} - 2{\uptau }} \right),{\text{s}}\left( {{\text{t}} - {\uptau }} \right)} \right] \) and \( {\text{y}}_{\text{t}} = {\text{s}}\left( {\text{t}} \right) \), and the test dataset \( \left\{ {{\text{x}}_{\text{t}}^{ *} ,{\text{y}}_{\text{t}}^{ *} } \right\}_{t = 1}^{L} \) can be set in the same way. In this way, time series prediction task can be transformed into the regression problem which aims at estimating and approximating the unknown function f.

We utilize Shanghai Composite Index (stock code: 000001) and Donghua energy (stock code: 002221) stock closing prices datasets from 2011 to 2013 which are downloaded from the Dazhihui software, and generate training datasets and test datasets which are respectively shown in the blue curve and red curve in Fig. 1.

Fig. 1.
figure 1

Shanghai and Donghua stock closing price curves from 2011 to 2013, blue curve represents 600 training data and red curve represents 100 test data. (a). Shanghai stock closing price curve. (b). Donghua stock closing price curve. (Color figure online)

For \( {\text{d}} = 1,2,3,4 \) and \( {\uptau } = 1,2,3,4 \), firstly we generate 700 samples, and every sample is a \( {\text{d}} + 1 \) dimensions vector. The first \( {\text{d}} \) data are the input sample of our model and the last data is the output. Secondly, we normalize all the training and test outputs by \( {\text{y}} \to ({\text{y }} - {\text{m}})/{\upsigma } \), where \( {\text{m}} \) and \( {\upsigma } \) denote the mean and the standard variance of the training outputs, respectively. Again, 700 samples are divided two parts, including 600 training samples and 100 test samples.

3.2 Prediction Results and Comparisons

We implement the precise hard-cut EM algorithm for MGPs (referred to as PreHard-cut) on the training dataset, and verify its performance on the test dataset. Actually, we implement it on each of the 16 normalized training datasets, get the trained MGP model and make the prediction. We finally de-normalize the prediction by \( \hat{y} \to \hat{y}\upsigma + {\text{m}} \). In order to compare its prediction performance, we run the MGP model with the other EM algorithms and some typical regression models and algorithms as follows:

  1. (1)

    The LOOCV hard-cut EM algorithm (referred to as LOOCV) proposed in [12] for MGPs, which approximates the posteriors and the Q function via the leave-one-out cross validation mechanism;

  2. (2)

    The variational hard-cut EM algorithm (referred to as VarHard-cut) proposed in [13] for MGPs, which approximates the posteriors via the variational inference;

  3. (3)

    The Radial Basis Function neural network with Gaussian kernel function (referred to as RBF), the classical regression algorithm which makes prediction by linear combinations of radial basis functions.

The prediction accuracy is evaluated by the root mean squared error (RMSE) on each experiment, which is mathematically defined as follow

$$ {\text{RMSE }} = \sqrt {\frac{1}{L} \mathop \sum \nolimits_{t = 1}^{L} \left( {\hat{y}_{t} - {\text{y}}_{\text{t}} } \right)^{2} } , $$
(10)

where \( {\text{y}}_{\text{t}} \) and \( \hat{y}_{t} \) denote the output true value of the t-th test sample and its predictive value, respectively. Meanwhile, we compare the efficiency of these algorithms by the total time consumed for both the parameter learning and the prediction, with an Intel(R) Core(TM) i5 CPU and 16.00 GB of RAM running Matlab R2014a source codes for all the experiments.

Before the parameter learning, some prior parameters have to be specified, including the number C of GP components for the MGP model, the number of pseudo inputs (PI) for the variational hard-cut EM algorithm and the number of neurons in the hidden layer (HL) for the RBF model. Without additional explanation, some typical values of these parameters are tested and these ones are selected and presented with the least prediction RMSEs.

The RMSEs as well as the best values of the predetermined parameters for each algorithm on each dataset are listed in Table 1. We find that in terms of prediction accuracy, the precise hard-cut EM algorithm rank the first in the dataset with \( {\text{d}} = 3,{ \uptau } = 1 \), which demonstrates the advantage in Shanghai and Donghua stock closing price prediction. And the predictive results are better than the results using the generalized RBF neural network in paper [14, 15]. The variational hard-cut EM algorithm for MGP model is comparable with the precise hard-cut algorithm on accuracy. But the last one is more stable and uniformly optimal with \( {\text{d}} = 3,{ \uptau } = 1 \) on Shanghai and Donghua stock price prediction. The LOOCV hard-cut EM algorithm for MGP model and the RBF model are not qualified for stock price prediction. Besides, Table 1 also shows a general decrease trend on prediction RMSEs with the embedding dimension d, since a large d means more information in the inputs.

Table 1. The RMSEs for Shanghai and Donghua stock closing price prediction.

Moreover, the proposed technique has good scalability, but for stock price prediction 600 days stock closing price data are enough on the grounds that time span is up to two years!

Figures 2 and  3 show the best forecasting results with the parameters \( {\text{d}} = 3 \) and \( {\uptau } = 1 \), which intuitively show the validity of the predictions. In Shanghai and Donghua test samples, the real and predicted values of the next 100 days are anastomotic and the best prediction RMSEs are 21.0782 and 0.2183 represented in bold in Table 1 respectively. The true values of the test samples are in good agreement with the predicted values, and the corresponding prediction errors are in actual allowable range which are mainly in \( \pm \)0.2 and \( \pm \)0.4 respectively as shown in Figs. 2 and 3.

Fig. 2.
figure 2

(a). The prediction results of Shanghai stock closing price data; (b). The corresponding errors of Shanghai stock closing price data. (Color figure online)

Fig. 3.
figure 3

(a). The prediction results of Donghua stock closing price data; (b). The corresponding errors of Donghua stock closing price data. (Color figure online)

The total time consumptions are shown in Table 2. We see that the precise hard-cut EM algorithm takes slightly longer. Nevertheless, no algorithms take longer than 6 min such that the remaining time is adequate for engineers to adjust the output power. Therefore, accuracy is the key factor in selecting the appropriate model and algorithm for stock price forecasting, so the precise hard-cut EM algorithm for the MGP model is a wonderful choice.

Table 2. The time consumptions for Shanghai and Donghua stock closing price prediction.

The best predictive curve for each algorithm is shown in Fig. 4. It can be found that the precise hard-cut EM algorithm and the variational hard-cut EM algorithm fit the true stock price extremely well except when the stock price reaches a peak or a trough, where there is a dramatic turn of the stock price. However, the two predictive curves are still within small and acceptable range around the true stock price even during the period of the peak and the trough. Besides, at some moments, the prediction of the precise hard-cut EM algorithm is closer to the true stock price than the variational hard-cut EM algorithm. The LOOCV hard-cut EM algorithm and the RBF model are not suitable for stock price forecasting.

Fig. 4.
figure 4

(a). Comparisons of each algorithm for the predictive curves of Shanghai stock closing price in 100d test data. (b). Comparisons of each algorithm for the predictive curves of Donghua stock closing price in 100d test data. (Color figure online)

Some remarkable results from Figs. 2, 3 and 4 is that the predicted prices seem to be displaced some constant time. Because the predicted price \( {\text{s}}\left( {\text{t}} \right) \) is based on before d stock price: \( {\text{s}}\left({{\text{t}} - {\text{d}}\uptau } \right), \ldots ,{\text{s}}\left({{\text{t}} - 2{\uptau }} \right),{\text{s}}\left({{\text{t}} - {\uptau}} \right) \).

In order to further explore how to improve the performance of the precise hard-cut EM algorithm, we plot the prediction RMSEs for \( {\text{d}} = 1,2,3,4,5 \) and \( {\uptau} = 1,2,3,4 \) respectively in Fig. 5. It can be observed from Fig. 5 that the RMSE generally decreases with the increasing of d and the decreasing of \( {\uptau} \). When \( {\text{d }} \ge 3 \), the RMSE is considerably low and its variation with \( {\text{d}} \) and \( {\uptau} \) is very tiny. Therefore, an appropriate large embedding dimension \( {\text{d}} \) ensures a precise forecasting in stock price.

Fig. 5.
figure 5

(a). The predictive RMSEs for Shanghai stock closing price in 100d test data in the precise hard-cut EM algorithm with various values of d and \( {\uptau } \). (b). The predictive RMSEs for Donghua stock closing price in 100d test data in the precise hard-cut EM algorithm with various values of d and \( {\uptau } \).

4 Conclusion

We have successfully applied the MGP model via the precise hard-cut EM algorithm to modeling and predicting the time series of stock prices. The experiment results demonstrate that this MGP based method via the precise hard-cut EM algorithm turns out to be valid, feasible and highly competitive on prediction accuracy with acceptable time consumption, and outperforms some typical regression models and algorithms.