Abstract
In this paper, the mixture of Gaussian processes (MGP) is applied to model and predict the time series of stock prices. Methodically, the precise hard-cut expectation maximization (EM) algorithm for MGPs is utilized to learn the parameters of the MGP model from stock prices data. It is demonstrated by the experiments that the MGP model with the precise hard-cut EM algorithm can be successfully applied to the prediction of stock prices, and outperforms the typical regression models and algorithms.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
The stock market has the characteristics of high return and high risk [1], which has always been concerned on the analysis and forecast of stock prices. Actually, the complexity of the internal structure in stock price system and the diversity of the external factors (the national policy, the bank rate, price index, the performance of quoted companies and the psychological factors of the investors) determine the complexity of the stock market, uncertainty and difficulty of stock price forecasting task [2]. Because the stock price is collected according to the order of time, it actually forms a complex nonlinear time series [3]. Some traditional stock market analysis methods, such as stock price graph analysis (k line graph [4]), cannot profoundly reveal the stock intrinsic relationship, so that the prediction results are not so ideal on stock price. Stock price prediction methodologies fall into three broad categories which are fundamental analysis, technical analysis (charting) and technological methods.
From the view of mathematics, the key to effective stock price prediction is to discover the intrinsic mapping or function, and to fit and approximate the mapping or the function. As it has been quickly developed, the mixture of Gaussian processes (MGP) model [5] is a powerful tool for solving this problem. But most of the MGP models are very complex and involve a large number of parameters and hyper-parameters, which makes the application of the MGP models very difficult [6]. Thus, we adopt the MGP model which proposed in [7] with excluding unnecessary priors and carefully selecting the model structure and gating function. This MGP model remains the main structure, features and advantages of the original MGP model. Moreover, it can be effectively applied to the modeling and prediction of nonlinear time series via the precise hard-cut EM algorithm. In fact, the precise hard-cut EM algorithm is more efficient than the soft EM algorithm since we could get the hyper-parameters of each GP independently in the M-step. It was demonstrated by the experimental results that this precise hard-cut EM algorithm for the MGP model really gives more precise prediction than some typical regression models and algorithms.
Along this direction, we apply the MGP model to the short-term stock price forecasting via the precise hard-cut EM algorithm. The experimental results show that this MGP based method can find potential rules from historical datasets, and their forecasting results are more stable and accurate.
The rest of this paper is organized as follows. In Sect. 2, we give a brief review of the MGP model and introduce the precise hard-cut EM algorithm. Section 3 presents the framework of stock price forecasting and the experimental results of the MGP based method as well as the comparisons of the regression models and algorithms. Finally, we give a brief conclusion in Sect. 4.
2 The Precise Hard-cut EM Algorithm for MGPs
2.1 The MGP Model
We consider the MGP model as described in [7]. In fact, it can be viewed as a special mixture model where each component is a GP. The whole set of indicators \( {\text{Z}} = \left[ {{\text{z}}_{1} ,{\text{z}}_{2} , \ldots ,{\text{z}}_{\text{N}} } \right]^{\text{T}} \), inputs X and outputs Y are sequentially generated and the MGP model is mathematically defined as follows:
where \( {\text{K}}\left( {{\text{x}}_{\text{i}} ,{\text{x}}_{\text{j}} } \right) = {\text{g}}^{2} {\text{exp}}\{ - \frac{1}{2}\left( {{\text{x}}_{\text{i}} - {\text{x}}_{\text{j}} } \right)^{\text{T}} {\text{B}}\left( {{\text{x}}_{\text{i}} - {\text{x}}_{\text{j}} } \right)\} \), \( {\text{B}} = {\text{diag}}\{ {\text{b}}_{1}^{2} ,{\text{b}}_{2}^{2} , \ldots ,{\text{b}}_{\text{d}}^{2} \} \), and Eq. (2) adopts Gaussian inputs in most generative MGP models [8–10]. \( {\uptheta }_{\text{c}} = \left\{ {{\uppi }_{\text{c}} {,\upmu }_{\text{c}} ,{\text{S}}_{\text{c}} ,{\text{g}}_{\text{c}} ,{\text{b}}_{{{\text{c}},1}} ,{\text{b}}_{{{\text{c}},2}} , \ldots ,{\text{b}}_{{{\text{c}},{\text{d}}}} ,{\upsigma }_{\text{c}} } \right\} \) are the parameters in the c-th GP component and \( {\uptheta } = \left\{ {{\uptheta }_{\text{c}} } \right\}_{{{\text{c}} = 1}}^{\text{C}} \) denotes all the parameters in the mixture model.
The generative structure is prominent and clear for the MGP model, and the model avoids the complicated parameters setting. In various GP components, Gaussian means \( {\upmu }_{\text{c}} \) are different so that each component concentrates on the different region and this mixture model can fit multimodal dataset.
2.2 The Precise Hard-cut EM Algorithm
To avoid the computational complexity of Q function, it is reasonable to use the hard-cut version of the EM algorithm and we then can efficiently learn the parameters for the MGP model. In fact, the precise hard-cut EM algorithm [7] is a good choice and we summarize its procedures as follows:
After the convergence of the precise hard-cut EM algorithm, we have obtained the estimates of all the parameters for the MGP. For a test input \( {\text{x}}^{ *} \), we can classify it into the z-th component of the MGP by the MAP criterion as follows:
Based on such a classification, we can predict the output of the test input via the corresponding GP using
In the next section, the precise hard-cut EM algorithm for the MGP model will be used for the stock closing price prediction, and the obtained results will be compared with the classical regression models and algorithms.
3 Stock Price Prediction
3.1 The General Prediction Model
The time series can be denoted as \( \left\{ {{\text{s}}\left( {\text{t}} \right)} \right\}_{t = 1}^{\infty } \). For time series prediction task under certain conditions, Taken’s Theorem [11] ensures that for some embedding dimension \( {\text{d }} \in {\text{N}}^{ + } \) and almost all time delay \( {\uptau } \in {\text{N}}^{ + } \), there is a smooth function \( {\text{f}}:{\text{R}}^{\text{d}} \to {\text{R}} \) so that \( {\text{s}}\left( {\text{t}} \right) = {\text{f}}\left[ {{\text{s}}\left( {{\text{t}} - {\text{d}}\uptau } \right), \ldots ,{\text{s}}\left( {{\text{t}} - 2{\uptau }} \right),{\text{s}}\left( {{\text{t}} - {\uptau }} \right)} \right] \). Thus, a natural choice of the training dataset can be \( \{ {\text{x}}_{\text{t}} ,{\text{y}}_{\text{t}} \}_{t = 1}^{N} \) where \( {\text{x}}_{\text{t}} = \left[ {{\text{s}}\left( {{\text{t}} - {\text{d}}\uptau } \right), \ldots ,{\text{s}}\left( {{\text{t}} - 2{\uptau }} \right),{\text{s}}\left( {{\text{t}} - {\uptau }} \right)} \right] \) and \( {\text{y}}_{\text{t}} = {\text{s}}\left( {\text{t}} \right) \), and the test dataset \( \left\{ {{\text{x}}_{\text{t}}^{ *} ,{\text{y}}_{\text{t}}^{ *} } \right\}_{t = 1}^{L} \) can be set in the same way. In this way, time series prediction task can be transformed into the regression problem which aims at estimating and approximating the unknown function f.
We utilize Shanghai Composite Index (stock code: 000001) and Donghua energy (stock code: 002221) stock closing prices datasets from 2011 to 2013 which are downloaded from the Dazhihui software, and generate training datasets and test datasets which are respectively shown in the blue curve and red curve in Fig. 1.
For \( {\text{d}} = 1,2,3,4 \) and \( {\uptau } = 1,2,3,4 \), firstly we generate 700 samples, and every sample is a \( {\text{d}} + 1 \) dimensions vector. The first \( {\text{d}} \) data are the input sample of our model and the last data is the output. Secondly, we normalize all the training and test outputs by \( {\text{y}} \to ({\text{y }} - {\text{m}})/{\upsigma } \), where \( {\text{m}} \) and \( {\upsigma } \) denote the mean and the standard variance of the training outputs, respectively. Again, 700 samples are divided two parts, including 600 training samples and 100 test samples.
3.2 Prediction Results and Comparisons
We implement the precise hard-cut EM algorithm for MGPs (referred to as PreHard-cut) on the training dataset, and verify its performance on the test dataset. Actually, we implement it on each of the 16 normalized training datasets, get the trained MGP model and make the prediction. We finally de-normalize the prediction by \( \hat{y} \to \hat{y}\upsigma + {\text{m}} \). In order to compare its prediction performance, we run the MGP model with the other EM algorithms and some typical regression models and algorithms as follows:
-
(1)
The LOOCV hard-cut EM algorithm (referred to as LOOCV) proposed in [12] for MGPs, which approximates the posteriors and the Q function via the leave-one-out cross validation mechanism;
-
(2)
The variational hard-cut EM algorithm (referred to as VarHard-cut) proposed in [13] for MGPs, which approximates the posteriors via the variational inference;
-
(3)
The Radial Basis Function neural network with Gaussian kernel function (referred to as RBF), the classical regression algorithm which makes prediction by linear combinations of radial basis functions.
The prediction accuracy is evaluated by the root mean squared error (RMSE) on each experiment, which is mathematically defined as follow
where \( {\text{y}}_{\text{t}} \) and \( \hat{y}_{t} \) denote the output true value of the t-th test sample and its predictive value, respectively. Meanwhile, we compare the efficiency of these algorithms by the total time consumed for both the parameter learning and the prediction, with an Intel(R) Core(TM) i5 CPU and 16.00 GB of RAM running Matlab R2014a source codes for all the experiments.
Before the parameter learning, some prior parameters have to be specified, including the number C of GP components for the MGP model, the number of pseudo inputs (PI) for the variational hard-cut EM algorithm and the number of neurons in the hidden layer (HL) for the RBF model. Without additional explanation, some typical values of these parameters are tested and these ones are selected and presented with the least prediction RMSEs.
The RMSEs as well as the best values of the predetermined parameters for each algorithm on each dataset are listed in Table 1. We find that in terms of prediction accuracy, the precise hard-cut EM algorithm rank the first in the dataset with \( {\text{d}} = 3,{ \uptau } = 1 \), which demonstrates the advantage in Shanghai and Donghua stock closing price prediction. And the predictive results are better than the results using the generalized RBF neural network in paper [14, 15]. The variational hard-cut EM algorithm for MGP model is comparable with the precise hard-cut algorithm on accuracy. But the last one is more stable and uniformly optimal with \( {\text{d}} = 3,{ \uptau } = 1 \) on Shanghai and Donghua stock price prediction. The LOOCV hard-cut EM algorithm for MGP model and the RBF model are not qualified for stock price prediction. Besides, Table 1 also shows a general decrease trend on prediction RMSEs with the embedding dimension d, since a large d means more information in the inputs.
Moreover, the proposed technique has good scalability, but for stock price prediction 600 days stock closing price data are enough on the grounds that time span is up to two years!
Figures 2 and 3 show the best forecasting results with the parameters \( {\text{d}} = 3 \) and \( {\uptau } = 1 \), which intuitively show the validity of the predictions. In Shanghai and Donghua test samples, the real and predicted values of the next 100 days are anastomotic and the best prediction RMSEs are 21.0782 and 0.2183 represented in bold in Table 1 respectively. The true values of the test samples are in good agreement with the predicted values, and the corresponding prediction errors are in actual allowable range which are mainly in \( \pm \)0.2 and \( \pm \)0.4 respectively as shown in Figs. 2 and 3.
The total time consumptions are shown in Table 2. We see that the precise hard-cut EM algorithm takes slightly longer. Nevertheless, no algorithms take longer than 6 min such that the remaining time is adequate for engineers to adjust the output power. Therefore, accuracy is the key factor in selecting the appropriate model and algorithm for stock price forecasting, so the precise hard-cut EM algorithm for the MGP model is a wonderful choice.
The best predictive curve for each algorithm is shown in Fig. 4. It can be found that the precise hard-cut EM algorithm and the variational hard-cut EM algorithm fit the true stock price extremely well except when the stock price reaches a peak or a trough, where there is a dramatic turn of the stock price. However, the two predictive curves are still within small and acceptable range around the true stock price even during the period of the peak and the trough. Besides, at some moments, the prediction of the precise hard-cut EM algorithm is closer to the true stock price than the variational hard-cut EM algorithm. The LOOCV hard-cut EM algorithm and the RBF model are not suitable for stock price forecasting.
Some remarkable results from Figs. 2, 3 and 4 is that the predicted prices seem to be displaced some constant time. Because the predicted price \( {\text{s}}\left( {\text{t}} \right) \) is based on before d stock price: \( {\text{s}}\left({{\text{t}} - {\text{d}}\uptau } \right), \ldots ,{\text{s}}\left({{\text{t}} - 2{\uptau }} \right),{\text{s}}\left({{\text{t}} - {\uptau}} \right) \).
In order to further explore how to improve the performance of the precise hard-cut EM algorithm, we plot the prediction RMSEs for \( {\text{d}} = 1,2,3,4,5 \) and \( {\uptau} = 1,2,3,4 \) respectively in Fig. 5. It can be observed from Fig. 5 that the RMSE generally decreases with the increasing of d and the decreasing of \( {\uptau} \). When \( {\text{d }} \ge 3 \), the RMSE is considerably low and its variation with \( {\text{d}} \) and \( {\uptau} \) is very tiny. Therefore, an appropriate large embedding dimension \( {\text{d}} \) ensures a precise forecasting in stock price.
4 Conclusion
We have successfully applied the MGP model via the precise hard-cut EM algorithm to modeling and predicting the time series of stock prices. The experiment results demonstrate that this MGP based method via the precise hard-cut EM algorithm turns out to be valid, feasible and highly competitive on prediction accuracy with acceptable time consumption, and outperforms some typical regression models and algorithms.
References
Sun, W., Guo, J., Xia, B.: Discussion about stock prediction theory based on RBF neural network. Heilongjiang Sci. Technol. Inf. (22), 130 (2010)
Fu, C., Fu, M., Que, J.: Prediction of stock price base on radial basic function neural networks. Technol. Dev. Enterp. 23(4), 14–15, 38 (2004)
Liu, H., Bai, Y.: Analysis of AR model and neural network for forecasting stock price. Math. Pract. Theory 41(4), 14–19 (2011)
Zhu, Y.: Research on Stock Prediction Methods. Xi’an: Master Thesis of Northwestern Polytechnical University (2006)
Tresp, V.: Mixtures of Gaussian processes. In: Advances in Neural Information Processing Systems, vol. 13, pp. 654–660 (2000)
Meeds, E., Osindero, S.: An alternative infinite mixture of Gaussian process experts. In: Advances in Neural Information Processing Systems, 18, pp. 883–890 (2006)
Chen, Z., Ma, J., Zhou, Y.: A precise hard-cut EM algorithm for mixtures of Gaussian processes. In: Huang, D.-S., Jo, K.-H., Wang, L. (eds.) ICIC 2014. LNCS, vol. 8589, pp. 68–75. Springer, Heidelberg (2014)
Joseph, J., Doshi-Velez, F., Huang, A., Roy, N.: A Bayesian nonparametric approach to modeling motion patterns. Auton. Robots 31(4), 383–400 (2011)
Sun, S.: Infinite mixtures of multivariate Gaussian processes. In: International Conference on IEEE Machine Learning and Cybernetics (ICMLC), vol. 3, pp. 1011–1016 (2013)
Tayal, A., Poupart, P., Li, Y.: Hierarchical double dirichlet process mixture of Gaussian processes. In: AAAI, pp. 1126–1133 (2012)
Zhou, Y., Zhang, T., Sun, J.: Multi-scale gaussian processes: a novel model for chaotic time series prediction. Chin. Phys. Lett. 24(1), 42–45 (2007)
Yang, Y., Ma, J.: An efficient EM approach to parameter learning of the mixture of Gaussian processes. In: Liu, D., Zhang, H., Polycarpou, M., Alippi, C., He, H. (eds.) ISNN 2011, Part II. LNCS, vol. 6676, pp. 165–174. Springer, Heidelberg (2011)
Nguyen, T., Bonilla, E.: Fast allocation of Gaussian process experts. In: Proceedings of the 31st International Conference on Machine Learning, pp. 145–153 (2014)
Liu, S., Ma, J.: The application of diagonal generalized RBF neural network to stock price prediction. Highlights Sciencepaper Online 7(13), 1296–1306 (2014)
Zheng, W., Ma, J.: Diagonal log-normal generalized RBF neural network for stock price prediction. In: Zeng, Z., Li, Y., King, I. (eds.) ISNN 2014. LNCS, vol. 8866, pp. 576–583. Springer, Heidelberg (2014)
Acknowledgement
This work was supported by the Natural Science Foundation of China for Grant 61171138.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Liu, S., Ma, J. (2016). Stock Price Prediction Through the Mixture of Gaussian Processes via the Precise Hard-cut EM Algorithm. In: Huang, DS., Han, K., Hussain, A. (eds) Intelligent Computing Methodologies. ICIC 2016. Lecture Notes in Computer Science(), vol 9773. Springer, Cham. https://doi.org/10.1007/978-3-319-42297-8_27
Download citation
DOI: https://doi.org/10.1007/978-3-319-42297-8_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42296-1
Online ISBN: 978-3-319-42297-8
eBook Packages: Computer ScienceComputer Science (R0)