Stock Price Prediction Through the Mixture of Gaussian Processes via the Precise Hard-cut EM Algorithm

Liu, Shuanglong; Ma, Jinwen

doi:10.1007/978-3-319-42297-8_27

Shuanglong Liu¹⁶ &
Jinwen Ma¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9773))

Included in the following conference series:

International Conference on Intelligent Computing

3231 Accesses
3 Citations

Abstract

In this paper, the mixture of Gaussian processes (MGP) is applied to model and predict the time series of stock prices. Methodically, the precise hard-cut expectation maximization (EM) algorithm for MGPs is utilized to learn the parameters of the MGP model from stock prices data. It is demonstrated by the experiments that the MGP model with the precise hard-cut EM algorithm can be successfully applied to the prediction of stock prices, and outperforms the typical regression models and algorithms.

Access provided by Autonomous University of Puebla. Download conference paper PDF

A Specialized Probability Density Function for the Input of Mixture of Gaussian Processes

The Hard-Cut EM Algorithm for Mixture of Sparse Gaussian Processes

A Precise Hard-Cut EM Algorithm for Mixtures of Gaussian Processes

Keywords

1 Introduction

The stock market has the characteristics of high return and high risk [1], which has always been concerned on the analysis and forecast of stock prices. Actually, the complexity of the internal structure in stock price system and the diversity of the external factors (the national policy, the bank rate, price index, the performance of quoted companies and the psychological factors of the investors) determine the complexity of the stock market, uncertainty and difficulty of stock price forecasting task [2]. Because the stock price is collected according to the order of time, it actually forms a complex nonlinear time series [3]. Some traditional stock market analysis methods, such as stock price graph analysis (k line graph [4]), cannot profoundly reveal the stock intrinsic relationship, so that the prediction results are not so ideal on stock price. Stock price prediction methodologies fall into three broad categories which are fundamental analysis, technical analysis (charting) and technological methods.

From the view of mathematics, the key to effective stock price prediction is to discover the intrinsic mapping or function, and to fit and approximate the mapping or the function. As it has been quickly developed, the mixture of Gaussian processes (MGP) model [5] is a powerful tool for solving this problem. But most of the MGP models are very complex and involve a large number of parameters and hyper-parameters, which makes the application of the MGP models very difficult [6]. Thus, we adopt the MGP model which proposed in [7] with excluding unnecessary priors and carefully selecting the model structure and gating function. This MGP model remains the main structure, features and advantages of the original MGP model. Moreover, it can be effectively applied to the modeling and prediction of nonlinear time series via the precise hard-cut EM algorithm. In fact, the precise hard-cut EM algorithm is more efficient than the soft EM algorithm since we could get the hyper-parameters of each GP independently in the M-step. It was demonstrated by the experimental results that this precise hard-cut EM algorithm for the MGP model really gives more precise prediction than some typical regression models and algorithms.

Along this direction, we apply the MGP model to the short-term stock price forecasting via the precise hard-cut EM algorithm. The experimental results show that this MGP based method can find potential rules from historical datasets, and their forecasting results are more stable and accurate.

The rest of this paper is organized as follows. In Sect. 2, we give a brief review of the MGP model and introduce the precise hard-cut EM algorithm. Section 3 presents the framework of stock price forecasting and the experimental results of the MGP based method as well as the comparisons of the regression models and algorithms. Finally, we give a brief conclusion in Sect. 4.

2 The Precise Hard-cut EM Algorithm for MGPs

2.1 The MGP Model

We consider the MGP model as described in [7]. In fact, it can be viewed as a special mixture model where each component is a GP. The whole set of indicators $ {\text{Z}} = \left[ {{\text{z}}_{1} ,{\text{z}}_{2} , \ldots ,{\text{z}}_{\text{N}} } \right]^{\text{T}} $, inputs X and outputs Y are sequentially generated and the MGP model is mathematically defined as follows:

$$ {\text{p}}\left( {{\text{z}}_{\text{t}} = {\text{c}}} \right) = {\uppi }_{\text{c}} , {\text{t}} = 1,2, \ldots ,{\text{N}}, $$

(1)

$$ {\text{p}}({\text{x}}_{\text{t}} |{\text{z}}_{\text{t}} = {\text{c}},\uptheta_{\text{c}} ) = {\text{N}}({\text{x}}_{\text{t}} |\, \upmu_{\text{c}} ,{\text{S}}_{\text{c}} ), {\text{t}} = 1,2, \ldots ,{\text{N}}, $$

(2)

$$ {\text{p}}({\text{y}}|{\text{X}},{\uptheta }) = \mathop \prod \nolimits_{{{\text{c}} = 1}}^{C} {\text{N}}({\text{y}}_{\text{c}} |0,{\text{K}}({\text{X}}_{\text{c}} ,{\text{X}}_{\text{c}} |{\uptheta }_{\text{c}} ) + \upsigma_{\text{c}}^{2}\, {\text{I}}_{{{\text{N}}_{\text{c}} }} ) $$

(3)

where $ {\text{K}}\left( {{\text{x}}_{\text{i}} ,{\text{x}}_{\text{j}} } \right) = {\text{g}}^{2} {\text{exp}}\{ - \frac{1}{2}\left( {{\text{x}}_{\text{i}} - {\text{x}}_{\text{j}} } \right)^{\text{T}} {\text{B}}\left( {{\text{x}}_{\text{i}} - {\text{x}}_{\text{j}} } \right)\} $, $ {\text{B}} = {\text{diag}}\{ {\text{b}}_{1}^{2} ,{\text{b}}_{2}^{2} , \ldots ,{\text{b}}_{\text{d}}^{2} \} $, and Eq. (2) adopts Gaussian inputs in most generative MGP models [8–10]. $ {\uptheta }_{\text{c}} = \left\{ {{\uppi }_{\text{c}} {,\upmu }_{\text{c}} ,{\text{S}}_{\text{c}} ,{\text{g}}_{\text{c}} ,{\text{b}}_{{{\text{c}},1}} ,{\text{b}}_{{{\text{c}},2}} , \ldots ,{\text{b}}_{{{\text{c}},{\text{d}}}} ,{\upsigma }_{\text{c}} } \right\} $ are the parameters in the c-th GP component and $ {\uptheta } = \left\{ {{\uptheta }_{\text{c}} } \right\}_{{{\text{c}} = 1}}^{\text{C}} $ denotes all the parameters in the mixture model.

The generative structure is prominent and clear for the MGP model, and the model avoids the complicated parameters setting. In various GP components, Gaussian means $ {\upmu }_{\text{c}} $ are different so that each component concentrates on the different region and this mixture model can fit multimodal dataset.

2.2 The Precise Hard-cut EM Algorithm

To avoid the computational complexity of Q function, it is reasonable to use the hard-cut version of the EM algorithm and we then can efficiently learn the parameters for the MGP model. In fact, the precise hard-cut EM algorithm [7] is a good choice and we summarize its procedures as follows:

After the convergence of the precise hard-cut EM algorithm, we have obtained the estimates of all the parameters for the MGP. For a test input $ {\text{x}}^{ *} $, we can classify it into the z-th component of the MGP by the MAP criterion as follows:

$$ {\text{z}} = {\text{argmax}}_{\text{c}} {\text{p}}\left( {{\text{z}}^{ *} = {\text{c|x}}^{ *} } \right) = {\text{argmax}}_{\text{c}} { \uppi }_{\text{c}} {\text{N}}\left( {{\text{x}}^{ *} {|\upmu }_{\text{c}} ,{\text{S}}_{\text{c}} } \right) $$

(8)

Based on such a classification, we can predict the output of the test input via the corresponding GP using

$$ {{\widehat{y}^{*}}} = \text{K}\left( {{\text{x}}^{ * } ,{\text{X}}} \right)\left[ {{\text{K}}\left( {\text{X,X}} \right){ + }\upsigma^{ 2} {\text{I}}} \right]^{ - 1} {\text{y}} $$

(9)

In the next section, the precise hard-cut EM algorithm for the MGP model will be used for the stock closing price prediction, and the obtained results will be compared with the classical regression models and algorithms.

3 Stock Price Prediction

3.1 The General Prediction Model

The time series can be denoted as $ \left\{ {{\text{s}}\left( {\text{t}} \right)} \right\}_{t = 1}^{\infty } $. For time series prediction task under certain conditions, Taken’s Theorem [11] ensures that for some embedding dimension $ {\text{d }} \in {\text{N}}^{ + } $ and almost all time delay $ {\uptau } \in {\text{N}}^{ + } $, there is a smooth function $ {\text{f}}:{\text{R}}^{\text{d}} \to {\text{R}} $ so that $ {\text{s}}\left( {\text{t}} \right) = {\text{f}}\left[ {{\text{s}}\left( {{\text{t}} - {\text{d}}\uptau } \right), \ldots ,{\text{s}}\left( {{\text{t}} - 2{\uptau }} \right),{\text{s}}\left( {{\text{t}} - {\uptau }} \right)} \right] $. Thus, a natural choice of the training dataset can be $ \{ {\text{x}}_{\text{t}} ,{\text{y}}_{\text{t}} \}_{t = 1}^{N} $ where $ {\text{x}}_{\text{t}} = \left[ {{\text{s}}\left( {{\text{t}} - {\text{d}}\uptau } \right), \ldots ,{\text{s}}\left( {{\text{t}} - 2{\uptau }} \right),{\text{s}}\left( {{\text{t}} - {\uptau }} \right)} \right] $ and $ {\text{y}}_{\text{t}} = {\text{s}}\left( {\text{t}} \right) $, and the test dataset $ \left\{ {{\text{x}}_{\text{t}}^{ *} ,{\text{y}}_{\text{t}}^{ *} } \right\}_{t = 1}^{L} $ can be set in the same way. In this way, time series prediction task can be transformed into the regression problem which aims at estimating and approximating the unknown function f.

We utilize Shanghai Composite Index (stock code: 000001) and Donghua energy (stock code: 002221) stock closing prices datasets from 2011 to 2013 which are downloaded from the Dazhihui software, and generate training datasets and test datasets which are respectively shown in the blue curve and red curve in Fig. 1.

For $ {\text{d}} = 1,2,3,4 $ and $ {\uptau } = 1,2,3,4 $, firstly we generate 700 samples, and every sample is a $ {\text{d}} + 1 $ dimensions vector. The first $ {\text{d}} $ data are the input sample of our model and the last data is the output. Secondly, we normalize all the training and test outputs by $ {\text{y}} \to ({\text{y }} - {\text{m}})/{\upsigma } $, where $ {\text{m}} $ and $ {\upsigma } $ denote the mean and the standard variance of the training outputs, respectively. Again, 700 samples are divided two parts, including 600 training samples and 100 test samples.

3.2 Prediction Results and Comparisons

We implement the precise hard-cut EM algorithm for MGPs (referred to as PreHard-cut) on the training dataset, and verify its performance on the test dataset. Actually, we implement it on each of the 16 normalized training datasets, get the trained MGP model and make the prediction. We finally de-normalize the prediction by $ \hat{y} \to \hat{y}\upsigma + {\text{m}} $. In order to compare its prediction performance, we run the MGP model with the other EM algorithms and some typical regression models and algorithms as follows:

(1)
The LOOCV hard-cut EM algorithm (referred to as LOOCV) proposed in [12] for MGPs, which approximates the posteriors and the Q function via the leave-one-out cross validation mechanism;
(2)
The variational hard-cut EM algorithm (referred to as VarHard-cut) proposed in [13] for MGPs, which approximates the posteriors via the variational inference;
(3)
The Radial Basis Function neural network with Gaussian kernel function (referred to as RBF), the classical regression algorithm which makes prediction by linear combinations of radial basis functions.

The prediction accuracy is evaluated by the root mean squared error (RMSE) on each experiment, which is mathematically defined as follow

$$ {\text{RMSE }} = \sqrt {\frac{1}{L} \mathop \sum \nolimits_{t = 1}^{L} \left( {\hat{y}_{t} - {\text{y}}_{\text{t}} } \right)^{2} } , $$

(10)

where $ {\text{y}}_{\text{t}} $ and $ \hat{y}_{t} $ denote the output true value of the t-th test sample and its predictive value, respectively. Meanwhile, we compare the efficiency of these algorithms by the total time consumed for both the parameter learning and the prediction, with an Intel(R) Core(TM) i5 CPU and 16.00 GB of RAM running Matlab R2014a source codes for all the experiments.

Before the parameter learning, some prior parameters have to be specified, including the number C of GP components for the MGP model, the number of pseudo inputs (PI) for the variational hard-cut EM algorithm and the number of neurons in the hidden layer (HL) for the RBF model. Without additional explanation, some typical values of these parameters are tested and these ones are selected and presented with the least prediction RMSEs.

The RMSEs as well as the best values of the predetermined parameters for each algorithm on each dataset are listed in Table 1. We find that in terms of prediction accuracy, the precise hard-cut EM algorithm rank the first in the dataset with $ {\text{d}} = 3,{ \uptau } = 1 $, which demonstrates the advantage in Shanghai and Donghua stock closing price prediction. And the predictive results are better than the results using the generalized RBF neural network in paper [14, 15]. The variational hard-cut EM algorithm for MGP model is comparable with the precise hard-cut algorithm on accuracy. But the last one is more stable and uniformly optimal with $ {\text{d}} = 3,{ \uptau } = 1 $ on Shanghai and Donghua stock price prediction. The LOOCV hard-cut EM algorithm for MGP model and the RBF model are not qualified for stock price prediction. Besides, Table 1 also shows a general decrease trend on prediction RMSEs with the embedding dimension d, since a large d means more information in the inputs.

Table 1. The RMSEs for Shanghai and Donghua stock closing price prediction.

Full size table

Moreover, the proposed technique has good scalability, but for stock price prediction 600 days stock closing price data are enough on the grounds that time span is up to two years!

Figures 2 and 3 show the best forecasting results with the parameters $ {\text{d}} = 3 $ and $ {\uptau } = 1 $, which intuitively show the validity of the predictions. In Shanghai and Donghua test samples, the real and predicted values of the next 100 days are anastomotic and the best prediction RMSEs are 21.0782 and 0.2183 represented in bold in Table 1 respectively. The true values of the test samples are in good agreement with the predicted values, and the corresponding prediction errors are in actual allowable range which are mainly in $ \pm $0.2 and $ \pm $0.4 respectively as shown in Figs. 2 and 3.

The total time consumptions are shown in Table 2. We see that the precise hard-cut EM algorithm takes slightly longer. Nevertheless, no algorithms take longer than 6 min such that the remaining time is adequate for engineers to adjust the output power. Therefore, accuracy is the key factor in selecting the appropriate model and algorithm for stock price forecasting, so the precise hard-cut EM algorithm for the MGP model is a wonderful choice.

Table 2. The time consumptions for Shanghai and Donghua stock closing price prediction.

Full size table

The best predictive curve for each algorithm is shown in Fig. 4. It can be found that the precise hard-cut EM algorithm and the variational hard-cut EM algorithm fit the true stock price extremely well except when the stock price reaches a peak or a trough, where there is a dramatic turn of the stock price. However, the two predictive curves are still within small and acceptable range around the true stock price even during the period of the peak and the trough. Besides, at some moments, the prediction of the precise hard-cut EM algorithm is closer to the true stock price than the variational hard-cut EM algorithm. The LOOCV hard-cut EM algorithm and the RBF model are not suitable for stock price forecasting.

Some remarkable results from Figs. 2, 3 and 4 is that the predicted prices seem to be displaced some constant time. Because the predicted price $ {\text{s}}\left( {\text{t}} \right) $ is based on before d stock price: $ {\text{s}}\left({{\text{t}} - {\text{d}}\uptau } \right), \ldots ,{\text{s}}\left({{\text{t}} - 2{\uptau }} \right),{\text{s}}\left({{\text{t}} - {\uptau}} \right) $.

In order to further explore how to improve the performance of the precise hard-cut EM algorithm, we plot the prediction RMSEs for $ {\text{d}} = 1,2,3,4,5 $ and $ {\uptau} = 1,2,3,4 $ respectively in Fig. 5. It can be observed from Fig. 5 that the RMSE generally decreases with the increasing of d and the decreasing of $ {\uptau} $. When $ {\text{d }} \ge 3 $, the RMSE is considerably low and its variation with $ {\text{d}} $ and $ {\uptau} $ is very tiny. Therefore, an appropriate large embedding dimension $ {\text{d}} $ ensures a precise forecasting in stock price.

4 Conclusion

We have successfully applied the MGP model via the precise hard-cut EM algorithm to modeling and predicting the time series of stock prices. The experiment results demonstrate that this MGP based method via the precise hard-cut EM algorithm turns out to be valid, feasible and highly competitive on prediction accuracy with acceptable time consumption, and outperforms some typical regression models and algorithms.

References

Sun, W., Guo, J., Xia, B.: Discussion about stock prediction theory based on RBF neural network. Heilongjiang Sci. Technol. Inf. (22), 130 (2010)
Google Scholar
Fu, C., Fu, M., Que, J.: Prediction of stock price base on radial basic function neural networks. Technol. Dev. Enterp. 23(4), 14–15, 38 (2004)
Google Scholar
Liu, H., Bai, Y.: Analysis of AR model and neural network for forecasting stock price. Math. Pract. Theory 41(4), 14–19 (2011)
MathSciNet MATH Google Scholar
Zhu, Y.: Research on Stock Prediction Methods. Xi’an: Master Thesis of Northwestern Polytechnical University (2006)
Google Scholar
Tresp, V.: Mixtures of Gaussian processes. In: Advances in Neural Information Processing Systems, vol. 13, pp. 654–660 (2000)
Google Scholar
Meeds, E., Osindero, S.: An alternative infinite mixture of Gaussian process experts. In: Advances in Neural Information Processing Systems, 18, pp. 883–890 (2006)
Google Scholar
Chen, Z., Ma, J., Zhou, Y.: A precise hard-cut EM algorithm for mixtures of Gaussian processes. In: Huang, D.-S., Jo, K.-H., Wang, L. (eds.) ICIC 2014. LNCS, vol. 8589, pp. 68–75. Springer, Heidelberg (2014)
Google Scholar
Joseph, J., Doshi-Velez, F., Huang, A., Roy, N.: A Bayesian nonparametric approach to modeling motion patterns. Auton. Robots 31(4), 383–400 (2011)
Article Google Scholar
Sun, S.: Infinite mixtures of multivariate Gaussian processes. In: International Conference on IEEE Machine Learning and Cybernetics (ICMLC), vol. 3, pp. 1011–1016 (2013)
Google Scholar
Tayal, A., Poupart, P., Li, Y.: Hierarchical double dirichlet process mixture of Gaussian processes. In: AAAI, pp. 1126–1133 (2012)
Google Scholar
Zhou, Y., Zhang, T., Sun, J.: Multi-scale gaussian processes: a novel model for chaotic time series prediction. Chin. Phys. Lett. 24(1), 42–45 (2007)
Article MathSciNet Google Scholar
Yang, Y., Ma, J.: An efficient EM approach to parameter learning of the mixture of Gaussian processes. In: Liu, D., Zhang, H., Polycarpou, M., Alippi, C., He, H. (eds.) ISNN 2011, Part II. LNCS, vol. 6676, pp. 165–174. Springer, Heidelberg (2011)
Chapter Google Scholar
Nguyen, T., Bonilla, E.: Fast allocation of Gaussian process experts. In: Proceedings of the 31st International Conference on Machine Learning, pp. 145–153 (2014)
Google Scholar
Liu, S., Ma, J.: The application of diagonal generalized RBF neural network to stock price prediction. Highlights Sciencepaper Online 7(13), 1296–1306 (2014)
Google Scholar
Zheng, W., Ma, J.: Diagonal log-normal generalized RBF neural network for stock price prediction. In: Zeng, Z., Li, Y., King, I. (eds.) ISNN 2014. LNCS, vol. 8866, pp. 576–583. Springer, Heidelberg (2014)
Google Scholar

Download references

Acknowledgement

This work was supported by the Natural Science Foundation of China for Grant 61171138.

Author information

Authors and Affiliations

Department of Information Science, School of Mathematical Sciences and LMAM, Peking University, Beijing, 100871, China
Shuanglong Liu & Jinwen Ma

Authors

Shuanglong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jinwen Ma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jinwen Ma .

Editor information

Editors and Affiliations

Tongji University , Shanghai, China
De-Shuang Huang
Inha University , Incheon, Korea (Republic of)
Kyungsook Han
Liverpool John Moores University , Liverpool, United Kingdom
Abir Hussain

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, S., Ma, J. (2016). Stock Price Prediction Through the Mixture of Gaussian Processes via the Precise Hard-cut EM Algorithm. In: Huang, DS., Han, K., Hussain, A. (eds) Intelligent Computing Methodologies. ICIC 2016. Lecture Notes in Computer Science(), vol 9773. Springer, Cham. https://doi.org/10.1007/978-3-319-42297-8_27

Download citation

DOI: https://doi.org/10.1007/978-3-319-42297-8_27
Published: 12 July 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42296-1
Online ISBN: 978-3-319-42297-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics