Keywords

1 Introduction

The non-linear and non-stationary behavior of financial time series makes them difficult to predict. Several statistical and computational intelligent techniques have been proposed. [1] and [2] have compared and analyzed the role of these techniques for stock prices prediction. Statistical models such as AutoRegressive (AR) models, Moving Average (MA), Autoregressive Moving Average (ARMA) and AutoRegressive Integrated Moving Average (ARIMA) assume that financial time series is stationary and follows normal distribution, which is not the case in real world. In order to overcome this, non-linear models like Autoregressive Conditional Heteroskedastic (ARCH) model, Generalized Autoregressive Conditional Heteroskedastic (GARCH) and their extensions have been proposed. Though they try to model the stock return data series exhibiting low and high variability and handle non-linearities but they do not completely capture highly irregular phenomena in financial markets [16].

In the past decade, artificial intelligence techniques have captured the attention of researchers from various domains such as Computer Science, Operations Research, Statistics and Finance. The ability to model non-stationary and non-linear data has led to wide adoption of techniques including Artificial Neural Network (ANN) and Support Vector Regression (SVR). These models are not without their limitations. SVR and ANN suffer from the problem of overfitting and getting trapped in local optima.

There are broadly two ways of improving the accuracy of forecasts: (1) data preprocessing, and (2) improvement in algorithm. Data preprocessing is part of data mining where the data is transformed into a format that reveals certain characteristics of data. Decomposition of time series is one such technique, where the time series is deconstructed into several components. There are two types of decomposition models: (i) classical decomposition, and (ii) non-classical decomposition models.

In classical decomposition model, the time series is separated into trend, seasonal and error components. This model works best with linear data. Moreover, while predicting the future values, error components are ignored leading to information loss, thus, affecting the forecast accuracy [21].

Empirical Mode Decomposition (EMD) and Discrete Wavelet Transform (DWT) fall under the category of non-classical decomposition. Both EMD and DWT are signal processing techniques that decompose time series in time domain and time-frequency domain, respectively [11, 12, 14, 15].

Proposed by [10], EMD uses Huang-Hilbert Transform (HHT) to decompose non-linear and non-stationary time series into a set of adaptive basis function called Intrinsic Mode Functions (IMFs). Unlike DWT, it is a non-parametric technique which does not require prior information on scale or levels of decomposition. Further, it does not suffer from leakage between levels [6].

Feature selection is considered to be one of the important components of any model building process. Here, IMFs obtained using EMD are considered as the feature to the machine learning models adopted. There are two approaches to feature selection: (i) scale-based approach, and (ii) feature vector based approach. In scale-based approach, the obtained IMFs are predicted independently and then reconstructed to obtain the final forecast. While the latter treats IMFs at a particular time point to be a feature vector to obtain the predicted value [17].

The paper focuses on scale-based approach to integrate the advantages of both ANN and EMD to obtain 1-period ahead predicted values for weekly Nifty stock price.

Organization of this paper is as follows: Sect. 2 presents the hybrid EMD-ANN framework. Section 3 discusses prediction of stock index followed by results and discussion in Sect. 4. Section 5 concludes the paper.

2 Hybrid EMD-ANN Framework

The steps of hybrid EMD-ANN model are listed below:

  1. 1.

    Decompose the original series using EMD into a set of various sub-series.

  2. 2.

    Using ANN predict each sub-series independently.

  3. 3.

    Recombine the predicted subseries to obtain aggregated time series.

  4. 4.

    Calculate the error measures using the obtained aggregated series and the original series.

Figure 1 shows the flow chart of the above described procedure.

Fig. 1.
figure 1

Flow chart of the hybrid approach

2.1 Steps of EMD

The original time series of stock index F(t) is decomposed as follows [10, 19]:

  1. 1.

    Formation of Lower and Upper Envelopes: Identify all local minima in F(t) and interpolate using cubic spline method to generate a lower envelope \(F_{l}(t)\). Similarly, identify all local maxima and interpolate to obtain a upper envelope \(F_{u}(t)\)

  2. 2.

    Calculation of Mean Envelope: Calculate the mean envelope using \(M(t) = (F_{l}(t)+F_{u}(t))/2\).

  3. 3.

    Local Detail: Obtain local detail Z(t) by subtracting M(t) from the original series F(t) i.e., \(G(t) = F(t) - M(t)\).

  4. 4.

    Sifting: Repeat the above two steps on G(t) until one of the following stopping criteria is reached: (a) the value of mean envelope approaches zero, (b) the difference between the number of zero crossings and number of local extrema is at most 1, or (c) the number of user-defined iteration is reached. This process is called Sifting. G(t) represents the first intrinsic mode function \(IMF_1(t)\) and the residue \(R_1(t)\) is obtained using \(R_1(t) = F(t) - G(t)\).

  5. 5.

    Repeat the process: Repeat steps 1–4 to obtain subsequent IMFs and residual component.

The original series F(t) and its decomposed series are represented as

$$\begin{aligned} F(t) = \sum _{i=1}^N IMF_i(t) + R_N(t) \end{aligned}$$
(1)

2.2 Steps of ANN

Artificial Neural Network (ANN) is a most commonly used machine learning technique which is inspired by structure and functioning of human brain. The ability to model non-linear dataset and robust performance have led to its wide acceptability and adaptability. Following factors affect the performance of the neural network:

Input data format: Since each sub-series will be predicted independently, the number of neural network will be equal to the number of sub-series obtained after decomposition. The input layer of the neural network would consist of each sub-series and its lags. Lags are determined using AR and ARIMA models with Partial Auto-correlation Function (PACF) and Auto-correlation Function (ACF) as the criteria.

Network Structure: A three-layer resilient feed forward neural network consisting of input layer, hidden layer and output layer is considered for this study. In previous studies, three layered network structure was found to be efficient for predicting non-linear time series [4, 14, 15].

Training Algorithm: Resilient Back Propagation (RBP) [20] is adopted for training the model due to its superior performance as compared to the most commonly used Back Propagation algorithm [15]. Further, the training of the model using RBP is faster and does not require specifying parameters during the training phase.

The final forecasted value is obtained by aggregating the predicted sub-series.

3 Analysis

3.1 Data Description

The original time series comprised of weekly closing prices of Nifty ranging from September 2007 to July 2015 covering a period of 8 years. The data was collected from Yahoo! Finance. Nifty is the stock index of National Stock Exchange, India comprising of 50 stocks covering 22 sectors. It is the benchmark index for Indian equity market.

3.2 EMD

The weekly closing prices of Nifty were decomposed using EMD resulting in a total of 7 components comprising of six IMFs and a residual component. It can be observed from Fig. 2 that IMFs produced are relatively stationary. These sub-series produced are predicted independently using ANN. The Box-Jenkins methodology is adopted to determine the model parameters of the ANN model [23].

Fig. 2.
figure 2

Decomposition of F(t) using EMD

3.3 Box-Jenkins Methodology

Figure 3 represents the Box-Jenkins methodology ([3, 18]) adopted to identify the model parameters of ANN. The steps are detailed as below:

  1. 1.

    Stationarity Check: In this step, the commonly adopted Augmented Dickey-Fuller (ADF) test ([7, 8]) is used to check the stationarity of each IMF independently and the residual component obtained using EMD. For instance, \(IMF_1\) was found to be stationary. If a series is identified to be non-stationary, then first difference of the series is obtained which is again tested for stationarity using ADF test. The process is continued until the series becomes stationary or reaching maximum number of iterations. In case of \(IMF_5\), the series and its first difference were found to be non-stationary, hence second differenceFootnote 1 of the series was determined. Since the second difference series was found to be stationary, the iteration process for this series terminates.

  2. 2.

    Identification of Lag Parameter: The lag parameter of series obtained from previous step is determined using ACF and PACF. The lag parameter of \(IMF_1\) is found to be 4 while that of second difference of \(IMF_5\) as 5.

The steps 1 and 2 are repeated for each sub-series.

Fig. 3.
figure 3

The Box-Jenkins methodology for ANN input data format

Table 1. Performance of ANN for different IMFs and residue

3.4 ANN

ANN model was used to predict 1-period ahead forecast for each IMF and residue component. 70 % of the data was used for training the model and remaining 30 % for testing the model.

The lag parameter, which estimates the interrelationship of a time series with its past values, is considered as the input for the neural network. ACF and PACF are used to identify the lag parameter for each sub-series (as explained in previous section). For instance, the lag parameter of first sub-series \(IMF_1\) is 4 (Fig. 4) since it cuts off at lag 4 and also exhibits autoregressive process. This indicates that the sub-series \(IMF_1\) at point t is dependent on its past 4 values, hence, the number of neurons in the input layer is four. This can be expressed mathematically as:

$$\begin{aligned} X(t) = f[X(t-1), X(t-2), X(t-3), X(t-4)] \end{aligned}$$
(2)
Fig. 4.
figure 4

ACF and PACF plot for \(IMF_1\)

Since it is a prediction problem, the number of neurons in the output layer is 1. The number of neurons in the hidden layer is selected on the basis of best performances of the model. Table 1 represents the number of neurons in various layers of the neural network.

Neural networks have the limitation of getting trapped in local minima. In order to overcome this, data is normalized using z-scores, normalization between [−1, 1] and [0, 1] [5]. The model using z-score normalization seemed to show better performance compared to other two processes. In addition, data normalization process quickens the training of the neural network [22]. The predicted values are later denormalized before calculating the error measures. In the similar way, the first and second differences applied to sub-series are transformed back. In case of sub-series \(IMF_5\), the second difference and its past 5 values are used as output and input neurons, respectively in the neural network training model. Hence, the predicted values of this sub-series are transformed back to the original form.

1-period ahead predicted values are obtained using SVR and ANN to compare and analyze the effectiveness of the hybrid EMD-ANN model.

4 Results and Discussion

4.1 Error Measures

The predicted values obtained using SVR, ANN and EMD-ANN models are compared using two error measures: (a) directional accuracy, and (b) Root Mean Square Error (RMSE). Test data is used to calculate these error measures. Directional Accuracy (DA) measures number of times the predicted value matched the direction of the original series. It is represented as percentage. Higher the value better is the predictive model.

Error (\(E_i(t)\)) is defined as the difference between the original series (\(F_i(t)\)) and the predicted value (\(P_i(t)\)). Root Mean Square Error is calculated as the square root of mean of error values. RMSE is expressed mathematically as:

$$\begin{aligned} E_i(t) = F_i(t) - P_i(t) \end{aligned}$$
(3)
$$\begin{aligned} RMSE = \sqrt{\sum ^n_{i=1} E_i(t)^2/n} \end{aligned}$$
(4)

Lower is the value of RMSE, better the predictive model.

The error measures of the models under consideration are shown in Table 2. Here, it can be seen that hybrid EMD-ANN model has shown superior performance. RMSE value of the hybrid model is less compared to the remaining two models. DA is clearly better than that of ANN model and is relatively better than SVR model. The 1-period ahead predicted values obtained using these three models are shown in Fig. 5.

Table 2. Error measures

4.2 Significance Test

One of the most commonly used techniques, Wilcoxon Signed-Rank Test (WSRT) is a non-parametric and distribution-free technique to evaluate the predictive capabilities of two different models [9, 13]. In this test, the signs and the ranks are compared to identify whether two predictive models are different. Here, WSRT is used to analyze whether the presented hybrid model outperformed both SVR and ANN models without decomposition.

Two-tailed WSRT was carried out on RMSE values and results of which are shown in Table 3. From the table, it can be seen that z statistics value is beyond (−1.96, 1.96), hence the null hypothesis of two models being same is not accepted. The results are significant at 99 % confidence level (\(\alpha = 0.01\)). The sign “\(+\)” in the table represents that predictive capability of hybrid EMD-ANN model is superior than two traditional computational intelligence techniques, namely, SVR and ANN. Signs “−” and “\(=\)” (not shown in table) refer to underperformance and similar performance of the hybrid model compared to other two models, respectively. The WSRT results confirm that the hybrid EMD-ANN model outperformed the traditional SVR and ANN models.

Table 3. Wilcoxon signed rank test On RMSE of hybrid EMD-ANN with SVR and ANN
Fig. 5.
figure 5

Forecasts obtained using ANN, SVR and EMD-ANN

5 Conclusion

The paper presented a hybrid Empirical Mode Decomposition - Artificial Neural Network model to predict the Nifty stock index. The model incorporates the advantages of above-mentioned methods. In the first stage, EMD was used to decompose the stock index into various sets of series. In the second stage, ANN was used to predict each sub-series independently. These predicted sub-series are then recombined to obtain the final predictions. The presented hybrid model exhibited better performance compared to SVR and ANN models. It can be concluded that EMD enhanced the performance of machine learning model, namely, ANN. Further, this hybrid model can be used for predicting non-stationary and non-linear time series.

The paper also analyzed the effect of data normalization procedure on ANN model. The performance of ANN model based on z-score normalization was consistently better than the model based on [0, 1] and [−1, 1] normalization.

The paper dealt with scale-based decomposition approach for predicting Nifty. A comparative analysis of both feature-based and scale-based approach on Nifty can be carried out.