A novel hybrid model using teaching–learning-based optimization and a support vector machine for commodity futures index forecasting

Das, Shom Prasad; Padhy, Sudarsan

doi:10.1007/s13042-015-0359-0

A novel hybrid model using teaching–learning-based optimization and a support vector machine for commodity futures index forecasting

Original Article
Published: 21 April 2015

Volume 9, pages 97–111, (2018)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

A novel hybrid model using teaching–learning-based optimization and a support vector machine for commodity futures index forecasting

Download PDF

Shom Prasad Das¹ &
Sudarsan Padhy²

1961 Accesses
97 Citations
Explore all metrics

Abstract

The analysis and prediction of financial time-series data are difficult, and are the most complicated tasks concerned with improving investment decisions. In this study, we forecasted a financial derivatives instrument (the commodity futures contract index) using techniques based on recently developed machine learning techniques. These methods have been shown to perform remarkably well in other applications. In particular, we developed a hybrid method that combines a support vector machine (SVM) with teaching–learning-based optimization (TLBO). The proposed SVM–TLBO model avoids user-specified control parameters, which are required when using other optimization methods. We assessed the viability and efficiency of this hybrid model by forecasting the daily closing prices of the COMDEX commodity futures index, traded in the Multi Commodity Exchange of India Limited. Our experimental results show that the proposed model is effective and performs better than the particle swarm optimization (PSO) + SVM hybrid and standard SVM models. For example, the proposed model improved the MAE by 65.87 % (1-day-ahead forecast), 55.83 % (3-days-ahead forecast), and 67.03 % (5-days-ahead forecast), when compared with standard SVM regression.

Predict Stock Prices Using Supervised Learning Algorithms and Particle Swarm Optimization Algorithm

Article 05 June 2022

Employing Machine Learning Algorithms for Stock Index Prediction

Predicting Stock Market Price Using Machine Learning Techniques

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In the business environment, we wish to accurately and efficiently forecast various kinds of financial variables to develop successful strategies and avoid large losses [43]. Many researchers have considered financial time series data forecasts since the 1980s, with the objective of beating the financial market. There are a huge number of factors (economic, political, environmental, and psychological) that make financial forecasting an interesting and challenging field. Further, financial time series are inherently noisy, nonstationary, and deterministically chaotic [2, 38].

Conventional statistical methods such as time series and multivariate analyses have been used in most prediction techniques. However, researchers have started to apply artificial intelligence (AI) methods to financial markets, because of recent successful developments such as artificial neural networks (ANNs), support vector machines (SVMs), particle swarm optimization (PSO), genetic algorithms (GAs), and fuzzy technologies. Refenes et al. [34], Tsibouris and Zeidenberg [40], and Steiner and Wittkemper [37] used ANN models to predict stock prices in the UK, US, and German markets, respectively. Wittkemper and Steiner [42] and Shazly et al. [36] used ANNs with GAs (hybrid models) to predict stock prices and currency exchange rates in Germany, the United Kingdom, Japan, and Switzerland.

Vapnik [41] introduced SVM methods to overcome the problems of ANNs (such as getting trapped in local minima, overfitting to training data, and long training times). Since then, several authors have proposed financial instruments pricing using SVMs. For example, Tay and Cao [38] and Cao and Tay [2] developed pricing models for five specific financial futures in the US market using SVMs, and Gestel et al. [7] used LS-SVM (Least squares support vector machine) for T-bill (Treasury bill) rate and stock index pricing in the US and German markets. Their simulated results show that SVM outperforms ANNs. Moreover, SVM has also been shown to perform better than ANNs and other statistical methods in other domains [23, 24].

Nicholas and Ravi [25] published an exhaustive survey on SVMs for time series prediction. They surveyed papers in the areas of financial market prediction, electric utility load forecasting, environment states, weather prediction, and reliability forecasting. In their survey, they noted that the free parameters chosen for the SVM are significant. The experimental result by Kim [14] showed that SVM predictions are sensitive to these free parameters, and that it is important to select optimal values. Improperly selected free parameters can cause over- or under-fitting problems [14]. The nonlinear nature of financial time series data means that we use nonlinear kernel functions such as the Gaussian and polynomial functions, which require appropriate, user chosen parameter(s). Current approaches for choosing these free parameters are typically based on domain knowledge, cut and try, and ergodic search methods [4]. Several studies proposed selecting the optimal free parameters for SVM/ANNs using PSO, GAs, artificial bee colonies (ABCs), ant colony optimization (ACO), differential evolution (DE), simulated annealing (SA), and so on [3, 11, 12, 21, 22, 44]). However, the optimization model itself introduces additional user-specified controlling parameter(s), making the user’s task even more complex. For example, GAs need optimal controlling parameter values for crossover and mutation probabilities; PSO needs specified optimal controlling parameters such as inertia weight and social and cognitive parameters; SA needs cooling temperature and cooling constants; DE requires a differentiation factor and a crossover probability; and ABC requires the optimal controlling parameters for the number of bees (employed, scout, and onlookers), limits, and so on. Variations to the controlling parameters alter the effectiveness of the optimization algorithm.

Rao et al. [30] proposed the teaching–learning-based optimization (TLBO) algorithm, which is an optimization technique for a mechanical design problem that does not require user defined parameters. They tested their novel technique using different benchmark functions. Their results show that TLBO can outperform many optimization algorithms such as particle evolutionary swarm optimization, ABC, and cultural DE. Similarly, Rao et al. [31] compared TLBO with well-known optimization techniques such as GA, ABC, PSO, HS, DE, and hybrid-PSO, by applying the methods to different benchmark problems [such as Griewank ($D\, = \,10$), Hyper Sphere ($D\, = \,6$), Rosenbrock ($D\, = \,1$, $D\, = \,3$), Rastrigin, and Ackley]. They considered the effectiveness of TLBO in terms of different performance criteria (such as the average number of function evaluations, success rate, convergence rate, and mean solution). These results also showed that the TLBO method performed better than other nature-inspired optimization techniques, for the considered benchmark functions. The TLBO optimization technique developed by Rao et al. [30] performed well in many studies [26–33, 35, 45].

In this study, we propose an SVM–TLBO hybrid regression prediction model for forecasting the multicommodity futures index (COMDEX) traded in the Multi Commodity Exchange of India Limited (MCX). We use the TLBO algorithm to select the free parameter(s) of the SVM, and the free parameter(s) of the kernel function. We compared the standard SVM method with the SVM–TLBO hybrid technique. The commodity futures index under consideration is a significant indicator for the performance of the Indian commodities market. MCX COMDEX is composed of futures contracts on 15 physical commodities with three subindices, representing the key commodity sectors within the index: metals, energy, and agricultural. Investors can use MCX COMDEX futures to efficiently hedge commodity and inflation exposure and lay off residual risk [1]. We developed the SVM–TLBO hybrid regression model because the most important consideration when using the standard SVM model is to properly select the free parameters [C (regularization) and ${\kern 1pt} \varepsilon$ (insensitive loss function radius)] and the kernel parameter(s) for training the data.

TLBO does not require any user-defined, controlling parameters(s), which means that can it effectively determine the free parameter(s) of the SVM without any user input. Our experimental results show that the proposed hybrid SVM–TLBO regression model produces better forecasts than the PSO + SVM hybrid and standard SVM models. The remainder of this paper is structured as follows. In Sect. 2, we provide a summary of SVM regression and the SVM-TLBO hybrid regression model for selecting the optimal free parameters. Section 3 contains the proposed method for predicting the commodity futures index, followed by our results, comparisons, and analysis in Sect. 4. Section 5 concludes the study and outlines some future work.

2 SVM for regression and the SVM–TLBO hybrid regression model

2.1 SVM for regression

Vapnik and his coworkers have developed an SVM technique for regression. The method was presented as follows.

Given a training data-set $\{ (x_{1} ,y_{1} ),\,\, \ldots \,\,,(x_{\ell } ,y_{\ell } )\}$ (where each $x_{i} \in X \subset R^{n}$, and $X$ denotes the input sample space), and matching target values $y_{i} \in R$ for $i\,\, = 1,\,\,.\,\,.\,\,.\,\,,\,l$ (where $l$ corresponds to the size of the training data), the objective of the regression problem is to find a function $f:R^{n} \, \to R$ that can approximate the value of $y$ when $x$ is not in the training set.

The estimating function $f$ is defined as

$$f(x) = (w^{T} \varPhi (x)) + b\,,\,$$

(1)

where $w \in R^{m} ,\,b \in R$ is the bias, and $\varPhi$ denotes a nonlinear function from $R^{n}$ to high-dimensional space $R^{m}$ ($m\,\, > \,\,n$). The aim is to find $w$ and $b$ such that the value of $f(x)$ can be determined by minimizing the risk

$$R_{\text{reg}} (f) = C\sum\limits_{i = 1}^{n} {L_{ \in } (y_{i} ,f(x_{i} ))} + \frac{1}{2}\left\| w \right\|^{2} .$$

(2)

Here, $L_{ \in }$ is the extension of the $\in$-insensitive loss function originally proposed by Vapnik [41], which is defined as

$$L_{ \in } \, = \left\{ {\begin{array}{*{20}c} {y - z|\, - \, \in ,} & {y\, - \,z|\, \ge \, \in } \\ {0,} & {\text{otherwise}} \\ \end{array} } \right\}.$$

(3)

By introducing the slack variables $\zeta_{i}$ and $\zeta_{i}^{*}$, the problem in Eq. (2) can be reformulated to the following.

(P) Minimize $C\,\,\left[ {\sum\limits_{i = 1}^{l} {(\zeta_{i} + \zeta_{i} ')} } \right]\,\, + \,\,\frac{1}{2}\left\| w \right\|^{2}$ subject to

$$\begin{aligned} y_{i} - w^{T} \varPhi (x_{i} ) - b\,\,\, \le \,\, \in + \zeta_{i} , \hfill \\ w^{T} \varPhi (x_{i} ) + b - y_{i} \,\,\, \le \,\, \in + \zeta_{i}^{{\prime }} , \hfill \\ \zeta_{i} \ge 0\,, \hfill \\ \zeta_{i}^{{\prime }} \ge 0\,, \hfill \\ \end{aligned}$$

(4)

where $i\,\, = 1,\, \ldots \,,\,l$, and $C$ is a user-specified constant known as regularization parameter.

We can solve (P) using the primal dual method to get the following dual problem.

Determine the Lagrange multipliers ($\left\{ {\alpha_{\text{i}} } \right\}_{\text{i = 1}}^{l} and\,\,\left\{ {\alpha_{\text{i}}^{*} } \right\}_{\text{i = 1}}^{l}$) that maximize the objective function

$$Q(\alpha_{i} ,\,\,\alpha_{i}^{*} ) = \sum\limits_{i = 1}^{l} {y_{i} (\alpha_{i} - \alpha_{i}^{*} ) - \, \in } \sum\limits_{i = 1}^{l} {(\alpha_{i} - \alpha_{i}^{*} ) - \frac{1}{2}} \,\,\sum\limits_{i = 1}^{l} {\sum\limits_{j = 1}^{l} {(\alpha_{i} - \alpha_{i}^{*} )\,} (\alpha_{j} - \alpha_{i}^{*} )\,\,K(x_{i} ,\,x_{j} ),} \,$$

(5)

subject to

$$\sum\limits_{i = 1}^{l} {(\alpha_{i} - \alpha_{i}^{*} ) = 0,}$$

(6)

and

$$0 \le \alpha_{i} \le C,\,\,\,0 \le \alpha_{i}^{*} \le C.$$

(7)

Here, $i\,\, = 1,\, \ldots ,\,l$, and $K:\,X\,\, \times X\, \to \,R$ is the Mercer kernel defined by

$$K(x,z) = \varPhi (x)^{T} \,\varPhi (z).$$

(8)

The solution of the primal dual method yields

$$w = \sum\limits_{i = 1}^{l} {(\alpha_{j} - \alpha_{i}^{*} )\,\,\varPhi (x_{i} ),}$$

(9)

where $b$ is calculated using the Karush–Kuhn–Tucker conditions. That is,

$$\begin{aligned} \alpha_{i} (\varepsilon + \zeta_{i} - y_{i} + w^{T} \varPhi (x_{i} ) + b) = 0, \hfill \\ \alpha_{i}^{*} (\varepsilon + \zeta_{i}^{*} + y_{i} - w^{T} \varPhi (x_{i} ) - b) = 0, \hfill \\ \end{aligned}$$

(10)

$$(C - \alpha_{i} )\,\zeta_{i} = 0\,\,\,\,and \, \,(C - \alpha_{i}^{*} )\,\zeta_{i}^{*} = 0,\,\,{\text{where}}\,i\,\, = 1, \ldots ,\,l.$$

(11)

Since $\alpha_{i} \, \bullet \,\alpha_{i}^{*} = 0$ both $\alpha_{i}$ and $\alpha_{i}^{*}$ cannot be simultaneously non-zero, there exists some i for which either $\alpha_{i} \in (0,C)$ or $\alpha_{i}^{*} \in (0,C)$ and hence $b$ can be computed using

$$\begin{aligned} b & = y_{i} - \sum\limits_{j = 1}^{l} {(\alpha_{j} - \alpha_{j}^{*} )\,\,{\text{K}}(x_{j} ,x_{i} ) - \varepsilon } \quad for\,\,0 < \alpha_{i} \, < C, \\ b & = y_{i} - \sum\limits_{j = 1}^{l} {(\alpha_{j} - \alpha_{j}^{*} )\,\,{\text{K}}(x_{j} ,x_{i} ) + \varepsilon \quad for\,\,0 < \alpha_{i}^{*} < C.} \\ \end{aligned}$$

(12)

The $x_{i}$ corresponding to $0 < \alpha_{i} \, < C$ and $0 < \alpha_{i}^{*} < C$ are called support vectors. Using the expressions for $w$ and $b$ in Eqs. (9) and (12), $f(x)$ can be computed using

$$\begin{aligned} f(x) & = \sum\limits_{i = 1}^{n} {(\alpha_{i} - \alpha_{i}^{*} )(\varPhi (x_{i} )^{T} \varPhi (x)) + b,} \\ & = \sum\limits_{i = 1}^{\ell } {(\alpha_{i} - \alpha_{i}^{*} )K(x_{i} ,x) + b.} \\ \end{aligned}$$

(13)

Note that we do not require function $\varPhi$ to compute $f(x)$, which is an advantage of using the kernel.

Advantages of SVM

SVMs have become a well-established tool within machine learning. Conceptually, they have many advantages, which include the following.

a.
The technique is methodical and derived from statistical learning theory.
b.
The SVM process requires convex function optimization, so there is a unique optimal solution (global minima).
c.
The model has an explicit strong dependence on a subset of the data points (support vectors), which improves model design.
d.
The relatively easy training process is a major strength of SVM.
e.
There are no local optima, unlike ANNs.
f.
The method scales moderately well to high-dimensional data, and the tradeoff between model complexity and errors can be explicitly controlled using appropriate optimal parameters.

Disadvantage of SVM

SVMs have the following disadvantage.

The training time is roughly between a quadratic and cubic function of the number of samples in the training set.

2.2 Teaching–learning-based optimization technique

TLBO is a newly developed novel and effective meta-heuristic population based optimization algorithm [30]. It is similar to PSO, GAs, and ABC. TLBO is modeled on the transfer of knowledge within a classroom atmosphere, where learners (students) first acquire knowledge from a teacher (teacher phase) and then from their peers (student phase). The population in TLBO consists of a group of learners. There are decision variables, similar to other optimization algorithms. The different decision variables in TLBO are equivalent to the different subjects offered to students, and the students’ grades are equivalent to the “fitness” in other population-based optimization methods. The flow chart for TLBO algorithm is presented in Fig. 1.

Salient features of TLBO

TLBOs have the following features.

Similar to other population-based methods (e.g., GAs, PSO, and ABC), TLBO uses many results to proceed to the optimal solution.
We do not need to tune any additional algorithm-specific controlling parameter.
It uses the best solution of the current iteration to modify the existing solution in the population, which increases the convergence rate.
The mean value of the population is used to update the solution.
A good solution is accepted using a greediness approach.
The population is not divided, unlike methods such as the ABC algorithm.

2.2.1 Steps involved in the TLBO algorithm

The following steps of the TLBO algorithm were described by Rao et al. [30].

Step 1: Define the optimization problem and create a solution space

In the initial phase, we identify the decision variable(s) in the problem to be optimized and assign them a range (minimum and maximum of the variable) where we will search for the optimal solution. If the solution spaces and ranges are not properly defined, then there is a chance that the optimization will take more time.

Step 2: Identify the fitness function

In this step, we design or identify the fitness function, which accurately represents how well the optimized solution fits our problem using a single number. The TLBO algorithm uses the fitness function to evaluate its candidate solutions and obtains the optimal solution by minimizing f (X), where f (X) is the fitness function.

Step 3: Initializing learners (or students)

Each learner (based on the population size) is initialized using random values for each of the variables (within the appropriate ranges).

The ith learner is represented by row vector $X_{i}$, defined as

$$X_{i} = \,\left[ {x_{i,1} ,x_{i,2} ,x_{i,3} , \ldots ,x_{i,D} } \right],\,\,\,i\, = \,1,2, \ldots ,N ,$$

(14)

where $D$ is the number of decision variables, and N is the number of learners. Each decision variable $x_{i,j}$ is randomly assigned a value using

$$x_{i,j} \, = \,x_{j}^{\hbox{min} } + rand()*\left( {x_{j}^{\hbox{min} } -_{j}^{\hbox{min} } } \right)\,\,\,\,\,j = 1,2, \ldots ,D ,$$

(15)

where $x_{j}^{\hbox{min} }$ and $x_{j}^{\hbox{max} }$ are the minimum and maximum values of the jth variable of ith learner, and $rand()$ is the random number function that returns a number between 0 and 1.

Step 4: Teacher phase

(a)
Compute the mean value of each of the learners’ decision variables and denote the population mean as

$$X_{mean} = \left[ {\bar{x}_{1} ,\bar{x}_{2} , \ldots ,\bar{x}_{j} , \ldots ,\bar{x}_{D} } \right]\,,\,\,{\text{where}}\,\,\,\bar{x}_{j} \, = \,\frac{{\sum\nolimits_{i = 1}^{N} {x_{i,j} } }}{N}.$$

(b)
Compute the fitness values of each learner $X$ based on the fitness function f(X). The learner with the best fitness value (solution) is identified as the teacher ($X_{teacher}$) for the teacher phase.
(c)
Now the teacher ($X_{teacher}$) transfers their knowledge and tries to improve the fitness of other learners ($X_{i}$) using

$$X_{new} = \,X_{i} + \,rand()\,\,*\,(X_{teacher} \,\, - \,(TF)\,\,*\,X_{mean} )\,\,,\,\,\,\,{\text{for}}\quad\,i\, = \,1,2, \ldots ,N,$$

(16)

where

$$TF = \,round\left[ {1\, + \,rand\,(0,1)} \right]\,.$$

(17)

Here, $TF$ is the teaching factor (either 1 or 2), and $rand()\,$ is the random number function that returns a number between 0 and 1.

Note that $TF$ is not a parameter of the TLBO algorithm. The value of $TF$ is not provided as input to the TLBO, but its value is randomly chosen by the algorithm using Eq. (17).

(d)
If the previously mentioned updated solution ($X_{new}$) is better than the existing solution ($X_{i}$), then we accept the new solution, otherwise we reject it.

Step 5: Student phase

In the student phase, the learners (students) enhance their knowledge by communicating with other learners in the classroom. Therefore, an individual learner learns if the other individuals have more knowledge.

(a)
Randomly select any two solutions $X_{i}$ and $X_{j}$ such that $i\, \ne \,j\,.$
(b)
If f(X _i), that is, the fitness value of $X_{i}$ is better than $X_{j}$, then we update $X_{i}$ to $X_{new}$ using

$$X_{new} = \,X_{i} + \,rand()\,\,*\,(X_{i} \,\, - \,X_{j} )\,$$

(18)

otherwise, we update it to

$$X_{new} = \,X_{i} + \,rand()\,\,*\,(X_{j} \,\, - \,X_{i} )\, .$$

(19)

Here, $rand()$ is the random number function that returns a number between 0 and 1.

Step 6: Iterate until the termination criteria are satisfied

We then repeat Steps 4 and 5 until our termination conditions are satisfied, i.e., the average value of the fitness function for all learners does not improve, or we reach the maximum number of generations. The $X_{i}$ that minimizes f(X _i) is the final solution of the optimization problem.

2.3 SVM–TLBO hybrid regression model

We propose a hybrid SVM–TLBO regression model, which uses SVM for predictions and TLBO for determining the SVM parameters. SVM can use many kernels, for example, linear, polynomial, sigmoid, wavelet, and Gaussian kernels. We have considered the Gaussian kernel (radial basis) function. This produces better financial time series forecasts [2, 38] because the data are complex and nonlinear. A SVM with a Gaussian kernel has three parameters that must be optimized. That is, $C$ (regularization), ${\kern 1pt} \sigma$ (kernel width), and ${\kern 1pt} \varepsilon$ (insensitive loss function radius).

We designed the proposed SVM–TLBO hybrid regression model to work in a two-dimensional solution space, that is, to optimize $C$ and ${\kern 1pt} \sigma$. We keep the $\varepsilon$ parameter constant at a reasonable value (i.e., 0.0001), because the number of support vectors decreases as $\varepsilon$ increases, when $\varepsilon$ is greater than 0.01 [2]. A flow chart of the SVM–TLBO hybrid regression model is presented in Fig. 2.

3 Proposed methodology

3.1 Dataset

We applied our forecasting model to real multicommodity futures index (MCX COMDEX) data collected from the MCX (http://www.mcxindia.com). MCX COMDEX is a collection of futures contracts on 15 physical commodities with a simple weighted average of three subindices (MCX AGRI, MCX METAL, and MCX ENERGY), which represent the key commodity sectors within the index. The index captures various sectors that incorporate futures contracts drawn on metals, energy, and agricultural commodities that are traded in the MCX. 1332 daily trading data points were collected from MCX COMDEX from January 1, 2010, to May 7, 2014. The time series data consist of daily open price, low price, high price, closing price, and traded date. The raw daily prices were used to calculate our financial technical indicators inputs. The time span covers many important and significant economic events, which we think are appropriate for training the models. Table 1 describes the data set in terms of high, low, mean, median, standard deviation, kurtosis (measure of flatness of the distribution), and skewness (degree of asymmetry of a distribution near its mean). The raw daily closing prices are plotted in Fig. 3. The data description in Table 1 and the plot in Fig. 3 clearly show that the data are well-spread. Therefore, an SVM trained with these data should be a well-generalized model.

Table 1 Description of MCX COMDEX dataset

Full size table

3.2 Preprocessing of data

We derived 17 financial technical indicators using the collected data, and used these indicators as input into the SVM regression model to forecast the closing price of the futures index. The technical indicators were computed using the formulas in Table 2. Financial technical indicators are a class of metrics whose values are derived from generic price activities in financial markets, and are extensively used by traders to predict the future price levels of a financial instrument by looking at past patterns. These financial technical indicators capture random price fluctuations in the market and offer a smoother perspective, because they are trend following or lagging indicators. The 17 financial technical indicators used in our study are based on previous work by Kim and Han [15], Kim [14], Kim and Lee 16], Tsang et al. [39], Ince and Trafalis [9], Huang and Tsai [8], Liang et al. [19], Lai et al. [17], and Chih-Ming [6], and from feedback from domain experts. The indicators are (1) 10-day moving average, (2) 20-day bias, (3) moving average convergence/divergence (MACD), (4) stochastic indicator %K, (5) stochastic indicator %D, (6) stochastic slow %D, (7) Larry William’s %R, (8) rate of change (ROC), (9) relative strength index (RSI), (10) commodity channel index (CCI), (11) psychological line, (12) buying/selling momentum indicator, (13) buying/selling willingness indicator, (14) momentum, (15) disparity 5, (16) disparity 10, and (17) moving average oscillators (MAO). After processing the 1332 raw data points, we obtained 1307 transformed data points with dates from February 1, 2010 to May 7, 2014. The 25 data points from January 1, 2010 to January 31, 2010, are not available because of the definitions of some technical indicators. For example, the buying/selling momentum and willingness indicators require 26 days of data.

Table 2 Technical indicators (features)

Full size table

We linearly normalized the technical indicators so they have a range of [0, 1]. This normalization procedure minimizes the forecasting errors and stops variables with larger numeric ranges from dominating those with smaller numeric ranges. We applied this to both the input technical indicators and the output closing prices. The technical indicators and closing prices were normalized using

$$Y_{i} \, = \,\frac{{\left( {P_{i} - \,P_{\hbox{min} } } \right)}}{{\left( {P_{\hbox{max} } - \,P_{\hbox{min} } } \right)}},\,\quad for \quad i\, = \,1,2,3, \ldots ,\,\,N ,$$

(20)

where $Y_{i}$ is the normalized value,$P_{i}$ is the original value, $P_{\hbox{min} }$ and $P_{\hbox{max} }$ are the minimum and maximum values in the original data, and $N$ is the total number of trading days.

The normalized data were segregated into training and test groups, approximately in the ratio of 5:1. Hence, 1085 data points were used for training with 5-fold cross-validation, and the remaining 222 were used to test the model. We considered three different forecasts of the closing prices: (1) 1 day ahead; (2) 3 days ahead; and (3) 5 days ahead.

In the 1-day-ahead forecasting case, the normalized technical indicators for each trading day from February 1, 2010 to April 30, 2014, and the normalized closing price for the next trading day (from February 2, 2010 to May 1, 2014, 1 day ahead) were partitioned into training and testing sets. The data were split up in a similar way for the 3 and 5-days-ahead forecasts.

3.3 Performance criteria

We evaluated the performance of the proposed model using standard statistical metrics: root mean square error (RMSE), normalized mean squared error (NMSE), mean absolute error (MAE), and directional symmetry (DS) [2, 38, 43]. Detailed descriptions and definitions of these performance criteria are given in Table 3. RMSE, MAE, and NMSE measure the deviation between the actual and forecasted futures index prices, so smaller values are preferred. The accuracy of the direction of the prediction is provided by DS (in %). Larger DS values indicate a better forecast.

Table 3 Performance metrics

Full size table

3.4 Computation Techniques

We implemented Vapnik’s SVM regression technique using LIBSVM, which is a SVM tool box [5]. SVMs for financial time series forecasting commonly use the polynomial kernel $\left( {k\left( {x,y} \right) = \left( {x.y + 1} \right)^{d} } \right)$ or the Gaussian kernel $(k\left( {x,y} \right) = { \exp }\left( {\left( { - 1/{\varvec{\upsigma}}^{ 2} } \right)\left| {\left| {x{-}y} \right|} \right|^{ 2} } \right)$. d is the degree of the polynomial kernel and ${\varvec{\upsigma}}^{2}$ is the width (bandwidth) of the Gaussian kernel. We used the Gaussian kernel (radial basis) function, because it performs well under general smoothness assumptions. Additionally, the Gaussian kernel has fewer parameters than the polynomial kernel. The polynomial kernel produces inferior results when compared with the Gaussian kernel, and requires more training time [2, 21, 38, 43]. We used an Intel Core i7 CPU, 4 GB memory PC for our simulations.

Traditional procedures for optimizing the parameters of the SVM model and the kernel function use grid search [13] or cross-validation [10] methods. However, both of these methods are computationally expensive and data intensive [12]. Grid search is a local search technique that often becomes trapped in local optima, and it is sometime hard to determine its search interval [21]. In this study, we used grid search to find the best values for C and $\varvec{\sigma}^{2}$ using cross-validation. We considered different pairs of (C, $\varvec{\sigma}^{2}$) and then selected those that minimized the error, which we then used in our comparisons. In the simulation experiment, we used C values in the range 0.01 to 35,000, and $\varvec{\sigma}^{2}$ values between 0.0001 and 32 (Table 4). After determining the final (C and $\varvec{\sigma}^{2}$) values for all three forecasting cases (i.e., 1, 3, and 5 days ahead), we trained the model again to generate the final forecasting model. The index prices obtained for the standard SVM regression model are shown in Figs. 4a–c.

Table 4 (a) SVM and (b) TLBO parameters used in our experiments

Full size table

We wrote our own code to implement TLBO for the proposed SVM-TLBO hybrid regression model. The TLBO algorithm was defined in two dimensions, to optimize $\varvec{\sigma}^{2}$ (bandwidth) of the Gaussian kernel parameter and C (regularization parameter) of the SVMs. In our experimental runs of the TLBO algorithm, there were no significant changes to $\varvec{\sigma}^{2}$ and C after 25–30 iterations, when using a population size (learners/students) of 15. Rao and Patel [29], Pawar and Rao [26], Rao et al. [28], and Rao and Waghmare [33] also observed that TLBO only requires a small population and few iterations (generations). With this in mind, we fixed the maximum number of iterations for the TLBO to 30, with a population size 15 (Table 4). We observed that the value of our objective function decreased when the algorithm went from the teacher to student phases within the same iteration, and reduced with the number of iterations. Similar observations were made by Rao et al. [31]. When defining the solution space for TLBO, the range of C was set to 0.01–35,000, and the range of $\varvec{\sigma}^{2}$ was set to 0.0001–32 [20]. The hybrid regression model algorithm ran as per the flow chart provided in Fig. 2, and the simulation results are shown in Fig. 4 and Table 8. In these results, the kth test day means the (1085 + k)th day from our reference date (February 1, 2010), because we have taken the first 1085 days of data for training, and used the remaining 222 days for testing. We compared the results of our proposed SVM–TLBO hybrid regression approach with standard SVM regression and the PSO + SVM model of Lin et al. [21]. We used a sequential optimization (SMO)-based algorithm to train the SVM regression, because it is fast and efficient for large data sets.

4 Results and discussion

The RMSE results of the SVM regression model in the training and testing phases, and the final values of C and $\varvec{\sigma}^{2}$ are presented in Table 5 for all three forecasting cases.

Table 5 Model performance and final parameter settings using the standard SVM regression model

Full size table

The results for the proposed SVM–TLBO hybrid regression model, and the optimal parameters are summarized in Table 6.

Table 6 Model performance and optimal parameters achieved by proposed SVM-TLBO hybrid regression model

Full size table

4.1 Comparisons of results

The RMSE, MAE, and NMSE values presented in Table 7 show that the SVM–TLBO hybrid regression model outperformed the standard SVM regression and PSO + SVM hybrid approaches in all three forecasting cases. With regard to the DS performance metric, SVM–TLBO performed better than the standard SVM and PSO + SVM models in two forecasting cases (3 and 5 days ahead), but standard SVM performed better for the 1-day-ahead forecast. Financial market practitioners evaluate forecasting models using both minimum forecast error and directional accuracy [18]. The aim is to get a directional accuracy of over 50 % [43]. In our study, the DS values for the SVM–TLBO hybrid and standard SVM methods were greater than 50 % in all cases. The DS values for the PSO + SVM hybrid approach was greater than 50 % for the 1-day-ahead and 3-days-ahead forecasts, but it was 48.15 % for the 5-days-ahead forecasts. The number in bold is the best performance.

Table 7 Comparison of the results of the standard SVM, PSO + SVM hybrid, and SVM–TLBO hybrid regression models

Full size table

Figure 4 shows the actual futures index prices, and the prices predicted using standard SVM regression, the PSO + SVM hybrid model, and the proposed SVM–TLBO hybrid regression model for the three types of forecasts. Table 8 represents the forecasting results in terms of index prices for a few data samples using the standard SVM, PSO + SVM hybrid, and SVM–TLBO hybrid regression model. It can be clearly seen from Table 8 that the index prices from the proposed SVM–TLBO hybrid model were more accurate than standard SVM, and were much better than the PSO + SVM hybrid model.

Table 8 Forecasting results using the SVM regression, SVM-TLBO hybrid regression, and PSO + SVM hybrid models

Full size table

5 Conclusions and future work

In this research, we examined the feasibility of applying the newly developed novel TLBO algorithm to select optimal free parameters for an SVM regression model of financial time-series data. We used multicommodity futures index data collected from MCX. Our experimental results show that our proposed SVM–TLBO hybrid regression model effectively found the optimal parameters, and produced better predictions than the standard SVM method. The proposed model improved the MAE result by 65.87 % (for the 1-day-ahead forecast), 55.83 % (for the 3-days-ahead forecast), and 67.03 % (for the 5-days-ahead forecast), when compared with standard SVM regression. The proposed model also improved the RMSE result by 55.64 % (1 day ahead), 55.74 % (3 days ahead), and 57.3 % (for 5 days ahead), when compared with standard SVM regression. There were similar improvements in terms of MAE and RMSE when we compared the proposed SVM–TLBO hybrid regression method with the PSO + SVM hybrid model. Moreover, our experiments demonstrate that the proposed SVM–TLBO hybrid regression model is more efficient than the standard SVM and PSO + SVM hybrid models for financial time series forecasting. The proposed model avoids user-specified control parameters, which are required when using optimization methods such as PSO, GAs, and ACO.

In our current model, we selected the technical indicators (features) using previous research in this area and expert feedback. We could enhance the accuracy of the forecast by including efficient macroeconomic features. The proposed model can also be applied to other domains in the future, to validate and extend the model.

References

Bose S (2008) Commodity futures market in India: a study of trends in the national multi-commodity indices. ICRA Bull Money Finance 3(3):125–158
Google Scholar
Cao LJ, Tay FEH (2003) Support vector machine with adaptive parameters in financial time series forecasting. IEEE Trans Neural Networks 14(6):1506–1518
Article Google Scholar
Chauhan P, Pant M, Deep K (2015) Parameter optimization of multi-pass turning using chaotic PSO. Int J Mach Learn Cybern 6:319–337. doi:10.1007/s13042-013-0221-1
Article Google Scholar
Cherkassy V, Ma Y (2004) Practical selection of SVM parameters and noise estimation for SVM regression. Neural Netw 17(1):113–126
Article MATH Google Scholar
Chih-Chung C, Chin-Jen (2001) LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1–27:27. (Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvmLIBSVM)
Chih-Ming H (2013) A hybrid procedure with feature selection for resolving stock/futures price forecasting problems. Neural Comput Appl 2013:651–671. doi:10.1007/s00521-011-07214
Google Scholar
Gestel TV, Suykens JAK, Baestaens D-E, Lambrechts A, Lanckriet G, Vandaele B, Moor BD, Vandewalle J (2001) Financial time-series prediction using least squares support vector machines within the evidence framework. IEEE Trans Neural Netw 12(4):809–821
Article Google Scholar
Huang CL, Tsai CY (2009) A hybrid SOFM-SVR with a filter based feature selection for stock market forecasting. Expert Syst Appl 36(2):1529–1539. doi:10.1016/j.eswa.2007.11.062
Article MathSciNet Google Scholar
Ince H, Trafalis TB (2008) Short term forecasting with support vector machines and application to stock price prediction. Int J Gen Syst 37(6):677–687. doi:10.1080/03081070601068595
Article MathSciNet MATH Google Scholar
Ito K, Nakano R (2005) Optimizing support vector regression hyper-parameters based on cross-validation. Proc Int Jt Conf Neural Netw 3:871–876
Google Scholar
Jain SK, Patnaik A, Sinha SN (2013) Design of custom-made stacked patch antennas: a machine learning approach. Int J Mach Learn Cybern 4:189–194. doi:10.1007/s13042-012-0084-x
Article Google Scholar
Jiang M, Jiang S, Zhu L, Wang Y, Huang W, Zhang H (2013) Study on parameter optimization for support vector regression in solving the inverse ECG problem. Comput Math Methods Med. doi:10.1155/2013/158056 Article ID 158059
MathSciNet MATH Google Scholar
Keerthi SS (2002) Efficient tuning of SVM hyperparameters using radius/margin bound and iterative algorithms. IEEE Trans Neural Netw 13(5):1225–1229
Article Google Scholar
Kim K (2003) Financial time series forecasting using support vector machines. Neurocomputing 55:307–319
Article Google Scholar
Kim KJ, Han I (2000) Genetic algorithms approach to feature discretization in artificial neural networks for the prediction of stock price index. Expert Syst Appl 19(2):125–132
Article Google Scholar
Kim KJ, Lee WB (2004) Stock market prediction using artificial neural networks with optimal feature transformation. Neural Comput Appl 13(3):255–260. doi:10.1007/s00521-004-0428-x
Article Google Scholar
Lai RK, Fan CY, Huang WH, Chang PC (2009) Evolving and clustering fuzzy decision tree for financial time series data forecasting. Expert Syst Appl 36(2):3761–3773. doi:10.1016/j.eswa.2008.02.025
Article Google Scholar
Leung MT, Daouk H, Chen AS (2000) Forecasting stock indices: a comparison of classification and level estimation models. Int J Forecast 16:173–190
Article Google Scholar
Liang X, Zhang HS, Mao JG, Chen Y (2009) Improving option price forecasts with neural networks and support vector regressions. Neurocomputing 72(13–15):3055–3065. doi:10.1016/j.neucom.2009.03.015
Article Google Scholar
Lin H-T, Lin C-J (2003) A study on sigmoid kernels for SVM and the training of non-PSD kernels by SMO-type methods. Technical report, University of National Taiwan, Department of Computer Science and Information Engineering. March, pp 1–32
Lin SW, Ying K-C, Chen S-C, Lee Z-J (2008) Particle swarm optimization for parameter determination and feature selection of support vector machines. Expert Syst Appl 35:1817–1824
Article Google Scholar
Liu S, Tian L, Huang Y (2014) A comparative study on prediction of throughput in coal ports among three models. Int J Mach Learn Cybern 5:125–133. doi:10.1007/s13042-013-0201-5
Article Google Scholar
Liu Z, Wu Q, Zhang Y, Philip Chen CL (2011) Adaptive least squares support vector machines filter for hand tremor canceling in microsurgery. Int J Mach Learn Cybern 2:37–47. doi:10.1007/s13042-011-0012-5
Article Google Scholar
Musa AB (2013) Comparative study on classification performance between support vector machine and logistic regression. Int J Mach Learn Cybern 4:13–24. doi:10.1007/s13042-012-0068-x
Article Google Scholar
Nicholas IS, Ravi S (2009) Time series prediction using support vector machines: a survey. IEEE Comput Intel Mag 4(2):24–38. doi:10.1109/MCI.2009.932254
Article Google Scholar
Pawar PV, Rao RV (2013) Parameter optimization of machining using teaching-learning-based optimization algorithm. Int J Adv Manuf Technol 67:995–1006
Article Google Scholar
Rao RV, Kalyankar VD (2012) Parameter optimization of machining processes using a new optimization algorithm. Mater Manuf Process 27(9):978–985
Article Google Scholar
Rao RV, Kalyankar VD, Waghmare G (2014) Parameters optimization of selected casting processes using teaching-learning-based optimization algorithm. Appl Math Model 38:5592–5608. doi:10.1016/j.apm.2014.04.036
Article Google Scholar
Rao RV, Patel V (2014) A multi-objective improved teaching-learning based optimization algorithm for unconstrained and constrained optimization problems. Int J Ind Eng Comput 5:1–22. doi:10.5267/j.ijiec.2013.09.007
Google Scholar
Rao RV, Savsani VJ, Vakharia DP (2011) Teaching-learning-based optimization: a novel method for constrained mechanical design optimization problems. Comput Aided Des 43(3):303–315
Article Google Scholar
Rao RV, Savsani VJ, Vakharia DP (2012) Teaching-learning-based optimization: a optimization method for continuous non-linear large scale problems. Inf Sci 183:1–15
Article MathSciNet Google Scholar
Rao RV, Waghmare GG (2014) A comparative study of a teaching-learning-based optimization algorithm on multi-objective unconstrained and constrained functions. J King Saud Univ Comput Inf Sci 26(3):332–346. doi:10.1016/j.jksuci.2013.12.004
Google Scholar
Rao RV, Waghmare GG (2015) Multi-objective design optimization of a plate-fin heat sink using a teaching-learning-based optimization algorithm. Appl Therm Eng 76:521–529. doi:10.1016/j.applthermaleng.2014.11.052
Article Google Scholar
Refenes AP, Zapranis AD, Francis G (1995) Modeling stock returns in the framework of APT: a comparative study with regression models. In: Refenes AP (ed) Neural Networks in the Capital Markets. Wiley, Chichester, pp 101–125
Satapathy SC, Naik A, Parvathi A (2013) A teaching learning based optimization based on orthogonal design for solving global optimization problems. SpringerPlus 2013(2):130. doi:10.1186/2193-1801-2-130
Article Google Scholar
Shazly MRE, Shazly HEE (1999) Forecasting currency prices using genetically evolved neural network architecture. Int Rev Financ Anal 8(1):67–82
Article Google Scholar
Steiner M, Wittkemper HG (1995) Neural networks as an alternative stock market model. In: Refenes AP (ed) Neural Networks in the Capital Markets. Wiley, Chichester, pp 135–147
Tay FEH, Cao L (2002) Modified support vector machines in financial time series forecasting. Neurocomputing 48:847–861
Article MATH Google Scholar
Tsang PM, Kwok P, Choy SO, Kwan R, Ng SC, Mak J, Tsang J, Koong K, Wong TL (2007) Design and implementation of NN5 for Hong Kong stock price forecasting. Eng Appl Artif Intell 20(4):453–461. doi:10.1016/j.engappai.2006.10.002
Article Google Scholar
Tsibouris G, Zeidenberg M (1995) Testing the efficient markets hypothesis with gradient descent algorithms. Neural Networks in the Capital Markets, pp 127–136
Vapnik V (1995) The Nature of Statistical Learning Theory. Springer, New York (ISBN 0-387-94559-8)
Book MATH Google Scholar
Wittkemper HG, Steiner M (1996) Using neural networks to forecast the systematic risk of stocks. Eur J Oper Res 90:577–588
Article MATH Google Scholar
Wun-Hua C, Jen-Ying S, Soushan W (2006) Comparison of support vector machines and back propagation neural networks in forecasting the six major Asian stock markets. Int J Electron Finance 1(1):49–67
Article Google Scholar
Yang Y, Wang G, Yang Y (2014) Parameters optimization of polygonal fuzzy neural networks based on GA-BP hybrid algorithm. Int J Mach Learn Cybern 5:815–822. doi:10.1007/s13042-013-0224-y
Article Google Scholar
Zou F, Wang L, Hei X, Chen D, Yang D (2014) Teaching-learning-based optimization with dynamic group strategy for global optimization. Inf Sci 273:112–131. doi:10.1016/j.ins.2014.03.038
Article Google Scholar

Download references

Acknowledgments

We would like to express our gratitude to the National Institute of Science and Technology (NIST), for the facilities and resources provided at the Data Science Laboratory at NIST for the development of this study. The authors would also like to thank the editor and the anonymous reviewers for their innovative suggestions that improved the quality of this manuscript.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, National Institute of Science and Technology, Palur Hills, Berhampur, Odisha, 761008, India
Shom Prasad Das
Institute of Mathematics and Applications, Bhubaneswar, Odisha, 751003, India
Sudarsan Padhy

Authors

Shom Prasad Das
View author publications
You can also search for this author in PubMed Google Scholar
Sudarsan Padhy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shom Prasad Das.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Das, S.P., Padhy, S. A novel hybrid model using teaching–learning-based optimization and a support vector machine for commodity futures index forecasting. Int. J. Mach. Learn. & Cyber. 9, 97–111 (2018). https://doi.org/10.1007/s13042-015-0359-0

Download citation

Received: 21 August 2014
Accepted: 03 April 2015
Published: 21 April 2015
Issue Date: January 2018
DOI: https://doi.org/10.1007/s13042-015-0359-0

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A novel hybrid model using teaching–learning-based optimization and a support vector machine for commodity futures index forecasting

Abstract

Similar content being viewed by others

Predict Stock Prices Using Supervised Learning Algorithms and Particle Swarm Optimization Algorithm

Employing Machine Learning Algorithms for Stock Index Prediction

Predicting Stock Market Price Using Machine Learning Techniques

1 Introduction