1 Introduction

The application of correlation to Forex market analysis has been exploited in previous researches. For instance, the correlation in forex market normally is investigated through technical and/or fundamental analysis. For the correlation between previous market trends and observed time series data, technical analysis extracts similar patterns from historical market trends. The significance of correlation is verified in the forecasting of future market direction [1]. For the correlation between macroeconomic data and observed time series data, fundamental analysis measures and examines directly macroeconomic data on its qualitative and quantitative impact factors to current market status [2]. For the usages of two analysis methods, the Bank of England had a questionnaire survey in 1992 among chief foreign exchange dealers based in London [3]. The results revealed that at least 90% of respondents prefer to use technical analysis [49] to conduct correlation analysis for forex market when forming views at one or more time horizons. Alternatively in 2002, the bank of Canada had an evaluation on fundamental analysis [10, 11], identifying that correlations from fundamental analysis provide strong evidence for forex market trend variation, and the bank suggested such correlations must be considered by forex traders.

In statistics, standard correlation analysis method calculates correlation coefficients through distance-based covariance calculation over every time point of the time line [12]. It is worth noting that standard correlation counts merely distance similarity. Significant correlation knowledge on trend similarity, however, is lost because time point mismatches happen to most financial time series. For example, given two periods of time series in a similar increasing trend but with varied zigzag paths, the apparent correlation on trend similarity is very likely to be ignored, as statistical correlation calculation gives often a coefficient lower than the predefined threshold due to mismatches between two time series (i.e. the peak of one market mismatches the trough of another).

In the context of forex market analysis, this paper proposes correlation-aided support vector regression model (cSVR) capable of conducting technical analysis and fundamental analysis in conjunction with extract correlation data from market and enhancing the performance of time series prediction. In our experiments for forex market trend analysis, the proposed cSVR is implemented for 5 contracts exchange rate prediction, in which correlation analysis is conducted on: (1) correlation to historical market; (2) correlation to other currencies; and (3) correlation to microeconomic variables.

The rest of the paper is organized as follows. Section 2 introduces related researches and motivations of the presented research. Section 3 describes the proposed methods for informative correlation data extraction. Section 4 describes the proposed cSVR time series prediction method. Results of correlation validity evaluation are reported in Sect. 5 Finally, conclusions and directions for future research are given in Sect. 6.

2 Literature review and motivations

2.1 Correlation extraction methods

Correlation in statistics indicates the strength and direction of a relationship between two random variables [13]. Depending on its distributions, correlation can be categorized into two main types, Pearson’s Correlation (e.g. positive, negative linear correlation) [14] and non-parametric correlation (e.g. Spearman Correlation, Tau Kendall Correlation, Gamma Correlation) [15]. The most popular correlation extraction method for forex market analysis is Pearson’s correlation.

Pearson’s correlation [14] is briefed as follows. Given time series \(X=\{x_1,x_2, \ldots, x_N\}\) and \(Y=\{y_1,y_2,\ldots,y_N\},\) the Pearson product-moment correlation coefficient (ρX,Y) is calculated as:

$$ \rho_{X,Y} = \frac{cov(X,Y)}{\sigma_X\sigma_Y} = \frac{E((X - \mu_X)(Y - \mu_Y))}{\sigma_X\sigma_Y}, $$
(1)

where cov is the covariance; σ X and σ Y are standard deviations; μ X and μ Y are the expected value; and E is the expected value operator. Practically, except for ρX,Y, Pearson’s correlation returns a probability p value (p). p value in statistical hypothesis testing is the probability of obtaining a test statistic at least as extreme as the one that was actually observed (Y to X), assuming that the null hypothesis is true. Null hypothesis is typically the statements of no difference or effect. The fact that p values are based on this assumption is crucial to their correct interpretation. The lower the p value, the less likely the result, assuming the null hypothesis, so the more “significant” the result, in the sense of statistical significance. p is calculated as:

$$ p = \frac{1}{N-1}\sum^{N-1}_{i=1} p_i $$
(2)

where,

$$ p_i = \left\{ \begin{array}{cc} 0 & \hbox{if } \Updelta x_i > 0\,\hbox {and} \, \Updelta y_i > 0\\ 1 & \hbox{if } \Updelta x_i < 0\, \hbox {and} \,\Updelta y_i > 0\\ 1 & \hbox{if } \Updelta x_i > 0\, \hbox {and} \,\Updelta y_i < 0\\ \end{array}\right. $$
(3)

Consider \( \sigma_X ^ 2 = E[{(X - E(X))}^2] = E(X ^ 2) - E ^ 2(X) \)due to μ X  = E(X) and likewise for Y. Also, E[(X − E(X))(Y − E(Y))] = E(XY) − E(X)E(Y). Equation (1) is often formulated with p as:

$$ \begin{aligned} &\rho_{X,Y} =\frac{E(XY) - E(X)E(Y)}{\sqrt{E(X ^ 2) - E ^ 2(X)} \sqrt{E(Y ^ 2) - E ^ 2(Y)}}\\ & \hbox{subject to:}\,p < 0.05, \end{aligned} $$
(4)

ρX,Y is ranged from +1 to −1. It follows that Pearson’s correlation includes positive correlation and negative correlation. A positive correlation (\(\rho_{X,Y} \rightarrow1\)) means that as one variable/time series (X) becomes large, the other (Y) also becomes large, and vice versa. ρX,Y = +1 means a perfect positive linear relationship between X and Y. In case of negative correlation (\(\rho_{X,Y} \rightarrow-1\)), as one variable (X) increases, the other (Y) decreases and vice versa. Note that Pearson’s correlation ρX,Y is statistically significant, only if p is less than 0.05.

The advantage of using Pearson’s correlation is that more accurate prediction can be made when a strong correlation exists among variables/time series patterns. The suitability of Pearson’s correlation for financial market forecasting has been demonstrated by Kondratenko and Kuperin [16]. They used Pearson’s correlation to aid neural networks (NN) to forecast the exchange rates between American Dollar to four other major currencies, Japanese Yen, Swiss Frank, British Pound, and EURO. The results show that the NN get better performance with Pearson’s correlation extraction information than without them. Also, a recent study [17] tested NN model work with Pearson’s correlation results in better average internode distance on ten exchange rates by comparing other correlation methods. However, both articles found that their Pearson’s correlation-aided time series prediction is not reliable.

In contrast to Pearson correlation influenced by outliers, unequal variances, and non-normality, non-parametric correlation is calculated by applying the Pearson correlation formula to the ranks of the data rather than to the actual data values themselves. In doing so, many of the distortions that plague the Pearson correlation are reduced considerably. In the literature, Chi-square correlation [18], Point biserial correlation [19], Spearman’s correlation [20], and Kendall’s correlation [21] are some of the well-known non-parametric correlation methods.

It is worth noting that the efficiency of a particular non-parametric correlation method depends on the type of probability distribution inherent in the data. Thus, different non-parametric correlations in practice have their characteristic applications. Chi-square correlation works well for age-adjusted death rates, life table analysis [22], lung cancer analysis [23], and cardiac re-synchronization therapy (CRT) in heart failure (HF) [24]. Point biserial correlation is special in the analysis of children reading attainment [25], schizophrenia research [26], and academic achievement prediction [27]. Spearman’s correlation usually performs well on psoriasis disease analysis [28], analysis of lung inflammation in asthma [29] and gaucher disease prediction [30]. Kendall’s correlation is unique on the analysis of drugs composition[31], network-coupled motions [32], and information ordering evaluation [33].

2.2 Motivation of cSVR

It has been confirmed that correlation information/knowledge has its unique, sometime even deterministic, role on market trend analysis and forecast despite the chaotic variation of forex market. From the viewpoint of technical analysis, correlation data are essential for any computational market analysis, in addition to the observed market data, especially when insufficient market data are available for analysis, or the observed market data give little indication on future direction of market.

Aiming to extract significant correlation data to the observed market, we studied a new correlation computing and synthesis approach, in which correlation knowledge is derived and encoded over historical data from the observed currency pair, relevant currency pairs, as well as important domestic/international microeconomics variables. Based on computational analysis on all available market data, the extracted correlation is expected to enable an ordinary trader to conduct expert market trend analysis the same as an financial professional do with his years of experiences on traditional technical and fundamental analysis.

Motivated by this, we model market trend similarity on above standard distance similarity for correlation analysis, employing a channel method followed by parameterized Pearson’s correlation method to extract all patterns most similar and correlative to the observed time series. We utilize technical analysis, fundamental analysis in conjunction with select informative correlation data for learning, assisting computational inference models such as SVR for enhanced Forex market trend analysis/forecast.

3 Informative correlation data extraction

3.1 Channel correlation extraction

The channel correlation models a concrete arc and approximates graphically trend similarity between two time series. Figure 1 gives the diagrams of 4 typical trend patterns: fast growing, slowly increasing, fast dropping, and slowly decreasing.

Fig. 1
figure 1

Four trend patterns used for channel approximation

Straightforwardly, each of the above trend patterns can be measured graphically by one piece of arc with its function formulated as a sub-circle shown in Fig. 2. In this paper, we call the arc function as channel pattern. As a result, we describe 4 types of channel patterns by the following 4 arc functions respectively,

$$ (x-x_0)^2 + (y-y_0)^2 = R^2 \left| \begin{gathered} x_{{\text{0}}} = {\text{0,}}y_{{\text{0}}} = R\hfill \\ x \in [0,\sin \alpha \cdot R\sqrt {2(1 - \cos 2\alpha)]} \hfill \\ {\text{see Fig. 2a}} \hfill \\ \end{gathered} \right. $$
(5)
$$ (x-x_0)^2 + (y-y_0)^2 = R^2 \left| \begin{gathered} x_{{\text{0}}} = R, y_{{\text{0}}} = 0 \hfill \\ x \in [0,\sin \alpha \cdot R\sqrt {2(1 - \cos (\pi - 2\alpha ))]} \hfill \\ {\text{see Fig. 2b}} \hfill \\ \end{gathered} \right. $$
(6)
$$ (x-x_0)^2 + (y-y_0)^2 = R^2 \left| \begin{gathered} x_{0} = 0,y_{0} = 0 \hfill \\ x \in [0,(1 - \cos \alpha ) \cdot R\sqrt {2(1 - \cos (\pi - 2\alpha ))]} \hfill \\ {\text{see Fig. 2c}} \hfill \\ \end{gathered} \right. $$
(7)
$$(x-x_0)^2 + (y-y_0)^2 = R^2 \left| \begin{gathered} x_{0} = R,y_{0} = R \hfill \\ x \in [0,(1 - \cos \alpha ) \cdot R\sqrt {2(1 - \cos (2\alpha ))]} \hfill \\ {\text{see Fig. 2d}} \hfill \\ \end{gathered} \right. $$
(8)

In (58), \(\angle\alpha\in(0,\pi/4)\) is the parameter reflecting the speed of market prices variation (increasing or decreasing). Radius R determines the length of the trend pattern corresponding to the time period of observation. In practice, a discrete channel pattern (i.e. arc) with a specified \(\angle\) will be generated according to the length of time series for channel approximation.

Fig. 2
figure 2

Four types arc ruler corresponding to 4 trend patterns shown in Fig. 1

Given an observed time series X with N data points and another time series Y with T points, \(N\leq{T}.\) Applying (58) to X, respectively, one of the 4 types of channel pattern is selected with α tuned to best suit X,

$$ p^* =\arg\min_{\alpha,i\in[1, 4]}\frac{\sum_{t=1}^{N}\|p^{i}_{t}-x_{t}\|}{N}. $$
(9)

To discover the correlation of Y to X, an Euclidean mean distance from the observed time series X to the channel pattern p* is estimated at every time point t,

$$ d_{t} =\frac{\sum_{t=1}^{N}{\|p^{*}_{t}-y_{t}\|}}{N}, $$
(10)

and correlation data are selected within a process of shifting distance comparison as,

$$ \begin{aligned} {\mathcal{C}}_{c}(X,Y) = \{y_{t},y_{t+1},\ldots,y_{t+N}\}\\ \hbox{subject to: } d_{t}<\xi, t=1,\ldots, T, \end{aligned} $$
(11)

where a subperiod time series of Y is judged correlated to X, only if its distance to the channel pattern p* is less than the distance threshold ξ.

The ξ is fixed normally based on the average distance between the selected channel pattern p* and the observed time series X,

$$ \xi =\frac{\sum_{t=1}^{N}{\|p^{*}_{t}-x_{t}\|}}{N}. $$
(12)

Alternatively, ξ can be fixed by the minimum distance of Y to p* within [1, T]. In this way, correlation data are selected by a ransack minimum searching as,

$$ \begin{aligned} {\mathcal{C}}_{c}(X,Y) = \{y_{t},y_{t+1},\ldots,y_{t+N}\}\\ \hbox{subject to: } d_{t}\leq\min_{t\in[1, T]}\frac{\sum_{t=1}^{N}\|p^{*}_{t}-y_{t}\|}{N}, \end{aligned} $$
(13)

where \(\min_{t\in[1, T]}\frac{\sum_{t=1}^{N}\|p^{*}_{t}-y_{t}\|}{N}\) calculates the the minimum distance of Y to p* within [1, T].

3.2 The parameterized Pearson’s correlation

To overcome the drawbacks of the channel method, typical Person’s correlation [14] is extended for distance correlation extraction with minimized noise, meanwhile minimized useful information lost.

According to [14], the correlation extraction by Pearson’s correlation is subjected to the p condition. However in forex market, two similar time series are often found with a high p value due to the time point mismatches between two variables. This implies that significant correlation information is likely to be missed due to the high p value, and only Pearson’s correlation analysis therefore is ineffective for extracting useful information for forex market analysis.

To develop an actually feasible and effective correlation extraction through Pearson’s correlation analysis, the standard Pearson’s correlation analysis is parameterized by setting a hyperplane on both sides of the perfect positive correlation (Y = X) to exclude noisy correlation data,

$$ \begin{aligned} r=&\frac{\sum{XY}-\frac{\sum{X}\sum{Y}}{N}}{\sqrt{(\sum{X^{2} -\frac{\sum{X^{2}}}{N}})(\sum{Y^{2}-\frac{\sum{Y^{2}}}{N}})}}\\ \hbox{subject\ to:\ } & Y-X-\alpha<0,\,\hbox{and}\,Y-X+\alpha>0, \\ \end{aligned} $$
(14)

where a is the similarity margin identifying the relatedness to the perfect linear correlation (i.e. Y = X). Figure 3 gives an illustration of this parameterized Pearson’s correlation analysis.

Fig. 3
figure 3

The illustration of weighed Pearson’s correlation extraction

As conducting correlation extraction, the distance from point \((x_t,y_t)\) to the perfect Pearson’s correlation line Y = X is calculated for every corresponding time point t of X and Y,

$$ d_{t}=\frac{\sum_{t\in{N}}{\frac{|x_{t}+y_{t}|}{\sqrt{2}}}}{N}. $$
(15)

Then, similar to the above channel method, correlation data are selected from Y through a shifting distance comparison as,

$$ \begin{aligned}&{\mathcal{C}}_{p}(X,Y) = \{y_{t},y_{t+1},\ldots,y_{N}\}\\&\hbox{subject\ to:\ } d_{t}<\alpha, t=1,\ldots,T. \end{aligned}$$
(16)

Note that parameter α identifies the width of correlation margin. In the proposed correlation extraction, α measures the trade-off between distance similarity and trend similarity. A smaller α means strict distance requirement for correlation extraction, while a big α indicates that correlation is more on the trend than on the distance similarity. In practice, α can be experimentally determined by cross validation tests.

3.3 Correlation synthesis

As discussed above, the channel correlation traces graphically the trend similarity, while the parameterized Pearson’s correlation approximates the distance similarity of two time series. Both approaches have a certain limitation. However, the combination of channel and parameterized Pearson analysis presents potentially an optimal correlation extraction, as it computes correlation complimentary with both trend and distance evaluations.

When the selected channel pattern well matches the observed time series, a small ξ t often causes no correlation data obtained by the channel correlation extraction of (11) or (13). In this case, the parameterized Pearson method, however, is always able to extract correlation data within a proper correlation margin α, as the parameterized Pearson counts the general trend similarity rather than the strict point-to-point distance similarity. On the other hand, when the observed time series is shaped in a zigzag path, no correlation output happens also to the parameterized Pearson method, because that two zigzag shape time series cause easily big mismatch in (16). In this case, the channel method is able to trace the trend similarity between X and Y, as (10) produces surely a big d t on time series in a zigzag path.

Apparently, the combination of channel and parameterized Pearson correlation characterizes the balancing trade-off between the trend similarity and distance similarity for correlation knowledge extraction. The correlation data obtained in complementary by two methods are expected to have more weight than the data from any one of the two methods. Therefore, we compose our correlation data by merging correlation data from two methods as,

$$ {\mathcal{C}}(X,Y) = {\mathcal{C}}_{c}(X,Y) \cup {\mathcal{C}}_{p} (X,Y). $$
(17)

For a specific forex analysis, the above correlation analysis is carried out in our experiment for extracting: (1) correlation to historical market; (2) correlation to other currencies; and (3) correlation to microeconomic variables [34], respectively. The obtained correlation data are modeled as below for a correlation-aided SVR time series prediction.

4 Correlation-aided SVR time series prediction

Support vector regression (SVR) is the application of support vector machines (SVM) [3537] to general regression analysis. The SVR departs from more traditional time series prediction methodologies in the strict sense where there is no “model” to make the prediction depend only on the data [38].

Given a forex time series x(t) where t represents the time point. Suppose the present time point is N, then a prediction x for t > N is computed over the training data \(\mathcal{X}(t)=\{x(1),x(2),\ldots,x(N)\}\). Thus, the goal is to find a function f(x) that matches the actually obtained targets x(t) of next time point for all the training data. According to [35], a non-linear support vector regression estimation of f(x) is computed as (18),

$$ f_{\hbox{SVR}}(x) = (w \cdot \phi(x)) + b. $$
(18)

where “\(\cdot\)” means a dot product. ϕ(x) refers to the kernel function \(k(x,x^{\prime}) = \left\langle \Upphi(x),\Upphi(x^{\prime})\right\rangle,\) which enables performing a linear regression in higher dimensional feature space.

To find an optimal set of parameters, weight w and threshold b, firstly, the weights are flatted by the Euclidean norm (\({\left\|w\right\|}^2\)), and Secondly, the empirical risk (error) is generated by the estimation process of the value. Thus, the overall goal is the minimization of the regularized risk R reg (f),

$$ R_{reg}(f) =\frac{1}{N}\sum^{N-1}_{i=0}L(x(i),y(i),f(x(i),w)) +\frac{\lambda}{2}{\left\|w\right\|}^2, $$
(19)

where \(\frac{1}{N}\sum^{N-1}_{i=0}L(x(i),y(i),f(x(i),w))\) is the empirical risk, i is an index to discrete time points t = {0, 1, 2,…,N − 1}, and y(i) is the predicted value being sought. L(.) is a “loss function” to be defined. λ is the capacity control factor, a scale factor regarded as regularization constant which reduces “over-fitting” of data and minimizes negative effects of generation.

To solve for the optimal weights and minimize the regularized risk, a quadratic programming problem is formed using the \(\epsilon\)-insensitive loss function is the most common loss function,

$$ \min\frac{1}{2}{\left\|w\right\|}^2 + C\sum^{n}_{i=1}L(y(i),f(x(i),w)), $$
(20)

where

$$ L(y(i),f(x(i),w)) = \left\{ \begin{array}{ll} 0 & \hbox{if }\left|y(i) - f(x(i),w)\right| \leq \epsilon\\ \left|y(i) - f(x(i),w)\right| - \epsilon & \hbox{otherwise.} \end{array} \right. $$
(21)

C is a positive constant that includes the (1/N) summation normalization factor and \(\epsilon\) refers to the precision by which the function is to be approximated. They are both user-defined constants and can be typically determined by cross validation tests. Solving (21) is an exercise in convex optimization, thus, it is easy to use Lagrange multipliers and form the dual optimization problem as,

$$ \begin{array}{cc} \hbox{Maximize: } & -\frac{1}{2}\sum^{N}_{i,j=1}(a_i - a^{*}_{i})(a_j - a^{*}_{j})\left\langle x(i), x(j)\right\rangle \\ & -\epsilon\sum^{N}_{i=1}(a_i - a^{*}_{i}) + \sum^{N}_{i=1}y(i)(a_i - a^{*}_{i}) \\ \hbox{subject to: } & \sum^{N}_{i-1}(a_i - a^{*}_{i}) = 0 : a_i, a^{*}_{i} \in \left[0,C\right]. \end{array} $$
(22)

In this way, f(x) is approximated as the sum of the optimal weights times the dot products between the data points as:

$$ f(x) = \sum^{N}_{i=1}(a_i - a^{*}_{i})\left\langle x,x(i)\right\rangle + b , $$
(23)

where those data points on or outside the \(\epsilon\) tube with non-zero Lagrange multipliers a are defined as support vectors.

In comparison with traditional SVR time series prediction in (18), the proposed cSVR enhances the performance by incorporating correlation data for SVR model construction [39]. Thus, given an observed time series X and correlation data \(\mathcal{C}\) obtained by (17), then (18) is extended for correlation-aided SVR time series prediction as,

$$ f_{\hbox{cSVR}}(x') =(w\cdot\phi(x')) + b,\quad x'\in[X\cup{{\mathcal{C}}}], $$
(24)

where the observed time series X plus correlation data \(\mathcal{C}\) extracted by the approach of channel and parameterized Pearson analysis are utilized jointly for SVR model construction. In other words, f cSVR differs f SVR at, (21) and (22) are trained on \({[X\cup{\mathcal{C}}]}\) instead of X.

5 Experiments and discussions

We examined the proposed cSVR for time series prediction of five contracts exchange rate NZD–AUD, NZD–EUD, NZD–GBP, NZD–JPY, and NZD–USD. Their used time periods are listed in Table 1, and the daily closing prices are used as the data sets. The presented study had five stock market data from each observed country as assistant analysis data. Table 2 gives the information of 5 stock market (NZX 50, S_P_ASX 200, ftse100, nikkei255, NYSE).

Table 1 Five future contracts
Table 2 Five assistant analysis data sets

The proposed cSVR is implemented in MATLAB version (7.6.0), on a 1.86 Hz Intel Core 2 PC with 2 GB RAM. In the experiment, we conduct cross validation tests to set finally γ as 250 for RBF SVR and parameter α as 0.07 for the parameterized Pearson’s correlation analysis. The regression period of time series N is generally determined by traders experience. In our experiment, N is fixed as 20 by a cross validation prediction tests on NZD/AUD for 2006. To exhibit the advantages of our method, we set a reliable prediction performance evaluation by means of the directional asymmetry (DS), mean squared error(MSE), root mean squared error (RMSE), normalized mean square error (NMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE).

5.1 Experimental results

Table 3 shows the results of forex time series prediction from 2 Jan, 2007 to 31 Dec, 2008 for 5 currency pairs, respectively. As seen from the tables, the cSVR in general shows a clearly more advanced capability than SVR on the forex time series prediction in terms of MSE, RMSE, NMSE, MAE, and MAPE. On DS, although the cSVR does not outperform SVR, there is no particularly difference between the DS of SVR and cSVR.

Table 3 Statistical results of cSVR versus SVR for 5 contracts exchange rates prediction

Among the 5 currency pairs, it is worth noting that the most obvious evidence on cSVR is shown in NZD/JPY prediction. As can be observed in Table 3d, the MSE produced by cSVR is over 3 times smaller than that produced by SVR in both periods 2007 and 2008. RMSE in cSVR prediction for 2007 is about 6 times smaller than that of SVR. The NMSE of cSVR is 40 times in 2007 and 3 times in 2008 smaller than that of SVR prediction. Also for MAE and MAPE, the cSVR is giving significantly smaller errors than those from SVR.

For daily exchange rates forecast by SVR, the left diagrams in Figs. 4 and 5 show the differences between the predicted and the actual time series of 5 contracts exchange rate for the period of 2008. As seen from the diagrams, the fitness between the predicted prices and the actual prices is mismatched in the five future contracts prediction. Obvious gaps between the two curves indicate the high level of prediction errors from SVR.

Fig. 4
figure 4

cSVR vs. SVR on daily exchange rates (NZD/AUD, NZD/EUD, and NZD/GBP) prediction in 2008

Fig. 5
figure 5

cSVR vs. SVR on daily exchange rates (NZD/JPY and NZD/USD) prediction in 2008

As a comparison, the right diagrams in Figs. 4 and 5 present the daily exchange rate forecast from cSVR. As seen, the prediction from cSVR is consistently better than the prediction from SVR for NZD/AUD in 2008, NZD/GBP in 2008, NZD/JPY in 2008, and NZD/USD in 2008. It is noticeable that those gaps occurring in SVR prediction are either disappeared or mostly reduced in cSVR prediction. However, a few downward/upward overfitting occurs in cSVR prediction, which makes cSVR not perform as good as SVR at some points. For example, for the prediction of NZD/AUD during 07 to 11 Jun, 2008, cSVR is seen in Fig. 4b suddenly losing accuracy, performing even worse than SVR. This could be explained by the fact that the correlation data might pose a trend different/conflicted to the state indicated in the observed time series, which eventually causes the overfitting of cSVR training [40, 41]. Nevertheless, the general contribution of the extracted correlation knowledge to the forex market trend prediction is confirmed according to the statistics for the predictions within the whole 2008.

6 Conclusions and directions for future work

In the literature, support vector regression (SVR) has been researched for financial time series forecasting. The SVR studies fall mostly into three categories: (1) Modified SVR. For instance, Tay et al. [42] proposed the C-ascending support vector machine, a modified version of support vector machines to model non-stationary financial time series; Van et al. [43] applied the Bayesian evidence to least squares support vector machine (LS-SVM) regression to infer non-linear models for predicting a financial time series and the related volatility; and Cao et al. [44] proposed dynamic support vector machines (DSVMs), modifying SVR to model non-stationary time series. (2) Integrated SVR, such as Lu et al. [45] developed a two-stage approach using independent component analysis (ICA) and SVR in financial time series forecasting. Huang [46] and Cao [47] considered hybridizing SVR with the self-organizing map (SOM) to reduce the cost of training time and to improve prediction accuracies. (3) Parameter adapted SVR. In this category, Cao [48, 49] and Min [50] studied significantly the variability of SVR with respect to the parameters, toward developing parameter adaptive SVR by diverse means.

Unlike the above SVR finance applications, this paper models the composition of SVR training data in the context of forex time series prediction by incorporating informative correlation data to the observed market (e.g. forex NZD/USD) time series into a standard SVR learning. In other words, original SVR has no change; however, SVR is empowered by adding in additional correlation data for training. Thus, we concentrate on computational correlation extraction over available forex market and microeconomic data. The discovered correlation is a synthesis of channel and parameterized Pearson’s correlation, in which the channel method traces trend similarity of two time series, and the parameterized Pearson’s correlation filters noise in correlation extraction. The proposed cSVR is experimented for time series prediction with 5 future contracts (NZD/AUD, NZD/EUD, NZD/GBP, NZD/JPY, and NZD/USD) within the period from January 2007 to December 2008. The experimental results show that the cSVR is outperforming SVR consistently for all 5 contracts exchange rate prediction in terms of error function MSE, RMSE, NMSE, MAE, and MAPE.

The cSVR prediction is found sometime surfing unexpectedly far away from the truth value, which implies that despite the significance of the proposed correlation, how to use and fuse correlation into the present market data remains a challenge preventing us from enhancing further market understanding through computational analysis. In addition, the selection of macroeconomic factors and the determination of time period N for analysis are two computationally essential points worth addressing further for future forex market correlation analysis.