Keywords

1 Introduction

The forex (foreign exchange market) is a global, decentralised market on which the main instruments are the currencies. It is often described as the most liquid market in the world, and its turnovers reach over billions of dollars every day. On the foreign exchange market, every single instrument is described as a pair of currencies in which two elements can be indicated: the first element is the base currency and the second element is the quote currency. The exchange rate of the currency pair changes dynamically over time and there are many crucial factors that have a high impact on it. In general, every currency pair is the resultant value of demand and supply. The whole market consists of four overlapping sessions located at different time zones (below are the hours given on the basis of GMT+1 time):

  • London—(European session)—8 a.m.–16 p.m.;

  • New York—1 p.m., 9 p.m.;

  • Sydney—10 p.m., 6 a.m.;

  • Tokio—midnight–8 a.m.

One of the most important mechanisms applied to the market is leverage. As opposed to regular markets (such as the stock exchange), there is no need on the foreign market to have a considerable amount of money. In its simplest definition, leverage allows to open positions on any currency pair having only partial capital protection. Such an approach is a considerable facilitation for persons with small capital. Moreover, it is also one of the most important features of this market that attracts small, private investors. More details on the forex market can be found, e.g. in [22].

Stock data (including foreign market data) are often considered to be chaotic data without a visible point attractor. In other words, there can be no expectation, that in the longer time horizon some currency pairs’ values will converge into a single value (constant price). In general, there are three different attractors: the constant point attractor, the cycle attractor and the chaotic attractor. There have been many articles which indicated that in the first two attractors there is a high possibility to predict future values. As oppose to those, in the chaotic data there is often a need to deal with a large error [8]. On the other hand, there is a large group of mathematical methods known as technical analysis. Such methods can be very helpful in determining the price direction movement as well as the range of such a move. One of the most popular tools based on the mathematical and statistical approach is the moving average. It is a value calculated on the basis of n last price values. High efficiency of such tools was shown for example in [9].

The whole concept of technical analysis is based on three rules: the market discounts everything, prices move on the basis of trends, and history repeats itself. Those assumptions were described in Dow theory, which is often considered as a preliminary to technical analysis. Dow theory existed already at the end of the 19th century but was described in details in the 20th century. Rhea [17] is considered to be one of the most important works on Dow theory. At the end of the 20th century there was still growing interest in technical analysis. In 1992 in [20] one of the first formal studies was presented on the effectiveness of technical analysis. A similar study was proposed once again in [19].

In this article we propose a novel approach to transform forex data into a decision table. We assume that in every time step there are three possibilities: Buy signal—which allows to open a buy position; sell signal—which allows to open a sell position, and wait—which is basically idle. This article may be considered as a preliminary study on the effectiveness of such a classification approach in which there is a visible inequality in the number of elements belonging to the decision classes. Such research will be conducted on the basis of well-known decision tree algorithms. The article is organised as follows: first we provide the background for the problem, then we propose a novel concept of acquiring financial data and transforming the data into decision tables. We also give a brief description of decision tree algorithms. Finally, we present the preliminary experiments and end with some conclusions.

2 Related Works

Automated forex systems hav attracted much attention in the over past few years. The high popularity of such systems is the result of continued growing interest in technical analysis. Besides fundamental analysis, it is one of the major components of modern trading systems. However, the effectiveness of technical analysis is still unclear. In article [6] the authors investigated over 90 different articles correlated with technical analysis, and only 56 of them had positive conclusions. In 20 articles, technical analysis did not generate satisfactory results.

The concept of building an automated forex system based on computer science algorithms is very popular. One of the most often used approaches involves the use of artificial neural networks. The effectiveness of this type of algorithm was recently proven in multiple articles, such as in [21], thus it is natural to use this concept in economy. One of the first articles involving a neural network as a tool for generating trading signals was [5]. The popularity of such an approach has grown with time and many similar concepts have been introduced, e.g. in [14] the authors proposed a web-decision support system based on neural networks. One of the newest articles on the application of neural networks on the forex market is [7].

The second subject of interest in the foreign market is the data prediction problem. One of the most interesting articles dealing with this problem is [10]. In this article the authors propose a transformation of the financial time series into fuzzy grain particle sequences. The main goal of this approach is to use a support vector machine to calculate the upper and lower bounds of those particles. There are also very complex systems based on classical fuzzy time series analysis, e.g. [1]. In this approach the authors propose an algorithm consisting of elements such as voting, statistical analysis and emotional decision-making agents.

Besides the above articles, there are a few approaches involving evolutionary computation, e.g. [3], in which the evolutionary programming mechanism is developed. Genetic algorithms were a point of interest in [12]. The authors tested their approach on the French stock market and the overall results were compared with one of the most well-known strategies: “buy and hold”. A method based on the recognition of patterns was proposed in [4]. An interesting fact is that Japanese candlesticks, which are one of the most frequently used elements of technical analysis, have not been a very popular subject in research articles. Modern financial data visualisations, such as Renko or Kagi charts, still remain beyond mainstream research interest.

3 Data Preparation

The data acquisition process is one of the most important elements of the proposed approach. Currently, there are many different concepts that allow to open a position on the forex market. These involve using advanced charts (e.g. Renko or Kagi), predefined sets of technical indicators and even fundamental analysis. The emerging question is as follows: How does one precisely define the price direction on the basis of technical analysis indicator values. In other words, the problem is to define an element of the decision table with a set of attributes, i.e. values of different indicators at some period of time. Such an element should also have a properly assigned value from the 3-element set: buy, sell and wait. We assume that the buy decision may be described as a situation in which the price of a given instrument will rise by some predefined value. The sell decision is a situation in which the price of the instrument drops by some predefined value. Of course, all remaining situations are classified as the wait position. An example fragment of the data set creation process is presented in Fig. 1.

Fig. 1
figure 1

Data set creation process

As can be observed, every situation on the market is described by a number of parameters. The decision to buy or sell is set only if significant price movement is observed on the chart. In other cases the value of the decision is set to default. Such an approach ensures that the signal to open the order is generated only on the basis of the price movement and technical analysis indicators, thus the Commodity Channel Index or the Relative Strength Index do not have an affect on the decision. Unfortunately, due to the lack of space we will omit detailed information on the technical analysis indicators used in this paper. More information on technical analysis can be found, e.g. in [11]. The proposed approach points to a significant problem related to the number of objects in every one of the three decision classes. It is understandable that the number of elements belonging to the wait class will be much larger than the size of the two remaining classes. Such a situation raises the question whether classification of objects in such a data set may be effective. We will look closer at this problem in the “Experiments” section.

4 Decision Trees

Readers who want to learn more about decision trees we refer to [18]. Below we give just some brief information and the most important features that will be crucial in the context of using decision trees in financial data. Decision trees have a very simple and intuitive construction and may be described as an acyclic, directed graph in which all vertices are called nodes, the edges are branches, nodes without descendants are leaves, and, finally, the root of the decision tree is a node without a parent. All nodes contain tests on the conditional attributes which are created on the basis of some predefined splitting rule. This splitting rule allows to divide the data according to the values of those attributes. Moreover, every result is represented by branches.

A decision tree is constructed from a training set which consists of objects. Each object is completely described by a set of attributes and a class label. In our problem the set of attributes comprises values of technical analysis indicators and the class label is one of the decisions: “sell”, “buy” and “wait”. All internal nodes contain splits which test the value of an expression of the attributes. Arcs from an internal node t to its children are labeled with distinct outcomes of the test at t. Finally each leaf node has one class label.

One of the important advantages of using decision trees is the possibility to perform a fast analysis (significantly faster as compared to the classical methods). Moreover, decision trees can easily be stored as decision rules. This feature allows to use them in systems directly correlated with the decision rules. Such an advantage of this structure promotes future development of this technique. In addition, the so-called groups of classifiers presented as decision forests seems especially useful when building decision trees with stochastic methods. In general, the construction of a single decision tree is based on the “divide and conquer” rule, which is recursive data division. Such an approach leads to dividing one single problem into smaller problems.

The task of constructing a tree from the training set has been called either: tree induction, tree building or tree growing. Most tree induction systems use the greedy approach, i.e. trees are induced top-down, where the first attribute (possibily the best) becomes the root of the tree. The child nodes perform division according to the same rule. Mostly, just after decision tree creation, the process of pruning is carried out. Such an approach is used to prevent classificator overtraining. In the literature we can find multiple algorithms used for decision tree construction. Some of the most popular state-of-art algorithms are CART and C4.5. A detailed brief of other similar algorithms can be found in [15].

4.1 The CART Algorithm

The CART method addresses the classification and regression problem by building a binary decision tree according to some splitting rule. Each split is based on a single variable; some variables may be used several times while others may not be used at all. Each sub-leaf is then split further based on independent rules.

Decision trees constructed by the CART algorithm are mostly generated on the Gini split criterium:

$$\begin{aligned} D_t = \sum _{i \ne j} \cdot p(i|t) \cdot p(j|t) = 1 - \sum _i p(i|t)^2 \end{aligned}$$
(1)

where: p(i|t) is the probability of decision class i in node t. The Gini splitting rule is based on the Gini index a measure of the random variable concentration. The main goal in this case is uniform division in the child nodes. The procedure is finished when there is no more admissible splitting. In general, the final tree overfits the available data and the prediction error. The main goal is to minimise that prediction error. To achieve this, construction of such a tree involves the pruning procedure. Such methods try to produce a subtree whose expected prediction error is lower than the error value before the pruning procedure.

4.2 The C4.5 Algorithm

The C4.5 algorithm was proposed by Qunlan [16] and is an improvement of his earlier ID3 algorithm. The main improvements concern the splitting criterium so that it is possible to limit the classification error and missing attributes values are no longer a problem. The C4.5 uses the so-called rule of relative profit which is calculated for every node (the node with the highest value is selected):

$$\begin{aligned} Gain(S, A) = Entropy(S) - I(S, A) = Entropy(S) - \sum _i \cdot \frac{S_i}{S} \cdot Entropy(S_i) \end{aligned}$$
(2)

where S is a set of examples and i is the number of subsets. Moreover, the pruning method is also used in the C4.5 algorithm. During the learning process it is possible to work with objects with some missing attributes. This algorithm is also prepared to work with continuous attributes.

In our experiments we used two of the algorithms as described above. The CART algorithm is very often considered a state-of-art algorithm. It may also be used to point out the weak elements of the proposed approach. The second choice was C4.5, which is best known and (probably) the most widely used learning algorithm.

5 Experiments

Our goal was to test experimentally if it is possible to present forex market data in the form of rules and to use them in the classification problem. The structure of real-world data was adjusted to the WEKA system. It was used to generate the results of the experiments. All of the results and their analysis are presented below. All experiments were conducted for three algorithms (with default settings): CART algorithm, C4.5 algorithm and Random Tree.

5.1 Data Sets

We prepared 9 real-world data sets which included three currency pairs: EURUSD was chosen because it is the most fluent instrument on the currency market. The GBPUSD pair is often considered to be one of the most variable pairs. Finally USDJPY was chosen as a pair mostly correlated with the technical analysis. For every instrument we prepared 3 sets which depended on the number of pips (1 pips is a 0.0001 change in the price of the instrument). The numbers were equal to 30, 45 and 60, e.g. 45 means that at least 45 pips movements should be observable to describe such a set-up for the BUY or SELL class. It is also worth noting that every data set consists of three decision classess: SELL, BUY and WAIT. Due to the specifics of these data, a vast majority of the elements belonged to the WAIT class. Detailed parameters of the data sets are presented in Table 1.

Table 1 Original parameters in data sets

The data sets were randomly divided (sampling without replacement) into two sets: a training set (9000 objects) and a test set (2998 objects). Every data set consisted of 8 conditional attributes (numeric) and one decision attribute (nominal)). Conditional attributes were selected on the basis of the most often used technical analysis indicators and are given as follows: Alligator, Bulls strength, Bears strength, CCI (Commodity Channel Index), Demarker, RSI (Relative Strength Index), Stochastic Oscillator, and Williams indicator. All indicators where used with their standard parameters.

5.2 Results of Experiments

As can be observed in Table 1, the proposed data sets are very difficult to analyse. The vast majority of elements (86–98 %) in the training set belongs to the WAIT class. The same situation is given in the test set, where 86–99 % of the elements are in the WAIT class. An even more serious problem is that the random division in many cases leads to a different ratio of objects (in the same class) in the training and testing sets. In many cases that difference is equal to even 50 %, e.g. in EURUSD 30p with 6.7 % of objects in the SELL class there are only 2.9 % objects with the same class in the training set.

Different measures were used due to the difficulty of the data sets. All preliminary results are presented in Tables 2, 3 and 4. All results should be analysed due to different measures as well as the parameters given in Table 1.

An analysis of the results given in Table 2 (which is the accuracy of the classification and the size of the decision tree) suggests that it is possible to use algorithms for decision tree construction as a tool for financial data prediction. However, it should be pointed out that in some cases the best results (the best according to the accuracy measure) are given by the CART algorithm. This is connected with decision trees with a height equal to 1. As may be expected, such trees classify objects by assigning all of them to the most numerous WAIT class.

Table 2 Comparative study—accuracy rate

For a more effective analysis, average values of the remaining measures are given. The set of measures contains: recall, precision, F-measure and ROC area. The results for those measures (Table 3) point out that observation related to the predominance of a single class is correct. Of course, in some examples other objects are properly classified as well. Especially interesting are the results for the Random Tree algorithm in which the ROC area value is frequently better than in the case of either the CART or C4.5 algorithms.

Table 3 Average values for different measures for currency pairs—italicized numbers are denoted as the best result

This article is a preliminary study in which we dealt with the possibility of using decision trees as a tool for developing an effective transaction system. Thus the confusion matrix is one of the crucial elements of our experiments. The results for the selected currency pairs and algorithms were presented in Table 4. These are example matrices which are the most frequent in the analysed experiments. The values of these matrices point to the problem related with the WAIT class the vast majority of objects is assigned to this class. It is very important to point out that in real-world problems such a situation from the viewpoint of the transaction system is far better than the option of assigning the wrong object to the class of BUY or SELL. Such an approach coincides with the transaction systems used on the forex market where the most frequent and safest option is to stay out of the market and to not open the order if the situation is not clear. It is worth noting that in some cases the selected algorithms do surprisingly well even for the two remaining decision classes.

Table 4 Confusion matrix for selected pairs

6 Conclusions

Ahe analysis presented in the article confirms that there is a possibility to effectively transform forex financial data into a decision table which may be used to generate decision trees and to perform satisfactory classification. The results given in the “Experiments” section allow to determine that in the future it will be possible to use such algorithms in the process of building forex systems. It should be pointed out that there are different classification measures, because only a general view allows to properly react to such results.

It is worth noting that even in this preliminary study the algorithms did surprisingly well with rash decisions, although it would be beneficial to improve the quality of the classifications, especially for the two remaining classes. The results analysis shows that in such a type of problem, approximate algorithms should be far more effective than the classical approaches. Algorithms in which there is the possibility of optimisation for exact classification measures, such as [13] or ensemble methods [2] should be checked in the near future.