Keywords

1 Introduction

Foreign Exchange (FOREX) refers to the foreign exchange market in which the foreign currencies of the world are traded and where an exchange rate is formed as a single price, which enables the transfer of funds from one currency to another [7]. It is the largest and liquid market in the world. The advantage of this market is the possibility to trade twenty-four hours a day, except Saturday and Sunday. The main function of this market is the exchange of the money from one currency to another for investing, travel, international trade, etc. The importance of this market represents a number of states having different national currencies and they need to trade with each other.

The motivation for the presented experiments included two factors: to enable a better understanding of the behaviour of the foreign exchange market and a possible usability of the designed system in real business cases with the aim to make a profit through trading on the Forex market. It is important to state that the profit was not a primary goal of our work but it represents one of the specified main success criteria.

In general, the main reason for working with the Forex data is a prediction of their future development. The right prediction represents a major advantage for the participants of the market trading. In this case several approaches how to define the right strategies are used, but the problem of the strategies that are based on technical analysis is that they do not specify under what conditions should be deployed the relevant strategy. The result, in this case, is a high turnover of the prediction’s accuracy of the future development caused by the changing conditions not underpinned by the strategy. This fact motivated us to design our own strategy, whose main requirement was to determine which currency is globally strong or is globally weak and use the chosen strategy on a currency pair created from the strong and weak currency. The result is a trading on the financial markets under the best conditions, i.e. with the highest chance to be successful.

To meet this requirement, we designed and implemented a decision support system (DSS) for trading on the foreign exchange market which uses a possibility to speculate on this market and in line with extracted rules, economic news and outputs of the technical analysis recommend the future trading direction. The rules extracted from the historical Forex data by means of suitable data mining methods represent one main aspect of the proposed DSS. This extraction is the crucial part of the whole recommendation process in combination with the selected methods of the technical analysis. Our goal was not to generate a model with the highest accuracy but to extract the rules that really appear on the market. Besides the main tasks described above, the data mining should help to answer a question: what is a probability of the situation that the negative economic news will affect negatively the whole market, and on the other hand, the positive economic news will have a positive impact. In addition, if including the fundamental news and technical analysis into decision process will have a significant positive impact.

We think that it is important to find out whether a determination of the global power of the currency creates an opportunity for trading in terms of its future development. The aim of this work is to verify a hypothesis that in this market we can determine the direction of currency’s movement based on the movement of the related currency pair containing this currency. We derived a statement whether the currency is strong, weak or has a neutral development from all relevant currency pairs. Therefore, the idea for proposed experiments is first to abstract actual trend from the price, determine the strength of the currency and make the prediction for the future development. The success criterion will be at least 55% accuracy of the generated model to determine the currency’s strength. All experiments were realised based on the major world currencies that are the Euro, US dollar, British pound, Australian dollar, New Zealand dollar, Japanese yen, Canadian dollar and Swiss franc [1]. We took this decision due to their high liquidity compared to other currencies and relatively low fees.

The article contains the four main parts. The first introduces the trading on the forex markets and our motivation to design a DSS to support this trading. The second describes the proposed DSS and realised experiments in accordance with the CRISP-DM methodology [17]. The third one is devoted to the evaluation of the obtained rules with the aim to confirm or decline the specified hypothesis and to meet the specified objectives and criteria. The last one concludes the article with the proposed directions of the future work.

1.1 Related Works

Fredrik Larsen in his work [10] proposed an automatic stock market trading based on the technical analysis. This system focused on a monitoring of the development of 24 stocks. It works with data from 1970 to 1998 and the whole system consists of 26 agents. Each agent represents one of the indicators or a related price formation from the technical analysis. The author used several methods of artificial intelligence and machine learning such as the neural networks to determine the purchase or sale, and decision trees ID3 and J48 available within a supporting tool Weka. Each agent works alone and produces a signal evaluated by a weight. These weights are optimised by a genetic algorithm and based on specified formula they are counted together to determine the future direction of the prediction. The result is a system in which each agent has its own accuracy and this factor determines the strength of his voting rights. The users cannot use the proposed system for real trading because its experimental evaluation with the real data was not successful. In addition, it does not take into account the business parameters as Stop Loss, Take Profit and economic fundamental news.

Peachavanish in his work [14] proposed a method using cluster analysis to identify a group of stocks that has the best trends and momentum characteristics at a given time. The author realized his experiments on five-year historical price data of stocks listed on the Stock Exchange of Thailand. In order to compare price trend and momentum of the different stocks, he calculated some technical indicators for the long term trend and for the short term trend too. Since, the nature of technical indicators used in the proposed method lag behind prices, it took some time for the selection process to recover and outperform the market in the long run.

A collective of authors [5] proposed a FOREX trading expert system based on some new technical analysis indicators and a new approach to the rule-based evidential reasoning. The authors design this approach to solve an issue that the traditional fuzzy logic rules lose important information when dealing with the intersecting fuzzy classes. The limitation is that this system selects the stocks for the trading based on historical data, indicators of the technical analysis and proposed trading rules and not considered some other important inputs as a political policy, macroeconomics, etc. [13].

In the work of Lai et al. [9] a web-based DSS using neural networks for FOREX trading is described. This system consists of two main modules: neural network and DSS available on-line. The neural network generated a model used for the prediction of the future development of the major currency pairs. The proposed DSS module used the generated model to recommend the possible directions for the user by specified rules. The authors did not mention the concrete accuracy of their system, but the DSS module contained following four types of rules. The first type determines the trading strategy in a very simple way, i.e. If prediction > current price THEN buy. The second type selects cases created by the first group of rules in line with the fact that the buy or sell recommendation will be not applied if the difference between the actual and predicted price is very low. The third type of rules determines the likelihood of a profit when buying or selling. The fourth type of rules is taking into account the risk of trading.

The authors in [6] describe a similar approach using the neural networks. At first, they proposed ten criteria calculate from the stock financial statements report to use for the fundamental analysis. Next, they applied a multi-layer perceptron on the five-year historical stock dataset to classify the good return stocks, i.e. those likely to win the market in the future. In addition, the authors analysed the short historical prices of the good return stocks by using technical factors to identify the buying or selling signal in the decision support process. The performance of the proposed approach was evaluated by the real information about the stock prices. The average returns of the ports following the system suggestions continuous increased.

Three students described in their work [11] a trading system based on testing of the simple well-known strategies of the technical analysis. These strategies consist of the indicators such as moving averages, Bollinger Bands, Keltner Channel, Commodity Channel Index, Volume Oscillator and Volume Ratio. The authors implemented the whole system through a TradeStation platform using a programming language called EasyLanguage based on predefined indicators. They proposed the four strategies using the indicators listed above and tested them with the aim to evaluate their’s performance. In the next step, the authors selected the best two strategies to create the trading system. In addition, they optimised the system with a sample of data from two previous months and tested for the next month. The most powerful strategy based on Commodity Channel Index and Volume Oscillator reached 72.73% accuracy with eight profitable and three losing trades.

Based on realised research we can state that many of them use the indicators and technical analysis to create the decision rules. In recent years, a fundamental analysis is gaining an importance to improve the percentage of the recommendations. These systems mainly consist of an integrated set of modules including a voting system for final prediction of the future development for the related currency pair. At first, these systems test and verify the accuracy of the prediction and further build a recommendation on this prediction to buy or to sell. Many of the existing systems do not provide information about their structure because of the trade secret and a protection of the intellectual property. The overall aim is to design a method or a combination of the suitable methods to provide a decision support system with the highest possible profit and the lowest risk.

2 DSS Design and Creation

The core of the proposed system consists of the knowledge obtained through suitable data mining methods represented by the decision rules describing frequented events in the market. The Fig. 1 visualises a process how was the proposed DSS constructed.

Fig. 1.
figure 1

Simple visualisation of the creation process of the proposed system

In addition, we extended a set of basic rules with the outputs of the fundamental and technical analysis. The first one was represented by a set of fundamental news for whose we can predict their impact on the market with the higher accuracy and the second one contained well-known indicators as moving averages and Bollinger Bands [2]. We used the resulting knowledge to create the trading platform further used for the testing and evaluation of the input knowledge with the aim to ensure a competitive advantage. We performed many experiments with different values of the parameters. We stored a result of each experiment and evaluated if the related rules are still valid. If the recommendation is successful, the weight of the rule increases. Otherwise, a constant number reduces the weight. The system works with two types of the external inputs. The first is the market price of a currency pair that is constantly changing and the second one contains the fundamental messages published at certain times on certain days of the month.

2.1 Rules Extraction Using CRISP-DM Methodology

In the text below, we specified that the success criterion would be at least 55% accuracy of the prediction for the future direction based on the strength of the related currency. In addition, it will be important to confirm that the currency’s strength has inertia and a determination of this global factor at a given hour creates a positive impact.

We used eight data sources for our experimental research. The seven data sources contained data about a combination of one of the seven currencies (Euro, British pound, Australian dollar, New Zealand dollar, Canadian dollar, Swiss franc and Japanese yen) and the U.S. dollar. The last source contained the economic news. The number of records was different for each source, so it was necessary to integrate the data and to define a common time interval for all currency pairs (hourly). The result was a set of databases describing the price development of the currency pairs related to the US dollar between 1st October 2014 and 17th April 2015 and economic news database for the same time interval. One row in the database contained information about the date, time, volume, open price, close price, minimum price and maximum price in a given hour, i.e. seven numeric attributes describing the development of the currency pairs. The economic news was described by the date and time of the news, name, state who issues reports, a degree of volatility, the actual value, previous value and expert consensus.

The initial data understanding of the relevant attributes showed that each currency pair has concrete price boundaries that limit the trading price zone. We can use this information as a trading guide because it is riskier to sell the currency near the minimum as the maximum. The same rule is relevant for buying but in a reverse relationship. From our point of view, it was not necessary to calculate the correlations between the attributes because e.g. the attributes close, open, mix and max correlated based on the basic of their nature. These attributes helped us only to determine a short-term trend used in the next steps of the decision process. The relationship between input attributes and target attribute was calculated using \( \upchi^{2} \) - independence test. We removed those attributes that do not depend on the target attribute.

We did not find any redundant data; all attributes were complete and did not contain any missing values. Each database contained more than 2 600 valid records. A different situation was with the database of the economic news. We identified missing values for all attributes, e.g. some attributes had more than 20% missing values. We solved these missing values by their replacing with a neutral value. Further analysis of this situation resulted in findings that this news is various meetings and statements of the central banks. It was not possible to identify a possible influence of this news on the market because they evoke an irregular volatility in the market lasting for only a few hours. In addition, we smoothed the noise and removed the outliers. In the processing phase, we created twenty-one new attributes, e.g. strength/weakness attribute to label a weakening or strengthening of the currency pair or minimal price range that smooth a noise in the cases when opening and closing prices were too close.

Modelling phase included an application of the selected algorithms on the processed data. We selected following methods CART [3] and C5.0 [16]. The C5.0 generates a decision model based on the trees or rules set, it is faster that C4.5, requires a less memory, generates smaller decision trees and has a greater ability to process classes that have very low representation in the training data [8]. The Classification And Regression Trees (CART) implementation is very similar to C4.5, but this algorithm constructs the tree based on a numerical splitting criterion recursively applied to the data. Resulting binary trees are more sparing with data and detect more patterns/rules/structures before too little data are left for learning [12].

In addition, we divided the processed data into two sets: 80% for training and 20% for testing based on the previous experiences. We realised a number of experiments to find the model with the best accuracy. The C5.0 model obtained the best accuracy 56.03% within a boosting set up to the value 10. The best models showed that a strict reduction of the input attributes makes the prediction worse, but the newly created attributes improve the accuracy.

Finally, we extracted 236 rules categorised into four groups:

  • The rules describing the loss trade (80 rules).

  • The rules describing the indecision of the U.S. dollar (55 rules).

  • The rules describing the weakening of the U.S. dollar (59 rules).

  • The rules describing the strengthening of the U.S. dollar (69 rules).

The first two categories do not add values for the proposed DSS because we need to have the rules describing a tendency of the US dollar to continue with the current trend. Therefore, we selected the best twenty rules from the third and fourth category based on their accuracy.

In addition, we followed the U.S. economic reports published by the US Department of Labor or the U.S. Federal Reserve System (FED) annualised during the period between 2nd February 2014 and 31st December 2015. We found out within realised experiments that about 60% of the time was the real development of the currency equal to the expected consequences of the published economic news. However, the highest expected impact had only the news published by the FED, i.e. if the FED statement about an interest rate was oriented to weakening the U.S. dollar; the market reacted in the same way with the probability near the 100%. In addition, we were able to predict the influence of the following factors such as the unemployment rate, the durable good orders and ISM (Institute for Supply Management) services sector with the accuracy higher than 70%.

After testing, we concluded that fundamental analysis would not improve the system in any way and that it would create the possibility to higher drawdowns. We would change our decision if we find a larger set of news with a better precision. However, chosen technical analysis indicators improved overall precision of the system by 7%, so we decided to apply them in the final version.

We can state that we met the specified success criteria with the best-achieved accuracy 56.03% that is typical for this type of data [4, 15]. The system’s ability to recommend the right decision strongly depends on the ratio between risk and profit and not so much on the particular accuracy. We will discuss this statement discussed in the next section.

2.2 Decision-Making Process Proposal

We proposed the decision-making process as an iterative service in the completely trading platform (see Fig. 2). An each change in the price executes this process. It starts with a determination of the input parameters. After these parameters are set up, the DSS switches to sleeping mode during awaits for an external factor. If this factor occurs, the DSS identifies a list of the valid rules, selects those whose weight is higher that specified boundary and sorts the rules by confidence in descending order.

Fig. 2.
figure 2

Visualization of the decision-making process

For each rule in the sorted list, the DSS opens iteratively a new trade and the number of these open trades limits the relevant parameter. The system analyses a result of each closed trade and a new weight for the relevant rule is determined. During testing, we used 0.5 as a value of an initial weight of the rules, 0.1 as the boundary and similarly value 0.1 for an incremental or decremental change of the rule’s weight. It is important to say that closing of the trade is an event, which may not occur in every iteration.

3 Evaluation

At first, it is important to explain that the number of open trades has a direct impact on the number of monitored rules. If the DSS reaches a maximal number of the allowed open trades, a next trade will be not open even a valid rule will exist. The particular trades are not limited by their duration, but by their expected profit and risk. This was the reason why we selected only the top 10 buying and selling rules.

The following tables contain a list of profit rules with similarly specified weights: boundary 0.1, initial weight 0.5, incremental and decremental change 0.1, a maximal number of open trades 1 and a volume of the trade as 0.1 lot. We used in all experiments an account with 1:200 leverage.

The optimisation was realised with data represented the currency pair EUR/USD between 1st July 2015 and 31st October 2015. After this optimisation, the initial set of rules from the first group (see Table 1) was reduced (rules 2, 6 and 10 were omitted, because they generated a loss). It means that the decision process did not include this type of rules and the rules with a low frequency of their validity (no. 5, 8 and 9). The rule no. 3 had a high drawdown, so we also eliminated it from the decision-making process.

Table 1. The profit rules from the category describing the strengthening of the U.S. dollar.

We eliminated also the rule no. 7 from the second group of rules (see Table 2), because it was generally lossy even after optimisation of the parameters. The rule no. 8 is the best one in this group because it has a higher frequency of its validity, above-average earnings, a reasonable ratio between the target profit and the risk, and a relatively low drawdown. In the next step, we eliminated the rule no. 9 too due to its low frequency and low profit compared compared to other rules.

Table 2. The profit rules from the category describing the weakening of the U.S. dollar.

The result represents a list of the profitable rules that have a standard drawdown and a sufficient frequency. The realised testing and optimisation showed that we cannot label all rules as profit and an effective selection can improve the overall accuracy.

All described operations resulted in a modified set of rules. Table 3 shows a performance of the optimised DSS before and after the modification of the rules selection process on the currency pair EUR/USD between 1st November 2015 and 22nd February 2016.

Table 3. The obtained results before and after the rules modification.

The previous rules provided a greater total net profit, but the usage of these rules has a disadvantage in a low profit position 14.52%. This system strongly depends on a few highly profitable trades. Since the aim of our work is to design a reliable system, more suitable is the system containing modified rules with 88.73% accuracy.

In order to describe the overall performance of the proposed DSS, it is necessary to divide the investigated time interval into smaller parts, specifically into sixteen weeks. This evaluation provides information about a consistency of the system performance and a distribution of the frequencies.

We can see in Table 4 that trades frequency in the particular weeks ranges from two to eight, except the Christmas period. We can label the eleven weeks as profitable. It can be stated that proposed DSS was successful even in the 6th week in which the system recommended only 40% right trades. A difference between total net profit in Table 3 and a sum of the all total net profits in Table 4 is caused by the trades that were opened at the end of the week and finished at the beginning of the following week.

Table 4. DSS performance evaluation for 16 weeks.

4 Conclusion

The aim of this work was to design a new approach how to use data mining approaches to improve decision support process for trading on the forex market. For this purpose, we propose to use the decision rules extracted from the historical data representing various trades on the forex market. These rules take into account different aspects of the macro analysis and microanalysis of the related currency. The rules represented a basis of the proposed decision support system, which was necessary to optimise to ensure its long profitability through continuous rules’ adaptation based on weights changes in accordance with their performance. At the beginning, we specified the hypothesis that in this market we can determine the direction of currency’s movement based on the movement of the related currency pair containing this currency. The obtained results confirm this hypothesis, i.e. it is important to determine the global currency direction, i.e. to determine if the currency is globally strong or weak against other major currencies and this knowledge has a positive influence on the prediction of the currency development. People, who want to trade on the forex market, can use this system not only as a recommender, but it is possible also to use it as an automatic trading system without human intervention.

We see our future work in several directions, e.g. it will be possible to extend the list of indicators resulting in more complex rules or to change used classification method. We selected the decision trees because of their simple understandability and readability.