Keywords

1 Introduction

Time series are usually analyzed to understand the past and to predict the future, enabling managers or policy makers to make properly informed decisions. Time series analysis quantifies the main features in data, like the random variation. These facts, combined with improved computing power, have made time series methods widely applicable in government, industry, and commerce. In most branches of science, engineering, and commerce, there are variables measured sequentially in time. Reserve banks record interest rates and exchange rates each day. The government statistics department will compute the country’s gross domestic product on a yearly basis. Newspapers publish yesterday’s noon temperatures for capital cities from around the world. Meteorological offices record rainfall at many different sites with differing resolutions. When a variable is measured sequentially in time over or at a fixed interval, known as the sampling interval, the resulting data form a time series [1].

Time series predictions are very important because based on them we can analyze past events to know the possible behavior of futures events and thus can take preventive or corrective decisions to help avoid unwanted circumstances.

The choice and implementation of an appropriate method for prediction has always been a major issue for enterprises that seek to ensure the profitability and survival of business. The predictions give the company the ability to make decisions in the medium and long term, and due to the accuracy or inaccuracy of data this could mean predicted growth or profits and financial losses. It is very important for companies to know the behavior that will be the future development of their business, and thus be able to make decisions that improve the company’s activities, and avoid unwanted situations, which in some cases can lead to the company’s failure. In this paper we propose a hybrid approach for time series prediction by using an ensemble neural network and its with optimization with particle swarm optimization. In the literature there have been recent produced work of time series [2,3,4,5,6,7,8,9,10].

2 Preliminaries

In this section we present basic concepts that are used in this proposed method:

2.1 Time Series and Prediction

The word “prediction” comes from the Latin prognosticum, which means I know in advance. Prediction is to issue a statement about what is likely to happen in the future, based on analysis and considerations of experiments. Making a forecast is to obtain knowledge about uncertain events that are important in decision-making [6]. Time series prediction tries to predict the future based on past data, it take a series of real data \( x_{t} \) − n, \( \ldots ,\,x_{t} - 2,0\,x_{t} - 1,\,x_{t} \) and then obtains the prediction of the data \( x_{t} \) + 1, \( x_{t} + 2,\, \ldots ,\,x_{n} + {\text{n}}. \) The goal of time series prediction or a model is to observe the series of real data, so that future data may be accurately predicted [1, 11].

2.2 Neural Networks

Neural networks Neural networks (NNs) are composed of many elements (Artificial Neurons), grouped into layers and are highly interconnected (with the synapses), this structure has several inputs and outputs, which are trained to react (or give values) in a way you want to input stimuli. These systems emulate in some way, the human brain. Neural networks are required to learn to behave (Learning) and someone should be responsible for the teaching or training (Training), based on prior knowledge of the environment problem [12, 13].

2.3 Ensemble Neural Networks

An Ensemble Neural Network is a learning paradigm where many neural networks are jointly used to solve a problem [14]. A Neural network ensemble is a learning paradigm where a collection of a finite number of neural networks is trained for the same task [15]. It originates from Hansen and Salamon’s work [16], which shows that the generalization ability of a neural network system can be significantly improved through ensembling a number of neural networks, i.e. training many neural networks and then combining their predictions. Since this technology behaves remarkably well, recently it has become a very hot topic in both neural networks and machine learning communities [17], and has already been successfully applied to diverse areas such as face recognition [18, 19], optical character recognition [20,21,22], scientific image analysis [23], medical diagnosis [24, 25], seismic signals classification [26], etc.

In general, a neural network ensemble is constructed in two steps, i.e. training a number of component neural networks and then combining the component predictions.

There are also many other approaches for training the component neural networks. Examples are as follows. Hampshire and Waibel [22] utilize different object functions to train distinct component neural networks.

2.4 Fuzzy Systems as Methods of Integration

There exists a diversity of methods of integration or aggregation of information, and we mention some of these methods below.

Fuzzy logic was proposed for the first time in the mid-sixties at the University of California Berkeley by the brilliant engineer Lofty A. Zadeh., who proposed what it’s called the principle of incompatibility: “As the complexity of system increases, our ability to be precise instructions and build on their behavior decreases to the threshold beyond which the accuracy and meaning are mutually exclusive characteristics.” Then introduced the concept of a fuzzy set, under which lies the idea that the elements on which to build human thinking are not numbers but linguistic labels. Fuzzy logic can represent the common knowledge as a form of language that is mostly qualitative and not necessarily a quantity in a mathematical language that means of fuzzy set theory and function characteristics associated with them [12].

2.5 Optimization

The process of optimization is the process of obtaining the ‘best’, if it is possible to measure and change what is ‘good’ or ‘bad’. In practice, one wishes the ‘most’ or ‘maximum’ (e.g., salary) or the ‘least’ or ‘minimum’ (e.g., expenses). Therefore, the word ‘optimum’ is takes the meaning of ‘maximum’ or ‘minimum’ de pending on the circumstances; ‘optimum’ is a technical term which implies quantitative measurement and is a stronger word than ‘best’ which is more appropriate for everyday use. Likewise, the word ‘optimize’, which means to achieve an optimum, is a stronger word than ‘improve’. Optimization theory is the branch of mathematics encompassing the quantitative study of optima and methods for finding them. Optimization practice, on the other hand, is the collection of techniques, methods, procedures, and algorithms that can be used to find the optima [27].

2.6 Particle Swarm Optimization

Particle Swarm Optimization (PSO) is a bio-inspired optimization method proposed by R. Eberhart and J. Kennedy [28] in 1995. PSO is a search algorithm based on the behavior of biological communities that exhibits individual and social behavior [29], and examples of these communities are groups of birds, schools of fish and swarms of bees [29].

A PSO algorithm maintains a swarm of particles, where each particle represents a potential solution. In analogy with the paradigms of evolutionary computation, a swarm is similar to a population, while a particle is similar to an individual. In simple terms, the particles are “flown” through a multidimensional search space, where the position of each particle is adjusted according to its own experience and that of its neighbors. Let \( x_{i} \) denote the position \( i \) in the search space at time step \( t \), unless otherwise stated, \( t \) denotes discrete time steps. The position of the particle is changed by adding a velocity, \( v_{i } \) (t), to the current position, i.e.

$$ \begin{aligned} & x_{i} \left( {t + 1} \right) = x_{i} (t) + v_{i} \left( {t + 1} \right) \\ & {\text{with}}\,x_{i} (0)\;\sim \; U \left( {X_{min} , X_{max} } \right). \\ \end{aligned} $$
(1)

3 Problem Statement and Proposed Method

The goal of this work was to implement Particle Swarm Optimization to optimize the ensemble neural network architectures. In this cases the optimization is for each of the modules, and thus to find a neural network architecture that yields optimum results in each of the Time Series to be considered. In Fig. 1 we have the historical data of each time series prediction, then the data is provided to the modules that will be optimized with the particle swarm optimization for the ensemble network, and then these modules are integrated with integration based on type-1 and type-2 Fuzzy Integration.

Fig. 1
figure 1

General architecture of the proposed ensemble model

Historical data of the Taiwan Stock Exchange time series was used for the ensemble neural network trainings, where each module was fed with the same information, unlike the modular networks, where each module is fed with different data, which leads to architectures that are not uniform.

The Taiwan Stock Exchange (Taiwan Stock Exchange Corporation) is a financial institution that was founded in 1961 in Taipei and began to operate as stock exchange on 9 February 1962. The Financial Supervisory Commission regulates it. The index of the Taiwan Stock Exchange is the TWSE [30].

Data of the Taiwan Stock Exchange time series: We are using 800 points that correspond to a period from 03/04/2011 to 05/07/2014 (as shown in Fig. 2). We used 70% of the data for the ensemble neural network trainings and 30% to test the network [30].

Fig. 2
figure 2

Taiwan Stock Exchange

The objective function is defined to minimize the prediction error as follows:

$$ EM = \left( {\sum\nolimits_{i = 1}^{D} {\left| {a_{i} - x_{i} } \right|} /{\text{D}}} \right) $$
(2)

where a, corresponds to the predicted data depending on the output of the network modules, X represents real data, D the Number of Data points and EM is the total prediction error.

The corresponding particle structure is shown in Fig. 3.

Fig. 3
figure 3

Particle structure to optimize the ensemble neural network

Figure 3 represents the Particle Structure to optimize the ensemble neural network, where the parameters that are optimized are the number of modules, number of layers, and number of neurons of the ensemble neural network. PSO determines the number of modules, number of layers and number of neurons per layer that the neural network ensemble should have, to meet the objective of achieving the better Prediction error.

The parameters for the particle swarm optimization algorithm are: 100 Particles, 100 iterations, Cognitive Component (C1) = 2, Social Component (C2) = 2, Constriction coefficient of linear increase (C) = (0–0.9) and Inertia weight with linear decrease (W) = (0.9–0). We consider a number of 1–5 modules, number of layers of 1–3 and neurons number from 1 to 30.

The aggregation of the responses of the optimized ensemble neural network is performed with type-1 and type-2 fuzzy systems. In this work the fuzzy system consists of 5 inputs depending on the number of modules of the neural network ensemble and one output is used. Each input and output linguistic variable of the fuzzy system uses 2 Gaussian membership functions. The performance of the type-2 fuzzy aggregators is analyzed under different levels of uncertainty to find out the best design of the membership functions for the 32 rules of the fuzzy system. Previous tests have been performed only with a three input fuzzy system and the fuzzy system changes according to the responses of the neural network to give us better prediction error. In the type-2 fuzzy system we also change the levels of uncertainty to obtain the best prediction error.

Figure 4 shows a fuzzy system consisting of 5 inputs depending on the number of modules of the neural network ensemble and one output. Each input and output linguistic variable of the fuzzy system uses 2 Gaussian membership functions. The performance of the type-2 fuzzy aggregators is analyzed under different levels of uncertainty to find out the best design of the membership functions for the 32 rules of the fuzzy system. Previous experiments were performed with triangular, and Gaussian and the Gaussian produced the best results of the prediction.

Fig. 4
figure 4

Fuzzy inference system for integration of the ensemble neural network

Figure 5 represents the 32 possible rules of the fuzzy system; we have 5 inputs in the fuzzy system with 2 membership functions, and the outputs with 2 membership functions. These fuzzy rules are used for both the type-1 and type-2 fuzzy systems. In previous work several tests were performed with 3 inputs, and the prediction error obtained was significant and the number of rules was greater, and this is why we changed to 2 inputs.

Fig. 5
figure 5

Rules of the type-2 fuzzy system

4 Simulation Results

In this section we present the simulation results obtained ​​with the genetic algorithm and particle swarm optimization for the Taiwan Stock Exchange.

We consider working with a genetic algorithm to optimize the structure of an ensemble neural network and the best architecture obtained was the following (shown in Fig. 6).

Fig. 6
figure 6

Prediction with the optimized ensemble neural network with GA of the TAIEX

In this architecture we have two layers in each module. In module 1, in the first layer we have 23 neurons and the second layer we have 9 neurons, and In module 2 we used 9 neurons in the first layer and the second layer we have 15 neurons the Levenberg-Marquardt (LM) training method was used; 3 delays for the network were considered.

Table 1 shows the particle swarm optimization results (as shown in Fig. 6) where the prediction error is of 0.0013066.

Table 1 Particle swarm optimization result for the ensemble neural network

Fuzzy integration is performed initially by implementing a type-1 fuzzy system in which the best result is in experiment of row number 8 of Table 2 with an error of: 0.0235.

Table 2 Result PSO for the type-1 fuzzy integration of the TAIEX

As a second phase, to integrate the results of the optimized ensemble neural network a type-2 fuzzy system is implemented, where the best results that are obtained are as follows: with a degree uncertainty of 0.3 a forecast error of 0.01098 is obtained, with a degree of uncertainty of 0.4 the error is of 0.01122 and with a degree of uncertainty of 0.5 the error is of 0.001244, as shown in Table 3.

Table 3 Result PSO for the type-2 fuzzy integration of the TAIEX

Figure 7 shows the plot of real data against the predicted data generated by the ensemble neural network optimized with the particle swarm optimization.

Fig. 7
figure 7

Prediction with the optimized ensemble neural network with PSO of the TAIEX

5 Conclusions

The best result when applying the particle swarm to optimize the ensemble neural network was: 0.0013066 (as shown in Fig. 6 and Table 1). Implemented a type 2 fuzzy system for ensemble neural network, in which the results where for the best evolution as obtained a degree of uncertainty of 0.3 yielded a forecast error of 0.01098, with an 0.4 uncertainty error: 0.01122, and 0.5 uncertainty error of 0.01244, as shown in Table 3. After achieving these results, we have verified efficiency of the algorithms applied to optimize the neural network ensemble architecture. In this case, the method was efficient but it also has certain disadvantages, sometimes the results are not as good, but genetic algorithms can be considered as good technique a for solving search and optimization problems.