1 Introduction

We are presenting an ensemble with three neural networks for the experiments. The final result for the ensemble was obtained with average integration and type-2 fuzzy integration. The time series prediction area is the study case for this paper, and particularly the Mackey-Glass time series is used to test the proposed approach.

This research uses the managing of the weights of a neural networks using type-2 fuzzy inference systems and due to the fact that these affect the performance of the learning process of the neural network, the used of type-2 fuzzy weights are an important part in the training phase for managing uncertainty.

One type of supervised neural network and its variations is the one that would be of most interest in our study, which is the backpropagation network. This type of network is the most commonly used in the above mentioned areas.

The weights of a neural network are an important part in the training phase, because these affect the performance of the learning process of the neural network.

This conclusion is based on the practice of neural networks of this type, where some research works have shown that the training of neural networks for the same problem initialized with different weights or its adjustment in a different way but at the end is possible to reach a similar result.

The next section presents the basic concepts of neural networks and type-2 fuzzy logic. Section 3 presents a review of research about modifications of the backpropagation algorithm, different management strategies of weights in neural networks and time series prediction. Section 4 explains the proposed ensemble neural network. Section 5 describes the simulation results for the ensemble neural network with average integration and the type-2 fuzzy integrator proposed in this paper. Finally, in Sect. 6, some conclusions are presented.

2 Basic Concepts

2.1 Neural Network

An artificial neural network (ANN) is a distributed computing scheme based on the structure of the nervous system of humans. The architecture of a neural network is formed by connecting multiple elementary processors, this being an adaptive system that has an algorithm to adjust their weights (free parameters) to achieve the performance requirements of the problem based on representative samples [1, 2]. The most important property of artificial neural networks is their ability to learn from a training set of patterns, i.e. they are able to find a model that fits the data [3, 4].

The artificial neuron consists of several parts (see Fig. 1). On one side are the inputs, weights, the summation, and finally the adapter function. The input values are multiplied by the weights and added: \(\sum {x_{i} w_{ij} }\). This function is completed with the addition of a threshold amount i. This threshold has the same effect as an entry with value −1. It serves so that the sum can be shifted left or right of the origin. After addition, we have the function f applied to the sum, resulting the final value of the output, also called \(y_{i}\) [5], obtaining the following equation.

$$y_{i} = f\left( {\sum\limits_{i = 1}^{n} {x_{i} w_{ij} } } \right).$$
(1)

where f may be a nonlinear function with binary output + −1, a linear function f (z) = z, or as sigmoidal logistic function:

$$f(z) = \frac{1}{{1 + e^{ - z} }}.$$
(2)
Fig. 1
figure 1

Scheme of an artificial neuron

2.2 Type-2 Fuzzy Logic

The concept of a type-2 fuzzy set, was introduced by Zadeh (1975) as an extension of the concept of an ordinary fuzzy set (henceforth called a “type-1 fuzzy set”). A type-2 fuzzy set is characterized by a fuzzy membership function, i.e., the membership grade for each element of this set is a fuzzy set in [0, 1], unlike a type-1 set where the membership grade is a crisp number in [0, 1] [6, 7].

Such sets can be used in situations where there is uncertainty about the membership grades themselves, e.g., uncertainty in the shape of the membership function or in some of its parameters [8]. Consider the transition from ordinary sets to fuzzy sets. When we cannot determine the membership of an element in a set as 0 or 1, we use fuzzy sets of type-1 [9,10,11]. Similarly, when the situation is so fuzzy that we have trouble determining the membership grade even as a crisp number in [0, 1], we use fuzzy sets of type-2 [12,13,14,15,16,17].

3 Historical Development

The backpropagation algorithm and its variations are the most useful basic training methods in the area of research of neural networks. When applying the basic backpropagation algorithm to practical problems, the training time can be very high. In the literature we can find that several methods have been proposed to accelerate the convergence of the algorithm [18,19,20,21].

There exist many works about adjustment or managing of weights but only the most important and relevant for this research will be considered here [22,23,24,25].

Ishibuchi et al. [26], proposed a fuzzy network where the weights are given as trapezoidal fuzzy numbers, denoted as four trapezoidal fuzzy numbers for the four parameters of trapezoidal membership functions.

Ishibuchi et al. [27], proposed a fuzzy neural network architecture with symmetrical fuzzy triangular numbers for the fuzzy weights and biases, denoted by the lower, middle and upper limit of the fuzzy triangular numbers.

Momentum method—Rumelhart, Hinton and Williams suggested adding in the increased weights expression a momentum term \(\beta\), to filter the oscillations that can be formed a higher learning rate that lead to great change in the weights [5, 28].

Adaptive learning rate—focuses on improving the performance of the algorithm by allowing the learning rate changes during the training process (increase or decrease) [28].

Castro et al. [29], proposed interval type-2 fuzzy neurons for the antecedents and interval of type-1 fuzzy neurons for the consequents of the rules.

Kamarthi and Pittner [30], focused in obtaining a weight prediction of the network at a future epoch using extrapolation. Feuring [31], developed a learning algorithm in which the backpropagation algorithm is used to compute the new lower and upper limits media weights. The modal value of the new fuzzy weight is calculated as the average of the new computed limits.

Recent works on type-2 fuzzy logic have been developed in time series prediction, like that of Castro et al. [32], and other researchers [33, 34].

4 Proposed Ensemble Neural Network

The focus of this work is to use ensemble neural networks with three neural networks with type-2 fuzzy weights to allow the neural network to handle data with uncertainty; we used an average integration approach and type-2 fuzzy integrator for the final result of the ensemble. The approach is applied in time series prediction for the Mackey Glass time series (for \(\tau\) = 17).

The three neural network works with type-2 fuzzy weights [35], one network works with two-sided Gaussian interval type-2 membership functions with uncertain mean and standard deviation in the two type-2 fuzzy inference systems (FIST2) used to obtain the weights (one in the connections between the input and hidden layer and the other between the hidden and output layer); the other two networks work with triangular interval type-2 membership function with uncertain and triangular interval type-2 membership function with uncertain standard deviation, respectively (see Fig. 2).

Fig. 2
figure 2

Proposed ensemble neural network architecture with interval type-2 fuzzy weights using average integration or type-2 fuzzy integrator

We considered a three neural network architecture, and each network works with 30 neurons in the hidden layer and 1 neuron in the output layer. These neural networks handle type-2 fuzzy weights in the hidden layer and output layer. In the hidden layer and output layer of the networks we are working with a type-2 fuzzy inference system obtaining new weights in each epoch of the networks [36,37,38,39].

We used two similar type-2 fuzzy inference systems to obtain the type-2 fuzzy weights in the hidden and output layer for the neural network.

The weight managing in the three neural networks will be done differently to the traditional management of weights performed with the backpropagation algorithm (see Fig. 3); the method works with interval type-2 fuzzy weights, taking into account the change in the way we work internally in the neuron (see Fig. 4) [40].

Fig. 3
figure 3

Schematic of the management of numerical weights for input of each neuron

Fig. 4
figure 4

Schematic of the management of interval type 2 fuzzy weights for input of each neuron

The activation function f (-) used in this research was the sigmoid function in the neurons of the hidden layer and the linear function in the neurons of the output for the three neural networks.

The three neural networks used two type-2 fuzzy inference systems with the same structure (see Fig. 5), which have two inputs (the current weight in the actual epoch and the change of the weight for the next epoch) and one output (the new weight for the next epoch).

Fig. 5
figure 5

Structure of the six type-2 fuzzy inference systems used in the three neural networks

In the first neural network, the inputs and the output for the type-2 fuzzy inference systems used between the input and hidden layer are delimited with two Gaussian membership functions with their corresponding range (see Fig. 6); and the inputs and output for the type-2 fuzzy inference systems used between the hidden and output layer are delimited with two Gaussian membership functions with their corresponding range (see Fig. 7).

Fig. 6
figure 6

Inputs (a and b) and output (c) of the type-2 fuzzy inference system used between the input and hidden layer for the first neural network

Fig. 7
figure 7

Inputs (a and b) and output (c) of the type-2 fuzzy inference system used between the hidden and output layer for the first neural network

In the second neural network, the inputs and the output for the type-2 fuzzy inference systems used between the input and hidden layer are delimited with two triangular membership functions with their corresponding ranges (see Fig. 8); and the inputs and output for the type-2 fuzzy inference systems used between the hidden and output layer are delimited with two triangular membership functions with their corresponding ranges (see Fig. 9).

Fig. 8
figure 8

Inputs (a and b) and output (c) of the type-2 fuzzy inference system used between the input and hidden layer for the second neural network

Fig. 9
figure 9

Inputs (a and b) and output (c) of the type-2 fuzzy inference system used between the hidden and output layer for the second neural network

In the third neural network, the inputs and the output for the type-2 fuzzy inference systems used between the input and hidden layer are delimited with two triangular membership functions with standard deviation with their corresponding range (see Fig. 10); and the inputs and output for the type-2 fuzzy inference systems used between the hidden and output layer are delimited with two triangular membership functions with uncertainty in the standard deviation with their corresponding ranges (see Fig. 11).

Fig. 10
figure 10

Inputs (a and b) and output (c) of the type-2 fuzzy inference system used between the input and hidden layer for the third neural network

Fig. 11
figure 11

Inputs (a and b) and output (c) of the type-2 fuzzy inference system used between the hidden and output layer for the third neural network

The rules for the six type-2 fuzzy inference systems are the same, we used six rules for the type-2 fuzzy inference systems, corresponding to the four combinations of two membership functions and we added two rules for the case when the change of weight is null (see Fig. 12).

Fig. 12
figure 12

Rules of the type-2 fuzzy inference system used in the six FIST2 for the neural networks with type-2 fuzzy weights

We obtain the prediction result for the ensemble neural network using the average integration and type-2 fuzzy integrator.

The average integration is performed with the Eq. 3 (prediction of the neural network with FIST2 Gaussian MF: NNGMF, prediction of the neural network with FIST2 triangular MF: NNTMF, prediction of the neural network with FIST2 triangular SD MF: NNTsdMF, number of neural networks in the ensemble: #NN, and prediction of the ensemble: PE).

$$PE = \frac{{{\text{NNGMF}} + NNTMF + NNTsdMF}}{\# NN}$$
(3)

The structure of the type-2 fuzzy integrator consists of three inputs: the prediction for the neural network with type-2 fuzzy weights using Gaussian membership functions (MF), triangular MF and triangular MF with uncertainty in the standard deviation; and one output: the final prediction of the integration (see Fig. 13)

Fig. 13
figure 13

Structure of the type-2 fuzzy integrator

We used three triangular membership functions in the inputs and output for the type-2 fuzzy integrator (T2FI) and the range is established in the interval for 0–1.5 (see Fig. 14). The footprint and positions of the membership functions are established empirically.

Fig. 14
figure 14

Structure of the type-2 fuzzy integrator

In the type-2 fuzzy integrator we utilized 30 rules, 27 for the combination of the three inputs with “and” operator and there are also 3 rules using the “or” operator (see Fig. 15).

Fig. 15
figure 15

Rules for the type-2 fuzzy integrator

5 Simulation Results

The results for the experiments for the ensemble neural network with average integration (ENNAI) are shown on Table 1 and Fig. 16. The best prediction error is of 0.0346, and the average error is of 0.0485.

Table 1 Results for the ensemble neural network with average integration for Mackey-Glass time series
Fig. 16
figure 16

Graphic of real data again prediction data of ENNAI for the Mackey-Glass time series

We presented 10 experiments of simulations for the ensemble neural network with the average integration and the type-2 fuzzy integrator, but the average error was calculated considering 30 experiments with the same parameters and conditions. The results for the experiments for the ensemble neural network with type-2 fuzzy integrator (ENNT2FI) are shown on Table 2. The best prediction error is of 0.0265, and the average error is of 0.0561.

Table 2 Results for the ensemble neural network with the type-2 fuzzy integrator for time series Mackey-Glass

We show in Table 3 a comparison for the prediction for the Mackey-Glass time series between the results for the monolithic neural network (MNN), the neural network with type-2 fuzzy weights (NNT2FW), the ensemble neural network with average integration (ENNAI) and the ensemble neural network with type-2 fuzzy integrator (ENNT2FI).

Table 3 Comparison results for the Mackey-Glass time series

6 Conclusions

In the experiments, we observe that using an ensemble neural network with average integration and type-2 fuzzy integrator, we can achieve better results than the monolithic neural network and the neural network with type-2 fuzzy weights for the Mackey-Glass time series. The ensemble with type-2 fuzzy integrator presents better results in almost all the experiments than the optimization with PSO.