1 Introduction

In recent years, the studies on the subjects of the brain are increased dramatically. These wide studies provide a new system for communicating brain and computer just with electroencephalogram (EEG) processing. This new approach is named brain–computer interface (BCI). BCI does not rely on muscular and neuromuscular inputs. In a BCI, the use of nerves, muscles, and movements is replaced with electrophysiological signals processing by means of some software and hardware [1]. This advantage enables the subjects with muscular and neuromuscular disabilities to communicate and control their ambience without any use of prerequisite movements. Therefore, BCIs can amend existing assistive technologies. Using BCIs has advantages for handicapped subjects [2] and applications such as neuro-feedback for stroke rehabilitation [3], medical usage such as sleep disorders heeling [6], epileptic seizure prediction [4, 7], alertness detection and cognitive load monitoring [8]. BCIs are also profitable as an additional technology in robotics [9], computer games [5], virtual reality [10], etc.

EEG signals of various motor imagery (MI) classes have distinct event-related brain potentials (ERP) which can further have specific attributes such as event-related desynchronization (ERD) which manifest a decrease in power, and event-related synchronization (ERS) which manifest an increase in power in mu and beta rhythms over the sensorimotor cortex [11].

Classification is an important computational procedure in a BCI. A two-class EEG-based classification classifies fatigue state versus alert state using a combination of independent component by entropy rate-bound minimization analysis (ERBM-ICA) for the source separation, autoregressive (AR) modeling for the features extraction, and Bayesian neural network as the classifier algorithm is presented in [15]. An optimal probe selection for EEG classification in a BCI for three mental tasks using blind source separation (BSS) technique based on independent component analysis (ICA) with its back-projecting of the scalp map is presented in [14]. As a result, the four channels of P3, O1, C4, and O2 are cited as the prominent channels with more scattered features. EEG classification of pre- and post-mental load tasks for mental fatigue detection by the classification system combined of principal component analysis (PCA) as the dimension reduction method, power spectral density (PSD) as feature extractor, and Bayesian neural network (BNN) as classifier is presented in [16]. An EEG classification system for fatigue detection includes PSD as a feature extractor, fuzzy swarm-based artificial neural network (ANN) as a classifier, and component analysis of entropy rate-bound minimization (ICA-ERBM) for source separation technique, is presented in [17]. EEG classification is also studied for fatigue detection [15,16,17] and user identification [18,19,20] as biometric systems.

This paper presents a novel EEG classification procedure based on time-series prediction. A new class of recurrent adaptive neuro-fuzzy interface system (ANFIS) is developed for predicting EEG time series. Autoregressive moving average (ARMA) prediction models, typical feedforward ANFIS, and ANN are also used for prediction approach in a similar framework to have a comparison between the developed recurrent ANFIS and the existing systems.

ANNs are widely used as approaches for pattern recognition including MI classification [11,12,13]. Fuzzy systems can describe objects and processes that are not defined precisely or have some uncertainty in their description. Fuzzy systems perform a valuable role in handling with uncertainty. Therefore, fuzzy systems have a growing interest in modern applications such as production techniques medical studies, information technology, pattern recognition, decision making, data analysis, and diagnostics [3,4,5].

Neuro-fuzzy systems are a combination of fuzzy systems and ANNs and utilize the benefits of them concurrently. Adaptive neuro-fuzzy inference system (ANFIS) is a significant approach in neuro-fuzzy networks which has shown notable potential in modeling nonlinear systems. In ANFIS, the parameters of fuzzy membership functions are gained from a dataset defining the system’s manner [9, 10]. Successful executions of ANFIS in biomedical applications have been announced such as EEG classification [21,22,23,24,25] and data analysis in biosignal processing [2, 26].

In this paper, a new recurrent structure based on ANFIS is performed for EEG classification. The presented approach that involves eight recurrent ANFISes classifies four classes of EEG signals when raw EEG data in the time domain are used as inputs. These recurrent ANFISes are trained with back-propagation gradient descent method in composition with least squares method. After the training phase, each recurrent ANFIS will be specialized in sort of signals which is trained with and can recognize the signals of the same class from the signals of other classes.

2 Experiment datasets

The dataset IIIa from BCI Competition III and the dataset 2a from BCI-IV are used in this paper [28, 29]. The dataset IIIa was recorded from three healthy subjects (S1–S3), and nine healthy subjects (S4–S12) were recorded in dataset 2a. These datasets consisted of four classes: left-hand movement imagination (class 1), right-hand movement imagination (class 2), foot movement imagination (class 3), and tongue movement imagination (class 4).

For dataset IIIa, signals were recorded in ten sessions for each subject. Subjects S1 fulfilled 360 trials, and subjects S2 and S3 fulfilled 240 trials [29, 30]. The signals for datasets 2a were recorded in two sessions on various days for each subject. Each session includes six recording parts which is cut off by short breathers. Each recording part contained 48 trials (12 trials for each class), leading to 288 trials per session and totally 576 trials.

Figure 1 shows the timing diagram of the routine of recording signals [27,28,29,30]. The routine of recording signals started with sitting the subjects in a comfortable armchair. Then, a cue arrow was shown on the screen, and the subjects were asked to imagine a movement according to the orientation of the arrow. The orientation of the arrow was selected randomly, and no feedback to the subjects was provided. The sampling frequency was 250 Hz. EEG in dataset IIIa was filtered with a 1–50-Hz band-pass filter and 50-Hz notch filter on. EEG in dataset 2a was filtered with a 0.5–100-Hz band-pass filter and an additional 50-Hz notch filter to remove the line noise [27,28,29,30]. The EEG recording systems for the datasets IIIa and 2a were a sixty-four EEG-channel amplifier from Neuroscan and a twenty-two Ag/AgCl electrodes system, respectively, as shown in Fig. 2. The recording systems use the right mastoid as ground and the left mastoid as a Ref. [28, 29].

Fig. 1
figure 1

Timing of the paradigm a dataset IIIa, b dataset 2a [28, 29]

Fig. 2
figure 2

Position of EEG electrodes a dataset IIIa, b dataset 2a [28, 29]

The two electrodes, named C3 and C4, are used in this paper. These electrodes are, respectively, placed over the left and right sensorimotor areas which are dominantly the main active areas in the imagination of motions [31]. These locations are signalized in Fig. 2. Usually, there is a pre-processing before EEG classification for artifacts and noise removal [32,33,34]. In this paper, the signals that highly affected by the artifacts are removed from the database by an expert, and no other artifact removal procedure is used.

3 Recurrent ANFIS architecture

ANFIS is a kind of neural network in a combination of Takagi–Sugeno fuzzy inference system. As ANFIS amalgamates both neural networks and fuzzy systems, it has the ability to acquire the advantages of both in a single structure [9]. ANFIS has pioneering results in recent years of neuro-fuzzy networks. Indeed, it is also considered to be one of the best systems in function approximation among the existing neuro-fuzzy models [35]. To express the ANFIS operating process, two first-order Takagi–Sugeno fuzzy if–then rules are assumed as:

$$ \begin{aligned} & {\text{Rule}}\,{1:} {\text{if}}\, x \,{\text{is}}\, A_{1 } \,{\text{and}} \,y \,{\text{is}} \,B_{1 } \, {\text{then}}\, f_{1} = p_{1 } x + q_{1 } y + r_{1 } \\ & {\text{Rule}}\,{2:} {\text{if}} \,x \,{\text{is}} \,A_{2 } \,{\text{and}}\, y \,{\text{is}}\, B_{2 } \, {\text{then}}\, f_{2} = p_{2 } x + q_{2 } y + r_{2 } \\ \end{aligned} $$
(1)

where x and y are the inputs, \( A_{i } \) and \( B_{i } \) are the fuzzy sets, \( f_{i } \) are the outputs within the fuzzy region designated by the fuzzy rules, and \( p_{i } \), qi, and \( r_{i } \) are the design parameters designated within the training phase. Figure 3 shows the ANFIS architecture to execute these two if–then rules. This network consists of five layers. If the output of node i of layer L is shown by oL,i, the function of this network is described as follows:

Fig. 3
figure 3

ANFIS architecture

The first layer calculates the degrees of inputs membership in fuzzy membership functions:

$$ \begin{aligned} o_{1,i} & = \mu_{{A_{i} }} \left( x \right) \quad i = 1,2 \\ o_{1,i} & = \mu_{{B_{i - 2} }} \left( y \right)\quad i = 3,4 \\ \end{aligned} $$
(2)

where \( \mu_{{A_{i} }} \left( x \right) \) and \( \mu_{{B_{i - 2} }} \left( y \right) \) are the fuzzy membership functions. For example, by employing Gaussian function, \( \mu_{{A_{i} }} \left( x \right) \) is written as:

$$ \mu_{{A_{j} }} \left( x \right) = e^{{ - \left( {\frac{{x - c_{i} }}{{\sigma_{i} }}} \right)^{2} }} $$
(3)

where \( c_{i} \) and \( \sigma_{i} \) are the membership function parameters computed in training procedure.

The output of the second layer is gained with multiplying the output values of the first layer:

$$ o_{2,i} = w_{i} = \mu_{{A_{i} }} \left( x \right).\mu_{{B_{i} }} \left( y \right) \quad i = 1, 2. $$
(4)

The output of the third layer is calculated by normalizing the output values of the second layer:

$$ o_{3,i} = \bar{w}_{i} = \frac{{w_{i} }}{{w_{1} + w_{2} }}\quad i = 1, 2. $$
(5)

The output of the fourth layer is computed by multiplying the output values of the previous layer with a first-order polynomial. For the first-order Takagi–Sugeno fuzzy if–then rules, it can be calculated as:

$$ o_{4,i} = \bar{w}_{i} f_{i} = \bar{w}_{i} \left( {p_{i } x + q_{i } y + r_{i } } \right)\quad i = 1, 2 $$
(6)

Finally, the output of network is calculated in layer five as:

$$ o_{5,i} = o = \mathop \sum \limits_{i} \bar{w}_{i} f_{i} = \frac{{\mathop \sum \nolimits_{i} w_{i} f_{i} }}{{\mathop \sum \nolimits_{i} w_{i} }}\quad i = 1, 2 $$
(7)

where o is the output of network, and \( \bar{w}_{i} \) is the output of third layer.

The architecture of ANFIS implementing these two rules is described in details in [9].

However, a major deficiency of available neuro-fuzzy networks is that their efficiency is limited to static problems because of their feedforward structure. Consequently, they have less efficiency for representing dynamic functions in comparison with recurrent networks [36,37,38]. By upgrading typical feedforward ANFIS to recurrent type, its ability to handle time-series patterns recognition will be promoted. A considerable issue in this structure evolution is that the recurrent structure shows greater prediction capacities compared with the feedforward ones. Therefore, in this paper, time-series prediction is performed by an altered structure of ANFIS with output-to-input feedback loop.

For converting the network architecture to the recurrent type, the error of prediction is exerted to the network as an input. Indeed, the prediction error is feedback to the network, and by doing so, a closed-loop system is acquired. This architecture is shown in Fig. 4. If the output of typical ANFIS is considered as a function of inputs: o = f(xy), the output of this recurrent ANFIS will be changed to: o = f(xye). By applying this change, the output of network which was the function of inputs will become a function of inputs and errors of previous estimations.

Fig. 4
figure 4

Recurrent ANFIS architecture

There are two groups of parameters that should be calculated to match the ANFIS output to the training data: fuzzy membership function parameters called premise parameters and linear parameters called design parameters. The least squares and the gradient descent methods are integrated into a hybrid algorithm to calculate these parameters for training the network. The hybrid algorithm includes backward pass and forward pass. The least squares method in forward pass is employed to adjust the design parameter values optimally when the premise parameters are fixed. When the optimum design parameters are defined, the back-propagation gradient descent method in the backward pass is employed to determine the optimal premise parameter values [9].

There is a difference in applying the network in the training and in the test phases. The error in the training phase is different with the error in the test phases. This problem is explained as follows:

Consider the consequence of inputs in the training phase to be as (x1y1e1) and (x2y2e2) and the outputs as o1 and o2, respectively, and in test phase, the inputs consequence to be as (x1′, y1′, e1′) and (x2′, y2′, e2′) and the outputs as o1′ and o2′, respectively. If x2 = x2′ and y2 = y2′, it is assumed that o2 = o2′, but if x1 ≠ x1′ and/or y1 ≠ y1′, as a result o1 ≠ o1′, and e2 ≠ e2′ (because the errors are affected by the previous level). As a result, although x2 = x2′ and y2 = y2′, because of e2 ≠ e2′ the outputs o2 ≠ o2′. Therefore, there is a problem in time-series prediction in recurrent form. To resolve this problem, an error estimation system (EES) is used in test phase. EES is trained with the network training data: x and y as inputs and e as output, and then, in level i in test phase, EES can find the proper ei (ei is affected by i − 1 level in training phase) according to xi and yi.

The network can have a good performance of prediction in test phase without using the EES, but by using EES, the error of prediction is decreased, whereas the classification is based on MSE, as a result, the classification accuracy (CA) is increased. The network is configured according to Fig. 4 in the training phase. In the test phase, the e input is gained by EES, and the whole network is configured according to Fig. 5 in the test phase.

Fig. 5
figure 5

Recurrent ANFIS predictor architecture

4 Methodology

4.1 Network configuration

In [39, 40], feedforward neural network is used for EEG time-series prediction and EEG classification. By exploiting feedforward network, the previous value of the signal is used to predict the future value of time series. Therefore, the signal values from sample indexes t − 1 to t − n are utilized to predict the signal value at sample index t, so it can be written as:

$$ \hat{x}\left( t \right) = f\left[ {x\left( {t - 1} \right),x\left( {t - 2} \right), \ldots ,x\left( {t - n} \right)} \right] $$
(8)

where \( x\left( t \right) \) is the signal value at sample index t, \( \hat{x}\left( t \right) \) is the predicted value of signal at sample index t, and f is a feedforward network function that predicts \( \hat{x}\left( t \right) \) from prior values of signal.

By using the errors of former predictions, the power of estimation is improved, and the errors of prediction in the next steps of prediction will be diminished; hence, a superior prediction will be done. By this consideration, \( \hat{x}\left( t \right) \) has become a function of prior values of signal and the previous values of prediction error:

$$ \hat{x}\left( t \right) = f\left[ {x\left( {t - 1} \right), \ldots ,x\left( {t - n} \right),e\left( {t - 1} \right), \ldots ,e\left( {t - m} \right)} \right] $$
(9)

where e(t) is the error of prediction at sample index t which is obtained as:

$$ e\left( t \right) = x\left( t \right) - \hat{x}\left( t \right) $$
(10)

n and m values are chosen by empirical approach to gain the high and acceptable CA. In the beginning, they were chosen with the smaller values, and they were increased step by step to reach to a high classification accuracy. In this paper, m and n are set as n = 3 and m = 3. By using these values of parameters, the high accuracy of estimation is acquired. Of course, by choosing higher values of n and m, the higher classification accuracy is gained, but the training time is increased dramatically (typically, the neuro-fuzzy networks training is time-consuming), i.e., having two fuzzy membership functions for each input, and increasing just one unit of each parameter, the fuzzy if–then rules are increased twofold causing the twofold amount of computation. In this work, we try to have an acceptable classification accuracy by choosing the minimum values of n and m.

Choosing m = 3 and n = 3, (9) is converted to (11); therefore, the signal and error values from time instants t − 3 to t − 1 are used to predict the measurement at time t:

$$ \hat{x}\left( t \right) = f\left[ {x\left( {t - 1} \right),x\left( {t - 2} \right),x\left( {t - 3} \right), e\left( {t - 1} \right), e\left( {t - 2} \right),e\left( {t - 3} \right)} \right] $$
(11)

To implement (11) for EEG time-series prediction, a network is configured in the training phase according to Fig. 6. After training phase and acquiring the values of prediction error, the EES is trained by x(t − 1), x(t − 2), and x(t − 3) as the inputs and e(t − 1), e(t − 2), and e(t − 3) as the outputs. Then, in the test phase, the final system to do the time-series prediction for the unknown signal is configured according to Fig. 7. However, without using EES, the prediction can be done in a good mode, but by exploiting EES, the power of prediction is increased that influences CA properly. In this paper, the EES is a two-layer MLP with 8 hidden neurons in the hidden layer.

Fig. 6
figure 6

Illustration of recurrent ANFIS architecture for EEG time-series prediction

Fig. 7
figure 7

Recurrent ANFIS predictor architecture for EEG time-series prediction

Training just one predictor on all EEG signals of different classes is not ideal consistently because of the complexity and nonstationary characteristics of EEG data on the signals of various classes. Based on the supposition underlying the neural time-series prediction framework, if more than one channel is used for signals of each class, useful supplemental information pertinent to the differences of signals of each class can be extracted to enhance the separability of the overall features, in consequence amending BCI efficiency [40].

The number of EEG channels and the number of classes specify the number of predictor networks as:

$$ N = C \times E $$
(12)

where N is the number of predictor networks, E is the number of selected EEG channels, and C is the number of classes. Two electrodes of C3 and C4 are chosen, and four classes of right-hand movement, left-hand movement, tongue movement, and foot movement exist, so eight networks are employed as predictors for EEG time-series data, and each network is trained corresponding to one-channel-class time-series EEG data.

4.2 Classification procedure

The classification procedure is configured in three phases. The first phase is configuration and training eight recurrent ANFISes separately to execute one-step-ahead prediction, using the sequence of [x(t − 1), x(t − 2), x(t − 3), e(t − 1), e(t − 2), e(t − 3)] for each channel-class time-series EEG data as described in the previous section (Sect. 4.1). This phase (the training phase of the networks) is presented in Fig. 8.

Fig. 8
figure 8

Classification procedure in the training phase

After the training stage, the networks are ready to perform classification of test signals for unlabeled signals in two other phases as shown in Fig. 9. In Fig. 9, the recurrent ANFISes are labeled ‘C3L’ and ‘C4L’ for left-hand data-electrode ‘C3’ and left-hand data-electrode ‘C4,’ respectively, and ‘C3R’ and ‘C4R’ for right-hand data-electrode ‘C3’ and right-hand data-electrode ‘C4,’ respectively, and ‘C3F’ and ‘C4F’ for foot data-electrode ‘C3’ and foot data-electrode ‘C4,’ respectively, and ‘C3T’ and ‘C4T’ for tongue data-electrode ‘C3’ and tongue data-electrode ‘C4,’ respectively, corresponding to the type of each channel-class time-series EEG data on which they are trained with.

Fig. 9
figure 9

Classification procedure in the test phase

In the second stage, according to Fig. 9, to find the class of the unknown signal, the unknown signals recorded from C3 to C4 electrodes are entered to the relevant networks (i.e., recorded signal of C3 is entered to all networks labeled with C3, and symmetric behavior is done for recorded signal of C4). Each network performs one-step-forward prediction for the input signal in each trial. Then, in the third stage, the MSE of the prediction according to the difference of the original test signal and the predicted signal is calculated for the trial as follows:

$$ {\text{MSE}} = \frac{1}{M} \mathop \sum \limits_{k = 1}^{M} \left[ {x\left( k \right) - \hat{x}\left( k \right)} \right]^{2} $$
(13)

where x(k) is the values of actual signal and \( \hat{x}\left( k \right) \) is the values of predicted signal, k index is the counter of the length of trial that increases from 1 to M, and M is the number of samples in the input trial. Then, the unlabeled signal is classified in agreement with the network which performs lowest MSE of prediction. The MSE is a criteria of the difference of the original test signal and the predicted signal. The supposition underling this classification procedure is that because each network is trained on signals of just one class, each network can predict the signals which are the same as its training signals with low error, but the other networks have the larger error on prediction of that kind of signals, because the structure of their training signals was different with that kind of signals. Thus, the network that has the same training signal with the test signal can perform the prediction with lower error than the other networks and can identify the similarity of the test signal with the training signals. Therefore, the class of test signal is specified according to the class of the network which has the minimum error of prediction. In this paper, the common feature extraction methods are not used. Since the classification is based on time-series prediction, each network shows the lower prediction error has chosen as the winner, and the class of test signal is selected similar to the network with lower prediction error. For example, if the C3T network shows the lower prediction error, the class of signal is tongue movement. In other words, the errors of prediction (there are 8 networks, so there are 8 prediction errors) can be assumed as the features and the minimum function as the classifier.

This procedure can be used for every trial time length utilizing a sliding window method. To do so, k in (13) ranges from k = α to M where α and M are specified before the next classification procedure (initially k = 1 and M = window size). It means that the previous samples placed before the index k = α are discarded as the window slides away from the index k = α to the end of trial. The benefit of utilizing the sliding window for classification is that there is no need to know the start point of the communication by the user; therefore, online classification can be configured. An important point in this regard is selecting the optimal window size. By selecting window size properly, maximum CA is obtained as the signal pass through the window. In [39], the optimal window is appointed by means of an automated iterative search procedure. The optimal window size ranges between 2.3 and 3.2 s, and this means that the optimal M with considering the sampling frequency of 250 Hz is appointed in interval of [575, 800].

4.3 The comparison between the presented recurrent ANFIS and the other similar predictor systems like ANFIS, MLP, and ARMA

A comparison between the presented recurrent ANFIS and the other similar predictor systems like typical feedforward ANFIS, MLP, and ARMA model is considered. A parametric model named ARMA model consists of autoregressive (AR) part, and a moving average (MA) part is described in the following:

$$ \hat{x}\left( t \right) = \mathop \sum \limits_{i = 1}^{m} a_{i} x\left( {t - i} \right) + \mathop \sum \limits_{j = 0}^{n} b_{j} e\left( {t - j} \right) $$
(14)

where ai and bj are the AR and MA parameters [41, 42].

For comparison, the classification is done as the same method described in Sect. 4.2. All stages of classification are the same, but only the time-series prediction system in our work which is a recurrent ANFIS is replaced by MLP, feedforward ANFIS, and ARMA model. These similar systems predict the EEG time series in a similar framework, and then, based on the errors of prediction, the CA of each one is gained. The comparison is made by setting m = 3 and n = 3 in (14) for ARMA model, setting n = 6 in (8) for the MLP, and setting n = 6 in (8) for the feedforward ANFIS. So, the MLP and feedforward ANFIS have 6 values of EEG signals as the inputs, and the ARMA model has 3 values of EEG signals and 3 values of prediction errors as the inputs. The MLP utilized here has one hidden layer with 8 neurons in the hidden layer, and the feedforward ANFIS has two membership functions in the first layer.

5 Results and discussion

The proficiency of the presented procedure is demonstrated in two parts: the effectiveness of the presented recurrent ANFIS in the prediction of time series applied on EEG signals and the CA of the classifier based on time-series prediction. The accuracy of proposed classification procedure is assayed by using a fivefold cross-validation.

Figure 10 shows an instance of the efficiency of the recurrent ANFIS in predicting EEG time-series data. According to Fig. 10, the proposed network can model the EEG signal with negligible error. Although EEG signals are highly nonstationary and time variant, the network can track it fittingly.

Fig. 10
figure 10

Predicted signal versus actual EEG

The comparison of MSE of the proposed network, ARMA model, the MLP-ANN, and the feedforward ANFIS is presented in Table 1. In Tables 1 and 2, subjects (S1–S3) are from dataset IIIa and subjects (S4–S12) are from dataset 2a. Based on Table 1, it is obvious that our suggested network has smaller MSE in comparison with the other methods. The classification system acts based on the MSE of each network, and the unlabeled signal will be classified according to the network with lowest MSE, so the accuracy of the classifier is enhanced by the system whoever can exploit the time-series prediction with lower MSE. According to Table 1, the MSE of recurrent ANFIS is lower than the others, and it means that the power of identifying the structure of EEG signals by recurrent ANFIS is higher than the others. The MSE rates of prediction influence the CA directly. However, the MSE rates decrease, and the CA would increase subsequently.

Table 1 MSE comparison analysis for 12 subjects (S1–S12) of EEG time-series prediction for each framework included recurrent ANFIS, feedforward ANFIS, ARMA model, and neural network
Table 2 Comparison analysis of average percentage classification accuracy rates for fivefold cross-validation for 12 subjects (S1–S12) of EEG time-series prediction for each framework included recurrent ANFIS, feedforward ANFIS, ARMA model, and neural network

The percentage of CA for twelve persons from S1 to S12 is presented in Table 2. The percentage of CA is calculated as:

$$ {\text{CA}}\% = \frac{\text{the number of true classified signals}}{\text{the number of all test signals}} \times 100. $$
(15)

The basis of classification is established on the morphological discrepancy of EEG signals in variant action imaginations because EEG signals are classified in the time domain and differences in their shapes are the basis of classification. According to Table 2, the CA of the recurrent ANFIS is more than the others which is acquired because of the lower MSE of prediction. The lower rate of MSE is gained because of the feedback structure of the designed network.

Time course CA related to the proposed recurrent ANFIS is shown in Fig. 11. Because the starting point of imagination in recording procedures is started from 3 to 7 s, this period is considered for the aim of this paper. CA rates have started from 0% at 3 s and have risen sharply, and after a while, no tangible changes are observed. According to Fig. 11, the CA rates become stable after a while despite the intrinsic time-to-time variation in EEG, which is a significant issue for a BCI system.

Fig. 11
figure 11

Time course of CA % for recurrent ANFIS based on time-series prediction for 12 subjects (S1–S12)

This approach shows a significant potential for the classification of such complex and nonstationary EEG signals. The designed network is obtained by feedbacking the errors of the former predictions. Therefore, a closed-loop system is configured. By using the errors of former predictions, the power of estimation is improved, and the errors of prediction in the next steps of prediction will be diminished, and hereby a superior prediction will be performed. As a result, using the error information, the system can handle the uncertainty and the time-variant nature of the EEG signals. The system can model the EEG signals despite the nonlinearity and nonstationary identity. By increasing the modeling accuracy, the higher CA is obtained. Other potential advantages that can be mentioned here are the classification that can be carried out without knowing the initiate point of communication. Thus, this method has a potential for online EEG processing. Moreover, because the signal processing and classification are done in the time domain, there is no need to map signals to the other domains (e.g., frequency domain), typical feature extraction methods, and dimension reduction methods.

6 Conclusion

This paper presents a new class of recurrent adaptive fuzzy neural network for utilizing time-series prediction to classify EEG signals in brain–computer interface systems. The recurrent ANFIS is structured by feedbacking errors of previous predictions as inputs to make a closed-loop system. With feedback loops, this recurrent ANFIS can exploit the memorized past information to strengthen the capability of handling temporal prediction problems, such as modeling complex and nonstationary EEG signals. Overall, it is shown that the developed recurrent structure of ANFIS is a high-performance device regarding modeling and tracking.

For signals of each class, a network is considered. Each recurrent ANFIS is trained on EEG signals of one class and has the ability to recognize the similar structure on test signal by prediction with less error in comparison with other networks. Classification is established on the morphological discrepancy of MI signals in variant tasks. Results of online classification for twelve subjects are presented that demonstrate the potential of this prediction and classification method using recurrent ANFIS to be applied in a BCI.

A new class of recurrent ANFIS is developed that can be used for EEG modeling. The proposed network can also be used in biosignal modeling and processing properly too. Due to the inherent day-to-day variability in EEG signals, the feature extraction process needs to be cautious. The proposed classification method acts in the time domain and does not need the typical feature extraction and dimension reduction methods. Besides, classification based on the proposed time-series prediction does not need any knowledge of the initial point of signal recording. As a result, it can be useful for online asynchronous classification.