Introduction

Epilepsy is a common long-lasting brain disease in most resource-poor countries where anti-seizure medications are not available. In many parts of the world, the true nature of epilepsy has also long been distorted by myths, fear, and mistaken notions about the disorder. The WHO is working with ministries of health and partners to improve access to the treatment for epilepsy [1, 2]. Mortality due to epilepsy is a significant concern. Epilepsy patients die at a rate that is much greater than the actual shown in the reports. So, detection and classification of seizures at early stage is essential to stop the mortality rate. There are several methods of detection and classification based on machine learning which are available, but, in actual practice, it is found that the detection and classification of seizures are complex and difficult. Based on the unexpected nature of seizures and classification difficulties, researchers have proposed different detection and classification methods. Some of the techniques are presented in the subsequent discussions. Numerous researchers have recently developed various classifiers for the classification of epileptic seizers using machine learning techniques, and some of the recent literature analysis are presented.

Abbasi et al. [3] proposed statistical characteristics of EEG signals for the detection of epileptic seizures by utilizing discrete wavelet transform and multilayer perceptron (MLP) neural network and achieved an accuracy of 98.33%. Sriraam et al. [4] proposed multi-features and MLP and achieved a sensitivity, specificity, and false detection rate of 97.1%, 97.8%, and 1 h − 1, respectively. Yuan et al. [5] proposed a novel feature diffusion distance (DD) algorithm and wavelet transform for feature extraction and a Bayesian linear discriminant analysis (BLDA) classifier for classification of seizures. The public intracranial EEG dataset, which consists of 577.67 h of intracranial EEG recordings from 21 patients with 87 seizures, is used to evaluate the prediction method. Both with a seizure prediction horizon of 10 s, the sensitivity of the prediction method was 85.11% for a seizure occurrence period of 30 min and 93.62% for a seizure occurrence period of 50 min. The erroneous prediction rate was 0.08/h, which was a negligibly low number. A new method for categorizing epileptic seizures was proposed by Gupta et al. [6] and is based on rhythms discovered using the Fourier–Bessel series expansion (FBSE) of electroencephalogram (EEG) signals and weighted multiscale Renyi permutation entropy (WMRPE). For the categorization of epileptic seizure EEG data, the generated WMRPE feature values are then used to several classifiers, including regression, random forest (RF), and least squares support vector machine (LS-SVM). The tenfold cross-validation technique is used to validate the categorization in order to confirm the robustness of the suggested work. Additionally, the suggested approach is examined using additive white Gaussian noise (AWGN) at various SNR levels. An analytical time–frequency flexible wavelet transform (ATFFWT) was used by Sharma et al. [7] to achieve a perfect 100% sensitivity with the LS-SVM classifier. For the purpose of identifying epileptic seizures from EEG data, Arti et al. [8] suggested an automatic learning framework based on the Fourier–Bessel series expansion-based empirical wavelet transform (FBSE-EWT) technique. The least square support machine learning (LS-SVM), support vector machine (SVM), k-nearest neighbor (k-NN), and ensemble bagged tree classifiers were used to classify the top-ranked cumulative features. Two publicly accessible benchmark EEG datasets were used to evaluate the proposed framework for epileptic seizure detection: the Bonn EEG dataset and the Children's Hospital Boston (CHB) and Massachusetts Institute of Technology (MIT) scalp EEG dataset. The suggested framework obtained 100% classification accuracy on the Bonn EEG dataset using the tenfold cross-validation technique, compared to 99.84% classification accuracy on the CHB-MIT scalp EEG dataset. Yu et al. [9] proposed the kernel robust probabilistic collaborative representation-based classifier for seizure classification and obtained an accuracy of 99.3%. Analytic intrinsic mode functions (AIMFs)-based features for classification were proposed by Sachin et al. [10]. The least squares support vector machine (LS-SVM) classifier achieved accuracy of 97.56%, sensitivity of 96.45%, specificity of 98.96%, positivity of 99.2%, negativity of 95.2%, and detection of minimal error rate of 4.28%. Recent studies have importance to progress machine learning-based methods for the prediction and classification of epileptic seizures with electroencephalogram (EEG). With a support vector machine (SVM) classifier and the wrapper-based feature selection method, Abhijit et al. [11] developed tunable-Q wavelet transform-based multiscale entropy measure for automatic classification of epileptic EEG data and achieved 99.5% accuracy. The accuracy was 84.675% when Bhattacharyya et al. [12] applied multivariate fuzzy entropy (mv) to the multivariate sub-band signals and LS-SVM. The accuracy of the time–time transform and harmony search local linear radial basis function network developed by Sreelekha et al. [13] was 99.45%.

Singular value decomposition and extreme learning machine (ELM) were proposed by Harikumar et al. [14] for the categorization of seizures, and they achieved an average detection of 98.94% and a GDR of 97.83%. Atemangoh et al. [15] proposed Laguerre polynomial wavelets and support vector machine and achieved 100% accuracies with threefold, fivefold, and tenfold cross-validation in some cases. Tuncer et al. [16] proposed epilepsy attacks recognition from EEG signals based on 1D octal pattern and wavelet transform, and 96.0% accuracy is attained for five categories. Baykara et al. [17] proposed stockwell transform and an extreme learning machine and achieved sensitivity, specificity, and accuracy of 95%, 82%, and 90%. Zhou et al. [18] proposed a graph-based extreme learning machine for the recognition of imbalanced epileptic EEG signals. Deivasigamani et al. [19] proposed an adaptive neuro-fuzzy inference system (ANFIS) and neural networks (NN) for the diagnosis of epilepsy in EEG signals and achieved 96.7% of sensitivity, 98.1% of specificity, and 99.7% of accuracy. The hybrid machine learning models offered to enhance the classifier's performance and shorten the classification process's computational time. Motivated by the above deficiencies, we are proposing a hybrid WCA–PSO optimization of the EELM model for the classification of seizure, non-seizure, and normal categories of EEG signals. To validate the proposed WCA–PSO algorithm, three benchmark functions are utilized and compared with existing PSO [20], WCA [20], SCA [21], and ABC [22] optimization algorithms. The objective of the research is to develop a hybrid WCA–PSO weight-optimized ensemble extreme learning machine model for the classification of seizure and non-seizure EEG signals to improve the classification performance.

The contributions are as follows:

  • In this study, a novel water cycle algorithm–particle swarm optimization (WCA–PSO) hybrid optimization technique has been developed for weight optimization of an ensemble extreme learning machine (EELM).

  • Further, to validate the robustness of the WCA–PSO hybrid optimization techniques, three benchmark functions are optimized, and results are compared with conventional algorithms.

  • In this study, analysis has been demonstrated with different classification models of ELM such as EELM, WCA–EELM models, and proposed WCA–PSO–EELM model with University of Bonn dataset.

The rest of the paper is divided as follows: "Materials and Methods" section presents the materials and methods which include the research step block diagram and proposed WCA–PSO-based EELM model, "Results and Discussion" section presents the results of the research, "Discussion" section presents discussions on the results, and "Conclusion" section presents conclusion and reference.

Materials and Methods

Research Work Flow Diagram

The flow of research work is shown in Fig. 1 which is focusing on the classification of EEG seizure machine learning and soft computing hybrid model. At the first step, the signals are decomposed by Daubechies, Coiflet, and Symlet wavelet transforms, and the levels of decomposition having lowest frequency are considered for the feature extraction technique. In the second step, the features are aligned to the proposed “WCA–PSO based EELM model for classification” and error calculation. At the third step, the features are submitted as input to the EELM, WCA–EELM model for classification comparison.

Fig. 1
figure 1

Research work flow diagram

Proposed WCA–PSO–Ensemble Extreme Learning Model

Due to the complex calculation with learning parameters in PSO, the hybrid WCA–PSO has been developed to maximize the performance of the EELM. In the PSO [23, 24] algorithm, the velocity and position equations are as follows:

$$v_{i} (n + 1) = \kappa *v_{i} (n) + \beta_{1} r_{1} \left( {p_{i}^{\rm best} - x_{i} (n)} \right) + \beta_{2} r_{2} p^{\rm gbest} - x_{i} (n)$$
(1)

And the position update equation is given by

$$x_{i} (n + 1) = x_{i} (n) + v_{i} (n + 1)$$
(2)

where κ is inertia coefficient, the learning factors β1 represent the local position weight coefficient and β2 represent the global position weight coefficient.

Water Cycle Algorithm

The WCA [20] algorithm is based on the directions of the flow of streams and rivers which is merged into the sea with the process of water cycle.

The water cycle algorithm starts with the rain fall or precipitation occurrences by generation of streams of population or design variables. Then, the best stream is chosen, i.e., the stream with the minimum objective function value (for minimization problems) is selected as the sea shown in Fig. 2. Later, a set of streams with the neighboring objective values to the best objective value are selected as the rivers. Note that it is assumed that the left over streams move toward the rivers and sea. Let us consider the 1 × D dimensional array as a stream for solution.

Fig. 2
figure 2

New positions of stream, flows to sea

Now considering the position and velocity from the PSO algorithm, the stream flow matrix of water courses of size \(S_{\rm Population} \times D\), where “\(D\)” is the “dimension,” and the corresponding position matrix is given by

$$S_{\rm Totpop}^{{}} = \left[ \begin{gathered} {\rm Sea} \hfill \\ riv - 1 \hfill \\ riv - 2 \hfill \\ \hfill \\ \vdots \hfill \\ {\rm strS}_{{\rm sr} + 1} \hfill \\ {\rm strS}_{{\rm sr} + 2} \hfill \\ {\rm strS}_{{\rm sr} + 3} \hfill \\ \vdots \hfill \\ {\rm strS}_{{\rm pop}} \hfill \\ \end{gathered} \right] = \left[ {\begin{array}{*{20}c} {x_{11}^{1} } & {x_{12}^{1} } & \cdots & {x_{D(i,j + 1)}^{1} } \\ {x_{21}^{1} } & {w_{22}^{2} } & \cdots & {x_{D(i + 1,j + 1)}^{2} } \\ \vdots & \vdots & \cdots & \vdots \\ {x_{i + 1,i}^{{s_{{\rm pop}} }} } & {x_{i + 1,j + 1}^{{s_{{\rm pop}} }} } & \cdots & {x_{D(i + n,j + n)}^{{s_{{\rm pop}} }} } \\ \end{array} } \right]$$
(3)

where \(S_{{\rm sr}}\) values are selected as the sea and rivers. The cost of every stream is calculated by evaluating cost function for that stream as cost = \(f\left( {{\rm River}_{n} } \right)\). A user parameter which is summation of number of rivers and sea so-called \(S_{{\rm sr}}\) is selected. The stream from this group with the lowest objective value is chosen as the sea, while the other streams are picked as the rivers, given in the following equations and is given by

$$S_{{\rm sr}} = {\rm No}.\,{\rm of}\,{\rm rivers} + 1({\rm sea})$$
(4)
$$S_{{\rm Stream}} = S_{{\rm population}} - S_{{\rm sr}}$$
(5)

Ensemble Extreme Learning Machine

The Bayesian methods are utilized [23, 25, 26] to learn the output weights of ELM to gain higher generalization. The model is proposed in Fig. 3.

Fig. 3
figure 3

WCA–PSO-based EELM model

Given a set of N training dataset \(D = \left( {x_{i} ,d_{i} } \right),i = 1\) to N with each \(x_{i}\) is a vector, and \(d_{i}\) is the expectation output. The output function of ELM with L hidden neurons is represented by

$$y = \sum\limits_{k = 0}^{L} {\beta_{k} h_{k} \left( {w_{k} ;\,x} \right)}$$
(6)

where \(h\left( {w;\,x} \right) = \left[ {1,h_{1} \left( {w_{1} ;\,x} \right),.......,h_{L} \left( {w_{L} ;\,x} \right)} \right]\) is hidden layer, and \(\beta\) is the weight vector of all hidden neurons, \(h_{k} ( \cdot )\) which is the activation function. Equation (6) can be written as follows:

$${\rm H}\beta = y$$
(7)

where \({\rm H}\) is the \({\rm N} \times \left( {L + 1} \right)\) hidden layer feature-mapping matrix, whose elements are as follows:

$${\rm H} = \left[ {\begin{array}{*{20}c} 1 & {h_{1} \left( {w_{1} ;\,x_{1} } \right)} & \cdots & {h_{L} \left( {w_{N} ;\,x_{1} } \right)} \\ \vdots & \vdots & \vdots & \vdots \\ 1 & {h_{1} \left( {w_{1} ;\,x_{N} } \right)} & \cdots & {h_{L} \left( {w_{N} ;\,x_{N} } \right)} \\ \end{array} } \right]$$
(8)

And \(h_{L} \left( {w_{N} ;\,x_{N} } \right) = \left[ {w_{1} x_{1} + w_{1} x_{1} .........w_{N} x_{N} } \right].e^{{\left( {\frac{{ - \left\| {\left( {x_{N} - v_{i} } \right)} \right\|^{2} }}{{2\sigma_{n}^{2} }}} \right)}}\) where \(\sigma_{n}^{2}\) is the parameter for controlling, and \(\left\| {x_{i} - c_{j} } \right\|\) indicates the Euclidean distance. Equation (9) is a linear system, which is solved by

$${\text{H}}^{\dag } = \left( {{\text{H}}^{T} {\text{H}}} \right)^{{ - 1}} {\text{H}}^{T}$$
(9)
$$\hat{\beta } = {\text{H}}^{\dag } d,$$
(10)

where \({\text{H}}^{\dag }\) is the Moore–Penrose generalized inverse of matrix \({\rm H}\).

In the ensemble ELM, there are two activation functions which are used [24] such as radial basis function and variable activation function [27, 28]. Now mapping with the stream with rivers in terms of weights of the ELM model, we have

$${\rm Ws}_{n} = {\rm round}\left\{ {\left| {\frac{{f\left( {{\rm River}_{n} } \right)}}{{\sum\limits_{i = 1}^{{W_{{\rm sr}} }} {f\left( {{\rm River}_{i} } \right)} }}} \right| \times W_{{\rm Stream}} } \right\},\,\,\,\,\,\,\,n = 1,2,3\,\,...W_{{\rm sr}}$$
(11)

where \(Ws_{n}\) is the “number of streams,” and evaluation function is given by \(f\) where \(W = \left[ {W_{1} .....,W_{4} ,...W_{n} } \right]\) are the weights of the ELM model.

By applying orthogonal projection method, the weight matrix is defined as follows:

$${\rm where}\, H\,{\rm becomes}\, {H}_{w}^{{}} = \left[ {\begin{array}{*{20}c} 1 & {h_{1} \left( {W_{1} ;\,x_{1} } \right)} & \cdots & {h_{L} \left( {W_{N} ;\,x_{1} } \right)} \\ \vdots & \vdots & \vdots & \vdots \\ 1 & {h_{1} \left( {W_{1} ;\,x_{N} } \right)} & \cdots & {h_{L} \left( {W_{N} ;\,x_{N} } \right)} \\ \end{array} } \right]$$
(12)
$$\begin{gathered} \hat{\beta }_{w} = \left\{ \begin{gathered} H_{w}^{T} \left( {\frac{1}{\lambda } + H_{w} H_{w}^{T} } \right)^{ - 1} d \hfill \\ \left( {\frac{1}{\lambda } + H_{w} H_{w}^{T} } \right)^{ - 1} H_{w}^{T} d \hfill \\ \end{gathered} \right. \hfill \\ \hfill \\ \end{gathered}$$

where \(d = \left[ \begin{gathered} d_{1} \hfill \\ d_{2} \hfill \\ \vdots \hfill \\ d_{n} \hfill \\ \end{gathered} \right]\), \(\hat{\beta } = \left[ \begin{gathered} \beta_{1} \hfill \\ \beta_{1} \hfill \\ \vdots \hfill \\ \beta_{n} \hfill \\ \end{gathered} \right]\).

Validation of Proposed WCA–PSO Algorithm

To show the uniqueness of the proposed WCA–PSO hybrid algorithm, the three benchmark functions [21] are utilized for optimization. The proposed WCA–PSO optimization technique has been compared with the existing WCA [20], PSO [21], SCA (sine cosine algorithm) [21], and ABC (artificial bee colony algorithm) [22] metaheuristic algorithms, and the comparison results are presented in Figs. 4,  5, and 6.

Fig. 4
figure 4

Validation of function F1 using WCA–PSO hybrid algorithm

Fig. 5
figure 5

Validation of function F2 using WCA–PSO hybrid algorithm

Fig. 6
figure 6

Validation of function F3 using WCA–PSO hybrid algorithm

Database Description

The EEG signals are collected from University of Bonn, Germany [29], and utilized for the experimentation. The dataset includes five sets of EEG recordings denoted by “A, B, C, D and E” and each containing “100 single-channel EEG derivations of 4097” samples of (23.6-s duration) with a sampling frequency of 173.61 Hz [24]. The samples E-dataset is presented in Fig. 7.

Fig. 7
figure 7

Seizure EEG signal representation of E-subset data

Wavelet Transform

“DWT especially” is suitable for the analysis EEG [30] signals. The present research work proposed a three types of wavelet for EEG analysis to attain the best seizure “detection accuracy with low computational cost.” PCA [31, 32] is a dimensionality reduction technique that reduces the dimensionality of the dataset to a lower dimension by employing matrix factorization with the loss of some variance.

Feature Extraction

Choosing significant features is crucial, and six features are extracted for the proper classification of epileptic seizures. The features such as “power spectral density (PSD),” “Shannon entropy,” energy, mean, kurtosis, and standard deviation are considered.

Results and Discussion

Wavelet Transform Results

Classification Performance Results

Sensitivity is also known as “true positive ratio (TPR),” specificity refers as “true negative ratio (TNR),” and accuracy is utilized to assess the performance of classifier. We have used 10 × 10 cross-validation procedure for this experiment.

$${\rm Sensitivity} = {\rm TPR} = \frac{{\rm TP}}{{{\rm TP} + {\rm FN}}}$$
$${\rm Specificity} = {\rm TNR} = \frac{{\rm TN}}{{{\rm TN} + {\rm FP}}}$$
$${\rm Accuracy} = \frac{{\rm TP} + {\rm TN}}{{{\rm TP} + {\rm TN} + {\rm FP} + {\rm FN}}}$$

Discussion

The pseudocode for the WCA–PSO algorithm is presented in Table 1. Figure 1 presents the steps of implementation of the research, whereas Fig. 2 shows the process WCA algorithm in which the direction of sea, river, and position of the stream. Figure 3 shows the architecture of the proposed WCA–PSO-based EELM model. Table 2 shows the three benchmark functions for validation of the proposed WCA–PSO algorithm. The normalized statistical features of A, B, C, D, and E EEG dataset are shown in Table 3. Figures 4, 5, and 6 present the validation of function F1, F2, and F3 by utilizing the SCA, WCA, PSO, ABC, and WCA–PSO algorithms. The SCA, WCA, PSO, and ABC algorithms are shown to converge more slowly than the proposed WCA–PSO method. The validation of function F1 using the SCA, WCA, PSO, ABC, and proposed WCA–PSO algorithm is shown in Fig. 3. The SCA, WCA, PSO, and ABC algorithms converge more slowly than the suggested WCA–PSO method. SCA, WCA, PSO, and ABC required approximately 300, 150, 660, and 850 iterations, respectively, but the proposed WCA–PSO required about 70 iterations to obtain convergence. Also, whereas SCA, WCA, PSO, and ABC achieved optimal values of 0.65629, 0.41259, 0.31254, and 0.24318 for function F1, the proposed WCA–PSO achieved an optimal value of 0.19778. The validation of function F2 is shown in Fig. 4. The proposed WCA–PSO took about 100 iterations, according to Fig. 4, whereas the SCA, WCA, PSO, and ABC required about 600, 250, 630, and 1000 iterations, respectively, to converge. Also, the SCA, WCA, PSO, and ABC achieved ideal values of 0.72184, 0.42838, 0.25427, and 0.2371 but the proposed WCA–PSO achieved an optimal value of 0.18337 for function F1. The validation of function F3 is shown in Fig. 5. The proposed WCA–PSO took about 125 iterations, according to Fig. 5, whereas SCA, WCA, PSO, and ABC required about 450, 180, 610, and 1000 iterations, respectively, to converge. Also, the SCA, WCA, PSO, and ABC all achieved ideal values of 0.65248, 0.39457, 0.19241, and 0.18731 for function F3, but the proposed WCA–PSO achieved an optimal value of 0.17293. The seizure EEG signal E-subset data, which have a length of 4093 data points, are shown in Fig. 7. Figure 8 shows the Coiflet wavelet decomposition of seizure EEG signal in which alpha shows the seizure region, where Fig. 9 shows the Symlet wavelet decomposition of seizure EEG signal in which theta shows the seizure region. Figure 10 shows the decomposition levels such as gamma, beta, alpha, theta, and delta for Daubechies wavelet.

Table 1 PSEUDOCODE: WCA–PSO algorithm for weight optimization of EELM model
Table 2 Benchmark functions for validation of the proposed WCA–PSO algorithm
Table 3 Normalized statistical features of A, B, C, D, and E EEG dataset
Fig. 8
figure 8

Coiflet wavelet decomposition of seizure EEG signal

Fig. 9
figure 9

Symlet wavelet decomposition of seizure EEG signal

Fig. 10
figure 10

Daubechies wavelet decomposition of seizure EEG signal

For gamma, the maximum occurs at 106.00 Hz, for beta, 44.00 Hz, for alpha, 28.00 Hz, for theta, 13.00 Hz, and for delta, 7.00 Hz, respectively, 44.00 Hz for beta, 28.00 Hz for alpha, 13.00 Hz for theta, and 7.00 Hz for delta, respectively. In this case, alpha shows the seizure region, so we have chosen the Daubechies wavelet as reference for feature extraction. A total of 5 × 600 = 3000 feature data points were used as input to the ELM, EELM, WCA–EELM, and proposed WCA–PSO–EELM models for classification accuracy. The proposed model achieved lesser computational time as compared to the other mentioned classifiers. This proposed WCA–PSO–EELM model achieved classification 99.12% accuracy. Additionally, it is shown from Fig. 11 that the WCA–PSO–EELM model required fewer computing time and approximately 40 iterations to reach convergence than the other models described. The mean square error comparison of each of the aforementioned classification models is shown in Fig. 12. Moreover, 450, 400, and 150 iterations were required for the ELM, EELM, and WCA–EELM models, respectively. The proposed WCA–PSO–EELM model took 24.1128 s for convergence, whereas ELM, EELM, and WCA–EELM models took 98.4839 s, 65.3782 s, and 34.2569 s. The details of performance evaluation of classifiers are presented in Table 4.

Fig. 11
figure 11

Training accuracy results

Fig. 12
figure 12

Mean square error results

Table 4 Performance evaluation of classifiers

Conclusion

This research work presented a novel WCA–PSO-based EELM model for classification, and wavelet transform has been utilized for the identification of seizure signals. The proposed WCA–PSO algorithm has been utilized for weight optimization of EELM model. Further to test the robustness of the hybrid WCA–PSO algorithm, three benchmark functions were considered, and the results were compared with SCA, WCA, PSO, and ABC algorithms. The wavelet transform considered for feature extraction. The extracted features from EEG signals were fed to the proposed WCA–PSO-based EELM for classification. There were six features used for the tenacity of classification task. The proposed WCA–PSO-based EELM model and comparison results also presented. The results offered from the proposed WCA–PSO-based EELM model display healthier classification result as compared to the ELM, EELM, and WCA–EELM models. Even though the computational time is approximate to the WCA–EEELM model, the classification accuracy is better in the case of proposed WCA–PSO-based EELM model. The proposed WCA–PSO-based EELM model has shown decent potentiality of classifying the EEG signals into non-seizure and seizure categories.