1 Introduction

Epilepsy is a brain disorder that affects nerve cell activity. It has many negative effects on human life, including seizures, unusual behavior, and even loss of consciousness [6, 9]. Epileptic activity in the brain can be detected through electroencephalography (EEG) by experts. Machine learning techniques help experts classify EEG signals better and diagnose disorders more accurately [7, 14, 39]. Some epilepsy patients can be resistant to treatment with drugs as a particular portion of the brain causes this resistance. EEG signals are used to detect this brain portion. The removal of the portion via surgery is an accepted treatment method, hence determining the correct location has the utmost importance. EEG signals received from parts of the brain affected by epileptic seizures are classified as focal. Other signals that are recorded from other brain regions not affected by epileptic seizures are called nonfocal [25].

1.1 Motivation and the proposed method

As epilepsy is one of the most common brain disorders worldwide affecting patients’ quality of life, any new method that will facilitate the disorder’s diagnosis and treatment should be regarded [12]. Characteristics of focal and nonfocal EEG signals are different. Therefore, machine learning methods can detect differences in these signals automatically. In this study, an automated and high-accurate classification method is recommended to classify EEG signals with high performance. This work’s primary goal is to use appropriate preprocessing, feature generation, and feature selection methods to reach a high-true prediction rate using shallow classifiers. Thus, multi-scale principal component analysis (MSPCA) is utilized as a denoising method in preprocessing. Multileveled or multilayered feature generators have high, moderate, and low levels of feature extraction capabilities, such as in deep models [8]. Therefore, a multi-leveled feature generation method is preferred in this work. Tunable Q-factor wavelet transform (TQWT) [30] is employed to create levels. A new local histogram-based feature generation function is proposed for feature extraction, namely, the cube pattern. Two-dimensional (2D) graphs (generally called local graph structures) based generators are commonly used in the literature. In this study, the feature generation and extraction abilities of a three-dimensional (3D) shape are investigated through the cube pattern. An appropriate feature selector must be used to reach high performance and decrease the used classifiers’ training and testing time. Therefore, a neighborhood component analysis (NCA) [15] selector is used in this phase. Twenty-five variable classifiers are used for demonstrating the general success of this model.

1.2 Contributions

Contributions of this work can be summarized as follows:

  • A new 3D pattern-based feature generation function, the cube pattern, is presented.

  • The proposed model yielded high accuracy rates by employing 25 shallows/conventional classifiers. Our approach focuses on combining the proposed cube pattern, TQWT, and NCA to obtain a high-accurate EEG classification model. Moreover, this model outperforms the existing methods (see Table 6) and has low computational complexity.

1.3 Literature review

Manual classification of the EEG signals can be subjective and error-prone. Thus, automated methods are proposed in the literature to aid medical professionals. A literature survey was performed to acquire related work covering recent studies (after 2019) that conduct EEG signal classification on Bern-Barcelona and Bonn datasets and have a minimum accuracy of 89%. Selected studies are listed in Table 1.

Table 1 Literature Review

Considering studies listed in Table 1, it can be observed that hand-crafted feature generation models have widely been used for focal and nonfocal EEG signal classification. However, hand-modeled methods can generate low-level features. To improve classification ability, deep networks have been applied on EEG datasets. Daoud and Bayoumi [10] have extracted features using the deep convolutional autoencoder. Generated features have been classified using the multi-layer perceptron. They have used Bonn and Bern-Barcelona EEG datasets, and their model has achieved 96.0% and 93.21% accuracies for these datasets, respectively.

Convolutional Neural Network (CNN) was generally used to solve computer vision problems in the literature [20]. However, San-Segundo et al. [29] have used CNN and Fourier transform together to present a deep EEG signal classification model. Their accuracy has reached 98.90%. Fraiwan and Alkhodari [13] have presented a recurrent neural network (RNN) based EEG classification model. Bi-directional long short-term memory (Bi-LSTM) has been applied to the Bern-Barcelona dataset, and 99.60% accuracy has been reached. As Table 1 denotes, this model [13] has the highest result among the listed methods that were applied on the Bern-Barcelona dataset.

Although deep network-based EEG classification models attain high accuracies, they are too expensive in terms of computational complexity. Therefore, a lightweight and high-accurate model is needed. Our work aims to put forth this kind of solution.

1.4 Organization

The organization of the rest of the article is given as follows. Material and method are explained in Section 2, results are given in Section 3, discussions are presented Section 4 and conclusions are given in Section 5.

2 Material and method

2.1 Material

The recommended model is applied to the Bern-Barcelona EEG dataset for evaluation, which is one of the widely preferred EEG datasets for machine learning studies [4, 7, 31, 32]. Its common usage allows us to compare our method with other techniques that use the same dataset. The dataset contains EEG signals that belong to two classes: focal and nonfocal. These signals were collected from Fz and Pz channels. The sampling rate of these signals is 512 Hz. In this work, the EEG signals are divided into 5120 sized frames. Therefore, 5000 focal and 5000 nonfocal EEG signals have been used [3].

2.2 The proposed method

The fundamental objective of this work is to investigate the feature generation ability of a 3D shape-based pattern. Therefore, a new cube pattern has been presented for feature generation. This research analyzes a graph-based textural feature extractor on a focal EEG dataset. This work recommends a new generation hand-crafted feature-based basic and effective classification model for focal and nonfocal EEG signals. The model uses MSPCA-based denoising, TQWT, the cube pattern feature generation network, the selection of the most discriminative features via NCA, and classification with 25 conventional classifiers. The primary objective of this model is to yield a high-performance rate with a low-time cost. An overview of the presented EEG signal classification model is given in the following steps. Detailed explanations are provided in the rest of the article.

  1. 1:

    Denoise (reduce noises) the loaded raw EEG signals by employing MSPCA.

  2. 2:

    Apply TQWT to denoised signals to create levels. Values 1, 3, and 6 are assigned to q-factor, redundancy value, and level number parameters of the presented TQWT, respectively. In this step, seven sub-bands are generated by applying TQWT with these parameters.

  3. 3:

    Generate 1024 features using the presented cube pattern function. The cube pattern feature generation function extracts 128 features from a one-dimensional signal. We apply the cube pattern to the denoised EEG signal and seven TQWT sub-bands of it. Thus, 1024 (8 × 128) features are extracted.

  4. 4:

    Generate weights of the extracted 1024 features by applying NCA and select the most informative/discriminative 128 of them.

  5. 5:

    Forward the chosen 128 features to the used 25 classifiers and calculate the performance metrics using the actual output and considering the predicted results.

The visual denotation of the presented model is shown in Fig. 1.

Fig. 1
figure 1

The visual denotation of the presented TQWT and Cube Pattern-based method (SB stands for sub-band)

2.2.1 Preprocessing

The first phase of this model is denoising. By using MSPCA, existing noises of EEG signals are removed. MSPCA denoising method is the decomposition of EEG signals by combining the wavelet transform and PCA. MSPCA generalizes the PCA values/components of a multivariate signal represented as a matrix by simultaneously performing a PCA on matrices of different levels of details. Another PCA is also performed on the coarser approximation coefficients matrix in the wavelet field and on the final reconstructed matrix. By choosing the number of key components retained, simplified signals are reconstructed. In this way, noise removal on the signal is performed [22, 38]. Thanks to this combination, the benefits of using both techniques are observed. The noises are directly affected by the generated features. Therefore, this phase has a critical importance for yielding a high classification rate.

2.2.2 Feature generation

In the proposed model, a multileveled feature generation method is employed. The feature generator uses TQWT and the cube pattern together. TQWT is the third-generation wavelet transformation method introduced by Selecnick [30]. It is very effective for decomposition and feature generation. Therefore, hand-crafted feature extractors have used TQWT to generate high-level features. TQWT is a parametrical decomposition model. Several wavelet filters can be presented by deploying these parameters, such as the Q-factor (oscillatory value), the signal redundancy, and the number of levels. The Q-factor determines the oscillation of the signal. If the value of the Q-factor is selected as one, non-oscillatory decomposition is applied. The signal redundancy parameter can define users. The number of levels parameter is dependent on the length of the signals. By using this parametric transformation mechanism, variable wavelet coefficients are calculated based on the particular problem. The used feature generation model can be described in detail below (L is used as the abbreviation of Level describing the different levels of the method):

  1. L 1:

    Decompose the EEG signal by using TQWT. The Q-factor (Q), the redundancy (r), and the level number (J) are assigned as 1, 3, 6, respectively.

    $$ S{B}^j= TQWT\left( Signal,1,3,6\right),j=\left\{1,2,\dots, 7\right\} $$
    (1)
  2. L 2:

    Generate features from signal and sub-bands. The cube pattern extracts 128 features from a one-dimensional signal. In mathematical notations, the cube pattern is defined as CP(.) function.

    $$ X\left(k,j\right)= CP(Signal),k=\left\{1,2,\dots, NS\right\}\ j=\left\{1,2,\dots, 128\right\} $$
    (2)
    $$ X\left(k,128\ast i+j\right)= CP\left(S{B}^i\right),i=\left\{1,2,\dots, 7\right\} $$
    (3)

    where X denotes generated features, NS defines the number of signals. Equations 2 and 3 represents feature generation and concatenation. Sub-steps of the cube pattern are as follows:

    1. L 2.1:

      Divide a one-dimensional signal (EEG signal) into the eight-sized overlapping blocks.

      $$ b{l}^h(k)= Signal\left(h+k-1\right),h=\left\{1,2,\dots, len(Signal)-7\right\},k=\left\{1,2,\dots, 8\right\} $$
      (4)

      where blh defines hth overlapped block, len(.) represents the length calculation function.

      A cube has eight corners. Therefore, each value is used as a corner of the cube. An eight-sized block is shown in Fig. 2.

      Fig. 2
      figure 2

      An eight-sized block. Each value in the block corresponds to a corner of the cube

    2. L 2.2:

      Create the cube by using the values of the block. The edges of the cube denote the relationship of the values. Binary features are generated by using these relationships and the signum function. A cube has 12 edges (see Fig. 3) and 12 bits are generated by using them.

      Fig. 3
      figure 3

      The presented cube pattern

      $$ \left[\begin{array}{c} bi{t}_1\\ {} bi{t}_2\\ {} bi{t}_3\\ {} bi{t}_4\\ {} bi{t}_5\\ {} bi{t}_6\\ {} bi{t}_7\\ {} bi{t}_8\\ {} bi{t}_9\\ {} bi{t}_{10}\\ {} bi{t}_{11}\\ {} bi{t}_{12}\end{array}\right]= Sign\left(\left[\begin{array}{c}P1,P2\\ {}P1,P4\\ {}P1,P5\\ {}P2,P3\\ {}P2,P6\\ {}P3,P4\\ {}P3,P7\\ {}P4,P8\\ {}P5,P8\\ {}P5,P6\\ {}P6,P7\\ {}P7,P8\end{array}\right]\right) $$
      (5)
      $$ Sign\left(t,i\right)=\left\{\begin{array}{c}0,t-i<0\\ {}1,t-i\ge 0\end{array}\right. $$
      (6)

      where Sign(., .) is signum function, t and i are input parameters of the signum function, biti is the ith extracted binary feature

    3. L 2.3:

      Divide the generated 12 bits into left and right sections. While the first six bits are from the left section, the remaining bits are from the right section.

      $$ left(j)= bit(j),j=\left\{1,2,\dots, 6\right\} $$
      (7)
      $$ right(j)= bit\left(j+6\right) $$
      (8)
    4. L 2.4:

      Calculate left and right map signals.

      $$ ma{p}^{left}(h)=\sum \limits_{j=1}^8 left(j)\ast {2}^{j-1} $$
      (9)
      $$ ma{p}^{right}(h)=\sum \limits_{j=1}^8 right(j)\ast {2}^{j-1} $$
      (10)

      where mapleft and mapright define the left map signal and right map signal consecutively.

    5. L 2.5:

      Extract histogram values of the generated map signals. Equations 9 and 10 denote that these map signals are coded in six bits. Therefore, the length of each map signal histogram is calculated as 26 = 64.

    6. L 2.6:

      Combine the extracted histograms and generate the feature value of the cube pattern.

      $$ feature(s)= His{t}^{left}(s),s=\left\{1,2,\dots, 128\right\} $$
      (11)
      $$ feature\left(s+128\right)= His{t}^{right}(s) $$
      (12)

      where Histleft and Histright are histograms of the left and right map signals, respectively.

      The given six steps are defined as CP(.) function.

2.2.3 Feature selection

Feature selection is a critical phase of the proposed method. The recommended feature extraction method generates 1024 features. This phase aims to select the most discriminative 128 of them. One of the widely preferred weight-based selectors, NCA, is preferred in this work. NCA is a version of kNN allowing feature selection. It is a distance-based selector presented by Goldberger et al. [15]. Fixed initial weights are assigned in NCA. A distance function (Manhattan) and an optimization function (stochastic gradient descend) are used to create positive weights. By using the created weights, the most valuable features are selected [15]. Steps (abbreviated as ‘S’) of this phase are;

  • S1: Normalize each feature individually.

    $$ maxi(j)=\max \left(X\left(:,j\right)\right) $$
    (13)
    $$ mini(j)=\min \left(X\left(:,j\right)\right) $$
    (14)
    $$ {X}^N\left(k,j\right)=\frac{X\left(k,j\right)- mini(k)}{maxi(k)- mini(k)} $$
    (15)

    Herein, X(:, j) defines the jth feature values, maxi(j) represents the maximum value of the jth feature, mini(j) is the minimum value of the jth feature, and XN expresses normalized features.

  • S2: Calculate weights deploying NCA.

  • S3: Sort the generated weights to find indices by descending.

    $$ \left[{w}^S, idx\right]= sort(w) $$
    (16)

    where w is weights of the features by generating NCA, idx expresses sorted indices, and sort(.) is the sorting function.

  • S4: Select 128 of the most informative/discriminative features (XS) by using XN and idx.

    $$ {X}^S\left(k,c\right)={X}^N\left(k, idx(j)\right),c=\left\{1,2,\dots, 128\right\} $$
    (17)

2.2.4 Classification

MATLAB (2020a) classification learner toolbox is used to apply 25 conventional classifiers. Among them, XS is considered as the input of conventional classifiers. All other classifiers are used in their default setting. 10-fold cross-validation is used for training and testing. The used classifiers are listed in Table 2.

Table 2 Used classifiers

3 Results

Results of this study have been calculated by using 25 conventional classifiers. Different metrics are used to evaluate the performance of the presented method. Mathematical expressions of these metrics are listed below [37].

$$ Accuracy=\frac{Tp+ Tn}{Tp+ Tn+ Fp+ Fn} $$
(18)
$$ F1=\frac{2 Tp}{2 Tp+ Fp+ Fn} $$
(19)
$$ Precision=\frac{Tp}{Tp+ Fp} $$
(20)
$$ Sensitivity=\frac{Tp}{Tp+ Fn} $$
(21)
$$ Specificity=\frac{Tn}{Tn+ Fp} $$
(22)
$$ Geometric\ mean=\sqrt{\frac{Tn}{Tn+ Fp}\ast \frac{Tp}{Tp+ Fn}} $$
(23)

where, Tp, Tn, Fp, and Fn express true positives, true negatives, false positives, and false negatives, respectively. By using these parameters (see Eqs. 1823) and with the help of 25 classifiers, the results are calculated as shown in Table 3.

Table 3 The calculated results of the presented EEG signal classification model

Table 3 denotes that the best accurate results are obtained from Medium Gaussian SVM as it reaches 99.97% classification accuracy. Twenty-four of the used 25 classifiers result in >99% classification accuracy. The worst classifier is Fine Gaussian SVM, and it yields 98.08% accuracy. The results demonstrate a high classification capability of the presented TQWT and cube pattern-based focal-non focal EEG classification. The ROC (Receiver Operating Characteristic) of the Medium Gaussian SVM is shown in Fig. 4.

Fig. 4
figure 4

The calculated ROC curve of the Medium Gaussian SVM. Per this figure, 100.0% AUC value was calculated

Moreover, the time complexity of the presented model is calculated using the Big O notation. This model consists of four main phases (see Section 3): preprocessing, feature generation, feature selection, and classification. Table 4 shows the time burden of the presented model on each phase and in total.

Table 4 Time complexities of the presented EEG detection model

In Table 4, n defines the length of the EEG signal, and k and t represent the time complexity variable of the NCA and the used classifiers. In this model, several classifiers with different time burdens are used to calculate results. For instance, the time complexity of the kNN is calculated as O(nd), while O(nd3) is the time burden of the SVM. The memory complexity of the presented EEG detection model is also given in Table 5.

Table 5 The memory burden of the presented EEG detection model

In Table 5, n, f, p and d are the length of the signal, number of features, number of selected features, and the number of the used EEG observations. It can be concluded that the proposed method is lightweight by considering Tables 4 and 5.

4 Discussions

TQWT is an effective and fast decomposition model that presents various wavelet decomposition methods by changing Q, r, and J parameters. The recommended cube pattern is a microstructure for feature generation. We have investigated the feature generation ability of the 3D shapes/graphs by proposing a cube pattern. The presented model yielded very high accuracy by using 25 variable classifiers. The range of the calculated accuracies is between 98.08% and 99.97%. Accuracy rates of the priorly presented EEG classification models and our method are listed in Table 6.

Table 6 Classification accuracies (%) of the existing classification models and the presented model (sorted per the increasing accuracy rate)

Table 6 denotes our method yields the best accuracy rate. The best of other models reached 99.92% classification accuracy. We achieved better results than that model using five classifiers (Quadratic SVM, Cubic SVM, Medium Gaussian SVM, Fine kNN, Subspace kNN). Also, the generated and selected separable features have a positive effect on these high classification performances. A scatter plot shows the distribution of these features in Fig. 5. Statistical properties of these features are provided in Fig. 6 by using boxplot analysis.

Fig. 5
figure 5

A scatter plot about the selected features. a Statistical properties of the generated and selected focal EEG features. b Statistical properties of the generated and selected nonfocal EEG features

Fig. 6
figure 6

Boxplot denotation of the generated and selected features

Figures 5 and 6 demonstrate the distinctiveness of the features that contributes to the high calculated performance. Moreover, a t-test was applied to the generated and chosen 128 features. The calculated p values were shown in Fig. 7. The minimum p value is found as zero, and the average value is calculated as 4.0940e-13. Figures 5, 6 and 7 obviously denote the discriminative attributes of the features using TQWT, the recommended cube pattern, and the NCA selector.

Fig. 7
figure 7

The calculated p-values of the features

Per these findings and results, the following points can be highlighted as the advantages of the presented model:

  • A new 3D pattern, the cube pattern, is presented in this research. By applying the cube pattern on the chosen dataset, the effectiveness of the 3D shape-based pattern is investigated.

  • The presented multileveled hand-crafted feature generation model has low-level and high-level feature generation abilities.

  • The most discriminative features are generated and selected for classification by using TQWT, the cube pattern-based generation, and the NCA selection models.

  • 25 variable classifiers have been used for calculating results. The range of the yielded accuracies is from 98.08% to 99.97%.

  • Simple/basic methods have been used together. Therefore, the model can be classified as lightweight.

5 Conclusions

This research presents a new 3D pattern-based feature generation method for classifying focal and nonfocal EEG signals. The presented TQWT and cube pattern-based model’s primary goal is to generate discriminative features to solve the EEG signal classification problem with high accuracy. Therefore, basic/simple methods were utilized together to create a useful model. The Bern-Barcelona dataset was used to perform experiments. The presented TQWT and cube pattern-based model tested on 25 variable classifiers. The proposed EEG classification model yielded >98% classification accuracies for all the used classifiers. The best-resulted classifier is Medium Gaussian SVM, whose accuracy is calculated as 99.97%. The results and findings denote the success of the presented model. By applying the presented model, high accuracy rates were calculated using any shallow classifiers.

In the near future, we are planning to develop an automatic EEG abnormality detection application to solve real-world problems. In this system, we will develop a new EEG signal classification model deploying a big EEG signal classification model. The used huge dataset will be trained using our presented 3D shape-based EEG classification models. In the medical centers, the EEG signals will be sent to the trained dataset using a graphical user interface (GUI), and our cloud-based model will send responses to the developed GUI. The intended system will help medical professionals and will cause speed up diagnosis processes.

Moreover, similar to the presented cube pattern (3D shape-based feature generator) in this research, other shapes or graphs can be used to propose novel transformations, decomposition techniques, and feature generation models. These models can be employed in advanced signal and image processing methods. Novel deep learning models can also be presented deploying shape/graph-based feature generators.