Exploring the Characterization and Classification of EEG Signals for a Computer-Aided Epilepsy Diagnosis System

Vega-Gualán, Emil; Vargas, Andrés; Becerra, Miguel; Umaquinga, Ana; Riascos, Jaime A.; Peluffo, Diego

doi:10.1007/978-3-030-37078-7_19

Emil Vega-Gualán^11,13,
Andrés Vargas¹³,
Miguel Becerra^12,13,
Ana Umaquinga^13,15,
Jaime A. Riascos^13,14 &
…
Diego Peluffo^13,14

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11976))

Included in the following conference series:

International Conference on Brain Informatics

832 Accesses
1 Citations
11 Altmetric

Abstract

Epilepsy occurs when localized electrical activity of neurons suffer from an imbalance. One of the most adequate methods for diagnosing and monitoring is via the analysis of electroencephalographic (EEG) signals. Despite there is a wide range of alternatives to characterize and classify EEG signals for epilepsy analysis purposes, many key aspects related to accuracy and physiological interpretation are still considered as open issues. In this paper, this work performs an exploratory study in order to identify the most adequate frequently-used methods for characterizing and classifying epileptic seizures. In this regard, a comparative study is carried out on several subsets of features using four representative classifiers: Linear Discriminant Analysis (LDA), Quadratic Discriminant Analysis (QDA), K-Nearest Neighbor (KNN), and Support Vector Machine (SVM). The framework uses a well-known epilepsy dataset and runs several experiments for two and three classification problems. The results suggest that DWT decomposition with SVM is the most suitable combination.

Access provided by Autonomous University of Puebla. Download conference paper PDF

An efficient method for identification of epileptic seizures from EEG signals using Fourier analysis

Article 29 March 2021

Classification of EEG Signals Using Single Channel Independent Component Analysis, Power Spectrum, and Linear Discriminant Analysis

Automatic identification of epileptic seizures using volume of phase space representation

Article 06 May 2021

Keywords

1 Introduction

Epilepsy has become the third most common neurological disorder after stroke and dementia. According to the World Health Organization (WHO), epilepsy affects 0.5–1.5% of the world population, mainly children under 10 and people over 65, and it is more common in developing countries and disadvantaged socioeconomic classes [2]. The seizures are the hallmark for an epilepsy diagnosis. They are recurrent but infrequently and unprovoked signals caused by the synchronized electrical discharge of a large number of neurons [9]. Since epilepsy occurs when the localized electrical activity of neurons suffers from an imbalance, analyzing the electroencephalographic signals (EEG) is one of the most suitable methods to diagnosis this disorder.

Most of the computer-aided systems for diagnosis epilepsy use EEG because it allows rapid and visual inspection of seizures, not only when they are occurring, but also the pre-occurrence- and between- seizures [12]. The common procedure for developing automatic diagnostic-assistance systems based on EEG has five stages (citation): EEG signal acquisition, pre-processing, signal characterization, classification and in-context interpretation (visualization). Zhou and colleagues [13] developed an epileptic seizure detection using the raw EEG Signals (temporal approach), meanwhile, Tsipouras [10] studies the epilepsy classification using spectral information of EEG signals.

This work aims to contribute with an exploratory study about the feature extraction and classification techniques on seizure detection. The proposed framework includes the typical stages described above as follows: a simple amplitude normalization for pre-processing signals. Subsequently, features are extracted using statistical measures on both the original signals and spectral transformation thereof (Discrete Wavelet Transformation, DWT). Afterward, a set of features is chosen by applying feature selection methods such as Bestfirst and Ranker. Then, the selected features are classified by using four of the most representative classification approaches for EEG: Linear Discriminant Analysis (LDA), Quadratic Discriminant Analysis (QDA), K-Nearest Neighbor (KNN) and Support Vector Machine (SVM). The framework uses the “Epileptic Seizure Recognition Data Set”^{Footnote 1} for running several tests in order to explore as much as possible the proposed classifiers and features. The outcome of the study points out the DWT features and SVM classifier as the most suitable combination.

2 Materials and Methods

2.1 Dataset

The “Epileptic Seizure Recognition Data Set” is composed of 500 individuals with 4097 data points of 23.6 seconds each one. This dataset is later divided and shuffled every 4097 data points into 23 chunks which contain 178 data points for 1-second [8]. Therefore, it contains a matrix of dimension $11500 \times 178$. The last column represents the labels (1,2,3,4,5) as follows:

1.
Seizure activity
2.
EEG signal from the area where the tumor is
3.
EEG activity from the healthy brain area
4.
EEG signal from eyes closed
5.
EEG signal from eyes open.

2.2 EEG Pipeline

Pre-processing: The signals are normalized in order to remove offset levels using the Eq. 1.

$$\begin{aligned} S = \frac{S- \bar{S}}{max \left| S \right| } \end{aligned}$$

(1)

where S is the signal, $max \left| S \right| $ is the maximum absolute value of the signal, and $\bar{S}$ is the mean of the signal.

Signal Decomposition: Discrete Wavelet Transform (DWT) decomposes a signal recurrently into two sub-signals (approach and detail) with less resolution regarding the frequency, known as coefficients [6]. Daubechies family is used in this work (order four) through the MATLAB function wavedec.

Characterization: Considering previous works of EEG signal [1, 11], several features are used as follows:

Number	Type	Description
$x^{(1)} \cdots x^{(15)}$	Temporal	Statistical features
$x^{(1)} \cdots x^{(15)}$	Temporal	Entropy
$ x^{(16)} \cdots x^{(27)}$	Morphological	Area under the curve
		Amplitude change
		Energy
$x^{(28)} \cdots x^{(35)} $	Spectral	Fourier transform
$x^{(28)} \cdots x^{(35)} $	Spectral	(Best features)
$x^{(36)} \cdots x^{(221)}$	Representative	DWT
$x^{(36)} \cdots x^{(221)}$	Representative	(Temporal and morphological features)

This results in a feature matrix of dimension $11500 \times 221$. Where 221 is the total number of features normalized through the Eq. 1.

Feature Selection: In order to use the most significant features, two methods from Weka [5] are used to create a smaller subset:

(i)
Using the CfsSubsetEval as attribute evaluator and BestFirst as search method [4]. This is applied to the whole feature matrix.
(ii)
Using the InfoGainAttributeEval as attribute evaluator and Ranker as search method. This is applied to the outcome of the previous step.

Obtaining thus a final feature matrix of size $11500 \times 9$. Where 9 is the total number of features used in this work. Most of the features coming from the DWT subset.

2.3 Classification

These features are used to train and evaluate four different classifiers. In this sense, a representative method of each typology (Distance-based (k-NN), model-based (LDA, QDA) and SVM (data-driven)) are used [3, 7]:

Linear Discriminant Analysis (LDA): LDA creates a hyperplane by the projection of the co-variance matrices that separates the classes and estimates the probability that a new data belongs to each class (maximum probability).
Quadratic Discriminant Analysis (QDA): It is a variant of LDA, where an individual co-variance matrix is estimated for every class of observations.
K-Nearest Neighbour (k-NN): k-NN assigns the classification label following the largest posterior probability among to nearest neighbor’s values, which is calculated using a metric distance.
Support Vector Machine (SVM): SVM finds the optimal hyperplane (in a N-dimensional space) that separates the data by maximizing the margin between the classes.

For all the experiments (see the next section), the classifiers were trained with 10 iterations using 80% of the data and the rest for testing. All experiments were run in Matlab using the classifier settings shown in Table 1. Finally, Fig. 1 summarizes graphically the methodology used in this research.

Table 1. Classifiers settings

Full size table

3 Results and Discussion

This work uses the Dunn test (Kruskal-Wallis with bonferroni correction) for performing comparisons among the classifiers. In order to evaluate as much as possible the performance of the classifiers, the final matrix with the selected features is used for doing five tests. All experiments are performed regarding the target class 1 which is the most important class for this research (seizure activity). Additionally, for evaluating the capacity of the methodology, the five classes of the original dataset were restructured to work with binary and three-class cases instead of multiclass cases (more complex and challenging). The combination of classes are described as follows:

3.1 Experiment 1

In this experiment the classes 2, 3, 4 and 5 are merged into a single class, with class 1 as target for classification. As result, there are significant differences (chi-squared = 30.8512, df = 3) among both LDA and QDA with KNN and SVM. SVM achieves the lowest error rate followed nearly by kNN. Figure 2 shows the result.

3.2 Experiment 2

For this experiment, we tried to classify the region where the seizure is presented; therefore, the seizure (class 1) and tumor localization (class 2) are used alone, and the rest of the classes are removed from the dataset. We found that LDA is significant different with SVM, meanwhile QDA is with both SVM and KNN (chi-squared = 34.929, df = 3). SVM presents again the lower error rate followed nearly by KNN. The worst of them is QDC. Figure 3 shows the result.

3.3 Experiment 3

Here, we intended to distinguished the seizure activity and the area where the tumor localization is, so the classes 1 and 2 form two classes individually, and the rest of the classes as a single one. This experiment presents the similar results as the previous one, that is, QDA as the worst classifier with a significant difference with both SVM and KNN, meanwhile LDA only with SVM. SVM achieves again the lowest error rate. Figure 4 shows the result.

3.4 Experiment 4

In this opportunity the classes 1, 2 and 3 represent different classes individually, with the rest of classes removed from the dataset. Therefore, class 3 can distinguish the healthy brain area. The results suggest that SVM achieves the lowest error rate and significantly differs with the other classifiers (chi-squared = 31.7648, df = 3). Meanwhile, QDA obtains the higher error rate. Figure 5 shows the result.

3.5 Experiment 5

Continuing with the multiclass problem, the classes 1 and 2 were merged as a single class, meanwhile class 3 individually and classes 4 and 5 as a single one as well. The results present again SVM as the best classifiers. There are significant differences between LDA and SVM, and QDA with both SVM and KNN. Again the worst one is QDA. Figure 6 shows the result.

Finally, Fig. 7 presents the comparison among classifiers along the five experiments, evidencing the SVM’s behavior as the best classifier.

4 Conclusions and Future Work

The comparison of the explored techniques suggests that the features from DWT decomposition along with Support Vector Machine are the best alternatives to build an Epilepsy-driven EEG analysis computer-aided system. The aim of combining several classes is to study the performance of classifiers, demonstrating that for binary classification both KNN and SVM are good alternatives, meanwhile, for the three-class problem, QDA shows the worst performance and SVM the best. Indeed, the seizure data are hardly separable classes, therefore, a data-driven method like SVM with kernel solution was necessary. Other alternatives should be considered for future studies, among them the feature extraction and deep learning approaches. Besides, given the number of methods (dozens of methods), it is necessary to scale the study with more performance measures to be able to do a comparison with other studies.

Notes

1.
Available on UCI machine learning repository https://archive.ics.uci.edu/ml.

References

Bajaj, V., Pachori, R.B.: EEG signal classification using empirical mode decomposition and support vector machine. In: Deep, K., Nagar, A., Pant, M., Bansal, J.C. (eds.) Proceedings of the International Conference on Soft Computing for Problem Solving (SocProS 2011) December 20-22, 2011. AISC, vol. 131, pp. 623–635. Springer, New Delhi (2012). https://doi.org/10.1007/978-81-322-0491-6_57
Chapter Google Scholar
Christensen, J., Sidenius, P.: Epidemiology of epilepsy in adults: implementing the ILAE classification and terminology into population-based epidemiologic studies. Epilepsia 53, 14–17 (2012)
Article Google Scholar
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley, New York (2000)
MATH Google Scholar
Hall, M.A.: Correlation-based feature subset selection for machine learning. Ph.D. thesis, University of Waikato, Hamilton, New Zealand (1999)
Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)
Article Google Scholar
Ocak, H.: Automatic detection of epileptic seizures in EEG using discrete wavelet transform and approximate entropy. Expert Systems with Applications 36(2), 2027–2036 (2009)
Article Google Scholar
Qazi, K.I., Lam, H., Xiao, B., Ouyang, G., Yin, X.: Classification of epilepsy using computational intelligence techniques. CAAI Trans. Intell. Technol. 1(2), 137 – 149 (2016). https://doi.org/10.1016/j.trit.2016.08.001, http://www.sciencedirect.com/science/article/pii/S2468232216300142
Qiuyi, W., Ernest, F.: Epileptic Seizure Recognition Data Set. UCI Machine learning repository (2017). https://archive.ics.uci.edu/ml/datasets/Epileptic+Seizure+Recognition
Stafstrom, C.E., Carmant, L.: Seizures and epilepsy: an overview for neuroscientists. Cold Spring Harbor Perspect. Med. 5(6), a022426 (2015). https://doi.org/10.1101/cshperspect.a022426
Article Google Scholar
Tsipouras, M.G.: Spectral information of EEG signals with respect to epilepsy classification. EURASIP J. Adv. Sig. Process. 2019(1), 10 (2019). https://doi.org/10.1186/s13634-019-0606-8
Übeyli, E.D.: Statistics over features: EEG signals analysis. Comput. Biol. Med. 39(8), 733–741 (2009)
Article Google Scholar
Wang, L., et al.: Automatic Epileptic Seizure Detection in EEG Signals Using Multi-Domain Feature Extraction and Nonlinear Analysis, May 2017. https://www.mdpi.com/1099-4300/19/6/222
Zhou, M., et al.: Epileptic seizure detection based on EEG signals and CNN. Front. Neuroinformatics 12, 95–95 (2018). https://doi.org/10.3389/fninf.2018.00095, https://www.ncbi.nlm.nih.gov/pubmed/30618700, 30618700[pmid]

Download references

Acknowledgments

Authors thank to the SDAS Research Group (www.sdas-group.com) for its valuable support.

Author information

Authors and Affiliations

Yachay Tech University, Urcuqui, Ecuador
Emil Vega-Gualán
Institución Universitaria Pascual Bravo, Medellín, Colombia
Miguel Becerra
SDAS Research Group, Urcuqui, Ecuador
Emil Vega-Gualán, Andrés Vargas, Miguel Becerra, Ana Umaquinga, Jaime A. Riascos & Diego Peluffo
Corporación Universitaria Autónoma de Nariño, Pasto, Colombia
Jaime A. Riascos & Diego Peluffo
Universidad Técnica del Norte, Ibarra, Ecuador
Ana Umaquinga

Authors

Emil Vega-Gualán
View author publications
You can also search for this author in PubMed Google Scholar
Andrés Vargas
View author publications
You can also search for this author in PubMed Google Scholar
Miguel Becerra
View author publications
You can also search for this author in PubMed Google Scholar
Ana Umaquinga
View author publications
You can also search for this author in PubMed Google Scholar
Jaime A. Riascos
View author publications
You can also search for this author in PubMed Google Scholar
Diego Peluffo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Emil Vega-Gualán , Andrés Vargas , Miguel Becerra or Jaime A. Riascos .

Editor information

Editors and Affiliations

Capital Normal University, Beijing, China
Peipeng Liang
York University, Toronto, ON, Canada
Vinod Goel
Shanghai University of Traditional Chinese Medicine, Shanghai, China
Chunlei Shan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vega-Gualán, E., Vargas, A., Becerra, M., Umaquinga, A., Riascos, J.A., Peluffo, D. (2019). Exploring the Characterization and Classification of EEG Signals for a Computer-Aided Epilepsy Diagnosis System. In: Liang, P., Goel, V., Shan, C. (eds) Brain Informatics. BI 2019. Lecture Notes in Computer Science(), vol 11976. Springer, Cham. https://doi.org/10.1007/978-3-030-37078-7_19

Download citation

DOI: https://doi.org/10.1007/978-3-030-37078-7_19
Published: 05 December 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-37077-0
Online ISBN: 978-3-030-37078-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics