Keywords

1 Introduction

Epilepsy is an acute and recurring neurological confusion generally evident by frequent seizures which has an effect on about 1% of the world’s population [1]. It results in worsening of consciousness and may show random and recurrent body convulsions. The developing countries contribute about 85% of the 60 million people affected by epilepsy worldwide. EEG recording characterizes electrical activities of the brain by measuring electrical potential of neurons. The EEG is recorded a prominent tool in the detection of epileptic seizures and includes recognition of spikes in the epilepsy identification [2]. Further, the accurate diagnosis of epilepsy helps in deciding the course of the antiepileptic medication. The automated seizure detection system generally involves three steps, namely preprocessing, signal transformation feature extraction and classification [3]. In joint time-frequency techniques, one finds two-dimensional (2D) time-frequency representation of filtered EEG [4], using Hilbert–Huang Transform (HHT) and its variant Empirical-Mode Decomposition (EMD) [5].

The proposed method employs two signal processing techniques for segregation of normal and epileptic EEG signals. In the first step, the EMD method is applied on the EEG signals for the extraction of the IMFs followed by SODP which measures the rate of variability of individual IMFs [6]. In the second step, the 95% confidence ellipse areas of SODPs of two IMFs is computed and used as a feature set for categorization of the EEG signal information using Cosine Similarity Measure Support Vector Machine (CSM-SVM) classifier [7].

The remaining paper is ordered as follows: Sect. 2 explains the data set, EMD, SODP and ellipse area formulation, selection of features based on CSM, CSM-SVM-based classification, and estimation of statistical parameters. The result and discussion are presented in Sect. 3. Conclusion has been drawn in Sect. 4.

2 Methodology

2.1 Data Set

The EEG data set information that is accessible online publicly and explained in [8] is used to validate the results of this work. The data set with normal and seizure EEG signals of five sub-signals is represented as A, B, C, D and E, each having 100 numbers of single-channel EEG signals. The duration of each sample is 23.6 s with sampling rate of 173.61 Hz. The subsets A and B are recorded with international 10–20 standard electrode placement scheme with surface EEG recordings. The set A is for healthy subjects with eyes open and B for eyes closed, the subsets C and D are recorded in normal intervals from five patients in the epileptogenic zone (subset C) and from the hippocampal formation of the opposite hemisphere of the brain (subset D). The typical EEG signal samples of normal and seizure are shown in Fig. 1.

Fig. 1
figure 1

EEG signals a normal and b seizure

2.2 Empirical-Mode Decomposition

EMD is a data-driven technique that splits a signal into a finite set of amplitude and frequency modulated (AM-FM) oscillating components. These components are called as IMFs. An IMF is a function which satisfies the conditions as follows: (1) number of maxima and minima are either equal or vary almost by one. (2) The mean value of envelopes, characterized by local maxima and minima, is zero. This signal-dependent decomposition is adaptively done for predefined levels. Moreover, the decomposition of long duration signals is done without a stationary and linearity condition of the signal. The EMD algorithm is used to obtain number of band-limited functions from a signal x(t). The EMD algorithm for the signal x(t) is depicted below [5]:

  1. 1.

    Original signal is set as x(t) = P1(t).

  2. 2.

    Both the maximum and minimum values for P1(t) are estimated.

  3. 3.

    The cubic spline interpolation is used to establish the upper and lower envelopes represented as envupper(t) and envlower(t), respectively. To achieve this, the maximum and minimum points are joined.

  4. 4.

    The local mean is computed as m(t) = (envlowet(t) + envlower(t))/2.

  5. 5.

    The signal m(t) is subtracted from the given signal as P1(t) = P1(t) − m(t) means IMF should have zero local mean.

  6. 6.

    To P1(t) is tested for above defined conditions to check if it is an IMF or not.

  7. 7.

    The steps (2)–(6) are repeated and when an IMFs are computed, the process is stopped.

The sifting is continued until the last IMF is generated. The computation functions and the residues are given by:

$$ r_{1} (t) - P_{2} (t) = r_{2} (t), \ldots ,r_{{{\text{IMF}} - 1}} (t) - P_{\text{IMF}} (t) = r_{\text{IMF}} (t) $$
(1)

where rIMF(t) is called the final signal remained. The final stage of decomposition results is the signal x(t) given by:

$$ x(t) = \sum\limits_{p = 1}^{M} {P_{p} (t) + r_{\text{IMF}} (t)} $$
(2)

where the number of IMFs is represented by M and rIMF(t) is the final signal remained.

The implementation of the EMD algorithm is done using MATLAB. The implementation of EMD on the 23.6 s EEG signals for 4097 samples is shown in Fig. 2.

Fig. 2
figure 2

Intrinsic-mode functions of normal EEG signal

2.3 Second-Order Difference Plot and Calculation of Ellipse Area

A nonlinear signal can be viewed with a different perspective, and a continuous chaotic modelling is an effective tool in understanding the nonlinearity in long data series. To classify normal and seizure EEG signals, the SODP of IMFs of EEG signals present important parameters.

For the signal x(n), the SODP is achieved by plotting X(n) versus Y(n) which are defined as [6]:

$$ X(n) = x(n + 1) - x(n) $$
(3)
$$ Y(n) = x(n + 2) - x(n + 1) $$
(4)

The chaotic equations are employed in generating graphs, which are known as Poincare plots. The equation below represents a Poincare equation, which is an example of a chaotic system.

$$ a_{n} = Aa_{n - 1} (1 - a_{n - 1} )2 \le A \le 4 $$
(5)

where A is a constant. The performance of the function is depending on this value.

Recently, the variability measured from the SODP has been used for analysis of EEG signals [6]. The 95% confidence ellipse area measured from the SODP of IMFs of EEG signal data represents a set of parameters for segregation of normal and seizure EEG signals. Figure 2 shows the IMFs of normal signal patterns. It motivates to work out the ellipse area of “SODP of IMFs” for separation of EEG signals. The method to calculate “95% confidence ellipse area” from the SODP approach is presented as follows [6]: The mean value of X(n) and Y(n) is computed as:

$$ S_{X} = \sqrt {\frac{1}{N}\sum\limits_{n = 0}^{N - 1} {X(n)^{2} } } \quad S_{Y} = \sqrt {\frac{1}{N}\sum\limits_{n = 0}^{N - 1} {Y(n)^{2} } } \quad S_{Y} = \frac{1}{N}\sqrt {\sum\limits_{{}}^{{}} {X(n)Y(n)^{{}} } } $$
(6)

The D value is computed as:

$$ D = \sqrt {\left( {S_{X}^{2} + S_{Y}^{2} } \right) - 4\left( {S_{X}^{2} \,S_{Y}^{2} - S_{XY}^{2} } \right)} $$
(7)

The ellipse area is formulated as:

$$ A_{\text{ellipse}} = P_i\,(ab);\quad a = \sqrt {3 \times \left( {S_{X}^{2} + S_{Y}^{2} + D} \right)} ;\quad b = \sqrt {3 \times \left( {S_{X}^{2} + S_{Y}^{2} - D} \right)} $$
(8)

2.4 Cosine Similarity Measure Support Vector Machine (CSM-SVM) Classifier

The SVM is a powerful machine learning algorithm for regression and classification which produces very precise classification results. SVMs are considered as an important example of “kernel methods”, one of the key areas in machine learning, and the Radial Basis Function (RBF) kernel has been employed in this work. The RBF is given by:

$$ K\left( {x_{i} ,x_{j} } \right) = \exp \left( { - \left( {\left\| {x_{i} - x_{j} } \right\|^{2} /2\,\rho^{2} } \right)} \right) $$
(9)

where \( \rho > 1 \) is the parameter controlling the width of the kernel. The Cosine Similarity Index (CSI) between two data vectors is a standard criterion for finding the distance between two data samples. CSI determines the cosine of the angle between the data samples. To construct the cosine resemblance equation, the equation of the dot product for the \( \cos \theta \) is to be solved as:

$$ \vec{a}.\vec{b} = \left\| {\overrightarrow {a} } \right\|\left\| {\vec{b}} \right\|\,\cos \theta ;\quad {\text{CSI}} = \cos \theta = \frac{{\vec{a}.\vec{b}}}{{\left\| {\vec{a}} \right\|\left\| {\vec{b}} \right\|}} $$
(10)

where CSI represents the whitened cosine similarity value between vector a and b [9].

2.5 Performance Evaluation

The assessment of the SVM-based classifier for classification of seizure and non-seizure EEG signals is carried out by calculating the sensitivity, specificity and accuracy. The sensitivity (SE), specificity (SP) and accuracy (AC) can be defined as:

$$ {\text{SE}} = \frac{\text{TP}}{{{\text{TP}} + {\text{FN}}}} \times 100\% ;\quad {\text{SP}} = \frac{\text{TN}}{{{\text{TN}} + {\text{FP}}}} \times 100\% ; $$
(11)
$$ {\text{AC}} = \frac{\text{TP}}{{{\text{TP}} + {\text{FN}} + {\text{FP}} + {\text{FN}}}} \times 100\% $$
(12)

where TP and TN signify the overall number of appropriately detected true positive patterns and true negative patterns, respectively. The FP signifies overall number of erroneously positive patterns, and FN signifies erroneously negative patterns. The positive and negative patterns signify detected seizure and detected normal EEG signals, respectively.

3 Results and Discussion

In the first step, the results are obtained by decomposing EEG signals from data sets B and E by EMD method to obtain nine IMFs as shown in Fig. 2. The first four IMFs are selected to compute SODP, and 95% ellipse area as the maximum frequency variation and nonlinearity of the signal is obtained from first few IMFs. The classification of normal and seizure EEG signals is performed using the ellipse area parameters of SODP for first four IMF’s pairs. Figure 3 shows plot of SODP for all the IMFs achieved by EMD process. The compute 95% ellipse area of SODP of combination of initial four IMFs, the six pairs has been selected as IMF12, IMF13, IMF14, IMF23, IMF24, and IMF34.

Fig. 3
figure 3

Second-order difference plot of IMFs of normal EEG signal

The plot of 95% confidence ellipse area of these six pairs is shown in Fig. 4. Generally, first pair of IMFs represents frequency variation with more values. The 95% confidence area as a statistical feature is computed from the pair of IMFs as it clearly represents the nature of the underlying EEG signal. The maximum and minimum scaled values of 95% confidence are shown in Table 1 for normal and seizure data samples. The values for IMF12 pair are 19,642.13 and 123.92 respectively. The first IMF and its associated SODP show that the feature set can distinguish between normal and seizure signals. The elliptical shape of SODP of IMFs encourages computation of 95% confidence ellipse area of SODP of IMFs of EEG signals. As with the seizure-free case, similar observation can be drawn for the case of ictal EEG signal also. The 95% of confidence ellipse area parameters has been computed for ictal and non-ictal classes using SODP of IMFs for a range of data values of the signal (500, 1000, 1500 and 4097 samples). The class discriminating performance of ellipse area feature is verified using box plots.

Fig. 4
figure 4

Ellipse area of IMF pairs for initial four IMFs of normal signal

Table 1 95% confidence area for six IMF pairs for normal and seizure EEG data (80 samples of normal EEG)

The IMF’s ellipse area parameters of SODP of IMFs are considerably diversified for the two classes of EEG signals (normal and seizure). The MATLAB function “tcdf” is used to compute the p-value of the data classified with defined degree of freedom. It is found that p ≤ 0.01 indicates the 1%, i.e. less than 1 in 100 chance of being wrong (refer to Fig. 4).

The comparison of classifier performance reveals that the division of EEG time series into four sub-bands results in varied performance parameters. The performance for 1500 and 4097 data sample is depicted in Table 2. The ellipse area features obtained from shape of the SODP of IMFs result in more suitable features for a data segment of 1500. Figure 5 shows the box plot of ellipse area of SODP of IMFs for normal and seizure EEG signals. The experimental results are found noteworthy for inequity among normal and ictal signals.

Table 2 The performance of LS-SVM classifier for 1500 and 4097 data samples
Fig. 5
figure 5

Box plots of ellipse area of SODP of IMFs for normal and seizure EEG signals

The CSM-SVM is an optimized form of SVM which uses wrapper method to select features suitable for classification and implemented in MATLAB. The toolbox is constructed around a fast LS-SVM training and simulation algorithm [10]. The LS-SVM functions have been used for classification as well as for plotting the performance. A fast LS-SVM training and simulation algorithm is present in the MATLAB toolbox. The LS-SVM functions have been used for categorization in addition to that for plotting the performance.

4 Conclusion

Epilepsy is a chronic disorder, and EEG-based automated seizure detection system is evolving as an alternative diagnosis option where the main requirement is optimum signal processing algorithms. This paper presents the performance of classifier using ellipse area of SODP of IMFs as a feature set. For categorization of ictal and non-ictal EEG signals, the ellipse area is obtained using EMD algorithm to generate optimum feature space. The ellipse area parameters of first four IMFs are used in generating six pairs. The ellipse areas of first and second IMFs have provided better classification accuracy. The seizure EEG signals have significant ellipse area as compared to the normal EEG signals. The increased variation in seizure signals results into the desired variation in ellipse area of SODP of IMF pairs. The CSM-SVM provided highest classification accuracy of 96.4% for first pair of IMFs. The application of this method in automated seizure detection system will be tested on out-of-sample data sets. The system may also assist caretakers of epileptic patients by detecting occurrence of seizure signal.