1 Introduction

Epilepsy is a central nervous system disorder caused by abnormal changes in the neural activity inside the brain. According to the World Health Organization (WHO), epilepsy affects 45–55 million people all around the world [1]. Electroencephalography (EEG) captures the brain’s electrical activity and it is considered as a tool in clinical application for epileptic seizure detection [2, 3]. Traditional methods for seizure detection usually required a long EEG recording of several hours. Seizure detection in EEG signals by visual examination not only requires high expertise, but it is also an expensive process in terms of time and it is prone to error. With the development of technology building an automated seizure detection framework is of great interest now a days. Epileptic seizure detection in EEG signal could be treated as a classification problem where the task is to classify an EEG signal to either as a seizure or as a non-seizure signal. It should be noted that feature extraction and classification are two important steps in this phenomenon. This combination enables an automated system to run faster. A number of methods have been proposed in the literature for epileptic seizure detection in EEG signals.

In recent years, feature extraction techniques based on time series signal analysis with linear prediction error energy [4], correlation [5], fractional linear prediction [6], frequency domain signal analysis with fast Fourier transform [7], time–frequency domain signal analysis techniques based on short-time Fourier transform [8], and wavelets [9,10,11,12,13,14,15,16,17] have been proposed for seizure detection.

Feature extraction based on different entropy schemes has shown effectiveness for seizure detection in EEG signals [18,19,20]. Dimensionality reduction techniques like PCA, ICA, and LDA have been successfully applied for epileptic EEG signal classification with a high accuracy [21]. The combination PCA and wavelet has been reported for automated classification of epileptic activity with a higher accuracy [22]. Techniques like recurrence quantification analysis [23, 24], higher order spectra features [25], Continuous wavelet transform [26], high order cumulants features [27], empirical mode decomposition [28,29,30,31], and Hilbert–Huang transformation [32, 33] are well known in the field of epileptic seizure detection.

As mentioned earlier the efficiency of a feature extraction technique not only depends on the extracted informative features, but it also depends on the computation complexity involved in the extraction process. The feature extraction technique should be computationally efficient. Local binary pattern (LBP) has gained popularity in the field of face recognition [34]. LBP focuses on preserving the structural property of the pattern. Because of the effectiveness of the LBP technique for different pattern recognition applications, a one-dimensional LBP (1D-LBP) scheme was proposed for signal processing [35]. Recently, the 1D-LBP technique has been used for epileptic EEG signal classification [36, 37]. 1D-LBP focuses on the local pattern of a signal to extract quantitative features for classification. However, 1D-LBP is sensitive to local variation. Local variation refers to any structural change in the local pattern of a signal.

In this study, we propose two effective feature extraction techniques, namely, local centroid pattern (LCP) and One-dimensional local ternary pattern (1D-LTP) for classification of epileptic EEG signals. Both the techniques are computationally simple, easy to implement and insensitive to local and global variations. These insensitiveness properties of LCP and 1D-LTP overcome the limitation of 1D-LBP. Both techniques (LCP and 1D-LTP) work in two phases. In the first phase, the local patterns are transformed to form the histogram. The histogram contains the structural description of the signal and represents the feature vector of the corresponding EEG signal. Histogram classification is completed in the second phase. The classification has been carried out with four different machine-learning classifiers. The classification performance is evaluated with 10-fold cross validation considering the sensitivity, specificity and accuracy.

The remaining content of this paper is organized as follows: Methodology and materials used are discussed in Sect. 2. Experimental results are shown in Sect. 3 and discussed in Sect. 4. Finally, Sect. 5 concludes the article with future direction.

2 Methodology and Materials

In this section, a brief discussion about LBP, 1D-LBP, LCP, 1D-LTP feature extraction techniques, dataset and the classifiers used has been done.

2.1 Local Binary Pattern (LBP)

LBP is a well known technique used for face recognition and two dimensional (2D) image processing [35]. For each pixel in an image it produces a binary code by comparing the pixel with the surrounding pixels or neighbor pixels (3 × 3 neighborhood). Each binary code captures the structural distribution of a small section of the image. This binary code is converted into its decimal equivalent (LBP code) to uniquely represent the pattern structure. Once the computation of all the LBP code is finished, the histogram is formed using these codes. A histogram graphically summarizes the structural distribution of patterns across the image and consists of two axes. The horizontal axis contains the LBP code in increasing order and the vertical axis represents the frequency (number of occurrences) of each LBP code. Usually, in images, for each pixel 8 surrounding pixels are considered while computing the LBP code. The LBP code of a pixel S c from its m surrounding pixels, P i (i = 0…m − 1), is computed as given below:

$$ S_{c}^{LBP} = \sum\limits_{i = 0}^{m - 1} s(P_{i} - S_{c} )2^{i} $$
(1)

where,

$$ s(x) = \left( {\begin{array}{*{20}c} {1,} & {{\text{if }}x \ge 0} \\ {0,} & {\text{ otherwise}} \\ \end{array} } \right. $$

One example of the LBP code is shown in Fig. 1.

Fig. 1
figure 1

The LBP code for image processing

2.2 One-Dimensional Local Binary Pattern (1D-LBP)

1D-LBP is a variant of LBP and it was introduced for signal processing [35]. The working principle 1D-LBP is similar to LBP, but it is used for one dimensional signal. In case of 1D-LBP, the transformation code for the signal point S c considering m number of surrounding points (P i , i = 0…m − 1) is computed as:

$$ S_{c}^{1D - LBP} = \sum\limits_{i = 0}^{m - 1} s(P_{i} - S_{c} )2^{i} $$
(2)

where,

$$ s(x) = \left( {\begin{array}{*{20}c} {1,} & {{\text{if }}x \ge 0} \\ {0,} & {\text{ otherwise}} \\ \end{array} } \right. $$

In the above technique, different weights (2i, i = 0…m − 1) are assigned to different points in order to convert the binary code into a unique 1D-LBP code [35]. The 1D-LBP code of a signal point S c is shown in Fig. 2.

Fig. 2
figure 2

The 1D-LBP code for signal processing

2.3 Proposed Techniques Based on Local Transformed Features

As mentioned before, in 1D-LBP, each point in the raw signal is compared with its neighbor points and extract features from the signal by focusing on local patten. Recently, 1D-LBP has gained popularity in the field of EEG signal classification [36, 37]. Usually, while recording an EEG signal, each action or abnormality possesses some unique local patterns and 1D-LBP can detect these patterns. However, 1D-LBP is sensitive to noise and hence the capability of detecting the hidden patterns is limited. In order to overcome this limitation, LCP and 1D-LTP techniques have been proposed. The LCP technique deals with the computation of centroid or mean value of the neighboring points and the comparison is carried out between the neighboring points and the centroid value. The binary code obtained after the comparison is converted into the transformation code. Centroid or mean is less sensitive to noise and it represents the local pattern structure as well. 1D-LTP is a generalization of 1D-LBP. It operates upon a user defined threshold value and can detect unique patterns. While in 1D-LBP and LCP, each comparison results in a binary value (0 or 1), the 1D-LTP produces a ternary code (+1, 0, or −1) for each comparison depending upon the threshold limit. Once the comparison finished between the center point and its neighboring points, the transformation code is computed from the ternary code. As compared to 1D-LBP, the 1D-LTP is more descriptive in nature. In this section, LCP and 1D-LTP feature extraction techniques are discussed one by one in detail. Figure 3 depicts the flowchart of the proposed methods.

Fig. 3
figure 3

Flowchart of proposed methods

2.3.1 Local Centroid Pattern (LCP)

LCP is a novel feature extraction technique based on the centroid of the surrounding points. Centroid or mean often captures the structure of a pattern. The various steps of the LCP feature extraction technique are as follows:

  1. 1.

    Set the number of neighboring points m.

  2. 2.

    For each signal point S c , select m/2 number of neighbor points in forward and backward directions.

  3. 3.

    Compute the centroid (c) of the neighboring points (\( c = \frac{1}{m}\sum\nolimits_{i = 0}^{m - 1} p_{i} \)).

  4. 4.

    Compute the difference between the neighboring points and the centroid as follows:

    $$ d_{i} = p_{i} - c,{\text{ for}}\;i = 0 \ldots m - 1. $$
  5. 5.

    Compute the LCP code.

    $$ S_{c}^{LCP} = \sum\limits_{i = 0}^{m - 1} s(d_{i} )2^{i} . $$
    (3)

    where,

    $$ s(x) = \left( {\begin{array}{*{20}c} {1,} & {{\text{if }}x \ge 0} \\ {0,} & {\text{ otherwise}} \\ \end{array} } \right. $$

    The various steps involved in the LCP are shown in Fig. 4.

    Fig. 4
    figure 4

    LCP code for the point S c

2.3.2 One-Dimensional Local Ternary Pattern (1D-LTP)

Like LBP, the LTP feature extraction technique was proposed for face recognition for two dimensional (2D) face images [34, 38]. Recently, LTP gained popularity in different pattern recognition applications [39, 40]. Even though the 1D-LBP feature extraction technique has been proposed for signal processing [35] and successfully applied for epileptic EEG signal classification [36, 37], the LTP based technique is yet to be proposed for the same. In this section, 1D-LTP based feature extraction technique has been introduced for epileptic EEG signal classification. The 1D-LTP technique works in the similar way to 1D-LBP, but it produces a ternary code. Like 1D-LBP, the difference is computed between the neighbor points and the center point. However, a user define threshold (t) is set in order to avoid the variations. The various steps of the proposed 1D-LTP feature extraction techniques are as follows:

  1. 1.

    Set the number of neighboring points m and a threshold t.

  2. 2.

    For each signal point S c , select m/2 number of neighbor points in forward and backward directions.

  3. 3.

    Compute the difference between the center point S c and the neighboring point p i . The difference is computed as:

    $$ d_{i} = P_{i} - S_{c} ,{\text{ for}}\;i = 0 \ldots m - 1. $$
  4. 4.

    Compute the LTP code.

    $$ S_{c}^{LTP} = \sum\limits_{i = 0}^{m - 1} s(d_{i} )3^{i} $$
    (4)

    where,

    $$ s(x) = \left( {\begin{array}{*{20}c} {1,} & {{\text{if }}x \ge t} \\ {0,} & { - t \,< x \,< t} \\ { - 1,} & {x \le - t} \\ \end{array} } \right.. $$

In order to reduce the LTP code range, the technique suggested in [35] is followed, where the ternary pattern is partitioned into positive (LTP pos) and negative (LTP neg) binary patterns and then concatenated as given below.

$$ LTP = \{ LTP^{pos} ,\;LTP^{neg} \} $$

where,

$$ LTP^{pos} = \sum\limits_{i = 0}^{m - 1} s^{pos} (d_{i} )2^{i} $$
(5)
$$ s^{pos} (x) = \left( {\begin{array}{*{20}c} {1,} & {{\text{if }}x \ge t} \\ {0,} & {\text{ otherwise}} \\ \end{array} } \right. $$

and

$$ LTP^{neg} = \sum\limits_{i = 0}^{m - 1} s^{neg} (d_{i} )2^{i} $$
(6)
$$ s^{neg} (x) = \left( {\begin{array}{*{20}c} {1,} & {{\text{if }}x \le - t} \\ {0,} & {\text{ otherwise}} \\ \end{array} } \right. $$

One LTP pattern along with the positive and negative parts is shown in Fig. 5.

Fig. 5
figure 5

1D-LTP code for the point S c

Once the computation of transformation codes (1D-LBP, LCP or 1D-LTP) for all the signal points is over, the histogram of these codes forms the feature vector of the EEG signal and is then fed to the classifier to carry-out the classification. In all the above three methods, the transformation code lies between 0 to 2m−1 (inclusive).

2.4 Time Complexity of LCP and 1D-LTP

Let X N×d represent the set containing N number of signals and d is the number of points present in each signal. In both the techniques, if m (m < d) number of neighboring points are considered for computation of transformation codes, then the time complexity (Tc) of both the techniques (LCP and 1D-LTP) for processing each signal is O(md). Since the set X contains N number of signals, the time complexity of both techniques for processing X is O[N.(md)].

$$ Tc(LCP) = O[N.(md)] $$
$$ Tc(1D - LTP) = O[N.(md)]. $$

For the same set X N×d , the time complexity of some well known techniques like PCA, LDA, Discrete Fourier Transform (DFT), fFT, WT are as follows.

$$ Tc(PCA) = O(N.d^{2} + d^{3} ) $$
$$ Tc(LDA) = O(N.d.t + t^{3} ) $$

where t = min(Nd)

$$ Tc(DFT) = O(N.d^{2} ) $$
$$ Tc(fFT) = O(N.dlogd) $$
$$ Tc(WT) = O(N.d). $$

It could be observed that the time complexity of the proposed techniques is less as compared to some of the existing techniques (PCA, LDA, DFT).

2.5 1D-LBP, LCP, and 1D-LTP in Case of Noise

Tolerance to noise is one of the important properties of a feature extraction technique. Noise may cause a local variation or a global variation. These variations can affect the pattern structure of a signal. Since all these cases belong to the same pattern, the transformation technique should produce the same transformation code for all the above cases. 1D-LBP generates the transformation code by comparing the center point with its surrounding points directly. In 1D-LBP, a small variation in the pattern structure can affect the transformation code. As a result of which the transformation codes are different for the original pattern and the noisy pattern. Hence, 1D-LBP is sensitive to noise. On the other hand, in the LCP technique the transformation code is generated by computing the mean of the surrounding points, followed by the comparison of mean with each of the surrounding points. If the subpart of a pattern sequence is affected by noise, the variation of mean with respect to the surrounding points considering the pattern structure is small for noisy pattern and this property makes LCP insensitive towards the noise by producing the same transformation code for both the original and noisy patterns. Similarly, the code computation in 1D-LTP not only depends on the center point and surrounding points, but also depends on the user defined threshold limit. This threshold limit is set in order to avoid the variation caused by noise and produces the same transformation code. The behavior of 1D-LBP, LCP, and 1D-LTP (with threshold t = 10 μV) techniques in case of local and global variations for different patterns are shown in Fig. 6. It should be noted that Fig. 6 is only an example.

Fig. 6
figure 6

a 1D-LBP, b LCP, and c 1D-LTP in case of local and global variations

In case of noise free signals, patterns with similar structures should be represented by the same transformation code. It can be seen in Fig. 6 that LCP and 1D-LTP assigns the same transformation code in case of local and global variations, whereas, 1D-LBP is sensitive to local variation. The insensitiveness property of LCP and 1D-LTP overcome the limitation of 1D-LBP.

2.6 Classification

Nearest neighbor (NN), decision tree (DT), support vector machine (SVM) and artificial neural network (ANN) are some of the well known classifiers of machine learning and data mining [41]. In this study, all the above four classifiers have been used and the classification results are shown.

2.7 Cross Validation

10-fold cross validation has been used to evaluate the performance of the proposed techniques. In 10-fold cross validation the data set is divided into ten parts. Out of these ten parts, nine parts are used as training sets and the remaining part is used as a testing set. This process is repeated ten times with different training and testing sets. Usually, the mean accuracy of all the iterations represents the final accuracy [42].

2.8 Statistical Parameters

The statistical parameters used for evaluating the performance of the proposed method are sensitivity (Sen), specificity (Spe), and accuracy (Acc). These are calculated as follows:

$$ Sen(\% ) = \frac{Tp}{Tp + Fn} \times 100 $$
(7)
$$ Spe(\% ) = \frac{Tn}{Tn + Fp} \times 100 $$
(8)
$$ Acc(\% ) = \frac{Tp + Tn}{Tp + Tn + Fp + Fn} \times 100 $$
(9)

where, true positive (Tp): correctly identified seizure signals, true negative (Tn): correctly identified non-seizure signals, false positive (Fp): incorrectly marked as seizure signals, and false negative (Fn): incorrectly marked as non-seizure signals.

2.9 Dataset

In this research, the publicly available benchmark epilepsy EEG dataset provided by University of Bonn,Footnote 1 Germany, is used [43]. The dataset consists of five subsets (A, B, C, D, E). The standard 10–20 system electrode placement was followed for signal capturing. Each subset contains 100 single-channel EEG segments of 23.6 s duration with 4097 data points. A 128-channel amplifier system was used for the recording of these EEG signals using common average reference and the sampling rate was 173.61 Hz. The subsets A and B contain the EEG recordings of five healthy volunteers while their eyes were opened and closed, respectively. The signals in subsets C and D were recorded on patients before epileptic attack at hemisphere hippocampal formation and from the epileptogenic zone respectively. The EEG signals within subset E were recorded from patients during the seizure activity. The description of the dataset is provided in Table 1.

Table 1 Dataset

In this study, in order to verify the effectiveness of the proposed approaches all the five subsets have been used and different experimental cases have been tested. The EEG signal of each subset is shown in Fig. 7.

Fig. 7
figure 7

EEG epilepsy data set

3 Results

All the five subsets (A, B, C, D, E) have been used in this study. In both the techniques, the first step is the computation of the transformation code for each signal point. Once the code computation for all the signal points is over, these codes are arranged in the form of a histogram. The histogram represents the feature vector of the corresponding EEG signal and is subsequently used for the classification using different machine learning classifiers.

The length of the feature vector (l) depends on the number of neighboring points (m) considered in evaluating the transformation code.

For the LCP feature extraction technique, the length of the feature vector is computed as follows:

$$ l_{LCP} = 2^{m} . $$
(10)

For the 1D-LTP feature extraction technique, the length of the feature vector is computed as follows:

$$ l_{1D - LTP} = 2^{m + 1} . $$
(11)

The small segments of histogram based feature vector (with m = 8) obtained with LCP and 1D-LTP techniques for different subsets are shown in Fig. 8.

Fig. 8
figure 8

Small segments of histogram based feature vector

The four different machine learning classifiers used in this study are NN, SVM, ANN, and DT. In case of the NN classifier the built-in MATLAB functions ClassificationKNN.fit() and predict() have been used for training and testing the feature vectors respectively. The svmtrain() and svmclassify() functions have been used for training and classifying the feature vectors with a linear kernel based SVM classifier. The kernel parameter was set to 1. For DT and ANN classifiers the fitctree() and patternnet() functions have been used. This multilayer perceptron neural network consists of three layers. The three layers are the input layer, a hidden layer, and the output layer. The input layer nodes represent the extracted features of a signal. After several experiments it is found that the highest accuracy was achieved when the cardinality of neurons in the hidden layer was set between 30 and 70. The maximum number of iterations and the minimum gradient were set to 1000 and 10−6 respectively. We have used the scaled conjugate gradient method with the hyperbolic tangent sigmoid transfer function. The cvpartition() function was used for random partitions of input dataset into training and testing sets. The various experimental cases considered in this research are shown in Table 2.

Table 2 Experimental cases considered in this research

A number of experiments have been carried out considering the different lengths of neighboring points (m). The best results were obtained when the number of neighboring points was set to 8. With the number of neighboring points m = 8, the mean classification accuracy (ACC) obtained after 10-fold cross validation for different experimental cases by applying LCP and 1D-LTP feature extraction techniques with different machine learning classifiers are shown in Tables 3 and 4 respectively. In case of 1D-LTP technique the threshold (t) was set to 10 μV empirically.

Table 3 Mean classification accuracy (%) of LCP with different classifiers after 10-fold cross validation
Table 4 Mean classification accuracy (%) of 1D-LTP with different classifiers after 10-fold cross validation

Among the four machine learning classifiers used in this research, it is found that ANN achieved the highest classification accuracy. The sensitivity, specificity, and classification accuracy achieved for some experimental cases by both the techniques (LCP and 1D-LTP) with ANN classifier have been shown in Tables 5 and 6 respectively.

Table 5 Mean sensitivity (%), specificity (%), and accuracy (%) of LCP with ANN
Table 6 Mean sensitivity (%), specificity (%), and accuracy (%) of 1D-LTP with ANN

The classification accuracy achieved by 1D-LBP, LCP, and 1D-LTP feature extraction techniques with different classifiers is shown in Fig. 9.

Fig. 9
figure 9

Mean classification accuracy (%) of 1D-LBP, LCP, and 1D-LTP for different experimental cases after 10-fold cross validation. Different experimental cases: Case 1 (A–E), Case 2 (B–E), Case 3 (C–E), Case 4 (D–E), Case 5 (A–D), Case 6 (AB–E), Case 7 (CD–E), Case 8 (ABCD–E), and Case 9 (A–D–E)

4 Discussion

The benchmark dataset has been used to carry out a fair comparison between the proposed techniques and different methods in the literature. After conducting several experiments, it is found that, LCP and 1D-LTP feature extraction techniques achieved a high classification accuracy with ANN classifier (Tables 5, 6). The experimental results of the proposed techniques and different methods reported in the literature is presented in Table 7.

Table 7 Authors, year, methods and classification accuracy obtained for some cases in the literature

For case 1 (A–E), Srinivasan et al. [45] reported a 100% classification accuracy with entropy and neural network. In the same way, Kumar et al. [50] achieved the highest classification accuracy with approximate entropy and SVM. Recently, Lee et al. [52] and Tawfik et al. [55] reported the classification accuracy of 98.17 and 99.5% respectively. In this study, both the proposed techniques (LCP and 1D-LTP) achieved 100% classification accuracy with ANN classifier.

The classification accuracy achieved by LCP with ANN classifier for cases 2–4 is 99.00, 97.50, and 99.00% respectively. Similarly, 1D-LTP with ANN achieved the classification accuracy of 99.50, 99.50, and 98.00% for these cases. Nicolaou and Georgiou [47] reported the classification accuracy of 82.88, 88.00, and 78.98 for these experimental cases with the combination of permutation entropy and SVM. Recently, Kumar et al. [50] conducted a number of experiments and achieved a maximum classification accuracy (%) of 100, 99.60, and 95.85 for these experimental cases respectively.

For cases 5–7, LCP and 1D-LTP achieved the classification accuracy (%) of 99.50, 99.33, 98.67 and 100, 99.00, 99.00 respectively. Recently, Pachori and Patidar [53] reported a classification accuracy of 97.67 for case 7 (CD–E) with the combination of intrinsic mode function and ANN classifier. For the same case, Kumar et al. [37] reported a classification accuracy of 98.33% with the application of Gabor filter, LBP and NN classifier. The classification accuracy achieved by LCP and 1D-LTP for case 8 (ABCD–E) is 98.60 and 98.20 respectively. For case 8 (ABCD–E), Kumar et al. [50] achieved the accuracy of 97.38% with the application of approximate entropy and SVM. Recently, for cases 6–8, Tiwari et al. [56] reported a high classification accuracy of 100, 99.45 and 99.31% respectively with the combination of key point based LBP and SVM. For case 9 (A–D–E), the classification accuracy achieved by LCP and 1D-LTP is 98.00 and 98.33% respectively. For case 9, Acharya et al. [22] reported a classification accuracy of 99.00% with the combination of PCA and GMM model, Kaya et al. [36] reported a classification accuracy of 95.67% with 1D-LBP, and Orhan et al. [46] reported the classification accuracy of 96.67% with k-mean clustering and ANN classifier.

These results show that both LCP and 1D-LTP have been able to achieve a better classification accuracy than some of the existing techniques proposed in the literature (Table 7). In addition, the proposed techniques are computationally simple and easy to implement. Like 1D-LBP [35], both the techniques can be used for processing other one-dimensional signals. The time complexity of the proposed techniques is less as compared to some of the well known techniques like PCA, LDA, and DFT. Both the techniques also extract features directly from the raw EEG signal.

5 Conclusions

A number of feature extraction techniques have been proposed in the past for epileptic EEG signal classification. Recently, 1D-LBP has gained popularity in this field. However, 1D-LBP is sensitive to local variation. In order to overcome this issue, we have proposed two effective feature extraction techniques called LCP and 1D-LTP. Nine different experimental cases have been tested to validate the effectiveness of the proposed approaches. The highest classification accuracy (%) achieved with LCP and 1D-LTP for different experimental cases, such as A–E, B–E, C–E, D–E, A–D, AB–E, CD–E, ABCD–E, A–D–E are 100, 99.50, 97.50, 99.00, 99.50, 99.33, 98.67, 98.60, 98.00 and 100, 99.50, 99.50, 98.00, 100, 99.00, 99.00, 98.20, 98.33 respectively. With the promising performance on the benchmark dataset, it could be concluded that LCP and 1D-LTP are effective feature extraction techniques for EEG signal processing. The proposed techniques are easy to implement and computationally simple. The time complexity of the proposed techniques is less as compared to some of the well known techniques. This research strengthens the direction of developing local transformed feature based techniques for epileptic EEG signal classification. In future, the effectiveness of these feature extraction techniques may also be verified with a larger dataset. It is also observed that the length of the histogram based feature vector is large. In future, different feature reduction techniques could be applied in order to reduce the length of the histogram based feature vectors. The future direction of research also includes the processing of other biomedical signals like electrocardiogram (ECG) and electromyogram (EMG) for classification of normal and abnormal states using local transformed features.