1 Introduction

Since the first EEG signal acquisition by Berger (1929), researchers have directed so much effort to processing of these signals. EEG signal processing has a variety of application; medical applications such as seizure detection and prediction (Tzallas et al. 2012; Alickovic et al. 2018), and non-medical applications such as games and safety (Scherer et al. 2013; Makeig et al. 2012). Epilepsy-related studies are dominant in the field of EEG signal processing. Epilepsy is a brain disorder affecting 1% of world population, and it is characterized by EEG seizures (Thurman et al. 2011; Tzimourta et al. 2018). The research objective of seizure detection is to check the existence of the seizure in EEG signals. We would like to anticipate the seizure with as long prediction time as possible in the so-called seizure prediction process. The early anticipation of seizures helps to alert patients or caregivers to save patients from possible hazards by implementing the seizure prediction system through a communication process between headset and mobile. The efficiency of a seizure prediction algorithm is determined by the prediction rate, false-alarm rate, and prediction horizon.

Different trends have been investigated in the literature for EEG seizure prediction using time-domain techniques, signal transforms, and signal decompositions. Zandi et al. (2010, 2013) adopted a time-domain zero-crossing rate seizure perdition approach based on histograms of different window intervals. They achieved a sensitivity of 88.34%, a false-prediction rate of 0.155/h and an average prediction time of 22.5 min. Arabi and He (2012) adopted statistical features including correlation entropy, correlation dimension, Lempel–Ziv, noise level, largest Lyapunov exponent, and nonlinear independence in their patient-specific approach. In the simulation experiments, their maximum sensitivity was 90.2% and the average false prediction rate was 0.11/h. Shelter et al. (2011) presented a seizure prediction algorithm depending on the interaction between pairs of EEG signals, and this algorithm achieved a sensitivity of 60%. Wang et al. (2010) presented a seizure prediction system depending on reinforcement learning with online monitoring. They achieved an accuracy of 70%. Li et al. (2013) investigated the use of morphological operations and averaging filters for EEG seizure prediction. They achieved a 75.8% sensitivity and a false-alarm rate of 0.09/h.

Wavelet transform and its versions have also been used for EEG seizure prediction. Hung et al. (2010) developed a wavelet-based seizure prediction algorithm using correlation dimension and its correlation coefficients. They achieved an average sensitivity of 87% with a false-alarm rate 0.24/h, and an average of 27 min warning time. Chiang et al. (2011) developed a wavelet-based seizure prediction algorithm adopting nonlinear independence, cross correlation, difference of Lyapunov exponents and phase locking. This algorithm achieved a sensitivity of 74.2% on MIT database. Gadhoumi et al. (2013) developed a wavelet-based seizure prediction method from iEEG signals depending on measuring the similarity with a reference signal. They achieved a sensitivity of 85% with a false-alarm rate of 0.35/h. Wang et al. (2013) exploited Lyapunov exponent, correlation dimension, Hurst exponent, and entropy features in the wavelet domain for seizure prediction. They achieved an average sensitivity of 73%, and a specificity of 76%. Costa et al. (2008) developed a seizure prediction method based on wavelet energy features. They achieved an average sensitivity of 83% and an average accuracy of 96%. Table 1 gives a comparison between some different seizure prediction methods in time and wavelet domains.

Table 1 A comparison between some time and wavelet-based seizure prediction methods

The compression of EEG signals is of great importance in the biomedical field. The reason for this is to overcome the limitations of channel capacity, achieve small memory usage, and avoid high bandwidth transmission. The effective compression of EEG signals is very difficult due to the random nature of these signals (Sriraam 2012). Compression techniques are basically classified to lossy and lossless compression techniques (Ruchi et al. 2016).

In the lossless compression process, no data is lost, and the original data can be exactly reconstructed from their compressed form. All information is saved with no distortion. In contrast, lossy compression is an irreversible process that provides only an approximate version of the original data with some losses. Lossy compression can provide high compression rates. Various algorithms have been adopted for EEG data compression based on lossy techniques.

This paper presents a proposed time-domain approach for EEG channel selection and hence seizure prediction based on simple statistics. Its main idea is how to discriminate between different signal activities based on their probability density functions (PDFs). Simulation experiments have shown that if the signals are segmented into non-overlapping segments, the PDFs of these segments differ even for segments of the same category. This means that we can treat the bins of each PDF as random variables across segments and select the most appropriate bins for discrimination through simple thresholding processes. Section 2 introduces a brief description of compression techniques. Section 3 gives a detailed explanation of the proposed seizure prediction approach. Section 4 gives the pre-processing steps implemented prior to PDF calculations. Section 5 gives the simulation results. Finally, sect. 6 gives the concluding remarks.

2 EEG signal compression

Compression of EEG signals is one of the most important solutions to speed up the signal transfer and save storage capacity. Compression techniques are basically partitioned into two different and important branches (Aiupkumar and Bej 2013): lossy and lossless compression techniques.

2.1 Lossless compression techniques

2.1.1 Huffman coding

Huffman coding is one of the lossless compression techniques. In Huffman coding, variable-length codes are generated for the input symbols based on their probability of occurrence. Huffman coding is a sort of entropy encoding. It can be applied in the encoding process of EEG signals for the compression purpose.

2.1.2 Shannon Fano coding

It is another lossless compression technique, which also falls under the category of entropy encoding. The main concept of this technique the generation of variable-length codes for the symbols based on Shannon’s theorem.

2.1.3 Lempel–Ziv–Welch (LZW) compression

This compression technique is named according to Abraham Lempel, Jacob Zev, and Terry Welch. It is based on creating a dynamic dictionary based on the choice of a sub-string from the original file, and then this string is matched with the dictionary. If the string is obtained, the reference is mentioned in the dictionary, and if the string is not obtained, a new dictionary is created with a new reference entry.

2.2 Lossy compression techniques

2.2.1 Discrete sine transform (DST) technique

DST is similar to discrete Fourier transform (DFT) but with a purely real matrix. The DST can be represented with Eq. 1. The basic idea of DST compression is the neglection of some coefficients in the DST domain based on their significance. Figure 1 illustrates an example of an EEG signal before and after compression.

$$y\,\left( k \right)=\mathop \sum \limits_{{n=0}}^{N-1} x\left( n \right)\sin \left( {\pi \frac{{\left(k+1 \right) \left(n+1\right)}}{{N+1}}} \right)~~$$
(1)

where  k= 0,1, 2,…N-1, n = 0, 1, 2,…N-1, and y(k) is the DST of x(n).

Fig. 1
figure 1

EEG signal a before compression, b after compression in DST  domain, c after IDST

3 Proposed seizure prediction approach

The proposed channel selection and seizure prediction approach depends mainly on estimating the PDFs of signal amplitude, derivative, local mean, local variance, and median of the different signal channels as illustrated in Fig. 2. This approach comprises two phases: training and testing as shown in Fig. 3. In the training phase, few hours are selected randomly for normal activities and two or three intervals for ictal and pre-ictal activities. The selected periods with multi-channel nature are segmented into 10-second segments. For each channel in each segment, five PDFs are estimated for amplitude, derivative, local mean, local variance, and median of the signal.

Fig. 2
figure 2

PDFs estimated from EEG signals for channel selection and seizure prediction

Fig. 3
figure 3

Training and testing phases of the proposed channel selection and seizure prediction algorithm. a Training phase, b testing phase

We treat each PDF bin (9 bins) as a random variable across segments for each of the normal, ictal, and pre-ictal histogram classes. Based on predefined false-alarm and prediction probability thresholds, the bins and the channels that discriminate between normal and pre-ictal classes are selected for discrimination in the testing phase. The effect of DST compression is also studied with the proposed approach to estimate its sensitivity to lossy EEG signal compression.

4 Pre-processing of EEG signals

The proposed approach depends on some pre-processing operations carried out on the signal channels including the derivative, local mean, local variance, and median as discussed below. The derivative of an EEG signal reinforces the rapid transitions in the signal and damps slow transitions, and hence the different activities of the signal will yield more distinguishable derivatives through their PDFs. The local mean is a good indication of the signal trend, and the local variance characterizes the signal power very well from sample to sample. The median filtering process removes spikes that may result from impulsive noise during the signal recording process. Based on the five estimated PDFs: signal amplitude, derivative, local mean, local variance, and median for any EEG segment, we can discriminate between normal, ictal, and pre-ictal signal segments.

4.1 Signal derivative

In EEG signals, abnormal activities are accompanied with abrupt changes in signal amplitude. To reinforce these abrupt changes, a signal differentiator can be utilized. We use a digital first-order differentiator filter for this purpose. This filter is given by (Kuo et al. 2006; Milić et al. 2013):

$$H\,\left( z \right)=1 - {z^{ - 1}}$$
(2)

4.2 Local mean

We can estimate the local mean of a signal X(n) as follows (Abd El-Samie 2011):

$$\widehat {X}(n)=\frac{1}{{(2K+1)}}\sum\limits_{{k=n - K}}^{{n+K}} {X(k)}$$
(3)

where \({(2K+1)^{}}\) is the number of samples in the short segment used in the estimation.

4.3 Local variance

We can estimate the local variance of a signal X (n) as follows (Abd El-Samie 2011):

$$\widehat {\sigma }_{X}^{2}(n)=\frac{1}{{(2K+1)}}\sum\limits_{{k=n - K}}^{{n+K}} {{{\left( {X(k) - \widehat {X}(n)} \right)}^2}}$$
(4)

4.4 Median filtering


Median filtering is a sort of nonlinear smoothing of signals. It aims at reducing some of the spikes in the signals that may occur due to impulsive noise. In the median filtering process, an odd number of signal samples is stored and sorted. The middle value after sorting is extracted. For a median filter of length \(N=2K+1\), the filter output is given as (Yin et al. 1996):

$$Y\left( n \right)=MED\left[ {X\left( {n - K} \right), \ldots ,X\left( n \right), \ldots ,X\left( {n+K} \right)} \right]$$
(5)

where \(X\left( n \right)\) and \(Y\left( n \right)\) refer to the nth sample of the input and output sequences, respectively.

This type of median filtering is non-recursive in the sense that an estimate of the median filter output at any sample time is independent of the median filter output history. There is another type of median filtering which is recursive. For a recursive median filter with window length \(N=2K+1\), the output is defined as (Yin et al. 1996):

$$Y\left( n \right)=MED\left[ {Y\left( {n - K} \right),Y\left( {n - K+1} \right) \ldots ,Y\left( {n - 1} \right),X\left( n \right), \ldots ,X\left( {n+K} \right)} \right]$$
(6)

This recursion process is a type of feedback that reduces noise more efficiently.

5 Simulation results and discussion

Simulation experiments have been carried on five patients from MIT database (patients 1, 8, 11, 14, 20) with 148.6133 hours containing 31 seizures (http://physionet.org/pn6/chbmit/). To better understand the steps of the proposed approach, we display some results for patient 20. Firstly, we begin by estimating the PDFs of the normal, ictal, and pre-ictal 10-s segments. Examples of these PDFs are shown in Figs. 4, 5, and 6 for normal, ictal and pre-ictal segments for patient 20. Three PDFs of three randomly-selected segments of each type are shown in the figures. The histograms are estimated for log values of the amplitudes to cover the large dynamic range of the signals. It is clear from these figures that each bin has different values from segment to segment, and hence it is necessary to consider each bin value as a random variable across segments and determine its PDF.

Fig. 4
figure 4

PDFs of the amplitude, derivative, local mean, local variance, and median for three randomly selected normal segments. a Amplitude distribution, b derivative distribution, c local mean distribution, d local variance distribution, e median distribution

Fig. 5
figure 5

PDFs of the amplitude, derivative, local mean, local variance, and median for three randomly selected ictal segments. a Amplitude distribution, b derivative distribution, c local mean distribution, d local variance distribution, e median distribution

Fig. 6
figure 6

PDFs of the amplitude, derivative, local mean, local variance, and median for three randomly selected pre-ictal segments. a Amplitude distribution, b derivative distribution, c local mean distribution, d local variance distribution, e median distribution

Our main objective is to distinguish between normal and pre-ictal activities for early seizure prediction. Hence, we have estimated the PDF of each bin from the normal and pr-ictal PDFs among all segments. If we can distinguish between these PDFs with a certain threshold for some selected bins, we can carry out a prediction process with these bins and their corresponding channels. Towards this objective, we set predefined prediction and false-alarm probability thresholds as 70% and 30%, respectively, for patient 20 for instance. The histogram bins from the channels, which satisfy these two constraints, are selected to create a prediction matrix for each patient.

Table 2 illustrates five rows of the prediction matrix obtained for patient 20. The first column in this matrix represents the feature type: amplitude (1), derivative (2), local mean (3), local variance (4), and median (5). The second column represents the index of the selected channel ranging from 1 to 23 for patient 20. The third column represents the index of the selected bin form the histogram of the feature of the first column ranging from 1 to 9. The sixth column represents the estimated threshold value to which we compare the corresponding bin value from each incoming segment of the same channel in the testing phase. The last column has a value of 1 for an incoming bin value that must be greater than or equal to the threshold and zero, otherwise. The fourth and fifth columns represent the prediction and false-alarm probabilities, respectively, achieved with the selected bin from the selected channel and selected feature. The prediction probability must be > 70%, while the false-alarm probability must be < 30%.

Table 2 Six rows of the prediction matrix of patient 20

Figure 7 illustrates the PDFs of the bins in Table 2. It is clear from the figures and the table that the selected threshold is not exactly the curve intersection point. It is selected as the value that maximizes the prediction probability, while maintaining the false-alarm probability below 30%. So, we accept more false alarms in order to maximize the prediction probabilities, depending on the fact that we will have multiple decisions that will be combined for each signal segment, which in turn reduces the false alarms. A binary decision is taken for each row of the prediction matrix and an accumulative sum is estimated for each 10-s incoming segment in the testing phase to be classified as a pre-ictal segment or not. A moving average filter is used to refine the results as shown in Fig. 8, because a decision of a certain activity is not taken with a single signal segment. Multiple segments are required in this decision, and hence the moving average process is appropriate for this action. The discrimination count on the vertical axis is compared with a selected threshold to determine the pre-ictal regions (Fig. 9).

Fig. 7
figure 7

PDFs of selected bins in Table 2

Fig. 8
figure 8

Variation of the discrimination count with time for the selected five patients. a Patient 1, b patient 8, c patient 11, d patient 14, e patient 20

Fig. 9
figure 9

Variation of the discrimination count with time for three patients (8, 14 and 20) with DST Compression technique. a Patient 8, b patient 14, c patient 20

In the simulation experiments, we have tested three different prediction horizons of 30, 60, and 90 min. In addition, a 15-min post seizure horizon has been adopted in the interpretation of the results. The obtained results for five MIT patients are given in Tables 3, 4, 5 and 6. From these results, it is clear that long prediction horizons are preferred to short prediction horizons from the prediction and false-alarm rates perspectives. In addition, the moving-average strategy contributes to the reduction of false alarms in the interpretation of seizure prediction results.

Table 3 Summary of the prediction results (without compression)
Table 4 Prediction results for patient 8 with DST compression
Table 5 Prediction results for patient 14 with DST compression
Table 6 Prediction results for patient 20 with DST compression

6 Conclusions

This paper presented a statistical time-domain approach for EEG channel selection and seizure prediction, which depends on estimating the PDFs of the signals and pre-processed versions of them. This approach is of multi-channel nature, and it depends on pre-defined constrains on the required prediction and false-alarm probabilities. Decision fusion and moving average post-processing steps are utilized to reduce the false-alarm effects and to make robust decisions regarding signal activities. The proposed approach has been tested for different prediction horizons. It achieved prediction rates of 90.32%, 93.55%, and 93.55% for prediction horizons of 30, 60, and 90 min, respectively with false-alarm rates of 0.148/h, 0.074/h, and 0.054/h, respectively. The average prediction times were 22.63 min, 34.25 min, and 40.96 min for the 30, 60, and 90 min horizons, respectively. These obtained results revealed that the proposed EEG seizure prediction approach can be appropriately used in a mobile application for epilepsy patients and caregivers.