1 Introduction

The incidence of infantile spasms (ISS), which is a severe infantile epilepsy syndrome, occurs between 2 and 3.5 per 10,000 live births [12, 21]. The seizure semiology in ISS is an epileptic spasm with contraction of muscles of the neck, trunk, and extremities and often occurs in clusters. If left untreated, there is risk for intractable seizures throughout life as well as greater intellectual disability. Given that ISS was described 174 years ago, assessment, diagnosis, and management for the disorder are still very enigmatic, and physicians have not yet reached consensus on objective protocols for diagnosing ISS [20]. The diagnosis of ISS is based on both the semiology of the seizure and the EEG pattern of hypsarrhythmia (HYPS); however, variations of HYPS can make the diagnosis of ISS challenging.

HYPS is characterized by a chaotic background that has high amplitude with multifocal discharges. The interrater reliability of HYPS in ISS has proven to be poor, and quantification of discharges from EEG readings remains challenging and subjective [10]. In particular, an expert electroencephalographer has to interpret an EEG by inspecting and approximating the characteristics of HYPS subjectively rather than through objective quantification. Due to the complex nature of these signals, even experienced EEG readers tend to interpret HYPS differently, which can have serious implications in the treatment of the infant [10, 1]. In order to address such a limitation, the objective of the present study is to develop a novel method that could assist clinicians to successfully identify and localize the epileptic discharges associated with ISS in HYPS. Quantification of the features of HYPS would make the EEG interpretation more reliable and objective, which could potentially improve management and ultimately the success of the prescribed treatment.

Several algorithms have been developed to detect epileptic discharges during epilepsy. The methods based on template matching techniques require an EEG reader to pre-specify spike characteristics such as amplitude and duration of the discharges to design epileptic spike templates. Once those values are defined, the algorithm searches in the time domain for waveforms that match the template in order to locate epileptic discharges [11, 15, 17]. The methods based on mimetic techniques are also time domain-based where use a pre-specified characteristics, such as, amplitude, slope, and duration to locate spike [13,10,, 16, 25]. The methods based on time–frequency techniques transform the EEG signal from the time domain to the joint time–frequency domain and perform the spike detection in the transformed domain. For example, in [19] the authors use wavelet transform to highlight the spikes in the scalogram domain, where the higher frequencies represent a high temporal resolution and is suitable to spike detection. However, the aforementioned methods have been developed for epileptic discharge detection in EEG signals associated with other types of epilepsy, but not in the presence of HYPS. Given the chaotic appearance of EEG during HYPS (see Fig. 1), there is a need for a novel method, which is able to detect epileptic discharges with multiple foci and varying morphologies associated with ISS in HYPS. The proposed algorithm applies a high-resolution time–frequency (TF) representation known as matching pursuit TF domain to transform the EEG signals into TF domain. It then uses a matrix decomposition method and a novel TF feature extraction method to characterize and locate epileptiform discharges associated with ISS in the presence of HYPS. In a previous work, we developed a novel method to detect such epileptiform discharges during HYPS [24]. The approach consisted of four stages: First, construct the time–frequency domain (TFD) of the EEG recording using matching pursuit TFD (MP-TFD). Second, decompose the TFD matrix into two submatrices (i.e., spectral components in \(\mathbf{W }\) and corresponding temporal components in \(\mathbf{H }\)) using nonnegative matrix factorization (NMF). Third, extract a spectral feature from every decomposed spectral component in W, and fourth, compare this spectral feature with a preset threshold and if larger then use the corresponding temporal component to locate the epileptiform discharges. The algorithm was successful in detecting the spikes, but the false positive rate was relatively high. In the present paper, we further expanded our algorithm to address this limitation. The algorithm was successful in detecting the spikes, but the false positive rate was relatively high. In the present paper, we further extended our algorithm to address this limitation. The technical contributions of the present work include: (1) Selected NMF model order using a Bayesian NMF method based on automatic relevance determination instead of selecting an empirical model order. (2) Extracted several temporal features from the decomposed temporal components in \(\mathbf{H }\) in addition to the feature from the spectral component, and (3) trained a classifier based on support vector machine (SVM) and employed it to detect the epileptiform discharges. In Sect. 2, we presented our novel TF feature extraction and classification algorithm for epileptiform discharge detection. The proposed method was evaluated on a dataset of patients with ISS as reported in Sect. 3. The results were discussed in Sect. 4 and the paper was concluded in Sect. 5.

Fig. 1
figure 1

A representative bipolar montage EEG recording exhibiting HYPS. The EEG appears highly disorganized with high amplitude waveforms and contains multifocal discharges with variable morphologies

2 Materials and methods

2.1 Dataset and preprocessing

The EEG recordings from five infants (4–9 months old) with ISS were used to evaluate the proposed spike detection algorithm in the presence of HYPS. Subject consent was obtained through the Infantile Spasms Registry and Genetic Studies via a protocol approved by the University of Rochesters Research Subjects Review Board. A 5-min section of awake EEG was selected for each subject. All EEGs were recorded based on the international standard 10-20 system with sampling rates of 256 {2}, 500 {1}, and 512 {2} samples per second where {.} indicates the number of patients. The recording EEGs were imported to Persyst EEG software (Persyst, San Diego, CA) for artifact reduction and then were imported into MATLAB and bandpass filtered (0.5–30 Hz) for further analysis. All the epilepticform discharges were manually marked by an epileptologist.

2.2 Construct time–frequency domain

Time–frequency domain (TFD) provides a two-dimensional energy representation of a signal in terms of its temporal and spectral content. However, the time and frequency resolution of a TFD determines the successful representation of the epileptiform discharges. For example, short-time Fourier transform (STFT), which is the most common TFD, may not present the transient waveforms associated with the epileptiform discharges of interest. The main reason for such a behavior is the time and frequency resolution tarde-off in STFT, which means that if we reduce the window size in an STFT, the frequency resolution will be decreased and vice versa [18]. Therefore, in this work, a high-resolution TFD known as the matching pursuit TFD (MP-TFD) is employed. The MP-TFD provides a high-resolution representation of the temporal and spectral information and is suitable for highly non-stationary data [8, 5].

The implementation of MP-TFD for an input EEG signal, x(t), consists of two stages. In stage one, the signal, x(t), is iteratively decomposed over an overcomplete dictionary of TF atoms as shown in Eq. (1).

$$\begin{aligned} x(t) = \sum _{i=0}^{I-1}{\left\langle R_{x}^{i},g_{\gamma i} \right\rangle }g_{\gamma i}(t)+R_{x}^{I}, \end{aligned}$$
(1)

where \({\left\langle R_{x}^{i},g_{\gamma i} \right\rangle }\) is the expansion coefficient on a TF atom \(g_{\gamma i}(t)\) at ith iteration and \(R_{x}^{I}\) is a residue signal after I iterations. At every iteration, i, the input signal is correlated with all the possible atoms from the redundant dictionary, \({D}= \left\{ g_{\gamma _r}(t)\right\} _{r=1:R}\) and the atom with the maximum correlation magnitude is selected as \(g_{\gamma i}(t)\). In this work, we selected a redundant dictionary of TF Gabor atoms as described in the following equation:

$$\begin{aligned} g_{\gamma _r} = \frac{1}{\sqrt{s_{r}}} g\left( \frac{t-p_{r}}{s_{r}}\right) e^{j(2\pi f_{r}t+\phi _{r})}, \end{aligned}$$
(2)

where \(\gamma _{r}\) represents the TF decomposition parameters (\(s_{r},p_{r},f_{r},\phi _{r}\)) denoted as scale factor, translation, frequency modulation and phase, respectively, and g in our work is the Gabor function [18]:

$$\begin{aligned} g = 2^{({1}/{4})}e^{-\pi t^2}. \end{aligned}$$
(3)

In addition to the high TF resolution of the MP-TFD that will be achieved using Gabor atoms, if large enough numbers of iterations are used in the MP decomposition step, then most of the coherent energy of the signal will be modeled by the Gabor atoms. The residue term can be easily ignored as it contains the non-coherent noise of the EEG signal, x(t).

The second stage of the MP-TFD implementation consists of summing the Wigner Ville distribution (WVD) [4, 6] of each decomposed Gabor atom as shown in the following equation:

$$\begin{aligned} \mathbf MP = \sum _{i=0}^{I-1}{\left| {\left\langle R_{x}^{i},g_{\gamma i} \right\rangle }\right| }^{2}{\mathbf{W }}_{g_{\gamma i}}(t,f), \end{aligned}$$
(4)

where \(\mathbf{W }_{g_{\gamma i}}(t,f)\) is the WVD of each MP selected Gabor atom \(g_{\gamma i}(t)\), and \(\mathbf MP \) is the MP-TFD constructed for the EEG signal, x(t). It is known that WVD provides the highest possible TF resolution for a mono-component signal [6]. Since Gabor atoms \(g_{\gamma }(t)\) are also mono-components, the constructed MP-TFD will inherits the high TF resolution of the WVD and can provide a suitable representation domain for analyzing the epileptic spikes of EEG signals during Hyps.

2.3 TF matrix decomposition

There are various techniques for the dimensionality reduction in the TFD data. One effective approach is the application of a matrix decomposition on the TFD to determine a low-rank factorization of the TFD dataset into low-dimensional components, which are suitable for feature extraction and classification applications. There are several matrix decomposition methods including singular value decomposition (SVD), principle component analysis (PCA), and nonnegative matrix factorization (NMF). However, only NMF enforces the non-negativity of the decomposed components and provides decomposed bases with nonnegative elements, which can result in a natural interpretation for the TFD data. As shown in the previous studies [19,20,21,22,2], NMF has been successfully employed in the TF feature extraction applications. It decomposes the nonnegative TFD data into two nonnegative factors denoted as temporal basis matrix (i.e., \(\mathbf{H }\)) and spectral basis matrix (i.e., \(\mathbf{W }\)), which correspond to the temporal and spectral structure of the TFD data, respectively. The following equation describes the matrix decomposition by NMF:

$$\begin{aligned} \mathbf MP _{F\times N} \approx \mathbf{W }_{F\times K} \cdot \mathbf{H }_{K\times N} \approx \sum _{k=1}^{K}w_{k}h^{T}_{k}, \end{aligned}$$
(5)

where the non-negativity of \(\mathbf{W }\) and \(\mathbf{H }\) is the decomposition constraint, F denotes the number of elements in the frequency domain, N denotes the length of signal in time domain, and the K (\(<< \text {min}(F,N)\)) is the model order that dictates the number of decomposed bases, and \([\cdot ]^{T}\) denotes transposition of the bases \({h_k}\).

2.3.1 Model estimation

Selection of the NMF model order, K, is usually performed arbitrary or by trial and error; however, determining an effective value for K (i.e., \(K_{\text {eff}}\)) could avoid overfitting and underfitting of the NMF decomposition to the TFD data. In this study, we employed a Baysian NMF method based on automatic relevance determination (ARD) [22] to select the order of the EEG TFD decomposition. In this probabilistic ARD model, each \(k\)th spectral bases of \(\mathbf{W }\) and its corresponding temporal bases from \(\mathbf{H }\) are related together through a common scale parameter or relevance weight denoted as \(\lambda _k\). If \(\lambda _k\) is small, it means that those bases are driven to zero or decoupled, leading to a more sparse model and smaller \(K_{\text {eff}}\). The parameter, \(\lambda _k\), is obtained from an inverse Gamma prior as shown below:

$$\begin{aligned} p({\lambda _k}| a, b) = \frac{b^a}{\Gamma (a)} \lambda _k^{-(a+1)} \text {exp}(-\frac{b}{\lambda _k}), \end{aligned}$$
(6)

where a and b, which are fixed for all k, denote shape and scale hyperparameters for \(\mathbf{W }\) and \(\mathbf{H }\) models when exponential priors (\({\mathcal {E}}\)) are selected as described below:

$$\begin{aligned}&p(h_{k}^T|\lambda _k) = {\mathcal {E}}(h_{k}^T|\lambda _k),\end{aligned}$$
(7a)
$$\begin{aligned}&\quad p(w_{k}|\lambda _k) = {\mathcal {E}}(w_{k}|\lambda _k) \end{aligned}$$
(7b)

where, given \(u \in \{w, h_{k}^T\}\), \({\mathcal {E}}(u|\lambda _k) = \frac{1}{\lambda _k}\text {exp}(-\frac{u}{\lambda _k})\) if \(u \ge 0\). Otherwise, \({\mathcal {E}}(u|\lambda _k) = 0\) for \(u < 0\).

The Baysian NMF enforces the following objective function:

$$\begin{aligned} \begin{aligned} C(\mathbf{W },\mathbf{H },\varvec{\lambda })&= - \text {log}\, \,p(\mathbf{W },\mathbf{H },\varvec{\lambda }|\mathbf{V }) \\&= D(\mathbf{V }|\mathbf{W }\mathbf{H }) + c \sum _{k=1}^{K}\frac{1}{\lambda _k}(f(w_k) + f(h_k) + b) + \\&\quad \,\, c\,\text {log}\,\lambda _k + \text {const}, \end{aligned} \end{aligned}$$
(8)

where D(.|.) is the KL-divergence operator, \(f(w_k)\) and \(f(h_k)\) are model functions, \(f(u) = ||u||_{1}\), where \(u \in \{w_k,h_{k}^T\}\), \(c = F+N+a+1\), and \(\varvec{\lambda }= (\lambda _1, \lambda _2,\ldots ,\lambda _K)\). The first term in Eq. (8) minimizes the error in the data fitting, and the second and third terms regularize the fit to the common scale parameter, \(\lambda _k\). If \(\lambda _k\) is large, the second term is suppressed while the other term is increased. This inverse proportion aims to prune irrelevant components out of the model, causing it to become sparse. If \(\mathbf{W }\) and \(\mathbf{H }\) are assumed to be known, Eq. (8) can be used to find parameter \(\varvec{\lambda }\) by optimizing that cost function with respect to \(\varvec{\lambda }\), and then by keeping \(\varvec{\lambda }\) known, to solve for \(\mathbf{W }\) and \(\mathbf{H }\) by optimizing the following cost function:

$$\begin{aligned} \begin{aligned} C(\mathbf{W },\mathbf{H })&= \frac{1}{\phi } D(\mathbf{V }|\mathbf{W }\mathbf{H }) + \\&\quad c \sum _{k=1}^{K}\text {log}(f(w_k) + f(h_k) + b) + \text {const}, \end{aligned} \end{aligned}$$
(9)

where \(\text {const} = Kc(1-\text {log}c)\). This process starts with initial values of K (\(<K_{\text {eff}}\)), \(\mathbf{W }\), and \(\mathbf{H }\), and will be repeated iteratively as shown in Algorithm 1 until there is not a significant change (<tol) in the value of the common scale parameter, \(\varvec{\lambda }_k\). In this paper, majorization–minimization [9] optimization method was used to solve for \(\mathbf{W }\) and \(\mathbf{H }\) from Eq. (9).

figure e

After the algorithm is converged, \(K_{\text {eff}}\) is calculated as the number of components that the ratio of \(\frac{\lambda _k - \frac{b}{c}}{\frac{b}{c}}\) is strictly larger than the threshold tol. This step is shown in the following equation:

$$\begin{aligned} K_{\text {eff}} = |\{k \in \{1,\ldots ,K\}: \frac{\lambda _k - \frac{b}{c}}{\frac{b}{c}} > tol\}|. \end{aligned}$$
(10)

2.3.2 Feature extraction algorithm

The MP-TFD of an EEG signal is decomposed to its TF spectral and temporal bases, (i.e., \(w_{k}\) and \(h_{k}\), respectively) using the NMF Eq. (5) and \(K_{\text {eff}}\) calculated in Eq. (10). As shown in Fig. 2a, b, each temporal basis may consist of multiple semi-Gaussians; for example, the temporal basis in Fig. 2b has two temporal semi-Gaussians. Each temporal semi-Gaussian indicates the presence of its corresponding spectral basis in time. Our objective is to identify the temporal semi-Gaussians that represent epileptic spikes and to make this happen, we extract a set of features from each temporal semi-Gaussian such that they are discriminative between the epileptic spikes and the rest of EEG signals. Since the epileptic spikes are generally characterized by their high frequency contents as well as their short intervals in the time domain, we propose five features where three (i.e., A\(_{h}\), C\(_{h}\), and B\(_{h}\)) are from the temporal semi-Gaussian and the other two are from the corresponding spectral basis (i.e., MF\(_{w}\) and A\(_{w}\)). For example, there are two temporal semi-Gaussians in the decomposed TF spectral and temporal bases shown in Fig. 2b. Hence, two feature bases are extracted where each feature vector contains five feature elements denoted as MF\(_{w}\), A\(_{w}\), A\(_{h}\), C\(_{h}\), and B\(_{h}\) and are extracted as follows:

Spectral features The spectral mean frequency (MF) of each spectral basis, \({w_{k}}\), is calculated as one of the features as shown below:

$$\begin{aligned} \text {MF}(w_{k}) = \frac{\sum _{f=1}^{F}fw_{k}(f)}{\sum _{f=1}^{F}w_{k}(f)} f_s, \end{aligned}$$
(11)

where MF(\({w_{k}}\)) is the mean frequency value of spectral basis \(w_{k}\), k \(\in \) [1, 2,..., \(K_{\text {eff}}\)], F is the number of frequency samples in \(w_{k}\), and \(f_s\) is the sampling frequency of the EEG recording. The other spectral feature, A\(_{w}\), is extracted as the area under the spectral basis. To calculate this feature, first the basis is normalized to its maximum magnitude value and then the area under the normalized basis is calculated at shown in Fig. 2a.

The MF values provide a measure of the frequency content of the decomposed spectral basis. If the MF value of a spectral basis is high, it means that basis represents high frequency content of the MP-TFD. Since epileptic spikes contain high frequency structures, it is expected that MF feature can separate spike basis from the rest of EEG structure. The area under the normalized basis indicates a measure of the spread of the energy over frequency domain. The epileptic spikes are specified by a spread of energy over frequency, so the area feature is designed to distinguish the spectral basis with a large spread as potential epileptic spikes.

Temporal features For every temporal semi-Gaussian from the temporal bases, \(h_k\), three features are extracted: the area under the plot, A\(_{h}\), the height or maximum value, C\(_{h}\), and the duration in time, B\(_{h}\). Figure 2b demonstrates these three features for a temporal basis with two semi-Gaussians. The epileptic spikes tend to be high bursts of energy over a short period of time so larger values of A\(_{h}\) and C\(_{h}\) with a small value of B\(_{h}\) is expected to differentiate spike-related semi-Gaussians from the rest of EEG structure.

As explained above for every temporal semi-Gaussian in the temporal bases five TF feature elements are extracted from the temporal basis and its corresponding spectral basis. Hence, the number of the extracted TF feature vectors is equal to the number of semi-Gaussians in the temporal basis. For example, if an NMF with \(K_{\text {eff}}=3\) is applied to a 10-s duration of an EEG signal and the number of semi-Gaussians in the temporal basis are 2, 1, and 4, a total of seven (i.e., 2 + 1 + 4) TF feature vectors will be extracted from the 10 s-signal, where each feature vector contains five feature elements.

Fig. 2
figure 2

Proposed TF feature extractions and the corresponding feature vectors are shown in this figure. Two feature vectors (F\(_1\) and F\(_2\)) are extracted from the spectral and temporal basis shown in (a) and (b), respectively, as shown in (c). The thick-dashed vertical line on the first temporal semi-Gaussianis associated with a spike marked by the epiletologist. Hence feature vector F\(_1\) is labeled as class “1” while the other feacture vector is labeled as “0”

2.3.3 Classification

Once the TF feature vectors are extracted, we use support vector machine (SVM) for the classification of the epileptic spikes. First, each extracted TF feature is labeled as ‘1’ and ‘0,’ where ‘1’ indicates the presence of an epileptic spike at the center of the corresponding temporal semi-Gaussian. For example, in Fig. 2b, the red dashed line indicates the presence of an epileptic spike as was marked by an epileptologist. Hence, the label of the first feature vector was assigned to ‘1’ while the other label was set to ‘0.’ Then, SVM hyperplanes are trained to separate the two classes in the feature space, which will later be used to classify any new data. In general, SVM integrates two main concepts. The first concept determines maximization of an optimum hyperplane, which maximizes the distance to the closest data point on each side. The other concept applies a kernel function to map the feature vectors from lower to higher dimensional spaces in order to improve the separation between the two classes. In this study, we used radial basis function (RBF) kernel, which empirically showed to be suitable for locating epileptic spikes in the given EEG dataset. The details of the classification parameters are explained in the next section.

3 Results

For each subjects, two EEG channels, \(P_{4}\) and \(O_{2}\), were selected for our further epileptic spike detection analysis. First, each 5-min EEG recording was divided into thirty 10-s segments. Each segment was transformed to MP-TFD with \(F=512\) and \(I=1000\) iterations. The model order for the NMF algorithm (i.e., \(K_{\text {eff}}\)) was estimated using the ARD method explained in the methods section. The model order, K, was initialized 30, the maximum number of iteration was selected to be 3500, and \({\text {tol}} = 10^{-6}\). The hyperparameter parameter a of the exponential priors was selected to be any of the values from {0.5, 100, 500, 1000} [22], and \(K_{\text {eff}}\) was determined for each value of a. Figure 3 shows the range of the estimated \(K_{\text {eff}}\) for a, and as can be seen in this figure when the hyperparameter parameter a increases, \(K_{\text {eff}}\) increases; however, the change is not significant. Based on this analysis, we selected the average of all the estimated model orders (i.e., 9.32 \(\approx \) 10) as the \(K_{\text {eff}}\) in this study. Hence, \(K=10\) is selected for the model order of the NMF in Eq. (5).

Fig. 3
figure 3

Results of \(K_{\text {eff}}\) versus the hyperparameter parameter a where \(a = \{0.5, 100, 500, 1000\}\). The average \(K_{\text {eff}}\) is 9.32 \(\approx \) 10

NMF decomposed 10 spectral and temporal bases from each 10-s EEG segment, and TF features were extracted as explained in Sect. 2. The ‘1’ and ‘0’ labels were assigned according to the epileptologist’s markers of the epileptic discharges. We applied a statistical test to identify the significance of each extracted feature. All the features were statistically significant (p value \({<}0.0001\)). The plots in in Fig. 4 show the normalized feature values for the non-epileptic spikes (NS) and epileptic spikes (S). As can be observed from this figure, the features provide a distinction between the non-epileptic vs. epileptic discharges.

Fig. 4
figure 4

Plot below shows the normalized features for non-epileptic spike (NS) and epileptic spike (S) feature vectors

The extracted TF features were fed to an SVM classifier with RBF kernel with parameter \(\sigma =\) {0.5, 1, 5, 10}. The TF features were partitioned into 80% for training and 20% for testing from each class. Once the parameters of the hyperplane are learned using the training dataset, the SVM classifier is used to classify the test data. The performance is evaluated by obtaining the receiver operating characteristic (ROC) curve and calculating the area under curve (AUC). The larger the AUC is, the better is the performance of the proposed method for localizing the epileptic spikes. A cross-validation was performed with repeating the testing and classification process 10 times where each time used a different selection for the training and testing set. The average and standard deviation values of the accuracy (Acc), precision (Prec), sensitivity (Sens), specificity (Spec), and AUC for the 10 experiments are reported in Table 1. The best and worst AUC values (i.e., 98.51 and 90.38%, respectively) were achieved using \(\sigma =\) 10 and \(\sigma =\) 0.5, respectively. Figure 5 shows the ROC plots for the best and worst performances. As it can be seen, the RBF kernel with parameter of \(\sigma =\) 10 shows the highest performance and was selected in the proposed study.

Fig. 5
figure 5

ROC plots of the best and worst performances with AUC values of 90.38 and 98.51%, respectively

Table 1 Classification performance of the proposed TF feature extraction algorithm

We demonstrated the performance of our algorithm in case of one 10-s EEG segment as shown in Fig. 6. The temporal and its corresponding MP-TFD representations are shown in Fig. 6a, b, respectively. The vertical dashed lines indicate six epileptic spikes marked by the epileptologist. We applied NMF with \(K_{\text {eff}}\) =10 to the MP-TFD representation in Fig. 6b and decomposed 10 spectral and temporal bases \([w_1,w_2,\ldots , w_{10}]\) and \([h^T_1,h^T_2,\ldots , h^T_{10}]\), respectively. The decomposed spectral and temporal bases are shown in Fig. 6c, d, respectively. For visualization purposes, the spectral bases were rearranged from the high to low mean frequency value, MF(\({w_{k}}\)), where k \(\in \) [1, 2,..., 10]; for example, in Fig. 6c the 4th spectral basis had the highest mean frequency value while the 5th spectral basis had the lowest value. The temporal bases in Fig. 6d were also rearranged accordingly. We extracted a feature vector (with 5 feature elements) from every semi-Gaussian in the temporal bases and the corresponding spectral bases and applied the trained SVM classifier to classify each feature vector. The center of the semi-Gaussians that their corresponding feature vectors were classified as epileptic spikes were identified as the location of the epileptic spikes. In the example shown in Fig. 6, eleven temporal semi-Gaussians were classified as the epileptic spikes. Only the 4th and 2nd bases had feature vectors, which were classified as epileptic spikes and the remaining bases were classified as non-epileptic spike signals. The 4th and 2nd temporal bases are zoomed out and shown in Fig. 6e. The temporal semi-Gaussians that were classified as epileptic spikes are marked by the stars and the algorithm reported the centers of those temporal semi-Gaussians (i.e., the red stars in Fig. 6e) as the locations of the epileptic spikes in the EEG signal. In this example, all the six epileptic spikes were successfully located by the algorithm.

Fig. 6
figure 6

a A 10-s EEG signal, b the MP-TFD representation of the EEG signal, c decomposed spectral bases, which are ordered from high to low spectral mean frequency value, d the corresponding temporal bases, e the temporal semi-Gaussians, which were classified as epileptic discharges are indicated by the stars. The vertical dashed lines in (ac) are the epileptic spikes as marked by the epileptologist. The stars in (e) show the detected epileptic spikes by our developed algorithm

Table 2 reports the successful localization of epileptic spikes as true positive (TP) and the false localization as false positive (FP) and the clinically available software based on template matching techniques. As can be seen the developed method successfully located the epileptic spikes with 98% accuracy and had a false localization of 7%, which was a significant improvement to the existing clinical software. The proposed method also offers a plausible improvement over our previously proposed method [24], which had only a TP of 86% and FP of 53%.

Table 2 The true positive (TP) and false positive (FP) of the proposed method and the clinically available software

4 Discussion

The diagnosis of infantile spasms is based on the semiology of the seizure and the EEG background characterized during HYPS. However, given the chaotic appearance of EEG during HYPS the detection of epileptic discharges with multiple foci and varying morphologies associated with ISS in HYPS is challenging. We developed a novel TF feature extraction and classification algorithm to automatically localize the ISS relevant electrical abnormalities known as epileptic discharges (or spikes) from EEG signals in HYPS. We applied our algorithm to a dataset with a duration of 25 min and a total of 233 epileptic discharges as marked by an electrophysiologist. Our major finding was that a high-resolution TF representation based on the MP-TFD can be used to successfully determine the location of ISS relevant spikes. The epileptic spikes tended to be high bursts of energies over a short period of time so in time–frequency, they were appeared as high frequency contents with short durations, which helped us to detect them from the rest of EEG structure. The algorithm successfully located 228 of the epileptic discharges and had only a false positive of 7% (16 non-epileptic discharges). Compared to the previous work in [24] with TP of 86% and FP of 53%, our developed method provides a significant improvement in FP. There are two main reasons for such an improvement: (1) The previous work relies only on spectral features while the present work extracts several temporal features and can provide a better representation of the temporal structure of the epileptic discharges. Since the epileptic discharges are mainly transients with high mean frequency value, the temporal features can successfully differentiate between high spectral energies over a short period of time (i.e., epileptic) vs. the ones that spread over longer period of time (i.e., non-epileptic). (2) Also, the previous method was an unsupervised classifier while the present one is a supervised method. It is known that the supervised methods tend to have a higher classification accuracy vs. unsupervised ones.

We applied available clinical software to the EEG dataset, but the method was only able to detect 9 of the epileptic discharges. The main reason is that clinical software generally uses the temporal characteristics of EEG, which are appropriate for adult EEGs without the presence of HYPS. However, our algorithm is able to locate epileptic discharges even in the presence of the chaotic background of EEG during HYPS. The proposed automatic spike localization algorithm has the potential to address the need to successfully localize the epileptiform discharges of ISS from long-term EEG recordings and provide a quantitative and objective assessment of the relevant electrical abnormality.

5 Conclusions

In this paper, we developed a novel TF feature extraction algorithm for localization of epileptic discharges in HYPS EEGs in children with ISS. The proposed algorithm which was based on MP-TFD and NMF matrix decomposition was combined with SVM classification method to identify the EEG waveforms with spikes from the signal. The MP-TFD provided a high-resolution representation in TF domain, which was able to capture the transient characteristics of epileptic discharges. NMF algorithm was used to adaptively decompose the MP-TFD signal representations to the main spectral and temporal bases for a further feature extraction step. The optimal model order for the number of decompositions in the NMF was determined using a probabilistic automatic relevance determination approach. We proposed five novel TF features from both the decomposed spectral and temporal vectors and used SVM to classify them. The evaluation was performed on a dataset with ISS patients, and the results were compared with respect to an electrophysiologist’s identification of spikes. The results showed an average true positive and false positive percentage of 98 and 7%, respectively, with an average AUC of \(98.15 \pm 0.30\), which was significantly higher than the results from the clinically available software. The proposed algorithm could potentially be used by the clinicians to guide the treatment and avoid the catastrophic long-term outcome of ISS.