Introduction

Nowadays, multi-channel and multivariate neurophysiological signals, such as electroencephalogram (EEG), electrocorticogram (ECoG), magnetoencephalogram (MEG), and functional magnetic resonance image (fMRI), have been widely used in clinical medicine (e.g., Cao et al. 2002, 2003; Cao 2006; Chen et al. 2007). Applying signal processing and statistics tools to biomedical fields has become increasingly popular (Akay 2001). Applications of which include signal detection or extraction, denoising, image enhancement, disease diagnosis, and disease classification, etc.

Brain death, briefly speaking, is referred to the complete, irreversible, and permanent loss of all brain and brainstem functions. Brain death implies the termination of a human’s life; correspondingly, the diagnosis of brain death is very important (Ad hoc committee of the Harvard medical school to examine the definition of brain 1968). Although there remain some social disagreements or different diagnosis criteria in clinical practice around the world (Wijdicks 2002), some standard tests are widely used, such as the apnea test and brainstem function examination. Notably, it is commonly agreed that EEG might serve as an auxiliary and useful tool in the confirmatory tests, for both adults and children (Wijdicks 1995; Taylor 1997; Schneider 1989). Typically, isoelectric EEG recording is required at least 30 min and may last 3–24 h (Wijdicks 2002); the positive response of EEG tests suggests functioning of the brain. Consequently, the patient in deep coma might show some EEG electroactivity, while the brain-dead patient will not. Footnote 1

Because EEG recordings are easily accessible and safe, it was mostly recommended in clinical practice in many countries. However, the downside of EEG is that its significance for evaluating comatose states of the brain is limited by the fact that the outcome is often not determined by the brain affection itself. For instance, EEG examination of the patient who uses anesthetics or other central nervous system (CNS) depression drugs might be misleading (Niedermeyer 1991). On the other hand, as criticized by some medical doctors (Pallis and MacGillivray 1980), EEG recordings might also be corrupted by some artifacts or various sources of noise interference, therefore the potential value of EEG was often underestimated. Despite all of the criticisms, there is no doubt that a systematic and quantitative study of EEG measurements would be much invaluable in neurology and clinical medicine (Buchner and Schuchardt 1990). It is our belief that if the EEG examination is reliable and its analysis results are informative, it can provide a simple and risk-free diagnosis tool in the intensity care unit (ICU) of the hospital.

A brain death diagnosis is often made according to some precise criteria following a well-defined procedure. Since the process of brain death determination usually takes a long time and involves certain risks (e.g., removing the breath machine during the apnea test), a practical yet safe method would be desirable for the pre-test of the patient’s brain-state status. In this paper, we present an empirical study on the real-life EEG recordings of some patients that were at different comatose states; we are particularly interested in studying the differences between two groups: deep coma and brain death. When referring to the “brain death” here, it might be more careful and accurate to use the term “qausi-brain-death” or “brain-death syndrome” (at least at the time of EEG examination), because we are really referring to the situation that the brain death diagnosis was made at an early stage (not the same as EEG confirmatory test), which was judged independently by two medical doctors or physicians. However, depending on specific scenarios, it might be necessary that more tests follow afterwards to reach the final clinical decision (see Fig. 1).

Fig. 1
figure 1

Left: the portable NeuroScan system and the electrode layout. Right: the experimental protocol procedure

The objective of this paper is to study several statistical methods for EEG analysis with an aim to using EEG to help bedside or ambulatory monitoring or diagnosis. To our interest, signal processing (qualitative analysis) and quantitative analysis were both applied to the EEG recordings that were collected in the hospital for adult patients. Although the data at this stage are still limited and the analysis remains preliminary, we believe such an empirical study on the data available to date is still invaluable. The main motivation of this study is to apply statistical and signal-processing tools for quantitative EEG (qEEG) analysis (especially on this specific medical field), which might reveal interesting findings for medical practice. Hence, the paper is written as more technique-oriented instead of physiology-oriented. To our best knowledge, very few qualitative and quantitative statistical analysis has been conducted to this biomedical field, particularly with EEG recordings (e.g., Lin et al. 2005; Wennervirta et al. 2007 Chen and Cao 2007). This might be partially due to the fact that the topic of qEEG study for brain death diagnosis is still under debate in clinical practice (e.g., Pallis and MacGillivray 1980) and in the meantime, the EEG data recorded in the real field (such as from the ICU of the hospital) are difficult (if not impossible) to access for most researchers, since different countries might have distinct regulations regarding the access or the use of such confidential data.

A brief background of brain death

Brain death is strictly defined medically and legally (Ad hoc committee of the Harvard medical school to examine the definition of brain 1968; Taylor 1997); it is defined as the cessation and irreversibility of all brain and brainstem functions. Specifically, brainstem controls basic functions essential to survival, such as breathing and heart rate. Nowadays, despite the differences of clinical practice across countries (Wijdicks 2002), the standard diagnosis procedure depends on three cardinal neurological features: coma, absent brainstem reflexes, and apnea (Ad hoc committee of the Harvard medical school to examine the definition of brain 1968).

Because a complete brain death implies the irreversibility of brain function cessation and exclusion of the possibility of recovery of any cerebral and brainstem functions, the irreversibility of coma was emphasized in the report of the ad hoc committee of the Harvard Medical School (Ad hoc committee of the Harvard medical school to examine the definition of brain 1968). However, the Harvard criterion was presented in a narrative rather than an algorithm form. Nowadays, the Harvard criterion of brain death was not fully agreed and still remained controversial (Niedermeyer 1991). Brainstem is the lower portion of the brain between the cerebrum and the spinal cord, which controls breathing, swallowing, seeing, hearing, and other vital functions. The examination of brainstem functions in clinical practice might be sophisticated and vary in practice (e.g., pupillary response to light, fixed or variation pupils, corneal reflex, gag reflex, cough reflex, irrigating the ears with cold water, presenting painful stimuli, etc.). The examination of the absence of spinal reflexes will also include the test of ocular movement, facial sensation and facial motor response, pharyngeal and tracheal reflexes. In clinical practice, many physicians request additional confirmatory tests before announcing brain death. The two most common confirmatory tests are the EEG and the cerebral blood flow (CBF) study. Footnote 2 Compared to the CBF test, the EEG test is much simpler, and therefore is well recommended in practice (Niedermeyer 1991). In spite of certain shortcomings discussed earlier, EEG still proved to be invaluable in the evaluation of brain death. Mostly, shortcomings involve the technical concern of artifacts or conceptual misunderstandings like with brainstem death; however, real cerebral EEG waves exclude brain death per definition (Niedermeyer 1991). Moreover, the technical problem of artifacts can be solved by advanced signal processing methods, which we will also address in this paper.

Experimental data and recording protocol

The EEG measurements in our present study were collected in the Shanghai Huashan Hospital in affiliation with the Fudan University, Shanghai, China. The EEG data were directly recorded at the bedside of patients in the ICU of the Huashan Hospital, where the level of environmental noise could be fairly high. The EEG recording machine was a portable NeuroScan ESI-64 system (El Paso, TX). Depending on the operator’s need, the NeuroScan device can be supported by either DC or AC power. In the system, a total of nine electrodes were placed on the forehead of the patient lying on the bed, which mainly cover the non- or least hairy area of the scalp. Specifically, six channels are placed at Fp1, Fp2, F3, F4, F7, F8, according to the standard 10/20 system; Footnote 3 two electrodes that connect the two ears are used as reference, namely (A1+A2)/2; the addition channel, GND, serves as the ground (see Fig. 1 for illustration). The sampling rate of EEG was 1,000 Hz, and the resistances of electrodes were set under 8,000 ohm. During the clinical measurements, no gel or any other conductive pastes was used during all sessions of EEG recording.

A total of 35 adult patients have been examined by using EEG from June 2004 to March 2006, with age range from 17 to 85 years old. Because the health conditions of patients vary, we had different kinds of EEG recordings for different patients. In particular, four categories are distinguished here:

  • The subject was recorded only in one session.

  • The same subject was recorded in several sessions within the same day.

  • The same subject was recorded in several sessions in different days without status change.

  • The same subject was recorded in several sessions in different days with status change (e.g., from coma to brain death, or from coma to awake recovery).

Notably, all subjects appeared in deep coma (some were prejudged as brain death by two physicians) before the EEG recordings. Patients were all lying down in the bed with eyes closed during the measurements. Correspondingly, no ocular or muscle artifacts was observed. In some occasions, the heart beat rhythm can be observed from specific patients. In the current experimental recordings, only the DC power was used, so the interference from power noise is somewhat minimized (compared to the AC power).

In China, there was still no legal regulation or instructions regarding to the brain death diagnosis at the time of data collection. In our case, the medical classification between coma and (quasi) brain death were predetermined independently by two experienced physicians or medical doctors based on the continuous observations and several typical tests (e.g., pupil’s light response, brainstem reflexes test). The EEG recordings were supervised by one physician (neurologist) and operated by either medical doctor or medical staff.

The experimental protocol was approved by the local ethics committee of the hospital, and all recorded data were used with permission of patients’ family. Although in total we have recorded 47 sessions of 35 patients’ EEG measurements, not all recordings have equally good quality (some were extremely noisy or the measurements had poor fidelity due to technical reason, and in some cases recordings from few channels were missing). The EEG data used in the present study were carefully scrutinized and only 32 patients were included; the statistics of selected patients are summarized in Table 1.

Table 1 The summarized list of patients under study (C denotes coma; D denotes brain death; F denotes female; M denotes male; N/A denotes not available; δ band: 1–4 Hz; θ band: 4–8 Hz; α band: 8–12 Hz; β band: 13–30 Hz)

Signal processing: independent component analysis and spectrum analysis

Upon obtaining the raw EEG measurements, no specific preprocessing was applied prior to the qualitative evaluation. The reason for that is twofold. First, because the noise is usually broadband, it is difficult to apply any standard filtering technique (except for the notch filter). Second, because the recording conditions differed from different subjects, it is our intention to investigate the robustness of our proposed statistical method regardless of levels of environment noise. Hence, all the signal processing tools employed here were applied to the raw recordings (but with relatively “clean” EEG traces according to human visual inspection).

Independent component analysis

Independent component analysis (ICA) is a powerful signal processing tool for blindly separating mutually independent sources (Cichocki and Amari 2002). Various ICA methods have been widely used in biomedical fields for data analysis, such as the EEG, MEG, or fMRI (e.g., Cao et al. 2002; Makeig et al. 2002, Calhoun et al. 2002; Pockett et al. 2007). Without going much details, we briefly describe a robust ICA method that was developed in Cao et al. (2003), which first applies a robust prewhitening procedure, and then uses a parameterized t-distribution density model to separate the mixture of sub-Gaussian and super-Gaussian signals.

The observed multi-channel signals are assumed to be generated by a probabilistic generative model

$$ {\mathbf{x}}(t)= {\mathbf{As}}(t)+{\varvec{\epsilon}}(t), $$
(1)

where t denotes the discrete-time index; the vector \({\mathbf{x}}(t)=[x_1(t),\ldots,x_m(t)]^T\in{\mathbb{R}}^m\) denotes the observed multi-channel signals at time t measured in the electrodes; \({\mathbf{s}}(t)=[s_1(t),\ldots,s_n(t)]^T\in{\mathbb{R}}^n\) denotes a set of independent and hidden “source” components of interest, which are all assumed to have zero mean and unit variance statistics; and \(\varvec{\epsilon}(t)\in{\mathbb{R}}^m\) denotes the additive uncorrelated white noise that corrupts the measurements, which is also assumed to have zero mean statistic. The mixing matrix \({\mathbf{A}}=\{a_{ij}\}\in{\mathbb{R}}^{m\times n}\) can be thought to be modeling the mixing or scattering effect between the sources and the sensors (electrodes) in the scalp. In this paper, we assume m = n = 6 for simplicity.

Let ℓ denote the number of data samples in time, equation (1) can be written in matrix form:

$$ {\mathbf{X}}_{(m\times \ell)}={\mathbf{A}}_{(m\times n)}{\mathbf{S}}_{(n\times \ell)}+\varvec{\Upxi}_{(m\times \ell)}. $$
(2)

Provided \({\mathbb{E}}[{\mathbf{s}}(t){\mathbf{s}}^T(t)]={\mathbf{I}}\) , when the sample size \(\ell\) is sufficiently large, then the covariance matrix of \({\mathbf{x}}(t)\) can be estimated by

$$ {\mathbf{C}}={\mathbf{AA}}^T+\varvec{\Uppsi}, $$
(3)

where \({\mathbf{C}}={\mathbf{XX}}^T/\ell,\) and \(\varvec{\Psi}=\varvec{\Upxi\Upxi}^T/\ell\) describes a diagonal matrix. For convenience, we assume that X has been divided by \(\sqrt{\ell}\) so that the covariance matrix is given by \({\mathbf{C}}={\mathbf{XX}}^T.\)

When \(\varvec{\Uppsi}\) cannot be ignored in the model, we may employ the following cost function for optimization

$$ L({\mathbf{A}},\varvec{\Uppsi})={\tt tr}\left([{\mathbf{AA}}^T-({\mathbf{C}}-\varvec{\Uppsi})] [{\mathbf{AA}}^T-({\mathbf{C}}-\varvec{\Uppsi})]^T\right) $$

where \({\tt tr}(\cdot)\) denotes the trace operator. In order to minimize the above cost function, we use the following iterative estimation:

$$ \hat{{\mathbf{A}}}={\mathbf{U}}_n\varvec{\Upsigma}_n^{1/2}, $$
(4)
$$ \hat{\varvec{\Uppsi}}=\hbox{diag}\left({\mathbf{C}}-\hat{{\mathbf{A}}} \hat{{\mathbf{A}}}^T \right), $$
(5)

where \(\varvec{\Upsigma}_n\) is a diagonal matrix whose elements contain the n largest eigenvalues from diagonalizing matrix C: \({\mathbf{C}}={\mathbf{U}}\varvec{\Upsigma} {\mathbf{U}}^T;\) and the columns of matrix U n are the corresponding eigenvectors.

Upon iterative optimization and convergence to the stable solutions of \(\hat{{\mathbf{A}}}\) and \(\hat{\varvec{\Uppsi}},\) we first estimate the prewhitening matrix, denoted by Q, as follows

$$ {\mathbf{Q}} = \left(\hat{{\mathbf{A}}}^T\hat{\varvec{\Uppsi}}^{-1}\hat{{\mathbf{A}}}\right)^{-1} \hat{{\mathbf{A}}}^T\hat{\varvec{\Uppsi}}^{-1}, $$
(6)

from which the whitened signal is given by z = Qx.

Next, we aim to find a demixing matrix, denoted by W, such that y = Wz will recover the independent source signals s (subject to scale and permutation ambiguities). Specifically, the following iterative learning rule is employed to estimate the demixing matrix:

$$ \Updelta{\mathbf{W}}(t) = \eta[{\mathbf{I}}-\psi({\mathbf{y}}(t)) {\mathbf{y}}^T(t)]{\mathbf{W}}(t), $$
(7)

where η is a small positive learning-rate parameter, and ψ(·) denotes the score function, which can be derived from a t-distribution probability density model (see Cao et al. 2003; Cichocki and Amari 2002 for mathematical details).

Fourier analysis and time-frequency analysis

After separating the independent components, we conduct the standard Fourier spectrum analysis to estimate the power spectra of individual independent components (see Fig. 2 for an illustration). Notably, the amplitudes of the separated components as well as their power spectra have no quantitatively physical unit meaning, since the outputs of the ICA all have scaling indeterminacy. From the power spectra, we can empirically determine or evaluate whether the components may contain the EEG brain waves.

Fig. 2
figure 2

The raw EEG traces (5 s) and the estimated 6 independent components as well as their corresponding power spectra. The arrows indicate the typical alpha waves of the extracted components. Note that here the amplitudes of the signals and power spectra have arbitrary units

For a closer examination, we also resort to the time-frequency analysis tool, such as the Wigner-Ville distribution (WVD) (Cohen 1995), to visualize the ongoing temporal signals in a time-frequency plane. Compared to the one-dimensional power spectrum, the two-dimensional time-frequency map may clearly reveal the time-varying spectral information of the specific signal of interest. See Fig. 3 for an example of illustration.

Fig. 3
figure 3

Two traces of extracted alpha waves and the associated time-frequency map (calculated from WVD)

Evaluation

Since EEG measures the “smeared” ensemble activity of synchronous firings from millions of neurons in the brain, it indicates the specific activity in which the brain is engaged and also provides some hints about the consciousness or comatose status. Provided the brain is not completely dead, it is highly likely that some brain waves might be extracted from the “EEG” measurements. In our experimental analysis, we are particularly interested in the upper theta (6–8 Hz) and alpha (8–12 Hz) waves. This is mainly because first, the theta waves are strong during internal focus, meditation, and spiritual awareness, they relate to subconscious status that reflect the state between wakefulness and sleep; while the alpha waves are responsible of mental coordination, self-control of relaxation, and it is believed to bridge the conscious to the subconscious state (Niedermeyer 1991). Second, the low-frequency brain waves (such as the delta waves, 0.1–4 Hz) typically occur in deep sleep and in some abnormal processes (e.g., experiences of “empathy state”) is classed as “slow” activity, but the delta rhythm is difficult to be distinguished from other slow non-EEG signal sources.Footnote 4 And finally, it is very rare to observe the high-frequency range brain waves, such as the beta (12–30 Hz) or gamma (>30 Hz) bands, considering the fact that they are more relevant to the high-level cognitive tasks, which seemed nearly impossible for all comatose patients. In our experiments, we were able to extract some brain waves (evaluated by Fourier and time-frequency analysis) for the patients in deep coma; in contrast, the signal spectra form the brain death group appeared to be white. The results have been summarized in Table 1.

After loading specific raw EEG recordings (within a temporal window with duration 5 s), the blind separation was achieved by the above-described robust ICA algorithm, which has been demonstrated to perform quite well for both simulated and real-life MEG signals (Cao et al. 2003), and proved to rather robust to noise interference as compared to other ICA algorithms (e.g., Cichocki and Amari 2002) in the literature, such as the fastICA algorithm, or the JADE algorithm. However, the detailed comparison of our algorithm with other ICA algorithms is beyond the focus of the current paper. It should be noted that given the long-time recordings of EEG signals, it was not true that the brain (alpha or theta) waves can always be observed or extracted. In our experimental procedure, we first conducted a human visual inspection on the raw EEG traces, and then conduct a moving-window based BSS procedure, followed by Fourier analysis and time-frequency analysis.

For each coma patient, we have selected some representative durations of EEG measurements (5 s long), which all contained the EEG theta waves, but might or might not contain the alpha waves. These data were further used for later quantitative analysis and comparison with the quasi-brain-death patients.

Quantitative analysis

After the preliminary clinical diagnosis and signal processing analysis for the EEG recordings, the patients can be categorized into two groups: the deep coma group and the brain death group. For evaluating the quantitative differences between two patient groups, the qEEG analysis was further employed. The goal of quantitative analysis is to discover some informative features relevant to the EEG signals that are useful in discriminating from these two groups (deep coma vs. brain death) and to further evaluate their statistical significances.

Relative power ratio

First, we compute a simple statistic based on the standard Fourier analysis. Specifically, we define the the relative power ratio (RPR) as follows:

$$ \hbox{RPR}=\frac{\theta+\alpha+\beta \quad (4\hbox{--}30\,\hbox{Hz})} {\hbox{total power} (1\hbox{--}30\,\hbox{Hz})}, $$

where θ, α, β denote the spectral power from the theta, alpha, and beta spectral bands, respectively. Here the relative power (ratio) is preferred to the absolute power of single spectral band because the latter directly depends on the signal amplitude, thereby also dependent on the scaling of the signal after signal processing (such as filtering or ICA). The reason we exclude the low-frequency component (1–4 Hz) in the numerator is that there always exist non-EEG slow waves in the recorded signals (including white noise) which is more difficult to distinguish based merely on the power spectrum. For each subject, we computed the RPR values from all 6 channels and only reported the maximum value (the reason for that is to emphasize the contribution from brain wave components—the presence of any brain wave rhythm would make the ratio high). Comparison was further made between the subjects from two groups. It was our intention to investigate whether the simple relative power statistic can reveal any statistical difference with regard to the qualitative observations. We applied the one-way ANOVA (analysis of variance) as well as the Mann–Whitney test (also known as Wilcoxon rank sum test) to evaluate the RPR statistics between two groups. The ANOVA is a parametric test (by assuming that the two group samples’ distributions are both Gaussian) that compares the means for two groups and returns the P-value for the null hypothesis that the two groups have equal means; whereas the Mann–Whitney test is nonparametric (by assuming that the two group samples’ distributions have similar shape) and tests the hypothesis if two groups have equal medians. From our experiments, statistical significance was found from both tests with our selected EEG data, and the null hypotheses were rejected (i.e., H = 1). The quantitative results are summarized in Table 2.

Table 2 Results of statistical tests on the maximum value of relative power ratio (RPR) for two groups: coma versus brain death

Quantitative complexity measures

The nonlinear brain dynamics can be somehow characterized by its outputs, where the EEG measurements might be treated as the random time series observed from the complex system (i.e., the functioning brain). Presumably, a functioning brain and a dead brain would exhibit different behavior in terms of their outputs and therefore have different degrees of “complexity”. To characterize the stochastic nature of the system, many stochastic complexity measures have been proposed or developed in the literature for analyzing neurophysiological signals (e.g., Gonzalez Andino et al. 2000; Gu et al. 2003; Hornero et al. 2006; Goldberger et al. 2002; Lin et al. 2005, Papadelis et al. 2007, Wennervirta et al. 2007). In our qEEG analysis, four types of quantitative measures are under investigation:Footnote 5

  1. (1)

    The approximate entropy (ApEn) (Pincus 1991), which is a quantity that measures the regularity or predictability of a random signal or time series.

  2. (2)

    The time delay-embedded normalized singular spectrum entropy (NSSE) (Roberts et al. 1998), which is a complexity measure arisen from calculating the singular spectrum of a delay-embedded time series.

  3. (3)

    The C 0 complexity (Chen et al. 2000; Gu et al. 2003), which is a complexity measure based on spectrum analysis.

  4. (4)

    The α-exponent based on detrended fluctuation analysis (DFA) (Peng et al. 1994; Little et al. 2006), which estimates the fractal scaling exponent. Notably, α = 1 indicates 1/f noise and long-range correlation; α = 0.5 indicates white noise; and α = 1.5 indicates Brownian noise.

All of complexity indices provide a quantitative metric for the consciousness status of brain state. It is noteworthy to point out several properties of these quantitative measures:

  • All four measures are strictly invariant to the scaling of the signal (hence independent of the signal’s power).

  • The NSSE and C 0 complexity are both nonnegative and bounded by 1.

  • The fractal exponent may characterize the long-range correlation behavior of a random signal. The α-exponent obtained from the DFA method shall be in principle consistent with β-exponent obtained from the power spectrum analysis (Kaspar and Schuster 1987), but the DFA method was claimed to be more accurate (Goldberger et al. 2002).

Specifically, the parameter setup and calculation of the above complexity measures in our experiments are as follows:

  • For ApEn, we chose m = 2 and r = 0.25 (consistent with the notations in Akay (2001)) throughout the experiments.

  • For NSSE, we chose m = 10 (that corresponds to 10 ms for 1,000 Hz sampling frequency) as the time delay parameter.

  • For ApEn, NSSE, and C 0 complexity, we evenly divided the EEG signals into several (say, 10) segments, and computed their corresponding quantitative values in each segment, and then averaged their values to obtain the mean statistic as the final outcome.

  • For α-exponent, we used the recommended setup from Little et al. (2006).

Upon obtaining the quantitative results from the four complexity measures, statistical tests were further applied to evaluate their statistical significance. Specifically, the Mann–Whitney test was applied to these quantitative measures of two groups (coma vs. brain death) for each electrode channel. For most subjects, there are 6 recorded channels available for analysis. For a few subjects, only 4 channels were recorded because of technical problem during the measurements; in such a case, they were excluded when we analyzed the corresponding specific channels.

We applied qEEG analysis (followed by statistical tests) to both raw EEG signals as well as its bandpass-filtered version (between 0.5 and 100 Hz). The bandpass filtering operation was aimed at reducing the effect of potential low-frequency artifacts (slow wave <1 Hz, such as myoclonic jerks (Niedermeyer 1991)) and high-frequency non-EEG noise. For the raw EEG data, the results of box plot are shown in Fig. 4. In each box plot, the box has three lines at the lower quartile (25% percentile), upper quartile (75% percentile), and median (middle line) values. The distance between the 25th and 75th percentiles represents the interquartile range (IQR). The whiskers are lines extending from each end of the box; the lower (or upper) whisker is at 1.5 the IQR below (or above) the 25th (or 75th) percentile. Outliers (labeled with marker ‘+’) are data samples with values beyond the range of the whiskers. In addition, the overall quantitative results are summarized in Table 3. As seen from the table, statistical tests show significant differences in all complexity measures and all channels for the raw EEG data. For the filtered EEG data, significant differences between two groups are still found in the all or majority of channels in all complexity measures.

Fig. 4
figure 4

Box plot of four quantitative statistics (for 6 channels) between the coma and brain death groups

Table 3 Summary of quantitative statistics applied to the raw and filtered EEG data for two groups: coma (C) versus brain death (D). For the P-value column, * means P < 0.05 and ** means P < 0.01, and other numerical values show non-significance from the Mann–Whitney test

Interpretation and visualization

The recorded EEG signals can be viewed as the multivariate time series observed from a dynamical system (i.e., the human brain). In order to analyze the characteristics of the dynamical system, we may use various statistical measures (such as the entropy, fractal dimension, etc.) to quantify the complexity or regularity of the time series. In time series analysis, this is a rather well-studied research field (Akay 2001).

One important aspect regarding the regularity of a time series is the so-called self-similarity. Natural objects or real-life physical signals often have such a “fractal-like” feature. Typically, the self-similarity is accompanied with a long-range correlation behavior: C(τ) ∼ τ−γ. Because the power spectrum density is simply the Fourier transform of the autocorrelation function, we have S(f) ∼ 1/f β (where β = 1 − γ). An illustration of the long-range correlation and log-log power spectrum of a segment of EEG signal is presented in Fig. 5. Notably, the scaling α-exponent is also related to the slope parameter β (Kaspar and Schuster 1987), and both of which can be used to calculate the fractal dimension of a self-similar signal.

Fig. 5
figure 5

Left panel: an illustration of self-similarity of one-channel raw EEG signal (20, 5, and 1 s) from a coma patient. Right panel: the autocorrelation function and log-log power spectrum calculated from the 20 s EEG signal

Another important aspect regarding the regularity or complexity of a random signal is the entropy, provided the time series is viewed as a random process. The more complex (or less regular) for a random signal, the greater is its entropy. Although there are many entropy and complexity measures proposed in the literature, we mainly focus on three measures (ApEn, NNSE, and C 0 complexity) in this paper. In light of the results of our qEEG analysis, it is worthy commenting several observations of these statistical measures:

  • The complexity of a time series, measured by ApEn and C 0 complexity, is lower in the coma group than the brain death group. It should be emphasized that, for the brain death group, the signals we truly analyzed are not human EEG signals (otherwise the patient will not be called brain death), but rather some non-EEG activities (either background noise or artifacts).

  • The NNSE can be viewed as a stochastic complexity measure. Generally, if the eigen-spectrum (or singular spectrum) of a time delay-embedded signal is flat (such as white noise), then it is expected to have a greater entropy value. On the contrary, if the time-embedded signals are highly correlated, then a lower entropy value would be obtained from the non-flat singular spectrum. As observed, the coma group has a lower NNSE (both median and mean) value than the brain death group.

It seems from our study that the entropy measures are quite robust in distinguishing between the coma and brain death patients, which is also in agreement with the findings reported in Wennervirta et al. (2007).

Next, we seek a statistical tool for feature extraction and dimensionality reduction, which further leads to data visualization in a lower-dimensional space. The most popular method for dimensionality reduction is principal component analysis (PCA), which attempts to find the projection direction that has the maximum variance. However, the standard PCA method is limited by its linear model assumption. Provided the features are nonlinearly correlated, then PCA will fail to reveal the inherent structure of the data. In this case, we may resort to nonlinear statistical methods. The two statistical tools we employ here for visualization are linear PCA and kernel PCA (KPCA) (Schölkopf et al. 1998). The KPCA method can be viewed a nonlinear generalization of the linear PCA method. By virtue of the so-called “kernel trick”, the linear PCA method can be extended to the kernel-based nonlinear dimensionality-reduction or feature-extraction methods. This is done by projecting the data to a high- or even infinite-dimensional feature space, whereas the inner product of the feature space is induced by a positive definite kernel (Schölkopf and Smola 2002).

We arranged all quantitative features of the first 5 channels (of all subjects) into an augmented (feature-by-subject) matrix, and then conducted the PCA analysis. The dimensionality reduction results are illustrated in Fig. 6. As seen, the two classes (coma vs. brain death) are quite clearly separated, expect for a few (about 5) subjects. It is also observed that KPCA did not bring additional discrimination advantage compared to the linear PCA (as their results are quite similar), indicating the correlations between the extracted features are somewhat linear.

Fig. 6
figure 6

Visualization of the first two dominant projected features in a two-dimensional space (circle: coma group; x-mark: brain death group). (a) Linear PCA. (b) kernel PCA with a third-order polynomial kernel \(K({\mathbf{x}}_i,{\mathbf{x}}_j)=(1+{\mathbf{x}}_i^T{\mathbf{x}}_j)^3\)

Classification

Upon computing the four complexity measures for EEG signals per channel, we obtained 6 × 4 = 24 features in total for each subject. To further extract uncorrelated features, we used the linear PCA for dimensionality reduction. With the features at hand for the two groups, we then feed them into a linear or nonlinear binary classifier, such as the Fisher linear discriminant analysis (LDA) and the support vector machine (SVM) (Schölkopf and Smola 2002). Specifically, SVM is a nonlinear classifier that is known to have a good generalization ability by maximizing the margin.

Let \(\ell\) denote the total number of samples, we further define three performance indices:

$$ \hbox{MIS} =\frac{\hbox{FP}\,+\,\hbox{FN}}{\hbox{Total}}, \quad\hbox{SEN} =\frac{\hbox{TP}}{\hbox{TP}\,+\,\hbox{FN}},\quad \hbox{SPE} =1-\frac{\hbox{FP}}{\hbox{FP}\,+\,\hbox{TN}} $$

where the above nomenclature follows: false positive (FP, type I error), false negative (FN, type II error), true positive (TP), true negative (TN), sensitivity (SEN), and specificity (SPE). In addition, it is informative to compute the receiver operating characteristic (ROC) curve, which is a graphical illustration that shows the relation between the specificity (1–SPE value in the abscissa) and sensitivity (SEN value in the coordinate) of the binary classifier. The area under the ROC curve (AUROC) reveals the overall accuracy of the classifier (with value 1 indicating perfect performance and 0.5 indicating random guess). In our experiments, since the available data set is rather small, thus far we only tested the classifier’s performance accuracy using a leave-one-out cross-validation procedure (i.e., using \(\ell-1\) samples for training and the remaining 1 sample for testing, and repeating the procedure for the whole data set). The average leave-one-out misclassification (MIS) performance was 9.2% for SVM and 11.3% for LDA. For SVM, we used a Gaussian kernel function with a kernel width of 0.1 (chosen from cross-validation). The optimal AUROC value we obtained is 0.852 with the nonlinear SVM classifier; see Fig. 7 for the illustration of ROC curves. The results are summarized in Table 4. In addition, we also compared the classification performance using the raw features without PCA dimensionality reduction, the MIS results from SVM are similar, while the performance of LDA is slightly worse than the the one with PCA feature reduction. This is probably because LDA is a linear classifier whereas SVM is a nonlinear classifier, and the latter is less sensitive to the number of linearly correlated features.

Fig. 7
figure 7

Comparison of the ROC curves of two classifiers

Table 4 Summary of classification results on the misclassification (MIS) and AUROC indices. The MIS performance is based on the leave-one-out cross-validation procedure

Subject-wise case study

In this section, we focus on specific individual subjects and investigate two interesting clinical cases. These two cases represent two different changes of consciousness state of the brain: one from deep coma to awake recovery, the other from deep coma to brain death. For these two subjects, we have relatively more recording sessions, which also provide us with more opportunity for an in-depth analysis.

From deep coma to awake recovery

The first subject (that corresponds to patient C1 in Table 1) is a 18-year-old male patient (SJ) with a primary cerebral disease, who was admitted to the hospital on May 20, 2004 and later diagnosed as virulent Meningitis. After one month hospitalization, on June 11, 2004, the patient lost his consciousness and remained in a deep coma state. On the examining day, his pupils were dilated, and the respiratory machine was used. The patient was completely unresponsive to external visual, auditory, and tactile stimuli, and was incapable of any communication. Although the symptom of patient was very similar to a brain-death case, EEG analysis indicated that the patient still had physiological brain activities. In fact, after that day the patient came to consciousness little by little. On August 31, 2004, the patient was able to respond to simple questions, and was released from the hospital later.

The EEG recordings available for this subject include three sessions (measured at different times on June 11), each with about 5 min. In order to compute the time evolution of the quantitative measures, we applied a moving overlapping window (with 10 s duration and half window overlap) to each channel’s recording. Furthermore, in order to reduce the effect of non-EEG artifacts, we also filtered the windowed signal within [0.5, 100] Hz before using them in qEEG analysis. Then we further analyzed each quantitative measure in 3 sessions for all 6 channels. The box plot statistics are shown in Fig. 8. As seen from the figure, some quantitative (median) values are rather stable (e.g., channels Fp1, F3, and F7), since these three recording sessions are obtained from the same day.

Fig. 8
figure 8

Box plot statistics of four quantitative measures for 6 channels in 3 recording sessions (subject SJ)

From deep coma to brain death

The second subject (that corresponds to patient C8 and D1 in Table 1) is a 17-year-old female patient (ZJ) with a virus encephalitis. This subject suffered from the difficulty of breathing, and the respiratory machine was used in the ICU since her admittance to the hospital on March 14, 2005. On March 16, 2005, the patient was in a deep coma state with dilated pupils, but was found to have a very weak visual response. On the same day, the EEG was recorded in the patient’s bedside. Three sessions of EEG measurements were recorded, each one with approximately 5 min.

On March 22, 2005, the patient was also found to be completely unresponsive to external visual, auditory, and tactile stimuli, and her pupils lost the light response. On the same day, two physicians made the diagnosis as (quasi) brain death. The EEG examination was then applied to the patient. Three sessions of EEG measurements were recorded, each one with about 5 min.

In total, the EEG recordings from 2 days have six sessions, with total durations about 30 min. Again, we applied a moving overlapping moving window (with 10 s duration and half window overlap) to the recorded signals, followed by bandpass filtering (within [0.5, 100] Hz). For each session, we calculated the quantitative statistics for 6 channels in a total of 60 temporal windows. The comparison of the mean and SEM (standard error of the mean) statistics for the 6 sessions (in two days) is given in Fig. 9. As shown in the figure, we can clearly observe a “mode shift” between these two days. Specifically, the mean values of ApEn, NNSE, and C 0 complexity are increased from March 19, 2005 to March 22, 2005, while the mean value of α-exponent is decreased. This phenomenon is again consistent with the early qEEG analysis results (Fig. 4) between the coma group and the brain death group. In addition, it was observed that for all four statistical measures, the SEMs of the measurements are greater on March 22 than those on March 16. It should be pointed out that although Fig. 9 only presents the averaged statistics of 6 channels, similar trends are also observed in each individual channel. Due to space limitation, we cannot show the temporal evolution traces of quantitative measures of all channels here. To give a demonstration, Fig. 10 shows the temporal evolution of four complexity measures for channel F4. Each point in these plots is calculated using a shifted overlapping 10-s window. The purpose of which is to observe the variation within specific sessions and to see if there is any median shift between two different days. Statistical (Mann–Whitney) test again show that the median statistics of all complexity measures are different between the two days.

Fig. 9
figure 9

Comparison of four quantitative measure statistics (mean ± SEM) between two days (March 16 and March 22, 2005) based on six recording sessions (subject ZJ). Each statistic value is calculated from 6 channels and 60 overlapping 10 s-duration windows

Fig. 10
figure 10

The temporal evolution of the four complexity estimates for channel F4 (subject ZJ). Each point corresponds to the median statistic calculated from a moving non-overlapping temporal window (with 10 s duration). In each subplot, two lines (one for March 16 and the other for March 22) represent the median values from 180 data points (from 3 sessions of each day). The P-value (of Mann–Whitney test) for rejecting the null hypothesis that the medians of the data points from two days are equal is also shown in the title of each subplot. H = 1 indicates the null hypothesis can be rejected at the 5% level

Discussion

Robustness of the quantitative measures

In this paper, we have proposed some signal processing methods and several complexity measures for qEEG analysis. One key factor in qEEG analysis is the robustness of these statistical measures. Most importantly, the quantitative measures shall be somewhat robust to the existence of non-EEG sources (such as noise, artifacts, or power interference). In addition, the EEG signal is known to be highly non-stationary, and therefore, the quantitative values obtained from the complexity measures are fast time-varying (e.g., see Fig. 10). As observed in our empirical qEEG analysis, our proposed complexity measures are somewhat robust to noise, and they are all invariant to the amplitude scaling of the signal. When monitoring the temporal evolutions of these quantitative measures (as done in the subject-wise case study), we also found that the median statistic of these measures are relatively robust to the potential artifacts in the measurements. Therefore, these measures are arguably reliable for real-life applications.

Online implementation in clinical practice

In this study, we present a practical procedure for EEG analysis for the clinical pre-testing of brain death. The proposed EEG examination procedure (Fig. 1) can be applied at the patient’s bedside using a small number of electrodes. Our signal processing method can be used to reduce the power of additive noise and to decompose or separate the brain and interference signals.

In terms of the clinical utility, we believe that the real-field EEG analysis would provide the medical doctor or physicians with valuable cues of the ongoing activities of the brain. Hence, our proposed method might be potentially used as a diagnostic and prognostic tool in clinical practice. In the meantime, new biomedical devices have been developed for helping to collect high-fidelity EEG signals in critical care setting (Litscher 1999). It is noted that most of our algorithmic components (such as Fourier transform or standard matrix decomposition) can be implemented efficiently in real time, using Labview or MATLAB run on a laptop). Although our results are still empirical (given the limited measurements available thus far) and the solid confirmation of our claims requires further investigation and more data analysis, our work reported here can be viewed as the first step towards the final goal.

Future study

We are planning to collect more real-field EEG data for more in-depth data analysis. Besides, we are examining methods to distinguish the low-frequency components of EEG signals from its surrogate signals (with the same Fourier magnitude but randomly shuffled phase). This is important before applying any quantitative measures to evaluate the bona fide EEG signals. We are attempting to explore several nonlinear methods and higher order statistics to overcome the limitation of standard Fourier analysis that is rooted in second-order statistics. The significance of the prospective method can be tested by Monte Carlo analysis. In addition, advanced machine learning methods, such as the ensemble classifier method (Dietterich and Bakiri 1995), can be used to further improve the classification performance especially in the case of small size of data sample set.

In conclusion, we believe that the signal processing and machine learning tools for qEEG analysis would shed a light on the real-time medical diagnosis in clinical practice, and might present themselves as a challenging research direction.