Keywords

1 Introduction

1.1 Evoked Potentials

The Evoked Potentials are signals that appear embedded in the electroencephalographic signal (EEG) after a given stimulus is presented to the subject, being of very weak amplitudes (in the order of 0.1–100 µV) [1]. The EEG signal is considered the main source of noise [1, 2], but other interferences and noise can also be found that contaminate this signal and make difficult to detect them, such as artifacts inside the body, the environment, sensors and electrodes. Specifically, the sources related to the electrodes and sensors used for the registration have a special attention [3]. In the case where the recording is done using a single channel, these interferences cannot be suppressed using linear combinations of channels as in ICA or other linear techniques.

To recover these low-amplitude evoked potentials, the most common technique performs a Coherent Average (described in more detail in next section) of a large number or responses to the stimulus. The EEG signal includes slow derivatives [3,4,5] from the electrode-gel-skin interfaces. These drifts can affect the result of the Coherent Average inducing a reproducible pattern that does not exist. Generally, these drifts can be treated with high-pass filters, but these in turn include the introduction of other types of artifacts. New morphological forms appear that depend on the cutoff frequency, the order, and the type of filter used. Another problem is the size of the signal analysis window: for very small sizes some trends cannot be eliminated even if high-pass filters are used.

1.2 Coherent Average

The Coherent Average can be computed from the ensemble matrix P that is formed with the set of evoked responses [6,7,8,9,10,11], as shown in Eq. (1):

$$ P_{ij} = \left[ {\begin{array}{*{20}l} {p_{11} } \hfill & \ldots \hfill & {p_{1N} } \hfill \\ \vdots \hfill & \ddots \hfill & \vdots \hfill \\ {p_{M1} } \hfill & \ldots \hfill & {p_{MN} } \hfill \\ \end{array} } \right],\,1 \le i \le M,\,1 \le j \le N $$
(1)

Here, the response pij to the i-th stimulus is assumed to be the sum of the deterministic (constant) component of the signal or evoked response s plus a random noise ri which is asynchronous with the stimulus. The model for each of the M responses is given by Eq. (2).

$$ p_{i} = s + r_{i} $$
(2)

where the deterministic component s is given by Eq. (3):

$$ s = s\left( n \right), \, 1 < n < N $$
(3)

and the noise \( r_{i} \) is given by Eq. (4):

$$ r_{i} = \left[ {\varGamma_{i} \left( 1 \right)\varGamma_{i} \left( 2 \right) \ldots \, \varGamma_{i} \left( N \right)} \right] $$
(4)

In the model given by Eqs. (1)–(4), N is the number of samples that compose the epoch, and Γ(n) (the current noise) is assumed to be stationary and normal, with zero mean [12,13,14,15,16,17,18,19]. Consequently, the variance of noise must be fixed and equal in all potentials. The CA average, or arithmetic mean as it is also known, is a simple and direct method to estimate the deterministic component s and produce an estimate of it, which we will call \( \hat{s} \) (Eq. 5):

$$ \hat{s} = \frac{1}{M}\sum\limits_{i = 1}^{M} {p_{i} \left( n \right)} $$
(5)

In some Coherent Averaging applications, this \( {\hat{s}} \) can be then used to extract, from each pi, the noise part and obtain an estimate of the signal-to-noise ratio of the ensemble, from which the possible biases produced by the number of responses M and amplitude variability in s has been removed [20, 21].

If the individual responses ri present marked tendencies, the estimated signal \( \hat{s} \) can show changes in its morphology (given the very low amplitude of the s component), and important values in the diagnosis, like the amplitudes of the individual components of the evoked potential, can be distorted.

1.3 Detrending

One solution to the problems caused by high-pass filtering is to perform detrending. Detrending consist in removing means, offsets, or linear trends from regularly sampled time-domain input-output data signals. Detrending can be developed using a smoothing function, for example a low-order polynomial that fits the data [3, 22,23,24,25], and subtracting it from the data, in order to eliminate fluctuations. Other models can be used for the same purpose [3, 4, 26].

The detrending assumes a model of the signal that must be flexible to the adjustment of the existing trend, if it is inextricable it does not absorb fluctuations of interest. Choosing the parameters (e.g. the polynomial order) is a critical step. Simple trends are easily removed with low-order polynomials, or the first terms of a Fourier series. It can be conceived that the unwanted tendency contains fewer oscillations than the waveform of the evoked potentials, and this, in turn, contains fewer oscillations than the noise, so a general concept could be that the order of the detrending is low enough so it does not adjust to the signal of interest.

In this paper, we propose to select the model and the order that best fits brainstem evoked potentials to eliminate present tendencies and thereby improve the quality of the coherent average. Quality measures commonly used to validate the estimate in Eq. (5) were chosen to evaluate the results.

2 Methods

2.1 Data

The database used in this study consists of Transient Auditory Evoked Potentials registered in 39 neonatal patients between 1–3 months of age born in Hospital Materno Ramón González Coro, in Havana, Cuba [27]. The signals were recorded with an AUDIX electro-audiometer. A click stimulus with duration 0.1 ms was provided at different intensities (100, 80, 70, 60, 30 dBnHL and 0 dBpSPL) via inserted earphones (EarTone3A) [28, 29]. Ag/AgCl dry electrodes were used, which were fixed with electrolytic paste on the forehead (positive), ipsilateral mastoids (negative) and contralateral mastoids (earth). The impedance values were maintained below 5 kΩ. The sampling frequency used was 13.3 kHz, and the analysis windows to form the ensemble matrix P (Eq. 5) and calculate the coherent average were of approximately 15 ms, that is about 200 samples per window (N = 200). From this database, only records obtained at 100 dBnHL (78 signals) were used, where it was confirmed by specialists that a response was present. These signals were used in order to guarantee the maximum values of the quality measures for this database. The signal was analogically filtered with a band-pass filter with cut-off frequencies of 20 and 2000 Hz. Although it has been said previously that filtering can produce trends, it is necessary in this type of signals (EEG). The limitation in this case, being the size of the analysis window (15 ms), persists, which is much lower than the minimum analysis period of the filter, equal to 1/20 Hz (50 ms).

According to [30], there are up to 8 oscillations with clinical value in the first 15 ms of the auditory evoked potentials, an aspect of relevance when choosing the order of the detrending model.

2.2 Models for Detrending Considered

Polynomial Model

Polynomial models for curves are given by Eq. (6).

$$ y = \sum\limits_{i = 1}^{n + 1} {p_{i} x^{n + 1 - i} } $$
(6)

where n + 1 is the order of the polynomial, n is the degree of the polynomial. The order gives the number of coefficients fit, and the degree gives the highest power of the predictor variable. For instance, a third-degree (cubic) polynomial is given by:

$$ y = p_{1} x^{3} + p_{2} x^{2} + p_{3} x + p_{4} $$
(7)

Polynomials are often used when a simple empirical model is required. The main advantages of polynomial fits include reasonable flexibility for data that is not too complicated, and they are linear, which means the fitting process is simple. The main disadvantage is that high-degree fits can become unstable. Additionally, polynomials of any degree can provide a good fit within the data range, but can diverge wildly outside that range. Therefore, caution must be exercised when extrapolating with polynomials.

Polynomials of order n can adapt to trends showing up to n − 1 local extremes, which in turn implies a maximum of (n − 1)/2 full oscillations in the trend. As mentioned, there are up to 8 oscillations of clinical relevance for the considered duration of the auditory evoked potentials. To limit the maximum number of oscillations detrended to be less than half of these useful oscillations, we evaluated polynomial models from order 0 to 8. The zero order corresponds to the classical procedure of eliminating the DC level, while the 1st order polynomial corresponds to a linear detrending.

Fourier Series Models

The Fourier series is a sum of sine and cosine functions that describes a periodic signal. It is represented in either the trigonometric or the exponential form:

$$ y = a_{0} + \sum\limits_{i = 1}^{n} {a_{i} \cos (iwx) + b_{i} \sin (iwx)} $$
(8)

where a0 models a constant (intercept) term in the data and is associated with the i = 0 cosine term, w is the fundamental frequency of the signal, and n is the number of terms (harmonics) in the series. In this case, we evaluated Fourier models up to n = 8, to keep up with the number of polynomial models considered, even if the number of oscillations modeled can match the ones with clinical interest.

2.3 Quality Measures Used

Correlation Coefficient Ratio

The correlation coefficient ratio (CCR) is a statistic that reflects the replicability between two sub-averages, computed as follows:

$$ CCR = \frac{{\sum\limits_{i = 1}^{NM} {\left( {\hat{s}_{1} \hat{s}_{2} } \right)} }}{{\sqrt {\sum\limits_{i = 1}^{NM} {\left( {\hat{s}_{1} } \right)} \sum\limits_{i = 1}^{NM} {\left( {\hat{s}_{2} } \right)} } }} $$
(9)

According to the Audiology Assessment Protocol in [31], for a window of interest of 10 ms the value of CCR must be greater than 0.7 between two sub-averages obtained with 2000 epochs each.

Standard Deviation Rate

The standard deviation rate is a signal-to-noise ratio,

$$ SDR = \text{var} (\hat{s})/\text{var} (\theta ) $$
(10)

where var(\( \hat{s} \)) is the variance of the estimated signal and var(\( \theta \)) is the residual noise variance. The residual noise is estimated as the difference between the even sub-average and the odd sub-average.

$$ \theta = \hat{s}_{1} - \hat{s}_{2} $$
(11)

The standards [31], suggest values of SDR > 1 to guarantee the presence of response.

3 Results

To evaluate the results obtained, a Friedman test was performed where the average values of each of the adjustment models for each of the quality measures used in the 78 signals were evaluated. In all cases, the test resulted in a value of p < 0.05, which suggests that there are significant differences between at least two models. In order to identify the models in which the differences existed, a post-hoc test was developed using the Bonferroni method. Figure 1 shows the results obtained for both quality measures, CCR in the left and SDR in the right panels, respectively. Non-overlapping segments are those that show significant differences. There is a consistent performance of Poly 7 as the best method across both measures, with Poly 5 also consistently ranking second in both measures.

Fig. 1.
figure 1

Differences between the average ranks of the different models evaluating: the CCR (left) and SDR (right). Grayed-out models have rank confidence intervals overlapping with the best model (Poly 7)

There is an obvious tendency for deterioration in Fourier models as they approach the 8th order, which could be explained by the reduction in the amplitude of the recovered response due to an increase in its ability to match \( \hat{s} \). In Fig. 2, an example of the recovered \( \hat{s} \) using the evaluated methods for one of the 78 subjects is shown.

Fig. 2.
figure 2

Auditory evoked potentials recovered using the detrending methods for one subject.

Figure 2 shows the result of the average potential obtained for a subject using the different detrending models, where most Polynomial and Fourier models improves the resulting signal compared to the subtraction of the DC level (standard procedure), in correspondence with results shown in Fig. 1. The smaller amplitudes of \( \hat{s} \) for the higher order Fourier approaches are also visible.

4 Conclusions

An adequate detrending can improve the detection of auditory evoked potentials according to recommended quality measures. It allows obtaining individual responses better suited to perform the coherent averaging. Although the best results were obtained here for a polynomial model of order 7, the use of a smaller order (i.e. 5) can be considered as an option given the interest in avoiding a fit to the oscillations of clinical interest. In future works an analysis of the variance of the remaining noise by subtracting the trends should be considered.