Introduction

The auditory steady-state response (ASSR) is a brain’s evoked response which can be used to objectively predict hearing thresholds in individuals who are unable or unwilling to cooperate during conventional behavioral testing (Sininger et al. 2018; Michel and Jorgensen 2017). Usually, ASSR can be elicited by presenting amplitude-modulated (AM) tones, and this response is characterized by an energy increment in the modulating frequency band of the electroencephalogram (EEG) (Kuwada et al. 1986). The presence or absence of an ASSR could be determined statistically by objective response detectors (ORDs), usually performed in the frequency domain by means of discrete Fourier transforms (DFT) (Picton et al. 2003). The ORD functions depend on the signal-to-noise ratio (SNR) of the signal as well as on the number of data windows used in DFT estimation. Moreover, the statistical threshold is obtained based on the detector’s sampling distribution under the null hypothesis of lack of response. The magnitude-squared coherence (MSC) is an ORD tool frequently used in the literature to detect ASSRs (Dobie and Wilson 1989; Felix et al. 2005).

In practice, the frequency-domain version of a finite-length, sinusoidal signal estimated via DFT shows an additional set of non-zero bins, when compared to the ideal Dirac delta distribution, which means that the energy that would be confined in a single bin is spread out around the sidelobe spectrum. This escape of energy in the spectrum is called spectral leakage (SL) (Breitenbach 1999). The distortion of the spectrum due to leakage may compromise statistical significance of the ORD.

The spectral leakage is a problem in several areas of research which use Fourier spectral analysis, such as ASSR detection (Felix et al. 2005), radar (Smith 2018), and harmonics estimations (Wu and Wang 2015). There are potentially two problems of SL: power leaking out of the frequency of interest and noise leaking into the frequency of interest. The coherent sampling is a widely used technique for avoiding the SL related to power leakage of the frequency of interest, which consists of adjusting the frequency of interest to have an integer number of cycles in the analyzed time interval. Many strategies have been proposed when coherent sampling is not being achieved, such as (a) windowing (Raze et al. 2005) which consists of multiplying the signal by a tapered window in the time domain, (b) four-parameter sine wave fitting (FPSW) (Chen 2010) which is an algorithm to estimate the sine wave parameters,(c) interpolating discrete Fourier transform (IpDFT) (Wu and Wang 2015; Diao and Meng 2015) which is a interpolation algorithm that could refine the result of DFT, (d) 2-D FFT method (Sudani et al. 2011), and (e) filter bank methods (Rebai et al. 2004). Regarding ASSR, the coherent sampling can be achieved and consists of adjusting the stimulus modulating frequency such that each epoch of data contains an integer number of cycles (Xi and Chicaro 1996). Alternatively, the length of the window can be adjusted to equal an integer number of periods of the modulation frequency. However, the coherent sampling cannot prevent the SL of the background EEG noise and of external sources during the data acquisition into the frequency of interest, especially due the power-line frequency. The windowing and filtering are preprocessing techniques for this purpose in EEG signals (Motamedi-Fakhr et al. 2014). Thus, this work aims at evaluating different signal preprocessing procedures designed to mitigate the effects of SL used prior to the application of the MSC in the detection of ASSRs.

Theoretical background

Magnitude-squared coherence (MSC)

The coherence estimate of a deterministic and periodic input signal x[k] and the output signal y[k], in a linear system, depends only of the output signal, given as (Dobie and Wilson 1989):

$$ \mathrm{M}\hat{\mathrm{S}}\mathrm{C}(f)=\frac{{\left|{\sum}_{i=1}^M{Y}_i(f)\right|}^2}{M{\sum}_{i=1}^M{\left|{Y}_i(f)\right|}^2}, $$
(1)

where “^” denotes estimation, Yi(f) is the DFT of the i-th epoch of the y[k], f is the frequency of the input signal, and M is the number of epochs used in the calculation.

In order to use this function as an ORD, the associated critical value must be obtained. It constitutes a threshold above which one may assume a response to have occurred. Critical values are commonly obtained based on the inverse cumulative density function of the detector distribution under the null hypothesis (H0) of lack of response. Under the null hypothesis, y[k] is assumed to be a Gaussian noise. Thus, the distribution of MSC under H0 is given by (Miranda de Sá 2004):

$$ {\left.\mathrm{M}\hat{\mathrm{S}}\mathrm{C}(f)\right|}_{H_0}\kern0.5em \sim {\beta}_{\left(1,M-1\right)}, $$
(2)

where β(1, M − 1) is the beta distribution with 1 and M − 1 degrees of freedom. In this manner, the detection threshold is achieved by (Miranda de Sá and Infantosi 2007):

$$ {\mathrm{MSC}}_{\mathrm{crit}}=1-{\alpha}^{\frac{1}{M-1}}, $$
(3)

where α is the given significance level. Thus, the ASSR is detected when \( \mathrm{M}\hat{\mathrm{S}}\mathrm{C}(f)>{\mathrm{MSC}}_{\mathrm{crit}} \).

Materials and methods

The experiments were performed in a soundproof booth, located in the Interdisciplinary Center for Signal Analysis (NIAS) at Federal University of Viçosa (UFV). This study was conducted on 5 healthy hearing adults (age range 21–29 years old). Each subject participated in 5 sessions that consist of the EEG recording during AM auditory stimulation, according to the protocol approved by the Local Ethics Committee. The subjects were instructed to sit comfortably, keep their eyes closed, and not to fall asleep during the exam.

Stimuli

The AM stimuli were created by multiplying together two sine waves: a carrier frequency (fc) and a modulating frequency (fm). The formula for these stimuli, for a maximum amplitude A and modulation depth λ, is given by John et al. 1998:

$$ x(t)=\frac{\mathrm{Asin}\left(2\pi {f}_ct\right)\left(\lambda \sin\ \left(2\pi {f}_mt\right)+1\right)}{1+\lambda }. $$
(4)

A modulation depth of 100% was used; the amplitude was calibrated to 70 dB SPL; the carrier frequency was 1000 Hz, and the modulation frequency was fixed in 37.5 Hz, in order to fit 64 number of cycles in an epoch of 1024 points, accordingly to the coherent sampling criterion (the sampling rate was 600 Hz). The stimuli was generated digitally with CD quality and performed monaurally in the right ear, through a shielded cable coupled to an inserted earphone E-A-R Tone 5A (Aero Technologies).

EEG data

The electroencephalographer BrainNet BNT 36 (Lynx Tecnologia, Brazil) was used for EEG acquisition. The parameters settings were 100 Hz low-pass filter, 0.1 Hz high-pass filter, and sampling frequency of 600 Hz. The gold-plated electrodes, with 10 mm diameter, were connected to the signal amplifier and placed on the scalp with the assistance of an electrolytic gel.

The electrodes positions were defined according to the International 10-20 System, with reference to electrode Cz and ground on Fpz, in the derivations: F7, T3, T5, Fp1, F3, C3, P3, O1, F8, T4, T6, Fp2, F4, C4, P4, O2, Fz, Oz, Pz, A1, and A2.

EEG bipolar derivations

These bipolar derivations are formed by the difference of potential between two scalp positions. In this case, the total number of available bipolar derivation is the pairwise combination of 22 electrode positions, which result in 231 bipolar derivations. Since each of the 5 participants repeated 5 times the recording procedure, then the number of bipolar derivation signals available for analysis was 5775. Each recording lasted about 1 min and 23 s—generating signals with 49 windows of 1024 samples. Since each bipolar derivation contains a different intensity of the ASSR, then each one has different SNR. In other words, this procedure allowed the application of the SL preprocessing techniques in 5775 EEG signals with different SNR levels, which improve the statistical significance of the results.

Dealing with SL

Windowing

A well-known technique to minimize the effects of SL is the window weighting function (Harris 1978). In this case, the DFT is estimated by

$$ Y(f)={\sum}_{k=0}^{L-1}w\left[k\right]y\left[k\right]{e}^{- jk2\pi f}, $$
(5)

where w[k] is the window function inserted in the estimation of the DFT and L is the number of points of the data epoch. When w[k] is a rectangular window, Equation 5 reduces to the standard form of the DFT. All window functions available in the MATLAB® 2015a were tested with its default parameters, which are: Rectangular, Tukey, Barllett, Triangular, Barllett-Hann, Gaussian, Hamming, Hanning, Blackman, Bohman, Blackman-Harris, Chebyshev, Nuttall, Parzen, Taylor, Flat-Top, and Kaiser.

Digital filters

Another way to circumvent SL in ASSRs is by applying a band-pass filter centered at the modulating frequency. The most common IIR filters—Butterworth, type 1 Chebyshev and Elliptic—have been implemented of orders 2, 4, 6, 8, and 10, all with a frequency range of 1.17 Hz. This bandwidth was chosen to correspond to the frequency interval between neighboring bins closest to the modulating frequency. The type 2 Chebyshev was also tested, but the results are not shown in this work since its performances were always inferior.

A 1024 order band-pass FIR filter was also applied and its recursive form was adapted from Yu et al. 2011 as follows:

$$ H(z)=\frac{2}{L}\frac{1-C{z}^{-1}-{z}^{-L}+C{z}^{-\left(L+1\right)}}{1-C{z}^{-1}+{z}^{-2}}, $$
(6)

where C is a constant, given by:

$$ C=\cos \left(\frac{2\pi N}{L}\right), $$
(7)

where N is the frequency bin associated with the center frequency of the filter and L = 1024. Despite its elevated order, this filter is easy to implement for online application as well as the low-orders IIR filters.

Monte Carlo simulations

The critical value of the Equation 3 was developed for MSC considering rectangular windows (Miranda de Sá 2007). However, the application of preprocessing techniques may alter the statistics of the detector. Thus, Monte Carlo simulations were performed in one million realizations of Gaussian noises. Each noise was submitted to the specific preprocessing and then the MSC was applied. In possession of the one million of MSC values, the critical values were obtained through the percentile associated with the significance level of the test. In order to avoid the transient effect of the filters, the first epoch was always discarded.

Performance measurements

Three performance measures were calculated in order to compare the effect of the preprocessing techniques in the detection of ASSRs using MSC: the area under the ROC curve (AUC), the detection rate, and the false positive rate. These measures were calculated considering all available EEG bipolar derivations. The detection rate was calculated by the percentage of the 5775 signals where the MSC detected ASSR. The false positive rate was calculated by the detection rate in 20 bins neighboring the bin of the modulation frequency (10 above and 10 below), where the filters had to be changed and adjusted at each of the 20 frequencies. The detection rate and the false positive were obtained at significance levels of 0.01 and 0.05. The AUC were estimated by varying the significance level in steps of 0.01.

Statistical differences of the performance measures—specifically for the detection rates—were analyzed with McNemar’s test.

Results

Critical values

Table 1 shows the critical values obtained through Monte Carlo simulation for different window functions and the difference found when compared to the theoretical ones obtained using Equation 3. All these critical values were obtained for M = 49.

Table 1 Critical values for MSC obtained through Monte Carlo simulation at a significance level of 0.01 for different windowing

It can be noted that the difference between the theoretical and simulated critical values for rectangular windowing are in the order of 0.14%. Since all other differences are in the same range, one may infer that the windowing methods do not alter notably the critical value estimation.

Table 2 shows the critical values obtained through Monte Carlo simulation for different filters at a significance level of 0.01 and the difference in relation to the theoretical critical value. Since the first epoch of the signal was ignored in the calculation of the MSC with filter, the critical values were obtained for 48 windows.

Table 2 Critical values for MSC obtained through Monte Carlo simulation at a significance level of 0.01 for different filters

The application of majority of the filters led to bigger differences of critical values, when compared with the theoretical ones. Since 0.14% can be considered the order of magnitude of the discrepancy due to Monte Carlo simulation in this case (refer to the result of rectangular window in Table 1), it can be inferred that filtering alters the statistics of the MSC. So, the critical values of Table 2 must be used to calculate the performance measures when filtering is applied, while the theoretical critical values can be used for windowing in preprocessing.

MSC performance

Table 3 shows the performance of MSC for different windowing using the EEG data, ordered according to the AUC. It can be noted that FP rate was lower than the significance level for rectangular windowing. This may be due to non-gaussianity, non-stationarity, or non-whiteness portion of the background EEG.

Table 3 MSC performance with different windowing for real EEG data

In comparison with the results from the rectangular windowing, which is the standard way to estimate MSC, all other windowing methods led to improvements in the AUC and detection rate. The low detection rate was expected because of the low SNR of some EEG bipolar derivations. Further, the false positive rates remained close to the significance level of the tests, as expected. Independently of the significance level of the test, the Tukey windowing achieved the best performances.

The AUC obtained when the filters were applied is shown in Table 4. There was no statistical difference between the AUC obtained by using filtering. Furthermore, filtering always led to significantly higher AUC than any windowing method.

Table 4. MSC performance with different filtering for real EEG data

Table 4 also shows the detection and false positive rates for different filters at significance level of 0.01 and 0.05. The filters were sorted according to the detection rate at significance level of 0.01. The ‘*’ indicates the detection rates that were significantly different (p < 0.05) taking into consideration the highest one, which was indicated by ‘+’.

The false positive rates were kept around the significance level of the test. Regarding the detection rates at significance level of 0.01, the best result was obtained by the 8th order Elliptic filter. The performance obtained by this filter was not different from the performances achieved by the 6th and 8th order Butterworth filters and 8th order Chebyshev filter. Regarding the detection rate of significance level of 0.05, the results revealed that any of the filters could be used to minimize the SL, with the exception of the 10th order Chebyshev and the 2nd order Elliptic.

Discussion

This study aimed at investigating the effect of different preprocessing techniques for mitigating the SL—namely windowing and filtering—applied to the detection of ASSRs in the EEG using the MSC. The effect of these techniques in the critical values—which constitute thresholds for statistical inference of MSC—was studied both theoretically and with Monte Carlo simulations. Besides that, the AUC, the detection rate, and the false positive rate were estimated among all the available EEG bipolar derivations in order to compare the performance of such preprocessing techniques.

Considering that the standard form of application of MSC is by using rectangular windowing, Tables 1 and 2 permits the conclusion that the critical values are immune to windowing but not to filtering; this fact was expected since filtering creates a dependency between epochs while windowing does not. Thus, care must be taken when bandpass filtering is applied before the MSC. In this work, this issue was overcome using Monte Carlo simulations to obtain simulated critical values.

In terms of AUC, both preprocessing techniques improved the performance of the MSC when compared to the standard MSC. However, the filtering techniques always obtained higher AUC values, as can be seen in Tables 3 and 4. Thus, it may be concluded that filtering is superior to windowing for SL mitigation in ASSRs.

Since the MSC is a constant false alarm rate (CFAR) detector (Kay 1998), it is expected that the false positive rate for this detector remained around the significance level of the test. The results from the rectangular windowing, which is the standard way to estimate MSC, show that the presence of noise concentrated around a specific frequency in EEG change significantly the null hypothesis distribution, since the false alarm rate was smaller than the significance level. The results in Tables 3 and 4 allow concluding that preprocessing techniques are more robust regarding the presence of such noise and did not change significantly the expected FP rate.

Finally, an analysis of Table 4 reveals that 8th order IIR filtering is recommended for dealing with SL in ASSRs, since these filters increased the AUC and the detection rates for both the significance levels (0.01 and 0.05). Taking into consideration the best case result for α = 0.01 (8th order Elliptic filter), the percentage increase was 214.8% in terms of detection rate. When α = 0.05, the best case (8th order Chebyshev) showed a percentage increase of 145.7% in the detection rate. These percentage changes were calculated taking into account the results obtained by the rectangular windowing and no filter.

Conclusion

In summary, this work has showed a comparison of preprocessing techniques—filtering and windowing, more specifically—applied to EEG signals in order to mitigate the effect of SL in the detection of ASSR. This work may be a guide for researches who have problems with SL using MSC as objective response detector. Our results showed that filtering is a good option, but care must be taken regarding the changes in critical values.