Keywords

1 Introduction

Genetic predisposition, [post/peri]-natal factors, intrauterine environment, all have a part in the growth of childhood hearing impairment. Due to the presence of various categorization systems, the definition of hearing loss/deficit may fluctuate. In early development, being exposed to spoken language is essential. As a result, children with undiagnosed hearing loss, including mild and unilateral deficits, may experience significant delays in speech development as well as psychological and mental behavioral disorders, which can have an impact on their social and academic skills as well as their overall development [15, 18, 22].

The Auditory Brainstem Response (ABR) is a regularly used tool for evaluating neonatal auditory function. This method has long been considered the gold standard for neonatal diagnostic evaluation throughout the first six months of life. However, employing repeated averaging of the trials for meaningful ABR waveform, ABR is also utilized for detecting hearing loss in adults [1]. To find the lowest level that provokes a discernible response, the electroencephalogram (EEG) ABR waveform is sampled, averaged, and waveforms are recorded for stimuli of various intensities [13, 17]. Noise interferences can cause the morphology of the ABR to be distorted, affecting its accuracy. As a result, identifying the wave V can be a difficult task since noise can be emitted by equipment, circuits, or power sources. To correctly identify the presence of the ABR and differentiate it from physiologic noise, skilled clinical interpretation is required; unfortunately, subjective interpretation and the possibility of human mistake impair an objective physiologic measure [18]. For this reason, researchers like Wang et al. [23] suggested reasonable techniques to make hearing screening feasible. They suggested use a Kalman filter to adaptively extract sounds from ABR signals, reducing the number of trials required to enhance efficiency. Their work shows that with proper denoising using Kalman filter, fewer sweeps was required to obtain reliable ABR waveforms. In another related study, a wavelet based algorithm was introduced by [4] known as cyclic shift tree denoising (CSTD) method that technically gives a faster convergence on estimating the underlying ABR waveform compared to the conventional averaging methods [10, 21]. The CSTD method is computationally stable and it has uncomplicated algorithm which motivates us to apply the method in this study.

2 Materials and Methods

Stimulus such as click, tonal, or chirp can be used in ABR measurements [20]. Though click stimulus is widely used in ABR measurements, however the responses obtained are not from the whole cochlea but are rather thought to originate from basal regions (2–4 kHz) [12]. Therefore, the chirp stimulus was developed. This chirp purpose is to stimulate the entire cochlea simultaneously and provide effective neural synchronisation. Studies report that, since chirp stimulus has a specific sequence from low frequency to high, larger amplitude ABR waves can be formed than with click sounds and the whole cochlea can be stimulated at the same time [6, 7, 11]. A chirp is a quick sweep through frequencies which is either low to high or high to low frequency. The stimulus of Audio chirp was generated using a personal computer for each participant, which simultaneously processes and records the ABR signals. By simultaneously connecting output audio to both the trigger box and the g.PAH Programmable Attenuator Headphone buffer, the triggering procedure could be completed (Guger Technologies, Austria). The trigger box receives the computer’s audio signal and converts it to a square signal (trigger signal), which is utilized as a reference point for framing the EEG signal according to the provided stimulus. At the same time, the g.PAH attenuator sent the signal to the headphones such that every time the “click” played on the computer, the participant hears it. The device input power is 100–240 V with a maximum frequency of 50–60 Hz; thus, we used a bandpass filter with cutoff 100 Hz–1500 kHz for the recorded signal. Twenty chirps per second of stimulus rate were used to record the ABR signal with a sampling rate of 19.2 kHz and 24-bit resolution. Although, this sampling rate was empirically set supported by previous studies, which was found to be around 20 kHz. In this study, the intensity level of the stimulus was set in the range of 30–60 dBnHL with an increment of 10 dB. The EEG data were then recorded and averaged using sliding windows of 2000 frames. Each recorded epoch start-point is aligned with the start of the triggering signal and ends within a window of 40 ms (768 samples) for chirp and 20 ms (384 samples) for a click. The processing steps were implemented using MATLAB. Figure 1 shows the experiment setup and devices used for data collection of the ABR signals.

Fig. 1.
figure 1

Hardware system setup with g-tech equipment which include gUSBamp, gPAH, trigger box and Sound Card.

2.1 Wavelets Methods

The two wavelet denoising methods subjected to performance evaluation are as follows:

  • Wavelet Kalman Filter (WKF)

  • Cyclic Shift Tree Denoising (CSTD)

Figure 2, shows the flow of the two types wavelet based denoising methods, (a) Wavelet Kalman filter approach, (b) Cyclic Shift Tree Denoising.

Fig. 2.
figure 2

Flowchart (a) Wavelet Kalman filter approach and (b) Cyclic Shift Tree De- noising.

A detail description of WKF method could be found in [19] while CSTD method could be found in [4]. This section only covered the process of modification. Due to the closeness in morphology between the ABR and the synthesis wavelet, the biorthogonal 5.5 wavelets were used as proposed by [8]. The signal was split into high-pass (HP) and low-pass (LP) components, which correspond to details and approximates components, respectively, using the discrete wavelet transform (DWT). The signal was dissected to level 5, at which point the ABR’s main frequencies were 200, 500, 900 Hz, respectively [8].

2.2 Wavelet Kalman Filter Approach

DWT is the digital form of continuous Wavelet transform (CWT) and can be used by setting discrete values of \(s = 2^j\) and \(u = n(2^j)\), where j and n are integer numbers, such that,

$$\begin{aligned} S_{2^j} f(n)=\sum _{k \in Z} h_{k} S_{2^{j-1}} f\left( n-2^{j-1} k\right) \end{aligned}$$
(1)
$$\begin{aligned} W_{2} f(n)=\sum _{k \in Z} g_{k} S_{2^{j-1}} f\left( n-2^{j-1} k\right) \end{aligned}$$
(2)

where \(S_{2^j} f (n)\) are the approximation coefficients (represent the low frequency sub-band), \(W_{2^j} f (n)\) are the detail coefficients (represent the high frequency sub- band), \(j,\;n,\; k\; \in Z\), Z is a set of all integer numbers. \(h_k\) and \(g_k\) are the low and high pass filters coefficients respectively. This study used wavelet coefficients for the observation model as suggested by [16] to reduce the state dimensions as shown in Fig. 2(a). The biorthogonal 5.5 wavelets were chosen as suggested in [16, 24] by considering the similarity in the morphology of the ABR with the synthesis wavelet. The improved signal was then decomposed with DWT to obtain the coefficients for the Kalman filter process. Algorithm 1 summarizes the recursive procedure of Kalman filter where H and C are reshaping matrices. When applying Kalman filter, the ABR estimation model was assumed as a linear additive model according to [5]. From the analysis, the minimum number of epochs sufficient to detect wave V was chosen based on the experimental result. This method was introduced to overcome the high computational complexity due to the high dimensional state vector. This data fusion algorithm gives a small computational requirement, elegant recursive properties and is the optimal estimator for one-dimensional linear systems with Gaussian error statistics. They typically used for smoothing noisy data and provide the estimated parameter of interest. It is also widely used for positioning system receivers, phase locked loops in radio equipment, smoothing the output from laptop track pads and more.

2.3 Cyclic Shift Tree Denoising

The flow chart in Fig. 2(b) shows CSTD method, an array of wavelet coefficients was created by performing a DWT on each frame and arranged as a successive original frame. Proceeding with the final average created by linearly averaging all the frames and applied the scale threshold to obtain a smooth ABR signal. The array of wavelet coefficients was then denoised using CSTD, with each denoised in a distinct way. To create a sequence of total N frames, linearly average all different denoised reordering of frames to which CSTD has been performed [3, 9]. Time domain samples were obtained by linearly averaging the N frames to generate one frame of wavelet coefficient and then reconstructing this average frame. This new estimation technique has a faster rate of convergence to the underlying signal than linear averaging, and it outperforms linear averaging in terms of performance [2, 8]. This current technique, however, has a number of drawbacks, including the inability to apply it to a single frame of data. Other constraints include the requirement for multiple measurements of the same signal. The CSTD algorithm requires a power of two number of initial frames, with the signal being estimated being constant between frames.

figure a

3 Results and Discussion

Wave V amplitudes and latencies, for example, were recognized as crucial components. In the acceptance area of latencies, the wave V was traced whether it existed or not. The average of 2048 epochs of data from each subject were used to calculate the delay. In addition, data collection from tainted data with noise interference was examined using a new modified methodology called MCSTD, which was recommended by the researchers. Female shows earlier latencies than male and larger amplitude than male. Table 1, below shows the average latency values of 11 normal adult’s subjects for chirp stimulus at 2048 sweep.

Wave V amplitudes and latencies were identified as key components. The wave V was traced in the acceptance zone of latencies, whether it exists or not. The average of 2048 epochs recorded data from each subject was used to calculate the delay. In addition, data collection from contaminated data with noise interference were analyzed using new suggested modified approach MCSTD. Female shows earlier latencies than male and larger amplitude than male. Table 1, below shows the average latency values of 11 normal adult’s subjects for chirp stimulus at 2048 sweep.

Table 1. The average latency values of wave V ABR for adult subjects

The inter-wave intervals for females are shorter due to the effect of negligible for a wave I and more pronounced for later waves. Females have smaller head size and less brain volume compared to male. Thus, the inter-wave latencies become shorter if the distance between the generators for each of the waves is shorter and the amplitude will be larger if the recording electrode is relatively closure to the wave generator. Figure 3, is the comparison between female’s and male’s signal.

Fig. 3.
figure 3

ABR signal for female and male at 60 dB, 2048 sweeps.

Fig. 4.
figure 4

ABR signal for female and male at 60 dB, 2048 sweeps.

3.1 Selection Minimum Number of Epochs

The signal was analyzed using averaging, KF, CSTD and MCSTD approaches. The data were recorded for four different intensity levels consist of 60, 50, 40 and 30 dBnHL. Wave V recognition between different intensity levels was related to their latency and frequency for each stimulus. Corona-Strauss et al. [5] stated that the latency value increased by decreasing the frequency content of chirps. Moreover, the latency of the responses evoked by higher intensity levels is shorter than the response evoked from a lower intensity level. Decreased the intensity level will longer the latency and smaller the amplitude.

Figure 4 (a) and (b) showed the results for the wave V detection at sweeps number of 16 and 8. When decreasing the sweep number, the signal amplitude increased. According to [16] the ABR signal amplitude is between 0.1 to 1.0 \(\upmu \text {V}\). However, ABR signal amplitude at 16 and 8 sweeps exceeded 1 \(\upmu \text {V}\) showed by the straight red line. In addition, wave V detection failed when decreasing the sweeps number to 16 and 8. Thus, 16 and 8 sweeps are not reliable for detecting wave V. The goal of this research is to determine the minimal trial that will be fulfilled in detecting the presence of wave V and denoising the tainted signals. By comparing the results of averaged data morphology, the performance of each methodology was assessed. Each recorded signal was averaged sweep by sweep and the sweep was divided into 2048, 1024, 512, 256, 128, 64 and 32 sweeps. When the signal is averaged over 2048 epochs, all techniques provide an accurate assessment of the ABR signal. When the number of epochs is reduced, however, wave V identification differs between techniques. As a result, when compared to other approaches, MCSTD performs admirably. When decreasing the epoch’s number to 16 and 8, the signal amplitude increases and exceeded 1 \(\upmu \text {V}\) in which ABR amplitude is between 0.1 to 1.0 \(\upmu \text {V}\) [14].

4 Conclusion

Analyzing EEG signal is very challenging due to their high complexity, low SNR, non-linearity and non-stationary. ABR is currently the most reliable method for hearing screening, however, the noise interference from environment, equipment, and subject movement can be a difficult task for wave V detection. The aim of this study is to reduce screening time by using a better signal processing approach, reducing the sweeping number and lowering the noise interference effects. This study focused on detecting wave V in normal adults by applying KF and CSTD. From the result, it is shown that better detection of wave V is obtained using WKF with the reduced number of epochs by considering the SNR and RMSE. Further work should be carried to see the performance based on Modified Cyclic Shift Tree Denoising.