Introduction

In the field of auditory neurophysiology, auditory brainstem response (ABR) has been extensively studied and regarded as a popular objective test for various clinical applications. For decades, it has been used for estimating behavioral hearing thresholds in both children [1] and adults [2]. Its clinical usefulness has even expanded to other important applications including site of lesion testing [3], intra-operative monitoring [4], diagnosis of vestibular schwannoma [5] and Meniere’s disease [6], evaluation of cochlear implant candidacy [7] and so forth. More recently, ABRs evoked by speech stimuli have been reported and the outcomes are promising [8, 9].

Contrary to the conventional click-evoked ABR, speech-evoked ABR (speech-ABR) is recorded by presenting speech syllable such as/da/repetitively [10]. The typical speech-ABR waveform consists of specific peaks with distinct features: onset (V and A), consonant-to-vowel transition (C), sustained (D, E, and F) and offset (O) [11]. The onset and sustained responses of speech-ABR are similar to click-evoked ABR and frequency following response (FFR), respectively [10]. By using speech stimuli, the possible mechanism on how temporal and spectral features are processed within the auditory brainstem can be revealed [12]. For instance, the period of fundamental frequency of the evoking stimulus is reflected by the inter-peak latencies of D, E and F [10]. Speech-ABR has been studied in various disorders including learning disability [13], dyslexia [14, 15], autism [16] and others. In fact, speech-ABR is a reliable tool to document the effectiveness of auditory training [15].

In a normal population, the effects of gender on speech-ABR have been clearly demonstrated [11]. The influence of ethnicity on speech-ABR outcomes, nevertheless, has not been well studied. Before the speech-ABR testing can be applied holistically in multiracial countries, it is essential to rule out any ethnicity effect on speech-ABR results. In an Asian country such as Malaysia, Malay and Chinese are the main ethnic groups. The present study, therefore, aimed to compare the speech-ABR outcomes between Malay and Chinese subjects. In addition, it was also of interest to compare the speech-ABR results of Asian subjects obtained in the current study with the Caucasian data revealed in the study of Krizman et al. [11].

Materials and methods

Participants

Thirty healthy subjects that consisted of 15 Malay (mean age 22.3 ± 1.6 years) and fifteen Chinese subjects (mean age 23.1 ± 1.5 years) participated voluntarily in this comparative study. To control for the gender effect, only male subjects were recruited. All of them had normal hearing bilaterally (hearing thresholds of less than 25 dB HL from 250 to 8000 Hz), were right-handed and had negative history of disorders related to hearing. Prior to the data collection, informed consent was obtained from all participants included in the study. All procedures performed in this study were approved by Human Ethics Committee of Universiti Sains Malaysia (which is in accordance with the 1964 Declaration of Helsinki and its later amendments).

Stimuli and recording of speech-ABR

The speech-ABR recordings took place in a sound proof room within the Audiology Clinic, Universiti Sains Malaysia. A two-channel Biologic Navigator Pro AEP system (Natus Medical Inc., Mundelein, USA) was used to record speech-ABRs. The stimulus was a 40-ms speech syllable/da/with five formants which was provided by the AEP system (Fig. 1). This syllable consists of an initial noise burst and formant transition between the consonant (/d/) and the steady-state vowel (/a/). Over the duration of the stimulus, the fundamental frequency (F0) and the first three formants (F1, F2 and F3) vary in a linear manner: F0 from 103 to 125 Hz, F1 from 220 to 720 Hz, F2 from 1700 to 1240 Hz and F3 from 2580 to 2500 Hz. The latter formants, F4 and F5 are constant at 3600 and 4500 Hz, respectively.

Fig. 1
figure 1

The waveform of 40-ms syllable/da/ used in the present study

Three scalp electrodes were placed on the subject’s head: non-inverting on the vertex, inverting on the right mastoid and ground on the forehead. The impedance of electrodes was maintained to be less than 5 kΩ throughout the measurements.

Before the testing began, proper instructions were given to the subjects. After placing the headphones, the stimulus was presented monaurally to the subject’s right ear at 80 dBnHL. Herein, due to the laterality effect of speech-ABR, only the right was tested [10]. The stimulus rate was 10.9/s with 3584 sweeps. The epoch time was set at 74.67 ms (including a 10-ms pre-stimulus period). The acquired responses were amplified 100,000 times and band-pass filtered at 100–1500 Hz. To ensure waveform replicability, the recording was repeated twice for each trial. During the testing, the subjects lay comfortably on the provided bed. Breaks were given between each trial or as requested by the subjects.

Data analysis

For peak picking, the present study followed the criteria used by Krizman et al. [11]. That is, the peaks of speech-ABR waveforms were marked by the first author and then verified by the second author. For each subject, amplitude and latency values of speech-ABR peaks (V, A, C, D, E, F and O) were computed. Speech-ABR composite onset measures (V/A duration, V/A amplitude and V/A slope) were also recorded. The data were then analyzed using descriptive and inferential statistics. Mean and standard deviation (SD) values were expressed as applicable. Kolmogorov–Smirnov test was used to check for the data normality. Levene test was then employed to determine whether the data had equal variances. Subsequently, independent t tests were utilized to compare the speech-ABR results between Malay and Chinese subjects. Lastly, one-sample t tests were conducted to compare the speech-ABR data revealed in the present study with speech-ABR findings for male subjects (n = 38) in the study of Krizman et al. [11]. The resultant p values of less than 0.05 were considered statistically significant. All data analyses were carried out with the SPSS software version 20 (SPSS Inc, Chicago, IL, USA).

Results

The speech-ABR waveforms were successfully recorded from all subjects. Figure 2 shows the speech-ABR waveform for a normal representative Malay subject. The onset (V and A), transition (C), sustained (D, E and F) and offset (O) peaks of speech-ABR are clearly shown. The speech-ABR data (amplitudes and latencies of each peak, as well as the composite onset measures) were found to be normally distributed with equal variances (p > 0.05). The parametric statistical analyses could therefore be carried out for subsequent analyses. Table 1 reveals mean and standard deviation of speech-ABR results for both Malay and Chinese groups. At a first glance, the mean speech-ABR peaks amplitudes, peak latencies and composite onset measures are descriptively similar for both groups. As shown in Table 1, the independent t test then revealed no significant differences in all speech-ABR results between Malay and Chinese subjects (p > 0.05).

Fig. 2
figure 2

Speech-ABR waveforms for the right ear of a representative Malay subject

Table 1 Mean and standard deviation of speech-ABR peaks amplitudes, peak latencies and composite onset measures for Malay and Chinese participants

Since the speech-ABR results were not statistically different between the two ethnic groups, the data were then pooled for the subsequent analysis (n = 30). Table 2 shows the pooled speech-ABR data and those obtained in the study of Krizman et al. [11]. The one-sample t test showed that the mean amplitudes for peaks V, A, E, F and O were statistically higher in the present study than in Krizman et al.’s study (p < 0.05). On the other hand, no significant difference in mean amplitude was found between the two studies for peak D [t(29) = 1.93, p = 0.063]. For latency analyses, all speech-ABR peaks in the present study showed significantly shorter mean latencies than those from the study of Krizman et al. (p < 0.05). For composite onset measures, the present study produced statistically higher V/A amplitudes [t(29) = 4.108, p < 0.001] and steeper V/A slopes [t(29) = 4.671, p < 0.001] than that of Krizman et al.’s study. No significant difference was found between the two studies for V/A duration [t(29) = 1.689, p = 0.102].

Table 2 Descriptive analyses and one-sample t-test outcomes when speech-ABR data in the present study are compared with the data of Krizman et al. [11]

Discussions

Recall that the main aim of the present study was to determine the influence of ethnicity on speech-ABR outcomes in Asian adults. In the current study, speech-ABR waveforms had been successfully recorded from all subjects. However, peak C was omitted from the analysis due to its low detectability (only present in 80 % of subjects). The poor detectability of peak C has also been acknowledged in the previous studies [10, 17]. If compared with other speech-ABR peaks, the peak C has the lowest amplitude resulting in the poorest signal-to-noise ratio [11].

When the speech-ABR results of Malay and Chinese subjects were compared, the statistical outcomes were insignificant. That is, no clear ethnicity effect on speech-ABR was found in the present study. This demonstrates that the speech-ABR results (peak amplitudes, peak latencies and composite onset measures) for the two ethnic groups are virtually similar. The anatomical similarities between these ethnic groups might have contributed to the insignificant results [18]. It is well known that the amplitudes and latencies of ABRs are influenced by the anatomical factors, particularly head diameter and cochlear length [19, 20]. Since the present study only recruited male subjects, the cochlear size factor might be of least importance. Both Malay and Chinese groups are of Asian origin and the head size differences between them are negligible [18].

When the speech-ABR data in the present study were compared with the corresponding data in the study of Krizman et al. [11], most of the statistical analyses were significant. Again, anatomical factors might contribute to these outcomes. Specifically, Asian men have smaller body and head sizes than that of Caucasian males, resulting in higher amplitudes and earlier latencies of waveform [20, 21]. Furthermore, since the syllable/da/has a flat tone and is common in Malay [22], Chinese [23] and English [24] languages, the superior speech-ABR outcomes in Asian subjects are unlikely due to the stimulus issue or language experience. On the other hand, if syllables with different rhymes are used as the stimuli, the most robust speech-ABR waveforms might be revealed in Chinese subjects as Chinese is a tonal language. This possibility, nevertheless, is subject to further research. For peak D amplitude, the insignificant statistical result is perhaps due to the data variability. Due to the fluctuations in background of electroencephalography (EEG) activity, the high variability of peak amplitude of ABR has been well-acknowledged [25, 26].

It is also worth noting that even though the sample size of the present study was smaller than those of Krizman et al. [11], the standard deviation of each speech-ABR result was reasonably small and comparable (Table 2). This suggests that the speech-ABR data obtained in the current study are adequate for obtaining the desired statistical results. In addition, these data may also serve as preliminary normative data for Asian adults for future applications.

Conclusion

The present study revealed the first effort to determine the ethnicity influence on speech-ABR among Asian adults. As shown, no ethnicity effect was found in the present study, suggesting that the speech-ABR results (amplitudes and latencies of onset, sustained and offset responses as well as composite onset measures) for Malay and Chinese subjects are essentially similar. In this regard, ethnicity-specific normative data of speech-ABR may not be necessary when Malay and Chinese subjects are tested, at least in the current stage. On the other hand, the normative data of speech-ABR for Asian adults are clearly required as the speech-ABR findings of Asian males are significantly different from the Caucasian data. Nevertheless, the data from the present study are only applicable to male subjects. Perhaps future studies should focus on determining speech-ABR outcomes in subjects of different genders, as well as to include other ethnic groups. Lastly, future large-scale studies are warranted to further support the findings obtained in the present study.