Introduction

Parallel to the well-studied auditory afferent pathways, the auditory efferent system projects in a top-down manner through the corticofugal tract and ultimately targets the cochlea via the olivocochlear bundle. The medial olivocochlear (MOC) efferents, which originate from the medial superior olivary complex, innervate outer hair cells (OHCs), modify cochlear mechanics, and alter the output of the cochlea and consequently, the auditory nerve. Among the proposed functional roles of the olivocochlear efferents are the protection against acoustic trauma and the facilitation of speech perception in noise (see Guinan 2006; Robles and Delano 2008 for recent reviews).

Clinical interest in this area is increasing as faulty efferent control of cochlear mechanics has recently been demonstrated in individuals with various disorders such as learning disabilities and auditory neuropathy (Hood et al. 2003; Garinis et al. 2008). Current methods that evaluate the efferent modulation of the cochlea often lead to great variability in the direction and magnitude of the response (Siegel and Kim 1982; Moulin et al. 1993b; Williams and Brown 1997; Sun 2008b), or are contaminated by other reflexes such as the middle-ear muscle (MEM) reflex (Whitehead et al. 1991; Burns et al. 1993). Hence, a complete understanding of the efferent modulation of human cochlear functions not only helps elucidate the physiological basis of this phenomenon but also enables potential development of clinical tools for the assessment of the efferent pathway and its role in various pathologies.

The effects of MOC activation on cochlear mechanics and auditory nerve activity have been extensively studied in laboratory animals (Galambos 1956; Fex 1962; Mountain 1980; Siegel and Kim 1982; Murugasu and Russell 1996). The fundamental effect of MOC activation is a reduction in basilar membrane motion (Murugasu and Russell 1996; Dolan et al. 1997) due to a decrease in the gain of the cochlear amplifier. Reduced output from the cochlea then results in a reduction in auditory nerve activity (Galambos 1956; Fex 1962; Wiederhold and Kiang 1970). The effects of these efferent-induced changes in cochlear function have also been observed in otoacoustic emissions measured in the ear canal (see Mountain 1980; Siegel and Kim 1982 for initial reports).

The effects of MOC activation have been observed on two distinct time scales—a fast effect that builds up and decays on a scale of 10–100 ms (Wiederhold and Kiang 1970; Cooper and Guinan 2003) and a slow effect on a scale of 10–100 s (Sridhar et al. 1995; Cooper and Guinan 2003). At the cellular level, the fast effect is mediated by an increase of the OHC conductance via calcium-activated potassium channels (Housley and Ashmore 1991; Fuchs and Murrow 1992; Kakehata et al. 1993; Erostegui et al. 1994; Evans 1996; Oliver et al. 2000; Yoshida et al. 2001) which reduces the gain of the cochlear amplifier (Dallos 1992). The slow effect, on the other hand, is hypothesized to be accompanied by both an increase of the OHC conductance and a decrease of the OHC’s axial stiffness (Sridhar et al. 1995; Cooper and Guinan 2003)

The knowledge of efferent physiology and function cannot easily be transferred between species due to anatomical differences (Robles and Delano 2008) as well as different degrees of interplay with other systems such as the MEM reflex. In cats, guinea pigs, and mice, the preponderance of crossed MOC fibers (two to three times that of uncrossed MOC fibers; Robles and Delano 2008) is consistent with experimental data showing the ipsilateral MOC reflex to be two to three times stronger than the contralateral reflex (Robertson and Gummer 1985; Liberman and Brown 1986; Gifford and Guinan 1987; Brown 1989; Maison et al. 2003). In humans however, the difference between the ipsilateral and the contralateral MOC reflex is much smaller (Guinan et al. 2003; Hood et al. 2003; Lilaonitkul and Guinan 2009) and the exact ratio between crossed and uncrossed MOC fibers is unknown. The degree of interplay between the MOC and the MEM reflexes is also species-dependent. In cats, post-onset adaptation and contralateral suppression of distortion product otoacoustic emissions (DPOAEs) were preserved when middle-ear muscles were severed, but were greatly reduced after olivocochlear bundle sectioning (Liberman et al. 1996). However in rats, the MEM reflex is the main contributor to post-onset adaptation and contralateral suppression of DPOAEs, which virtually disappear after middle-ear muscle sectioning (Relkin et al. 2005). These differences highlight the necessity for species-specific knowledge about the physiology of the MOC efferent system and its modulatory effect on the cochlea.

In humans, where direct and invasive measurements of either basilar membrane mechanics or auditory nerve activity are not feasible, otoacoustic emissions (OAEs) provide an indirect but noninvasive probe to evaluate the effects of the MOC efferents on cochlear mechanics. In the vast majority of experiments in human subjects, the MOC efferents are stimulated acoustically to examine changes in evoked emissions, such as transient evoked OAEs (TEOAEs), distortion product OAEs (DPOAEs), and stimulus frequency OAEs (SFOAEs) (Veuillet et al. 1991; Moulin et al. 1993a; Guinan et al. 2003). Such acoustic stimulation of the MOC efferents can be conducted in the ear in which the OAE is being monitored (ipsilateral), in the opposite ear (contralateral), or in both (bilateral). The effect on the OAE is largest with bilateral stimulation (Guinan et al. 2003; Hood et al. 2003; Lilaonitkul and Guinan 2009), but both bilateral and ipsilateral stimulation are problematic due to two-tone suppression in the monitored ear. With contralateral acoustic stimulation (CAS), the issue of contamination of the observation from two-tone suppression is avoided. However, all methods of acoustic stimulation risk contamination due to the activation of the MEM reflex. Finally, the emission-evoking stimulus could itself elicit efferent activity and interact with the otoacoustic emission (Guinan et al. 2003). In this regard, Guinan et al. (2003) argued that SFOAEs provide a better assay to evaluate the MOC pathway, as higher amounts of total energy present in stimuli used to record TEOAE and DPOAEs would be more likely to evoke MOC activity. Here, we extend this argument by examining the effects of efferent activation on spontaneous otoacoustic emissions (SOAEs), a type of OAE recorded without the use of an external stimulus.

Because broadband noise is an effective activator of both the MOC efferent activity and the MEM reflex (Norman and Thornton 1993; Maison et al. 2000; Velenovsky and Glattke 2002), the possibility of contamination from the MEM reflex is ever present and vexatious, demanding in-depth treatment. The range of temporal characteristics of the MOC and MEM reflexes, and the overlap between them makes it difficult to isolate their influence on OAEs (Goodman and Keefe 2006). As has been discussed above, the relative contribution of the MEM reflex to OAE shifts has been found to be species-dependent in laboratory animals. While the stapedius muscle can be sectioned in animal experiments, thereby allowing the parsing of contributions from the MEM and MOC reflexes, this luxury is not available in most experiments involving human subjects. Although it is difficult to ascertain the relative roles of the MOC and MEM reflexes in humans, an approximate estimate can be obtained from groups with specific pathologies that cause the attenuation/eradication of one or the other of these reflexes. In one such published work, significantly reduced suppression of TEOAE by CAS was demonstrated in a group of patients who had undergone vestibular nerve neurotomy (severed crossed MOC efferents) (Giraud et al. 1995). In contrast, contralateral suppression of TEOAE in another group of subjects diagnosed with Bell’s palsy (abolition of stapedial reflex due to paralysis of the facial nerve) was comparable to that of a control group. This led the authors to argue that the MEM reflex plays a relatively insignificant role in human subjects. However, the tensor tympani muscle was intact and active in the subjects with Bell’s palsy thereby allowing the possibility of the middle ear having influenced these results.

An acceptable approach in experiments with human subjects would be to measure the MEM reflex threshold in individual subjects. This would at least grant the experimenter the knowledge that when the MOC elicitor is higher in level than the MEM reflex threshold, the outcome was possibly “helped” by the MEM reflex. Complications taint this approach as well. First, the clinical MEM reflex thresholds vary with the sensitivity of the instrumentation and the testing procedures (Gelfand 1984). Moreover, thresholds measured using clinical instruments have been consistently found to be substantially higher than those measured using a variety of experimental techniques (Neumann et al. 1996; Feeney et al. 2003; Guinan et al. 2003; Muller-Wehlau et al. 2005; Goodman and Keefe 2006). However, most studies using contralateral noise elicitors have not controlled for the MEM reflex stringently, leaving their results difficult to interpret.

Efferent modulation of SOAEs has not been studied as extensively in the literature as other evoked OAEs. Still, activation of the MOC efferents has been shown to have a consistent effect on SOAEs. CAS with tones or broadband noise as the elicitor of MOC efferents has been shown to reduce SOAE amplitude and increase SOAE frequency (Mott et al. 1989; Harrison and Burns 1993; Smurzynski et al. 2002). The time course of CAS-induced changes in SOAE amplitude and frequency has been quantitatively characterized at the onset of CAS (Smurzynski et al. 2002). However, the temporal behavior of SOAEs during CAS or at CAS offset has not been characterized with sufficient resolution. Harrison and Burns (1993) observed a general release from MOC-induced shifts in both SOAE amplitude and frequency over a 40-s period. Nonetheless, the coarse temporal resolution of their measurement and analysis did not allow accurate description of the time course of events.

In this article, we have attempted to fully characterize the temporal behavior of magnitude and frequency shifts of SOAEs, at the onset of, during, and at the offset of four levels of contralateral broadband noise. To achieve both high temporal and spectral resolution, we have utilized the Choi–Williams distribution, a member of the Cohen class distribution function (Cohen 1995). In the execution of our experiment, we have attempted to detect the MEM reflex threshold by monitoring the magnitude and phase of a pure tone at two frequencies while stimulating the contralateral ear with broadband noise over a range of levels.

Methods

Subjects

Six female participants (S1–S6) between the ages of 18 and 30 years with clinically normal hearing sensitivity (20 dB HL or better at octave frequencies between 250–8,000 Hz, measured by the Interacoustics Audio Traveller AA220) in both ears and at least one SOAE, greater than 10 dB SPL (re 20 μPa; SOAE spectra shown in Figure 1) were recruited to participate. Subjects were compensated for their participation, and the Northwestern University Institutional Review Board approved all procedures. Measurements were made in a sound-treated audiological test booth.

FIG. 1
figure 1

SOAE spectra for all six subjects (S1–S6). The SOAEs marked by arrows were examined in detail for this study.

Signal generation and recording

Experimental stimuli were generated and digitized by a MOTU 828 MKII input/output converter controlled by a Macintosh computer via firewire. Signals were amplified by a custom-built amplifier and presented to the subjects’ ears using MB Quart 13.01 HX drivers. Signals were recorded in participants’ ears by an Etymotic Research ER-10B+ microphone, amplified by a pre-amplifier (20 dB gain), digitized by the MOTU and eventually stored on the Macintosh computer for analysis.

Measuring MEM reflex

MEM reflex thresholds were measured in each subject using the AA220 (standard probe tone at 226 Hz, with a contralateral wideband noise as the reflex elicitor). These results along with those obtained by a novel experimental procedure (described below) are displayed in Table 1. In addition to the measurement of MEM reflex thresholds using a clinical device, we made a set of parallel measurements by monitoring the magnitude and phase of a probe tone in the test ear while the contralateral ear was stimulated by a broadband noise. A contralateral broadband (white) noise (100–10,000 Hz, between 50 and 90 dB SPL in 10 dB steps) was presented as the MEM reflex elicitor. A probe tone of either 602 or 226 Hz at 60 dB SPL was monitored in the test ear using an Etymotic Research ER-10B+ microphone. In each run, the probe tone was on for 4 s and the contralateral elicitor was presented for 500 ms between 1 and 1.5 s (Fig. 2). A total of eight runs were recorded for each elicitor level and probe tone frequency. Averaged data for each elicitor level and probe tone frequency were passed through an adaptive least-squares fit filter (Long and Talmadge 1997) to estimate the magnitude and phase of the probe tone. The MEM reflex threshold was determined based on a time-locked change in the probe tone magnitude and phase in response to the contralateral noise. The lowest contralateral elicitor level that induced the characteristic pattern (described below) of the MEM reflex was taken as the threshold.

TABLE 1 MEM reflex thresholds obtained with broadband noise in the contralateral ears of all six subjects using clinical and experimental methods
FIG. 2
figure 2

Representative results of the experimental method of measuring the MEM reflex threshold (subject 6). The magnitude and phase (both normalized to baseline) of a probe tone (226 or 602 Hz at 60 dB SPL) were monitored with CAS between 50–90 dB SPL from 1 to 1.5 s (black bar on the bottom left). A time-locked increase in the magnitude of the total pressure at 226 Hz was considered a sign of the MEM reflex. The reflex threshold in this subject was determined to be 70 dB SPL. Interestingly, for CAS above the experimental threshold of 70 dB SPL, the phase of the 226 Hz probe showed a lag; the phase of the 602 Hz probe turned from a lead to a lag. The magnitude of the 602 Hz probe always showed a decrease, but it exhibited a biphasic pattern (a decrease followed by an increase) for CAS of 90 dB SPL.

It was assumed that the 226 Hz probe tone is only sensitive to the acoustic impedance change of the middle ear induced by the MEM reflex because SFOAEs are extremely difficult to measure at this low-frequency range (Goodman and Keefe 2006). The choice of the 602-Hz probe tone was based on Feeney et al. (2003) which showed that the greatest change in reflectance due to the MEM reflex was approximately around this frequency. A positive change in the magnitude of the total pressure of the 226-Hz probe tone, time locked to the contralateral noise was taken as an indicator of a MEM reflex. Results obtained from subject 6 are displayed in Figure 2, with results for the 226- and 602-Hz probe tones represented in the left and right columns, respectively. Contralateral noise ranges from 50 to 90 dB SPL in 10-dB steps. Traces representing pressure and phase measurements are displayed for each probe tone. Note the increase in magnitude of the 226-Hz probe tone for a noise level of 70 dB SPL. A reduction in the phase of the 226-Hz probe tone was always observed with an increase in magnitude. Thus, 70 dB SPL was the designated MEM reflex threshold in this subject. As demonstrated in Figure 2, the changes in the magnitude and phase of the 602-Hz probe were more complex. A decrease in magnitude and an increase in phase were commonly observed at CAS levels below the MEM reflex threshold. Above the MEM reflex threshold, a general reduction in the phase of the 602-Hz probe was observed. However, complex patterns were often observed in both magnitude and phase at noise levels above the MEM reflex threshold, possibly indicating the interplay between the MEM reflex and the MOC reflex which alters SFOAE at the probe frequency. These observations, though interesting, are not the main focus of this paper. MEM reflex threshold estimates using the experimental method were consistently and significantly lower than those obtained using the AA220. We, therefore, chose to be guided by the MEM reflex threshold obtained by our experimental protocol.

SOAE measurements

A broadband noise (100–10,000 Hz) was presented in the contralateral ear while the SOAEs (without any stimulation) were monitored in the test ear. In each run, SOAEs were monitored for 35 s and the contralateral noise elicitor was turned on at 5 s and off at 25 s (Fig. 3). There was a 10-s break between runs. Four contralateral noise levels (45, 55, 65, 75 dB SPL) were interleaved and 20 runs were averaged for each noise level. Subject 4 had a fairly low MEM reflex threshold as measured by our experimental assay, thus was not exposed to the highest level of noise at 75 dB SPL.

FIG. 3
figure 3

An illustration of the experimental design of studying effects of CAS on SOAEs. Contralateral noise was presented between 45 and 75 dB SPL, in 10-dB steps during a time window between 5 and 25 s while the selected SOAE was monitored for 35 s.

Participants’ SOAEs were also measured without contralateral stimulation before and after the test.

Analysis

A time–frequency analysis using the Choi–Williams distribution function was applied in MATLAB to examine the shifts in SOAE magnitude and frequency with both fine temporal and spectral resolution. The Choi–Williams distribution function is a time–frequency analysis tool of the Cohen’s class distribution function (Cohen 1995), which renders the energy density (represented by arbitrary magnitude) of a signal s(t) on a time–frequency plane, and is fundamentally different from the spectrogram. The Cohen’s class distribution function has a general form as follows:

$$ C\left( {t,\omega } \right) = \frac{1}{{4\pi^2 }}\int {\int {\int {s * \left( {u - \frac{1}{2}\tau } \right)} s\left( {u + \frac{1}{2}\tau } \right)} \phi \left( {\theta, \tau } \right)} e^{{ - j\theta t - j\tau \omega + j\theta u}} dud\tau d\theta $$
(1)

where C(t,ω) is the energy density at time t and frequency ω. Since the energy density can be written as the Fourier transform of the deterministic local autocorrelation function R t(τ), we have

$$ C\left( {t,\omega } \right) = \frac{1}{{2\pi }}\int {R_{\text{t}} \left( \tau \right)e^{{ - j\tau \omega }} d\tau } $$
(2)

By comparing Eqs. 1 and 2, we have R t(τ), the deterministic generalized local autocorrelation function as follows:

$$ R_t\left( \tau \right) = \frac{1}{{2\pi }}\int {\int {s * \left( {u - \frac{1}{2}\tau } \right)s\left( {u + \frac{1}{2}\tau } \right)\phi \left( {\theta, \tau } \right)e^{{j\theta \left( {u - t} \right)}} d\theta du} } $$
(3)

where τ is the delay from time u in the autocorrelation function and θ is an intermediate variable without a direct physical correlate. φ(θ,τ) is the kernel function which mitigates the inherent artifacts (cross-terms) of the Cohen’s class distribution function. Different kernels φ(θ,τ) renders different members of the Cohen’s class. The Choi–Williams distribution has a kernel φ(θ,τ) as follows:

$$ \phi \left( {\theta, \tau } \right) = e^{{{{\theta^2 \tau^2 } \mathord{\left/ {\vphantom {{\theta^2 \tau^2 } \sigma }} \right. } \sigma }}} $$
(4)

where σ is an adjustable parameter that is used to optimize the results. A typical example of shifts in SOAE magnitude and frequency obtained from the Choi–Williams distribution is shown in Figure 4. Specifically, we examined the SOAE shifts at the onset, in the presence, at the offset and after the offset of CAS. The shifts were fitted with a one-exponential curve (Eqs. 5 and 6).

$$ y(t) = y_0 + ke^{{{{ - t} \mathord{\left/ {\vphantom {{ - t} \tau }} \right. } \tau }}} $$
(5)
$$ y(t) = y_1 + y_2 \left( {1 - e^{{{t \mathord{\left/ {\vphantom {t \tau }} \right. } \tau }}} } \right) $$
(6)

where y(t) is the function of magnitude or frequency change over time t. y 0 or y 1 + y 2 is the asymptotic value of y. k is a constant and τ is the time constant of the exponential function.

FIG. 4
figure 4

A 2-D illustration of typical shifts in SOAE frequency and magnitude, obtained from the Choi–Williams distribution analysis. During CAS, the SOAE frequency was elevated (upwards shift on the plot) while magnitude was reduced (change from warmer to cooler colors.)

Results

Six relatively large SOAEs (greater than 10 dB SPL) from six normal-hearing young healthy female human ears were examined. The examined SOAEs were between 876 and 2,764 Hz in frequency, and between 21 and 36 dB in signal-to-noise ratio (Fig. 1). As can be observed in the panels of Figure 1, each of our subjects had multiple SOAEs in the ear studied. We chose the largest SOAE for detailed examination. The level and frequency of the SOAE was monitored for a period of 35 s, while the contralateral ear was stimulated with a broadband noise in the time window between 5 and 25 s. The typical effect of CAS on the frequency and magnitude of an SOAE is illustrated on a time–frequency plane as in Figure 4. During CAS, the SOAE frequency was elevated (upwards shift in the plot) and magnitude reduced (change from warmer to cooler colors). We discuss the details of the CAS-induced changes in SOAE magnitude and frequency below.

Effect of contralateral noise on SOAE magnitude

The magnitude of each of the six SOAEs studied exhibited a rapid reduction at CAS onset, a gradual release from reduction (adaptation) in the presence of CAS, a rapid overshoot at CAS offset, and a gradual return to baseline (recovery) (Fig. 5). The magnitude of the SOAE was quickly reduced as contralateral noise was gated on at 5 s. Over almost the entire 20-s CAS window, the SOAE magnitude was gradually and partially released from the reduced state. At 25 s when contralateral noise was gated off, the SOAE magnitude quickly recovered towards the baseline but usually overshot the baseline for a brief period of time, returning to baseline over the next few seconds. The reduction, adaptation (release from initial magnitude reduction), and the overshoot after CAS offset were more prominent in some subjects than others. Two exemplars of these effects are displayed in Figure 5 from subjects 2 and 6, in the left and right panels, respectively. At baseline, the examined SOAE in subject 2 was 16.9 dB SPL at 876 Hz. In the other example, the baseline was 15.6 dB SPL at 2,764 Hz for the examined SOAE in subject 6. Each trace in each panel represents the effect of CAS at a given level (marked on the traces in dB SPL). As seen in these examples, the reduction in magnitude generally increased with increasing CAS level. This trend was common in most subjects. In some cases, as in the left panel of Figure 5, the SOAE was reduced into the noise floor (black dashed line) at some CAS levels (65 and 75 dB SPL in this subject).

FIG. 5
figure 5

Two examples of SOAE magnitude shift as a function of CAS level are presented in the left and right panels. Each trace represents change in SOAE magnitude as a function of time for CAS levels between 45 and 75 dB SPL. The black dashed line is the noise floor. The reduction of the noise floor during CAS was an artifact of the Choi–Williams analysis which relates to cross-terms. In both subjects, the MEM reflex threshold was 70 dB SPL, as estimated by our experimental method.

The MEM reflex thresholds of both subjects in Figure 5 were estimated to be 70 dB SPL by our experimental method. Thus, we attribute the adaptation and overshoot in SOAE magnitude to the MOC reflex for CAS levels of 45, 55, and 65 dB SPL, and to mixed effects of the MOC and MEM reflexes for the CAS level of 75 dB SPL.

Close examination in Figure 5 reveals a small reduction in the noise floor during CAS. This is due to the reduction of cross-terms, a known artifact inherent to all Cohen class distributions. These cross-terms are visible in Figure 4 as “ripples” flanking the SOAE signal band. The magnitude of the cross-terms is reduced during CAS activation in Figure 4, and the analog of this reduction is the lowering of the noise floor in Figure 5.

In an attempt to fully characterize all stages of CAS-induced changes to an SOAE, three states were defined for comparison. Baseline was defined as the average magnitude in the 1-s window before CAS onset. The reduced state was defined as the average SOAE magnitude in a 1-s window before CAS offset. Finally, the post CAS peak was defined as the highest recorded magnitude in a 2-s time window following CAS offset. From these reference points, shift was computed as the deviation of the reduced state from the baseline. Similarly, overshoot was computed as the deviation of the post CAS peak from the baseline. A schematic illustration of these computations in subject 2 is presented in the leftmost column of Figure 6. The top and bottom rows exhibit SOAE behaviors due to CAS of 45 and 65 dB SPL, respectively. Different stages of CAS-induced changes of the same SOAE are highlighted in the remaining columns of Figure 6. Each stage was fitted with a one-exponential curve to quantify the temporal characteristics of CAS-induced initial reduction, adaptation, overshoot, and return to baseline which typically follows the overshoot. The thick red lines in each of the last four columns of Figure 6 represent these fits (see Eqs. 5 and 6 in methods). Shift, overshoot, as well as time constants for various stages depicted in Figure 6 are presented later for all subjects.

FIG. 6
figure 6

An example (subject 2) of the time course of SOAE magnitude shifts with CAS at 45 (upper row) or 65 dB SPL (lower row). The quantification of shift and overshoot in magnitude is illustrated in the upper left panel. Shifts at CAS onset, during the presence of CAS (adaptation), at CAS offset and after CAS offset (recovery) are illustrated in columns two through five. The shifts were fitted with a one-exponential function (thick red lines, Eqs. 5 and 6). Note that the SOAE was suppressed into the noise floor after the onset of CAS at 65 dB SPL, thereby affecting the time constant of the fitted curve. The time constant of adaptation for this condition was therefore not included in group data (Fig. 10).

Effect of contralateral noise on SOAE frequency

Figures 7 and 8 are analogs of Figures 5 and 6, but show changes in SOAE frequency rather than magnitude. As was observed in the case of SOAE magnitude, SOAE frequency demonstrated a rapid change (elevation in frequency) at CAS onset, a gradual release from elevation (adaptation) in the presence of CAS, a rapid overshoot at CAS offset, and a gradual return to the baseline (recovery). Two examples from subjects 1 and 3 are presented in Figure 7. At baseline, the examined SOAE in subject 1 (Fig. 7, left panel) was 13 dB SPL at 1,426 Hz. The SOAE examined in subject 3 was 22.4 dB SPL at 2,026 Hz (Fig. 7, right panel). In both cases, the change in SOAE frequency was generally bigger with increasing CAS level. However, the frequency elevation appears to saturate for CAS levels greater than 55 dB SPL in subject 3 (Fig. 7, right panel). The SOAE was reduced to the noise floor immediately after the onset of CAS at 75 dB SPL in subject 1, thereby preventing estimation of its nominal frequency (marked by arrow in Fig. 7, left panel). As in the case of magnitude changes, SOAE frequency elevation, adaptation, and overshoot varied among subjects. Group data are presented later in the paper.

FIG. 7
figure 7

Two examples of shifts in SOAE frequency as a function of time for different CAS levels. In the left panel, the onset of the contralateral noise at 75 dB SPL suppressed the SOAE into the noise floor, resulting in a lack of nominal SOAE frequency and a gap in the trace (marked by the arrow).

FIG. 8
figure 8

An example (subject 3) of the time course of SOAE frequency shifts with contralateral noise at 45 (upper row) or 65 dB SPL (lower row), respectively. The quantification of frequency shift and overshoot is illustrated in the upper left panel. Shifts at CAS onset, during the presence of CAS (adaptation), at CAS offset and after CAS offset (recovery) are illustrated in columns two through five. The shifts were fitted with a one-exponential function (thick red lines, Eqs. 5 and 6).

The MEM reflex threshold was estimated to be 70 dB SPL for both subjects 1 and 3. Thus, we attribute the changes in SOAE frequency to the MOC reflex for CAS levels up to 65 dB SPL, and to a combination of MOC and MEM reflexes for CAS at 75 dB SPL.

Figure 8 illustrates the quantification of the frequency shift and overshoot in subject 3. Fitting of one-exponential curves to various stages of CAS-induced change in SOAE frequency are depicted for subject 3 in the last four columns of Figure 8. The top and bottom rows of Figure 8 represent results for CAS levels of 45 and 65 dB SPL, respectively. Group data for shift, overshoot, as well as time constants for various stages are presented below.

Group data

In Figure 9, shifts (left column) and overshoot (right column) in SOAE magnitude (top row) and frequency (bottom row) from all subjects (color coded) as a function of CAS level are presented. In each panel, the data points following a dashed line are likely to be affected by the MEM reflex. For example, the MEM reflex threshold was estimated to be at 50 dB SPL for subject 4. Hence, all data points from this subject are connected using dashed lines. In contrast, all other subjects had MEM reflex thresholds at 70 dB SPL. Thus, only the data point obtained with CAS of 75 dB SPL follows a dashed line.

FIG. 9
figure 9

CAS-induced shift (left column) and overshoot (right column) in both magnitude (upper row) and frequency (lower row) as a function of CAS level. The shift in magnitude was converted to dB re: baseline. Dashed lines indicate potential contamination from the MEM reflex given the experimental MEM reflex thresholds (see text and Table 1). Overall, as CAS level increases, shifts and overshoots in both magnitude and frequency increase. SOAEs from subject 4 and 5 were completely suppressed into the noise floor by all levels of CAS. Hence, their magnitude shifts were invariant as a function of CAS level (open symbols) and their data are excluded from the bottom left panel.

As CAS level increased, both magnitude shift (reduction) and frequency shift (elevation) increased (Fig. 9, left column). SOAEs from subjects 4 and 5 were fully reduced into the noise floor by all CAS levels, and the noise floor level was arbitrarily taken as the SOAE magnitude during CAS. As a result, their growth curves for magnitude shift did not change as a function of CAS level, and their nominal SOAE frequencies were undeterminable (data not shown in bottom left panel). We chose to visually separate subjects 4 and 5 in the top left panel of Figure 9 by using open symbols. Notice that the magnitude is presented in dB re: baseline. The arbitrary magnitude in the Choi–Williams distribution represents energy density whose square root is roughly on the same scale as sound pressure. The non-local nature of the Choi–Williams distribution prevents a strict definition of this relationship. Thus, estimation of dB change on the SPL scale should be performed on the square root of the output of the Choi–Williams distribution. The overshoot is in the positive direction for magnitude and in the negative direction for frequency, and is generally greater for higher CAS levels (Fig. 9, right column).

By fitting one-exponential curves to SOAE data (Figs. 6 and 8), we also quantified the time constants of SOAE shifts in both magnitude and frequency for CAS onset (τ onset), adaptation (τ adaptation), offset (τ offset), and recovery (τ recovery). Time constants are displayed in Figure 10 as a function of CAS level. The trends in the time-constant growth functions are less clear than those in SOAE shift/overshoot growth functions (Fig. 9). Nevertheless, for both SOAE magnitude and frequency shifts, the onset time constants (τ onset) decrease as the CAS level increases (Fig. 10, left column). Subject 2 showed a frequency shift of less than 1 Hz, thus our frequency resolution did not allow us to fit the one-exponential curve for this subject. SOAEs from subjects 4 and 5 were completely reduced into the noise floor by CAS, leaving the SOAE frequency undeterminable. Therefore, these three subjects were excluded from the time constant estimation for SOAE frequency changes (Fig. 10, lower row), except for recovery data for subject 4. No general patterns were observed for adaptation, offset, and recovery in either magnitude or frequency.

FIG. 10
figure 10

Time constant τ of the fitted curves of SOAE shifts as a function of CAS level. Onset, adaptation, offset, and recovery time constants are presented in columns one through four. Time constants of shifts in magnitude and frequency are presented in the upper and lower rows, respectively. Dashed lines indicate potential contamination from the MEM reflex given the experimental MEM reflex thresholds (Table 1). Subject 2 demonstrated a frequency shift of less than 1 Hz. The SOAEs from subjects 4 and 5 were suppressed into the noise floor for all CAS levels. Thus, the time constants of the frequency shift from these three subjects are not available except for the recovery stage. Overall, at CAS onset there is a negative slope as a function of CAS level. The trends in other panels are less prominent, especially for adaptation and recovery due to their low fitting coefficients. Onset time constants obtained from the Wigner–Ville distribution by Smurzynski et al. (2002) are marked by black open circles (median τ) and horizontal bars (range of τ).

Discussion

The principal focus of the experiments presented here was to examine the time course of changes in magnitude and frequency of SOAEs in response to CAS, which are arguably mediated through the MOC efferents. We attempted to carefully characterize the influence of the MEM reflex on our results. The use of SOAEs to characterize the effects of the MOC system on human cochlear mechanics may be advantageous as external stimulation, which could activate the MOC, is not required.

In agreement with previous studies, we observed not only a reduction in SOAE magnitude and elevation in SOAE frequency due to CAS, but also an adaptation following the shift at CAS onset for both SOAE magnitude and frequency (Mott et al. 1989; Harrison and Burns 1993). Although some of the general trends observed here were also observed in previous reports, the time course of these events was not fully characterized due to the coarse temporal resolution of the analysis tools employed. Smurzynski et al. (2002) overcame the challenge of acute temporal as well as spectral resolution by using a modified Wigner–Ville distribution, a member of the Cohen’s class distribution function, to examine changes in SOAE magnitude and frequency. They reported the onset time constants of fitted one-exponential curves for both magnitude and frequency changes of SOAEs due to broadband CAS at 65 dB SPL. Note the general agreement of these data with those of the current study in the left column of Figure 10. Here, we extended the findings of the previous work by characterizing the time course of all stages of CAS-induced changes in SOAE magnitude and frequency. Particularly, we have identified an overshoot after CAS offset, for both SOAE magnitude and frequency. Furthermore, by examining CAS-induced changes over a range of CAS levels, we have attempted to document the growth pattern of changes in each stage.

The CAS-induced reduction in OAE magnitude is observed in other OAE types, such as DPOAEs (Liberman et al. 1996), SFOAEs (Guinan et al. 2003), and TEOAEs (Liberman 1989; Collet et al. 1990). The changes observed in OAEs due to CAS are arguably mediated via the MOC which attenuates the gain of the cochlear amplifier. This reduction in the gain of the cochlear amplifier is directly reflected in the reduction of basilar membrane vibration due to MOC activation (Murugasu and Russell 1996).

The mechanism underlying the CAS-induced elevation in SOAE frequency has not been extensively discussed in the literature. Mott et al. (1989) proposed that the MOC activity increases OHC membrane conductance thereby changing the phase of the feedback force provided by OHCs and altering the intrinsic tuning of the emission generator. It should be noted that the OHCs themselves were not required to be tuned in the Mott et al. (1989) model as they serve merely as the power source of the SOAE generator. In another model of SOAE generation based on cochlear standing-waves, SOAEs are generated by a population of OHCs, and gain sufficient magnitude to escape the cochlea due to appropriate phase accumulation following multiple reflections between the region of generation and the cochlear boundary with the middle ear (Shera 2003). Tuning of the emission generator thus depends on the round-trip phase accumulation of the cochlear traveling wave. The so-called fast effect of MOC activity has been shown to cause a phase lead in basilar membrane vibration (Murugasu and Russell 1996; Cooper and Guinan 2003). Such a phase change would move the ‘in-phase’ coherent reflection region in Shera’s cochlear “resonant cavity” model towards the basal end, hence increasing SOAE frequencies.

Adaptation during the presence of noise

We observed a partial release (adaptation) from the initial change in SOAE magnitude and frequency after CAS onset. Such adaptation has been reported previously (Mott et al. 1989; Harrison and Burns 1993), but its exact time course has not been documented due to limitations of analysis techniques. Similar adaptation patterns due to MOC activation, while on different time scales, have been reported in both compound action potential (CAP) and basilar membrane vibration recordings in experimental animals (Sridhar et al. 1995; Cooper and Guinan 2003).

The term adaptation has been used in connection with DPOAEs and the MOC in a different connotation. The magnitude of DPOAEs gradually reduces after stimulus onset even in the absence of CAS (Liberman et al. 1996; Kim et al. 2001; Kujawa and Liberman 2001; Relkin et al. 2005). In contrast, the term adaptation is used in this paper to signify a release from MOC-induced changes in SOAEs.

Decay of the MEM reflex could be suspected as the cause of the SOAE adaptation observed during CAS (Mott et al. 1989). However, in agreement with Mott et al. (1989), we discount the role of MEM reflex decay because adaptation was observed well below our best estimates of the MEM reflex threshold.

Another postulation is that the observed SOAE adaptation is synaptic in origin. As is well known, the main neurotransmitter of the MOC pathway is acetylcholine (Eybalin 1993). The receptor of acetylcholine in the auditory efferent system is a special nicotinic acetylcholine receptor composed of α9 and α10 subunits (Elgoyhen et al. 1994; Elgoyhen et al. 2001). Continuous activation of the MOC efferent may lead to depression of neurotransmitter release due to a depletion of presynaptic vesicles that are ready to be released, thereby causing a reduction in postsynaptic current amplitude and weakened synaptic strength (Zucker and Regehr 2002).

Yet another contributor could be the desensitization of the postsynaptic nicotinic acetylcholine receptors (Elgoyhen et al. 1994; Elgoyhen et al. 2001). Elgoyhen et al. (2001) expressed α9/α10 receptors in Xenopus oocytes and found an immediate decrease of the recorded current in the presence of continuous extracellular acetylcholine.

A similar pattern of adaptation is also observed in experiments where SOAEs are suppressed by an ipsilateral external tone (Murphy et al. 1995). But these changes occur at a much faster time scale and are arguably due to the mechanical interplay between the target SOAE, the external tone, and another SOAE in frequency proximity.

We fitted one-exponential curves to the adaptation traces (see Figs. 6, 8, and 10). However, no consistent trends in the time constant of adaptation were observed as a function of CAS level.

Overshoot at the offset of contralateral stimulation

We have reported an overshoot for both SOAE magnitude and frequency at the offset of contralateral noise. The overshoot in SOAE magnitude reported in Murphy et al. (1995) at the offset of an external tone in the ipsilateral ear was explained by the suppression from a neighboring higher-frequency SOAE which responds more slowly to the external tone. This model of differential responses of two neighboring SOAEs cannot explain our results where changes in SOAE were evoked by contralateral stimulation. We observed prominent overshoot in both SOAE magnitude and frequency in subjects where no SOAEs of any significance were measurable near the target SOAE. We postulate that other mechanisms account for the SOAE overshoot observed in this study.

Following exposure to a loud ipsilateral low-frequency tone, a bounce in the recovery of hearing thresholds, as well as a concomitant bounce in otoacoustic emissions and cochlear microphonics, has been reported previously in the literature (Hirsh and Ward 1952; Kemp 1986; Kirk et al. 1997; Kirk and Patuzzi 1997; Kevanishvili et al. 2006). This bounce, on a time scale of minutes, has been identified in different types of OAEs, including SOAEs, TEOAEs, and DPOAEs. Kirk et al. (1997) suggested that the shift of the operating point on the OHC transfer function, rather than neural activity, is responsible for this bounce. However, it is uncertain whether this mechanism accounts for the bounce phenomenon reported here, which is on a much faster time scale of seconds.

Similar overshoots have been observed in SFOAEs, TEOAEs as well as the slow MOC effect on CAP and sound-evoked basilar membrane vibration, following either CAS or electrical stimulation of the MOC efferents (Sridhar et al. 1995; Cooper and Guinan 2003; Backus and Guinan 2006; Zyl et al. 2009). The overshoot in CAP, sound-evoked basilar membrane vibration and TEOAEs, which builds up and decays over hundreds of seconds, is significantly different from the overshoot observed in SOAEs and SFOAEs, which builds up and decays in a few seconds.

Time constants compared with previous studies

We fitted various stages of the CAS-induced changes in SOAE with one-exponential functions and found great variation across subjects and CAS levels (44 ms < τ onset < 782 ms, 10.3 s < τ adaptation < 30.7 s, 52 ms < τ offset < 299 ms, 1.9 s < τ recovery < 17.5 s). These values of τ onset are comparable to those reported by Smurzynski et al. (2002). Unfortunately, they did not document the time constants of other stages of SOAE shift. Nonetheless, comparison of our data with changes in evoked OAEs is possible. The time course of DPOAE onset adaptation (without CAS) fitted with a two-exponential function has been reported in the literature (Kim et al. 2001; Bassim et al. 2003). The median values of the fast and slow time constants were 69 ms and 1.51 s in Kim et al. (2001), and 72.9 ms and 2.1 s in Bassim et al. (2003). The onset and offset time constants of DPOAE suppression by contralateral noise were similar to those of DPOAE onset adaptation (Bassim et al. 2003). These fast time constants are comparable to the faster subset of τ onset and τ offset values reported here for SOAEs. Unlike our results, no DPOAE adaptation (gradual increase in amplitude) during CAS or overshoot after CAS offset was observed in either the Kim et al. (2001) or the Bassim et al. (2003) report. Backus and Guinan (2006) have reported onset and offset time constants of 277 ± 62 ms and 159 ± 54 ms, respectively, for changes in SFOAE level with CAS. These values overlap with the slower subset of onset and offset time constants reported here for SOAEs. Additionally, Backus and Guinan (2006) identified an SFOAE overshoot following CAS but not adaptation during CAS.

Influence of MEM reflex

The possible influence of the MEM reflex in experiments investigating MOC functions or OAEs has been a source of concern. Using an alternate approach we were able to detect the MEM reflex at lower stimulation levels than those detectable using a commercial impedance audiometer. Subject 4 had an MEM reflex threshold at 50 dB SPL and data from this subject are clearly separated from the remaining data pool (Fig. 9, top row). The remaining subjects had MEM reflex thresholds around 70 dB SPL, just below our highest CAS level. We did not observe any dramatic change of slope in the growth functions of magnitude and frequency shift or overshoot (Fig. 9) as CAS increased beyond the supposed MEM reflex threshold in these subjects. Middle-ear muscle activation has been associated with a dramatic change of slope in some studies (Guinan et al. 2003) but not others (Sun 2008a). The lack of dramatic slope change in our results could mean that all effects shown here are dominated by either the MEM reflex or the MOC reflex, or that these two effects are qualitatively continuous.

The MEM reflex threshold measured with our method ranged from 50 to 70 dB SPL. These results were consistently lower than those obtained with the commercial middle-ear analyzer (Interacoustics Audio Traveller AA220). Two previous papers have developed SFOAE-based techniques to measure the MEM reflex threshold (Guinan et al. 2003; Goodman and Keefe 2006). In both of these publications, the lowest elicitor level shown to evoke the MEM reflex was between 55 and 60 dB SPL. A comparison between laboratory methods is perhaps premature given the small sample sizes of these reports, including ours. However, these laboratory methods are unequivocally more sensitive to the MEM reflex than a clinical middle-ear analyzer.