Introduction

Changes in human cochlear function throughout the lifespan have not been thoroughly characterized. Here, we examine cochlear function beginning in the neonatal period and continuing through senescence, and speculate as to the mechanistic underpinnings of lifetime changes in peripheral auditory system function. In developing the framework presented here, maturation and aging are viewed as a continuum to the extent possible. The value of this approach is in the maintenance of a continuous thread between early and late changes in peripheral function while elucidating patterns unique to each epoch. While distortion product otoacoustic emissions (DPOAEs) have previously been applied as a window into cochlear maturation (e.g., Abdala 1998, 2000; Abdala et al. 2007) and, to a lesser extent, aging (e.g., Lonsbury-Martin et al. 1991; Dorn et al. 1998), the present study included an age continuum spanning nearly eight decades, considered current OAE generation theory and evaluated independent changes in magnitude and phase of dual DPOAE components.

Maturation

A substantive body of work has described cochlear function in the human neonate, most of it applying OAE methodology; spontaneous OAEs (SOAEs) are present in infants with at least adultlike prevalence, are more numerous per ear, and are observed at higher frequencies compared to adults (Strickland et al. 1985; Burns et al. 1992; Morlet et al. 1995; Abdala 1996). The prevalence of evoked OAEs is also adultlike in infants (Bonfils et al. 1989). However, click-evoked (CE) and stimulus frequency (SF) OAEs, both classified as reflection-source emissions (Shera and Guinan 1999), are higher in level in infant ears and throughout childhood (Norton and Widen 1990; Prieve 1992; Kalluri et al. 2011). Consistent with this finding, the reflection component of the DPOAE (i.e., the energy backscattered from 2f 1f 2) is more robust in newborn than adult ears (Abdala and Dhar 2010).

Distortion emissions are non-adultlike in infants. DPOAE levels tend to be slightly higher in newborns for low- and mid-frequency tones (Lasky et al. 1992; Smurzynski et al. 1993), and newborn input/output functions show saturation at elevated primary tone levels. Both of these observations can be accounted for by conductive factors: inefficient forward transmission through the neonatal middle ear and a functional boost in OAE level during reverse transmission due to smaller neonatal ear canal area (Abdala and Keefe 2006; Keefe and Abdala 2007). OAE indices of cochlear tuning have also suggested immaturities in frequency resolution (Brown et al. 1995; Abdala 1998). Infant DPOAE ipsilateral suppression tuning curves at f 2 = 1.5 and 6 kHz are sharper than those measured in adults through at least 6 months of age. The non-adultlike suppression tuning at 6 kHz seems to be accounted for by middle ear inefficiencies combined with standing waves in the adult ear canal (Keefe and Abdala 2011); however, age differences in DPOAE suppression tuning at 1.5 kHz are not as easily resolved because standing waves are not intrusive in this frequency range.

Most recently, studies of DPOAE phase in newborns reported immaturities at low frequencies (Abdala et al. 2011a). The DPOAE phase-gradient delay is constant over much of the frequency range when the frequency ratio between the two primaries (f 2/f 1) is fixed. Phase invariance under these measurement conditions is thought to reflect the approximate scaling invariance of cochlear mechanics. DPOAE phase deviates from invariance in the apical half of the mammalian cochlea signaling a break in cochlear scaling (Shera et al. 2000; Shera and Guinan 2003; Dhar et al. 2011). While these recent papers have specifically focused on DPOAE phase characteristics in the apical half of the cochlea, interest in exploring the relationship between DPOAE phase and cochlear mechanics is not new (e.g., Brown et al. 1994; Kimberley et al. 1993). Our recent work indicated that the deviation from phase invariance in the apical cochlea is exaggerated in newborns, suggesting a possible immaturity in basilar membrane motion at birth (Abdala et al. 2011b). Middle ear inefficiencies cannot easily explain the age differences in apical phase because DPOAE phase is relatively insensitive to the stimulus level (Abdala et al. 2011a).

Aging

In the latter decades of life, a functional decline is evident mainly exemplified by increasing hearing thresholds, deterioration in signal perception (especially speech-in-noise), and degradation of temporal and spectral processing (see Gordon-Salant 2005 for a review). There is some evidence that this degradation begins in the pre-senescent auditory system (Grose et al. 2006). The cochlear contribution to this decline is likely rooted in the loss of sensory cells, strial degeneration (and associated changes in the endocochlear potential), and the loss of spiral ganglion neurons (Schuknecht 1955). This original system of classification has maintained support and has been extended in recent years (e.g., Ohlemiller 2004). Some groups have emphasized the role of a degraded endocochlear potential as the primary trigger for various aging-related changes in the cochlea (Schmiedt et al. 2002; Mills and Schmiedt 2004; Lang et al. 2010). Others have observed aging-related neural degeneration in the absence of substantive cochlear hair cell loss in human temporal bones (Makary et al. 2011).

OAEs are affected by aging. Fewer SOAEs and lower CEOAE levels have been observed in aging ears (Collet et al. 1989; Kuroda 2007) as well as reduced DPOAE amplitude (Lonsbury-Martin et al. 1991; Dorn et al. 1998). At least one report failed to see any age effect on DPOAE fine structure (He and Schmiedt 1996). The strong co-variation of hearing threshold and age has posed a vexing problem in determining aging influences on cochlear function in humans (e.g., Oeken et al. 2000; Uchida et al. 2008; Hoth et al. 2010). When strict control for audiometric threshold has been employed in aged adults, the isolated effects of aging on the OAE have been difficult to observe (Stover and Norton 1993).

DPOAE components

Models of OAE generation have seen significant development over the last decade with the consensual view espousing distinct generation mechanisms for different emission types (Talmadge et al. 1998; Shera and Guinan 1999). This model is supported by a decade of experimental results (Talmadge et al. 1999; Knight and Kemp 2000, 2001; Kalluri and Shera 2001, 2007). The DPOAE measured in the ear canal with moderate level stimulus tones is understood to be a mixed response including a nonlinear distortion component generated at the overlap of the traveling waves evoked by the primary tones, f 1 and f2, and a component produced by backscattering of wavelets due to coherent reflection from impedance perturbations along the cochlear partition (Zweig and Shera 1995). The DPOAE measured in the ear canal is a vector sum of these two components, which are distinguishable by their signature phase behavior. The phase of the distortion component remains approximately invariant over much of the frequency range. In contrast, the phase of the DPOAE reflection component changes substantially as a function of frequency, resembling the phase behavior of SFOAEs. The interaction between these DPOAE two components results in a semi-periodic pattern of peaks and valleys, i.e., fine structure, measured at the ear canal microphone.

The impact of pathology, natural variability, maturation, and aging on DPOAE components and their generation mechanisms has not been well defined. Existing evidence suggests that the reflection component is more vulnerable to ototoxic insult (Rao and Long 2011) and is preferentially modulated by the medial olivocochlear system (Abdala et al. 2009; Deeter et al. 2009; Henin et al. 2011). DPOAE fine structure is altered by hazardous noise exposure, also indicative of distinct component behavior (Engdahl and Kemp 1996). Maturation has shown effects on fine structure, though less consistently: newborns generally show deeper fine structure and narrower spacing (Dhar and Abdala 2007; Abdala and Dhar 2010). The present study examines the independent maturation and aging trajectories of DPOAE distortion and reflection components and the cochlear properties they elucidate, from birth through senescence.

Methods

Subjects

Subjects included 156 individuals in seven age groups: premature newborn, term newborn, older infant, teen, young adult, middle-aged adult, and older adult. Subject information is presented in Table 1. With the exception of the older-infant group in which right ears were exclusively tested, test ear was chosen in a pseudo-random fashion unless one ear had markedly higher DPOAE levels. The infant age groups were chosen to cover the developmental period most likely to show residual cochlear immaturities. Studies of post-mortem fetal tissue suggest that cochlear morphology is nearly adultlike around the perinatal period (Pujol et al. 1998); therefore, testing older children would serve to track the longer-lasting middle ear immaturities, which was not the aim of this study.

TABLE 1 Demographic information and numbers of subjects in each age group

Average audiometric thresholds between 0.25 and 8 kHz ±1 standard deviation are displayed for each non-infant age group in Figure 1A. Error bars are offset for visualization purposes. Figure 1B displays individual hearing thresholds from the older adult group. Eight of the elderly subjects had some thresholds that fell outside of the normal range (>25 dB HL). For the purposes of this study, we did not isolate aging from associated threshold elevation. We considered aging in older adults as a composite variable inclusive of related presbycusis and other non-controlled factors such as lifetime exposure to ubiquitous noise and common ototoxins.

FIG. 1.
figure 1

A Mean audiometric thresholds for teen, young adult, middle aged, and older adult age groups. Error bars represent ±1 standard deviation of the mean and are offset for visualization purposes. B Individual audiometric thresholds (gray) for older adults with mean superimposed.

Protocol

Newborns were tested at the Infant Auditory Research Laboratory within the University of Southern California (USC)–Los Angeles County Medical Center in the Neonatology Unit. All newborns passed a click-evoked auditory brainstem response screening conducted at 35 dB nHL. Following informed consent, newborns were transported from the mother's room or the Neonatal Intensive Care Unit to the lab for data collection. OAEs were measured in a sound-attenuating isolette (Eckels ABC-100) providing between 25 and 40 dB of attenuation. Infant testing was always conducted in pairs; one tester attended to the infant throughout the entire test, watching for movement and probe slippage while the other tester implemented the data collection program. When movement or noise was detected, a sweep was manually stopped, rejected in its entirety, and re-initiated. DPOAE testing of older infants was conducted at a satellite laboratory at the University of Washington (UW). Their middle ear function was evaluated with a GSI Tympstar middle ear analyzer, and hearing was screened with click-evoked OAEs on the day of test. Data were collected during natural sleep within a double-walled audiometric test booth.

All data from teen and middle-aged subjects were collected at the House Research Institute (HRI), whereas data from young adult and elderly subjects were collected at either HRI or Northwestern University (NU). Laboratories across centers were comparable in hardware and software, having been set-up by one of the two authors of this report. Similarly, protocols were matched across labs, and all testers received the same training. Data analysis was conducted at HRI. Teen and adult subjects were awake during testing and seated comfortably in a padded armchair within an IAC sound-attenuated double-walled booth. Hearing thresholds were established via air-conduction using a standard Hughson–Westlake audiometric procedure between 0.25 and 8 kHz, followed by a tympanogram.

Primary tones, f 1 and f 2, were presented at 65 and 55 dB SPL (L1–L2) with a fixed f 2/f 1 ratio of 1.22. Tones were logarithmically swept upward in frequency at 8 s per octave between 0.5 and 4 kHz for a total of 24 s/sweep with a 2-s interval between sweeps (Software developed by C. Talmadge). Sixteen sweeps were recorded for all teen/adult subjects and in a small subset of newborns. The sweeps were averaged into two separate buffers of eight sweeps each. Comparison between these two averages provided an index of test–retest reliability for all but the older infants for whom it was not possible to record additional sweeps.

Signal processing and instrumentation

DPOAEs were recorded using a Macintosh laptop controlling a MOTU 828 Mk II audio device (44.1 kHz, 24 bit). The output of the MOTU was appropriately amplified and fed to either MB Quartz 13.01 HX drivers (NU) or Etymotic Research-ER2 tube phones (HRI, UW). The output of the drivers was coupled to the subjects' ears through the sound tubes of an Etymotic Research ER10B+ probe microphone assembly.

DPOAE level and phase estimates were obtained using a least-squares-fit algorithm (LSF) every 2–3 Hz at low frequencies (0.5–1 kHz frequency range) and every 9–12 Hz at high frequencies (3–4 kHz range) for a total of ~550 individual data points. In this implementation of the LSF technique, models for the stimulus tones and DPOAE of interest are created. Signal components are then fitted to these models to minimize the sum of squared errors between the model and the data (Long et al. 2008). The noise floor was similarly estimated after phase-inverting every alternate sweep window. The phase was unwrapped by sequentially subtracting 360° to all points beyond identifiable discontinuities. The final estimate of DPOAE phase was computed by subtracting 2ϕ 1ϕ 2 (where ϕ 1,2 are phases of f 1 and f 2) from ϕ dp (the extracted phase at 2f 1f 2). Data were treated with a cleaning regimen prior to analysis. The median of every three consecutive points was calculated and compared to the noise estimate at the corresponding frequency to determine SNR. If SNR was <6 dB, the data point was eliminated. The use of a three-point median reference to evaluate SNR was implemented to avoid, when possible, eliminating points at level minima, which would prevent accurate quantification of DPOAE fine structure depth and spacing.

Calibration

Calibrated stimuli were delivered to each subject after compensating for the depth of probe insertion (Lee et al. 2012). The depth of probe insertion was determined using the half-wave resonance peak, after the ear canal response to a slow chirp was normalized to that recorded in a 50-ft. long copper plumbing tube approximating the diameter of an average adult ear canal. The spectrum recorded in the essentially reflection-free tube represents the frequency response of the sound source. The spectrum recorded in the ear canal represents the frequency response of the sound source and ear canal. Thus, the normalization process reveals the frequency response of the ear canal for any given insertion and allows an accurate estimate of the first half-wave resonance and, hence, of insertion depth. The pressure response recorded for a matched insertion depth in a Bruel & Kjaer 4157 ear simulator (IEC 60318-4) was used to compensate the frequency response of the sound sources and to approximate the desired SPL across frequency at the tympanic membrane. In newborns, a priori measurement of the half-wave resonance in 20 infant ears provided the reference (9.6 kHz); corresponding depth insertion compensation was applied equally to all neonates.

Analysis

Mixed DPOAE

The approximately 500 DPOAE level values were binned for analyses and displayed by calculating the mean of all data points falling within third-octave intervals at center frequencies of 561, 707, 891, 1122, 1414, 1781, 2244, 2828, and 3563 Hz. DPOAE fine structure was quantified with a custom algorithm (Abdala and Dhar 2010). Maxima and minima were identified based on the first and second derivatives of the DPOAE level function and the relationship between them. Data points where the first derivative was equal to zero were identified as extrema and then further classified as a maximum or minimum based on the second derivative being negative or positive, respectively. Fine-structure depth for each period was computed as: 20 × log10 (P max/P avg(min)), where P max is the DPOAE amplitude at a maximum, and P avg(min) is the average DPOAE amplitude of the preceding and following minima. Fine structure periods with a depth of 2.5 dB or greater were accepted. The frequency spacing between adjacent minima was normalized to the center frequency of the fine structure period by computing f/Δf, where f is the geometric mean between two adjacent minima, and Δf is the frequency separation between them. Only spacing ratio estimates <25 were accepted. Estimates of fine structure prevalence, spacing, and depth were also averaged into third-octave intervals. (Note: The fine structure features of the older infant group were not analyzed because there were too few oscillations with adequate SNR available in several of the third-octave intervals.)

The DPOAE phase measured at the microphone in the ear canal was quantified in two ways: (1) phase-gradient delays were calculated as the negative of the slope of phase and plotted as a function of third-octave center frequency, and (2) individual DPOAE phase-frequency functions were fit with a one-knot spline model (SPSS ver 18.0). Spline modeling approximates a curvilinear relationship with a series of linear fits. The spline model looks for junctions in the data set that indicate significant change and labels them as knots. The location of the knot is determined computationally by the best fit to the data and not specified a priori. The spline analysis was used to determine the frequency at which DPOAE phase transitions from invariant to sloping, denoting the putative apical-basal cochlear demarcation and signaling a break from scaling symmetry (Abdala et al. 2011b; Dhar et al. 2011). The spline model also provided estimates of phase slope (cycles per kilohertz) above and below the break frequency.

Unmixed DPOAE: components

MATLAB-based software (C. Talmadge and P. Luo) was used to separate the DPOAE distortion- and reflection-source components based on their respective phase-gradient delays. During inverse fast Fourier transformation (IFFT), the DPOAE complex pressure measured in the frequency domain was multiplied by a moving Hann window in overlapping 50-Hz steps. The length of the Hann window was adjusted on a logarithmic scale in close approximation to the cochlear frequency map (Greenwood 1990) and ranged from 400 to 930 Hz. Rectangular time-domain filters were applied to each window to extract the target DPOAE component. A search range of −2 to 10 ms was applied to window the short-latency (distortion) component and 3–15 ms for the long-latency (reflection) component. The time-domain filters were centered around the maximum in each window and were 6.25 ms in duration at the lowest frequencies, 3.23 ms at the highest. The filtered windows of data were then transformed back to the frequency domain by FFT and the level and phase of the distortion and reflection components reconstructed. Data segments equal to half of the length of the analysis window were eliminated at low- and high-frequency boundaries to remove edge effects inherent in the time-windowing process.

IFFT-derived magnitude for distortion and reflection components was averaged into third-octave bins. Additionally, a difference score was calculated by subtracting reflection from distortion levels, to provide an metric of the relationship between components. Although the low-level reflection component is vulnerable to noise intrusion, frequency-by-frequency comparison of component magnitude or phase with the original noise floor recorded in the ear canal is not appropriate (e.g., Knight and Kemp 2001). The following steps were taken to mitigate the effects of noise on the reflection component: (1) Mean noise floor from the original ear canal recording was calculated for each age group separately (NF1–7, where 1–7 = age group); (2) The grand mean noise floor was calculated for low- (<1.5 kHz, NFg,lo) and high-frequency segments (>1.5 kHz, NFg,hi); and (3) Any component magnitude (P1–7) where P[1–7] ≤ NF[1–7] was considered a no response and assigned a value equal to NFg,[lo,hi].

This process was implemented to reduce bias on reflection-component level estimates. If reflection level values with inadequate SNR had simply been eliminated, mean magnitude estimates would have been artificially elevated because the lowest level exemplars would have been deleted. As the presence of no response was most common for the reflection component and the noisiest groups (i.e., infant and elderly), it follows that these conditions would have been uniquely biased toward higher reflection levels. For these reasons, the no response was assigned a value that would impact measures of central tendency appropriately and eliminate this bias.

The group distortion- and reflection-component phase-frequency functions were fit with locally linear loess trend lines to capture global phase trends (Cleveland 1993). Loess is a form of locally weighted scatter plot smoothing that is a modern version of classical linear and nonlinear least squares regression. Simple local models of linear and nonlinear least squares regression are fitted to localized subsets of the data, and adjacent fits are joined to create the overall fit. The loess fit essentially gives the deterministic portion of the variation in a data set without having to make any presumptions about the global fitting model a priori. To reduce non-meaningful phase variance, functions with phase values that were not within ±0.5 cycles of zero at 0.7 kHz were shifted up or down by one cycle.

Estimates of phase slope for the low-level reflection component were also susceptible to the effects of noise and could be inadvertently influenced by the initial cleaning routine, which can create gaps in the function. These gaps could bias phase slope estimates and produce unwrapping ambiguities. Therefore, loess trend lines were fit to non-cleaned component phase data only. Phase was determined to be reliable by the following process: Component phase was re-calculated using an alternative sequence of frequencies. If the two estimates (the original and recalculated phase-frequency function) replicated, the function was considered reliable. If the phase was adversely impacted by noise, the two functions diverged, in which case either the entire phase-frequency function was eliminated or the start frequency was modified. Phase-frequency functions from one newborn, one older infant, one young adult, two middle-aged adults, and three older adults were eliminating using this reliability probe, resulting in a total of eight discarded phase functions (out of 156). Discontinuities in reflection-component phase remained in some subject data, but these were typically associated with deep minima in DPOAE level fine structure; they included half-cycle shifts that were randomly upward or downward, exerting little bias on the overall trend. Reflection-component phase-frequency functions were also quantified with a phase accumulation index derived by subtracting the initial phase at 0.7 Hz from phase at 3.6 kHz. This is a non-frequency specific metric that gauges the total accumulation of phase cycles across the entire frequency range.

Statistical analyses

A linear mixed effects model was used to test for age and frequency effects on DPOAE indices whenever a repeated factor (frequency) was variably present (SPSS ver. 18.0). When data were present across all frequencies, a repeated measures ANOVA was conducted. If there was a main effect of age, post hoc one-way ANOVAs were conducted separately for low- (<1.5 kHz) and high-frequency (>1.5 kHz) segments. This was followed by pair wise age comparisons (least significant difference t tests); notable age sub-groups or clusters were described. An alpha level of p = 0.05 was applied to all statistical analyses.

Results

Mixed DPOAE

SNR was calculated by subtracting mean noise floor values from DPOAE level values at each center frequency for every subject. Figure 2 shows individual SNR (gray lines) across frequency, with the group mean (thick line) superimposed. DPOAE SNR summed across frequency was as follows: 20, 25, and 22 dB in the premature newborn, term newborn, and older infant groups, respectively. The mean SNR was 27 dB in both teen and young adult groups, and 26 and 21 dB in middle-aged and older adult groups, respectively. The test–retest reliability of two DPOAE amplitude measures taken within the same session at fine structure peaks was assessed for between 50 % and 70 % of the data in each age group as a gross estimate of response stability. The mean variance between observations and 95 % confidence intervals were as follows: combined newborn group (premature + term) = 0.52 dB (±0.1), teen = 0.32 dB (±0.2), young adult = 0.36 dB (±0.1), middle-aged adult = 0.32 dB (±0.1), and older adult = 0.85 dB (±0.17). DPOAE level estimates within subject varied less than 1 dB for all age groups. Older infants did not have repeated measures available for test–retest analysis.

FIG. 2
figure 2

DPOAE signal-to-noise ratio (SNR) across frequency for individual subjects in each age group (gray lines) with the mean SNR superimposed; age group is denoted by color and by panel label, AG.

DPOAE level

A repeated measures ANOVA was used to assess the effects of age on DPOAE level. Main effects of age (f = 107, p < 0.0001) and frequency (f = 37.2, p < 0.0001) were significant as was the interaction (f = 3.7, p < 0.0001). Post hoc ANOVAs confirmed that the age effect was present in both low- (<1.5 kHz) and high-frequency (>1.5 kHz) segments (f = 16.3, p < 0.0001; f = 24.2, p < 0.0001 respectively). As is evident from Figure 3 and elucidated by paired age contrasts, DPOAE level was singularly reduced by between 5 and 15 dB in the older adult group across the entire frequency range (i.e., the older adults showed significantly lower DPOAE levels when paired with all other age groups). Also notable is that older infants exhibited significantly higher DPOAE levels when paired with all other age groups in the high-frequency interval only.

FIG. 3
figure 3

Mean DPOAE levels averaged into third-octave bins for seven age groups. Age is denoted by color and symbol. Error bars represent 95 % CIs and are offset for easy visualization.

DPOAE fine structure

Figure 4 displays a representative example of DPOAE fine structure for one subject from each age group. A mixed effects model was applied to DPOAE fine structure. Fine structure features were analyzed in six age groups (excluding older infants for reasons noted in the “Methods” section). Older adults and newborns exhibited the highest number of fine structure periods per third octave (mean = 1.5–1.7). There was a significant effect of age (f = 3.44, p = 0.004) and frequency (f = 30.52, p < 0.0001) on the number of fine structure oscillations and an interaction (f = 1.64, p = 0.017). Post hoc age ANOVAs conducted for low- and high-frequency segments separately showed significant age effects on both segments (low: f = 4.48, p = 0.001; high: 3.18, p = 0.01). Paired age comparisons indicated that teens had significantly fewer fine structure periods than newborn, middle-aged, and older adults groups across the frequency range. Older adults showed significantly greater number of fine structure periods per third-octave interval than all age groups for the low-frequency segment. The mean overall spacing ratio for DPOAE fine structure (ff) was 13.1. Spacing did not show a main effect of age though it became narrower with increasing frequency (f = 69.4, p < 0.0001). The overall mean depth of DPOAE fine structure was 4 dB, but oscillations were as deep at 17 dB in some individuals. Fine structure depth showed no age or frequency effects.

FIG. 4
figure 4

DPOAE level fine structure from one representative subject for each of seven age groups. DPOAE level was measured at between 400 and 550 frequencies across the three-octave frequency range in frequency intervals ranging from 9 to 12 Hz.

DPOAE phase-gradient delay

The DPOAE phase-gradient delay, displayed in Figure 5 binned into third-octave intervals, was analyzed with a repeated measures ANOVA. Delays beyond two standard deviations of the mean were considered outliers and were eliminated, mostly impacting the 561-Hz center frequency. For most subjects, DPOAE phase delays were measurable with adequate SNR at only eight of the nine center frequencies (707–3,563 Hz). Consistent with past reports, at mid-to-high frequencies, the phase-gradient delay was relatively invariant. At low-mid frequencies, phase slope steepened producing nonzero delays of between 1.5 and 2.5 ms depending on age group. There was a significant effect of age (f = 17.57, p < 0.0001) and frequency (f = 377, p < 0.0001) on DPOAE delay as well as an interaction (f = 3.22, p = 0.001). Post hoc analyses confirmed a significant effect of age in the low-frequency interval only (f = 10.36, p < 0.0001). Pair wise age comparisons showed that the low-frequency phase delay was significantly prolonged in premature and term newborns relative to all other age groups. No other age pairs showed significant differences. As noted in Figure 5, the newborn phase delay was prolonged by ~1 ms compared to young adult delays at the lowest center frequencies.

FIG. 5
figure 5

Mean DPOAE phase-gradient delay averaged into third-octave bins for seven age groups. Age is denoted by color and symbol. Error bars represent 95 % CIs and are offset for easy visualization.

Each individual phase-frequency function, comprised of approximately 400 to 550 data points, was fit with a spline model. The model identified a knot in each function denoting a change in phase trajectory, and calculated the phase slope of segments below and above the knot frequency. These features were tested for age effects with one-way ANOVAs. As shown in Figure 6A, the mean knot or break frequency was centered between 1.4 and 1.6 kHz and was not statistically different among groups. Age effects were noted on low- (<1.5 kHz) and high-frequency (>1.5 kHz) slope of phase (f = 9.91, p < 0.0001, f = 3.25, p = .003, respectively). Figure 6B shows the systematic reduction of phase slope (commensurate with a shortening of phase-gradient delay) as a function of increasing age through young adulthood where it appeared to plateau. Paired age comparisons defined two sub-groups for low-frequency phase slope: (1) The three infant groups were comparable, showing the steepest slope, and (2) Young, middle-aged, and older adult groups were comparable, showing the shallowest low-frequency phase slope. The age effect observed on high-frequency phase slope (Fig. 6C) was not consistent with the phase delay analysis (Fig. 5); and when further analyzed, paired age comparisons did not elucidate a systematic trend.

FIG. 6
figure 6

Three features were derived from a spline model fit to individual DPOAE phase-frequency functions from all subjects: (A) mean “break” frequency denoting where DPOAE phase delay deviates from invariance, (B) mean phase slope of the low-frequency segment (below the break frequency), and (C) mean phase slope of the high-frequency segment (above break frequency). Age groups: PN, TN = premature and term newborn, respectively; OI = older infant; T = teen; YA = young adult; MA = middle-aged adult; and OA = older adult. Error bars represent 95 % CIs.

The unmixed DPOAE

Component magnitude

As detailed in the “Methods” section, component level estimates lacking adequate SNR were classified as no response (NR) and uniformly assigned a number equal to the grand mean noise floor. Only reflection-component data included instances of NR: 6–7 % of infant data, 2–4 % of the teen and young adult data, and 7 % and 11.8 % of the middle-aged and older-adult data. Mean distortion- and reflection-component level as a function of frequency is shown in Figure 7A and B with age parameterized.

FIG. 7
figure 7

IFFT-derived (A) distortion- and (B) reflection-component levels averaged into third-octave bins for seven age groups. Age is denoted by color and symbol. Error bars represent 95 % CIs and are offset for easy visualization.

Mean distortion-component levels ranged from −6 to 20 dB SPL as noted in Figure 7A. The age trends were comparable to the trends previously reported in Figure 3 for ear canal DPOAE level. Repeated measures ANOVAs found a significant effect of age (f = 96.85, p < 0.0001) and frequency (f = 39.44, p < 0.0001) on distortion-component level and an interaction (f = 2.75, p < 0.0001). When separated into two frequency segments, both low- and high-frequency intervals showed age effects (f = 16.04, p < 0.0001; f = 19.06, p < 0.0001, respectively). Paired age contrasts confirmed what is apparent to the eye: (1) Older infants had higher levels of distortion than all other age groups (i.e., all paired contrasts were significant) in the high-frequencies, and (2) Distortion levels from older adults were reduced relative to all other age groups across the frequency range.

Mean reflection-component level ranged between −21 and 5 dB SPL. There was a significant effect of age (f = 83.48, p < 0.0001) and frequency (f = 70.52, p < 0.0001) on reflection level and an interaction (f = 2.18, p < 0.0001). When separated, both low- and high-frequency segments showed age effects (low: f = 12.7, p < 0.0001; high: f = 20.9, p < 0.0001). Pair wise age comparisons confirmed that (1) Reflection levels were higher in older infants than in all other age groups for frequencies >1.5 kHz, and (2) Comparably reduced reflection levels were observed for the older adult, middle-aged, and teen groups. These results indicate that the overlap of reflection-component levels among age groups was greater than the overlap of distortion-component levels among groups.

The average component difference is displayed as a function of frequency in Figure 8. There were both age (f = 15.82, p < 0.0001) and frequency (f = 12.06, p < 0.0001) effects on this difference metric but no interaction. Most notable in the paired age comparisons is that the older adult group clustered with premature newborns and older infants (i.e., values were not statistically different among these three groups), showing reduced differences between component levels. The reduced difference is consistent with the presence of strong fine structure in the older adult and infant groups (as noted in Fig. 4). DPOAE components, when relatively equivalent in magnitude, yield the most prominent fine structure as near-complete cancellation can be achieved if they sum out of phase.

FIG. 8
figure 8

Mean difference between distortion- and reflection-component levels averaged into third-octave bins for seven age groups. Age is denoted by color and symbol. Error bars represent 95 % CIs and are offset for easy visualization.

Component phase

Figures 9 and 10 display the individual distortion and reflection phase-frequency functions (gray) with loess trend lines superimposed (colored line) for each of the seven age groups. These displays provide an indication of the goodness of the loess fit and the scatter of the individual functions. The loess trend lines for each age group are then shown superimposed in Figure 11 along with the 95 % confidence intervals (CI) for the model prediction. To more effectively visualize age trends, newborns were combined into one group, as were teens and young adults for this display.

FIG. 9
figure 9

Loess trend lines (thick colored line) superimposed on individual distortion component phase-frequency functions (gray lines). Ninety-five percent CIs for model prediction are displayed as thin lines around each trend line. Age group is denoted by color of the loess line and by panel label, AG. Starting phase has been normalized to zero.

FIG. 10
figure 10

Loess trend lines (thick colored line) superimposed on individual reflection component phase-frequency functions (gray lines). Ninety-five percent CIs for model prediction are displayed as thin lines around each trend line. Age group is denoted by color of the loess line and by panel label, AG. Starting phase has been normalized to zero.

FIG. 11
figure 11

The loess trend lines fit to age group phase-frequency functions are shown here superimposed for (A) distortion and (B) reflection components separately. Age is denoted by color. Note that newborns (premature and term) were combined into one age group as were teens and young adults for this display. Ninety-five percent CIs for model prediction are displayed as thin lines around each trend line. Starting phase has been normalized to zero.

The loess fits approximated the group mean of the phase curves and provided an overall visual guide for component phase as a function of frequency. The starting phase for all loess lines was normalized to zero. Note in Figure 11A that the phase behavior of the distortion component was similar to the ear canal DPOAE phase-frequency function (see Figs. 5 and 6), consistent with its dominant contribution to the ear canal response. Distortion-component phase was uniformly invariant for all age groups at frequencies above 1.5–2 kHz. At frequencies <1.5 kHz, the combined newborn group showed the steepest phase gradient followed by the older infant group; the combined teen/young adult trend line more closely approximated middle-aged and older adult data. The two oldest age groups showed the shallowest phase slope at frequencies below 1.5 kHz.

The loess fits to reflection-component phase-frequency functions are shown in Figure 11B, and reflection emission phase accumulation is displayed in Figure 12. The trend lines give a rough, global indication of phase slope; however, age differences can best be observed at the highest frequency (3.6 kHz), which provides a metric of total phase accumulation. This metric does not reflect high-frequency age effects per se; rather, it represents the collective accumulation of phase across the entire frequency range. There was an effect of age (f = 2.6, p = 0.02) on phase accumulation. Newborns accumulated ~16 cycles of phase across frequencies, consistent with the steepest phase slope. The older adult group showed reduced phase accumulation (13.7 cycles) relative to other age groups consistent with the shallowest reflection-component phase slope. The only non-systematic age effect was shown for the middle-aged group, which showed steeper phase slope than the teen and young-adult groups.

FIG. 12
figure 12

Mean reflection-component phase accumulation calculated between 0.7 and 3.6 kHz for seven age groups: NB = premature and term newborns combined, OI = older infants (6–8 months), T/YA = combined teen and young adults, MA = middle-aged adults, OA = older adults. Error bars represent 95 % CIs.

The steep reflection-component phase slope in newborns could have been influenced by the immature middle ear. Inefficiencies in the newborn middle ear effectively attenuate forward-going stimulus levels, lessening drive to the infant cochlea by ~15 dB in the high frequencies (Abdala and Keefe 2006). We corrected for this effect by estimating 0.13 cycles of reflection-component phase accumulation for every decibel decrease in stimulus level (correction derived from Abdala et al. 2011a). Assuming a middle ear attenuation effect of 15 dB in newborns, it is possible to account for 1.9 cycles of phase accumulation with this correction factor; thus, it effectively bridges the difference in reflection-component phase slope between neonates and young adults.

Discussion

The effect of age on a wide array of OAE-based cochlear measures has been examined here with the objective of understanding how the peripheral auditory system changes throughout the human lifespan. The discussion that follows is anchored by four main findings; additional trends are presented within the context of these primary results. The results are then integrated into a cohesive framework of maturation and aging informed by a current view of cochlear physiology and OAE generation theory.

Four primary findings

  1. 1.

    DPOAE and individual distortion and reflection component levels are robust in human infants and, within the age continuum represented here, peak by 6–8 months of age.

    Robust DPOAE levels from premature and term newborns indicate that intermodulation distortion generated at the overlap of f 1, f 2 is adultlike and suggests that the saturating characteristics of outer hair cell (OHC) transduction are appropriately nonlinear early in life. Consistent with this observation, newborn DPOAE input/output (I/O) functions when corrected for outer and middle ear effects are adultlike in configuration (Abdala and Keefe 2006). It appears that the compressive nature of basilar membrane motion, which allows for the coding of a vast dynamic range, is mature at birth.

    Also apparent from component-level data is that newborn ears produce strong reflection. The coherent reflection model of OAE generation (Zweig and Shera 1995; Shera and Guinan 2003) suggests that reflection preferentially gauges cochlear amplifier gain. Strong reflection levels observed in term and prematurely born neonates could be interpreted to indicate mature cochlear amplification early in the third fetal trimester. Consistent with this conjecture, high-frequency DPOAE suppression tuning is mature in premature and newborns once corrections are made for outer and middle-ear immaturities. Of note, robust reflection from infants combined with near normal (i.e., adultlike) distortion levels reduces the difference between components. Because fine structure is deepest when the two DPOAE components are relatively equal, this component profile (best exemplified by data in Fig. 8) likely accounts for the strong DPOAE fine structure characterizing newborn data (Dhar and Abdala 2007; Abdala and Dhar 2010).

    DPOAE and component levels were highest in 6–8-month-old infants. We attribute this finding to immature forward transmission through the neonatal middle ear, and its subsequent near resolution during the first half-year of life (Abdala and Keefe 2006). By this age, nearly adultlike stimulus levels may be driving the cochlea, resulting in these heightened emission levels. This maturational process is coupled with reduced infant ear canal area that also provides a functional boost to the outgoing DPOAE signal (Keefe and Abdala 2007). Combined, these two factors can account for the most robust emission levels from 6- to 8-month-old infants.

  2. 2.

    DPOAE level decreases with age, distortion-component levels showing the most dramatic reduction in older adults.

    Recent work confirms a link between the strength of cochlear nonlinearity directly assayed with measures of intracochlear pressure, and DPOAE level measured in the ear canal (Dong and Olson 2010); hence, reduced DPOAEs in older adults tested here suggest weakened cochlear nonlinearity. A more linear cochlea has been associated with sensory cell damage (Ruggero and Rich 1991; Ruggero et al. 1996). It is possible that the reduction in nonlinearity is due to the mild amounts of hearing loss in some of the older adults rather than an intrinsic aging process separate from threshold elevation. Though past studies with larger number of aged subjects sought to disentangle these influences, a firm conclusion has not been reached (Lonsbury-Martin et al. 1991; Stover and Norton 1993; Dorn et al. 1998).

    The origin of weakened nonlinearity in older adults is not known. OHCs are likely reduced in number due to aging so that fewer are available to manifest this nonlinearity. Recent research has shown contributions to the ear canal DPOAE from sources basal to the f 2 site (Martin et al. 2009). If high-frequency presbycusis reduced the complement of OHCs in the base of the cochlea, one would expect a weakened contribution from these basal sources to the ear canal DPOAE. This weakened contribution may have impacted the greatly reduced distortion levels observed in older ears. Because basal contributions to the DPOAE have been observed primarily in response to high-level primary tones, their impact on the DPOAEs recorded here to moderate-level tones is uncertain. A second possibility is that OHCs are not greatly diminished in number, but their behavior in response to stimulation has become more linear. It is not straightforward to differentiate between these two possibilities based on a remote ear canal response like the OAE. It is also not possible to know if aging produced direct or secondary effects on OHCs via alterations in the endocochlear potential (Schmiedt et al. 2002; Mills and Schmiedt 2004). Alterations in the electrical environment of the cochlea and its consequences on hair cell function and neural health are still being explored (Lang et al. 2010).

    It is important to recall that we treated aging in this study as a composite factor including associated threshold variation; hence, it is not possible to distinguish between the effects of aging per se and hearing loss on emission level. This distinction may be artificial as two-thirds of individuals show measurable hearing loss after 70 years of age (Cruickshanks et al. 1998). One could argue that several additional uncontrolled variables such as lifetime noise exposure, ototoxins, and genetic variation also confound interpretation of aging research. With the small number of elderly subjects in the present study, it was not possible to create sub-groups and attempt to disentangle these relationships.

    The finding of reduced nonlinear distortion combined with more mildly impacted reflection may be salient. It indicates that the two DPOAE components exhibit distinct aging trends and supports independent generation mechanisms for the two emission types. Distortion- and reflection-component emissions manifest differential sensitivity to age-related variation. Several caveats must be considered in this interpretation: (a) The reader will recall that estimates of reflection-component level may have been selectively impacted by our strategy of assigning values (equal to the grand mean noise floor) to data points with inadequate SNR rather than simply eliminating them. However, if this process exerted bias, it reduced reflection-component level estimates in older adults; therefore, it cannot explain reflection levels that are preserved or less reduced than distortion in older adults; (b) The dynamic range of the reflection component over the lifespan is more limited than that of the distortion component, potentially exerting a floor effect earlier in life; (c) The initial nonlinearity at the f 1, f 2 overlap region drives the resulting reflection emission. Hence, the strength of this nonlinearity will influence one's ability to reliably extract reflection from the mixed signal, i.e., they are not truly independent components; and (d) Component estimates are heavily influenced by the signal processing techniques used to separate them. The use of an IFFT and time windowing assumes distinct phase behavior for the two components a priori. However, overlap between the two components in the (phase-derived) time domain has been reported and can make their separation imprecise (Shaffer and Dhar 2006). Recording reflection-source emissions such as SFOAEs, which are less confounded by cochlear nonlinearities, may better explicate cochlear aging effects. These caveats notwithstanding, the component levels recorded here were sensitive enough to detect distinct age trends for separated DPOAE components throughout the human lifespan. Other studies have likewise reported distinct component behavior after salicylate ototoxicity (Rao and Long 2011) and upon activation of the medial efferent system (Abdala et al. 2009; Deeter et al. 2009; Henin et al. 2011).

    An interesting consequence of the differential effect of age on component level is the re-emergence of pronounced DPOAE fine structure in older adults. Recall that fine structure is deepest when the two components are equivalent in magnitude. Because the substantial drop in distortion combined with the relatively less reduced reflection levels minimized the difference between components, fine structure was robust. More numerous fine structure oscillations were observed in older adults; however, in this case, it may not indicate a healthy cochlea. The pristine newborn cochlea also produces strong fine structure but for a different reason: a strong reflection component combined with more typical distortion levels. This observation underscores the importance of DPOAE component separation to understand the origin of changes in cochlear function with age.

    In contrast to earlier work from our joint laboratories where spacing in newborns was reported to be narrower than in adults (Abdala and Dhar 2010), fine structure depth and spacing did not show age effects. Others have similarly observed the lack of an age effect on DPOAE fine structure (He and Schmiedt 1996). This inconsistency is likely due to imprecision in the quantification of DPOAE level fine structure. Though fine structure represents a profile of component mixing and should reflect shifts in the component relationship, it is highly dependent upon the algorithms used for its quantification. Thus, separating and examining the magnitude and phase behavior of individual DPOAE components is more fruitful in understanding cochlear properties.

  3. 3.

    The violation of DPOAE phase invariance (and associated break in cochlear scaling) occurs around the same frequency in all age groups. The degree of break, indicated by a steeper phase gradient, is most pronounced in newborns.

    A wave along the basilar membrane accumulates approximately the same number of cycles regardless of signal frequency (Rhode 1971; Zweig 1976). This is evidence of cochlear scaling symmetry and is apparent in the basal half of the mammalian cochlea. DPOAE phase invariance (with fixed f 2/f 1) provides a gauge of scaling. The break in scale invariance was observed here around 1.5 kHz corresponding roughly to a demarcation between basal and apical halves of the cochlea (Greenwood 1990). This break frequency remains stable for most of the human lifespan. Above this frequency, scaling was evident for all age groups. Even the youngest premature newborns showed DPOAE phase invariance at high frequencies, suggesting that the basal cochlea is scaled by at least the seventh fetal month.

    Below 1.5 kHz, there is a break from phase invariance for all groups, most markedly in newborns where the phase delay is prolonged by 1 ms relative to young adults. The delay decreases (and phase becomes more shallow) with increasing age, reaching an asymptote by young adulthood and remaining stable through middle age and senescence. Though the phase measurements were taken from the mixed ear canal DPOAE, comparable trends were evident for the separated distortion component phase (see Fig. 9). The use of moderate-level primaries and a relatively wide f 2/f 1 ratio produced a DPOAE that was generally dominated by the distortion component. The phase of the ear canal DPOAE follows the phase features of the dominant component (Knight and Kemp 2000, 2001; Dhar et al. 2005).

    The steepening of phase in the apical half of the cochlea signals a departure from scaling invariance (Shera et al. 2000). This secular break could be due to a broadening of filters and/or deviations in the exponential relationship between frequency and place in the apical half of the cochlea. The more marked steepness of apical phase slope in newborns suggests an immaturity in whichever mechanism underlies this break in scaling. Consistent with this hypothesis, DPOAE suppression tuning at 1 kHz is not adultlike in newborns (Abdala 1998). The source of this apical immaturity is not known. Active processes in the cochlea appear to be adultlike by birth. Additionally, the contribution of active processes to signal transduction in the apical cochlea is not clear as this region appears to be more weakly nonlinear in some mammals (Cooper and Rhode 1995). One could speculate that the physical features of the basilar membrane (i.e., stiffness gradient and mass) might be immature in newborns, producing a protracted maturational time course for motion in the apical half of the human cochlea.

    It is important to consider the impact of the middle ear in this speculation. Prolonged phase-gradient delays cannot be easily explained by conductive factors as distortion emission phase is relatively insensitive to stimulus level over a 20-dB range (Abdala et al. 2011a); therefore, lower levels driving the newborn cochlea cannot account for prolonged DPOAE delays. It is possible that the immature newborn middle ear also produces a non-adultlike delay. In adults, the round-trip middle ear delay is estimated to be 0.2 ms averaged across frequency (Puria 2003). The newborn middle-ear delay would need to be five times longer than the adult's to account for the prolonged DPOAE phase-gradient delays observed below 1.5 kHz.

  4. 4.

    The reflection component shows the steepest phase slope (longest delays) in newborns and shallowest phase slope in older adults.

    SFOAE phase-gradient delays and cochlear tuning co-vary in a number of mammalian models such as cat and guinea pig (Shera and Guinan 2003). Additionally, estimates of tuning based on reflection emission delays correspond well to psychoacoustic measures in humans (Shera et al. 2010; Bentsen et al. 2011). In general, long delays (steep phase slope) suggest sharper mechanical tuning. These associations have been tested with a narrow set of parameters and have not been comprehensively probed in humans. Here, the phase slope of reflection-component emissions was steeper in newborns than all other age groups and shallowest for the older adults (Fig. 11). If the relationship between tuning and reflection emission delays holds at the probe levels presented here, the data suggest that newborns have narrow cochlear tuning (most probably due to middle ear inefficiencies); and older adults have broadened tuning. Estimating the “probe” level for reflection emission generation is not straightforward when recording DPOAEs. The apical-going portion of the DPOAE generated at the overlap of f 1 and f 2 serves as the probe producing reflection at the DP place. Unfortunately, estimating its level or strength is complex. Although it seems intuitive to use the DPOAE level measured in the ear canal as an index, it is colored by age-dependent reverse transmission as well as other factors. The difference in reflection emission phase accumulation between elderly and young adult subjects reported here is subtle, and in the absence of perceptual measures of tuning, its functional significance is not known.

    It is curious that middle-aged adults show steeper reflection emission phase slope than teens and young adults. One could conjecture that the younger groups, collectively between 13 and 25 years old, have been exposed to higher levels of noise than their middle-aged counter parts had been exposed to at comparable ages. Certainly, personal listening devices were used earlier and more widely by the younger demographic. This may be one contributing factor, and generational differences in early noise exposure warrant further attention. Admittedly speculative in nature, DPOAE phase behavior may be indicating an earlier change in tuning than sensitivity.

An integrated framework

Four primary findings have been highlighted, two of which can be explained by immaturities in middle ear function. The peaking of the ear canal DPOAE and component levels above 1.5 kHz in the older infants is likely due to gains in efficient transmission of stimulus energy into the cochlea during the period between birth and approximately 7 months of age. Similarly, the steeper reflection component phase in newborns can be attributed to reduced stimulus drive to the cochlea due to inefficiencies in forward transmission through the newborn middle ear. Reflection emission phase is stimulus level dependent (Mauermann and Kollmeier 2004; Schairer et al. 2006); the lower stimulus levels at the newborn cochlea could produce steeper reflection emission phase slope and presumably, narrower cochlear tuning.

The other two highlighted findings can be attributed to processes likely to be cochlear in nature, for example, the prolonged DPOAE and distortion component delays in newborns at frequencies below 1.5 kHz, which is consistent with past work (Abdala et al. 2011b). This delay shortens (i.e., phase slope shallows) through the teen years, suggesting that the cochlear properties producing the break from scaling symmetry are immature at birth but reach some kind of stasis by young adulthood. This immaturity in distortion component low-frequency phase, unlike reflection-emission phase, cannot be easily attributed to middle ear inefficiencies. We have suggested a speculative mechanism involving immaturity in the physical features of the basilar membrane at birth; however, this hypothesis requires further study.

Observations attributed to aging are also likely to be cochlear in nature. These include the preferential reduction of distortion component levels and the shallower reflection phase slope. These changes appear to reflect ongoing degradation of cochlear function, perhaps starting as early as the teenage years in some cases and manifesting later in life for other processes. Some of the cochlear mechanisms that might contribute to this aging-related decline, such as loss of OHCs and reductions in the endocochlear potential, have been posited in the preceding section though they are difficult to specify with OAE methodology. Interpreted within the dual mechanism and coherent reflection models, the results suggest that the cochlea becomes more linear with aging, and frequency resolution is degraded. Though these results reflect a composite aging process, including mild threshold elevation in some subjects, the conclusions are consistent with perceptual studies reporting broadened psychophysical tuning curves, decreased temporal resolution and speech recognition in noise, as well as a reduced dynamic range in older adults (see Gordon-Salant 2005 for review; Grose et al. 2006; Harris et al. 2010).

The reader will note that although the steepened apical distortion phase slope in newborns and the shallow reflection phase slope in older adults are both argued to be cochlear in nature, these two age effects are distinct. The underlying cochlear properties gauged by each component are thought to differ (Shera and Guinan 1999). In this case, distortion-component phase signals a break from scale invariance in the apical half of the cochlea, whereas reflection-component phase provides a rough indicator of cochlear tuning. Under this framework, it is not surprising that maturation and aging produced independent effects on each DPOAE component.

In summary, the results suggest a complex symphony of intertwined processes related to maturation and aging of the peripheral auditory system. Viewing these cochlear changes as an age continuum allowed to some extent for the disentangling of factors related to immaturity versus those imposed by aging. The complexity of age-related change in cochlea is most evident, and one hopes most effectively interpreted, when both magnitude and phase characteristics of the DPOAE are considered and when individual DPOAE components are assessed to target distinct cochlear properties.