INTRODUCTION

Human presbyacusis (i.e., age-related hearing loss) is a multifactorial condition that involves the effects of age and environmental exposures that contribute to cochlear pathology (Gates and Mills 2005; Schmiedt 2010; Keithley 2020; Kobrina et al. 2020; Eckert et al. 2021). Presbyacusis is most often associated with two primary forms of cochlear pathology, which led to the categorization of audiograms as metabolic, sensory, or both (Schmiedt 2010; Dubno et al. 2013). However, distinct categories do not specify the magnitude of metabolic and sensory components in an audiogram, which is important given that hearing loss in many older adults likely reflects the effects of multiple cochlear pathologies. A more quantitative characterization of the two primary components of age-related hearing loss in individuals is critical for understanding the underlying mechanisms of human presbyacusis and for development of effective treatments (Kujawa and Liberman 2019). Therefore, a novel approach was developed to estimate the magnitude of the metabolic and sensory components represented in an individual audiogram, which was validated using retrospective datasets from animal models, human temporal bones, and older adults.

Animal models of presbyacusis and histopathologic studies of human temporal bones provide evidence that metabolic declines occur with increasing age, independent of noise exposure. The endocochlear potential (EP) is an electrochemical gradient that is generated and maintained by the stria vascularis to power outer hair cell (OHC) motility, which is critical to hearing sensitivity for low-level sounds. Thus, EP declines can reduce cochlear amplification, which results in varying degrees of hearing loss across frequency (Schmiedt et al. 2002; Mills et al. 2006; Lang et al. 2010). Atrophy in the stria vascularis and spiral ligament of the cochlea (Schulte and Schmiedt 1992) and other strial pathology is strongly associated with a lower EP (Gratton and Schulte 1995; Gratton et al. 1996). Reduced EP can increase thresholds by ~ 20 dB at lower frequencies and by ~ 40–60 dB at higher frequencies (Sewell 1984; Ruggero and Rich 1991; Cooper and Rhode 1997; Hemmert et al. 2000; Robles and Ruggero 2001; Schmiedt et al. 2002), which produces gradually sloping audiograms with poorer thresholds at higher compared to lower frequencies (Schmiedt 2010). This pattern of pure-tone thresholds as a function of frequency is also common in large audiogram databases (Lee et al. 2005; Allen and Eddins 2010; ISO 2017; Parthasarathy et al. 2020). Histopathology studies in human temporal bones suggest that strial pathology is predominant among older adults (Bredberg 1968; Schuknecht and Gacek 1993; Nelson and Hinojosa 2006; Nadol 2010; Wu et al. 2020).

A different pattern of hearing loss is associated with OHC damage and losses, which can occur at any age and typically results from significant exposure to noise or ototoxic drugs. Sensory declines often involve damage or losses of more susceptible basal OHCs (Sha et al. 2001), producing higher thresholds at high frequencies and minimal changes to low-frequency thresholds (Schuknecht and Neff 1952; Prosen et al. 1990; Liberman et al. 2002; Schmiedt et al. 2002; Schmiedt 2010). Abrupt and substantial high-frequency losses that spare low-frequency thresholds have been proposed to reflect OHC loss (Nadol 2010; Schmiedt 2010; Dubno et al. 2013). With regard to sensory loss, because OHCs are more susceptible to noise-related damage and loss compared to IHCs (Hu 2012), noise-related damage to IHCs (Liberman and Dodds 1984; Sayles and Heinz 2017) is assumed to have a negligible contribution to thresholds (e.g., Moore, 2007; Nadol, 2010). Strial atrophy and OHC losses can also occur together, contributing to a pattern of hearing loss that combines effects from both pathologies (Johnsson and Hawkins 1972; Lang et al. 2010; Kobrina et al. 2020; Wu et al. 2020).

Classifying audiograms into presbyacusis categories was a first step toward the goal of relating audiograms of older adults to underlying pathologies, based on the premise that distinct pathologies produce different audiogram configurations (Schuknecht et al. 1974; Schuknecht and Gacek 1993; Schmiedt 2010; Dubno et al. 2013). In the original classifications, the metabolic phenotype was characterized by thresholds ≥ 20 dB HL at lower frequencies and gradual threshold increases above 1–2 kHz. The sensory phenotype included relatively normal thresholds at lower frequencies and thresholds > 40 dB HL in the higher frequencies that increased more steeply with increasing frequency. The metabolic + sensory phenotype showed a combination of moderately elevated thresholds at lower frequencies and steeply sloping thresholds at higher frequencies. Recently, classifying audiograms led to evidence of significant phenotype differences in age, sex, and noise history (Dubno et al. 2013), longitudinal changes in hearing (Vaden et al. 2017), and otoacoustic emissions (Vaden et al. 2018). Nonetheless, one major limitation of discrete phenotypes (i.e., classifying audiograms as predominantly metabolic or sensory) is that these classifications provide no information about the magnitude of the two primary components of age-related hearing loss for an individual audiogram. Given the likelihood that hearing loss in older adults reflects multiple pathologies, it is important to know the magnitude of any metabolic or sensory component within an audiogram.

The current study reports on the development and validation of a novel approach to estimate the magnitude of metabolic and sensory hearing loss (in dB) based on the audiogram, to build on the earlier phenotype categories but now provide quantitative information on phenotype components. Metabolic and sensory estimates were validated with retrospective datasets collected from gerbils, older humans, and histopathology reports from human temporal bone donations. The results suggest that audiograms include information about distinct cochlear pathologies (Schuknecht and Gacek 1993; Nadol 2010; Schmiedt 2010; Dubno et al. 2013; Vaden et al. 2017; Eckert et al. 2021). Our findings also suggest that the characterization of hearing loss based on metabolic and sensory estimates is enhanced for well-approximated audiograms, which may be more common in community-based research samples than clinical samples with more heterogeneous pathologies.

MATERIALS AND METHODS

Overview of the Model and Validation Methods

First, we describe the model and four steps used to derive metabolic and sensory hearing loss estimates for gerbils and humans. Next, the metabolic and sensory estimates were validated based on data from gerbils with well-characterized strial and/or OHC pathologies. In addition to magnitude estimates of metabolic and sensory components, “predicted error” (i.e., line fit error) was calculated to represent goodness-of-fit for an individual audiogram. High predicted error was associated with other types of hearing loss for exclusion from analyses (e.g., “cookie-bite” or “corner” audiograms). Simulation tests were then performed to examine accuracy, precision, and bias in the hearing loss component estimates, and how these related to predicted error. Metabolic and sensory estimates were then used to characterize hearing loss components in association with cross-sectional and longitudinal age differences, sex differences, positive noise history reports, and speech recognition in noise for older adults. Finally, we determined the extent to which cochlear pathology from case reports of older temporal bone donors from the National Temporal Bone Registry (NTBR; Merchant et al. 1993, 2008) related to differences in metabolic and sensory estimates.

Profile-Based Model for Estimating Metabolic and Sensory Hearing Loss

Metabolic and sensory estimates were calculated based on hearing threshold measurements that were obtained from both gerbils and older adults. Metabolic and sensory estimates represent the extent of threshold changes consistent with the effects of strial and OHC declines, respectively. Because the estimates were calculated similarly for gerbil and human datasets, validating the sensitivity of the estimates to environmental exposures and cochlear pathology in gerbils could support their interpretation in humans. Four calculation steps are described below for the gerbil and human datasets.

Step 1: Selecting Cases to Form a Metabolic Group and Sensory Group for Gerbils and Older Adults

Cases from retrospective gerbil datasets were assigned to metabolic, sensory, and control groups based on age and experimental conditions. The metabolic group included old gerbils reared in quiet acoustic conditions (“quiet-aged,” N = 19) and young gerbils chronically exposed to furosemide (“furosemide-exposed,” N = 25). Furosemide is a loop-diuretic that temporarily lowers the EP in young gerbils so that is it similar to the reduced EP in old gerbils. Under these conditions, compound action potential (CAP) threshold shifts in young gerbils were found to be similar to CAP threshold shifts in old gerbils (Schmiedt et al. 2002; Lang et al. 2010). The sensory group included old gerbils that were chronically exposed to noise (“noise-aged,” N = 5). The control group included data from young gerbil ears that were not exposed to furosemide or noise (N = 15). A CAP-based measure of hearing loss was calculated for the gerbils (Fig. 1A), which is similar to human audiograms (Fig. 1B) that represent pure-tone thresholds (in dB HL) relative to young normal-hearing adults (American National Standards Institute 2010; Steinberg et al. 1940). Specifically, CAP thresholds, which assess summated auditory nerve activity, were measured across frequency for individual gerbil ears. Gerbil CAP threshold shifts were then calculated relative to young controls in each study, by subtracting the average CAP thresholds from each young control ear. More details on the sources of retrospective data and published studies are presented later in the “Gerbil Data and Statistical Analyses” section.

Fig. 1.
figure 1

Average threshold shifts and thresholds were used to create metabolic and sensory hearing loss profiles for gerbils and older adult humans, respectively. Group averaged CAP threshold shifts A and pure-tone thresholds B are shown with standard error of the mean (SEM) error bars. Profiles derived from the group-averaged threshold shifts and thresholds are shown in C, D; note the change in y-axis scale. The metabolic profile was computed by scaling the quiet-aged group average CAP threshold shifts. The sensory profile was adjusted for possible metabolic losses in the noise-aged gerbils C. The metabolic and sensory profiles derived from older adult humans D were similarly derived by scaling group average thresholds. Each hearing loss profile consisted of weights with an across-frequency average value = 1 in arbitrary units (AUs), shown on the y-axes C, D. The profile weights were designed to be multiplied by a constant to estimate the extent of metabolic or sensory losses that best approximated hearing thresholds.

Cases from the retrospective human dataset of audiograms from older adults were assigned to metabolic (N = 106; mean age ± SD = 71.6 ± 6.6 years) and sensory (N = 296; mean age ± SD = 67.7 ± 5.9 years) groups based on matching labels from the expert raters and the machine learning algorithm used in the original phenotype classifications (Dubno et al. 2013; Vaden et al. 2017, 2018). This human audiogram selection prioritized cases with predominantly metabolic or sensory phenotypes.

Step 2: Computing Average Metabolic and Sensory Hearing Loss

Average gerbil CAP threshold shifts and average human audiograms were separately calculated within each hearing loss group, which revealed distinct patterns of hearing loss across frequency (Fig. 1A, B). Within the metabolic group of gerbils, the quiet-aged and furosemide-exposed gerbils showed elevated CAP thresholds across frequency, with thresholds gradually increasing above 2 kHz (Fig. 1A). Within the sensory group, the noise-aged gerbils exhibited a distinct pattern of hearing loss that was consistent with a combination of age-related metabolic losses typical of the quiet-aged and furosemide-exposed gerbils and sensory losses that reflected the characteristics of the noise used in that study (Schmiedt et al. 1990).

Step 3: Creating Scaled Metabolic and Sensory Hearing Loss Profiles

Scaled “hearing loss profiles” that represented the relative distribution of hearing loss across frequency were derived from the average hearing losses within the metabolic and sensory groups. Each group’s average profile was then scaled by dividing hearing loss at each frequency by the across-frequency average, so that the scaled across-frequency average = 1 dB (Fig. 1C, D). The metabolic and sensory profiles may be understood as scale replicas of each group’s hearing loss, which would perfectly fit the group average thresholds if each profile was multiplied by some value. As expected (Schmiedt et al. 2002), the scaled values in the quiet-aged profile were highly correlated with those from the furosemide-based profile [r(10) = 0.93, P = 9.87 × 10−6]. Accordingly, the results for gerbils only include metabolic estimates based on the quiet-aged metabolic profile.

Because the sensory group included only noise-aged gerbils, the gerbil sensory profile was adjusted for possible age-related metabolic losses. Noise-aged gerbils appeared to show a combined pattern of metabolic and sensory hearing loss, with elevated lower frequency thresholds common for quiet-aged gerbils in addition to a clear noise-related component. Thus, the metabolic profile was subtracted from the noise-aged profile, then the positive differences were rescaled so their average = 1 dB (Fig. 1C). This adjustment produced a sensory profile that emphasized noise-related shifts and more closely reflected the spectral characteristics of the noise used in that study (Schmiedt et al. 1990).

Step 4: Fitting Audiograms with the Metabolic and Sensory Profiles

The final step for calculating estimates of metabolic and sensory components determined the combination of metabolic and sensory profiles that best approximated hearing thresholds for an individual ear (CAP threshold shifts: Fig. 2A–C; audiograms: Fig. 2D–F). The goal of this step was to account for the configuration of hearing thresholds in an individual audiogram, whether those reflected a predominantly metabolic or sensory pattern, or a combination. The metabolic and sensory profiles were each multiplied by a test value of 0 dB or higher (i.e., estimate), then the products were summed to approximate thresholds from an individual ear. Metabolic and sensory estimates were optimized by minimizing the root mean square (RMS) difference between the modelled and observed hearing thresholds (i.e., predicted error), based on an optimization algorithm with a lower-bound set to zero for each estimate (Byrd et al. 1995). Because a small sample of gerbils was used to derive profiles and calculate estimates, a leave-one-out approach was used to limit model over-fitting for each gerbil. That is, each gerbil was temporarily set aside to re-compute profiles, before estimating the losses for that case. A similar leave-one-out approach was used for a cross-sectional dataset that included older adult human audiograms (described below) to estimate each subject’s sensory and metabolic hearing loss independently from the audiograms used to calculate the hearing loss profiles. In summary, the metabolic and sensory estimates represent the contribution of each profile in approximating hearing losses measured from an ear, and predicted error represents the quality of that approximation.

Fig. 2.
figure 2

To illustrate how hearing losses were fitted by the profiles, metabolic and sensory estimates are shown in 3 dB increments (colored lines) that approximated an individual gerbil’s CAP threshold shifts (black circles) selected from each group AC. The human profiles were similarly used to fit metabolic and sensory estimates to older adult audiograms D, E. The examples shown appeared to have predominantly metabolic losses D or sensory losses E, and a combination F. Note the distinct threshold configurations (phenotypes) in D and E, even though both had an average threshold = 28 dB HL across frequency.

Gerbil Data and Statistical Analyses

Threshold shifts from gerbils in two age groups (younger or older) that underwent specific exposures (none, noise, furosemide) were used to estimate metabolic and sensory components in cases with well-characterized strial and/or OHC pathologies. Briefly, the noise-aged gerbils are comparable to older adults exposed to noise throughout their lifespan, the quiet-aged gerbils were reared in acoustically controlled conditions that may not be comparable to humans in industrialized societies, and the furosemide-exposed young gerbils also raised in a low-noise environment were used to study how lowering EP causes hearing loss similar to the quiet-aged gerbils. Each dataset was collected from a previously published study conducted by researchers within our research group at MUSC. These studies included data read from plots with software for 19 gerbil ears and electronic records for 45 gerbil ears (Schmiedt et al. 1990, 2002; Tarnowski et al. 1991; Lang et al. 2010). The “quiet-aged” data were collected from 19 older gerbils (14 were ages 33 to 36 months; 5 were ages 36 to 37 months) reared in a low-noise environment (Schmiedt et al. 1990, 2002; Tarnowski et al. 1991). The “noise-aged” data were collected from 5 gerbils (ages 24 to 36 months) exposed to noise (85 dBA, 0.5 to 4.0 kHz) for two years (Schmiedt et al. 1990). The furosemide-exposed data were collected from 25 young gerbils (5 were ages 4 to 8 months; 20 were ages 3 to 6 months) chronically infused with furosemide for 2 to 28 days with a surgically implanted pump (Schmiedt et al. 2002; Lang et al. 2010). The control group included data collected from untreated ears in 15 of the young gerbils (ages 3 to 6 months) from the furosemide-exposed group (Schmiedt et al. 2002; Lang et al. 2010). The CAP threshold shifts were calculated by subtracting average thresholds from young controls for the quiet-aged gerbils, noise-aged gerbils, furosemide-exposed young gerbils, and young control gerbils (Fig. 1A). There were EP measurements available for 5 quiet-aged gerbils and 25 furosemide-exposed young gerbils.

Differences in metabolic and sensory estimates were tested across the four gerbil groups using ANOVA and Tukey tests: quiet-aged (N = 19), noise-aged (N = 5), furosemide-exposed young gerbils (N = 25), and young control (N = 15). Metabolic estimates were predicted to be higher for the quiet-aged, noise-aged, and furosemide-exposed gerbils compared to young control gerbil ears, and sensory estimates were predicted to be higher for the noise-exposed gerbil group compared to the other groups. Correlation tests were then performed to test the predictions that lower EP measurements are associated with higher metabolic estimates but not sensory estimates.

Simulation-Based Parameter Recovery Tests

Simulation-based parameter recovery tests were performed to determine the accuracy, precision, and bias for metabolic and sensory estimates, and confirm that audiograms well-fit by the model (i.e., predicted error < 15 dB) generate accurate estimates. The simulation generated 1,000 audiograms for each combination of hearing loss parameters: metabolic = 0, 2, … 74 dB, sensory = 0, 2, … 74 dB, and simulated error = 0, 2, … 24 dB, to produce all possible audiograms with pure-tone thresholds within the − 10 to 120 dB HL range. The metabolic and sensory profiles were each multiplied by the corresponding parameter value, then summed together to create a simulated audiogram with known values for metabolic and sensory hearing loss (i.e., “ground truth”). Each of the 1000 simulated audiograms reflected the addition of a different random noise to create the simulated error, which were unrelated to the metabolic or sensory profile. The simulated errors were produced that way because predicted error estimates reflect variance in audiometric thresholds that are not explained by the fitted metabolic and sensory estimates (i.e., line fit error). Simulated error was RMS scaled to levels between 0 and 20 dB, before it was added to an audiogram across frequencies. Audiograms with thresholds outside of the − 10 to 120 dB HL range were excluded.

Metabolic and sensory estimates and predicted error were calculated for each simulated audiogram, then the simulation results were binned into 2 dB increments of metabolic and sensory estimates, and predicted error (i.e., cells). This organization for the results was used to summarize accuracy, precision, and bias for human audiograms that may have similar estimates, but with unknown underlying parameters. A total of 7,142,645 simulated audiograms were included in the analyses, based on sufficient numbers of observations within each cell (at least 2750). Parameter recovery error was calculated to examine accuracy, precision, and bias by subtracting the metabolic and sensory parameters used to generate a simulated audiogram from each of its derived estimates. Accuracy was determined by calculating the mean absolute metabolic and sensory parameter recovery error for each simulated audiogram, which was averaged for each cell. Precision was determined by calculating standard deviation for the same binned, mean absolute error values, to characterize their variability. Estimation bias was determined based on the average signed metabolic or sensory parameter recovery error, to determine the degree to which each parameter was consistently over-estimated or under-estimated for each cell. Lower averages and standard deviations for the mean absolute parameter recovery error would indicate higher accuracy and precision, respectively, and the bias estimates would characterize whether either parameter was over-estimated or under-estimated by this approach.

Characterizing Metabolic and Sensory Components in Audiograms from Older Adults

Audiograms from older adults were used to characterize associations between the estimates of metabolic and sensory components and age, noise history, sex, and a measure of speech recognition in noise. Estimated losses were characterized for retrospective datasets from the MUSC longitudinal study of age-related hearing loss: (1) a large cross-sectional sample of audiograms measured at baseline (Dubno et al. 2013), (2) a smaller sample of audiograms measured longitudinally (Vaden et al. 2017), and (3) a large cross-sectional sample of speech recognition in noise scores measured at baseline from a subset of participants who provided audiograms.

Cross-Sectional Audiograms and Analyses

The cross-sectional audiogram dataset included 1,728 average audiograms from 865 study participants [480 females and 385 males, ages 50.4 to 89.0 + years; mean age ± SD = 69.9 ± 7.2 years of age]. Because multiple audiograms were collected for each participant over multiple study visits, average pure-tone thresholds were computed for each ear based on an average of 2.9 audiograms collected during their first year in the study. Additional information on the questionnaires, procedures, equipment, and recruitment for this sample are detailed in the original publication (Dubno et al. 2013). After metabolic and sensory estimates were calculated for each average audiogram, 1,634 observations were selected for analysis (94.6 % of the original sample), with 94 cases excluded based on predicted error ≥ 15 dB. A regression analysis was used to test the extent to which each estimate related to age and sex (Dubno et al. 2013). A generalized linear mixed model with a random subject effect term was used for these tests, with a base-model that can be expressed as an R equation: Metabolic = Age × Sex + Subject.

Longitudinal Audiogram Collection and Analyses

The longitudinal dataset included 1,826 average audiograms collected from 343 older adult participants [198 F and 145 M; age at first visit: 51.2 to 85.7 years; mean age ± SD = 68.4 ± 6.3 years; number of visits = 2.7 ± 0.9; length of study participation = 6.5 ± 4.2 years]. Average pure-tone thresholds for each time point were calculated from audiograms collected from a cluster of 3–6 visits in a single year, which were repeated at 2–3 year intervals. Responses to a noise history questionnaire (Dubno et al. 2013) suggested that 46.6 % of the participants reported a positive noise history. Additional information on the original study is published in Vaden et al. (2017). Data from 334 participants were selected for analysis (97.4 % of the original sample; 659 ears) with two or more time points, after 56 audiograms were excluded based on predicted error ≥ 15 dB. For the audiograms collected longitudinally, age- and sex-related differences were also tested for metabolic or sensory estimates with regression (Vaden et al. 2017). A generalized linear mixed model with a random effect term for each ear was used, for example: Metabolic = Age × Sex + Ear. Separate regression tests were performed to determine if sensory estimates increase with positive noise history, consistent with earlier observations (Dubno et al. 2013; Vaden et al. 2017): Sensory = NoiseHistory + Sex + Ear. Random effect terms for individual ears were used to control for ear-level differences in these regression models.

Speech Recognition in Noise Scores and Analyses

The speech recognition data set included average audiograms and low-context scores from the Speech Perception in Noise test (SPIN; Kalikow et al. 1977) for 1188 participants [658 F and 530 M; age range: 50.0 to 89.0 + years; mean age ± SD = 68.3 ± 8.0 years]. Selecting cases based on predicted error < 15 dB reduced the dataset to 1161 participants (94.9 %), after excluding 121 poorly fit audiograms. Each SPIN sentence was presented at 50 dB sensation level relative to the calculated babble threshold for each participant; babble was presented to achieve a + 8 dB signal-to-noise ratio (SNR; Bilger 1984; Bilger et al. 1984). For participants who reported that, under these conditions, SPIN sentences were uncomfortably loud, the presentation level was reduced to a tolerable level while maintaining the + 8 dB SNR. Recognition scores for low-context SPIN sentences were first predicted based on importance-weighted audibility using a modified version of the Articulation Index (AI), which estimates speech audibility based on weighted pure-tone thresholds, and presentation levels and spectral characteristics for the speech and noise (Dubno et al. 2008). An adjusted SPIN score was calculated for each participant by subtracting the AI-predicted SPIN score from the observed score. (i.e., a difference value). A regression analysis was used to characterize the extent to which metabolic and sensory estimates affect the AI-adjusted SPIN scores, based on a generalized linear mixed model (GLMM) with a random subject effect term, specified in an R equation: SPINobs-pred = Metabolic + Sensory + Age × Sex + Subject.

Histopathology Case Reports and Statistical Analyses

Histopathology case reports from the NTBR (Merchant et al. 1993, 2008) were used to characterize metabolic and sensory estimates in relation to cochlear pathologies. The NTBR case reports were selected from a large-scale searchable database with detailed histology reports that often include audiograms and describe the condition of the stria vascularis and spiral ligament, IHC and OHC losses, and neural losses. Metabolic and sensory estimates were calculated for audiograms that were transcribed from the digital case records by another research lab (see Acknowledgements). Most of the audiograms were not suitable for analysis based on predicted error equal to or above the 15 dB cutoff, which was justified based on our simulation results. The NTBR database is not limited to age-related hearing loss, and often includes more extensive hearing loss than is typical for community-based samples (e.g., Gates et al. 1990; Cruickshanks et al. 1998; Dubno et al. 2013). This was reflected in median predicted error = 13.6 for NTBR audiograms, suggesting that these were poorly fit by the modeled hearing loss compared to older adult audiograms from the cross-sectional and longitudinal samples [median predicted error = 7.26 and 7.04, respectively].

Metabolic and sensory estimates were used in combination with low predicted error to select groups of cases that potentially included strial and/or OHC pathology, such that each group of audiograms represented predominantly metabolic or sensory hearing losses, or a combination, based on estimates (Table 1) rather than phenotype categories as defined in Dubno et al. (2013). Cases were excluded based on reports with no noted pathology (N = 13), poorly preserved or prepared samples that could mask pathology (N = 12), and pathology unrelated to presbyacusis (N = 23). Unrelated pathologies included the following: otosclerosis, schwannoma, Ménière’s disease, lymphoma infiltrates, Paget’s disease, mitochondrial dysfunction, severe endolymphatic hydrops, paraneoplastic encephalitis, metastatic adenocarcinoma infiltration, and fat-wire prosthesis. In total, thirty-seven ears were analyzed [16 F, mean age at audiogram ± SD = 68.4 ± 8.2 years, mean age at death = 74.0 ± 9.0 years]. The NTBR case numbers are listed individually for each group in Table 2.

Table 1 Histopathology groups defined by estimates
Table 2 Histopathology cases in each group

A general linear model regression analysis was used to test the prediction that presence of strial, OHC, or neural pathology was associated with metabolic or sensory estimates, which can be expressed in an R equation: Metabolic = Strial + OHC + Neural. Additional regression tests were performed to determine the extent to which strial pathology increased with age, and whether OHC losses differed by sex.

Statistical Software and Associated Packages

All of the statistical analyses were performed using the R Statistics Project (3.6.2). The R-package: digitize (0.0.4) was used for reading data from plots. For the generation of hearing loss profiles from labelled human audiograms, R-packages: nlme (3.1–142) and MASS (7.3–51.4) were used to classify and select representative audiograms for the profiles. Metabolic and sensory estimates were generated by fitting profiles to hearing thresholds using the R-base (3.6.2) function for optimization with lower bound: optim (L-BFGS-B method). Generalized linear mixed models were performed using the R-packages: lme4 (1.1–21) and lmerTest (3.1–2). Comparisons of correlations based on dependent groups were performed using the R-package: cocor (1.1–3).

RESULTS

Metabolic and Sensory Estimates for Gerbils

The first step was to verify that hearing loss profiles accurately estimated metabolic and sensory components of age-related hearing loss from CAP threshold shifts measured in gerbils with well-characterized strial and/or OHC pathologies (Schmiedt et al. 1990, 2002; Tarnowski et al. 1991; Lang et al. 2010). Significant differences in the metabolic estimates were observed across groups [F(3,60) = 18.56, P = 1.23 × 10–8], with significantly lower metabolic estimates for young controls compared to the other groups [Tukey tests: T ≥ 5.22, P < 0.001]. The metabolic estimates were not significantly different between the quiet-aged, noise-aged, and furosemide-exposed groups [Tukey tests: P ≥ 0.73]. The sensory estimates were also significantly different across groups [F(3,60) = 13.57, P = 7.29 × 10−7], with significantly higher sensory estimates for noise-aged gerbils than the other groups [Tukey tests: T ≥ 5.43, P < 0.001]. Sensory estimates were not significantly different between the quiet-aged, furosemide-exposed, and young control groups [Tukey tests: P ≥ 0.36]. Figure 3A, B show the estimates of the metabolic and sensory components for each group.

Fig. 3.
figure 3

Significant differences were observed for the metabolic estimates A and sensory estimates B, based on gerbil CAP threshold shifts. A Metabolic estimates (dB) were significantly higher for each group compared to the controls, and were not significantly different between quiet-aged, furosemide-exposed, and noise-aged gerbils. B Sensory estimates were significantly higher for the noise-aged gerbils than for the other three groups. C Lower EP was significantly associated with higher metabolic estimates, following a 0.5 dB/mV slope, shallower than 1 dB/mV for 16 kHz CAP threshold shifts (Schmiedt et al. 2002). Because of low-frequency contributions, the metabolic estimates were smaller than high-frequency CAP threshold shifts despite their high correlation [r(62) = 0.95, P = 2.2 × 10−16].

Lower EP was significantly related to higher metabolic estimates (Fig. 3C); this significant relationship remained when adjusted for sensory estimates [r(25) =  − 0.76, P = 4.1 × 10−6; partial r(25) =  − 0.65, P = 0.0002]. Lower EP was significantly associated with higher sensory estimates, but was not significant after adjusting for metabolic estimates [r(25) =  − 0.51, P = 0.006; partial r(25) = 0.02, P = 0.92]. These results suggest that EP was related to metabolic estimates, whereas an association between EP and sensory estimates was dependent on metabolic differences.

Simulation Results

The simulation results characterized how accuracy, precision, and bias varied for the metabolic and sensory estimates based on measures derived from parameter recovery error. First, lower predicted error was related to smaller parameter recovery errors (Fig. 4A), such that more explained variance in an audiogram was associated with estimates that more accurately represented the simulation parameters for metabolic and sensory components. Example audiograms show that well-fit audiograms have threshold configurations that were more typical for older adults based on our observations (Fig. 4B) compared to poorly fit audiograms (Fig. 4C).

Fig. 4.
figure 4

Parameter recovery error from simulated audiograms was used to characterize the accuracy, precision, and bias of the metabolic and sensory estimates. Audiograms that were well-approximated by the model estimates (i.e., lowest predicted error) showed the highest accuracy (i.e., lowest mean absolute error; A). Example audiograms and simulated audiograms are shown for predicted error = 5 dB B and predicted error = 20 dB C. Simulated audiograms that were well-fit by modeled hearing loss (predicted error = 0 to 14 dB) compared to poorly fit audiograms (predicted error = 16 to 20 dB) demonstrated higher accuracy (lower mean absolute error; D, G), better precision (lower SD error; E, H), and lower estimation bias (lower average metabolic and sensory parameter recovery errors; F, I). The small, superimposed circles in D and E show increasing observations with larger radii (N = 0 to 120), suggesting that the distribution of estimates derived from human audiograms in the current study have high accuracy and precision. The bias plots represent the metabolic and sensory estimates with acute triangles, which point in the direction of estimation error to indicate if each estimate on average was lower or higher than the parameter (F, I). The directional key shows how the plot represents bias in terms of overestimated ( +) or underestimated ( −) parameters (M, S). An identical color scale is used for each of the plot panels, to show the average error, variance, and bias across metabolic and sensory estimates.

On average the well-fit simulated audiograms with predicted error < 15 dB showed higher accuracy (Fig. 4D), better precision (Fig. 4E), and less biased estimates (Fig. 4F) compared to the poorly fit simulated audiograms (Fig. 4G–I). Accuracy was also characterized by the average shared variance between parameters and estimates (R-square), which naturally increases with smaller parameter recovery errors. Well-fit audiograms had estimates that were strongly correlated with (or identical to) simulation parameters, with 75 to 99 % shared variance for metabolic estimates and 77 to 99 % for sensory estimates, when calculated separately for predicted error = 0, 2, … 14 dB. The poorly fit simulated audiograms showed 65 to 70 % shared variance for metabolic estimates and 62 to 72 % for sensory estimates, calculated separately for predicted error = 16, 18, 20 dB. These results suggest that a predicted error cutoff of 15 dB can be used to limit parameter recovery error in the metabolic and sensory estimates, lower predicted error was generally related to better estimates.

Simulated audiograms that represented a nearly equal combination of sensory and metabolic components often resulted in slightly under-estimated parameters, while combinations of sensory and metabolic estimates that were unequal could over-estimate the larger parameter and under-estimate the smaller parameter (Fig. 4F, I). This tendency was seen for metabolic and sensory estimates alike, which suggests that the algorithm does not exhibit a consistent bias for metabolic or sensory estimates. Superimposing the estimates from the human audiogram datasets in the following sections (dark outlines in Fig. 4 D, E) suggested that their estimates would have high accuracy and precision. More generally, these results suggest that well-fit audiograms can produce estimates of metabolic and sensory components with high accuracy, high precision, and low bias, which supports the predicted error criteria used for the current study.

Metabolic and Sensory Estimates from Older Adult Audiograms

Estimates of metabolic and sensory components were next validated with the large cross-sectional sample of audiograms from older adults (Dubno et al. 2013). The results (Fig. 5A, B) showed that metabolic estimates were significantly higher with increasing participant age [Z = 12.10, P < 2 × 10−16], which were not significantly different for males and females [Z =  − 1.39, P = 0.17]. The age-sex interaction did not affect model fit for the metabolic estimates [P = 0.94]. The sensory estimates from the cross-sectional data showed a significant Age-Sex interaction [Z = 2.94, P = 0.0034]. Follow-up tests showed a slightly larger sex difference in sensory estimates for the youngest participants [below median age: Z =  − 11.74, P = 2 × 10−16], compared to the oldest [above: Z =  − 7.23, P = 2.33 × 10−12]. Additional post hoc tests showed significantly lower sensory estimates with increasing age for males [Z =  − 2.97, P = 0.0032] and no significant age effect on sensory estimates for females [P = 0.20]. This unexpected decrease in sensory estimates with increasing age for males is discussed later. A significant correlation was observed across left and right ears for the metabolic estimates [r(790) = 0.85, P < 2.2 × 10−16], which was significantly higher [Z = 4.45, P = 4.35 × 10−6] than for the sensory estimates [r(790) = 0.80, P < 2.2 × 10−16], consistent with previous observations of agreement in phenotype classifications across left and right ears (Dubno et al. 2013; Vaden et al. 2017).

Fig. 5.
figure 5

Age differences in the metabolic and sensory estimates were examined in a cross-sectional data set, which included 1634 audiograms (N = 842 participants; Dubno et al. 2013). The metabolic estimates were significantly higher for older participants A, whereas sensory estimates were not consistently related to age differences B. Each point in A represents a metabolic estimate and corresponds to a sensory estimate in B.

Longitudinal Changes in Metabolic and Sensory Estimates

Results from the sample of audiograms from older adults collected longitudinally (Vaden et al. 2017) showed age-related changes in the metabolic estimates (Fig. 6A) and inconsistent changes in the sensory estimates (Fig. 6B). A significant age-sex interaction was observed for metabolic estimates [Z = 3.23, P = 0.0013]. Metabolic estimates increased with age at a faster rate for females [Z = 23.75, P = 2 × 10−16] than for males [Z = 19.41, P = 2 × 10−16]. A significant age-sex interaction was also observed for the sensory estimates [Z =  − 5.53, P = 3.7 × 10−8], with no age effect for females [P = 0.22] and a significant age-related decrease for males [Z =  − 5.77, P = 1.3 × 10−8]. Participants who reported a positive noise history showed significantly higher sensory estimates [Z = 8.59, P = 2 × 10−16] but not metabolic estimates [Z =  − 0.16, P = 0.88], consistent with the original results (Vaden et al. 2017). However, noise history-related differences in the sensory estimates were non-significant [P = 0.26] when sex was included in the regression, reflecting large sex differences with 80 % of males and 23 % of females reporting positive noise histories (Dubno et al. 2013; Vaden et al. 2017). This suggests that the sex-related differences in sensory estimates reflect differences in noise exposure histories.

Fig. 6.
figure 6

Metabolic estimates derived from longitudinally measured audiograms were significantly higher with increasing age A, whereas sensory estimates were less consistently related to age B. Individual ears (lines) were shaded for visual contrast. C Metabolic and sensory changes (average estimate differences in dB per year) are represented by arrows, which show that increased metabolic estimates were sometimes accompanied by sensory decreases (N = 227 ears, upper left quadrant). Less often, sensory estimate increases were accompanied by metabolic decreases (N = 80 ears, bottom right). Metabolic and sensory changes that appeared to trade-off also showed increases at approximately twice the rate of decreases, on average. This suggested that negative associations between metabolic and sensory changes reflect different factors, with sensory hearing loss sometimes obscured by rapid metabolic changes and vice versa. Moreover, metabolic increases without an attendant sensory change (N = 180 ears) and sensory increases without metabolic change (N = 66 ears) indicate that these estimates can change independently of one another. D Audiograms from one ear measured over 9 years show an example of combined losses (indicated with purple color), in which metabolic losses obscured a sensory loss (legend).

Sensory hearing loss does not decrease with older age for males, theoretically, despite what the significant effects in the two datasets suggest. However, high-frequency hearing losses may be concealed by the high-frequency component of metabolic losses with increasing age. This would explain patterns observed in the cross-sectional and longitudinal results. Indeed, longitudinal increases in metabolic estimates were sometimes associated with decreasing sensory estimates for individual ears and vice versa (Fig. 6C). Audiograms from an individual case followed this hypothesized pattern over a decade (Fig. 6D), with initially high sensory estimates that decreased from 30 to 9 dB whereas metabolic estimates increased from 9 to 43 dB. These observations suggest that sensory estimates may be more accurate for individuals with lower metabolic estimates, whereas combined patterns with high metabolic estimates could underestimate sensory losses.

Speech Recognition in Noise

The AI-adjusted SPIN scores were most strongly related to the metabolic estimate (Fig. 7A, B). AI-adjusted SPIN scores were significantly poorer for older adults with higher metabolic estimates [Z =  − 15.92, P = 2 × 10−16], higher sensory estimates [Z =  − 5.75, P = 1.10 × 10−8], older age [Z =  − 4.16, P = 3.54 × 10−5], for males compared to females [Z =  − 2.31, P = 0.021], and there was a significant interaction between age and sex [Z =  − 2.37, P = 0.018], based on a regression model that included all of these factors. The significant age-sex interaction reflected significantly poorer AI-adjusted SPIN scores with increasing age for females [Z =  − 5.00, P = 7.90 × 10−7], but not males [Z =  − 0.14, P = 0.89]. Effect size estimates from the regression model showed that AI-adjusted SPIN scores were 0.53 % lower with each 1 dB increase in the metabolic estimate and 0.24 % lower with the sensory estimate. A comparison of the correlations between the AI-adjusted SPIN scores and each estimate confirmed that the metabolic estimates were significantly more predictive of poorer speech recognition than the sensory estimates [Z =  − 15.40, P = 8.68 × 10−54]. Lower observed speech recognition scores in noise (i.e., not corrected for audibility differences) were related to increased metabolic estimates [Z =  − 36.80, P = 2 × 10−6] and sensory estimates [Z =  − 23.37, P = 2 × 10−6], which simply confirms that speech recognition is poorer with increasing hearing loss.

Fig. 7.
figure 7

Higher metabolic estimates were significantly associated with poorer than expected low-context SPIN scores, after adjusting for AI-weighted audibility differences A, whereas a weaker association was seen for sensory estimates in B. In each panel, the horizontal line at zero represents cases in which observed scores matched AI-predicted scores; data points below the line represent scores that are poorer than predicted based on AI-weighted speech audibility.

Metabolic and Sensory Estimates from Histopathology Case Reports

Human histopathology case reports from the NTBR (Merchant et al. 1993, 2008) were used to determine the extent to which metabolic and sensory estimates were associated with strial, OHC, and neural pathologies. Cochleas from older donors were significantly more likely to show strial pathology [T(33) = 2.49, P = 0.019], but not OHC nor neural pathology [P > 0.61]. Strial, OHC, and neural pathology were not significantly different for male and female donors [P > 0.37]. The interval between audiogram collection and death was not significantly related to pathology [mean = 6.1 ± 5.4 years; P > 0.49], nor was time of collection of the temporal bones from time of death [median = 14.5 h; P > 0.40].

As shown in Fig. 8A, metabolic estimates were significantly higher for cochleas with strial pathology [T(33) = 2.39, P = 0.022] and not OHC [T(33) =  − 1.48, P = 0.15] nor neural pathology [T(33) = 0.74, P = 0.46]. Sensory estimates were significantly higher with OHC pathology [T(33) = 1.91, one-tailed P = 0.032], absence of strial pathology [T(33) =  − 2.11, P = 0.043], and were not significantly related to neural pathology [T(33) =  − 0.15, P = 0.88]. Combined OHC and strial pathology was associated with lower sensory estimates than OHC pathology in isolation (Fig. 8B). Figure 8C shows different audiogram patterns when the cases were grouped by pathology. Metabolic and sensory estimates are shown with histology images for one case with predominantly strial pathology (Fig. 9A, B) and another case with predominately OHC pathology (Fig. 9C, D), from two published reports (Schuknecht 1964; Peng and Linthicum 2016).

Fig. 8.
figure 8

Metabolic and sensory estimates were differentially associated with reports of strial and OHC pathology (N = 37 ears) in the NTBR. Metabolic estimates were significantly higher in cases with strial pathology A, and sensory estimates were significantly higher with OHC pathology B. C Audiograms grouped by pathology showed distinct patterns of age-related hearing loss. Note that cochlear pathologies were unevenly distributed because cases were selected based on estimates, independently of pathology. The largest number of cochleas included (from left to right panels) both strial and OHC pathology, strial only, OHC only, then neither (i.e., neural).

Fig. 9.
figure 9

A-B Audiograms were collected longitudinally (right ear) from a temporal bone donor at ages 63, 71, and 75 years (filled circles). The metabolic estimates increased from 39 to 51 dB over 12 years (blue lines). Near-complete strial atrophy in the middle-turn of the cochlea was noted, consistent with a selective metabolic loss (arrow and denoised image adapted from Peng and Linthicum 2016). C-D Predominantly sensory loss (red lines) was observed for a right ear audiogram from a donor 70 years of age (Schuknecht 1964). The NTBR case report noted OHC losses from 0 to 10 mm of the base (red arrow; Case # 204718, NTBR). Note that the histology images were contrast-adjusted B and arrows were added for clarity B, D, as the original images are published and accessible elsewhere.

DISCUSSION

The current approach for estimating metabolic and sensory components of age-related hearing loss was initially validated based on retrospective analyses of CAP threshold shifts and EP values from gerbils with known strial and/or OHC pathologies. Metabolic and sensory estimates were shown to differentiate losses for these groups of gerbils, providing support for the profile-based model approach. Retrospective datasets of audiograms from older adult humans provided additional validation for metabolic estimates that consistently increased with age, and sensory estimates that were higher for males than females and reflected differences in their noise exposure histories. Finally, the histopathology dataset linked cochlear pathology to differences in the metabolic and sensory estimates based on audiograms from temporal bone donors. Together, the results support a multifactorial view of age-related hearing loss with metabolic and sensory components that are both reflected in audiograms from older adults.

Retrospective gerbil thresholds coupled with well-characterized cochlear pathologies provided an initial validation for the current approach to estimating metabolic and sensory components of age-related hearing loss. Consistent with the original observations for CAP thresholds (Schmiedt et al. 2002; Lang et al. 2010; Schmiedt 2010), quiet-aged, noise-aged, and furosemide-exposed gerbils showed significantly higher metabolic estimates than younger controls. Although noise-aged gerbils showed significantly higher sensory estimates than each of the other groups, their metabolic estimates were similar to quiet-aged and furosemide gerbils. This result was particularly important, because it showed that the noise-aged gerbils had a combination of sensory hearing loss related to noise exposure and age-related metabolic hearing loss. Higher metabolic estimates were significantly associated with lower EP and dissociated from sensory estimates, in agreement with predictions of the effects of strial dysfunction (Schmiedt 2010). These CAP threshold shifts and EP values from gerbils provided an ideal validation for profile-based estimates, in cases where diet, drugs, noise-exposure, and other environmental factors were well controlled. Limitations for the retrospective gerbil data analyses included the small sample size (N = 5) and 12-month age differences for the noise-aged gerbils, which both reflected the limited availability. However, these limitations did not seem to affect sensitivity to age-related differences in metabolic estimates, EP associations, or noise-related effects on sensory estimates. As discussed in the next section, the larger audiogram datasets collected from older humans provided more statistical power.

Profile-based estimates were used to quantify metabolic and sensory losses for older adults. Consistent with the original studies, the retrospective analyses showed higher metabolic estimates with increasing age, and higher sensory estimates for males than females (Dubno et al. 2013; Vaden et al. 2017). Moreover, the participants who reported positive noise histories had significantly higher sensory estimates but not metabolic estimates. Sex-related differences in self-reported noise history suggest that occupational and recreational activity differences may account for higher sensory estimates among older males compared to females. As discussed in more detail below, the metabolic estimates were also significantly associated with poorer than expected speech recognition in noise, based on audibility-adjusted SPIN scores. Together, the results from older adults suggest that audiogram-derived estimates are sensitive to different components of age-related hearing loss with distinct consequences, including for speech recognition.

Speech recognition is poorer with increasing age and hearing loss, even after controlling for audibility differences (e.g., Dubno et al. 1984), although different metabolic and sensory effects have not been previously characterized. Metabolic hearing loss could be predicted to impact speech recognition (Schmiedt 2010), based on loss or inactivity of low-spontaneous rate (low-SR) auditory nerve fibers that process suprathreshold sounds (e.g., speech in noise) in aged gerbils (Schmiedt et al. 1996) and chronic furosemide-exposed gerbils (Lang et al. 2010). Because metabolic hearing losses in older adult audiograms are also often more extensive than sensory losses in terms of threshold elevation and the range of frequencies affected, there could be more extensive indirect or downstream effects on speech recognition scores, beyond reduced audibility. Alternatively, sensory losses appear to affect auditory nerve activity that encodes critical acoustic information for speech to a larger extent than metabolic losses (Henry et al. 2019) and noise-related functional changes have been hypothesized to affect speech recognition in noise (e.g., Schuknecht and Gacek 1993; Kujawa and Liberman 2009).

Using differences between AI-predicted scores and observed scores is a principled and broadly accepted approach to study age-related differences in speech recognition that are not attributed to the effects of reduced speech audibility. This is important because differences in speech audibility explain, by far, the largest portion of the variance in speech recognition scores (e.g., Shub et al. 2020). Indeed, the unadjusted SPIN scores were significantly poorer with increasing metabolic and sensory estimates, although this simply confirmed that speech was less audible with more hearing loss. More interestingly, AI-adjusted scores were significantly poorer when the metabolic estimates exceeded 20 dB and, to a lesser degree, the adjusted SPIN scores were also lower with increased sensory hearing loss. These associations with AI-adjusted SPIN scores could be affected by limited sensitivity to small metabolic or sensory hearing losses in certain cases, as indicated by the simulation results (Fig. 4F). This result provides evidence that suggests metabolic declines (and not only the threshold elevation and reduced audibility related to those declines) affect speech recognition, as predicted by Schmiedt (2010). Moreover, this finding provides a potential explanation for why amplification strategies such as hearing aids, which are often fit according to the audiogram but without consideration of metabolic/sensory components, provide such a variable benefit for older adults with speech recognition difficulties.

Broader Implications of the Current Approach. The metabolic and sensory estimates provide quantitative metrics for components of age-related hearing loss in comparison to phenotype categories (Dubno et al. 2013), which is a new advancement in the study of cochlear pathologies. The audiogram-derived estimates provide a few methodological improvements over phenotype categories. First, continuous estimates can improve statistical sensitivity over phenotype categories, better accounting for variance in other continuous variables (e.g., age). Second, continuous estimates can increase statistical power because every audiogram has a metabolic and sensory estimate, which contrasts with mutually exclusive phenotype categories that may also include unbalanced sample sizes. The magnitude of losses can also be compared for audiograms that showed a consistent categorical pattern (e.g., small versus large metabolic losses). Finally, because most older adults appear to have combined metabolic and sensory components (Dubno et al. 2013; Vaden et al. 2017; Parthasarathy et al. 2020), estimating their combination can provide more specific information for the majority of older adults.

Metabolic and sensory estimates can improve specificity for characterizing age-related hearing loss, compared to categorical phenotypes. However, cases with high metabolic estimates warrant some caution because their gradually sloping audiograms may obscure relatively smaller sensory losses reflected in steeply sloping audiograms (see Lebo and Reddell, 1972 for related logic; Vaden et al. 2017 for examples). Metabolic and sensory hearing loss both are believed to limit cochlear amplification and reduce sensitivity particularly for high-frequency sounds, which could introduce ambiguity for deriving estimates from audiograms with high-frequency losses. For example, a sensory hearing loss could be obscured by metabolic declines at an older age that additionally affects low frequencies. Indeed, several observations from the current study suggest that increasing metabolic losses may limit detection of sensory losses. Because the simulation results indicated that the model produces less biased estimates when predicted error < 15, these interactions could reflect an age-related progression of metabolic hearing loss that can overtake sensory hearing loss over a period of many years.

In addition to age-related hearing loss, audiograms can reflect hearing losses resulting from heredity, genetic disorders, acoustic trauma, exposure to ototoxic drugs, and other diseases and conditions not related to aging. In other words, an older adult audiogram may not exclusively reflect metabolic or sensory components of age-related hearing loss. Audiograms from older adults may involve patterns of hearing loss that are not well-approximated by any combination of metabolic and sensory estimates, resulting in a poor model fit for their pure-tone thresholds (i.e., high predicted error). Predicted error less than 15 dB was associated with audiograms that are typical for age-related hearing loss in community-based research samples (e.g., Cruickshanks et al. 1998; Dubno et al. 2013), which provided the original rationale for this value as an inclusion criterion for the statistical analyses.

To facilitate the use of this approach by researchers and clinicians with large audiogram datasets from more heterogeneous populations, simulation tests were used to identify conditions that affect the accuracy, precision, and bias for metabolic and sensory estimates. A high level of predicted error, as suggested by its name, was shown to negatively impact all three measures, suggesting that caution is required for poorly approximated audiograms. The estimation bias results suggest that predominantly metabolic or sensory losses could be overestimated, resulting in an underestimation of the relatively smaller sensory or metabolic parameters, especially with higher predicted error. Conversely, simulated audiograms with lower predicted error demonstrated more accurate, more precise, and less biased estimates, providing additional reasons to exclude poorly fit audiograms that often appear atypical for age-related hearing loss (e.g., “cookie-bite” audiograms). Higher or lower predicted error cutoffs may be appropriate depending on the goals and sample composition of future research studies.

One limitation of the parameter recovery test was that simulated error was added across frequencies to each simulated audiogram to approximate a specific amount of predicted error, which might not align with predicted error patterns in the audiogram datasets. Nonetheless, the simulation results aligned realistically with findings from our experiments that some sensory hearing losses may be misattributed to the metabolic estimate, and vice versa. The simulation results suggest that different methods may be required for cases with audiograms that are poorly fit by the metabolic and sensory profiles, which often have predicted error above 15 dB. This could be especially true for audiograms in clinical datasets (e.g., Parthasarathy et al. 2020; Wu et al. 2020), which appear more variable than audiograms collected from community-based older adult research participants (e.g., Cruickshanks et al. 1998; Dubno et al. 2013).

Histopathology reports were also used to examine how cochlear pathology related to metabolic and sensory estimates in humans. We observed that metabolic estimates were significantly higher for cochleas with strial pathology, and sensory estimates were significantly higher with OHC pathology. Strial pathology appeared to obscure sensory hearing loss, as shown earlier. Conversely, several cases in the sensory group had mixed OHC and strial pathologies despite metabolic estimates = 0 dB, consistent with the “battery theory” prediction that OHC pathology can offset the effects of strial declines on thresholds by reducing EP load (as proposed in Schmiedt 1993, 1996; and further developed in Eckert et al. 2021). Important limitations included sample size and case reports from different institutions with varied protocols. The sample size was limited because broader search parameters tended to increase data rejection for unrelated pathologies or no pathology noted in the histopathology reports. Nonetheless, these findings suggest metabolic and sensory estimates may reflect distinct cochlear pathologies in age-related hearing loss in humans.

The battery model of age-related hearing loss proposes a physiological source for interactions between strial and OHC pathologies on hearing thresholds (Schmiedt 1993; Eckert et al. 2021). Strial and OHC pathologies are hypothesized to interact in a homeostatic fashion: OHC losses reduce the draw on strial current, which allows near-normal EP despite strial atrophy. This prediction is supported by the high-frequency threshold shifts of noise-aged gerbils (Fig. 1A) and sensory cases with OHC and strial pathology (Fig. 8A, B). These observations also suggest that noise-related OHC losses may explain a lack of correspondence in strial data and hearing thresholds (Wu et al. 2020). Although the current model did not take interactions between metabolic and sensory components into account for the gerbil or human datasets, such considerations grounded in battery theory might enhance its performance.

We note that EP declines in metabolic hearing loss can affect IHC function, albeit to a smaller scale than their effects on OHC function. This reflects long-standing observations that OHC loss affects the cochlear microphonic about 20 dB more than equivalent IHC loss, suggesting the OHCs draw substantially more electrical current than IHCs (Dallos and Cheatham 1976). Effects of age-related metabolic declines on IHC function are most likely seen in the loss of activity of the low-SR auditory nerve fiber population (Schmiedt et al. 1996), based on direct EP reduction with chronic furosemide application (Lang et al. 2010). Lowered activity is hypothesized to result from IHCs not being depolarized sufficiently due to reduced EP, which results in higher thresholds to excite the low-SR synapses on the IHC wall. Thus, metabolic losses affect thresholds for high-SR fibers via the OHC amplifier, whereas thresholds of low-SR fibers are affected via the IHCs. Because the audiogram represents only sensitive, high-SR fiber thresholds (Liberman 1978; Schmiedt 1989), only the OHC effects are predicted to be observed in pure-tone thresholds, while the predicted suprathreshold IHC effects are shallower amplitude-intensity functions for the CAP and auditory brainstem responses (ABR). Indeed, differential effects that are consistent with those predictions have been observed in ABR and CAP threshold data from gerbil models (Mills and Schmiedt 2004) as well as neural sound encoding in chinchilla models (Henry et al. 2019) of sensory and metabolic hearing loss. Together, these observations suggest that both hair cell populations are affected by age-related metabolic declines (reduced EP), but with different physiological and behavioral results (OHC: thresholds, IHC: suprathreshold function).

The current approach used retrospective audiogram datasets that were collected at conventional audiometric frequencies for humans (0.25, 0.5, 1, 2, 3, 4, 6, 8 kHz), and CAP threshold shifts measured consistently across frequency from animal models. Although the code we used can calculate metabolic and sensory estimates for human audiograms despite missing thresholds, its accuracy and precision may not generalize for audiograms with too many missing thresholds. Because the gerbil hearing loss profiles were developed as an initial test for the profile-based estimates and a small number of gerbils were used for the sensory profile, we would recommend against using the hearing loss profiles calculated here for other species or frequency ranges. Nonetheless, a similar approach could be appropriately used to derive profiles and calculate estimates for other species, to the extent that distinct hearing loss patterns can be manipulated and measured.

Finally, we note that the predicted error measure from our model may be used to limit outlier audiograms in clinical and occupational audiogram data sets (e.g., Masterson et al. 2013; Cantuaria et al. 2020), which often include more heterogeneous hearing loss compared to community-based samples (e.g., Gates et al. 1990; Cruickshanks et al. 1998; Dubno et al. 2013). The metabolic and sensory estimates may be useful in research applications, such as assessing the potential efficacy of future treatments for cochlear pathologies (Kujawa and Liberman 2019). As those authors noted, OHC regeneration would not remediate a metabolic hearing loss. Moreover, the battery model predicts that improved OHC function could unmask a metabolic hearing loss, due to increased current draw on the EP. Maintenance of EP and OHC function are both required for good hearing, so either component may contribute to age-related hearing loss (Schmiedt 1993; Eckert et al. 2021).

CONCLUSIONS AND FUTURE DIRECTIONS

The consequences of each metabolic and sensory hearing loss may differ, such as poorer speech recognition in association with age and metabolic declines, or neural encoding changes associated with sensory losses (e.g., Henry et al. 2019). Such observations reinforce the importance of understanding differing underlying mechanisms of age-related hearing loss. Amplification and rehabilitation strategies could potentially be different depending on the extent of metabolic or sensory losses. More research is needed before hypotheses related to individualized interventions could be developed, particularly given the evidence concerning interactions between metabolic and sensory hearing losses that were described earlier.

Although the audiogram cannot fully define the underlying pathologies of the inner ear, the current findings support the conclusion that audiograms include information about cochlear pathologies that contribute to age-related hearing loss. Higher metabolic estimates derived from gerbil and human thresholds were associated with increasing age, poorer speech recognition, and strial pathology. Higher sensory estimates were shown for males, individuals with a positive noise history, and OHC pathology. Metabolic and sensory hearing loss appeared to interact, and might be differentiated more accurately in future models that include interactions. The results reinforce a multifactorial view of presbyacusis with distinct metabolic and sensory components that reflect underlying cochlear pathologies.