Keywords

11.1 Introduction

According to the Administration of Aging, the number of adults 60 years and older will increase in the next 10 years from 57 million to more than 75 million. One consequence of this increase in the older population is a greater prevalence of communication problems associated with decreased hearing. Decreased hearing in older adults may lead to social isolation, depression, and reduced cognitive function (Heine and Browning 2002; Lin et al. 2011a). For this reason, increased efforts are being directed toward understanding the neural mechanisms that underlie the communication difficulties associated with aging and hearing loss with the aim of implementing appropriate evaluation and management strategies to offset some of these declines.

Although aging is associated with a decline in peripheral hearing thresholds, additional hearing difficulties may arise from declines in central auditory processing. Suprathreshold deficits in temporal processing have been documented for both speech and nonspeech stimuli. Older adults have more difficulty than young adults when detecting changes in the temporal cues of speech that may distinguish one word from another, such as voice-onset time or silence duration (Gordon-Salant et al. 2008). Furthermore, older adults have poorer temporal resolution (e.g., gap detection or duration discrimination) compared to younger adults (Fitzgibbons and Gordon-Salant 1994; Harris et al. 2010). Precise encoding of the temporal features of speech is necessary for accurate perception in noisy or reverberant environments in which the inherent redundancy of speech may be reduced.

Animal models have demonstrated possible neural mechanisms for these temporal processing deficits. A loss of auditory nerve fibers would lead to a reduction in the neural synchrony that is required for precisely timed representation of auditory stimuli. Older animals or humans may experience a degree of auditory neuropathy that has variable effects on speech perception. A selective loss of low–spontaneous–rate auditory nerve fibers is reflected in lower auditory brainstem response Wave I amplitudes in an aging mouse model (Sergeyenko et al. 2013). Other factors, such as delayed neural recovery (Walton et al. 1998) and changes in the balance of excitatory and inhibitory neurotransmission (Caspary et al. 2008), may contribute to the observed age-related changes in temporal resolution.

In both animal and human models, the frequency–following response (FFR), a scalp–recorded farfield response arising primarily from the midbrain, has been used to assess age-related or hearing-related changes in auditory temporal processing. The FFR is well suited to assess temporal encoding as it preserves the temporal and spectral features of the stimulus with remarkable precision. From a clinical perspective, differences on the order of fractions of milliseconds may be clinically significant, indicating possible auditory-based impairments in children with learning disabilities (White-Schwoch et al. 2015), in older adults with speech perception difficulties (Anderson et al. 2013b), and in other populations. These populations may be exhibiting some degree of auditory neuropathy, as even young adults with normal hearing exhibit variability in behavioral and electrophysiological measures of temporal coding (Bharadwaj et al. 2015), suggesting that suprathreshold declines in auditory function may be observed in the presence of normal cochlear function.

Hearing aids are the primary intervention for older adults with hearing difficulties, but increased audibility may not improve auditory temporal processing deficits associated with age–related auditory neuropathy. Increased understanding of the nature of the neural mechanisms underlying these deficits may lead to improved assessment and management strategies.

This chapter first provides a brief summary of the behavioral and physiological literature examining the nature of deficits in auditory function associated with aging and hearing loss. The effects of aging and hearing loss on the FFR are then reviewed. The chapter ends with a discussion of how knowledge of the effects of aging and hearing loss on subcortical neural processing of sound can inform assessment and remediation strategies used in clinical management of older individuals who are experiencing hearing difficulties.

11.2 Perceptual Declines Associated with Aging and Hearing Loss

11.2.1 Aging

Older adults typically report that they can understand what others are saying in quiet settings but that they have difficulty hearing in noisy backgrounds. A similar observation is often made by individuals with auditory neuropathy, suggesting that decreased neural synchrony contributes to age–related decreases in perception. Behavioral evidence of age–related auditory temporal processing deficits has been found for a variety of perceptual tasks. For example, young adults’ perception of speech that has been temporally jittered is equivalent to that of older adults’ perception of normal speech, suggesting that jitter associated with age-related dysynchrony may account for speech perception difficulties in older adults (Pichora-Fuller et al. 2007). Older adults also exhibit perceptual deficits for time–compressed speech (Wingfield et al. 1999; Gordon-Salant et al. 2007) and for reverberant speech (Halling and Humes 2000).

Poorer performance on nonspeech tasks of temporal processing also supports the idea that decreased temporal resolution contributes to poor speech-in-noise performance in older adults. A signal waveform consists of two temporal components: the temporal envelope corresponds to slow variations in amplitude and the temporal fine structure (TFS) corresponds to the rapid oscillations in the signal that carry the envelope. Older adults exhibit decreased sensitivity to temporal envelope and TFS cues compared to young adults in tasks using tonal stimuli, and this sensitivity to temporal cues relates to identification of consonants and sentences presented in two–talker babble (Füllgrabe et al. 2014). The existence of age-related deficits in speech perception that are distinct from decreases in peripheral hearing or cognition has been debated (Humes et al. 2012), but the deficits found in the Füllgrabe et al. (2014) study were observed in older adults who had hearing thresholds matched to those of the younger adults. Furthermore, the correlation between age and TFS sensitivity remained even after controlling for the effect of cognition, suggesting the existence of central presbycusis that may arise from multiple levels of the auditory system (refer to Gordon-Salant et al. (2010) for a comprehensive review of aging effects on the auditory system).

11.2.2 Hearing Loss

Aging may be a confounding variable when evaluating hearing loss effects on perception of TFS, as individuals with hearing loss are often older than individuals with normal hearing (Lorenzi et al. 2006). To circumvent this problem, studies have used age and hearing loss as continuous variables to evaluate independent effects on TFS sensitivity. For example, King et al. (2014) found orthogonal effects of aging and sensorineural hearing loss on the discrimination of interaural phase differences (IPD) in a group of adults who had a wide range of age and hearing levels. While the TFS thresholds and envelope IPD thresholds increased with age, sensorineural hearing loss appeared to affect the TFS but not the envelope thresholds. In another study, sensitivity to TFS was evaluated using monaural, bilateral, and binaural gap duration discrimination tasks in a group of adults with a wide range of ages, most of whom had fairly normal hearing thresholds (Gallun et al. 2014). They found that age and hearing loss had independent effects on performance across tasks, but age had a greater influence on monaural than binaural performance, while hearing had a greater influence on binaural than monaural performance.

Overall, these studies support the idea that temporal processing deficits in older adults contribute to the difficulties of hearing in noise. Furthermore, declines in either age or hearing may be associated with these deficits, especially for tasks involving sensitivity to TFS. Finally, as noted by Gallun et al. (2014), a substantial amount of variability in performance cannot be predicted by age or hearing loss, suggesting that the remaining variability may be accounted for, at least in part, by neurodegeneration in the auditory nerve and throughout the central auditory system or by cognitive factors.

11.3 Neuroanatomical Changes Associated with Aging and Hearing Loss

Age-related hearing loss (presbycusis) may be associated with a loss of outer hair cells (Schuknecht 1964), a reduction in the endocochlear potential (Schmiedt et al. 2002; Ohlemiller et al. 2006), and a loss of auditory nerve fibers (Felder and Schrott-Fischer 1995; Lang et al. 2010). While these pathologies may produce an actual loss of audiometric thresholds, recent attention has focused on other age-related neural changes that may result in decreased performance on speech perceptual tasks in the presence of normal thresholds. For example, cochlear synaptic and neural degeneration were found in an aging mouse model (CBA/CaJ) prior to a loss of outer hair cells (Sergeyenko et al. 2013). Decreased auditory brainstem response (ABR) amplitudes in these mice suggest an auditory neuropathy that could lead to speech perception impairments, especially in noise.

Auditory processing deficits may also arise from degeneration at higher levels of the auditory system. Caspary and colleagues have documented changes in the balance of inhibitory and excitatory neurotransmission in the brainstem (Caspary et al. 2006), midbrain (Caspary et al. 1995), and auditory cortex (Hughes et al. 2010). These changes may lead to decreased ability to process rapidly changing temporal speech components and subsequent impairments in perception (as reviewed in Caspary et al. 2008). For example, the ability to detect gaps represents a dimension of temporal resolution that is important for accurate perception of certain speech contrasts, such as consonants that differ in voice-onset time (tie versus die).

In a chinchilla model, noise–induced hearing loss (NIHL) leads to enhanced neural coding of the temporal envelope of sinusoidally amplitude–modulated (SAM) tones presented in quiet in the auditory nerve and inferior colliculus (Kale and Heinz 2010; Zhong et al. 2014). Conversely, in responses to SAM tones presented in noise, coding of the TFS is actually reduced in chinchillas with NIHL (Henry and Heinz 2012). Heinz and colleagues surmise that this pattern of NIHL induced enhancement of the envelope and reduction of the TFS (at least in noise) may arise from a number of different mechanisms, including outer hair cell dysfunction or an increase in the excitability of auditory neurons due to reduced neural input. This change in excitability was also noted in a gerbil model of sensorineural hearing loss (SNHL) that found increased amplitude but decreased frequency of excitatory postsynaptic currents in thalamocortical slices (Kotak et al. 2005). These studies suggest the existence of homeostatic mechanisms in the central auditory system that may serve to increase central gain to offset loss of sensory input (Zhong et al. 2014). This change in central gain may offer an explanation for why older adults with hearing loss often report that the volume is loud enough or too loud but that the clarity is reduced. Exaggerated amplitude fluctuations may lead to a sensation of loudness, but clarity may be diminished due to inadequate representation of the TFS.

In humans, imaging studies using cortical evoked potentials or magnetoencephalography have demonstrated an exaggerated enhancement of responses to auditory stimuli associated with hearing loss (Tremblay et al. 2003; Alain et al. 2014) and with aging (Soros et al. 2009; Alain et al. 2012), further evidence of a central gain mechanism that compensates for a loss of sensory input. Older adults also draw on cognitive resources to compensate for hearing difficulties, especially in noise. In fMRI studies of speech-in-noise perception, older adults show reduced activation of the auditory cortex but increased activation of prefrontal areas related to working memory and attention compared to younger adults (Wong et al. 2009). Furthermore, in older adults, the volume of the left pars triangularis and the thickness of the left superior frontal gyrus predict performance on speech-in-noise tasks (Wong et al. 2010). Although the participants in the Wong et al. studies had clinically normal hearing thresholds (≤25 dB HL), the reduced auditory cortex activation may have been affected by subclinical age-related loss of peripheral sensitivity and by age-related deficits in central processing independent of ear health. Even mild to moderate declines in hearing sensitivity are associated with reduced gray matter volume in the auditory cortices in older adults (Peelle et al. 2011). These results provide a neural basis for the increasing role of cognition in speech perception performance that has been found in behavioral investigations of aging effects on speech understanding (Schneider and Pichora-Fuller 2000; Tun et al. 2002).

11.4 Aging Effects on the FFR

11.4.1 Aging Effects in Animals

Because aging affects temporal precision of neural speech encoding, the FFR is well–suited to evaluate temporal processing deficits associated with aging. Bartlett and colleagues conducted a series of studies on aging effects on the ABR and FFR in Fischer 344 rats, the results of which provide a neurophysiological basis for psychophysical findings in humans. Using farfield recordings, they compared amplitude–modulated following responses (AMFRs) in younger versus older rats and found comparable AMFR amplitudes between the groups in the mid–frequency range (181–512 Hz), but higher amplitudes at low and high modulation frequencies in younger rats compared to older rats (Parthasarathy et al. 2010), consistent with human studies showing age-related declines in envelope detection of modulation frequencies for low (He et al. 2008) and high modulation frequencies (Grose et al. 2009). In the same study, Bartlett and colleagues (Parthasarathy et al. 2010) evaluated the effects of wideband noise on AMFR amplitudes. Interestingly, they found that moderate levels of background noise resulted in significant reductions in AMFR amplitudes in the younger but not the older rats at low frequencies, but at the higher frequencies this pattern was reversed and the older rats had a greater noise–induced decline in AMFR amplitude than the younger rats. A subsequent study compared responses in younger and older rats when amplitude and frequency modulation depths were varied (Parthasarathy and Bartlett 2011) and found an age-related reduction in response amplitudes at the lower, but not the higher, modulation depths (Fig. 11.1). They also found that the older rats had reduced precision of envelope shape coding, suggesting a loss of the ability to sustain neural firing. They surmised that decreased inhibitory neurotransmission associated with aging leads to a reduction in the precision of temporal processing that was demonstrated in these studies.

Fig. 11.1
figure 1

(A, B) Responses comparing young rodents (black graph, A) and aged rodents (grey graph, B) show clear phase locking to amplitude-modulated tones. Dashed lines indicate stimulus offset. (C, D) Age-related differences in response amplitudes are present for both low frequencies (C) and high frequencies (D), but these differences are more apparent for smaller amplitude modulation depths. Solid lines indicate responses above mean modulation detection threshold. *p < 0.05. (Adapted from Parthasarathy and Bartlett 2011, with permission from Elsevier)

Because current clinical testing uses the ABR rather than the FFR, it would be important to establish if the FFR provides information regarding auditory processing beyond what is represented in the ABR. Parthasarathy et al. (2014) compared ABRs and FFRs in a rodent model of aging and found age-related differences in ABR thresholds and amplitudes and in FFR phase locking capability. Interestingly, they found significant correlations between ABR and FFR amplitudes in the young rodents but not in the old rodents, suggesting that these measures provide information about different aspects of neurophysiological sound processing, and that the relationships among these measures change with age.

11.4.2 Aging Effects in Humans

One might expect that the reduced precision of temporal coding in the FFR found in aging animal models would also be found in humans. To evaluate age-related effects on temporal precision in humans, FFRs to steady–state tones and dynamic frequency sweeps were compared in younger and older adults (Clinard et al. 2010; Clinard and Cotter 2015). Response amplitudes elicited by steady–state tones of relatively high frequency (~1000 Hz) decrease with age (Clinard et al. 2010), and this age-related decrease in amplitude also occurs for lower frequency sweeps (beginning or ending at 400 Hz) that rapidly rise or fall in frequency at rates from 1333 Hz/sec to 6667 Hz/sec (Clinard and Cotter 2015). Response amplitudes to speech syllables are also affected by age, particularly the onset and offset regions after controlling for the effects of hearing (Vander Werff and Burns 2011; Clinard and Tremblay 2013). These studies used a 40-ms [da] syllable that contains a rapidly changing formant transition without a steady-state vowel region. Anderson et al. (2012) recorded responses to a 170-ms [da] in younger and older adults to compare the effects of aging on a speech syllable containing both formant transition and steady-state vowel regions. They found smaller amplitudes and reduced phase locking for both the transition and steady-state regions in the time and frequency domains, but the effects were more pronounced in the steady-state region.

A follow-up study compared FFRs to the vowel [a] and the syllable [da] to determine if age-related delays in peak latencies were due to an inability to phase lock to the rapidly changing formant transition in the syllable [da] (Presacco et al. 2015). They replicated the Anderson et al. (2012) finding of delayed peak latencies specific to the formant transition in the [da]. However, they also found that in young adults, peak latencies were earlier for the [da] than the [a], as expected given that the high frequency stop consonant burst in the [da] would be encoded earlier than the [a] due to cochlear tonotopicity. However, these peak timing differences between syllables were not found in the FFRs of older adults. They concluded that the lack of peak latency differences between syllables in the older adults was likely due to decreased hearing in the high frequencies, even though this group had clinically normal hearing. These findings and those of Vander Werff and Burns (2011) speak to the importance of accounting for group differences in high-frequency thresholds, even when those differences are slight. An important but unexpected finding in Presacco et al. (2015) was a marked reduction in sustained phase locking to the vowel [a] in older adults that was not observed in the younger adults (Fig. 11.2). These results are in line with those of Parthasarathy and Bartlett (2011), which showed age-related changes in the precision of envelope coding. The loss of sustained phase locking may arise from a number of changes associated with aging. For example, a loss of auditory nerve fibers may lead to an inability to sustain neural firing, as may be seen with VIIIth nerve tumors (Lidén and Korsan-Bengtsen 1973). Prolonged response recovery times may also change the shape of the neural response (Walton et al. 1998).

Fig. 11.2
figure 2

(A, B) Average responses to a 170-ms vowel /a/ are displayed for younger (A, red, n=15) and older (B, black, n=15) human adults. (C, D) Phase-locking factor (PLF) to the vowel /a/ in the same younger (C) and older adults (D). Note the dramatic reduction in response amplitude and in phase locking after ~110 ms in older adults (this region indicated by the light green rectangle. (Adapted from Presacco et al. 2015, with permission from Wolters Kluwer Health, Inc.)

The Anderson et al. (2012) study also assessed trial-to-trial consistency and found that older adults had poorer response consistency than young adults for both transition and steady-state response regions. They surmised that poorer response consistency in older adults may be a neural correlate of temporal jitter that contributes to poorer speech perception in older adults (Pichora-Fuller et al. 2007). Mamo et al. (2015) tested this idea by applying different degrees of jitter to a speech syllable and recording responses to these jittered syllables in younger and older adults. They compared effects of jitter on the envelope and TFS components of speech by presenting the [da] in alternating polarities. Adding responses to the two polarities emphasizes the envelope component and minimizes the fine structure, while subtracting the responses has the opposite effect (Aiken and Picton 2008). Even a mild degree of jitter produced a significant decrease in response amplitudes to the envelope in the younger adults, whereas no reduction was seen in the older adults (Fig. 11.3). In response to the temporal fine structure, the mild jitter condition resulted in a dramatic reduction in harmonic representation in the young adults to the extent that their responses in the mild jitter condition were equivalent to the responses of older adults in the non–jittered condition. Again, older adults’ responses did not show a reduction in amplitude with jitter, presumably because a loss of neural synchrony has already introduced jitter into the responses of older adults.

Fig. 11.3
figure 3

Average responses to a 170-ms [da] syllable are displayed for younger and older human adults. (A, B) The spectral amplitude of the F0 was significantly reduced in the mild jitter condition (B) compared to the no jitter condition (A) in young adults (A, n = 22; B, n = 21). The harmonics were essentially unaffected. (C, D) A similar reduction in F0 amplitude was not seen in the mild jitter (D) compared to the no jitter condition (C) in older adults, presumably because their responses are already affected by neural jitter (C, n = 22; D, n = 7). open circles, F0; asterisks, second harmonic; brackets, data distribution with top and bottom dash indicating top and bottom quartiles and middle dash indicating the median; FFRenv, frequency-following response to the envelope; mild (0.25), mild jitter; none, no jitter. (Adapted from Mamo et al. 2015, with permission from Elsevier)

The FFR may also be used to increase understanding of the mechanisms contributing to cognitive functions, such as selective attention. Although they did not find age differences in behavioral measures of selective attention, Ruggles et al. (2011) found that the impact of reverberation on selective attention increases with age. They analyzed FFR phase locking to both the stimulus temporal envelope and TFS. When comparing relationships between phase locking and selective attention measures, they found that performance in middle-aged listeners appears to rely on encoding of TFS, whereas performance in young listeners is predicted by encoding of the stimulus envelope. The authors concluded that because effects of reverberation are greater for the TFS than for the envelope, selective attention in younger listeners, who rely primarily on envelope cues, will be affected to a lesser extent than in older listeners, who rely primarily on fine structure cues (see Shinn-Cunningham, Varghese, Wang, and Bharadwaj, Chap. 7 for a more thorough review of the FFR role in spatial hearing and selective attention).

These findings are supported by a recent study relating word intelligibility assessed in different degrees of reverberation to envelope and fine structure components of the FFR in older adults (Fujihira and Shiraishi 2015). This study found that representation of the fine structure (harmonics corresponding to the first formant), but not the envelope, predicted word intelligibility in conditions of mild to moderate reverberation. Although subcortical representation of fine structure degrades to a greater extent with age than the envelope (Anderson et al. 2012; Mamo et al. 2015), these findings support the idea that older adults rely on TFS components across perceptual tasks.

11.4.3 Neural Correlates of Perceptual Deficits

A number of studies have used the FFR to investigate neural correlates of clinical impairments associated with aging. Older adults who have clinically normal hearing thresholds are known to experience more trouble understanding speech in background noise than younger adults (Dubno et al. 1984; Souza et al. 2007), suggesting deficits in central auditory processing or decreased cognitive function (CHABA 1988), but the existence of central presbycusis as an isolated entity remains controversial (Humes et al. 2012). The FFR may be useful for evaluating central presbycusis as it does not place cognitive demands on the participant.

Two recent studies used the FFR to evaluate the neural basis of speech-in-noise impairments in older adults. The first study divided older adults (ages 60–73 years) into groups of higher and lower performance on the Hearing–in–Noise Test (HINT; Nilsson et al. 1994) and compared FFRs in response to a 170-ms [da] syllable presented in quiet and in six-talker babble (Anderson et al. 2011). They found that the group with better HINT scores had larger response amplitudes and more robust representation of the fundamental frequency (F0) than the group with poorer HINT scores. They cross-correlated responses obtained in quiet with responses obtained in babble noise and found a strong positive correlation between response correlation values and HINT performance, suggesting that the robustness of subcortical speech representation is a factor in successful hearing in background noise (Fig. 11.4).

Fig. 11.4
figure 4

(A, B) Individual response waveforms to a 170-ms [da] presented in quiet (gray) and in six-talker babble (black) from individuals who have good (A) or poor (B) scores on the HINT (Hearing-in-Noise Test). For better visualization, the figure zooms in on the first 70 ms (onset and transition). The consonant transition is degraded by noise to a greater extent in the bottom speech-in-noise performer. (C) Responses obtained in quiet were correlated with responses obtained in noise. Higher correlation values related to lower speech-in-noise thresholds. r, correlation coefficient; SNR, signal-to-noise ratio. (Adapted from Anderson et al. 2011, with permission from Wolters Kluwer Health, Inc.)

In terms of clinical relevance, this information might be useful as a counseling tool to help the patient understand why listening in noise might be so challenging. A follow-up study was performed to determine if the FFR would explain more of the variance in the older adult’s own perception of their speech-in-noise ability than traditional clinical measures (Anderson et al. 2013b). This study recorded the FFR using a 40-ms [da] syllable in a group of 111 middle-aged to older-aged adults (ages 45–78 yrs) who had audiometric profiles ranging from normal to mild to moderate sensorineural hearing loss. The protocol was designed to be clinically feasible, and the 40-ms [da] was chosen because the testing time is approximately 20 min. They used the Speech, Spatial, and Qualities of Hearing Scale (SSQ; Gatehouse and Noble 2004) to assess self-reported speech-in-noise performance and the Quick Speech-in-Noise test (QuickSIN TM; Killion et al. 2004) to assess performance in a clinical setting. Using a step-wise multiple linear regression, they found that hearing thresholds and QuickSINTM scores predicted 15% of the variance in SSQ scores, and timing measures of the FFR (onset slope, morphology, and offset latency) predicted an additional 15%. They concluded that the FFR provides information about speech-in-noise performance beyond what is obtained using the current audiological protocol, and that it may be useful in the assessment and management of patients presenting with hearing difficulties (see Bidelman, Chap. 8 for more information on the FFR and communication in challenging environments).

Although the previously mentioned studies have found relationships among FFR and clinical speech-in-noise measures, mixed results have been obtained in studies comparing behavioral performance and the FFR using the same stimuli. In studies of frequency discrimination in young adults, periodicity strength of the FFR relates to F0 difference limens in young adults (Krishnan et al. 2012; Smalt et al. 2012); however, in a study including older adults, Clinard et al. (2010) found age-related deficits in pitch discrimination and FFR representation of the same tones, but these measures were not predictive of each other. Because frequency discrimination performance in older adults is likely to be affected by elevated hearing thresholds, Marmel et al. (2013) investigated the relationship between FFR phase locking and frequency discrimination across a range of age and hearing thresholds to evaluate respective contributions to this relationship. They found that both FFR phase locking and hearing thresholds predicted frequency discrimination performance, while age was not a significant factor. Interestingly, they found that age, but not hearing thresholds, was related to FFR phase locking. Because age-related changes in peripheral hearing thresholds will be seen even in older adults with “clinically normal” hearing thresholds, this study underscores the need to consider the contributions of both age and hearing thresholds when investigating relationships among neural and behavioral measures of auditory performance (Carcagno and Plack, Chap. 4).

11.5 Hearing Loss Effects on the FFR

11.5.1 Hearing Loss Effects in Animals

Early investigations of hearing loss effects on perception have compared young normal hearing individuals with older individuals with hearing loss, thus introducing an aging confound. Animal models of noise-induced hearing loss (NIHL) provide one means of eliminating that confound. Heinz and colleagues have conducted a series of experiments to evaluate effects of NIHL on neural coding of the temporal envelope and fine structure in chinchillas. In the first experiment, responses from auditory nerve fibers were recorded in response to SAM tones or to single-formant stimuli in chinchillas that had normal hearing or had been exposed to narrowband noise levels sufficient to produce a threshold shift of at least 20 dB on the ABR (Kale and Heinz 2010). The strength of envelope coding was enhanced in noise-exposed fibers compared to coding in normal hearing fibers, especially those with higher ABR thresholds, but there was no reduction in the coding of fine structure. This initial study presented stimuli in quiet conditions only. In a follow-up study, Henry and Heinz (2012) recorded responses to spike trains presented in quiet and in three levels of Gaussian noise in chinchillas with and without NIHL to determine if the presence of noise would cause a degradation in the coding of fine structure. They found no differences in vector strength of phase locking to tones in the quiet condition, but with increasing levels of noise, vector strength decreased in the NIHL chinchillas compared to the normal-hearing (NH) chinchillas. Finally, Zhong et al. (2014) used scalp recordings to evaluate noise effects on neural coding in more central structures of the auditory midbrain. They found that noise exposure resulted in an increase in envelope response amplitudes to SAM tones in both quiet and noise conditions (Fig. 11.5).

Fig. 11.5
figure 5

(A, B) Considerable overlap is noted in response amplitudes to the temporal envelope of SAM (sinusoidally amplitude-modulated) tones presented in quiet and in three levels of Gaussian noise between chinchillas with and without NIHL (noise-induced hearing loss), possibly reflecting differing degrees of NIHL. Open symbols and error bars correspond to means and standard deviations, respectively. The dashed line represents the noise floor. (C, D) Recordings were obtained in seven animals before and after noise exposure and changes in envelope response amplitude were greater in animals that had greater noise-induced shifts in hearing thresholds. Thick grey lines represent the predicted relationship between ABR (auditory brainstem response) threshold and response amplitude. Although recordings were also obtained to 1 and 2 kHz carrier frequencies, the greatest effects were obtained to the higher frequencies, which are displayed here. ENV, envelope (Adapted from Zhong et al. 2014, with permission from Elsevier Limited)

Zhong et al. (2014) surmised that envelope enhancement associated with hearing impairment may arise from both peripheral and central noise-induced changes. Outer hair cell dysfunction or impairment of high-threshold auditory nerve fibers would lead to steeper input-output functions and enhanced response amplitudes for suprathreshold input levels. Alternatively, results may reflect increased central gain due to homeostatic regulation of excitatory and inhibitory synapses following reduced sensory input (Chambers et al. 2016). Based on these findings, Heinz and colleagues suggest that the enhancement of envelope information at the expense of TFS may contribute to speech perception difficulties in individuals with hearing loss, as the heightened envelope cues may distract the listener from the fine details required for accurate speech discrimination.

Recent attention has been focused on the damage produced by moderate levels of noise exposure that results in cochlear neuropathy—a loss of auditory nerve fibers without concomitant outer hair cell damage (Kujawa and Liberman 2009; Lin, Furman, et al. 2011). Given that this type of auditory dysfunction is not reflected in audiometric threshold or otoacoustic emission testing, a clinical measure is needed that would be sensitive to cochlear neuropathy. Wave I amplitude of the ABR may reflect a reduction of auditory nerve fibers, but high variability may reduce its clinical efficacy. Shaheen et al. (2015) assessed effects of moderate noise exposure on FFRs to SAM tones and ABRs to tone pips in mice. While ABR amplitude and FFR amplitude and phase locking were reduced in noise-exposed mice, the changes in the FFR were more robust with reduced variability, suggesting that the FFR may serve as an efficacious measure of noise-evoked auditory neuropathy in the clinic.

11.5.2 Hearing Loss Effects in Humans

Anderson et al. (2013a) investigated the effects of hearing loss in humans using FFRs to a 40 ms [da] syllable presented binaurally in quiet and noise. To reduce audibility effects, they created amplified waveforms based on individual hearing loss using the National Acoustics Laboratory-Revised (NAL-R) algorithm (Byrne and Dillon 1986) and presented the [da] syllable in both amplified and unamplified conditions. To minimize effects of aging, they compared two groups of older adults who were matched in age: one group with normal audiometric thresholds and one group with mild to moderate sensorineural hearing loss. Similar to the Kale and Heinz (2010) study, they found that the response amplitude to the envelope was larger in the group with hearing loss than in the group with normal hearing in both aided and unaided conditions, especially in noise.

The initial study did not find differences in fine structure representation, but a follow-up training study comprising a larger number of participants (58 in the follow–up study versus 30 in the initial study) found that spectral amplitudes of the TFS were smaller in the noise condition in hearing-impaired individuals than in normal-hearing individuals (Anderson et al. 2013c) (Fig. 11.6). Because the results of the follow-up study were consistent with Henry and Heinz (2012), the initial lack of significant findings in the first study may have been due to insufficient power. These results support studies demonstrating perceptual deficits for TFS cues associated with hearing loss (King et al. 2014) that may be contributing to deficits in speech perception (Lorenzi et al. 2006; Füllgrabe et al. 2014).

Fig. 11.6
figure 6

FFRs obtained in human older adults with and without hearing loss (matched in age) to a 40-ms [da] presented in pink noise (+10 dB signal-to-noise ratio). Hearing impaired adults have greater representation of the envelope in the pitch-dominated frequencies (F0 and second harmonic, H2), whereas normal hearing adults have greater representation of the fine structure in the frequency region corresponding to the first formant. H3H6, harmonics; *p < 0.05, **p < 0.01. (Adapted from Anderson et al. 2013, with permission from Frontiers)

The Anderson et al. studies of hearing loss effects used an individually amplified speech stimulus to minimize effects of audibility. Ananthakrishnan et al. (2015) employed a different approach to equate audibility by obtaining FFRs at four different presentation levels and by comparing NH and HI individuals at equal sensation levels. They used a relatively low frequency vowel (/u/) with the first two formants well below 1000 Hz. In contrast with previous findings (Kale and Heinz 2010; Anderson et al. 2013a), they found degradation of both the envelope and the TFS in the HI individuals. They attributed these differences to milder degrees of hearing loss in their study and to differences in compensating for hearing loss. However, it should be noted that the HI group was older than the NH group. Although the effects they found persisted even after they controlled for age, it is important to minimize aging effects by matching groups on this variable to the extent possible in humans. Overall, this study highlights the benefits of using multiple stimulation levels when evaluating the effects of hearing loss.

11.5.3 Neural Correlates of Performance

Using synthesized stop consonants on a /ba/-/da/-ga/ continuum, Plyler and Ananthanarayan (2001) evaluated effects of hearing loss on identification performance and accuracy of FFR encoding of the second formant transition at different presentation levels. They found that although the FFR spectral peak shifts toward the higher frequencies as the second formant transition rises over time in the normal hearing group, this shift was substantially reduced in the group with hearing loss, suggesting that reduced hearing sensitivity may degrade phase locking. Furthermore, wider critical bands and reduced frequency selectivity in the HI group may lead to a broad dispersion of FFR peaks. Although there was no correlation between behavioral performance and FFR representation, the hearing impaired individuals tended to have reduced identification and degraded FFRs, suggesting a relationship in at least some of the HI individuals.

The Plyler and Ananthanarayan (2001) study used a broad range of ages and, therefore, interpretation of their findings is likely to be confounded by aging differences between the NH and HI groups. An alternate approach would be to use hearing level as a continuous variable within age groups. Bidelman et al. (2014) used this approach in a study that evaluated both FFR and cortical-evoked responses to a five-step /u/ to /a/ continuum of synthetic vowels that differed in the first formant frequency, and they compared neural responses to categorical perception of these vowels. Better behavioral performance was related to larger F1 magnitudes in the FFR but to reduced N1-P2 magnitudes in the cortical response across stimuli and groups. When investigating orthogonal effects of aging versus hearing loss, they found that greater levels of hearing loss were related to weaker subcortical pitch salience and larger cortical N1-P2 magnitudes, but age did not correlate with subcortical pitch salience or F1 encoding. However, both hearing loss and aging were associated with stronger cortical responses, an over-enhancement that has been observed in other studies (Tremblay et al. 2003; Alain et al. 2014). In older adults, smaller FFR magnitudes were related to larger cortical magnitudes, suggesting that weakened subcortical encoding may contribute to exaggerated cortical responses associated with a down regulation of inhibitory neurotransmission (Turner et al. 2005). Because these patterns were not seen in younger adults, the authors concluded that there is greater redundancy between levels of the auditory system in older adults to compensate for deficient encoding associated with aging and hearing loss. This diminished encoding is a factor in impaired perception on the behavioral categorization task (Fig. 11.7).

Fig. 11.7
figure 7

Correlations among categorical perception, brainstem first-formant magnitude (F1 mag), and cortical magnitudes (N1-P2) in response to vowels varying on a continuum of the first formant are displayed separately for young and older adults. In older adults only, brainstem magnitudes significantly correlated with cortical magnitudes and with categorical perception. In addition, hearing thresholds were negatively correlated with brainstem magnitudes but positively correlated with cortical magnitudes. HL, age-related hearing loss; *p < 0.05, **p < 0.01. (Reprinted from Bidelman et al. 2014, with permission from Elsevier Limited)

11.6 Clinical Implications

11.6.1 Amplification

Because the FFR reflects auditory processing, it may prove to be a useful tool for evaluating the benefits of hearing aid amplification. The current recommendation for most individuals with mild to moderate sensorineural hearing loss is the use of hearing aids, but less than 25% of people who would benefit from hearing aids actually use them (Kochkin 2010). The current standard of audiologic care recommends real-ear measurements to verify that hearing aids are providing appropriate levels of amplification for the hearing loss, but this approach does not provide any information about how sound is processed beyond the tympanic membrane. A number of questionnaires can be used to validate the success of fitting the hearing aid, but these questionnaires may be affected by personality factors and may not reveal the root cause of dissatisfaction with hearing aids. Because digital technology provides a great deal of flexibility in fitting the hearing aid, audiologists often turn to changes in the software to adjust high or low frequency settings without knowing how these changes affect the accuracy of neural speech encoding.

To address some of these issues, a clinical instrument was developed to ensure audibility of speech consonants using cortical evoked potentials (HEARLab™; Munro et al. 2011). This instrument may be useful for assessing infants and individuals who are hard to test, but it may be less useful in a cooperative child or adult who can verify audibility using a behavioral procedure. Furthermore, verification of audibility does not insure that temporal or spectral components of speech components are being accurately encoded.

The traditional ABR to clicks or tone bursts has not been considered a valid approach to the assessment of hearing aids because the transient stimuli that are used for threshold testing would not be compatible with hearing aid time constants. However, the stimuli typically used in FFR testing have durations long enough to exceed the rise and fall times of hearing aid processing. Two studies recently investigated the feasibility of using the FFR to evaluate effects of stimulus level, bandwidth, and amplification in adults with normal hearing and with hearing loss (Easwar et al. 2015a, b). Both studies elicited the FFR with a naturally spoken speech token /susaʃi/ containing low-frequency, mid-frequency, and high-frequency phonemes. To ensure that the protocol was clinically feasible, just 300 sweeps were recorded for each condition and a statistical algorithm was used to determine the probable presence of the response. Bandwidth was evaluated by low–pass filtering the /susaʃi/ token at 1, 2, and 4 kHz. In the initial pilot study with NH adults, increases in level and stimulus bandwidth led to an increase in response amplitudes and in the number of detectable responses. In a follow-up study, experienced hearing-aid users with mild to moderate sensorineural hearing loss underwent the same protocol, but in addition to examining the effects of level and bandwidth, the authors elicited the FFR while the /susaʃi/ token was presented to individually fitted hearing aids through wireless transmission. Again, they found that increases in level and bandwidth and the use of amplification increased the number of detectable responses. Furthermore, speech discrimination scores and sound quality ratings correlated positively with FFR amplitude and detectable responses, suggesting that the FFR might be useful for predicting suprathreshold performance.

Similar to the HEARLab™ system, this previously mentioned protocol was designed to improve verification of the benefits of hearing aids in infants and young children, with a focus on improved audibility. Nevertheless, adult users of hearing aids often report that hearing aids are loud enough to hear conversation, but they have trouble with the clarity of speech. As discussed in Sect. 11.5, loudness may be detrimental to clarity, and it would be worthwhile to understand the factors in subcortical transcription that lead to improved understanding of speech with hearing aids. A better understanding of these factors may lead to adjustments in algorithms for hearing aids or device settings. The feasibility of using the FFR to aid in adjusting the setting for hearing aids was observed in an individual who was encountering hearing aid difficulties (Fig. 11.8). One factor to consider in these recordings is the stimulus artifact produced by hearing aids. One approach to reducing artifact is to use direct audio input or wireless sound transmission (Bellier et al. 2015). Work is underway to explore the ways that the FFR can be used to maximize successful fitting of hearing aids in both pediatric and adult populations.

Fig. 11.8
figure 8

The FFR may reflect changes in hearing aid settings. Responses to a 170-ms [da] syllable were recorded in a sound field in an older individual wearing a hearing aid with one of two settings. The response amplitude in the time and frequency domains was increased with setting 1 compared to setting 2. (Adapted from Anderson and Kraus 2013, with permission from Hindawi Publishing)

11.6.2 Auditory Training

Through the use of digital technology, the benefits of amplification have improved to a considerable extent. Yet, even if a hearing aid is capable of delivering a perfect signal for an individual hearing loss, amplification will not compensate for declines in spectrotemporal processing associated with aging. For this reason, clinicians should consider including auditory training as part of the management protocol. At this time, although there are studies demonstrating the efficacy of auditory training (Song et al. 2012; Ferguson et al. 2014), there is limited understanding of the kinds of protocols that would be most beneficial. The responses and needs of older adults are highly variable; therefore, a “one size fits all approach”, such as is used in most commercial training packages, will likely have limited benefits for this heterogeneous population.

Because the FFR represents the temporal and spectral characteristics of the speech signal with precise fidelity, it may provide an appropriate tool for demonstrating training benefits. For example, one of the manifestations of age-related decreases in temporal precision is a delay in FFR peak latencies (Vander Werff and Burns 2011; Anderson et al. 2012). A recent study demonstrated that this aging effect can be partially reversed with training. Adaptive auditory-based cognitive training reduced peak latencies and inter-peak variability in the FFR to a speech syllable presented in quiet and in babble noise, and the greatest effects were seen in noise (Anderson et al. 2013d) (Fig. 11.9). Concomitant improvement was seen in speech-in-noise performance, but the changes between the measures were not related, suggesting that different neural mechanisms contributed to perceptual and neural changes.

Fig. 11.9
figure 9

Training-induced changes in FFR peak latencies for a 170-ms [da] syllable recorded in two-talker babble (+10 SNR). In the auditory training group, significant decreases were noted in peak latencies, more so in the region corresponding to the formant transition (30–60 ms) than in the region corresponding to the vowel (60–170 ms). No changes were noted in the active control group (ns = not significant). Error bars, ±1 standard error. *p < 0.05, ***p < 0.001. (Adapted from Anderson et al. 2013d, with permission from Proceedings of the National Academy of Sciences of the USA)

11.6.3 Clinical Use

Because the FFR preserves aspects of the speech stimulus so precisely, analysis of specific features that have degraded representation can inform clinical recommendations. For example, an older adult who has difficulty encoding the consonant-vowel transitions of speech may benefit from training that adaptively expands and contracts these transitions. Algorithms for hearing aids may be adjusted depending on the nature of the impairment. Chasin (2011) has recommended adjusting hearing aid parameters based on phoneme-level, word-level, and sentence-level differences in the individual’s spoken language. These parameters could be similarly adjusted for processing deficits, such as deficient encoding of the F0 or inability to sustain phase locking to long-duration vowels.

Another potential clinical use of the FFR would be to predict who would benefit from certain types of clinical management. For example, response consistency in the FFRs of good readers is higher than that of poor readers (Hornickel and Kraus 2013), and response consistency at pretest predicts gain in phonological awareness after a year of using an assistive listening device during school hours in children with reading impairments (Hornickel et al. 2012). School administrators are much more likely to follow up on recommendations that are tailored to an individual rather than to widespread recommendations that are made to everyone in a group.

Audiologists have long been aware that two people with identical audiograms may have vastly different experiences when trying to communicate in a noisy environment (Killion and Niquette 2000). Because the FFR reflects both long-term and short-term experiences (Krishnan and Gandour, Chap. 3; Carcagno and Plack, Chap. 4; White-Schwoch and Kraus, Chap. 6), and the long-term consequences of aging and hearing loss on central auditory processing, its use may provide the clinician with a better understanding of the nature of the deficit that is contributing to the patient’s problems with hearing in noise.

11.7 Future Directions

The hearing aid studies cited in Sect. 11.6.1 used the FFR to verify audibility for phonemes containing energy in low to high frequency ranges. Knowledge of the effects of amplification on suprathreshold processing would also be beneficial, both for developers of hearing aid algorithms and for clinicians trying to maximize hearing aid benefits. As mentioned in Sect. 11.5, individuals with hearing loss have an over-representation of the temporal envelope at the expense of the fine structure, especially in noise. It would be useful to determine the specific features of amplification that affect the balance of representation of the envelope and TFS. Modern hearing aids automatically adjust for different environments, but the strategies for this adjustment vary among hearing aid companies. Most hearing aids use some form of nonlinear compression, but time constants and other aspects of compression differ, with some hearing aids having fairly fixed, slow compression time constants, and other hearing aids having an option of slow versus fast speeds. There is evidence supporting the use of slower compression speeds for older individuals or individuals who have reduced cognitive function (Lunner and Sundewall-Thoren 2007; Cox and Xu 2010). It would be useful to determine the effects of varying compression speeds on neural encoding of various speech components in these different populations.

Although evidence suggests that a short training program can improve subcortical encoding of speech, more work is needed to determine the specific features of training that provide the most benefit. Because FFR changes may be specific to the training stimuli (Song et al. 2008; Carcagno and Plack 2011), the use of the FFR may inform the investigator of the aspects of training that can be used to achieve certain perceptual benefits. For example, training on speech-in-noise recognition led to enhancement in the F0 in young adults (Song et al. 2012). Because the robustness of F0 appears to be a factor in better speech recognition in noise in young and older adults with normal hearing, perhaps training that adaptively adjusts the signal-to-noise ratio of training stimuli can be particularly effective.

Finally, the FFR is considered a research tool and is not approved for clinical use in the United States. Work is underway to provide normative data and guidance to clinicians regarding the use of the FFR to classify individuals according to specific impairments. In a recent study, a consonant-in-noise score was developed (representing FFR peak latencies, response consistency, and spectral amplitudes) that predicts 68% of the variance in phonological scores in preschool-aged children and correctly classifies school-aged children with or without dyslexia in 69.1% of cases (White-Schwoch et al. 2015). Therefore, the FFR could become a valuable tool in the assessment of children with language-based learning impairments and other populations with communication impairments (Kraus and Anderson, 2016; Schochat, Rocha-Muniz, and Filippini, Chap. 9).

11.8 Summary

Studies have demonstrated the FFR’s usefulness in enhancing our understanding of the ways in which aging and hearing loss affect subcortical transcription of speech. Age-related reductions in neural synchrony and subcortical temporal precision are reflected in animal and human FFR studies, with smaller response amplitudes, decreased trial-to-trial consistency, decreased phase locking, and reduced ability to sustain neural firing. These deficits relate to speech perception abilities and may account, in part, for the difficulties older adults experience when trying to understand speech, especially in noisy environments.

Hearing loss effects on the FFR have been more varied, especially in humans, in part due to aging confounds and in part due to differences in strategies for equating audibility. Animal studies of NIHL demonstrate enhanced representation of the temporal envelope but decreased representation of the TFS, especially in noise. These findings were confirmed in human studies but only in response to speech syllables containing a high-frequency transient, stop consonant burst. Degradation in both the envelope and TFS may be found in response to low-frequency vowels. The strength of response magnitude to the first formant of these vowels relates to better categorical perception, suggesting that the FFR may be used as an objective assessment of perception. Disentangling the effects of aging and hearing loss in human studies is problematic, as hearing loss etiologies differ between younger and older individuals. More work is needed to understand the varied effects of hearing loss on the FFR and the ways in which these effects contribute to impaired perception.

Knowledge of changes to the FFR that accompany aging or hearing loss can guide clinical management. Historically, hearing aid algorithms have attempted to compensate for outer hair cell loss by restoring audibility while maintaining comfortable loudness, but recently, the focus has shifted to include cognitive considerations. Knowledge of the specific speech components that are affected by hearing loss or aging, as revealed by the FFR, may also be taken in consideration when developing amplification algorithms. Furthermore, as amplification may not be sufficient to restore degraded temporal processing, auditory training might be used to at least partially restore the deficits revealed by FFR testing in an individual. More work is needed to explore clinical uses and to ascertain the efficacy of FFR use.