Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

7.1 Introduction

Older individuals often find it difficult to communicate, especially in group situations, because they are unable to keep up with the flow of conversation or are too slow in comprehending what they are hearing. These communication difficulties are often exacerbated by negative stereotypes held by their communication partners who often perceive older adults as less competent than they actually are (Ryan et al. 1986). Sometimes, older adults’ communication problems motivate them, often at the prompting of their family and friends, to seek help from hearing specialists (O’Mahoney et al. 1996). Quite often, however, older adults and/or their family members wonder if these comprehension difficulties are a sign of cognitive decline. Such uncertainty on the part of both older adults and their family members with respect to the source of communication difficulties is understandable given that age-related changes in the comprehension of spoken language could be due to age-related changes in hearing, to age-related declines in cognitive functioning, or to interactions between these two levels of processing. To participate effectively in a multitalker conversation, listeners need to do more than simply recognize and repeat speech. They have to keep track of who said what, extract the meaning of each utterance, store it in memory for future use, integrate the incoming information with what each conversational participant has said in the past, and draw on the listener’s own knowledge of the topic under consideration to extract general themes and formulate responses. In other words, effective communication requires not only an intact auditory system but also an intact cognitive system.

Previous chapters have identified a number of age-related changes in cochlear, retrocochlear, and central auditory processing that could interfere with a listener’s ability to understand, memorize, integrate, and recall heard information (Schmiedt, Chapter 2; Canlon, Illing, and Walton, Chapter 3; Ison, Tremblay, and Allen, Chapter 4; Fitzgibbons and Gordon-Salant, Chapter 5). To comprehend spoken language in complex listening situations, a person needs to overcome peripheral (energetic) masking (Humes and Dubno, Chapter 8), parse the auditory scene into different sources of information (e.g., different talkers) to be able to keep track of who said what, focus attention on the target talker, suppress the processing of irrelevant information, and, when appropriate, switch attention from one talker to another. Clearly, a person’s ability to carry out these functions in complex environments will depend not only on the status of that person’s auditory system but also on the status of their cognitive system.

Cochlear pathology could reduce the audibility of speech sounds and increase a listener’s susceptibility to energetic masking, leading to errors in speech perception (see Humes and Dubno, Chapter 8). These errors, in turn, could cascade upward, making it more difficult for listeners to use higher order processes to extract meaning, store information in memory, or perform any of the other necessary cognitive operations. Central auditory deficits (e.g., declines in binaural processing or declines in the synchrony of neural firing at various levels of the auditory system, discussed in Schmiedt, Chapter 2; Canlon, Illing, and Walton, Chapter 3; Eddins and Hall, Chapter 6) will interfere with scene analysis to make it more difficult for the listener to keep track of different auditory sources and to separate streams of information for subsequent processing. At the cognitive level, age-related declines in speed of processing, working memory capacity, and the ability to suppress irrelevant information might make it more difficult for the listener to handle multiple streams of information, rapidly switch attention from one talker to another, and comprehend and store information extracted from speech for later recall.

In other words, to acquire and use the information contained in spoken language requires the smooth and rapid functioning of an integrated system of perceptual and cognitive processes. This chapter begins by introducing some concepts important to an integrated systems approach in studying spoken language comprehension. It then reviews how this integrated system is affected by (1) age-related auditory declines, (2) age-related cognitive declines, and (3) the interaction between age-related auditory and cognitive declines. Finally, it ends by considering how the system functions in different clinical populations (e.g., users of hearing aids and those with some form of dementia).

7.2 An Integrated Approach to Investigating Spoken Language Comprehension

Taking an integrated approach to investigating sources of age-related declines in spoken language comprehension requires the introduction of a number of concepts that although well known to cognitive scientists, may be less familiar to those in the hearing sciences. These include the notion of “levels of processing” as well as the interrelated concepts of “executive control,” “limited working memory and attentional resources,” and “processing speed.”

7.2.1 Levels of Processing

In their seminal paper, Craik and Lockhart (1972) argued that cognitive functions such as memory flow from a perceptual analysis of the stimulus but that the degree or elaborateness of information processing depends on the task demands. This approach, rather than viewing perception and cognition as separate modules (or boxes in a flow chart), treats the sensory and cognitive systems as an integrated whole in which those processes one calls sensory or perceptual occur relatively early in the processing sequence, whereas those that are labeled cognitive could be considered as elaborations of these early processes. Importantly, it assumes that the depth to which a stimulus is processed will depend on the task demands. To repeat a sentence does not require the same depth of processing that is required, e.g., to decide whether a statement is true or false or to extract the theme from a short lecture. To repeat a sentence, the listener need only process the acoustic stream to the point at which he or she is able to identify and repeat the words, i.e., up to the lexical stage of processing. Understanding and remembering a short lecture requires more than lexical processing; it requires that the listener integrate the successively encountered words with one another and with world knowledge as well as storing a coherent representation of the lecture in memory for later use. Hence, processing of the information carried in the stimulus when the task is to understand and remember what has been heard is likely to be much more elaborate than it would be when the task is simply to repeat a word.

7.2.2 Executive Control

Implicit in the levels of processing approach is the notion of executive control (see Baddeley 1993; Shallice and Burgess 1993; West 1996 for reviews). If tasks require different kinds or different levels of processing, there must be an executive function that organizes and controls how the acoustic stimulus is processed. For example, at a more cognitive level, the executive would identify what the main themes are, separate relevant from irrelevant information, exclude the latter from further processing, decide what should be stored in memory, and marshal and organize the required resources to accomplish these tasks. At a more perceptual level, it may decide that it is has to focus attention on the talker to the left of the listener rather than the one on the right. At an even lower level of processing, it might decide that the important information is coming through the auditory filters serving the coding of low-frequency components of the signal. If the manner in which an acoustic stimulus is processed has such flexibility, there must be executive control over stimulus processing. If so, one interesting question is how extensive is this control; i.e., does executive control extend all the way down to very early perceptual processes? A second question of interest is how aging affects executive control at the various levels of information processing during spoken language comprehension.

There is evidence to suggest that aging is associated with declines in the control processes involved in coordinating distinct tasks (see McDowd and Shaw 2000; Verhaeghen et al. 2003 for recent reviews) and switching between tasks (Mayr et al. 2001; Verhaeghen et al. 2005), although the extent to which such age deficits contribute to age declines in spoken language comprehension has been relatively unexplored. There is also evidence to suggest that aging is associated with declines in the inhibitory control mechanisms that ordinarily prevent irrelevant information from interfering with the processing of relevant information (Hasher and Zacks 1988; Zacks and Hasher 1994). Research that has addressed the extent to which an inhibition deficit can account for age-related declines in spoken language comprehension is discussed in Section 7.5.2.

7.2.3 Limited Working Memory Resources

If processing and storage resources were unlimited, then one could imagine an executive function that would simply assemble all of the resources required for any task no matter how difficult the task and how many different resources were required. Moreover, an unlimited resource model would permit several tasks to be conducted in parallel without any performance decrements in any of the tasks. The more likely scenario is that processing and storage resources are limited (e.g., Craik and Byrd 1982; Baddeley 1986). As a result, one might expect that performance on one or more aspects of a task would deteriorate as (1) the complexity of the acoustic scene is increased (e.g., through the addition of competing sound sources), (2) the semantic or syntactic difficulty of the speech material is increased (e.g., switching from a narrative story to a lecture on non-Euclidean geometry), or (3) the task demands are increased (e.g., attempting to answer e-mail while carrying on a phone conversation).

Current information processing models use the term “working memory” to refer to the limited-capacity system that is responsible for the processing and temporary storage of task-relevant information during the performance of everyday cognitive tasks such as language comprehension (Daneman and Carpenter 1980; Baddeley 1986; Daneman and Merikle 1996; Miyake and Shah 1999). In some models, the executive control functions described earlier are part and parcel of the working memory system, see, e.g., “the central executive” component of Baddeley’s (1986) model of working memory. There is considerable evidence to suggest that aging is associated with reductions in working memory resources (Van der Linden et al.1994, 1999; Bopp and Verhaeghen 2005). The extent to which a working memory deficit can account for age-related declines in spoken language comprehension and how hearing deficits can alter the operation of the working memory system are discussed in Section 7.5.1.

7.2.4 Speed of Processing

Older adults are 1.5 to 2 times slower than younger adults at performing even the simplest of tasks, such as pressing a button in response to a tone or deciding whether two stimuli are perceptually alike (Cerella 1990). Given the ubiquity of age-related slowing, it is not surprising that one of the most dominant theories among aging researchers is that a generalized slowing in brain function with age is associated with most, if not all, of the age-related declines in performance on complex cognitive tasks such as problem solving, reasoning, and language comprehension (Salthouse 1996; for a meta-analysis, see Verhaeghen and Salthouse 1997). According to this theory, older adults would find it difficult to understand someone who is talking rapidly or to follow a conversation when there are multiple overlapping talkers because the rate of flow of information approaches or exceeds the maximum rate that can be accommodated by the cognitive processes involved in language comprehension (Wingfield 1996). The extent to which processing speed deficits can account for age-related declines in spoken language comprehension is discussed in Section 7.5.3.

7.2.5 Evaluating How Age Affects the Comprehension of Spoken Language

An integrated model of how listeners process speech and other complex acoustic stimuli is one in which the labels “sensory” and “cognitive” are simply convenient ways of referring to the earlier and later stages of a processing system in which the level or depth of processing depends on task demands (for a related model, see Wingfield and Tun 2007). Moreover, it is assumed that there is an executive function capable of marshalling and organizing the different resources required when a listener processes speech and that these resources are limited. In taking an integrated approach to understanding how adult aging affects spoken language comprehension, the following questions will be considered. How do age-related changes in sensory processes affect comprehension and memory? How do age-related changes in cognitive mechanisms such as working memory resources, inhibitory control, and processing speed affect comprehension and memory? How do sensory and cognitive processes interact in the context of spoken language comprehension? Section 3 begins by reviewing how age-related changes in the early (sensory and perceptual) processes might affect the comprehension and memory of complex auditory signals.

7.3 The Effects of Age-Related Changes in Sensory Processes on the Comprehension of Spoken Language

7.3.1 Listening in Quiet

As earlier chapters have indicated, aging is associated with elevated thresholds (especially in the high-frequency region, see Fitzgibbons and Gordon-Salant, Chapter 5), losses in spectral and temporal acuity (see Fitzgibbons and Gordon-Salant, Chapter 5), and possible losses of neural synchrony in the auditory pathways (see Schmiedt, Chapter 2; Canlon, Illing, and Walton, Chapter 3). Provided that these losses are not too severe and that the signal level is adequate, they have little or no effect on simple speech recognition tasks in quiet. However, even though word recognition accuracy measures may be at ceiling in such situations, there is no guarantee that all of the individual speech sounds or words are easily identified or that listening is effortless. For example, high-frequency hearing losses will lead to errors in identifying isolated phonemes in quiet (e.g., van Rooij and Plomp 1992; Humes 1996). However, when these phonemes are embedded in a sentence, the listener is able to make use of sentential context to correct such errors. Hence, if the listener did not clearly hear the final phoneme (was the last word “risk” or “wrist”), she or he could use semantic context to eliminate the ambiguity. In other words, the listener can use his or her knowledge of the language to enhance phoneme or word identification. Moreover, there is some evidence that older adults are at least as good, if not better, than younger adults at using sentential context to reduce ambiguity (Pichora-Fuller et al. 1995; Dubno et al. 2000; Sheldon et al. 2008b; Pichora-Fuller, 2008).

Such use of sentence context would be an example of how top-down cognitive-level resources can be deployed to enhance or support lower-level perceptual processes. However, within our current model, the allocation of higher-level processes to support phoneme or word identification could reduce the pool of resources available for higher-order tasks. Hence, although older adults in the early stages of presbycusis might be able to perform as well as younger adults when there is sufficient contextual support and minimal task demands, age-related losses in hearing could potentially lead to age differences in performance on more demanding tasks. For example, older listeners are able to identify 100% of the high-context sentence-final words of the revised speech perception in noise (R-SPIN) test (Pichora-Fuller et al. 1995) at moderate-to-high signal-to-noise ratios (SNRs) but are unable to remember detailed information or deduce and report themes as well as younger adults when listening to a short lecture in a quiet background (Schneider et al. 2000). Part of the reason why performance might be poorer in the latter than in the former situation may be that older adults, because they are more likely to suffer from peripheral auditory processing deficits, need to engage higher-order cognitive processes for word recognition more frequently than younger adults, thereby depleting the pool of resources available for integrating information across words, extracting themes, and storing relevant information for later recall.

Of course, it could equally well be the case that the reason why older adults may find it more difficult to comprehend and recall information presented in short lectures is that they are also experiencing cognitive declines in the processes that have to be engaged for comprehension and recall of lecture material or that the executive is not as efficient in marshalling the resources required for this task. How one can assess the relative contributions of these two sources (age-related hearing declines versus cognitive declines) is considered in Section 7.6.

7.3.2 Listening in Noise

Although hearing loss can account for most of the speech-recognition problems experienced by healthy older adults in quiet (see Humes and Dubno, Chapter 8), the elevated thresholds and reduced spectral acuity associated with presbycusis can only account for part of the difficulties that older adults experience in noisy situations. This has led a number of investigators to examine the potential contribution of age-related declines in temporal acuity to the speech-recognition problems of older adults in noisy situations.

Temporal cues relevant to speech processing have been described at three main levels (Greenberg 1996): subsegmental (phonetic), segmental (phonemic), and suprasegmental (syllabic and lexicosyntactic). Subsegmental fine-structure cues include periodicity cues based on the fundamental frequency and harmonic structure of the voice. Some types of segmental information are provided by local gap and duration cues in the envelope that contribute to phoneme identification (e.g., presence of a stop consonant, voice onset time). Suprasegmental cues, such as amplitude fluctuations in the region of 3- to 20-Hz, convey prosodic information related to the rate and rhythm of speech and support both syntactic and lexical processing of the information in the speech signal. For example, Shannon et al. (1995), in their classic study of noise-vocoded speech, showed that speech could still be recognized when most of the segmental and subsegmental information in the speech signal was largely removed by (1) breaking the speech signal into a relatively small number of frequency bands, (2) extracting the amplitude envelope in each band, and (3) using these amplitude envelopes to modulate bands of noise whose bandwidths were identical to those in (1). Shannon et al. were able to show that speech recognition was possible even when as few as two to four bands were used in vocoding. This study clearly demonstrates that listeners can use the envelope characteristics of the speech signal in different spectral regions for word recognition. Recently, Sheldon et al. (2008a) have shown that good-hearing older adults need a larger number of frequency bands in the vocoded speech to perform as well as younger adults in a speech-recognition task. Hence, there is some evidence to indicate that at least some older adults are beginning to experience difficulties at this level of temporal processing.

Over the past few decades, a large number of studies (see Fitzgibbons and Gordon-Salant, Chapter 5) have shown that older adults find it more difficult to detect gaps, to discriminate between different gap durations, or to detect a change in the duration of a sound. Losses in temporal acuity at this level have obvious implications for phoneme recognition. Reduced gap discrimination ability could lead to problems detecting stop consonants and loss of sensitivity to vowel duration removes an allophonic cue to vowel identity. Hence, there is good reason to believe that losses in temporal acuity at a segmental level would make speech recognition more difficult for older than for younger adults.

Finally, a few studies are beginning to suggest that some older adults may be experiencing losses in neural synchrony (e.g., Boettcher et al. 1996; Mills et al. 2006). It has long been known the firing pattern in primary auditory afferents is phase locked to the signal, with the degree of phase locking decreasing as frequency increases (see Schmiedt, Chapter 2). Age-related losses in synchrony would, for example, make it more difficult for an older adult to identify a talker or discriminate between talkers based on their characteristic fundamental frequency and/or to track that talker’s voice in a complex auditory scene. Recently, Pichora-Fuller et al. (2007) were able to reduce young adults’ performance on the R-SPIN test to that characteristic of older adults by artificially increasing the degree of asynchrony in the speech signal, thereby mimicking a loss of neural synchrony. Hence, losses in neural synchrony, especially in noisy situations, could make it more difficult for older than for younger adults to comprehend and remember speech in difficult or noisy situations.

7.4 Effects of Age-Related Changes in More Central Auditory Processes on the Comprehension of Spoken Language

To fully comprehend the auditory scene, listeners have to locate and perceptually segregate the sound sources in their environment (auditory scene analysis; Bregman 1990) so that they can focus their attention on target sources and ignore or suppress the processing of information from irrelevant sources. This is especially difficult to accomplish in reverberant environments because listeners in such situations typically receive not only the direct wave from each sound source but also myriad reflections off environmental surfaces. To successfully parse the auditory scene in these environments, the auditory system has to be able to recognize when, for example, a waveform arriving at one ear is a filtered and time-delayed version of the same waveform that arrived at the other ear a few milliseconds earlier, so that the information available in both waves can be fused into a single source and distinguished from other sound sources (see Eddins and Hall, Chapter 6).

The ability to successfully parse auditory scenes is influenced by a number of factors. First, the greater the spectral differences among sources, the easier it will be to segregate them. Brungart (2001) has shown that it is easier to segregate one talker from another when the two talkers differ substantially with respect to their fundamental frequencies and other acoustic features of their voices. Second, differences in harmonic structure between two sound sources can facilitate source segregation. For example, changing the frequency of one of the harmonics of a complex tone leads to its emergence as a second auditory event (Alain et al. 2001, 2006). Third, spatially separating sound sources also will lead to improved source separation (see Freyman et al. 1999). In general, the more two sounds differ with respect to their acoustic properties and the greater their separation in space, the easier it is to perceptually segregate them.

Once the auditory scene has been parsed into its component sound sources, listeners find it easier to focus their attention on the target talker and disregard or suppress information from competing talkers. Otherwise, the intrusion of information from irrelevant talkers might interfere with the target talker’s message. Such interference is often referred to as perceptual or informational masking (see Freyman et al. 1999; Schneider et al. 2007).

Several cognitive theories suggest that older adults might be more susceptible to intrusions from irrelevant or distracting stimuli than younger adults because of age-related changes in cognitive functioning (e.g., the inhibitory deficit hypothesis; Hasher and Zacks 1988; Hasher et al. 1999). If older adults are more susceptible to distraction, then even if they are able to segregate sources as well as younger adults, they may still benefit less from cues (such as spatial separation) that release younger listeners from informational masking. Alternatively, signal degradation that results from a deteriorating sensory system may make it more difficult to perceptually segregate the target talker from irrelevant talkers, thereby leading to a greater degree of informational masking in older listeners (also discussed in Humes and Dubno, Chapter 8).

Finally, successful scene analysis could be adversely affected by age-related processing limitations in central auditory processes (Snyder and Alain 2007; Canlon, Illing, and Walton, Chapter 3). For example, an age-related diminution in the ability to fuse left ear and right ear correlated signals could significantly reduce the ability of an older person to parse an auditory scene. This, in turn, would lead to greater interference by competing talkers, i.e., to a greater degree of informational masking. This section focuses first on how age-related changes in more central auditory processes might hamper perceptual segregation of sound sources. Then it explores the effects of age on source segregation and informational masking.

7.4.1 Effects of Age on Processing Capacity and Executive Control Over Auditory Processes

Spoken language comprehension requires the integration of information coming from different auditory channels. For example, to segregate and correctly identify two simultaneously spoken vowels that differ in fundamental frequency, the listener groups together all frequencies that are harmonically related. The ability to do this could be limited by age-related declines in the capacity of the auditory system to process and integrate information from several auditory filters. In addition, age-related declines in the ability to control the gain in these auditory channels or to control other aspects of auditory processing could also lead to age-related comprehension problems in situations where there are multiple talkers and/or other sound sources. What is known about how age affects auditory channel capacities and influences how information processing is controlled in these channels is discussed in this section.

7.4.1.1 Age Differences in Channel Capacity

Miller (1956) showed that the ability of listeners to identify stimuli varying along a single dimension is limited by that dimension’s bandwidth or channel capacity. Murphy et al. (2006b) measured the channel capacities of younger and older adults by having them identify pure tones that differed only in intensity by pressing a button corresponding to the intensity of the tone (an absolute identification paradigm). If older adults were less able to distinguish intensities because of cochlear degeneration, then their ability to identify the tones would have been poorer than that of younger adults. To avoid possible confounds arising from such age-related sensory declines, the intensity differences between adjacent tonal intensities were always large enough that they were nearly perfectly discriminable to both younger and older adults. When the adjacent tones were perfectly discriminable, there were no age differences in the accuracy with which individuals were able to identify among two to eight pure tones. Hence, with respect to stimulus intensity, it does not appear that there is any diminution in channel capacity with age.

There is some evidence suggesting that the bandwidth and/or the processing strategy differs between younger and older adults when they are asked to identify the temporal duration of pure tones by pressing a button corresponding to the duration. Specifically, McCormack et al. (2002) reported that older adults were less accurate than were younger adults in identifying pure tones that varied only in duration. However, the tonal durations used by these investigators were such that some of the pairs of adjacent stimuli were likely to be below the discrimination thresholds of older adults (Bergeson et al. 2001). Hence, the poorer performance of older adults in identifying tones differing only in duration could have reflected the inability of older adults to distinguish between pairs of adjacent stimuli rather than age-related diminution in the channel capacity for stimulus duration. Further studies are needed to see if there are significant reductions in the bandwidth of this and other kinds of auditory channels.

7.4.1.2 Age-Related Differences in Top-Down Control Over Auditory Gain

A number of studies have documented the existence of a nonlinear cochlear amplifier (see Nobili et al. 1998; Schmiedt, Chapter 2). This mechanism is thought to amplify low-intensity sounds and compress high-intensity sounds (Dallos 1997; Robles and Ruggero 2001). Parker et al. (2002) and Gordon and Schneider (2007) have presented evidence that this nonlinear amplifier is under top-down control and argued that the gain of the amplifier is set to maximize discriminability and to protect the auditory system from overload. Specifically, these investigators argued that the degree of compression imposed on a stimulus was modulated by the participant’s expectations concerning how intense the stimulus to be presented might be. In other words, the listener’s expectations appeared to control, in a top-down fashion, the gain of the nonlinear amplifier. Hence, loss of top-down control over this nonlinear amplifier could limit the ability of the auditory system to function over a wide range of amplitudes and could lead to declines in discriminability. Because there is a widespread loss of outer hair cells and changes in endocochlear potentials in aging, it is possible that there is a loss in gain control with age. Nevertheless, Murphy et al. (2006b) did not find any age-related changes in top-down control over this system in older adults with good hearing. Hence, it appears that top-down control is preserved in older adults with good hearing.

7.4.1.3 Age-Related Changes in Automatic Versus Controlled Processing

Although age-related decrements in early auditory processing should lead to an impoverished representation of the signal, it is always possible that the perception of the signal can be enhanced and performance improved by exerting compensatory top-down control over how information is processed. For example, attention could be focused on the signal’s frequency components (auditory attention bands; Dai et al. 1991) or spatial position (e.g., Mondor et al. 1998; Boehnke and Phillips 1999), thereby enhancing signal quality. Hence, when hearing becomes difficult, either because of poor acoustics or because of hearing loss, one would expect listeners to rely more heavily on controlled processing to compensate for poor signal quality. Accordingly, one might expect older adults to more frequently engage in controlled processing than younger adults because of age-related deficiencies in the automatic (bottom-up) processing of speech in competing noise.

Evidence for support of this notion comes from a study by Alain et al (2004) that compared the mismatched negativity (MMN) wave elicited when young, middle-aged, and older adults listened for a deviant stimulus in an oddball paradigm. The standard stimulus was a tone pip; the deviant stimulus was a gap in an otherwise continuous tone pip. Event-related potentials (ERPs) were collected under two conditions: (1) when the listeners were instructed to ignore the auditory stimulus and perform a concomitant visual serial-choice reaction-time task (passive listening) and (2) when the listeners were instructed to pay attention to the stimulus and to respond as quickly and as accurately as possible by pressing a button whenever they heard a stimulus with a silent gap (active listening). Figure 7.1 shows average MMN waves in the passive listening condition for young, middle-aged, and old adults to a deviant stimulus that could be easily detected under active listening conditions. Specifically, the gap duration of the deviant stimulus presented to an individual in this passive listening condition was the shortest gap duration that produced a hit rate for that individual in the active listening condition that was within 2% of the asymptotic value of her or his psychometric function, relating hit rate to gap duration. Figure 7.1 shows that a significant MMN in the passive listening condition was observed for younger adults when the deviant stimulus was one that they could easily detect in the active listening condition. In contrast to the findings of a clear MMN for younger adults, in middle-aged and older adults, stimuli that were clearly detectable in active listening conditions failed to produce an MMN in the passive listening condition. The lack of a clear MMN in all but younger adults in this passive listening condition indicates an age-related decrement in automatic processing. However, under active listening conditions, in which listeners presumably engage in controlled processing, the same near-threshold stimuli were readily detected by all age groups. This pattern of results suggests that older adults are able to use top-down processing to compensate when they are actively listening to detect the auditory signal, whereas younger adults apparently do not need to engage such processes. Although such compensation during active listening may overcome age-related problems in passive listening, the increase in effort required to listen in this way may deplete limited cognitive resources in older adults (Pichora-Fuller 2006).

Fig. 7.1
figure 1_7figure 1_7

Mismatched negativity waves for young, middle-aged, and older adults in the passive listening condition. The gap size in the deviant stimulus was the shortest gap duration that produced a hit rate within 2% of the asymptotic value reached by the psychometric function relating hit rate to gap detection from the active listening condition. Even though all deviant stimuli were approximately equally detectable in the active listening condition, a significant mismatched negativity (MMN) was only observed for young listeners in the passive condition. (Adapted from Alain et al. 2004 with permission of the author.)

7.4.2 Effects of Age on Source Segregation

The most comprehensive study to date of age-related changes in the ability to segregate two competing speech sources was conducted by Humes et al. (2006). These investigators pitted two sentences from the coordinate response measure (CRM; Bolia et al. 2000) against each other. The sentences were of the form “ready (call sign), go to (color and number) now” and were played simultaneously. The listener was instructed to attend to the sentence containing the designated call sign and to report the color and number associated with that call sign. The call sign to be attended was presented before or after the two CRM sentences were played. For example, in the before condition, the call sign (e.g., “baron”) was presented for three seconds before the two sentences were played (e.g., “ready baron, go to red eight now” vs. “ready charlie, go to green one now”). In the after condition, the call sign was presented for three seconds immediately after the two sentences were played. In both conditions, the correct response for this example is “red eight.” Note that the after condition places a greater load on working memory because the listener must retain the information from both sentences until the call sign is presented, whereas in the before condition, listeners can focus their attention on the sentence containing the call sign and either inhibit the processing of the competing sentence or delete its content from working memory. Hence, as seen in Section 7.5.1, according to the prevailing view concerning age-related changes in working memory, age differences should be larger in the after condition than in the before condition.

All sentences were spectrally shaped to adjust for any hearing loss that a participant might be experiencing. This manipulation significantly reduces the probability that any age-related differences in performance on this task are due to age-related cochlear pathologies. The two CRM sentences used in a trial could also be presented to the same ear or to different ears and be spoken by talkers of the same or different genders. Hence, there were four conditions and two age groups. Figure 7.2 depicts the rationalized arcsine transform of the percentage of correct identifications in the before condition. Three features of these data should be noted. First, for both age groups, performance is lowest for same gender sentences presented monaurally; i.e., accuracy is lowest when there are no spatial or gender cues to support source segregation. Second, for both age groups, switching from monaural to dichotic presentations produces a larger improvement in performance for same gender than for different gender talkers. Thus the beneficial effect of spatial separation is larger when there are fewer cues of other sorts to support source segregation. Third, the only significant age difference occurred when the presentation was monaural but the two talkers differed in gender. This suggests that younger adults benefit more than older adults from differences in voice characteristics. However, this age difference in sensitivity to voice characteristics only becomes evident in the absence of other cues to support source segregation.

Fig. 7.2
figure 2_7figure 2_7

Transformed percentage correct scores (means ± SE) for the coordinate response measure (CRM) target for each of 8 listening conditions when the call sign was presented before the target and competing sentence. RAU, rationalized arcsine unit. *Significant difference between the young and old listeners (p < 0.01). (Adapted from Humes et al. 2006 with permission of the author.)

Figure 7.3 presents the equivalent data in the after condition. As was the case for the before condition, the worst performance is observed when the cues for source segregation are minimal (same gender, monaural presentations) and the effect of spatial separation is larger for same gender than for different gender speakers. Here, however, there are significant age effects in three of the four conditions, suggesting that older adults find it more difficult than do younger adults to retain information in working memory until the call sign cue is presented.

Fig. 7.3
figure 3_7figure 3_7

Transformed percentage correct scores (means ± SE) for the CRM target for each of 8 listening conditions when the call sign was presented after the target and competing sentence. *Significant difference between the young and old listeners (p < 0.01). (Adapted from Humes et al. 2006 with permission of the author.)

The results of this experiment indicate that both younger and older adults experience a great deal of difficulty in attending to the target sentence when the cues supporting source segregation are minimal (same gender for target and competitor, monaural presentation). When a gender cue is included (but the presentation is monaural), younger adults benefit more from this cue than do older adults. One possibility is that age-related losses in temporal synchrony make it more difficult for older adults to discriminate between two voices based on their fundamental frequency and harmonic structure. An explanation of this sort implies that the failure to be able to segregate the voices based on gender reflects age-related deficiencies in the earlier stages of auditory processing rather than a failure in the later stages of processing involving attention and/or working memory.

Finally, as might be expected from studies of working memory, age differences appear to be greater when the call sign is given after CRM sentence presentation as opposed to before it. Note, however, that age differences in performance in the before case under dichotic listening conditions may have been limited by ceiling effects in the young (see Fig. 7.2). Hence, older adults appear to be able to perform as well as younger adults on speech-recognition tasks, provided that there are sufficient cues to source segregation and that the working memory requirements of the task do not exceed their working memory capacity.

In a recent study, Singh et al. (2008) explored the effects of location certainty in younger and older listeners in a paradigm developed by Kidd et al. (2005). In one condition of their study, three sentences were presented simultaneously from three different “perceived” locations (perceived location of a sentence in the sound field was manipulated using the “precedence” effect; see Eddins and Hall, Chapter 6). Again, the task was to identify the color and number associated with a particular call sign. The effects of two within-subject factors were explored: (1) temporal position of the call sign (before or after the three sentences were presented) and (2) probability (100%, 80%, 60%, 33%) of the target sentence appearing at the central location (announced before a block of trials). Interestingly, although there was a main effect of age, the effects of location certainty and prior knowledge of the target’s call sign were identical for younger and older adults. Because Singh et al. did not compensate for possible cochlear declines, the most likely reason for the poorer performance of older adults was an age-related decline in peripheral auditory processing. The lack of any age interaction suggests that both younger and older adults benefit equally from prior knowledge of target location and target call sign. Hence, although younger adults may outperform older adults when there is competing speech, older adults appear in some cases to be able to utilize information as effectively as younger adults, suggesting that the ways in which listeners handle competing auditory information streams may be relatively unaffected by age.

However, before reaching the conclusion that older adults may be as adept at source segregation as younger adults once age-related changes in cochlear processes have been taken into account, it would be prudent to consider how age might affect other factors that are known to facilitate auditory scene analysis and source segregation.

7.4.2.1 Source Segregation Based on Harmonic Structure

The auditory system tends to interpret frequencies that are harmonically related as belonging to a single source. Hence, age-related changes in the ability to determine whether a single frequency or a group of frequencies are harmonically related to another group of frequencies might be expected to affect scene analysis. Alain et al. (2001) tested the ability of younger and older adults to detect a mistuned harmonic. They found that older adults required a greater degree of mistuning than younger adults to detect a change, with the age difference greater for shorter than for longer sounds. Hence, older adults would presumably require a greater separation in fundamental frequency (and, correspondingly, in harmonic structures) to segregate two voices. Indeed, when two vowels are presented simultaneously, the ability of listeners to correctly identify both vowels increases with the separation between the fundamental frequencies of the two vowels (e.g., Summerfield and Assmann 1991; de Cheveigné 1997; Summers and Leek 1998). A recent study found that older adults benefit less than younger adults from differences in fundamental frequency between two concurrent voices, pointing to age differences in bottom-up auditory processes (Vongpaisal and Pichora-Fuller 2007). Interestingly, an examination of the pattern of brain activity in a different group of participants performing the same task indicated that there was top-down engagement of compensatory cognitive processes in older adults compared with younger adults (Snyder and Alain 2005). Hence, it appears that older adults are less sensitive to differences in the harmonic structure of voices and engage compensatory mechanisms to partially offset age-related peripheral deficits. This would help explain why, in the Humes et al. (2006) study, younger adults profited more than older adults from gender differences between the target and competitor CRM sentences.

7.4.2.2 Source Segregation Based on Attentional Focus

Another possible reason why older adults might not be as efficient as younger adults in using the acoustic cues associated with gender differences to segregate voices or in capitalizing on differences in fundamental frequency when identifying simultaneously presented vowels is that they might not be able to focus their attention on specific-frequency regions as well as younger adults. Studies of auditory attentional focus (e.g., Dai et al. 1991; Hafter and Schlauch 1991) have shown that younger adults are capable of focusing their attention on a specific frequency region if they are expecting a target to have energy in that region. This capacity is usually demonstrated using a probe-frequency paradigm. In this paradigm, the listener is asked to detect a pure tone of a known frequency in a background of bandlimited Gaussian noise. Typically, the intensity of the pure tone (e.g., 1 kHz) is adjusted so that the listener is performing at a targeted degree of accuracy (usually between 85 and 90% correct). The experimenter then occasionally replaces the target tone (e.g., 1 kHz) with a probe tone of another frequency but at the same SPL (for instance, a 980-Hz tone), and then computes how accurate listeners were in detecting the probe tone. Typically, performance on those trials containing a probe decreases with increasing separation between the probe tone’s frequency and the target tone’s frequency.

Because older adults are often thought to have attention deficits, one might expect their attention bands to be broader than those of younger adults. A possible physiological basis for a broader attentional focus has come from the work of Scharf et al. (1994, 1997), who found that individuals who have had their olivocochlear bundle severed do not demonstrate attentional selectivity. The olivocochlear bundle is believed to be important in allowing for the top-down control of the micromechanical properties of the cochlea and may play a significant role in detecting signals in noise (Pickles 1988). Because the olivocochlear bundle synapses mostly with outer hair cells and because outer hair cells are known to suffer widespread damage with age and/or noise exposure over the lifespan (Willott 1991), one might expect a broadening of attentional focus in older adults. However, Ison et al. (2002) and Murphy et al. (2006b) have shown that the attentional filter has the same bandwidth in younger and older adults. Figure 7.4 plots the percentage of tones correctly detected as a function of the difference in frequency between the probe and target tones for younger and older adults. Clearly, the greater the separation between the probe tone’s frequency and the frequency of the target tone, the less likely it is that the probe tone will be detected. Note also that the bandwidth of this “attentional” filter is quite narrow. As Figure 7.4 shows, a probe tone that differs in frequency from the target tone by as little as 100 Hz simply is not detected (the detection rate for a probe tone whose frequency is 100 Hz less than that of the target tone is the same as the probability of reporting a tone when no tone was presented). Hence, both younger and older adults are equally capable of narrowing their attentional focus to a very small region on the basilar membrane (approximately the size of a critical band).

Fig. 7.4
figure 4_7figure 4_7

Percent detection (means ± SE) for the signal and for off-frequency probes at the same level that deviated from the signal by ±25 to 100 Hz in younger and older listeners. Also shown are the false alarm rates (responding signal present when no signal was presented). (Adapted from Ison et al. 2002 with permission of the author.)

7.4.2.3 Source Segregation Based on Spatial Separation

Perhaps one of the most important cues to source segregation is spatial separation. Consider the case in which there is a single talker in a noisy background, with the location of both the talker and the noise source being to the listener’s left. When both talker and noise are colocated, the SNR in each ear will be approximately the same. However, if the talker moves to the right of the listener, with the noise remaining on the left, there will be a significant increase in the SNR ratio in the right ear, which could readily facilitate source segregation. In addition, the change in target position could lead to interaural timing differences that might unmask the talker’s voice. Hence, spatial separation should be a powerful cue to source segregation.

Because a number of studies have indicated that older adults are less sensitive to binaural cues than are younger adults (see Eddins and Hall, Chapter 6), one might expect that older adults would be less able to use such cues to achieve the same degree of source segregation as younger adults. Hence, age-related changes in binaural processing could lead to poorer source segregation. This issue is addressed in Section 7.4.3 on informational masking.

7.4.2.4 Source Segregation Based on Prior Knowledge

Listening to a conversation in a noisy environment is much easier when the listener is familiar with the topic of conversation. The most likely reason for this is that a priori knowledge of the topic creates expectations about the semantic content and linguistic structure of the unfolding speech signal that facilitates processing. In particular, if listeners miss some of the words or phrases because the listening situation is difficult, they may be able to recover the lost information from the context provided by the parts of the conversation they have heard and knowledge stored in long-term memory. However, it also is possible that such knowledge helps the listener to focus attention on the relevant voice. For example, if the topic of the conversation of interest is the impending marriage between Emily and Tom and the listener perceives the following sentence fragment “the bridesmaids will,” it is quite likely that this voice is a relevant part of the conversation. Hence, it would make sense for listeners to focus their attention on this auditory stream.

Knowledge and expectations regarding aspects of the unfolding speech signal can also be used to advantage by listeners. Such benefit is illustrated in a study by Freyman et al. (2004) showing that the final word of a masked phrase was recognized better if listeners had already heard the initial part of a phrase than if they were hearing the phrase for the first time. Hence, partial knowledge of the sentence improved recognition of the sentence-final word. It is possible that listeners derived some benefit from priming because they acquired knowledge of the acoustical properties of the initial portion of the utterance; however, the same reduction in informational masking was obtained when the participant read the prime instead of listening to it. Clearly, knowledge of the words alone is sufficient to lead to an improvement in performance, independent of the modality in which that knowledge is acquired. Another possibility is that knowing part of the sentence helps the listener to identify and focus in on the target stream. Hence, it is reasonable to hypothesize that a listener in a complex acoustic environment is capable of using knowledge about the nature of conversation to identify and focus in on the talkers participating in a discussion. It would be very interesting to determine whether older adults are as good as are younger adults in using this kind of prior knowledge to achieve source segregation. To our knowledge, there are no published studies in this area.

7.4.2.5 Source Segregation Based on Other Aspects of the Acoustic Scene

Bregman (1990) has argued that the auditory system will capitalize on virtually any features of the auditory scene that will aid in source segregation. One additional cue for source segregation that should be mentioned is “auditory image size.” Freyman et al. (1999) demonstrated the usefulness of this cue in a study in which they compared performance when both masker and target were presented over the same frontal loudspeaker to performance when the target was presented over the frontal loudspeaker but the masker was presented over both the frontal loudspeaker and 4 ms later over a loudspeaker located to the listener’s right. Note that in the latter condition, the masker was perceived to be frontally located because of the precedence effect (see Section 7.4.3). The target sentences were repeated with much greater accuracy in the latter condition. Even though the location of all images in both conditions remained in the frontal position, the image of the target in the latter condition was more compact than that of the masker, whereas the masker and target had the same degree of compactness when both were presented only over the frontal loudspeaker. This comparison suggests that differences in the compactness of target and masker will improve speech recognition, presumably because it enables a listener to more accurately parse the auditory scene into two different sound sources. How age might alter the effectiveness of this cue to source segregation is not currently known.

In general, there are likely to be a number of acoustic features that could be used to achieve source segregation. How source segregation enhances spoken language comprehension, i.e., how it helps to unmask the speech signal, is considered next.

7.4.3 Effects of Age on Informational Masking

Before discussing the effects of age on informational masking, it is important to distinguish informational masking from energetic masking. When the SNR is low in a spectral region, the energy in competing sound sources can simply overwhelm (mask) the energy in the signal, making it difficult for the listener to extract the target signal from the noise background. This kind of masking is often referred to as “energetic” or “peripheral” masking, and it has been studied extensively (see, e.g., Plomp and Mimpen 1979). In contrast, “informational” or “perceptual” masking occurs when competing signals and background noises interfere with speech recognition at more central auditory- and/or cognitive-processing levels. For example, consider the case in which someone is attempting to attend to one person who is talking when there are two other people who are also talking. To understand what is being said in this situation, the listener must either 1) focus attention on one stream and suppress the other streams of information or 2) attempt to simultaneously process more than one stream at a time. If it becomes difficult for the listener to inhibit the processing of irrelevant information or to simultaneously process more than two information streams, the listener may require a higher SNR for speech recognition and comprehension than would be required if the maskers were acoustically matched nonspeech maskers. Clearly, any factor that facilitates source segregation should make it easier to either (1) focus attention on one source and ignore or suppress the processing of information from the other sources or (2) simultaneously process more than one source at a time. Hence, source segregation should facilitate the release from informational masking more than it would facilitate release from energetic masking.

Many competing sound sources are likely to give rise to some combination of both energetic and informational masking. A common approach to determining what portion of total masking is due to informational versus energetic masking is to compare performance for speech and nonspeech maskers in conditions manipulating auditory or cognitive factors. For instance, spatial separation of the target and masker is one factor that reduces speech-on-speech masking. This release from masking could be due to a reduction in peripheral (energetic) masking and/or a reduction in the amount of interference produced at more central (cognitive) levels. To determine how much of the release from masking is due to the binaural advantage in overcoming acoustical masking versus improved attentional focus in overcoming distraction, the release in speech-on-speech masking seen in conditions of spatial separation may be compared with release when the masker is an unmodulated speech-spectrum noise. It is assumed that there will be equivalent energetic masking by the speech and spectrally matched speech-spectrum maskers. Because speech-spectrum noise is unlikely to initiate any competing phonetic, semantic, or linguistic processing, it should not interfere with speech processing at these more central levels. Therefore, if the release from masking due to spatial separation is larger when the masker is speech than when the masker is speech-spectrum noise, one can infer that the manipulation is effective in reducing interference at more central levels, i.e., in reducing informational masking. Presumably spatial separation is beneficial in reducing energetic masking for both the speech and speech-spectrum noise maskers because interaural cues enable the listener to better segregate the target and masker. In addition, presumably spatial separation is beneficial in reducing informational masking because it enables the listener to better isolate the target speech signal from competing speech in the auditory scene.

An important question is the degree to which age-related changes in either peripheral or central processes reduce the effectiveness of source segregation (however achieved) in providing relief from informational masking. Given that age-related losses in peripheral auditory functioning are likely to reduce the effectiveness of cues to source segregation, one might expect to find less of a release from informational masking in older than in younger adults.

An alternative explanation for possible age differences in release from informational masking could be that older adults are less able than younger adults to benefit from source segregation once it has been achieved because of declines in cognitive capacity. To directly test cognitive theories of age-related declines in spoken language comprehension and bypass age-related differences in sensory processing, Li et al. (2004) used the paradigm developed by Freyman et al. (1999) in which younger and older adults were asked to repeat meaningless target sentences (e.g., A rose could paint the fish.”) presented in either a noise background (energetic masker) or a background in which two other people were also speaking nonsense sentences (informational masker). The target and masker were perceived as coming from the same spatial location or from different spatial locations. Rather than actually changing the physical location of two loudspeakers to achieve the perception that the target and masker were spatially separated, the paradigm capitalizes on the precedence phenomenon to change the perceived locations. In the precedence paradigm, all stimuli were presented over each of two loudspeakers. If a signal is presented simultaneously from both loudspeakers, it is perceived to be located centrally; however, if the signal presented from one loudspeaker leads the same signal presented from a second loudspeaker, the listener perceives that there is only a single source located at the position of the loudspeaker from which the leading sound was presented (see Zurek 1987 for a review). In the experiment of Li et al. (2004), the target sentence was presented over both loudspeakers, with the right speaker leading the left by 3 ms so that the target sentence was always perceived as coming from the right. The masker was presented either in the same fashion as the target or with the lag between the presentation of the masker from one loudspeaker relative to the other changed so that the masker was perceived as originating from a different location. Because changing the perceived location in this way does not change the acoustic stimulation at either ear in any significant way (see the Appendix in Li et al.), the amount of energetic masking should not change in any significant way with perceived spatial separation. Nevertheless, changing the perceived location of the masker should provide a release from informational masking if it facilitates segregation of the target and masker.

Li et al. (2004) found that both younger and older adults benefited equally from spatial separation when spatial separation was induced using the precedence effect. Interestingly, the only age difference in their experiment was that older adults required a 2.8-dB higher SNR to perform equivalently to younger adults in all conditions. Hence, once age-related differences in peripheral auditory processing are controlled for, the two age groups appear to benefit by the same amount from spatial separation of target and masker.

The fact that older adults in this experiment needed a higher SNR to perform equivalently to younger adults suggests that it should be possible to compensate for a number of age-related deficits in processing speech by improving the SNR for older adults either by improvements in the acoustic environment or by the appropriate use of noise-reduction algorithms in assistive listening technologies.

7.4.4 Tentative Conclusions Concerning Age-Related Changes in More Central Auditory Processes

The available studies of how age-related changes in more central auditory processes might affect spoken language comprehension suggest that older adults with clinically normal audiometric thresholds throughout most of the speech range may be as good as younger adults at source segregation, scene analysis, and release from informational masking once the effects of subclinical age-related deficits in lower-level processing are taken into account. Moreover, there do not appear to be any significant declines in these good-hearing older adults with respect to the bandwidths of auditory channels, ability to focus attention, and/or top-down control over auditory processing. In contrast, a number of findings do point to age-related declines in the automatic processing of near-threshold stimuli. Importantly, it seems that older adults may make more extensive use of controlled top-down processing to compensate during listening. Thus it appears that many central auditory processes are preserved in aging but that they may play a more extensive compensatory role because of age-related declines in lower-level (peripheral or brainstem) auditory processing.

7.5 Effects of Age-Related Changes in Cognitive Processes on Comprehension of Spoken Language

Although there is a fairly large body of research aimed at investigating the extent to which age-related changes in cognitive mechanisms account for age-related declines in language processing (for reviews, see Light 1990; Kemper 1992; Stine 1995; Johnson 2003), it is important to keep in mind that most of the research has focused on the comprehension and recall of written discourse rather than spoken discourse (although see, e.g., Tun et al. 1991; Wingfield and Stine 1992; Titone et al. 2000). As will be seen, there is evidence to suggest that declines in working memory capacity, inhibitory control, and processing speed play a role in the effects of aging on language processing. However, there are vigorous debates as to which of the three plays the primary role (for a review, see Van der Linden et al. 1999).

7.5.1 Working Memory

In most contemporary models of language comprehension, working memory represents “the critical bottleneck in which signals are decoded, concepts are activated, linguistic constituents are parsed, thematic roles are assigned and coherence among text-based ideas is sought” (Stine et al. 1995, p. 1). Consequently, it is not surprising that age-related declines in language comprehension are frequently attributed to age-related declines in working memory capacity. The working memory deficit hypothesis is well supported in the literature. Older adults perform more poorly than their younger counterparts on tasks that assess the combined processing and storage capacity of working memory (see Bopp and Verhaeghen 2005 for a meta analysis), and these age-related working memory span differences account for a significant proportion of the age-related variance on written and spoken language comprehension tasks (e.g., Van der Linden et al. 1999; Brébion 2003; DeDe et al. 2004).

7.5.2 Inhibitory Control

Some researchers prefer to attribute age-related declines in language comprehension to age-related declines in the ability to inhibit the processing of irrelevant stimuli. According to the inhibition-deficit hypothesis, aging is associated with reduced inhibitory mechanisms for suppressing the activation of goal-irrelevant information (Hasher and Zacks 1988), allowing interfering information to intrude into working memory or preventing no longer relevant information from being purged from working memory. The irrelevant information squanders working memory resources and disrupts the processing of goal-relevant information (Hasher et al. 1999), thereby impairing the reader’s or listener’s ability to construct a coherent representation of the discourse. Support for the inhibition-deficit hypothesis comes from several studies that have shown that measures of inhibition efficiency (e.g., interference on the Stroop color-word task) appear to mediate age-related differences in written language comprehension (Zacks and Hasher 1994; Kwong See and Ryan 1995; Van der Linden et al. 1999). However, not all the reported data are consistent with the inhibition-deficit hypothesis and even the consistent data are compatible with alternative interpretations (see Burke 1997 for a critical review).

7.5.3 Processing Speed

Finally, in all models of language comprehension, the processes involved in the construction of a complete and coherent discourse representation are assumed to be time-consuming. Consequently, it is not surprising that age-related slowing is frequently viewed as the primary contributor to age-related declines in language comprehension. The processing-speed hypothesis is well supported in the literature for both written and spoken language comprehension (e.g., Cohen 1979; Wingfield, et al. 1985; Tun et al. 1992; Stine and Hindman 1994; Stine et al. 1995). Studies of reading comprehension have shown that age differences in text memory are much larger when reading is experimenter paced rather than self paced (Verhaeghen et al. 1993; Stine-Morrow et al. 2001; Johnson 2003). Fine-grained analyses of self-paced reading times suggest that older adults need to allocate more processing time to new information (Stine et al. 1995) and propositionally dense sentences (Stine and Hindman 1994) than do their younger counterparts.

The contribution of speed of processing to spoken language comprehension has been studied by comparing the performance of younger and older adults when speech is artificially speeded (Wingfield et al. 1985; Gordon-Salant and Fitzgibbons 1993, 1997, 1999; Wingfield 1996), and the typical finding has been that comprehension declines more rapidly for older adults than for younger adults as speech rate increases. Although such a finding is consistent with a slowing hypothesis, there is another possible reason for why older adults find it more difficult to handle rapid rates of speech. Speeding speech, in addition to increasing the rate of flow of information, also tends to degrade and/or distort consonant phonemes in the speech signal (Gordon-Salant and Fitzgibbons 1999; Wingfield et al. 1999). Therefore, it is possible that the reason why older adults are more affected by speeding is that the auditory systems of older adults are less able to handle these distortions than are the auditory systems of younger adults. Indeed, recent studies have found that if speech is speeded in a way that minimizes the adverse effects of speed-induced acoustic distortions on the auditory systems of older adults, increasing the rate of speech has the same effect on speech recognition (Schneider et al. 2005) and spoken language comprehension (Gordon et al. 2009) in younger and older adults. These results support the view that auditory decline rather than cognitive slowing may be responsible for older adults’ poorer performance in speeded-speech conditions.

7.5.4 Tentative Conclusions Concerning Cognitive Mediators of Age-Related Changes in Spoken Language Comprehension

Although there is a significant body of evidence to suggest that deficits in working memory capacity, inhibition control, and processing speed could all contribute to age-related differences in spoken language comprehension performance, there is conflicting evidence concerning the relative contributions of the three factors. Most researchers acknowledge that these three indices of processing efficiency are interdependent, but the debate continues as to which of the three plays the primary role in accounting for age-related declines in comprehension. For example, Kwong See and Ryan (1995) have argued that the influence of working memory is secondary to the influences of inhibition control and processing speed. On the other hand, Van der Linden et al. (1999) argue that age-related reductions in processing speed and resistance to interference have an indirect influence on comprehension that is mediated by reductions in working memory capacity. Unraveling the relative and independent contributions of these cognitive mechanisms remains a tricky enterprise.

7.6 Auditory-Cognitive Interactions

Equally tricky is the task of investigating the complex interactions between the aging auditory and cognitive systems and how these interactions contribute to the speech-understanding difficulties of older listeners (e.g., van Rooij and Plomp 1992; Humes 1996; Schneider et al. 2000; Murphy et al. 2006a; George et al. 2007). In this section, correlational and experimental approaches to investigating these complex auditory-cognitive interactions are described.

Several studies have used a correlational approach to investigate the relative contributions of auditory and cognitive factors to speech-understanding difficulties in older adults (e.g., van Rooij and Plomp 1992; Humes 1996). To assess auditory and cognitive competence, younger and older listeners were administered tests of basic auditory abilities (e.g., pure-tone sensitivity, frequency, and duration discrimination) and basic cognitive function (e.g., digit span, Wechsler Adult Intelligence Scale-revised). Scores on these auditory and cognitive tests were then correlated with performance on a number of tests of speech recognition in which listeners were required to detect, discriminate, or identify nonsense syllables, phonemes, spondees, isolated words, words presented in sentences, or whole sentences in quiet and noise. The best single predictor of word and sentence recognition across the studies was the listener’s pure-tone threshold function. Most of the cognitive measures correlated poorly with speech recognition. Results such as these led Humes (1996) to conclude that auditory declines are primarily responsible for age-related declines in speech-understanding performance. As provocative as these findings are, there is always the concern about making causal inferences from correlational designs. Moreover, it is possible that these particular correlational studies underestimated the contribution of cognitive factors because (1) the particular choice of cognitive measures may not have been the most appropriate and (2) it is unlikely that simple speech detection and discrimination tests fully engage the linguistic and cognitive processes that operate in everyday listening situations.

Some of the limitations found in correlational studies can be redressed by taking an experimental approach that controls for age-related hearing differences and uses more natural listening tasks. For example, Schneider et al. (2000) approximated more naturalistic listening conditions in the laboratory by having participants listen to complex single-talker discourse in quiet or in a background of conversational noise (12-talker babble), conditions that would be similar to attending a 10- to 15-minute lecture with an audience that is either very attentive or a lot less so. The methodology involved presenting the monologues and noise under identical physical conditions to the younger and older listeners, which is the typical approach in cognitive aging research, or adjusting the listening conditions to compensate for the poorer hearing abilities of the older listeners. When the younger and older adults listened to the monologues under identical stimulus conditions (passages were presented at the same sound pressure level to all participants and the noise, when present, was identical for all participants), the older adults provided fewer correct responses to questions about the discourse than did the younger adults. One might be inclined to attribute the negative age difference in this study to declines in cognitive mechanisms such as working memory capacity, inhibition control, or processing speed. However, the notion that age differences were due primarily to cognitive factors was challenged by the results of a second experiment that adjusted the listening situation to make it equally difficult for both young and old adults to identify individual words. In conditions in which it was equally difficult for young and old to recognize individual words, age-related differences in comprehension and recall of the monologues were largely eliminated. The latter finding suggested that the speech-understanding difficulties of older adults may be largely a consequence of age-related auditory declines rather than age-related cognitive-linguistic declines. Presumably, perceptual declines in older adults result in inadequate or error-prone representations of external events. These inadequacies and errors at the perceptual level then cascade upward and lead to errors in comprehension (see also McCoy et al. 2005).

Of course, natural listening situations do not simply involve listening to a single talker in a noisy background. Murphy et al. (2006a) investigated potential interactions between perceptual and cognitive demands on central resources by asking younger and older adults to comprehend and remember details from two-person conversations when there were varying degrees of spatial separation between the two talkers. In this study, younger and older adults listened to 2-person plays against a background of 12-speaker babble. The voices of the two actors were presented either over the same central loudspeaker (colocation condition) or over separate loudspeakers located 45° to the left and the right of the listener (spatial separation condition). In addition, in both conditions, 12-talker babble could be presented over the central loudspeaker only. The SNR in this situation was individually adjusted so that all individuals, both young and old, were equally able to recognize individual words presented over the left, right, or central loudspeakers when these words were unsupported by context. Thus, in both conditions, younger and older adults were tested in conditions in which they performed equivalently well with respect to word recognition.

Figure 7.5 plots the percentage of detailed information correctly recalled as a function of noise level for both younger and older adults when the two voices were spatially separate versus colocated. When spatial position cues are absent (Talker 1, Talker 2, and babble played over the central loudspeaker), younger and older adults performed equivalently, suggesting that younger and older adults are equally adept at executing the cognitive processes that are required for comprehension, memory, and recall in this task. However, older adults were not as good as younger adults in the same task when the two voices were spatially separate from each other and from the source of the babble.

Fig. 7.5
figure 5_7figure 5_7

Percentage of detailed information correctly recalled for younger and older adult listeners in quiet and in noise for two conditions of spatial separation between the voices. (Adapted from Murphy et al. 2006a with permission of the author.)

One possible explanation of this result is that perceived spatial segregation in older adults is not as robust and stable as it is in younger adults. Alternatively, because adding spatial separation to the auditory scene could increase the cognitive load (by requiring listeners to switch attention between spatial positions), older listeners might find it more difficult than younger listeners to handle the increased cognitive demands because of resource limitations. In general, because complex listening tasks (such as comprehending a conversation in a noisy environment with competing talkers) requires the smooth integration of a number of perceptual and cognitive components, one is more likely to observe age-related deficits in spoken language comprehension in such situations.

In the beginning of this section, it was suggested that the study of auditory-cognitive interactions using more ecologically valid stimuli is a tricky business. So far it appears that when the auditory scene is rather simple (all sound sources originating from the same location) and once one controls for age differences in hearing (by making individual words equally difficult for younger and older adults to recognize), age differences in word recognition and in comprehension of spoken discourse tend to disappear. However, when the auditory scene becomes more complex (spatial separation between talkers and masking noise), age differences emerged, presumably because the task of integrating information coming from two different spatial locations while suppressing the processing of information from a third location placed additional demands on working memory and attentional resources. Hence, to the extent that processing resources are more limited in older than in younger adults, one would expect to find that age differences increase as the complexity of the auditory scene increases.

The interpretation above implies that increasing the complexity of the auditory scene either indirectly or directly draws on some of the resources that are involved in spoken language comprehension. A recent study (Heinrich et al. 2008) on the effects of noise on memory suggests that this is indeed the case. Previous studies (Rabbitt 1968, 1991; Murphy et al. 2000) have shown that a continuous background noise or babble (12 people talking simultaneously) affects memory even when the words are clearly audible. Heinrich et al. (2008), in a series of studies (using young adults) designed to determine why background babble interfered with memory consolidation, concluded that listeners were attending to the auditory stream to facilitate the extraction of the signal from the background babble, thereby diverting attentional resources from the task of memory consolidation. Thus the results of this study support the notion that the diversion of attentional resources to the task of parsing the auditory scene leaves fewer resources available for more higher-level processing of the information in the signal.

In conclusion, these studies indicate a rather complex pattern of interdependency between peripheral and central processing of speech. Age-related changes in peripheral processing degrade the speech signal and impede auditory scene analysis. When the listening situation is simple (all sound sources emanating from the same location) and the processing demands light (e.g., speech recognition, processing a monologue), it is possible to show that once these age-related peripheral auditory declines are taken into account, age differences in performance become minimal or disappear altogether. Some evidence has also been presented that central attentional resources may be required, in some instances, either to parse the auditory scene or to make effective use of the information in the parsed scene. This draw on resources, in turn, may deplete the pool of resources available for the processing of language. Hence, if older adults have a smaller pool of available resources (e.g., a reduced working memory capacity), one would expect negative age differences to emerge as auditory scenes become complex. Alternatively, one might expect such age differences to emerge when the cognitive demands placed on the individual are increased (e.g., listening to a lecture while responding to e-mail). In all cases, it appears that, in the presence of age-related declines in lower-level auditory processing, there is, correspondingly, a greater need to deploy central processing resources to compensate for these lower-level auditory processing deficiencies. This, in turn, leaves fewer resources available for the processing of spoken language. Conversely, if there are age-related declines in higher-level processes, this could have an adverse effect on auditory scene analysis and signal extraction. The limited number of studies currently available suggest that higher-level, more central auditory processes (attention bands, channel capacity, etc.) may be spared in healthy older adults. Moreover, age-related declines in the cognitive-level processes deemed essential for language comprehension do not appear to have any substantial effect on performance when the auditory scene is simple and the task requirements are not excessively demanding. Importantly, age differences begin to emerge as the auditory scene becomes more complex and/or the comprehension task becomes more demanding.

7.7 Auditory-Cognitive Interactions in Different Populations

The argument has been advanced that age differences in speech recognition and spoken language comprehension in a population of healthy and cognitively-intact individuals are largely due to age-related changes in auditory processes. The available evidence also suggests that age differences are more likely to emerge as the auditory scene becomes more complex and/or the comprehension task becomes more demanding because listeners in such situations will have to depend more and more on the top-down deployment of attentional resources to compensate for adverse listening situations and/or subclinical declines in auditory processing. Moreover, because hearing does not decline at a uniform rate in all individuals, one would also expect larger variations in performance in older than in younger individuals, with some older individuals performing as well as their younger counterparts. Finally, one might expect to find the extent of the decline in both speech recognition and comprehension and the nature and/or kind of top-down compensatory mechanisms that will be engaged when listening becomes difficult to differ depending on the specific nature of the auditory or cognitive decline (Pichora-Fuller 2007). In general, regardless of age, any listener will shift from automatic processing of incoming speech information to more effortful or controlled processing when the listening conditions or task demands become sufficiently challenging. The key questions are when does this breaking point occur (see Rönnberg et al. (2008) for a possible model of this process) and what are the consequences of switching from automatic to controlled processing in older populations with hearing and/or cognitive losses.

Previous studies have shown that older adults with good audiograms require a SNR that is ∼3 dB higher than that needed by their younger counterparts to perform equivalently on tests of speech recognition and comprehension. Presumably, the additional 3 dB are needed to overcome the effects of subclinical age-related auditory declines. Older adults with relatively good hearing, however, are not representative of the general population because their linguistic (e.g., vocabulary) and cognitive (e.g., working memory spans) abilities as well as their general health status are likely to be much better than the average or median of older adults in the general population. Such high-performing older adults may excel in using compensatory processing to an extent that is beyond the capabilities of other older adults in the general population. Hence, speech recognition and comprehension are likely to be more drastically affected in older adults with hearing losses or with declining cognitive capacities. The remainder of the chapter considers how auditory-processing deficits associated with presbycusis may interact with cognition during spoken language comprehension in populations of older adults who have clinically significant levels of hearing loss and/or who may not be as cognitively competent as those older adults typically studied in the laboratory. Before doing so, however, it is important to note that competence at the two levels of processing (sensory and cognitive) are strongly linked in older adults.

In a seminal study, Lindenberger and Baltes (1994) reported a very strong link between sensory and cognitive functioning in a large-scale correlational study of adults aged 70 to 103 years (the Berlin Aging study). Specifically, basic measures of hearing sensitivity and visual acuity accounted for 93.1% of the age-related variance in cognitive functioning (for a review, see Schneider and Pichora-Fuller 2000). This strong linkage, irrespective of reasons for its existence, is quite likely to affect the nature of the interaction between the auditory and cognitive systems in any individual with either auditory or cognitive problems. The Berlin group proposed four hypotheses concerning possible explanations for the powerful intersystem connections between perception and cognition in aging: (1) the declines are symptomatic of widespread neural degeneration (common cause hypothesis); (2) cognitive declines result in perceptual declines (cognitive load on perception hypothesis); (3) perceptual declines result in long-term cognitive declines (deprivation hypothesis); or (4) impoverished perceptual input results in compromised cognitive performance (information degradation hypothesis). A number of studies have been conducted to evaluate the contributions of each of these explanations to the linkage between perception and cognition in aging (for a review, see Gallacher 2005) with many of the laboratory studies described earlier, providing evidence in support of the information degradation hypothesis. Nevertheless, when dealing with an aging individual, it could be that the nature of the interaction between hearing and cognition in speech processing has been altered because of widespread neural degeneration, because cognitive declines have led to inefficient sensory processing (e.g., lack of top-down control over perceptual processes), because long-term sensory deprivation has led to cognitive deterioration, or because information degradation has compromised cognitive performance. Whatever the reason, it is likely that the pattern of auditory-cognitive interaction will differ depending on the specific subpopulation of older adults being studied.

In addition, “presbycusis” is a catchall term that refers to hearing loss in an older adult with no known specific cause (e.g., disease, trauma). Recently, researchers have begun to distinguish among a number of different subtypes of presbycusis (see Schmiedt, Chapter 2). Therefore, it is likely that the interplay between auditory and cognitive factors will vary with the particular subtype. Better diagnosis of various subtypes of presbycusis should reduce the apparent hetereogeneity seen in older groups tested in the lab and in the clinic. Furthermore, if the nature of the auditory-cognitive interactions in spoken language comprehension can be determined for each of the different subtypes, rehabilitation strategies could be developed that are specifically tailored to the abilities and potentials of older adults. In the meantime, without being able to differentiate patterns that might be associated with different subtypes of presbycusis, one should proceed with caution in considering the literature regarding the interactions between auditory and cognitive processing in older adults with clinically significant hearing impairment, including research regarding the experiences of older adults using hearing aids or other assistive technology.

7.7.1 Interaction of Auditory and Cognitive Factors in Older Listeners With Hearing Loss

In studies where simple signals and tasks are used to measure speech understanding, audiometric thresholds explain more of the variance than is explained by cognitive variables (see Humes and Dubno, Chapter 8). However, as the listening situation and/or the task becomes more difficult, cognitive factors appear to account for more of the variance in performance of older adults (for a review, see Humes 2007). For example, in the study concerning source segregation described in detail in Section 7.4.2 (Humes et al. 2006), in which amplification was used to compensate for hearing loss, individual differences within the older group with hearing loss were associated with a measure of working memory (digit span) when the call sign was presented after the CRM sentences were played (a condition that placed a high load on working memory) but not when the call sign was presented before the sentences were played (low load on working memory). By way of contrast, performance on the CRM tasks did not correlate significantly with high-frequency hearing loss in either the low- or high- working memory load conditions. The latter result is not too surprising because amplification was used to compensate for high-frequency hearing losses. The former result, however, supports the notion that cognitive factors, such as working memory load, play an increasingly important role as the demands of the task increase.

7.7.2 Interventions

Over the last half century, those with sensorineural hearing loss, especially presbycusics, were considered to present special problems as candidates for audiological rehabilitation (Davis and Silverman 1970, pp. 321-322). When treating older adults, clinicians must frequently reckon with poor health and the gradual failure of other faculties, particularly vision, which is quite helpful as a supplement to hearing. Not only do older people often become very dependent on others, but they may also find themselves unable to “keep up with the times.” In addition, many of them live alone or with children who “have their own lives to lead.” All these factors may lead to tensions, fears, and anxiety, which may discourage the use of new technologies (Czaja et al. 2006). All of these factors might lead to less successful interventions and hearing aid outcomes in older than in younger adults.

Tremendous technological advances have resulted in present-day hearing aids being more effective than past technologies in overcoming the peripheral auditory deficits associated with sensorineural hearing loss. However, overcoming the central auditory-processing and cognitive-processing problems that affect older listeners is an ongoing topic of research concerning hearing aid design (Edwards 2007). Furthermore, in addition to needed improvements in hearing aid technology, older adults continue to require other rehabilitation approaches that are tailored to their specific needs and ecologies (e.g., Kiessling et al. 2003; Kricos 2006; Willott and Schacht, Chapter 10). Some of the issues that can modulate the effectiveness of interventions in older populations with hearing losses are consideredin Section 7.7.2.

7.7.2.1 Hearing Aids

A typical finding is that there may be a decade or more delay between when a person becomes aware of hearing problems and when the first hearing aid is obtained (e.g., Hétu 1996). The stigma and negative stereotypes associated with wearing a hearing aid and being old, along with other psychological and social factors, seem to conspire to promote the denial of hearing loss, prolong the period before rehabilitation is sought, and foster the rejection and discontinuation of hearing aid use (Kochkin 1993; Garstecki and Erler 1998). Consequently, the average first-time hearing aid user has already reached retirement age (see Cruickshanks, Zhan, and Zhong, Chapter 9).

Not surprisingly, the rate of hearing aid use is even lower in those with dementia than it is in those whose with normal cognitive function (Cohen-Mansfield and Infeld 2006). Indeed, those with no prior hearing aid experience are often considered to be unlikely candidates for a first-time hearing aid fitting. Those with dementia who learned to use a hearing aid before suffering cognitive loss need increasing support from others to maintain hearing aid use as their general health and cognitive abilities continue to deteriorate (Hoek et al. 1997; Jennings 2005). Therefore, it seems that early audiological intervention is worthwhile not only because of the immediate benefits to the older person with hearing loss but also because early intervention may alleviate some of the negative consequences of further age-related declines in auditory and cognitive abilities. The importance of early intervention for age-related hearing loss has led to the development of new rehabilitative strategies that, in addition to providing a hearing aid, also include a health promotion component and new training techniques that are tailored to fit the needs of the individual (Chisolm et al. 2003; Gates and Mills 2005; Kricos 2006).

A basic hope today is that audiological intervention and rehabilitation in older adults will not only improve auditory function and spoken language comprehension but also increase competence in performing activities of daily living (e.g., Brennan et al. 2005) and result in improvements in an individual’s overall quality of life (e.g., Stark and Hickson 2004; Chia et al. 2007; Chisolm et al. 2007). Indeed, on the basis of a large-scale longitudinal study of a comprehensive set of outcome measures administered to older adults who had been fit with hearing aids, Humes (2003) concluded that improvements could be demonstrated in three distinct categories: speech perception, hearing aid use, and satisfaction. There is also some evidence that sensory-focused interventions, such as hearing aid fitting (or cataract surgery to correct vision), result in significant posttreatment improvements on cognitive measures administered in the same modality as the sensory intervention (Mulrow et al. 1990; Van Boxtel et al. 2000; Valentijn et al. 2005). However, improvements after hearing aid fitting have not been seen on cognitive tests administered using visual stimuli (Tesch-Romer 1997; van Hooren et al. 2005). It is worth noting that within-modality improvements provide further evidence for the information degradation hypothesis.

Whether the strong correlation between auditory and cognitive functioning is explained in terms of the common-cause hypothesis, the information-degradation hypothesis, or a combination of these hypotheses, another emerging assumption motivating rehabilitative approaches is that compensation for sensory loss may slow or attenuate the symptoms of cognitive decline (Wahl and Heyl 2003). In turn, both perceptual and cognitive declines have been linked to emotional and psychosocial problems and even longevity (e.g., Appolonio et al. 1996; Cacciatore et al. 1999). Thus it seems that audiological rehabilitation may have positive consequences for cognitive and social function in older communicators (e.g., Mulrow et al. 1992).

Another perspective on the connection between hearing aid use and cognition that has gained recent attention among rehabilitative audiologists is that cognitive function may influence the suitability of candidates for particular types of hearing aid processing. In recent landmark studies, it has been shown that an individual’s cognitive capacity is significantly related to the degree to which he or she benefits from more demanding types of signal-processing algorithms used in hearing aids when more demanding speech and background signals are used to test performance (Gatehouse et al. 2003, 2006; Lunner 2003; Lunner and Sundewall-Thorén 2007). Importantly, traditional measures of hearing such as pure-tone average account for most of the variance in speech understanding when the signal-processing and background signals are less challenging, but cognitive measures account for most of the variance in performance when the listening situation is more challenging. The connection between cognitive ability and candidacy for different types of hearing aids has compelled audiologists to begin to consider which cognitive measures could be incorporated into audiological practice and how such measures could be used to either guide hearing aid fitting or to evaluate outcomes of rehabilitation (see Pichora-Fuller and Singh 2006).

7.7.2.2 The need for Comprehensive Rehabilitation

Given the inevitable failure of technology to solve all communication problems experienced by an older adult with hearing loss, a more comprehensive approach to rehabilitation is required, especially for older adults who have central auditory and cognitive declines in addition to cochlear hearing loss. Moreover, communication difficulties due to hearing loss (cochlear or central) are often exacerbated by the attitudes and behavior of communication partners and by the context within which communication takes place (Pichora-Fuller and Robertson 1997). Therefore, comprehensive rehabilitation would necessarily include training and counseling components that would improve the communicative behavior of both the impaired listener and his or her communicative partners and the ecology or situations in which they communicate (Pichora-Fuller and Schow 2007). Specifically, intervention may be required to enable the person with hearing loss to cope with their reactions to communication difficulties, to develop self-efficacy, and to use contextual information and conversational strategies to circumvent or compensate for the poor quality of perceptual input. Intervention may also be required to enable the person’s communication partners to support them in a variety of ways including producing more easily understood speech and language. In addition, it may be important to optimize environmental factors such as reducing ambient noise and distraction or improving lighting. The effectiveness of these forms of intervention has been convincingly documented (e.g., Hickson and Worrall 2003; Kramer et al. 2005; Boothroyd 2007).

In general, the value of a more comprehensive approach to audiological rehabilitation has received renewed attention as the importance of the interactions between auditory and cognitive factors has become better understood (see Willott and Schacht, Chapter 10).

7.7.3 Interaction of Auditory and Cognitive Factors in Older Listeners with Cognitive Loss

7.7.3.1 Prevalence

As with hearing loss, reports on the prevalence of cognitive impairment vary depending on many research parameters. For Alzheimer’s disease, prevalence estimates range from 1% at age 65 years to 75% at age 90 years (Hy and Keller 2000). In a sample of over 1,800 adults living in the community who participated in the Canadian Study of Health and Aging (Ebly et al. 1994), the prevalence of dementia increased with age, with ∼15% being affected in the age group from 75 to 84 years, 23% in the group aged 85 to 89 years, 40% in those 90 to 94 years, and 58% in those 95 years and older. Probable or possible Alzheimer’s disease accounted for 75% of the cases of dementia, and vascular etiology alone accounted for another 13%. Of course, these figures would be much higher in the segment of the population who are living in residential care.

7.7.3.2 Correlations Between Sensory and Cognitive Impairment in Population Studies

Hearing impairment is associated with cognitive function even when the sample is deemed to be clinically normal on cognitive screening tests such as the Mini-Mental State Examination (MMSE) (Golding et al. 2006). Hence, there is an obvious practical need for taking sensory function into account when assessing cognitive function (e.g., Uhlmann et al. 1989b). Hearing impairment may even contribute or accelerate clinically significant cognitive decline and it has been suggested that sensory intervention could slow cognitive decline (Peters et al. 1988; Wahl and Heyl 2003). In a large-scale study, Uhlmann et al. (1989a) examined the relationship between hearing loss and cognitive decline in 100 people with and 100 people without Alzheimer’s type dementia. (The two groups were matched with respect to age, sex, and education.) These investigators found that cognitive function was correlated with hearing loss in both the demented and nondemented groups, with the prevalence of hearing loss being greater in the demented group. These results led the investigators to identify hearing loss (especially in the moderate-to-severe range) as a risk factor for Alzheimer’s type dementia. In a large-scale, multicenter longitudinal study of older women who were participating in research concerning osteoporotic fractures, combined vision and hearing loss was associated with the greatest odds for cognitive decline on the MMSE and functional decline in five everyday activities over a period of four years (Lin et al. 2004). In a smaller longitudinal study of patients with dementia of various etiologies, cognitive decline in Alzheimer’s patients with impaired hearing over a period of about one year was more rapid than in Alzheimer’s patients with relatively good hearing (Peters et al.1988).

It is noteworthy that in the studies investigating the relationships between cognitive impairment and sensory impairment, the criterion for vision impairment is typically based on measures of corrected vision; however, the criterion for hearing impairment is usually based on unaided audiometric thresholds. This may be a consequence of a relatively small percentage of participants who are hearing aid users and the reluctance on the part of the researchers to obtain performance measures from them. In addition, only a handful of studies have included tests of central auditory processing. Those that have included such tests have indicated that people with probable Alzheimer’s disease performed worse on the tests of central auditory processing even though they were matched to a nondemented control group with respect to age, gender, and pure-tone average (Strouse et al. 1995). In a recent large-scale longitudinal study, speech tests, which were employed to measure central auditory dysfunction, proved to be predictive of the likelihood of developing Alzheimer’s disease even after the contribution of audiometric sensitivity had been taken into account (Gates et al. 2002). Further work is clearly needed to identify the nature of the linkage between hearing losses of various types and cognitive declines and to determine the effectiveness of audiological interventions in stemming the tide of cognitive decline in the population.

7.7.3.3 Intervention

The literature suggests that providing comprehensive audiological rehabilitation may be especially important for older adults with cognitive impairment. As cognitive impairment progresses, more responsibility for communication will shift from the listener with hearing loss to the communication partners, and it will become even more important to prevent communication problems by optimizing communication contexts rather than simply relying on technology to overcome auditory problems. Importantly, hearing aid use should not be precluded by dementia. Indeed, reduced rate of decline in MMSE scores over a six-month period has been documented after intervention with hearing aids (Allen et al. 2003). Use of hearing aids has also been related to reduced caregiver-identified problem behaviors in patients with Alzheimer’s disease living in the community (Palmer et al. 1999).

7.8 Summary and Recommendations for Future Research

7.8.1 Summary

Spoken language comprehension requires the smooth and rapid functioning of an integrated system of perceptual and cognitive processes. Studies of high-functioning younger and older adults with good hearing have shown that the pattern of interdependency between lower-level and higher-level processing of speech can be quite complex. It also appears that any factor (such as background noise, competing speech) that degrades the speech signal will place a greater processing burden on the higher-level components involved in speech processing and most likely will engage working memory and other attentional processes as the listener attempts to process impoverished or distorted speech signals. Conversely, any cognitive-level demands (such as trying to compose an e-mail while listening over the telephone) will also lead to poorer spoken language comprehension because it divides attention between two tasks. The importance of these interactions has been demonstrated in highly selected experimental samples of younger and older adults. However, relatively little is known about the nature of these sensory-cognitive interactions in the general population where there is a much broader range of both auditory and cognitive abilities. Because of the strong linkages that exist between auditory and cognitive functioning, one might expect to find more complex and varied patterns of sensory-cognitive interactions in the general population than those found in high-functioning, good-hearing older adults.

7.8.2 Recommendations for Future Research

First, systematic laboratory research is needed to more fully understand the nature of the complex interactions that occur between the auditory and cognitive processes involved in spoken language comprehension. This could be done in two parts: (1) continuing to explore interactions in high-functioning older adults and (2) extend these laboratory studies to include specific impaired subpopulations having different degrees and types of both auditory and cognitive impairments. Special attention should be paid to individuals with different subtypes of presbycusis, different subtypes of cognitive impairment, or combinations of more precisely subtyped auditory and cognitive impairments.

Second, more research is needed on how to translate experimental findings into clinical practice and models of health service delivery. Experimental research has advanced our knowledge of interactions between auditory and cognitive factors and developed experimental tools to measure various aspects of auditory and cognitive processing. Research is needed on how to develop cost-effective, clinically feasible versions of these experimental techniques. Assuming that this new generation of auditory and cognitive measures has been developed, they could be used to devise better and more individually tailored interventions.