Introduction

Listeners can improve their sensitivities to basic acoustic features with practice. Such perceptual learning has been documented on a number of auditory tasks for young, college-aged, listeners with normal hearing (YNH; for review see Wright and Zhang 2009a, b). However, far less is known about the influence of auditory training in the largest population seeking treatment in audiology clinics—older listeners with hearing impairment (OHI). While perceptual learning has been observed in OHI listeners especially on speech perception tasks (e.g., Walden et al. 1981; Sweetow and Sabes 2006; Burk and Humes 2007, 2008), how the learning in this population compares to that of YNH listeners given the same training has received little attention. Thus, it is not known whether perceptual learning itself differs between OHI and YNH listeners. Any differences in response to the same training regimen between these populations would indicate that the processing mediating perceptual improvement differs in some way between them, and that conclusions about perceptual learning that arise from studying one population do not necessarily apply to the other. Therefore, we trained OHI listeners on a basic auditory task (spectral modulation detection) and compared their learning patterns to those we previously reported for YNH listeners given the same training (Sabin et al. 2012b).

The clearest evidence that a sensory disorder itself can influence perceptual learning comes from recent reports in which the effects of the same multisession training regimen were compared between young adults with and without amblyopia. Amblyopia is characterized by an inability to perceive fine spatial details despite a physically normal eye (Attebo et al. 1998). In these investigations, participants practiced detecting sinusoidal gratings at a particular spatial frequency (Huang et al. 2008, 2009) or practiced discriminating between spatial frequencies (Astle et al. 2010). In comparison to individuals with normal vision, learning in amblyotes had a larger magnitude (Polat et al. 2004; Chen et al. 2008), a longer time course (Li et al. 2008), and generalized to a broader range of untrained spatial frequencies (Huang et al. 2008, 2009; Astle et al. 2010).

Here, we compared the response to training between OHI and YNH listeners on auditory spectral modulation detection, the auditory analog of the detection task that the amblyotes practiced. Listeners had to distinguish a noise with a flat spectral envelope from one with a spectral envelope that had a sinusoidal shape on a logarithmic frequency axis (Eddins and Bero 2007). We selected this task because it requires the perception of sinusoidal patterns of activity distributed across the sensory periphery, as does the visual detection of a sinusoidal grating practiced by the amblyotes, and because the learning patterns on this auditory task are similar to those on the corresponding visual task in young adults with normal sensory processing. For such observers, for both tasks, when learning occurs, it unfolds over multiple training sessions and does not generalize to untrained spectral modulation/spatial frequencies (auditory, Sabin et al. 2012a, b; visual: Sowden et al. 2002). Specifically, we trained a group of OHI listeners on a single spectral modulation detection condition, 720 trials/session for seven daily sessions. Before and after this training phase, we evaluated performance on the trained condition, on the detection of two untrained spectral modulation frequencies, and on a ripple reversal task (another measure of spectral modulation perception; Henry et al. 2005). A separate group of OHI controls only participated in the pre- and post-tests. We compared the results of the OHI listeners to those we previously reported for YNH listeners who practiced detecting spectral modulation using the same training regimen (Sabin et al. 2012b). There were qualitative differences in both learning and generalization between these two populations, indicating that the processing underlying perceptual improvement can differ between OHI and YNH listeners.

Method

Listeners

Sixteen participants (eight female) between 56 and 82 years of age were recruited from the Northwestern University Audiology Clinic. All listeners had bilateral sensorineural hearing loss ranging from moderate to profound (see Table 1). The hearing loss was symmetric between the two ears in all but one listener (listener C7). All listeners reported no previous experience with psychoacoustic tests, gave informed consent, and were financially compensated for their participation. All procedures were approved by the Institutional Review Board at Northwestern University.

TABLE 1 Listener audiograms

Overview

The participants were divided into a group of trained listeners (n = 8) and a separate group of controls (n = 8). The trained listeners participated in an initial screening, a pre-test, a training phase, and a post-test. During the screening, pure-tone air and bone-conduction thresholds were measured at frequencies from 250 to 8,000 Hz and the uncomfortable listening level was determined. In the pre-test, performance was evaluated on three spectral modulation detection conditions (1, 2, and 4 cyc/oct) as well as on a ripple-reversal task. Vowel and consonant identification in noise were also measured during this session in subsets of the trained listeners and controls, but there were methodological problems during the collection of those data, so they are not reported here. The training phase consisted of seven daily practice sessions (each approximately 1 h in length) in which thresholds were measured repeatedly on a single spectral modulation detection condition (2 cyc/oct). The post-test followed the training phase and was identical to the pre-test. The order of the conditions in the pre- and post-tests was randomized across listeners, but held constant between the pre- and post-tests for each individual listener. The pre-test and first day of training were conducted on consecutive days as were the final day of training and the post-test. The controls participated in all of the same stages, except for the training phase. Thus, any difference between the trained listeners and controls can be attributed to the training phase. Pre- and post-tests were separated by an average of 17.0 days for the trained listeners and 18.8 days for the controls.

Tasks and procedures

Thresholds were estimated using an adaptive three-alternative, forced-choice procedure with feedback. On a given trial, three intervals were presented in random order. One interval contained a signal stimulus and two contained a reference stimulus. In the spectral modulation detection task, the spectrum of the signal stimulus was sinusoidally modulated and the spectrum of the reference stimulus was flat (Fig. 1A). In the ripple-reversal task, the spectra of the signal and reference stimuli were both modulated at the same frequency with a rectified sinusoid, but the phase of the spectral modulation in the reference stimulus was shifted by 180 ° relative to that in the signal stimulus (Fig. 1B). Thus, the locations of the peaks and valleys were interchanged between the two stimuli. Listeners indicated which of the three intervals contained the signal stimulus by using a computer mouse to click on a visual display. Visual feedback indicating whether the response was correct or incorrect was provided after every trial throughout the experiment.

FIG. 1
figure 1

Tasks. Schematic diagrams of the stimuli used in the two tested tasks. A In the spectral modulation detection task, listeners had to distinguish a noise with a sinusoidal spectral shape over a logarithmic frequency axis (solid line signal) from one with a flat spectrum (dotted line reference). The modulation depth was varied adaptively to determine the spectral modulation detection threshold. The noise ranged from 400 to 3,200 Hz. The spectral modulation frequency of the displayed stimulus is 2 cyc/oct (the trained condition). B In the ripple reversal task, listeners had to distinguish a stimulus with a full wave rectified sinusoidal shape on a logarithmic frequency axis (solid line signal) from one in which the peaks and valleys were interchanged with those of the signal stimulus (dotted line reference). The spectral modulation frequency was varied adaptively to determine the ripple reversal threshold. The stimulus ranged from 100 to 5,000 Hz. The spectral modulation frequency of the displayed stimulus is 5.6 cyc/oct.

Spectral modulation detection thresholds were estimated by adaptively adjusting the modulation depth of the signal (peak to valley difference in dB) using a three-down/one-up rule. The modulation depth was decreased after every three consecutive correct responses and was increased after each incorrect response. This procedure converged on the 79.4 % correct point on the psychometric function (Levitt 1971). The modulation depths on those trials on which the depth reversed from decreasing to increasing or vice versa, due to the listener’s history of responses, are referred to as reversals. The depth was initially 20 dB and was adjusted in steps of 2 dB until the third reversal; subsequent steps were 0.4 dB. In each block of 60 trials, the first three reversals were discarded and the modulation depths at the largest remaining even number of reversals (the useable reversals) were averaged and taken as the spectral modulation detection threshold. This procedure followed that used by Sabin et al. (2012b).

Ripple reversal thresholds were estimated by adaptively adjusting the spectral modulation frequency using a two-down/one-up rule. The spectral modulation frequency was increased after every two consecutive correct responses and was decreased after each incorrect response. This procedure converged on the 70.7 % correct point on the psychometric function (Levitt 1971). On any given trial, the signal and the reference had the same spectral modulation frequency. The frequency was initially 1.414 cyc/oct and was adjusted in one half octave steps throughout the adaptive track. All other aspects of the adaptive procedure were the same as for spectral modulation detection, except that the threshold was computed by taking the geometric, rather than the arithmetic, mean of the useable reversals. This procedure was based on that used by Henry et al. (2005).

For both tasks, blocks that contained fewer than seven useable reversals (5.3 % of all blocks) or single trials that were longer than 20 s (from the first observation interval through the response, 1.1 % of all blocks) were excluded from analysis. We also removed from analysis blocks for which the threshold estimate was more than 2 standard deviations higher than the mean of all the estimates from all the listeners on that condition (4.4 % of all blocks).

The uncomfortable listening level was determined using a modified version of the contour test of loudness perception (Cox et al. 1997). On each trial, the listener was presented with a flat spectrum noise spanning 400–3,200 Hz—the reference used in the spectral modulation detection tasks. The listener had to select which of the seven loudness categories, ranging from “inaudible” to “uncomfortably loud,” best described that presentation. On the first trial, the noise was presented at 50 dB SPL. On each subsequent trial, the presentation level increased by a random amount ranging from 2 to 5 dB until the listener selected the “uncomfortably loud” category. Each listener repeated this procedure three times. The average of the three sound pressure levels that were characterized as uncomfortably loud was taken as the uncomfortable listening level.

Stimulus synthesis and presentation

For stimuli in the spectral modulation detection task (as in Eddins and Bero 2007), a 8,192-point buffer was first filled with a sinusoid computed on a log2 frequency axis with the appropriate spectral modulation frequency (1, 2, or 4 cyc/oct) and modulation depth (expressed in dB). The depth of the sinusoid was varied based on the listener’s performance in the adaptive procedure. The phase of the sinusoid was randomly drawn from a uniform distribution spanning 0–2 π. The sinusoid was first multiplied by an equivalently sized buffer filled with randomly drawn numbers from a Gaussian distribution, and then multiplied by the magnitude response of a Butterworth filter (−32 dB/octave) with cutoff frequencies at 400 and 3,200 Hz. The resulting magnitude response was combined with a random phase spectrum and the real inverse Fourier transform was computed. Once in the time domain, the sound was shaped by a 100-ms amplitude envelope with 10-ms raised cosine on/off ramps. All stimuli were scaled to have the same RMS amplitude.

For the ripple reversal task (as in Henry et al. 2005), each stimulus was comprised of 200 sinusoids that had random phases and were evenly spaced along a log2 frequency axis from 100 to 5,000 Hz. The amplitudes of the individual components were shaped by a full-wave rectified sinusoid that was defined across a log2 frequency axis (spectral frequency in cycles per octave) and had a modulation depth of 30 dB. The frequency of that sinusoid was varied based on the listener’s performance in the adaptive procedure. Its phase was chosen randomly for the signal stimulus and was always the opposite phase for the references. Note that others have shaped the spectrum using a sinusoid without rectification (e.g., Supin et al. 1999), which would lead to better thresholds than with a full-wave rectified sinusoid at the same spectral modulation frequency. To approximate the long-term speech spectrum, an overall spectral tilt of −6 dB/oct was applied to the frequencies above 750 Hz. Each stimulus was 500 ms in duration including 150-ms raised cosine on/off ramps, and was scaled to have the same RMS amplitude.

To help prevent the listeners from basing their decisions on the use of local intensity cues (comparing the intensity at a single audio frequency across intervals), we randomized the presentation level of each stimulus in both tasks. The presentation levels were randomly drawn from a uniform distribution spanning 15 dB, where the top of that range was 5 dB below the individual’s uncomfortable listening level. Each stimulus was synthesized before each trial.

All stimuli were presented using custom software written in MATLAB. For some listeners, the stimuli were presented through a 16-bit digital-to-analog converter (Tucker-Davis Technologies DD1) followed by an anti-aliasing filter with a 16-kHz cutoff frequency (TDT FT6-2), a programmable attenuator (TDT PA4), a sound mixer (TDT SM3), and a headphone driver (TDT HB6). For others, the stimuli were presented through a two-channel USB sound card (Edirol UA-25) and a headphone amplifier (Crown D75). The stimuli were always presented through both earpieces of Sennheiser HD265 circumaural headphones. Listeners were tested in a sound-attenuated room.

Analyses

Our primary assessment of the influence of training on spectral modulation detection focused on comparisons between improvements of the trained listeners and those of the controls. Toward this end, we first computed, separately for each condition, a two group (trained vs control) × two time (pre- vs. post-test) analysis of variance (ANOVA) using time as a repeated measure. A significant time × group interaction would indicate an effect of the training. We confirmed that the datasets did not violate the assumptions of normality or homogenity of variance. We also attempted to account for any differences in learning due to differences in starting threshold by computing, separately for each condition, a two-group (trained vs control) analysis of covariance (ANCOVA) using the pre-test threshold as the covariate and the post-test threshold as the dependent variable. We confirmed that each dataset did not violate the assumption of homogenity of regression and had a linear relationship between the dependent variable and the covariate. There were no cases in which the statistical conclusions differed between ANOVA and ANCOVA. This could be due to the fact that there were no significant group-level differences in starting performance on any condition (all p > 0.20).

To faciliate visual comparisons between groups, in most of the figures we plot post-test or training-day thresholds after adjusting for their relationship to pre-test thresholds. We did this adjustment, separately for each condition, using the procedure that underlies ANCOVA (Eq. 1) where pre-test threshold was the covariate (X) and post-test (or training session) threshold was the dependent variable (Y).

$$ \left( {Y-Y\prime } \right) = \left( {Y-\mathrm{G}{{\mathrm{M}}_y}} \right)-{B_{y.x }}\left( {X-\mathrm{G}{{\mathrm{M}}_x}} \right) $$
(1)

The adjustment for each individual’s post-test threshold (Y – Y′) was computed by subtracting that listener’s difference from the across-group (combined trained and control) average pre-test threshold (X − GM x ) weighted by the slope of the line comparing pre- to post-test thresholds (B y·x ) from that listener’s difference from the average post-test threshold (Y − GM y ).

Results

Trained condition

The trained listeners improved on the trained spectral modulation detection condition over the course of multiple sessions, but the controls improved by a similar amount (adjusted values, Figs. 2A and 4A (left panel); raw data, Fig. 4C (left panel)). The magnitudes of improvement between the pre- and post-tests for the trained listeners (filled squares, T 7 = 2.9, p = 0.024) and controls (open circles, T 7 = 2.6, p = 0.036) did not differ according to either the interaction term of two groups (trained vs. control) × two sessions (pre vs. post) ANOVA using session as a repeated measure (F 1,14 = 0.1; p = 0.8) or the effect of group in an ANCOVA using pre-test performance as a covariate (F 1,13 = 0.8, p = 0.39). The improvements were, however, more consistent across the individual trained listeners (Fig. 2B, lines) than the controls (circles). The post-test thresholds of three (dashed circles) of the eight controls were more than 2 standard deviations higher than the mean post-test threshold of the trained listeners, while none of the thresholds of the individual trained listeners were this extreme. Removing the data of the three aberrant controls did not change any of the statistical conclusions arising from the between-group comparisons of pre- to post-test improvement on either the trained or untrained conditions. Neither age nor severity of hearing loss (pure tone average) was predictive of pre-test performance (all r 2 < 0.09, all p > 0.25) or of the amount of improvement (all r 2 < 0.009, all p > 0.72) on the trained condition, computed on the combined set of trained and control listeners.

FIG. 2
figure 2

Learning curves. A The group average spectral modulation detection threshold on the trained condition (2 cyc/oct) as a function of testing session for the trained listeners (filled squares, n = 8) and controls (open circles, n = 8). Values are adjusted using individual differences in pre-test performance as a covariate (Eq. 1). The schematic illustrates the trained stimulus, with audiofrequency on the abscissa and magnitude on the ordinate (axes not shown; see Fig. 1). Error bars 1 standard error of the mean. B Individual performance for the trained listeners (lines) and controls (circles). Five of the controls improved between the pre- and post-tests (solid-line circles) while three others did not (dashed circles). Note that while the data are plotted on a linear scale, the regression analyses of performance across sessions were computed using log10 of the session number. Trained listeners improved gradually over the training phase, but controls improved by a comparable amount.

Despite the similar improvement by the trained listeners and controls, the trained listeners actually showed a clear learning curve over the training phase. On the group level, this training phase improvement is indicated by a significant negative slope of a single line fitted to the pre-test-adjusted daily mean thresholds for each trained listener over the log10 of the session number, computed across all sessions except the pre-test (r = −0.36, p = 0.004; note that the data are plotted on a linear rather than a log10 scale in Fig. 2). This gradual improvement is further confirmed on the individual level where nearly all (seven of eight) trained listeners improved over these sessions, as indicated by a significant negative slope of a regression line fitted to each adjusted threshold estimate over the log10 of the session number (all p < 0.048). To help determine when these improvements emerged, we examined how performance changed within training sessions by computing, for each trained listener on each training session, the means of the first three and of the last three threshold estimates (Fig. 3). We evaluated within-session performance using a two-time (first vs. last)-by-seven session (all training days) ANCOVA using both time and session as repeated measures and the pre-test threshold as the covariate. While there was no main effect of time (F 1,42 = 0.4, p = 0.53), there was a main effect of session (F 6,42 = 5.8, p < 0.0001), and a time × session interaction (F 6,42 = 4.0, p = 0.003). This interaction arose because performance improved within the first training session (T 7 = 4.7, p = 0.002), but not within any other sessions (all p > 0.14). However, even when the entire first day of training is excluded from analysis, the slope of the remaining learning curve is still significantly negative (r = −0.31, p = 0.001). Therefore, the learning of the trained listeners was comprised of both a fast within-session improvement during the first training session and a more gradual improvement over the other sessions.

FIG. 3
figure 3

Within-session performance. The group-average spectral modulation detection thresholds toward the beginning and end of each training session. The averages of the first three (left square) and last three (right square) adjusted threshold estimates from the same training session are connected by a line. Values are adjusted using individual differences in pre-test performance as a covariate. Error bars 1 standard error of the mean. Trained listeners showed a rapid within-session improvement during the first training session and a gradual improvement across the remaining sessions.

The apparently contradictory result of training-phase learning despite comparable improvement by the trained listeners and controls is at least partially clarified by evidence that the influence of the pre-test emerged over multiple days. To examine the time course of the influence of the pre-test, we compared the performance of the controls at the post-test to that of the trained listeners on the first training day, because at these points, the only prior experience for both groups came from the pre-test. Three performance differences between the two groups suggest that the influence of the pre-test took more than 1 day to fully emerge. First, the post-test thresholds of the subset of controls who learned (all controls except the three aberrant listeners described above) were lower than those of the trained listeners on the first training day (F 1,10 = 13.2, p = 0.005). Second, while the trained listeners showed no improvement between the pre-test and the beginning (first three estimates) of the first day of training (T 7 = −0.09, p = 0.92), the controls improved between the pre- and post-tests (see above). Third, the trained listeners improved during the first day of training (r = −0.37, p < 0.001), but the controls showed no improvement during the post-test. The controls did not improve during the post-testing of the trained condition itself (r = −0.08, p = 0.67). They also did not improve over the course of the entire post-test, either when thresholds were expressed as raw values (r = −0.13, p = 0.21) or when they were z-score normalized on a condition-by-condition basis to the post-test performance of all controls on that condition (r = −0.14, p = 0.17). Therefore, at least a portion of the training-phase learning shown by the trained listeners may simply reflect this implied multiday influence of the pre-test.

Untrained conditions

While the contribution of the multiple-session practice to improvements on the trained condition is unclear, analyses of the untrained conditions suggest that the training phase did affect performance. Pre-test-adjusted values are plotted in Fig. 4A and raw data are plotted in Fig. 4C. For the detection of the untrained lower spectral modulation frequency (1 cyc/oct; Fig. 5A and C, middle column), the trained listeners improved significantly more than controls (ANOVA: F 1,12 = 4.9, p = 0.047; ANCOVA: F 1,11 = 6.0, p = 0.03). This between-group difference suggests that the training phase itself led to some improvement. However, this improvement was limited to (specific to) the lower untrained spectral modulation frequency. The improvements of the trained listeners and controls were not distinguishable from each other for the detection of the higher untrained spectral modulation frequency (4 cyc/oct, adjusted scores: Fig. 4A and C, right column; raw scores: Fig. 4C, right panel) or for the ripple reversal task (adjusted scores: Fig. 4B, raw scores: Fig. 4D; ANOVA: all p > 0.43 ; ANCOVA all p > 0.41). Supporting the likelihood that these listeners are representative of the larger population of listeners with hearing loss, the average pre-training performance here (1.50 cyc/oct) was similar to a prior examination of 36 listeners with hearing loss (1.77 cyc/oct; Henry et al. 2005).

FIG. 4
figure 4

Pre- and post-tests. A Adjusted post-test spectral modulation detection thresholds for each tested condition for the trained listeners (squares) and controls (circles). Results are shown as group averages (filled symbols) and for each individual listener (unfilled symbols). As in Fig. 2, dashed circles represent the individual controls who did not improve on the 2 cyc/oct condition. Values are adjusted using individual differences in pre-test performance as a covariate (Eq. 1). For these conditions, lower values indicate better performance. Performance was evaluated for the trained (left column) and untrained (middle and right columns) spectral modulation frequencies. A schematic of the tested stimulus is displayed above each column. The dashed lines represent the mean pre-training performance across groups, and the horizontal boxes represent the 95 % confidence interval of the adjusted post-test performance of the controls. Error bars 1 standard error of the mean. Asterisks indicate a significant (p < 0.05) difference between trained listeners and controls based on an analysis of covariance using pre-test performance as a covariate. B As in A, but for the ripple-reversal condition. For this condition, higher values indicate better performance. C Raw (unadjusted) pre-test performance (x-axis) and pre-test minus post-test improvement (y-axis) is plotted for each listener in the trained (filled squares) and control (open circles) groups. Each panel is a different spectral modulation frequency. D As in C, but for the ripple reversal condition. Trained listeners learned more than controls on the untrained 1 cyc/oct condition. Both groups improved by similar amounts on the trained 2 cyc/oct and the untrained 4 cyc/oct conditions, and neither group improved on the untrained task.

FIG. 5
figure 5

Comparison to young normal hearing listeners. A–D Adjusted learning curves, plotted as in Fig 2. Data are shown for three sets (AC) of trained younger listeners with YNH (filled symbols) and corresponding YNH controls (open symbols; data from Sabin et al. 2012b) as well as for D the older listeners with hearing impairment (OHI; replotted from Fig. 2). E–H Summary of training-induced learning across all conditions. Each bar indicates the eta-squared effect size of a two group (trained vs control) ANCOVA, with pre-test performance as the covariate, computed separately for each condition in each experiment (indicated by x-axis labels). Larger bars indicate larger differences between trained and control listeners, and asterisks indicate significant differences (p < 0.05). The key result is that the patterns of improvement in the OHI listeners is different from those of each of the groups of YNH listeners.

Finally, participation in the pre-test itself led to improvements that were restricted to a subset of conditions, suggesting that these improvements did not arise solely from procedural learning. Of the four conditions, exposure to the pre-test induced learning in only two. Trained listeners and controls improved significantly and equivalently between the pre- and post-tests on the trained condition (2 cyc/oct; see above) and on the detection of the higher untrained spectral modulation frequency (4 cyc/oct; Fig. 4A, right column, ANOVA main effect of session: F 1,12 = 6.0, p = 0.03). The similar improvements shown by the two groups for these two conditions suggest that, for both groups, this learning arose from participation in the pre-test, because prior to the post-test this was the only experience shared by the two groups. In contrast, for the detection of the lower untrained spectral modulation frequency (1 cyc/oct), the controls did not improve (T 6 = 0.05, p = 0.96), even though improvements on this condition were possible, as demonstrated by the trained listeners, and neither group improved on the ripple-reversal task (T test, all p > 0.2; ANOVA, F 1,13 = 3.1; p = 0.1). Thus, the influence of the pre-test, as observed at the post-test, was restricted to detection of the higher spectral modulation frequencies (2 and 4 cyc/oct). This specificity implies that the improvements arising from participation in the pre-test do not simply reflect the learning of procedural aspects that were common to all of the conditions or even of those that were common only to the detection task.

Discussion

The goal of this investigation was to assess the extent to which perceptual learning differs between older listeners with hearing impairment and younger, college-aged, listeners with normal hearing. Toward this end, we trained OHI listeners on a spectral modulation detection task so that we could compare their results to those we previously reported for YNH listeners who participated in the same training regimen. Below we identify four differences in the response to perceptual training between OHI and YNH listeners, present a potential account for these differences, consider the potential distinct influences of age and hearing loss on these differences, and discuss the practical implications of these results.

Differences in the response to training between OHI and YNH listeners

We compared the learning patterns on spectral modulation detection for the current OHI listeners to those we previously reported for YNH listeners who participated in the same training regimen (summarized in Fig. 5; Sabin et al. 2012b). Like the OHI listeners, the YNH listeners practiced a single spectral modulation detection condition ∼1 h/day for 7 days and were tested on multiple conditions before and after the training phase. Different groups of YNH listeners practiced either the same condition that was trained here (Fig. 5C and G; 2 cyc/oct spectral modulation spanning 400–3,200 Hz) or one of two conditions with a lower spectral–modulation frequency and a lower carrier–frequency range (0.5 cyc/oct, 200–1,600 Hz (Fig. 5A and E), or 1 cyc/oct 200–1,600 Hz (Fig. 5B and F)). The influence of this spectral modulation detection training differed between the OHI and YNH listeners in four notable respects.

First, when given multiple-session training on the same condition (2 cyc/oct), the OHI listeners improved over the training phase (Fig. 5D), while the YNH listeners did not (Fig. 5C). Thus, it appears that the YNH, but not the OHI, listeners were already at optimal performance on this condition prior to training. However, YNH listeners did show training-phase improvement on a different condition (0.5 cyc/oct; Fig. 5A). This improvement was similar in time course and magnitude to that of the OHI listeners, demonstrating that gradual learning on spectral modulation detection is not unique to OHI listeners.

Second, for those cases in which the trained listeners improved gradually (OHI trained at 2 cyc/oct and YNH trained at 0.5 cyc/oct), how that learning compared to the improvement of controls differed between the YNH and OHI populations. The YNH-trained listeners learned more on their trained condition than did matched controls who received no training (Fig. 5A), indicating that the training itself led to learning. In contrast, even though the trained OHI listeners improved over the course of the training phase, the OHI controls improved by a similar amount, suggesting that the learning in this population was driven primarily by the pre-test. Thus, the training-phase learning in these two populations appears to have arisen from different components of the same experimental protocol.

Third, the improvement of the controls appeared to emerge over a longer time course in OHI than in YNH listeners. The controls did not participate in the training phase, so any improvements between the pre- and post-tests in this group can be attributed to an influence of the pre-test. We inferred the time course of this influence by comparing the performance of the controls at the post-test (∼18 days after the pre-test) to that of the trained group on the first training session (1 day after the pre-test), because in both cases, the only previous experience came from the pre-test. By these analyses, for the YNH listeners, the pre-test either led to modest improvements that emerged in full by the following day and were maintained (0.5 cyc/oct; Fig. 5A) or lost (1 cyc/oct; Fig. 5B) by the post-test, or did not lead to improvement at all (2 cyc/oct; Fig. 5C; Sabin et al. 2012b). On the other hand, for the OHI listeners, participation in the pre-test led to marked improvement on the trained condition (2 cyc/oct; Fig. 5D) that seemed to take multiple sessions to fully emerge (see results). Thus, the two populations also differed in how exposure to the pre-test influenced their performance.

Fourth, the influence of practice generalized to untrained spectral modulation frequencies in the OHI, but not the YNH, listeners. Across all three groups of YNH-trained listeners there were no cases in which trained listeners learned more than controls on an untrained spectral modulation frequency (Fig. 5E–G); training-induced learning was specific to the trained condition. In contrast, the OHI trained listeners learned more than controls on an untrained lower spectral modulation frequency (1 cyc/oct; Fig 5H left bar), despite not having done so on their trained condition (2 cyc/oct; Fig 5H middle bar). Thus, it appears that the influence of performing the training task generalized more broadly in OHI than in YNH listeners. It is also possible simple exposure to the stimuli during the training sessions led to this generalization.

A potential account for these learning differences

One possible unifying account for these population differences is that learning consolidated more slowly, and that training modified an aspect of processing that had broader tuning to spectral modulation frequency, in OHI than YNH listeners. It is generally held that after training has ceased there is an extended period of consolidation during which the new and fragile learning becomes more stable (for review, see McGaugh 2000; Walker and Stickgold 2004). This process of consolidation is also thought to underlie improvements that emerge well after the end of a practice session (e.g., Karni and Sagi 1993; Korman et al. 2003; Ortiz and Wright 2010). In this context, the evidence that the influence of the pre-test took multiple days to fully emerge in OHI listeners, but only a single day in YNH listeners, suggests that consolidation proceeded over a slower time course in OHI than in YNH listeners. A difference in consolidation rate could account for the population differences in how the learning in the trained listeners compared to that in the controls. In YNH listeners, the gradual learning on the trained condition (in listeners trained at 0.5 cyc/oct) could have arisen from multiple short (∼1 day) consolidation periods initiated by each training session, leading to a gradual learning curve and ultimately to greater learning than controls. However, because the OHI-trained listeners showed comparable learning to controls, the gradual learning curve on the trained condition could, at an extreme, simply reflect the multiple session consolidation initiated by the pre-test, rather than an influence of the training phase. A more moderate option is that the learning curve of the OHI trained listeners reflects the influences of both the pre-test and the training phase. Evidence that the training itself did have some effect on this population comes from the observations that the trained listeners generalized to an untrained condition (1 cyc/oct, Fig. 4) and showed more consistent learning than controls (Fig. 2B). Even so, these influences of the training could have consolidated over a slower than normal time course. Similar arguments hold if, instead of consolidating the same modifications at different rates, training in the two populations modified different aspects of processing that normally consolidate at different rates. Different consolidation time courses have been reported for learning different aspects of the same condition in YNH listeners (Ortiz and Wright 2010).

When the influence of practice on one condition affects performance on another (generalization), it is taken as evidence that training modified processing that is engaged by both conditions (e.g., Ahissar and Hochstein 2004; Wright et al. 2010). The observed population difference in the breadth of generalization therefore implies that whatever was modified by training had broader tuning in OHI than YNH listeners. One possibility is that the training-induced modification involved the neural circuitry underlying the filtering of stimuli into their component spectral modulation frequencies (e.g., Chi et al. 2005). Behavioral support for filters tuned to particular spectral modulation frequencies in YNH listeners comes from spectral-modulation-frequency-specific effects of masking (Saoji and Eddins 2007), adaptation (Eddins and Harwell 2002), and training (Sabin et al. 2012b), while physiological support comes from observations of individual neurons in auditory cortex with bandpass tuning to a particular spectral modulation frequency (e.g., Schreiner and Calhoun 1994; Shamma et al. 1995; Versnel et al. 1995; Kowalski et al. 1996). The training could have modified these filters directly or modified a more central decision maker that receives input from these filters (for discussions of these views, see Dosher and Lu 1998; Ahissar and Hochstein 2004). Either way, this possibility requires that these filters had broader tuning in OHI than YNH listeners, at least at the post-test.

Potential contributions of age and hearing loss

Both age and hearing status differed between the YNH and OHI populations, so both factors could have contributed to the observed population differences. While the present data do not allow us to separate the influences of these factors, comparisons to the limited literature regarding their isolated influences on the response to perceptual training raise the possibility that age primarily affected performance on the trained condition, while hearing loss affected generalization. For the trained condition, the performance pattern differences between the populations described here were quite similar to those between groups of older and younger listeners with normal hearing (ONH and YNH) who received multisession training on auditory temporal interval discrimination: Preliminary analyses of those data suggest that ONH listeners who started poorly improved over the training phase but matched controls improved as much and YNH listeners given the same training learned more than controls (Marrone et al. 2010). Thus, age alone could have led to the current population differences on the trained condition. Age has also been shown to have an effect on perceptual learning even when the starting performance is similar across groups (adolescents vs young adults; younger vs older adults: Marrone et al. 2010; adolescents vs young adults: Huyck and Wright 2011). However, age may not affect perceptual learning in all cases. Younger and older participants with different starting performance showed qualitatively similar learning on two basic visual tasks when given the same training regimen (Ball and Sekuler 1986; Andersen et al. 2010). We note though that this learning has not been compared between groups differing in age but not starting performance on the same condition.

For the untrained conditions, the current population differences in the breadth of generalization resemble those described in the introduction between young adults with and without amblyopia—a visual sensory disorder. After practicing the detection of a sinusoidal grating at a single spatial frequency (the visual analog of the current trained task) or the discrimination of that frequency, young adults with amblyopia generalized their learning to a broader range of untrained spatial frequencies than did adults with normal vision (Huang et al. 2008, 2009; Astle et al. 2010). This pattern matches the present broader generalization across spectral modulation frequency in OHI than YNH listeners. Thus, hearing loss alone could have led to the current population differences in generalization.

Practical implications

Overall, this investigation adds to others (e.g., Peelle and Wingfield 2005; Huyck and Wright 2011) indicating that conclusions about perceptual learning are not easily applied across populations. Given the same training regimen, different populations can learn differently. It appears then that the development of clinical training regimens should focus primarily on the target population. For OHI listeners, the long time course of the pre-test-induced improvement observed here indicates that brief periods of practice on some conditions can have an extended influence in this population. This time course could be harnessed in a clinical training regimen by providing training sessions that are widely spaced in time. Doing so could decrease the overall time and effort spent by both the clinician and the patient. Further, the improvement of the OHI listeners on an untrained condition following multisession practice on a single trained condition demonstrates that generalization, the goal of most clinical training regimens, can occur in this population. Finally, this work indicates more specifically that the detection of spectral modulation in OHI listeners can improve with practice. Training in this population therefore might aid real-world skills that are limited by spectral modulation detection such as sound localization in the vertical plane (Macpherson and Middlebrooks 2003; Qian and Eddins 2008) or speech perception (Litvak et al. 2007).