INTRODUCTION

Loss of cochlear hair cells leads to sensorineural hearing loss (SNHL). Bypassing the lost hair cells, cochlear implants (CIs) electrically stimulate the spiral ganglion cells (SGCs), of which the axons form the auditory nerve. Proper functioning of CIs thus depends on the presence of a healthy and sufficiently large population of SGCs that is able to transduce the encoded auditory information to the brainstem. However, animal studies have shown that SGCs become smaller and degenerate as a consequence of hair cell loss (Ylikoski et al. 1974; Spoendlin 1975; Webster and Webster 1981; Versnel et al. 2007). This degeneration has been associated with discontinued neurotrophic support from the organ of Corti (Ernfors et al. 1995; Fritzsch et al. 1999; Zilberstein et al. 2012). Administration of exogenous neurotrophic factors has prevented SGC degeneration in multiple animal models of SNHL (for a review, see Ramekers et al. 2012); SGCs are larger and more numerous after neurotrophic treatment than in untreated controls (Ernfors et al. 1996; Richardson et al. 2005; Shepherd et al. 2005; Glueckert et al. 2008; Agterberg et al. 2008, 2009; Leake et al. 2011; van Loon et al. 2013).

In human CI users, a relation between the number of surviving SGCs and CI performance has not been found (Fayad and Linthicum 2006; Xu et al. 2012), suggesting that criteria for a well-functioning interface for CIs comprise more than mere numerical quantification. So far, attempts to assess the functional consequence of SGC degeneration mainly involved characterization of electrophysiological activity at single unit or population level in animal models of SNHL. Inconsistent findings have been reported from changes in excitation thresholds after deafening. In single neuron studies, excitation thresholds become as often lower as they become higher (reviewed in Sly et al. 2007). Excitation thresholds for electrically evoked auditory brainstem responses (eABRs) have been reported to become elevated after deafening (Shepherd and Javel 1997; Maruyama et al. 2008; Fransson et al. 2010), but also to remain unchanged (Agterberg et al. 2009). For electrically evoked compound action potentials (eCAPs) excitation thresholds are found to be higher after deafening (Stypulkowski and van den Honert 1984; Prado-Guitierrez et al. 2006). Being consistently smaller after deafening, response amplitudes seem to reflect function better than threshold (Hall 1990; Shepherd and Javel 1997; Agterberg et al. 2009). However, the amplitude largely depends on factors such as the distance between the stimulation electrode and the excitable tissue, the impedance between the two, and, likewise, on the distance and impedance between the excited neural tissue and the recording electrode (Grill et al. 2009).

Overcoming these unpredictable and often confounding factors, Prado-Guitierrez et al. (2006) employed a method with which the difference in excitability between two stimulus pulse shapes can be utilized as a quantitative functional measure for SGC health. The magnitude of the change in excitability brought about by either a prolonged inter-phase gap (IPG) or phase duration (PD) correlated with the amount of surviving SGCs. Increasing PD leads to higher excitability because the amount of charge that is displaced toward or away from the electrode increases proportionally when phase area (PD × current level) is increased (Shepherd and Javel 1999). The beneficial effect of an IPG increase was recognized soon after the introduction of biphasic stimulation since it partly abolishes the increase in threshold that arises when adding a hyperpolarizing phase to a depolarizing monophasic pulse (van den Honert and Mortimer 1979). Besides the effect on electrophysiological thresholds, an IPG increase in cochlear implant stimulation results in an increase in loudness perception (McKay and Henshall 2003) as well as a decrease in behavioral thresholds (Carlyon et al. 2005).

The aim of this study was (1) to provide a comprehensive characterization of eCAP input–output properties in normal-hearing and deafened guinea pigs, (2) to evaluate the effect of changing both IPG and PD on these properties, and (3) to correlate the magnitude of this effect to quantified histological measures of neurodegeneration. Finding electrophysiological correlates for the amount of degeneration of the SGC population may provide an important tool for clinical diagnostics and prognosis in cochlear implant recipients.

METHODS

Animals and Experimental Design

Eighteen female albino guinea pigs (Dunkin Hartley; 250–350 g) were obtained from Harlan Laboratories (Horst, The Netherlands) and kept under standard laboratory conditions (food and water ad libitum; lights on between 7:00 am and 7:00 pm; temperature 21 °C; humidity 60 %).

One group served as normal-hearing controls (NH; N = 6), while the remaining guinea pigs were exposed to ototoxic treatment either two (2WD; N = 6) or six (6WD; N = 6) weeks prior to acute electrophysiological measurements. Immediately after these experiments, the animals were sacrificed and their cochleas processed for histological analysis. All surgical and experimental procedures were approved by the Animal Care and Use Committee of Utrecht University (DEC 2010.I.08.103).

Deafening Procedure

Animals were anesthetized by intramuscular injection of dexmedetomidine (Dexdomitor®; 0.25 mg/kg) and ketamine (Narketan®; 40 mg/kg), and acoustically evoked auditory brainstem responses (aABRs) were recorded to measure hearing thresholds. Ototoxic treatment involved subcutaneous injection of kanamycin (Sigma-Aldrich, St. Louis, MO, USA; 400 mg/kg) and subsequent infusion of furosemide (Centrafarm, Etten-Leur, The Netherlands; 100 mg/kg) into the external jugular vein, which has been shown to eliminate the great majority of both inner and outer hair cells (Versnel et al. 2007; West et al. 1973). Postoperatively, the animals were injected subcutaneously with the non-ototoxic antibiotic enrofloxacin (Baytril®, 5 mg/kg) and the anti-inflammatory drug carprofen (Rimadyl®, 5 mg/kg).

Implantation

For the acute experiments, anesthesia was initiated with Hypnorm® (Vetapharma; 0.5 ml/kg i.m.), followed by a gas mixture of 2 % isoflurane evaporated in O2 and N2O (1:2), delivered by a mouth cap. The skull was exposed and four transcranial screws were placed for ABR and eCAP recordings (one 1 cm posterior to bregma, one 2 cm anterior to bregma, and two 1 cm bilateral to bregma). The animals were subsequently tracheostomized and artificially ventilated (Amsterdam infant ventilator mk3, Hoekloos, Schiedam, The Netherlands) with a gas mixture of O2 and N2O (1:2) and 1–1.5 % isoflurane (45–50 cycles/min respiration rate, 1.8-2.0 kPa) throughout the experiment. The animals were placed on a 37 °C heating pad. Their heart rate, O2 consumption, and expiratory CO2 were monitored, and, based on those readings, anesthetic settings were adjusted if needed. Every 2 h, 2 ml of lactated Ringer’s solution was injected subcutaneously to avoid dehydration. Via a retro-auricular approach, the bulla was opened to expose the cochlea. A 0.5-mm cochleostomy was then drilled in the basal turn, within 1 mm from the round window, through which the electrode array was inserted into the scala tympani (see below).

Auditory Brainstem Responses

Prior to ototoxic treatment, aABRs were recorded using subcutaneously positioned needle electrodes (active electrode behind the right pinna; reference electrode on the skull, rostral to the brain; ground electrode in hind limb). During the acute experiments, the posterior transcranial screw was used as active electrode, the anterior screw as reference electrode, and a subcutaneous needle electrode in the hind limb as ground. Broadband acoustic clicks (20 μs monophasic rectangular pulses; inter-stimulus interval 99 ms) were synthesized and attenuated using a TDT3 system [modules RP2, PA5 (2×) and HB7; Tucker-Davis Technologies, Alachua, FL, USA], and presented in free field using a Blaupunkt speaker (PCxb352; 4 Ω; 30 W). The responses were differentially amplified using a Princeton Applied Research (Oak Ridge, TN, USA) 5113 pre-amplifier (amplification ×5,000; band pass filter 0.1–10 kHz), digitized by the TDT3 system (100 kHz sampling rate, 24-bit sigma-delta converter), and stored on a PC for off-line analysis. Hearing thresholds were obtained by starting at approximately 110 dB peak equivalent SPL and decreasing the sound level in steps of 10 dB until the response had disappeared. The threshold was then defined as the interpolated sound level at which the aABR N1–P2 peak was 0.3 μV.

Compound Action Potentials

Both stimulation and recording of eCAPs was done with monopolar configuration, using the transcranial screw right from bregma as return electrode for stimulation and the one left from bregma as reference electrode for recording. eCAPs were recorded with a custom-made four-contact electrode array connected to a MED-EL PULSARCI 100 cochlear implant (MED-EL GmbH, Innsbruck, Austria). The electrode array was inserted up to 4 mm into the cochleostomy, so that all four contacts—evenly spaced over the first 3 mm of the array—were in the scala tympani. For all data presented here, the most apical contact (5 mm from the round window) was used for stimulation and the most basal one (2 mm from the round window) for recording. Electrode impedance ranged from 3.4 to 4.6 kΩ (4.1 kΩ on average), so that in all cases, given the maximum implant output voltage of 6.1 V in our set-up, a 1,200-μA current was within the implant’s current range. The implant was controlled by a PC via a Research Interface Box 2 (RIB2; Department of Ion Physics and Applied Physics, University of Innsbruck, Innsbruck, Austria) and a National Instruments data acquisition card (PCI-6533; National Instruments, Austin, TX, USA). eCAP data were sampled using adaptive sigma-delta modulation at 1.2 MHz (Neustetter et al. 2012). Stimulation and recording paradigms were created in MATLAB (version 7.11.0; Mathworks, Natick, MA, USA).

Biphasic current pulses were presented with alternating polarity to reduce stimulation artifact (see Fig. 1A), and the responses to 50 pairs of these stimuli were averaged. We chose alternating stimulation rather than a masker-probe paradigm since it is thought that by doing so, a larger portion of the SGC population is recruited (Westen et al. 2011). The resulting waveforms did not seem to contain significant remaining stimulation artifact (e.g., see sub-threshold traces in Fig. 1B, C). In addition, the absence of significant stimulation artifact can be deduced from the absence of linear increases in the N1–P2 amplitude sub-threshold or supra-saturation in the input–output curves. Therefore, further techniques such as scaled template subtraction (Miller et al. 1998) were not applied. The phase duration (PD) and the inter-phase gap (IPG) of the current pulses were varied (PD = 20–50 μs in steps of 5 μs; IPG = 2.1–10–20–30 μs; see Fig. 1A). Each of the 7 × 4 = 28 possible pulse shapes was presented at 10 current levels, which were adjusted for PD, such that for each PD the charge ranged from 2.4 to 24 nC: e.g., pulses with 25 μs PD had a maximum stimulation level of 960 current units (cu; 1 cu ≈ 1 μA; linearity of this relationship was verified via control measurements), while 50 μs/phase pulses had a maximum stimulation level of 480 cu. The recording to the lowest, sub-threshold stimulation level was subtracted from the other recordings in order to reduce measurement onset artifacts. Resulting eCAPs are exemplified in Figure 1B, C. The eCAP amplitude was defined as the difference in voltage between the N1 and the P2 peak (first positive peak following N1), and was plotted against current level. Input–output functions thus obtained were fitted with a Boltzmann sigmoid, using the Levenberg–Marquardt algorithm (Marquardt 1963; R 2 was between 0.95 and 1, and 0.996 on average):

FIG. 1
figure 1

A Schematic of the electric pulse stimuli used in this study. To reduce stimulation artifact, alternating polarity stimuli were applied. The biphasic pulses were applied with IPGs of 2.1, 10, 20, and 30 μs, for each PD (20 to 50 μs in 5-μs steps). B, C Examples of eCAPs at various current levels recorded in a normal-hearing animal (B) and in a 6-weeks-deaf animal (C). Stimulation conditions were identical in both cases (PD 50 μs; IPG 30 μs) and stimulation amplitude is given for each trace. D Input–output function derived from eCAP N1–P2 amplitude in (B); the solid line represents the fitted Boltzmann curve (see Eq. 1). The dashed lines indicate the eCAP characteristics that were derived from the Boltzmann equation (see “METHODS” section). IPG inter-phase gap, PD phase duration.

$$ {V}_{\mathrm{eCAP}}=A+\frac{B}{1+{e}^{-\frac{I-C}{D}}}, $$
(1)

where V is amplitude in microvolts (μV), I is stimulus current in microamperes (μA), and A–D are fitting parameters. The parameters derived from the fitting are the outcome variables (see also Fig. 1D) that were used to assess differences between pulse shapes, between animal groups, and to relate to SGC packing density and perikaryal area. These outcome variables are maximum N1–P2 amplitude (defined by B), current level to achieve 50 % of the maximum N1–P2 amplitude (defined by C), slope at C (defined by B/4D), threshold (defined by C − 2D, the current level at which the tangent to the curve at C crosses A), and dynamic range (defined by 4D). The N1 peak latency, averaged over the three highest current levels, was analyzed in addition to these input–output characteristics. Note that the slope B/4D equals the ratio of the amplitude B and the dynamic range 4D; therefore, these three variables are interrelated. The noise level (defined by A) has not been used as an outcome variable. All analysis was performed using custom-made software in MATLAB.

Histology

After completion of the experiment, all animals were sacrificed and their right cochleas were harvested. Processing and analysis was performed as previously described by van Loon et al. (2013). Intra-labyrinthine cochlear fixation was achieved with a fixative of 3 % glutaraldehyde, 2 % formaldehyde, 1 % acrolein, and 2.5 % DMSO in a 0.08 M sodium cacodylate buffer. The cochleas were then decalcified, post-fixated, and embedded in Spurr’s low-viscosity resin. After dividing the cochleas into two halves along a standardized midmodiolar plane, they were re-embedded in fresh resin. From each cochlea, five semi-thin (1 μm) sections were cut at 30-μm intervals—to ensure that consecutive sections could not contain the same cell—which were subsequently stained with 1 % methylene blue and 1 % azur B in 1 % borax.

Using a Leica DC300F digital camera mounted on a Leica DMRA light microscope and a 40× oil immersion lens (Leica Microsystems GmbH, Wetzlar, Germany), pictures of each transection of Rosenthal’s canal (two basal, two middle, and three apical transections) as well as the organ of Corti were obtained. Within each transection of Rosenthal’s canal, the number of type-I SGCs was counted and packing density was averaged across all five sections. In one of the five sections, the average perikaryal area was determined for SGCs with a visible nucleus using ImageJ (version 1.42q; National Institutes of Health, Bethesda, MA, USA). Subsequently, both packing density and perikaryal area were averaged over cochlear turns. Since the likelihood of detecting an individual SGC depends on its perikaryal size, the average packing density was corrected for perikaryal size as previously described (Coggeshall and Lekan 1996; van Loon et al. 2013). Hair cell counts included hair cells with a nucleus, a cilia bundle, or a clear cochlear hair-cell-like outline.

Statistical Analysis

Differences in aABR threshold shifts between the 2WD and 6WD groups were evaluated with an independent samples t test. Changes in SGC packing density and perikaryal area after ototoxic treatment were assessed with a one-way analysis of variance (ANOVA) and post hoc Bonferroni. Pearson’s correlation coefficient was used to assess the correlation between maximum eCAP amplitude and SGC packing density. The effects of PD and IPG on eCAP characteristics (amplitude, slope, threshold, dynamic range, level50%, and latency) were assessed with two-way repeated measures ANOVA; Greenhouse–Geisser correction was applied when the assumption of sphericity was violated. Correlations between differences in eCAP characteristics in response to PD or IPG changes and histological findings (SGC packing density and SGC perikaryal area) were evaluated using multiple linear regression. All statistical tests were performed with SPSS 20.0 for Windows (IBM, Armonk, NY, USA), except for R 2 values in Figures 8 and 9, which were obtained using the Statistics toolbox in MATLAB.

RESULTS

Deafening

aABR N1–P2 peak threshold shifts after deafening ranged from 59 to 82 dB with the exception of one animal in the 2WD group (27 dB shift). This animal was therefore excluded from analysis with respect to group averages, but not from the multiple regression analyses. Average threshold shifts per group were 61 dB for 2WD and 74 dB for 6WD, and did not significantly differ [t (9) = 2.0; P = 0.079].

Ototoxic treatment resulted in significant reduction of both inner (IHCs) and outer hair cells (OHCs). One hundred percent of IHCs were present in the NH group, and this was reduced to 45 % and 13 % in the 2WD and 6WD group, respectively. Average OHC presence decreased from 96 % in NH to 37 % in 2WD and 18 % in 6WD animals. OHC loss was more pronounced in the basal half of the cochlea (17 % remaining in 2WD and 10 % remaining in 6WD) than in the apical half (48 % and 25 %, respectively). For IHCs, this gradient was smaller: 38 % for 2WD and 8 % for 6WD in the basal half, and 50 % and 17 %, respectively, in the apical half.

For each group, a typical transection of Rosenthal’s canal is depicted in Figure 2. The packing density of SGCs was lower in the 2-weeks-deafened group (Fig. 2B) than in the normal-hearing group (Fig. 2A), and lower still in the 6-weeks-deaf group (Fig. 2C). Figure 3A shows the quantification of numerical SGC packing density averaged across animals. SGC packing density significantly decreased after deafening [one-way ANOVA—F (2,14) = 86.3; P < 0.001]. Post hoc Bonferroni showed that SGC packing density was significantly different between groups (P < 0.001 for each pair). In normal-hearing animals, packing density was similar across turns. The decrease in packing density after deafening was most pronounced in the basal turn and smallest in the apical turn. The average perikaryal area (shown in Fig. 3B) also significantly decreased after deafening [one-way ANOVA—F (2,14) = 14.1, P < 0.001; post hoc Bonferroni—NH–2WD, P = 0.013; NH–6WD, P < 0.001; 2WD–6WD, P = 0.41]. For normal-hearing animals, the perikaryal area was similar across turns. The decrease in cell size after deafening was slightly larger toward the apex. In Figure 3C, SGC perikaryal area is plotted against packing density for each animal separately. Although SGC degeneration was clearly accompanied by a decrease in cell size, a large variation can be observed within experimental groups. Note that the 2WD animal with a small threshold shift has a normal packing density but a SGC size between normal and the average size found in 2WD animals.

FIG. 2
figure 2

Examples of transections of Rosenthal’s canal (lower middle turn) containing the SGCs in a normal-hearing cochlea (A), a cochlea 2 weeks after deafening (B), and a cochlea 6 weeks after deafening (C).

FIG. 3
figure 3

Quantification of histological SGC analysis. Group averages are shown for SGC numerical packing density (A) and SGC perikaryal area (B), both first averaged across all cochlear turns per animal. C SGC perikaryal area as function of SGC packing density for each animal separately. The blue-crossed red symbol indicates the 2WD animal excluded from the group averages. NH normal-hearing, 2WD 2 weeks deaf, 6WD 6 weeks deaf, SGC spiral ganglion cell. *P < 0.05, ***P < 0.001; N = 6 for NH and 6WD, N = 5 for 2WD; error bars represent SEM.

eCAP Waveforms

The examples of eCAP recordings in Figure 1B (normal hearing) and 1C (6-weeks deaf) show that the N1–P2 peak amplitude is smaller for the deafened animal, while the eCAP threshold is roughly similar (see first obvious response at the same stimulation level). The P2 peak was not as pronounced in deafened animals as it was in normal-hearing animals, but its approximate location was evident in all instances.

Effect of Pulse Shape Alterations on eCAP Characteristics

In Figure 4, N1–P2 peak amplitudes for a single (normal-hearing) animal are plotted as a function of current (left column) and charge (right column) for different pulse shapes. The beneficial effect of increasing IPG on stimulation efficacy (i.e., larger amplitude and lower threshold) becomes apparent when comparing Figure 4A and B (2.1 μs IPG) with 4C and 4D (30 μs IPG), respectively. Increasing PD also enhanced stimulation efficacy (see 4A, C), but only partly so when eCAP amplitude is plotted against charge (i.e., phase area) instead of current level (i.e., phase amplitude; compare 4A and 4C with 4B and 4D, respectively). Specifically, the effect of PD is abolished for 30 μs IPG (4D), but not for 2.1 μs IPG (4B). In other words, the effect of pulse shape (the combination of phase duration and phase amplitude) does not influence the response amplitude for 30 μs IPG as long as the phase area (charge level) is the same, while for 2.1 μs IPG pulse shape is relevant since short phases elicit lower maximum eCAP amplitudes.

FIG. 4
figure 4

Input–output curves from a single normal-hearing animal. Separate lines represent responses to pulses with different PD. The upper two plots show the same data for pulses with a 2.1-μs IPG, but plotted against stimulation phase amplitude (μA) in (A) and against phase area (nC) in (B). The lower two plots show the same input–output curves for 30-μs IPG pulses either plotted as function of phase amplitude (μA) (C) or as function of phase area (nC) (D). IPG inter-phase gap, PD phase duration.

For each animal, the input–output functions as shown in Figure 4 are fitted according to Eq. 1 (see “METHODS” section). The five input–output characteristics that are derived from these fits (amplitude, slope, threshold, dynamic range, and level50%) and N1 latency are averaged per group and plotted as function of IPG in Figure 5 for 20-μs (first and third column) and 50-μs PD (second and fourth column). In Figure 6, these eCAP characteristics are plotted as function of PD, for either 2.1 μs IPG (first and third column) or 30 μs IPG (second and fourth column). F and P values resulting from two-way repeated measures ANOVA, with IPG and PD as within factors and group as between factor, are given in Table 1.

FIG. 5
figure 5

eCAP characteristics averaged per group and plotted as function of IPG. PD is 20 μs in the first and third column, and 50 μs in the second and fourth. A, B amplitude; C, D slope; E, F threshold; G, H dynamic range; I, J level50%; K, L latency. IPG inter-phase gap, PD phase duration, NH normal-hearing, 2WD 2 weeks deaf, 6WD 6 weeks deaf. For each plot N = 6 for NH and 6WD, N = 5 for 2WD; error bars represent SEM.

FIG. 6
figure 6

eCAP characteristics averaged per group and plotted as function of PD. IPG is 2.1 μs in the first and third column, and 30 μs in the second and fourth. A, B amplitude; C, D slope; E, F threshold; G, H dynamic range; I, J level50%; K, L latency. IPG inter-phase gap, PD phase duration, NH normal-hearing, 2WD 2 weeks deaf, 6WD 6 weeks deaf. For each plot N = 6 for NH and 6WD, N = 5 for 2WD; error bars represent SEM.

TABLE 1 Results from the two-way repeated measures ANOVAs, applied separately for each eCAP characteristic

The maximum eCAP amplitude (Figs. 5A, B and 6A, B) and the slope (related to the amplitude; Figs. 5C, D and 6C, D) decreased markedly with time after deafening. Accordingly, for all 28 pulse shapes there was a significant correlation between maximum amplitude and SGC packing density (R 2 between 0.49 and 0.71; P < 0.01 for all pulse shapes), and between slope and SGC packing density (R 2 between 0.48 and 0.67; P < 0.01 for all pulse shapes). For all groups, the maximum amplitude increased similarly with increasing IPG (Fig. 5A, B) and with increasing PD (Fig. 6A, B). For the slope, dependence on IPG varied among groups as was indicated by a significant interaction. The slope increased markedly with IPG for the NH group, while this increase was less pronounced for the 2WD and smaller still for the 6WD group (Fig. 5C, D). Differences in slope with increasing PD were not different among groups (Fig. 6C, D). Note that the overall charge was kept constant as described in the previous section. eCAP threshold decreased with an increase in both IPG (Fig. 5E, F) and PD (Fig. 6E), while neither overall threshold levels nor dependence on IPG or PD were significantly different between groups. The dynamic range was similar for NH and 2WD animals, but noticeably wider for the 6WD animals (Figs. 5G, H and 6G, H). In addition, dependence on IPG was significantly different among groups: whereas it hardly changed with IPG for the NH and 2WD groups, there was an apparent increase in dynamic range for the 6WD group (Fig. 5G, H). Level50%—at which the eCAP amplitude is half of the maximum amplitude—decreased significantly with increasing IPG (Fig. 5I, J), but did not significantly vary with PD (Fig. 6I, J). Although there was no significant overall difference in level50% among groups, a significant interaction effect with IPG indicates that dependence on IPG varied: the decrease with increasing IPG is smaller for 6WD than for the other groups. When comparing the effect of IPG on level50% with that on threshold (Fig. 5E, F and I, J), it is consistently larger at threshold; this means that the amount of charge needed to equalize eCAP amplitude is larger at threshold level than at the half-maximum eCAP amplitude. Latency of the N1 peak was shorter for the 6-weeks-deaf animals than for the other two groups (Figs. 5K, L and 6K, L). Dependence on PD was similar for all groups (Fig. 6K, L), while N1 latency increased with increasing IPG for both deafened groups, but not for the NH group (Fig. 5K, L). Finally, for each of the six eCAP characteristics we found a significant interaction of IPG and PD (Table 1), a feature that is well illustrated by comparison of Figure 4B and D.

In summary, virtually all eCAP characteristics were dependent on both IPG and PD, and most of these characteristics changed after deafening. The change with PD was similar for all groups, whereas with increasing IPG the slope, dynamic range, level50%, and latency of the eCAP were affected differently after deafening (demonstrated by significant interactions between IPG and group; see Table 1). In other words, it was possible to distinguish between groups by varying IPG, but not by varying PD.

Correlation Between Changes in Response to Pulse Shape Alterations and Histology

The differences in eCAP characteristics (amplitude, slope, threshold, dynamic range, level50%, and latency) that emerge when comparing responses to pulses with the shortest (2.1 μs) and the longest IPG (30 μs), as well as responses to the shortest (20 μs) and longest PD (50 μs), were subsequently put in a regression model with SGC packing density and perikaryal area (averaged over all cochlear turns) as predictor variables. Since interaction effects with group were exclusively found for changes in IPG (see Table 1), our emphasis is on changes in eCAP characteristics in response to IPG variation. Upon PD variation, SGC packing density and perikaryal area only significantly predicted a change in slope (R 2 = 0.50, P = 0.008), a change in threshold (R 2 = 0.40, P = 0.021) and a change in dynamic range (R 2 = 0.43, P = 0.015)—all three only for an IPG of 2.1 μs.

In Table 2, the R 2 and P values derived from multiple regression analyses are given for changes in eCAP characteristics in response to increased IPG for the shortest and largest PD. The observation that either histological measure can show a significant partial contribution demonstrates that both play a considerable functional role. The corresponding scatter plots are shown in Figure 7; changes in response to IPG increase are plotted against SGC packing density (on the horizontal axis), while the marker size is proportional to the perikaryal area.

TABLE 2 Multiple regression analyses
FIG. 7
figure 7

Differences in eCAP characteristics when IPG is increased from 2.1 to 30 μs plotted as function of numerical SGC packing density for 20-μs (first and third column) and 50-μs PD (second and fourth column). Symbols represent individual animals; symbol size is proportional to the average SGC perikaryal area for that animal. Both histological predictors are averages from the basal, middle, and apical turns. Solid lines are regression lines with SGC packing density as single predictor, shown here for purely visual purposes. R 2 values represent the amount of variation that can be explained by both SGC packing density and perikaryal area. The blue-crossed red symbol indicates the 2WD animal excluded from the group averages. PD phase duration, SGC spiral ganglion cell; *P < 0.05; **P < 0.01; ***P < 0.001. N = 18 for (A)–(B) and (E)–(L); N = 17 for (C) and (D).

The four eCAP measures that showed a significant interaction effect with group (slope, dynamic range, level50%, and latency) also showed significant correlations with SGC packing density and perikaryal area (Fig. 7C, D, G, I–L). In addition, the regression model significantly predicts threshold shifts for 20-μs PD (Fig. 7E). The absence of a significant interaction effect between IPG and group for eCAP amplitude (see Table 1) is paralleled by a lack of predictive value for SGC packing density or perikaryal area with respect to amplitude change with increasing IPG (Table 2; Fig. 7A, B).

The latency shift after IPG increase significantly correlated with SGC degeneration for both PDs (Table 2, Fig. 7K, L). Interestingly, the latency shift we found in the 6WD animals (22 μs on average) roughly matched the increase in IPG duration (30.0–2.1 = 27.9 μs). Moreover, the extrapolated trend line for packing density predicts a latency shift of approximately 30 μs for both PDs for a hypothetical packing density of 0, while the latency shift is close to zero for high packing densities.

In Figure 8, R 2 values for the same multiple linear regression analyses are plotted for all seven PDs. Whereas the predictors in the aforementioned regression models are histological measures obtained by averaging across all cochlear turns, in this figure the ability to predict changes in eCAP characteristics with increased IPG was assessed per cochlear turn (basal, middle, and apical). In general, differences among turns did not appear to vary substantially, suggesting that both histological measures correlated considerably over cochlear turns, and/or that neural elements from all turns contributed to the eCAP response. (Note that the electrode array was placed up to 5 mm from the round window, so that both the stimulation and the recording electrode were in the basal turn.)

FIG. 8
figure 8

R 2 values derived from multiple linear regression analyses with SGC packing density and perikaryal area as predictor variables (determined separately for each cochlear turn) and change with IPG increase as dependent variable. Filled circles represent statistically significant R 2 values (P < 0.05). PD phase duration.

The results in Figure 8 are similar to those shown in Figure 7. The predictive power of SGC packing density and perikaryal area for the change in maximum amplitude with increased IPG remained mostly insignificant (Fig. 8A; filled circles indicate statistically significant values). For slope, the regression model significantly predicted changes with IPG, although the R 2 decreases slightly toward longer PD (Fig. 8B). The same gradient was observed for dynamic range (Fig. 8D) and, more distinctly, for threshold (Fig. 8C). The R 2 values for the change in level50% (Fig. 8E) provide a different perspective on the data shown in Figure 7I, J: the predictive power of the regression model is statistically significant only for the extreme PDs. For all PDs, the latency shift brought about by increased IPG is for at least 50 % explained by both histological predictors (Fig. 8F).

Figure 9 shows that the high predictability for latency as shown in Figure 8F is restricted to the three highest charge levels used, as the R 2 rapidly drops to insignificant levels below 20 nC. This effect, illustrated here for 20, 35, and 50 μs, is present for all PDs applied.

FIG. 9
figure 9

R 2 values derived from multiple linear regression analyses with SGC packing density and perikaryal area as predictor variables and change of latency with IPG increase as dependent variable. R 2 values are statistically significant (filled circles; P < 0.05) only for the higher stimulation levels. PD phase duration.

DISCUSSION

In this study, we have evaluated the effect of PD and IPG on eCAP characteristics in normal-hearing and deafened guinea pigs. The effect of increasing PD in most cases roughly resembled increasing IPG: the maximum amplitude, slope, dynamic range, and latency increased, while threshold and level50% decreased. A strong interaction effect for all six characteristics furthermore suggests that both act via a similar mechanism since, in general, the effect of increasing IPG was smaller when PD was long and vice versa—the effect of increasing PD was smaller when IPG was long (Figs. 5 and 6). Probably, the beneficial effect of an increase in either lies in the temporal separation of action potential initiation and hyperpolarization (van den Honert and Mortimer 1979). However, in contrast to PD, increasing the IPG has a differential effect on various eCAP characteristics with respect to experimental groups, and the effect size often correlates well with histological measures for SGC degeneration. Based on these findings, variation of IPG, and not PD, in a clinical setting possibly makes a suitable diagnostic tool for the evaluation of the auditory nerve in CI users.

The Effect of Deafness on eCAPs

The gradual loss of SGCs after ototoxic treatment was reflected by a gradual decrease in eCAP amplitude (Figs. 5A, B and 6A, B). With a decrease in amplitude, the slope of the input–output function became—fairly proportionally—lower as well. We did not observe significant changes in excitation threshold, although there was a small but consistent increase for the 2WD group followed by a decrease for the 6WD group for virtually all applied pulse shapes (Figs. 5E, F and 6E, F). A similar pattern was observed for the N1 latency, in which case group differences were statistically significant. Together, these results are in agreement with the theory that the site of excitation shifts to a more central location, and that loss of tissue may lead to lower impedances of the electrode–tissue interface (Stypulkowski and van den Honert 1984; Frijns et al. 1996; Shepherd and Javel 1997). The stimulation level at which the half-maximum eCAP amplitude is reached (level50%) did not significantly change (Figs. 5I, J and 6I, J), although again there was a small but consistent increase after deafening. A wider dynamic range in deafened animals probably reflects the more heterogeneous disposition of the SGC population with respect to excitation thresholds, given that the dynamic range of single neurons appear to remain unaltered (Sly et al. 2007).

Phase Duration

With increasing PD, firing thresholds for single fibers (Shepherd and Javel 1999), eABRs (Miller et al. 1995) and eCAPs (Prado-Guitierrez et al. 2006; this study), and psychophysical detection level thresholds (Miller et al. 1999) become lower when current level is kept constant. The main reason for this is that the amount of displaced charge (PD times current level) increases, which is why we chose to express stimulation level in term of charge rather than current. Accordingly, for 30-μs IPG the threshold expressed in charge was equal for all PDs (Figs. 4C, D and 6F). For short-IPG pulses, the threshold became lower for longer PDs (Figs. 4A, B and 6E), suggesting more efficient recruitment with longer pulses. This latter finding seems to contradict single-fiber data from Shepherd and Javel (1999) and psychophysical data from Moon et al. (1993), who both found an increase in threshold with PD for PDs longer than 50 μs. Two different mechanisms with opposite effects but operating at different PD ranges may explain how our data are complementary to the literature data. On the one hand, in a pulse with short (<50 μs) PD the second—hyperpolarizing—phase may be so close to the onset of the first phase that it abolishes excitation initiation at low levels, thereby increasing threshold. Increasing PD to 50 μs (Fig. 6E) then reduces threshold by moving the onset of the hyperpolarizing phase further away from that of the depolarizing phase, as does increasing IPG (compare Fig. 6E and F). Indeed, by separating both phases with a 40-μs IPG, Moon et al. (1993) observed similar psychophysical detection thresholds for PDs shorter than 50 μs. On the other hand, a longer PD (>50 μs) may introduce significant influence of the leaky integrator, which means that more charge is needed to compensate membrane leakage over the length of the phase duration (Moon et al. 1993). In that respect, a stronger “leaky” effect for the deafened groups might be expected since the demyelination associated with SGC degeneration (Agterberg et al. 2008) causes leakier cell membranes (Koles and Rasminsky 1972; Spoendlin 1984). Instead, we did not find differences between healthy and degenerating SGC populations. This might be explained by the relatively short PD we used, which may have been too short for significant charge leakage to occur. Note that longer PDs would require an increase in blanking time, which would interfere with the N1 peak. Alternatively, the effect of demyelination on membrane leakage might not be that substantial.

Inter-Phase Gap

The potential of an IPG to decrease excitation threshold has been demonstrated in various neural tissues and at various levels of neural processing (van den Honert and Mortimer 1979; Shepherd and Javel 1999; McKay and Henshall 2003; Carlyon et al. 2005; Prado-Guitierrez et al. 2006; Cappaert et al. 2013). In the present study, this potentiating effect of an IPG is confirmed unambiguously for threshold (Fig. 5E, F) and level50% (Fig. 5I, J). Interestingly, this effect is more pronounced for threshold than for level50% (Fig. 5E, F and I, J), which is consistent with findings on loudness perception in CI users: McKay and Henshall (2003) reported a larger effect of IPG on threshold than on the loudest comfortable level. In addition, as is shown in Table 1, the effect of IPG on amplitude, slope, dynamic range, and latency is highly significant as well.

When comparing Figs. 5 and 6, it appears that IPG and PD have roughly similar effects on all six eCAP characteristics. However, the key difference between the two is that the effect size with a change in IPG, but not PD, highly depends on the physiological state of the cochlea. This is demonstrated by the significant interaction between IPG and group for slope, dynamic range, level50%, and latency and the lack of interactions between PD and group (Table 1). A smaller increase in slope (Fig. 5C, D) and a smaller decrease in level50% (Fig. 5I, J) indicate that the increased excitability brought about by an increased IPG is lower for the 6WD animals. More strikingly, the dynamic range remains mostly unchanged for the NH and 2WD groups, while it increases significantly for the 6WD animals (Fig. 5G, H). Probably the SGCs that are additionally recruited by the increased IPG have higher excitation thresholds, thereby enhancing the dynamic range. Given that this increase in dynamic range only occurs for deafened animals, these high individual thresholds may correspond to degenerating SGCs, or SGCs with a degenerated myelin sheath (Koles and Rasminsky 1972; Spoendlin 1984).

Both cathodic and anodic currents can initiate action potentials in extracellular auditory nerve stimulation (reviewed in Macherey et al. 2008). Shepherd and Javel (1999) showed that when the second phase of a biphasic pulse is the excitatory one, the fiber response is delayed by the IPG duration as expected (see Fig. 10C, D). In the present study, since alternating stimulation was applied for artifact reduction, we therefore expected the latency to increase with half the increase in IPG duration (14 μs; see Fig. 10). Instead, for 50-μs PD, the increase was negligible for NH animals: 14 μs for 2WD and 22 μs for 6WD animals (Fig. 7L). The pre-processed eCAPs in response to the cathodic-first and anodic-first pulses were often too much obscured by the stimulation artifact to reliably discern which pulse polarity contributed most to the averaged eCAP. However, based on several less-obscured examples from normal-hearing animals (i.e., with large eCAP amplitudes), we gathered that, in the normal-hearing group, cathodic-first pulses contributed to a substantially greater extent. The observation that in this group the latency was not delayed in case of IPG increase implies that it was the first (i.e., cathodic) phase that most potently excited the SGCs (Fig. 10A, B). We hypothesize that the latency shift we observed with IPG increase for deafened animals, but not for the normal-hearing controls, may therefore have resulted from a shift in excitation phase preference from cathodic-first in NH to cathodic-second in 6WD animals: an increased IPG will not influence the eCAP latency when the majority of all responses is evoked by the cathodic phase of the cathodic-first pulse (Fig. 10A, B), while the latency shift will be more than half the IPG increase when the majority is evoked by the cathodic phase of the anodic-first pulse (Fig. 10C, D). Note that because we applied alternating polarity stimulation, this reasoning would pertain if the anodic rather than the cathodic phase (as is assumed here) were the most effective one. In a similar fashion, for the sake of simplicity, we here assume that also for deafened animals, as for the normal-hearing animals, the cathodic phase is the most effective one. This assumption may be incorrect, as pointed out by a modeling study of Rattay et al. (2001) showing that SGC degeneration caused a change from lower thresholds for cathodic monophasic pulses to lower thresholds for anodic pulses. While the effect for biphasic pulses was much weaker, it would imply a shift from cathodic excitation by the cathodic-first pulse in NH animals (Fig. 10A, B) to anodic excitation by the same pulse in 6WD animals. The expected latency shifts with increasing IPG in this case would remain as observed.

FIG. 10
figure 10

The effect of IPG increase in case of alternating stimulation. For cathodic-first stimulus polarity (A, B), a latency shift is not expected since the excitatory phase (shaded area) remains unchanged. For anodic-first pulses (C, D), the latency shift relative to the stimulus onset is proportional to the IPG duration. In case of equal contribution of both pulse polarities, a latency shift of 14 μs is expected. The black bar indicates the time between stimulus onset and termination of the excitatory phase.

In case of the 20-μs PD (Fig. 7K), the hypothesis concerning the shift in excitation phase holds as well, with the exception that the latency decreased with increased IPG for NH animals (see negative latency shifts, as opposed to no shift, in Fig. 7K). Probably, a 20-μs cathodic phase almost immediately followed by the anodic hyperpolarizing phase is too short for the majority of SGCs to initiate a successful action potential, so that action potential initiation gradually shifts from the cathodic phase of the anodic-first pulse to the cathodic phase of the cathodic-first pulse as IPG increases, resulting in a shorter latency.

In addition to explaining the latency shift, this hypothesis can explain the limited potential of an increased IPG to enhance excitability (increase in slope and decrease in level50%; Fig. 5C, D and I, J) in the 6WD animals relative to the NH controls. If indeed the majority of SGCs in NH animals responds to the cathodic phase of the cathodic-first pulse rather than to that of the anodic-first pulse, these SGCs may benefit substantially from the delay of the second (anodic) phase, which otherwise can annihilate the successful initiation of action potentials (van den Honert and Mortimer 1979). If, then, in deafened animals the cathodic phase of the anodic-first pulse is the preferred exciter, the function of the IPG is fundamentally different as it separates the excitatory phase from the preceding hyperpolarizing phase rather than from the subsequent hyperpolarization. In the latter case, the IPG is arguably of less importance to action potential initiation.

This hypothesized shift in excitation phase is partly based on visual inspection of recordings to individual polarities, and largely inferred from the objective analysis of averaged data. For a more direct approach to test this hypothesis, a masker-probe paradigm for artifact reduction might have been more suited since artifact-free recordings to both individual stimulus polarities would then have been available. We chose alternating stimulation since a larger portion of the SGC population is thought to be recruited than with a masker-probe paradigm (Westen et al. 2011).

Correlations with Histology

Both the maximum eCAP amplitude and the slope of the input–output function correlated significantly with SGC packing density. This is in agreement with findings by Hall (1990), who found a correlation between the amplitude of the eABR P1 peak and the number of spiral ganglion cells in rats. Although maximum amplitude and slope may be reliable and straightforward measures for the estimation of the number of SGCs in animals, they have restrictions when translating to clinical practice. First, factors such as the position of the electrode array (angle, depth, location of insertion), possible fibrosis or ossification, and a wide range of different available electrode arrays affect the eCAP amplitude and thereby also the slope. Second, the current range in CI users is much more limited than in (anesthetized) laboratory animals, so that the maximum eCAP amplitude often cannot be attained. The eCAP slope as we have defined it is the steepest slope in the input–output curve (at level50%, see Fig. 1D) and can only be reasonably estimated when current levels above level50% have been applied, which depends on maximum current level. The alternative in clinical settings is to apply linear fits through all supra-threshold data points (Kim et al. 2010), which, taking into account the non-linear nature of the input–output function, makes the slope dependent of the maximum current level. Hence, we have explored the possibility to use more indirect within-subject differential eCAP measures that therefore inherently suffer less from the aforementioned limitations to predict the neuronal status. However, most of the relative measures still depend on the use of high current levels (second limitation), and it cannot be ruled out that any of the factors mentioned under the first limitation to some extent still influence the effect size of pulse shape variation.

We did not find any significant interaction effects between PD and group (Table 1), and accordingly we found hardly any predictive power for either SGC packing density or perikaryal area for changes in eCAP characteristics with changed PD. Specifically, in contrast to Prado-Guitierrez et al. (2006), we did not find a significant correlation between a change in level50% with increasing PD and SGC packing density. The discrepancy with their findings most likely resulted from a difference in PD range (20–50 μs versus 104–208 μs), but could also be due to eCAP versus eABR recordings, or to monopolar versus bipolar stimulation. For PDs beyond the range we used, the necessary blanking time would obscure the N1 peak. This limitation can be circumvented by recording eABRs instead of eCAPs, like Prado-Guitierrez et al. did, but for eCAP recordings the PD range cannot be widened substantially, which so much as disqualifies PD as a suitable predictor for neural health using currently available CI telemetry.

In contrast, the increase in IPG (from 2.1 to 30 μs) did result in a series of significant correlations (Fig. 7). Specifically, the change in level50% (Fig. 7I, J) is highly similar to the results Prado-Guitierrez et al. obtained with IPG variation: the slopes of the trend lines are 0.8 and 1.0 dB per 1,000 cells/mm2, respectively.

Whereas Prado-Guitierrez et al. (2006) used only SGC packing density as histological measure to relate to their electrophysiological measures, we added perikaryal area as second histological measure to cover both the degree of SGC loss and the state of the surviving SGCs. The two measures are obviously related, although, as is shown in Figure 3C, variability within groups is substantial. Moreover, it is hypothesized that SGC loss and shrinkage may result from separate subcellular mechanisms (van Loon et al. 2013). SGC soma size affects functionality directly since it is highly (positively) correlated with membrane capacitance (Limón et al. 2005). Furthermore, a decrease in cell size may result from intracellular changes due to lost neurotrophic support (van Loon et al. 2013), which could cause a change in electrical activity as well. The possible effect of SGC packing density on SGC functionality is less straightforward. Less densely packed cells could result in a change in the current path from electrode to the neural substrate, resulting in lower thresholds and shorter latency as discussed above.

Considerations for Clinical Application

The present study has demonstrated that, by increasing the stimulus’s IPG, it is possible to predict the magnitude of a change in eCAP response for up to 80 % with histological measures for SGC degeneration (Fig. 8). Thus, the magnitude of an IPG-induced change in eCAP morphology is a fairly accurate predictor for neural health. By means of the CI’s telemetry function in human CI users, eCAPs can be evoked and recorded using the same parameters as used in this study. Although there are some critical differences between animal and human cochleas—e.g., with respect to myelination, anatomy, and excitation (Frijns et al. 2001; Abbas and Miller 2004)—this method may prove to be suitable for clinical purposes. During cochlear implantation, intraoperatively acquired knowledge of the state of the auditory nerve may provide a choice to apply personalized supplementary neuroprotective treatment. Furthermore, these measures may have instant prognostic value for CI performance, and the state of the auditory nerve can be monitored over time. Since this method is independent of the highly variable absolute amplitude, it can easily be used for comparison between subjects.

In this light, it should be noted that in human CI users a correlation between a change in eCAP sensitivity (level50%) and speech perception has not been found (Kim et al. 2010). However, we have shown that this measure only marginally correlates with neural health, and that the predictive power furthermore depends on the PD that is used (Fig. 8E). Based on the present findings, most likely candidates would be slope, dynamic range, and latency. Of these candidates, latency not only has the most predictive power (Fig. 8F), it is also the only characteristic for which it is sufficient to use only one current level, instead of having to construct an input–output curve. This will considerably reduce the duration of the measurement session, thereby making it more suitable for clinical purposes than the other characteristics. However, it is important to note that the predictive power of latency exists only for near-saturation stimulus levels (above 20 nC; see Fig. 9). This is not a result of less accurate N1 peak assessment—which might be expected at lower stimulus levels—since the within-animal variance of N1 peaks is similar for all stimulus levels above 12 nC (data not shown). Possibly, at medium stimulus levels the preference for pulse polarity or even for excitation phase polarity is less pronounced than at near-saturation levels. If this is the case, the IPG effect on latency obviously becomes less straightforward.

If this restriction for higher-level stimulation would be the case in human CI users as well, for all eCAP characteristics used in the present study high-level stimulation would be required for the prediction of neural status, which could be a considerable limitation to clinical applicability. Instead of using a single eCAP characteristic, employing a combination of all three identified candidates may therefore be crucial to develop a reliable objective measure for auditory nerve status in CI users.