Introduction

Electrocochleography (ECochG) measures the auditory-evoked potentials originating from the inner ear and distal portion of the cochlear nerve. First discovered by Wever and Bray (1930) while recording from the auditory nerve fibers (ANFs) in the cat, its traditional use has been the assessment of hearing thresholds and the objective diagnosis of endolymphatic hydrops (see Eggermont, 2017; Hornibrook, 2017 for reviews). However, recent applications encompass measurements of the auditory function before, during, and after cochlear implantation or lateral skull base surgery (Harris et al. 2017; Fontenot et al. 2019; Trecca et al. 2020a, b; Weder et al. 2021), as well as detailed diagnosis of the site of lesion in patients affected by auditory spectrum disorder (ANSD) (McMahon et al. 2008; Iseli and Gibson, 2010; Santarelli, 2010; Santarelli et al. 2015; Riggs et al. 2017) and the evaluation of auditory dysfunction related to synaptopathy and hidden hearing loss (Liberman et al. 2016; Grant et al. 2020). ECochG potentials include sensory potentials, namely, the cochlear microphonic (CM) from the outer hair cells (OHCs) and inner hair cells (IHCs), and neural potentials, such as the compound action potential (CAP) and the auditory nerve neurophonic (ANN). Lastly, the SP to tones is an offset of the baseline that persists for the duration of the tone, and clicks are seen as a rising edge prior to the onset of the CAP. An increased SP has proven to be a reliable indicator of endolymphatic hydrops (Gibson et al. 1977; Gibson 1991) and an increase in the SP is also an ECochG indicator for cochlear synaptopathy (Liberman et al. 2016; Grant et al. 2020). A change in polarity of the SP during an insertion of a cochlear implant may be indicative of electrode position (Helmstaedter et al. 2017).

Thus, changes in the size and polarity of the SP appear to be useful indications of cochlear function. However, to fully exploit its various uses, a complete understanding of its sources is necessary. The SP is often considered to arise entirely from hair cells, with the contribution from IHCs occurring at lower thresholds than OHCs (Zheng et al. 1997; Durrant et al. 1998). However, other reports show a neural component as well, because the SP changes after application of neurotoxins, such as TTX, CNQX, or kainic acid (van Emst et al. 1995; Sellick et al. 2003; Pappa et al. 2019), and can be reduced at high stimulus rate, which would not be expected for a purely hair cell potential (Kennedy et al. 2017). Each of the sources can vary in polarity depending on the location of the recording site and as a function of frequency and intensity. From the round window of the gerbil, pharmacological studies show that OHCs provide a negative polarity (relative to neck muscle), while the IHC and neural components are positive. The sizes of the contribution from each source vary across frequency and intensity such that the overall polarity could be either positive or negative. Like the hair cells, the neural component is also be a mixture of sources, composed of both a spiking component based on action potentials and a dendritic component derived from the summed EPSPs within the postsynaptic terminals (Dolan et al. 1989; McMahon and Patuzzi, 2002; McMahon et al. 2008; Santarelli et al. 2008). The purpose of this study was to further delineate the neural contributions to the SP from these two components. To do this, a pharmacological model was developed where the spiking components could first be removed by TTX, a sodium channel blocker that prevents action potentials but not EPSPs, and then kainic acid, which removes the postsynaptic terminal entirely through excitotoxicity (Dolan et al. 1989). By combining these methods with our previous model for selectively removing OHCs (Pappa et al. 2019), the contributions of OHCs, IHCs, and the two neural components could be isolated and studied across a range of frequencies and intensities.

Methods

Animal Model

A total of 27 male Mongolian gerbils (Meriones unguiculatus), weights between 60 and 80 g, were included in this study. The sex was restricted to males because the numbers per group were too small for stratification by sex. The gerbils were obtained from Charles River laboratories (Wilmington, MA, USA). All animal protocols were conducted in accordance with the National Institutes of Health Guide for the Care and Use of Laboratory Animals (National Research Council 2011). All experiments were reviewed and approved by the Institutional Animal Care and Use Committee at the University of North Carolina at Chapel Hill.

Experimental Design

The design of the experiment is shown in Fig. 1. Gerbils with two hearing conditions were used based on ototoxin exposure: untreated, ostensibly normal hearing (NH) animals, or treated animals where OHCs were removed with a combination of furosemide and kanamycin (FK). Normal hearing in the untreated animals was confirmed by comparing the CM magnitude across frequency with a series of 24 animals from a previous study (Pappa et al. 2019). In addition, in all NH cases, the thresholds to the CM and CAP were within 15 dB of 0 dB SPL at 4 kHz. All animals then went through an acute recording experiment where ECochG was performed prior to introduction of TTX, after TTX, and again after kainic acid. In this way a variety of waveform subtractions could be made to isolate contributions from the different sources. In Fig. 1, the elements contributing prior to and after the treatments are indicated. Contributions could be determined either directly as in the case of IHCs (which are the only elements remaining after treatment with FK and kainic acid) or by subtraction to obtain the OHC and neural contributions. It is important to note that the isolation of OHCs is only available across animals with the two hearing conditions. The neural components are available as within animal subtractions for each animal group.

Fig. 1
figure 1

source could be isolated

Experimental design to isolate sources of the SP. The two hearing conditions (NH and FK) are shown at the top (columns), steps at left (rows) indicate when recordings were taken, and arrows (both down and across) indicate subtractions to isolate particular sources. Colored text indicates where a

Production of FK Animals

The FK animals were injected with 200 mg/kg kanamycin (SQ) and 100 mg/kg and furosemide (IP), dosages that were shown previously to remove OHCs from all but the apical parts of the cochlea and to leave IHCs intact (Pappa et al. 2019). Survival time was 7–10 days. The effects of the FK treatment are variable with greater or lesser amount of the apical OHCs remaining, but the treatment reliably results in OHC losses from the base to at least the 2-kHz region of the cochlea. An example of the hair cell distribution remaining after FK treatment is shown in Fig. 2.

Fig. 2
figure 2

Confocal image of the basilar membrane from an FK animal immunolabeled for myosin VIIA that showing OHCs and IHCs. OHCs are present in the apex (insert, lower right) but not in the base (insert, upper right). IHCs are present throughout. The white bar marks the point where surviving outer hair cells begin to appear, which is about the 1 kHz point in the gerbil cochlea (Muller 1996; Hutson et al. 2021). Scale = 300 µm

Surgery and Electrocochleography (ECochG) Recording

Animals were anesthetized with an intraperitoneal injection of urethane (1.5 g/kg) in combination with Nembutal (10 mg/kg). Once the anesthetic plane was reached, the fur over the bulla was shaved and a subcutaneous injection (0.1 cc) of a local anesthetic (Lidocaine) delivered under the skin overlying the bulla. Each animal also received a subcutaneous injection of sterile saline (0.1–0.2 cc) and 0.05 mg/kg atropine to retard mucus secretions in the animal’s airway. The animal’s head was placed in head-holder, and body temperature was maintained at 37 °C with a custom-made hot-water circulating heating pad and a rectal thermometer. After pinna removal, and retraction of the thin muscle overlying the bulla, a small hole was made in the bulla sufficient to view and gain access to the round window for ECochG recording.

Stimulation and recording for ECochG was controlled by a Biologic Navigator Pro (Natus Medical Inc., San Carlos, CA). Recording was through a stainless-steel facial nerve monitor probe (Neurosign, Magstim Co., Wales, UK) placed in the round window niche and used as the non-inverting input. Inverting and ground inputs were from needle electrodes over the contralateral mastoid and tail, respectively. Sound delivery was through an Etymotic ER-3b speaker (Elk Grove Village, IL, USA) with the sound tube sealed in the ear canal. Acoustic stimuli were tone bursts and clicks, alternating in condensation and rarefaction phases, with 100 repetitions to each phase, across a range of intensities from 0–90 dB SPL. Tone bursts were (2, 3, 4, 6, and 8 kHz), Calibration was done in the ear canal through a probe tube attached to a ¼ microphone and measuring amplifier (model 3982, B&K, Belgium). ECochG recordings were taken using a 0.1 Hz high pass filter and low pass filters of 5–15 kHz depending on stimulus frequency. The stimulation rate was 11.1 cycles per second.

Recordings in NH and FK animals were taken before and after application of tetrodotoxin (TTX) to remove the contribution of the spiking neural elements and kainic acid (KA) to remove both spiking and dendritic elements. Concentration of TTX used was 0.015 mM, and KA was 100 mM in artificial perilymph consisting of (in mM): 127.5 NaCl, 3.5 KCl, 25 NaHCO3, 1.3 CaCl2, 1.2 MgCl2, 0.75 NaH2PO4, and 11 glucose, and pH adjusted to 7.3 with HCl (Mikulec et al. 2009). After each application of neurotoxin there was a 1-h wait for the drug to diffuse through the round window and cochlea before the ECochG measurement. In NH control animals, artificial perilymph was used without the addition of KA or TTX.

Data Analysis

The digitized data was analyzed through custom routines in MATLAB (R2021a, Mathworks, Natick, MA). The ECochG data was used to extract the SP from the sum of the stimuli to each phase. The SP is more visually apparent in the summed responses because it removes hair cell and neural components of the responses that change with phase. To tones, the summed waveform was then further smoothed which a triangular filter with number of points contained in 1 cycle of the stimulus frequency. The SP was measured as the average of a 1 ms time window that was within a few ms prior to stimulus offset, i.e., at a point where the response had reached steady state and was thus beyond the time where perturbations associated with the stimulus onset such as the CAP might affect the SP measures.

The protocol was used on 27 gerbils. A factor affecting yield of successful experiments is the effect of time after anesthesia on the responses. That is, there was generally some decline during the experiment, and since later responses are subtracted from earlier ones to isolate the components (Fig. 1), the results will be affected by the degree to which the responses change as a function of time rather than the effects of the TTX or KA. The criterion used to assess the effect of time was the change in the CM, which, because it is produced by hair cells, should not be affected by the removal of the neural components. The frequencies used (2–8 kHz) are above the range of neural phase-locking that produces the auditory nerve neurophonic in the gerbil (Henry 1995; He et al. 2012; Forgues et al. 2014; Fontenot et al. 2017), so the CM should be the only AC component in the steady-state response. For inclusion in the analysis, the CM needed to remain within 2 dB of the range at the start of the preceding step, so the total decline for the two steps had to be less than 4 dB. In the 15 cases included, the total decline in the CM ranged from 1.1 to 3.4 dB with a mean of 2.3 ± 0.9 dB (standard deviation).

Histological Analysis

At the end of the recording session, animals were euthanized with an overdose of sodium pentobarbital, the cochleae removed “en bloc” and immersed in 4% paraformaldehyde in 0.1 M phosphate buffer for at least 24 h. The Cochlea was then decalcified for 2–3 days in 10% EDTA. At that time the otic capsule was removed, the basilar membrane was carefully dissected away from the lateral wall and modiolus (see Hutson et al. Fig. 3) and prepared for histological examination. The location of surviving hair cells in FK-treated animals was charted after staining the basilar membrane with iron hematoxylin and hair cells counted in 250 μm increments using a Zeizz Axioscope light microscope and 40 × objective (Carl Zeiss, Thornwood, NY). Alternatively, the basilar membrane was immunostained for the detection of hair cells. The membrane was rinsed in phosphate-buffered saline (PBS), blocked for 2 h in 5% normal donkey serum with 1% Triton X 100 in PBS, transferred to primary antibody solution overnight (Myosin VIIa at 1:200 in blocker solution; Proteus Biosciences; 25–6790, Ramona, CA, RRID:AB, 10,015,251), rinsed 3 × 15 min in PBS, followed by a fluorescently tagged secondary antibody for 2 h (1:500; Thermo-Fisher; A10042, Waltham,MA, RRID:AB_2534017), washed in PBS, and mounted between coverslips. Fluorescent material was imaged with a Zeiss 700 scanning laser confocal microscope (Carl Zeiss Microscopy GmbH, Jena, Germany) at × 5 magnification; the images were viewed and analyzed with Fiji for ImageJ (Schindelin et al. 2012).

Fig. 3
figure 3

Examples of responses to the different treatments for a single frequency/intensity combination (8000 Hz, 80 dB SPL) and data analysis to extract the neural spiking and dendritic contributions to the SP. A Grand average of the alternating phases for 5 normal hearing (NH) animals. The point where the SP was measured (bar) became more negative after TTX, and then more positive after KA. B Grand average data for 5 animals with outer hair cells removed with FK. The directional changes in the SP polarity after the TTX and KA were the same as NH animals. C and D Spiking and dendritic components derived after subtractions of the data in A and B, respectively. E and F Pattern of the SP changes for each of the 5 animals making up the grand averages in A and B. The pattern was similar in 9 of the 10 animals, and only 1 FK animal was different

Statistical Analysis

Within-animal effects were studied with linear mixed models in SPSS (v28, IBM, Armonk, NY) to isolate significant effects of the treatments where the random factor was the animal number and the fixed factors were frequency, intensity, and the three treatment conditions, i.e., pre- and post-TTX and then post-KA. Animals in the experimental groups were randomized, i.e., two animals were obtained at a time and one was assigned to the NH arm and one to FK, until the desired sample of 5 animals/group was reached. The NH control group was run after the experimental groups were completed. Data collection at the time of ECochG was automatic so no blinding to condition was performed. Histological analysis was done without knowledge of physiological results.

Results

Example of the Treatment Effects at a Single Frequency and Intensity

Examples of “grand average” responses for five animals each in the two initial hearing conditions (NH and FK; Fig. 3) show that for this frequency and intensity (8000 Hz at 80 dB SPL), the average effect of the TTX was to remove the CAP, as expected, and to remove a positive polarity component of the SP, driving it more negative by comparison (Fig. 3A, B). The effect of the KA was to remove a negative polarity contribution to the SP, so it again became more positive, although less than the original waveform. The remaining waveform, after both treatments, represents the combination of the OHCs and IHCs for the NH animals, and IHCs only for the FK animals. These results show that, after subtraction, the spiking and dendritic components were opposite in polarity (Fig. 3C, D). For both parts of the neural component, there was adaptation in terms of at least a partial return to baseline by the measurement point (black bar in A and B). The variability across animals, shown in Fig. 3E, F, indicates that the “v-shaped” trend was observed after the treatments in 5/5 cases for the NH animals and 4/5 for the FK. Note that for the NH animals, the pre-treatment SP could be either positive or negative while for the FK animals, it was always positive, due to the lack of a negative polarity contribution from OHCs (Pappa et al. 2019). Also, as a reference for the relative sensation levels of the stimuli, the thresholds for the FK animals to the CM at 4 kHz were approximately 30 dB higher than for the NH animals, and the maximum responses of the CM were proportionally lower as well.

Statistical Comparisons

Overall, the effects of TTX and KA were significant compared to the within-animal, pre-treatment baseline (Tables 1 and 2). Using the SP as the output variable and animal ID as the random variable, linear mixed models for each animal group showed significant fixed effects of frequency, level, and treatment (pre-TTX, post-TTX, and post-KA). The interaction of frequency and level was also significant for both groups. The NH group showed significant interactions of treatment with frequency and level, but the FK animals did not. The interaction in the NH animals suggests a relationship between the treatments and the presence of OHCs. That is, the presence of the negative polarity from the OHCs causes the pre-treatment SP to vary greatly in both magnitude and polarity as a function of frequency and intensity, in contrast to the FK animals where it is always positive. As expected in control animals (Table 3), there were significant main effect of level and frequency, and a significant interaction between level and frequency but the main effect of treatment was not significant. There were some missing conditions for 3000 Hz for one animal each in the FK and NH control animals due to a programming error.

Table 1 Mixed linear model results for NH animals (n = 5375 conditions)
Table 2 Mixed linear model results for FK animals (n = 5213 conditions)
Table 3 Mixed linear model results for control NH animals (n = 5357 conditions)

Distributions of Spiking and Dendritic Contributions to the SP Across Frequency and Intensity

In Fig. 4A and B, we show the spiking and dendritic components across all frequencies and intensities tested for the NH and FK animals, respectively. The graph is presented as a function of level, with frequency as a parameter. In each case, the spiking part of the neural component (filled symbols, solid lines) had a positive polarity, while the polarity of the dendritic component (open symbol, dashed lines) was negative. The magnitudes of the components increased with intensity in general, and there was relatively little effect of frequency, but there were exceptions to both (e.g., NH to 2000 Hz and FK to 6000 Hz). The 2 kHz deviated substantially from the other frequencies for both hearing conditions, suggesting that the action of the neurotoxins and or ototoxic treatments may have been less effective towards the apex.

Fig. 4
figure 4

Neural spiking and dendritic components for the two animal groups across frequency and intensities. A NH animals. B FK animals. Error bars are standard deviation. Solid symbols and lines = neural spiking, open symbols and dashed lines = neural dendritic

From the experimental design (Fig. 1), each of four components that contribute to the SP (OHC, IHC, neural dendritic, and neural spiking) can be isolated based on particular subtractions, or in the case of IHCs determined directly in the FK animals. The results in terms of grand averages for each component at one frequency/intensity combination are shown in Fig. 5. The IHCs and OHCs from this independent data set have large and opposite polarities, as reported previously (Pappa et al. 2019), while the neural spiking and dendritic components also have opposite polarities but with smaller magnitudes. The OHC curve in this case was isolated from pretreatment NH and FK animals (see Fig. 1).

Fig. 5
figure 5

Grand averages of the isolated IHC, OHC, and neural spiking and dendritic sources at one frequency/intensity combination (8000 Hz, 80 dB SPL)

Responses to Clicks and Time Course of the Different Potentials

Examples of the response to clicks for the different treatments and two hearing conditions are shown in Fig. 6. For the NH case, the click responses to condensation and rarefaction phase stimuli at 50 dB SPL (Fig. 6A) show a slight asymmetry in the first cycle of response (compare arrows at negative and positive points on the y-axis). This asymmetry is responsible for the first part of the SP in the sum (Fig. 6B) and persists through the TTX and KA treatments, indicating it is derived from hair cells. Subtracting the curves as before yields the neural spiking and dendritic components (Fig. 6C).

Fig. 6
figure 6

Responses to clicks. AC Example from an NH animal. DF Example from an FK animal. In each case, the top panels (A and D) show the responses to condensation and rarefaction phase stimuli, and the arrows indicate the negative and positive peaks of the first response. The middle panes (B and E) are the sum of the two phases before and after each treatment, and the bottom panels (C and F) are the spiking and dendritic components determined by the subtractions as in Fig. 1. The arrowheads in E and F show a slight change in the early response with the different treatments and in the subtractions; this is presumably hair cell responses changing over time rather than a direct effect of the neurotoxins

For the FK case, at 60 dB SPL, because the responses to OHCs have been removed, the first cycle of deflections is from IHCs only (Fig. 6D). The asymmetry is much greater than when OHCs are present, and in the opposite direction. In the sum curve (Fig. 6E), the initial deflection is positive rather than negative and is also relatively, although not completely, stable after each treatment (arrowhead). After subtraction (Fig. 6F), the isolated potentials show that the peak of the dendritic response occurs slightly after the negative peak of the CAP and then returns to baseline rather than reaching steady state. The initial peak associated with the spiking component (arrowhead) is an artifact of the subtractions due to the small changes after the treatments in Fig. 6E.

To observe the time course for all four components of the SP, the grand average results to an 8 kHz tone (Fig. 7A) and to clicks (Fig. 7B) are shown on an expanded time axis and with the OHC response flipped to a positive polarity to better appreciate the relative timing of each event. The first event to both stimuli was the response of OHCs which rose sharply, followed by the IHCs which, to both stimuli, had a shallower slope. The differences in onsets and time constants between OHCs and IHCs across frequencies and intensities were reported previously for tones (Pappa et al. 2019) and the pattern shown here is consistent with that report. The neural components followed in time, with the large spiking component (the CAP) rising quickly and the smaller dendritic component having a slower time course, as seen in the examples in Fig. 5. As noted previously, the peak of the dendritic potential occurred after the large deflection of the CAP. The contributions of the components reached a steady state to tones, but clicks returned to baseline with time.

Fig. 7
figure 7

Grand average response to tones and clicks on an expanded scale to see the time courses of the responses. A The onset portion of the grand average responses in Fig. 6. The sequency in time is outer hair cells, then inner, then the neural dendritic and spiking components close together. B Responses to clicks, the time sequence of the components is similar, but the components do not reach steady state

Discussion

The main result was that the neural contribution to the SP consists of spiking and dendritic components, so that together with the OHCs and IHCs, there are at least four components that combine to produce the SP. Like IHC and OHCs, the two neural components have opposite polarity when recorded at the round window, seen both in NH animals and animals where the OHCs were selectively removed with ototoxins. The study of the SP has a long history as one of the more complex potentials produced in response to sound, and a clinical history as a sensitive and specific indication for Meniere’s disease (Gibson 1991; Ferraro and Durrant 2006; Iseli and Gibson 2010; Hornibrook 2017). Other uses, such as an indicator for cochlear synaptopathy (Liberman et al. 2016; Mepani et al. 2020) and monitor for tonotopic position during cochlear insertions (Helmstaedter et al. 2017), are also proposed. In this study we continued to dissect the complex nature of the SP and will consider this complexity in relation to its possible clinical applications.

Technical Considerations and Limitations

A key issue with subtractions as in Fig. 1 is potential declines over the time of the experiment that are unrelated to the neurotoxins. Controls for this were to monitor the CM, which should not be affected by the neurotoxins, and to do sham experiments with artificial perilymph only.

The frequency range used was limited because the equipment and recording configuration used clinical equipment identical to that used for similar recordings from the round window of cochlear implant subjects recorded intraoperatively (e.g., Fontenot et al. 2019; Pappa et al. 2019). This limited the upper frequency to 8 kHz. The lower bound of 2 kHz was because the FK treatment reliably removed OHCs to about this range, and complete removal of OHCs is required to isolate the IHC response and determine the OHC contribution by subtraction. However, the data in Fig. 4 indicates that to the 2 kHz stimuli, the neurotoxin and/or ototoxin diffusion may be incomplete at more apical CF regions. The intensity range was limited to 60–90 dB SPL for NH animals and 70–90 dB SPL for FK animals. The high intensity limit was arbitrary, but it seems unlikely that the results would be much altered if higher intensities were used. To lower intensities, the SP was often too small for subtractions to be reliable.

The neurotoxins should not affect the hair cells stereociliary responses that produce the CM. However, the removal of the synapse with KA may affect the basolateral surface that can in turn have an effect on the production of an SP (described more fully below). Importantly, the TTX should not have an effect on the hair cells since the synapse remains intact and only action potentials are prevented, so an effect on the SP from blocking spiking activity is evident.

Neural Components of the SP

Initial studies of the SP suggested a possible neural origin (Davis et al. 1950; Kupperman, 1966), but over time, these were discounted for a purely hair cell origin based on asymmetries in the CM primarily of OHCs (Dallos et al. 1972; Dallos 1973; Dallos and Cheatham 1976). However, asymmetries are greater in IHCs than OHCs (see Fig. 6) and have opposite polarities, at least as recorded at the round window. Possible reasons for the polarity difference between OHCs and IHCs were discussed in our previous report and include an operating point on opposite sides of the 50% channel open/closed position at rest, or differences in the location of the “center of gravity” of the generators (Pappa et al. 2019). In humans, the SP varies in polarity and magnitude across frequency and intensity or with hearing condition, suggesting similar interactions of inputs from sources with different polarities (Dauman et al. 1988; Ferraro et al. 1994; Pappa et al. 2019).

Previous studies also showed effects of neurotoxins on the SP (Dolan et al. 1989; van Emst et al. 1995; Sellick et al. 2003) but this neural contribution was not systematically explored. By combining the use of neurotoxins with animal models with and without OHCs, we were able to isolate OHC, IHC, and two neural contributions over the frequency and intensity ranges used in the current study. Here, we show that the spiking part of the SP provides a larger positive polarity contribution than previously realized (Pappa et al. 2019), because it is opposed by a negative polarity dendritic contribution.

A difference between the spiking and dendritic contributions to the SP to tones that may impact their polarity is their site of origin. The dendritic potential resides in the terminals next to IHCs, and thus the distribution of active fibers is likely to follow that of the IHCs. In contrast to the local origin of the dendritic potential, the origin of the CAP may be more central (Brown and Patuzzi 2010). Dolan et al. (1989) using the same pharmacology in NH guinea pigs reported that SP recordings from scala vestibuli had reversed polarities relative to scala tympani for the hair cell potential, but not for either of the neural components. This expected result shows that the neural potentials do not reverse in phase relative to the motion of the basilar membrane in the different compartments.

The SP and Endolymphatic Hydrops

The most significant clinical use of the SP is based on its increase in subjects with confirmed Meniere’s disease. The basis for the increase is thought to be the pressure from endolymphatic hydrops, the pathophysiological correlate of Meniere’s disease, compressing the space between the tectorial and basal membranes resulting in an increased asymmetry in the OHC input/output function and a larger SP (Gibson 1991). The increase is seen both with clicks (Ferraro et al. 1983; Ferraro and Tibbils 1999; Ferraro and Durrant 2006) and tones (Gibson 1991; Iseli and Gibson 2010; Hornibrook 2017). However, according to our results, the increased SP to tones (i.e., becoming more negative in our recording configuration) could also occur through diminished IHC or spiking neural activity. Since hearing loss is typical with Meniere’s disease, some effects on these elements are expected. For clicks, an increase could also be due to increased output from OHCs, decreased output from IHCs, or an increased dendritic component. The spiking component is not likely to be involved because the CAP dominates the earliest part of the spiking response.

The SP and Cochlear Synaptopathy

Recent articles have suggested that an increased SP can also be a correlate of cochlear synaptopathy, or loss of neural responses without concomitant losses in hair cells (Liberman et al. 2016; Grant et al. 2020; Hancock et al. 2021). The correlation was observed by comparing SP to clicks between the best and worst performers on a battery of word tests. Yet, because of the complexity of the SP, there could be several explanations for the correlation. The initial rise in response is the same in both groups, and this rise can be attributed to the OHCs (note that the recording direction is reversed between most research and clinical studies). OHC function also appeared similar between the groups based on distortion product otoacoustic emissions. After the initial rise, there is an inflection with a slope change, which is reduced in the best performance group and increased in the worst performance group. During this brief period prior to the onset of the CAP, active sources of the SP other than OHCs are the IHCs which include the neural dendritic component. A reduced neural dendritic response would be expected with cochlear synaptopathy, but since it has the same polarity as the OHCs, a reduction in the dendritic response cannot account for an increased slope in the worst performers. In addition, a change in the IHC transduction process does not seem likely. However, an increase in IHC resting potential would reduce the contribution to the SP, and a possible correlate of a loss of synapses due to excitotoxicity could be a less tight basolateral surface. Thus, a reasonable hypothesis for the increase in the SP in the worst performing group is a less asymmetric operating point in IHCs, due to a higher resting potential, compared to the best performers.

The SP as a Monitor During Cochlear Implantation

Recently it has been suggested that the tuning of the SP to frequencies as a function of position can indicate the tonotopic region of a particular contact (Helmstaedter et al. 2017). All of the SPs reported in that study had positive polarities and were in response and to relatively high frequencies. The tuning shown for the SP was greater than that of the CM or CAP. The difference in asymmetry between OHCs and IHCs makes it is likely that the dipoles for IHCs are relatively more basal than for OHCs, so caution must be used in interpreting the SP data. A study of the SP in guinea pig assessed its properties at four points along the cochlea from base to apex as a function of frequency and intensity (Honrubia and Ward 1969). They noted systematic changes in the SP that were consistent with a wide spread of excitation, i.e., the peak SP and CM to 3000 Hz and 80 dB SPL was in the basal turn (turn 1), rather than near its CF region in turn 3. Thus, using the SP during a CI insertion where typically high intensities are used and time is limited might be challenging, but as a post-insertion measure for identifying position of contacts, the SP could be useful.

A difficulty in using the SP for human CI subjects is that in most subjects, it is quite small, at least at the round window (Riggs et al. 2017). However, in some CI subjects, the SP is substantial and a variety of morphologies is seen: from large and negative in Meniere’s and ANSD subjects, to small potentials when associated with a CAP, and to positive polarities with and without a CAP where IHC and/or spiking components may dominate (Riggs et al. 2017). Thus, the variety of SP morphologies is consistent with a plurality of a sources.

Conclusion

This study used pharmacology with round window ECochG in gerbils to isolate four sources of the SP, including OHCs, IHCs, and two from the auditory nerve, one related to synaptic potentials (neural dendritic) and one to firing of action potentials (neural spiking). The sources vary by time course and polarity, and to some degree, by frequency and intensity, so that any final SP is a complex mixture of the sources. Thus, interpreting SP changes in clinical situations requires consideration of the complex interactions involved.