Introduction

The auditory system’s neural pathway consists of both afferents and efferents functioning in a multi-layer complex circuitry. While there is a decent understanding of afferents, less is known about efferents despite being implicated in a diverse set of roles. One such efferent is the Medial Olivocochlear (MOC) bundle that project from the superior olivary complex onto the bases of the outer hair cells (OHC) of the cochlea (Warr and Guinan 1979). What is known about the MOC bundle is that when stimulated, MOC efferents directly inhibit OHC activity and reduce cochlear amplifier gain (Guinan 2006; Guinan and Gifford 1988; Wiederhold and Kiang 1970). These efferent effects consequently shift rate-level (RL) functions at the auditory nerve (AN) to higher stimulus levels (a rightward shift). This is shown in the study by Guinan and Gifford (1988) for responses of cat neurons to pure tones with electrical efferent stimulation.

Further up the neural pathway, however, MOC efferent effects become more complex and varied. Seluakumaran et al. (2008) investigated MOC efferent effects on single neuron recordings of the Inferior Colliculus (IC) in the auditory midbrain by means of electrical stimulation. The mentioned study was conducted on anaesthetized guinea pigs and found three main categories of MOC effects across a variety of typical rate-level responses for pure-tone stimulus. The most common MOC-induced effect (44–55%) was shifts of the rate-level (RL) response to higher stimulus levels which are similar to those seen at the AN. The second category of recorded effects was inhibitory—mainly strong reductions in overall firing rate (~ 28%). A minority of recordings (11–14%) showed the third excitatory type of effect where firing rate was enhanced at mid and high levels of stimulus. A strong motivation to understand the effects of MOC stimulation is its association with adjusting the dynamic range of neurons (Guinan 2006). At the AN, shifts to higher stimulus levels alter the dynamic range and consequently the neuron’s sensitivity to different stimulus levels. This can potentially allow for a neuron to code stimuli at levels where it would previously be saturated. At the IC, the plethora of combinations of effects and RL subtypes provide even more possibilities to adjust the dynamic range of neurons.

Little is known as to how MOC efferent effects at the periphery translate to such varied responses. In particular, the enhancing effect at the IC is interesting as they differ from the inhibitory nature of MOC-induced effects seen (cochlear amplifier gain reduction; mid and low stimulus level reduction of AN response) at lower levels of the auditory pathway. While the effects observed in the study were induced under less natural conditions (electrical stimulation; anaesthetized; pure-tone stimulus), understanding these are the logical first step to expand the body of knowledge regarding MOC efferent effects at higher levels of the auditory system. The various effects observed are clues to what information cues that the auditory system is processing and how it does so through the MOC. This is useful for understanding MOC efferents in more natural situations (i.e., complex stimulus like speech and sound-evoked MOC).

It is thought that the diverse effects are a reflection of complex ascending circuitry of excitatory and inhibitory inputs innervating the IC. Seluakumaran et al. (2008) discussed this and illustrated with example circuitry how relative effects of MOC on the excitatory and inhibitory components can produce these effects. For example, within a circuitry involving inhibition from off-frequency neurons, the enhancement effect can be shown if the inhibitory component of the IC neuron is inhibited (disinhibition) relative to the excitatory component by the MOC. The variety of MOC-induced effects and even IC responses by themselves (tuning, monotonicity, etc.) can be explained by the vast possibilities of neuron innervation. However, the role of intrinsic properties within the neuron structure (relating to EPSP and IPSP) cannot be discounted. Varying combinations of excitatory and inhibitory time courses within a neuron model have been shown to account for the range of amplitude modulation rate tuning characteristic of IC cells (Nelson and Carney 2004). Likewise, the range of MOC-induced effects may be a result of these intrinsic interactions rather than across frequency convergences.

To test this hypothesis, a combination of computational models from different levels of the auditory system was used to replicate the parameters used in the Seluakumaran study. The well-established AN model by Zilany et al. (2009, 2014) was utilized as a front end for the Mao and Carney (2015) IC model. The IC model is a later version of the Nelson and Carney (2004) same frequency inhibition–excitation (SFIE) model which assumes innervation from the same characteristic frequency (CF) neuron for both excitatory and inhibitory components. This means that the SFIE model does not involve across frequency interactions. Approximately half of IC cells have band-pass tuning to amplitude modulated tone and function like physiological ‘modulation filters’ (Krishna and Semple 2000; Nelson and Carney 2007). The SFIE model employed in this study also models this behavior (Mao and Carney 2015; Nelson and Carney 2004). While there seems to be a mismatch by simulating a pure-tone stimulus experiment using a modulation filter IC model, we hypothesize that the modulations from onsets, offsets, and phase locking features of the neural response to pure tone may be sufficiently relevant in this setup. By extension, we predict that variations in these features across innervating AN fibers may partly be driving the diverse effects of MOC activation at the IC.

Methodology

Overview

To study the mechanisms behind the effects of MOC efferent stimulation at the IC, a combination of mathematical models of the AN and IC was used. The combined model was used to simulate neural responses for pure-tone stimulus with and without MOC stimulation as in Seluakumaran et al. (2008). For responses at the AN, another study of Guinan and Gifford (1988) was used for comparison. A wide range of model parameters (i.e., characteristic frequency (CF), spontaneous rate (SR) type, and best modulation frequencies (BMF); the modulation frequency eliciting the largest rate response) was simulated.

The combined model involved two major stages corresponding to significant points in the auditory pathway. A phenomenological model of the AN (Zilany et al. 2014, 2009) was employed as the front end; its output was then used to simulate IC responses at the second stage. The responses for midbrain IC neurons were simulated using versions of the same frequency inhibition–excitation (SFIE) model proposed by Mao and Carney (2015) and Carney et al. (2015). The details of the models and parameters used are covered in the following sections.

AN model

A well-established model of the AN (Zilany et al. 2014) was used to simulate responses of the AN fiber with and without MOC efferent stimulation. This model incorporated the many cochlear non-linearities, including two-tone suppression, compression, nonlinear tuning, and best frequency shifts with level. The responses produced from this model were previously validated by various physiological data (Bruce et al. 2003; Carney 1993; Zhang et al. 2001; Zilany and Bruce 2006, 2007). The architecture of the model consisted of a middle-ear filter followed by three parallel filter paths—C1, C2 signal paths and the control path. Many non-linearities are controlled by parameters in the control path, and this mechanism provided a simple way to manipulate non-linearity effects. The input required of the model was an instantaneous pressure waveform (Pascals) sampled at 100 kHz. To facilitate comparison, the stimuli used were pure tones (100 ms with 1 ms rise/fall time) similar to those in Seluakumaran et al. (2008). Pure tones of varying CFs (250 Hz–20 kHz) were tested to simulate AN response in the form of time-varying discharge rates (spikes/sec) for individual low, medium, and high SR fibers.

Several studies have successfully modeled MOC efferent stimulation on AN neurons (Brown et al. 2010; Ferry and Meddis 2007; Zilany and Bruce 2006) by adding an attenuator onto the basilar membrane component of their model framework. While originally designed to model the effect of hearing impairment and OHC damage, varying the cohc parameter of the model has been shown to function as an attenuator and represents the effects of gain reduction from MOC stimulation (Chintanpalli et al. 2012; Jennings et al. 2011; Smalt et al. 2014). The cohc parameter (values 0 to 1) controls OHC gain where the value ‘1’ indicates full OHC functionality and ‘0’ indicates complete OHC impairment. Changes to the value of cohc most notably result in shifting the rate-level function to higher levels (rightward on a typical graph), in a similar manner to MOC stimulation.

Here, a similar approach was adopted to simulate electrical MOC efferent stimulation. While other studies (Chintanpalli et al. 2012; Smalt et al. 2014) involved MOC reflex from both ipsilateral and contralateral stimulation, this study did not require this parameter. It must be noted, however, that MOC reflex is naturally stimulated by sound stimulus feedback rather than electrically. Simulation at this stage used a simple estimation of MOC effect intensity. A measure of the rightward shift using ∆L, which was introduced by Guinan and Gifford (1988), is useful to this end. Based on ∆L values obtained from Guinan and Gifford (1988), a standard cohc value of 0.4 was chosen and used across CFs and SR to model MOC-stimulated neurons. The standard cohc value produced shifts toward higher intensities in the RL response consistent with the average ∆L of 11.2–14.2 dB for fibers of different SR (Guinan and Gifford 1988).

The AN model used in this study was developed with parameters based on the published data from cats where possible. For application to human hearing, some parameters of the most recent version of this model have been adjusted to better match human anatomy and physiology (Ibrahim and Bruce 2010). These modifications include the middle-ear filtering (Pascal et al. 1998), the cochlear place-frequency map (Greenwood 1990), and the sharpness of cochlear frequency tuning (Shera et al. 2002). The human version was utilized in tandem with the cat model to obtain greater depth of insight during analysis of responses. Since the differences between the models are known and controlled, consequent differences in MOC effects can be more easily attributed. It must be noted that cat data for the AN are available for comparison, but primary IC data used for validation are from studies conducted on guinea pigs. The main structure and mechanism involved in the model (especially above the AN) are the same for both species versions. As such, broad patterns of similarity are expected across models. At the same time, it is hypothesized that finer details like effect magnitudes or CF-specific responses would be difficult to compare due to these differences.

IC modeling

The Mao and Carney (2015) SFIE model was employed in this study with minor adjustments made. The strengths of this model are its simplicity and ease of use. Essentially, the model was implemented as a sixth-order band-pass filter (cascading three second-order band-pass filters) with a Q value of 1. This filter can variably be centered on the best modulation frequency (BMF; the modulation frequency eliciting the largest rate response) which characterizes the IC neuron being simulated. To model IC neurons, the sixth-order band-pass filter is applied directly onto the AN discharge rate output response to obtain the IC synapse rate. This differs slightly from the implementation in Mao and Carney (2015) where an additional SFIE ventral cochlear nucleus (VCN) stage links the AN and the filter. The omission of the VCN stage did not significantly affect the IC as the function of the VCN in the model was to increase the synchrony to the stimulus envelope (Mao and Carney 2015; Salimi et al. 2017).

The Mao and Carney model is a simplification of the Nelson and Carney (2004) model with the purpose to more efficiently control the center frequencies and the BMFs of IC simulation. It is, necessary, however, to understand the mechanisms involved in the older model and its development in order to interpret and analyze simulated results as is of interest in this study. Nelson and Carney (2004) introduced a two-layer cascade model consisting of the VCN and the IC with both layers of the model framework incorporating same frequency inhibition–excitation (SFIE) mechanism. This means that the VCN model is inhibited and excited by an AN fiber of the same CF, and the IC stage receives input from the VCN in a similar way. Originally, the Nelson and Carney (2004) model was developed to model the amplitude modulation tuning seen in physiological recordings of the IC. The modeling study managed to demonstrate how the SFIE is able to accomplish this by varying the time courses of the excitatory and inhibitory postsynaptic influences on the IC cell. This was accomplished by the temporal interactions between excitatory and inhibitory components resulting in an overlap between stimulus phase-locked peaks and inhibitory ‘valleys’ for a specific BMF. It is useful to know the relation that faster acting inputs give rise to higher BMFs and slower inputs to lower BMFs as BMF will be used as a more systematic variation for the simulation.

In 2015, the IC model was extended to simulate low-pass band-reject (LPBR) variants of IC neurons, which are separate from the band-pass (BP) variants that were previously modeled (Carney et al. 2015). The LPBR type cells were modeled using the BP-cell output as an inhibitory input to the excitatory cochlear nucleus input. Here, consistent with the BP variant model used, inhibitory input was unchanged, but the excitatory component was taken from the AN. Figure 1 illustrates the flow, from input stimulus to implementation, of the IC models described above.

Fig. 1
figure 1

The schematic shows the flow of simulation. Pure tone stimulus is first fed to the AN model (Zilany et al. 2014). At the AN model, MOC stimulation is modeled by lowered cohc parameter (0.4). The output of the AN model is used as an input to two different types of IC cells. Band-pass (BP) IC cells are implemented using a sixth-order modulation filter tuned to a specific BMF. Band-reject (BR) IC responses are derived using the original AN model response as an excitatory component and the BP IC model as an inhibitory component. The bottom pane shows the simulated RL responses of each IC cell model type with varying BMF tunings

Using the outputs from the AN model as inputs into this model, a wide range of different BMF responses for both BP and LPBR cells were simulated for each input. The AN response for CF corresponding to the pure tone was used as both the excitatory and inhibitory inputs of the IC model. Systematically, responses were simulated for each combination of input across a BMF range of 1 Hz – 150 Hz (logarithmically spaced). The selected BMF range was consistent with physiology (Krishna and Semple 2000) and model limits.

Simulations from the model were first used to match the corresponding responses reported in the physiological study by Guinan and Gifford (1988) (AN) and Seluakumaran et al. (2008) (IC). Example responses reported in the study were matched according to the CF reported. Unknown characteristics of the fiber (i.e., SR, BMF, cell type, etc.) were systematically varied to find the parameter for best fit. A comparison of model responses to individual fiber responses and overall RL response types was done to evaluate the output of the model. Following that, an extensive simulation of a wide range of parameters, as summarized in Table 1, was done for later analysis.

Table 1 Summary of parameters and range used in model simulation

Data analysis

Rate-level (RL) responses of IC neurons are classified into two major types: monotonic and non-monotonic. The typical neural response, where firing rate increases with stimulus level, is termed ‘monotonic’. There are two subtypes of monotonic responses: saturating and non-saturating. These two subtypes are differentiated by the rate response at higher stimulus levels; the rate of one subtype saturates at a point (saturating), while the other continues to steadily increase (non-saturating). The response of IC neurons, where discharge rates decrease at higher sound intensities, is termed ‘non-monotonic’. Distinct from random fluctuations, non-monotonic responses require a reduction of more than 25% from the maximum discharge rate (Rees and Palmer 1988). Non-monotonic responses have two different subtypes. If the rate of a non-monotonic RL response reduces to its original SR, it is classified as ‘complete’; if it does not, it is classified as ‘partial’. In the context of level coding, non-monotonic neurons are interesting as they respond strongest at a certain stimulus level and have weaker response at lower and higher intensities. This has led to non-monotonic neurons described as tuned to best level. In addition, neurons of this RL type can continue to code changes in intensity past its maximum rate unlike its monotonic counterparts which saturate. The emergence of non-monotonic neurons at stages above the AN add a different dynamic to analyzing responses like those at the IC.

A measure of monotonicity would be useful for analyzing the different RL responses. One such measure is the monotonicity index (MI). Watkins and Barbour (2011) defined MI as follows:

$${\text{MI}}=~\frac{{{\text{rat}}{{\text{e}}_{{\text{maxlevel~}}}} - ~{\text{rat}}{{\text{e}}_{{\text{spontaneous}}}}}}{{\hbox{max} \left( {{\text{rate}}} \right) - {\text{rat}}{{\text{e}}_{{\text{spontaneous}}}}}}.$$
(1)

The maximum level rate is the rate of the fiber at the highest sound intensity, which is distinct from the maximum rate in the RL response. In this study, however, a related measure of non-monotonicity (NM) was used. It is a measure of the percentage of reduction of discharge rate from the maximum rate and is defined as follows:

$${\text{NM}}=~\frac{{\hbox{max} \left( {{\text{rate}}} \right) - {\text{rat}}{{\text{e}}_{{\text{maxlevel}}}}}}{{\hbox{max} \left( {{\text{rate}}} \right) - ~{\text{rat}}{{\text{e}}_{{\text{spontaneous}}}}}}.$$
(2)

The measure of non-monotonicity is related to the MI in that the sum of the two for a RL response will be equal to 1. This measure was used instead of the MI to highlight the non-monotonic responses within the simulated results.

Another metric introduced in this study is saturation (S). The saturation metric in the context of this study is a measure of how saturated a neuron is solely based on its pattern of response. S or rather the change in S was used to evaluate the effect of MOC on the pattern of response with regard to how saturated the response of the neuron is. The advantage of this metric is that it can be used for both monotonic and non-monotonic RL types. It is based on the perspective that a change from saturating non-monotonic response to a non-monotonic type is a reduction in saturation. S is defined as follows:

$${\text{Saturation}}~\left( S \right)=~\frac{{{\text{rat}}{{\text{e}}_{{\text{average}}}} - {\text{rat}}{{\text{e}}_{{\text{spontaneous}}}}}}{{\hbox{max} \left( {{\text{rate}}} \right) - {\text{rat}}{{\text{e}}_{{\text{spontaneous}}}}}}.$$
(3)

The idea behind the measure S is to find the proportion of space a RL curve occupies in comparison to the area which it would occupy as a square (fully saturated). The more saturated a curve is, the greater its area by proportion. Here, the average rate subtracted by the base (SR) is used to represent the area of the curve. Accordingly, the maximum rate subtracted by the SR represents the fully saturated area. Figure 2b helps to illustrate how curves increase in saturation from non-saturating monotonic to saturating monotonic and decreases as it becomes non-monotonic. In this study, S values of normal IC responses were subtracted from those of MOC-stimulated responses which are shown in Fig. 4a bottom panel.

Fig. 2
figure 2

a Definitions of rates used for determining non-monotonicity (NM) and saturation (S). The maximum rate is differentiated from the rate at maximum level in non-monotonic RL responses. b Depicts normalized monotonic RL curves (in blue) of increasing S value as labeled. Non-monotonic RL curves (green) have higher S values and are viewed as more saturated. c Examples of RL responses simulated using the AN-IC model. Three of the four observed RL response types are replicated through simulation—monotonic unsaturated, monotonic saturating, and partial non-monotonic. Complete non-monotonic RL response was not successfully replicated using this model

To understand the other diverse aspects of MOC stimulation, a systematic approach was used to quantify them. Based on the description of effects by Seluakumaran et al. (2008), there are two general categorizations of effects which we can make: first are shifts to higher stimulus intensities (∆L) like those introduced in Guinan and Gifford (1988) and the second are changes in maximum rate termed as enhancement (E). The maximum rate is the highest spike rate in the RL curve. The difference between the maximum rate of an IC neuron after MOC stimulation and before is E and is referred to as a percentage of the max rate. A positive E would mean that MOC stimulation enhanced firing rates, whereas a negative E would mean reduction. In the case of regular shifts to high intensities, E would be close to 0. To account for non-monotonic IC neurons, ∆L values on the other hand involve first obtaining the midway point, Lmax/2. Lmax/2 is defined as the lowest stimulus level at which the RL reaches half of its maximum rate. The difference between IC neuron’s Lmax/2 after and before MOC stimulation (in dB) is ∆L. By quantifying these values for the vast range of IC neurons being simulated, the effect of MOC stimulation can be thoroughly looked at.

Results

Matching physiological data

The first goal in this study was to match simulated responses with example responses reported in the study by Seluakumaran et al. (2008). Model predictions showed good fit and could capture the overall RL response type. As seen in Fig. 3, different neuron RL types and MOC effects could be predicted through an appropriate combination of cell type, spontaneous rate, and BMF for the specific CF fibers of example responses. Some limitation in the model simulations includes a considerably weaker enhancing effect in contrast to empirical data (Fig. 3c, f). While this is true for the specific CFs, larger enhancing effects can be obtained without this restraint. Another interesting observation is the constant (with and without MOC) best-level coding response in Fig. 3c which the model did not replicate. Despite the general fit of simulated responses, responses from physiological recordings tend to have a higher threshold when responding to pure tone at the IC. These differences, however, could be due to species differences between guinea pigs (physiological data) and cats (model) or other types of circuitry. Overall however, there was a good fit between model responses and available data.

Fig. 3
figure 3

Guinea Pig physiological data from Seluakumaran et al. (2008) are matched with cat model predictions for neurons of the same CF. Neuron types were limited to either BP or LPBR cells with BMF range between 1 and 150 Hz and low-, mid-, or high-corresponding AN fiber SR. The combination of model predictions in the figure are listed as follows: a LPBR cell, BMF 8 Hz, Mid AN SR; b LPBR cell, BMF 64 Hz, Low AN SR; c BP cell, BMF 48 Hz, High AN SR; d BP cell, BMF 128 Hz, High AN SR; e BP cell, BMF 132 Hz, Mid AN SR; f BP cell, BMF 48 Hz, High AN SR

RL response patterns: monotonicity and saturation

The IC cells simulated across a wide range of parameters produced a variety of different RL types consistent with physiological data. Figure 2d exemplifies the different responses obtained in this study. Three of the four types of RL responses (e.g., monotonic saturating, monotonic unsaturated, and partial non-monotonic) observed from physiological recordings of the IC were replicated in the model IC responses with the exception of complete non-monotonic subtype. These types of RL subtypes are also shown in Fig. 3 (monotonic saturating—e; monotonic unsaturated—a, b; partial non-monotonic—c, d, f). Using a simple SFIE neural circuitry, most of the RL response types were reproducible. The BP IC cells showed a variety of available RL responses, but the LPBR cells are uniformly monotonic in nature and produce either saturating or non-saturating variants. In addition, the LPBR cell types had little-to-no variation, even across the subtype of monotonic RL responses, and they resulted in monotonic saturating responses when innervated by high SR AN fibers and monotonic unsaturated responses with medium and low SR fibers. In spite of having more inhibition within its circuitry, LPBR cells are likely only producing monotonic responses due to the very weak inhibition influences. If, as hypothesized, BP IC cells only respond well to onsets and offsets for pure-tone sound. As the inhibitory component, its influence could be very minimal under a pure-tone paradigm.

NM measurements of BP IC cell responses revealed certain regions of CF/BMF combinations with non-monotonic or monotonic tendencies (Fig. 4a, top panel). In particular, non-monotonic responses concentrated at regions of higher BMF, specifically for high CFs (beginning at 3 kHz) and some very low CFs (125–250 Hz). In addition, non-monotonicity increased with increasing BMFs in these CFs. These patterns, however, were specific to model cells innervated by high SR fibers. In contrast, low-to-medium SR innervated cells exclusively resulted in monotonic RL responses, similar to LPBR cell types.

Fig. 4
figure 4

a The measure of non-monotonicity (NM) is mapped across the combinations of CF and BMF for fibers of different SR for BP IC cells. The difference in saturation (S) of the response patterns with and without MOC stimulation reveals an overall decrease in saturation which are larger at higher CFs. This is the same for all SR fiber types. b The effect of MOC efferent stimulation at the level of the IC was measured (percentage enhancement) and mapped across the different combinations of CF and BMF. Positive enhancement is observed in particularly in mid-CF ranges (750–2 kHz), while negative values (i.e., reduction) are observed elsewhere. Simulation using the human model produces much stronger enhancement in those regions in comparison to the cat model

The highest NM values produced by model cells were slightly below 0.5 and were far from being complete non-monotonic RL responses (value of 1). The majority of CF/BMF combinations resulted in more monotonic responses and even non-monotonic responses did not have very big reductions in discharge rates.

The bottom panel of Fig. 4a showed a universal decrease in saturation values when MOC was stimulated. This reduction in saturation was bigger at higher CF regions (values up to − 0.10) and is the same across different SR fibers. Even for low and medium SR fibers where responses were universally monotonic and typically would not be considered, saturated had reductions in its S values. This is related to the shifts to higher stimulus levels introduced by MOC and accounted for with the S metric.

MOC efferent stimulation effects

Percentage enhancement values mapped for different CF/BMF combinations in the cat model are discussed here. MOC stimulation enhanced or reduced the firing rate activity of model IC cells at a maximum value of 15%. Most notable in the mapping was a region of positive enhancement (Fig. 4b). At mid-to-high BMFs (> 60 Hz) for neurons with CFs 750–2 kHz, MOC stimulation enhanced the firing rate of model IC neurons. This trend was mostly consistent for all SR fiber innervations. Areas of a significant reduction were limited to higher CFs (> 5 kHz), especially for high and low SR fiber innervation.

In human model cells, the pattern of regions of enhancement and reduction remained consistent. However, interestingly, the region of significant enhancement in human model cells was at a higher CF region (1 k–4kHz). In addition, the magnitude of enhancement was visibly sharper and stronger.

While changes in maximum rate get complex at the IC level, shifts of RL to higher levels (∆L) from the AN are largely preserved. With a cohc value of 0.4, ∆L ranged from 1 to 16 dB at the IC and were CF-dependent (bigger shifts at higher CFs). This was similar to empirical data from Guinan and Gifford (1988) and corresponding AN simulations.

Overall, simulation using the model resulted in responses that were a good fit with the physiological data. Overall trends of RL shapes and MOC effects were replicated. One key finding from the analysis using NM and S metrics was the increasing non-monotonic nature of IC neurons with increasing BMF and CF for BP cells. LPBR cells, however, were uniformly monotonic. While the rightward shift effect (CF-dependent) of MOC stimulation in the AN seemed to carry forward without much modification at the IC, enhancing or reductive effects were more combination-specific. The observed rates of central CF regions were enhanced under MOC stimulation, and they were even more pronounced in human cells compared to cat cells.

Discussion

As shown by Seluakumaran et al. (2008), the effects of MOC stimulation differed from the periphery at the level of the auditory midbrain. It was previously thought that a complex innervation pattern in the ascending auditory pathway was necessary to explain the varied MOC efferent-stimulated responses. In particular, the enhancement effect observed at the IC requires a more complex explanation as MOC effects at the periphery are inhibitory in nature. As mentioned, Seluakumaran et al. (2008) explained that the suppressive effects of the periphery were directed towards the inhibitory component of the ascending pathway, causing a disinhibition and, consequently, an enhancement in the responses at the level of the IC. For this to be possible, there must be a differential between the suppressive effects of excitatory and inhibitory components as offered by off-frequency inhibition, for example. However, within the SFIE model employed in this study, both excitatory and inhibitory inputs are identical. Still, the simulation of the IC model responses in this study shows that the simple SFIE circuit can replicate all the known MOC effects across RL subtypes (shifts, reduction and enhancement). The main limitation of the model was not in replicating MOC effects but simulating complete monotonic RL subtype. The results from the model showed that a simplistic neural circuitry is sufficient to account for varied MOC stimulation responses at the midbrain. These results suggest that a more complex central interaction is likely not essential for obtaining the varied effects, but could be required to obtain complete non-monotonic RL responses.

Non-monotonic tendencies: role of CF and BMF tuning

Mao and Carney (2015) simplified the function of the IC, and identified it as a form of modulation filtering arising from the interactions of the excitatory and inhibitory time constants. The predominance of non-monotonic neurons at higher CFs can be explained using this simplification. Pure-tone stimuli do not have much modulation which IC neurons can respond to. The typical AN responds to pure tone with an onset spike before a tonic continuous train of neuron spikes. Each segment of the typical response would generally respond to a specific BMF range. First, the tonic portion corresponds to very low BMFs in the range of 1–20 Hz. The onset portion and the end of the spike train in contrast is much sharper and corresponds to higher BMF tunings.

As stimulus intensity increases, the onset spike of the AN responds more strongly and results in a monotonic increase in overall rate. This response was true for the majority of neurons. Non-monotonicity within IC responses could arise from differing effects of stimulus intensity for neurons. For AN neurons with high CFs, the onset spike was more prominent as observed from simulated responses (Fig. 5a). In cases of higher CFs where the onset spike was already strong at lower stimulus levels, the onset spike often invoked a refractory period (example shown in Fig. 5a green box) disrupting the tonic portion of the response at higher stimulus levels. The presence of the refractory period in the AN response changes the shape of the onset spike, which corresponds to different BMFs. As a result, IC neurons tuned to modulation frequencies most sensitive to the onset spike at lower stimulus levels would experience reduction in rates when the shape of the onset spike changes with higher intensities. This results in non-monotonic RL responses like the IC neuron demonstrated in Fig. 5a.

Fig. 5
figure 5

The simulated responses at the level of the AN and the IC (specific BMFs) for differing frequency pure tones at their CF. a Highlights the change in onset spike at the AN with increasing intensity (30 dB vs 90 dB) for higher CF neurons (5 kHz shown). A clear refractory period develops (shown within green box), which lowers the rate of response at specific BMFs (BMF 160 Hz shown here). b Showcases the disruption of periodicity (highlighted within green box) of low CF (250 Hz) AN fiber responses due to the effects of saturation at higher intensities (60 dB vs 90 dB). This reduces the responses of specifically tuned IC neurons (128 Hz shown here) and gives rise to non-monotonic responses. c Shows the effect of MOC efferent stimulation on the responses of AN fibers with different CFs (500 Hz, 1.2 kHz, and 5 kHz) and IC neurons (BMF 128 Hz) that were specifically tuned to pure tones at 60 dB. The red lines, which indicate MOC-activated responses compared to those without MOC (blue), highlight how MOC sharpens transient portions of stimulus, particularly at the stimulus offset. Due to the combination of MOC effect strength (stronger at higher CFs) and offset responses across CF (faster at higher CFs), only mid-range CFs have contrast sufficient enough to result in a significantly enhanced rate at the IC neurons (green box)

The case for non-monotonic neurons with low CFs was different. The onset spike for low CF neurons was hardly prominent. Instead, neurons with low CFs had prolonged tonic spike trains, which were highly phase locked due to the periodicity of low-frequency, pure-tone stimulus. As such, IC neurons tuned to the specific low BMFs responded well to low CF AN neurons. At higher stimulus levels, however, when the AN neuron begins to saturate, the periodicity of the AN spikes slowly began to distort. The effect of saturation on the AN responses is clearly seen in Fig. 5b. As a result, the rate of IC neurons at higher levels also decreased, resulting in a non-monotonic RL response.

Effects of MOC stimulation

When MOC efferent stimulation was simulated for the IC cells, shifts like those seen at the level of AN were observed. This was expected and is consistent with the current knowledge that there are more MOC projections at middle and basal areas of the cochlea corresponding to AN tuned to higher frequencies (Guinan et al. 1984; Liberman et al. 1990; Wiederhold and Kiang 1970). It is likely that the relative magnitude of shifts was preserved from the peripherals through the midbrain processes.

As discussed earlier, enhancing effects are not as simple to explain given that complex innervation patterns were not part of the model. IC cells functioning like a filter indicate that fluctuating and transient portions of stimulus would be most relevant to the IC. An explanation can then be derived for these effects by examining MOC influence on the onset and offset of the AN response. For all simulated AN fibers, the decay of neuronal spikes at the end of the tone stimulus was much faster when MOC was stimulated. This quicker decay has been indicated for AN responses to clicks in physiological studies (Guinan et al. 2012). This faster decay at stimulus offset can result in enhancement of IC responses for neurons which are tuned to specific BMFs that favor the changes from MOC activation. This is demonstrated in Fig. 5c and also shows why enhancement was only seen in mid-range CF neurons. While MOC effects are generally stronger at high CFs, the contrast of rate of decay at the offset favors mid-range CF neurons. As Fig. 5c also shows, the rate of AN response decay across CF (0.5 kHz, 1.2 kHz, and 5 kHz shown) was already faster at higher CF neurons even without MOC activation. MOC influence quickens this decay even more, but the difference of rate of decay for high CF neurons did not produce a sufficiently large effect to enhance overall rate of IC responses, since the rate of decay was already fast to begin with. The case of low CF neurons is similar despite a slow offset decay without MOC stimulation. The MOC-induced difference was not sufficient, because the effect of MOC activation was much weaker. Mid-CF neurons, however, provided an ideal combination of response at stimulus offset (decay rate) and contrast in MOC inhibition effects; this combination produced IC neurons with enhancement when MOC is stimulated. The critical difference in decay rate for enhancement was only achieved above a certain stimulus intensity, however, and are only reflected in specific BMF tuned neurons. MOC stimulation still inhibited responses at lower intensities and neurons of other tunings, which result in shifts or reduction of the IC RL curve. The higher magnitude of the enhancement effect in simulated human cells, compared to cats, lends credibility to this explanation. The sharper tunings of human auditory filters should contribute to sharper contrast for MOC effects due to the specificity of tuning of each fiber.

It is significant that reductive MOC activation effects at the AN can translate to complex representation at the IC. The implication here is that existing cues from a single channel are potentially responsible for such IC responses instead of it solely cross-channel interactions. What seems to be driving much of the different IC RL responses as well as MOC effects are the transient portions of stimulus and in the case of a pure-tone paradigm–the onset and offset. It would be interesting then to see the effects of MOC on more complex stimulus with more transients like amplitude modulated tone or speech. For the purpose of discussion, Fig. 6 shows the response of the AN-IC model to a 3 k carrier tone modulated by a 60 Hz sine wave. The transient portions of the AN response were not pronounced without MOC activation. It appears that these cues were drowned out by neuron saturation, even at 60 dB SPL. Here, it is interesting to note the link between the reduction in saturation by MOC, as shown in Fig. 4a (bottom), and its relevance to improving contrast of transient stimulus. MOC seems to have shifted the operating range of AN neurons, such that the contrast of the response was enhanced. The effect of MOC on transient portions of stimulus shown gives insight as to how MOC could potentially be involved in more realistic and complex stimulus such speech where fluctuating portions of signals are prominent and key features. Given that the fluctuating parts of speech are very important for encoding information (Fant 1973), MOC could play a prominent role in the process of understanding speech. In addition, its potential relevance to real-life situations with high neuron activity (competing sounds, wide spectral content, or high stimulus levels) where shifts in range would be useful is of note.

Fig. 6
figure 6

Depicts a modulated signal (3 kHz carrier at 60 dB) (a) and the responses of AN (b) and IC neurons (c BMF 32 Hz and d BMF 128 Hz) with and without MOC stimulation to it. When MOC is not activated (blue), the modulating portions of the stimulus are not pronounced in the AN response and drowned out through the effects of saturation. MOC activation (red) significantly enhances the contrast in the modulating portions represented in the IC responses

Conclusion

This study demonstrated that the variety of MOC-induced effects at the level of the IC could be modeled using an SFIE model without invoking across frequency interactions which was previously thought to be necessary. Interactions across frequency may, instead, be more pertinent to producing complete non-monotonic RL response-type IC fibers which could not be simulated using the model. While varied, the effects of MOC stimulation are shown to reduce the saturation of neurons across all the combinations. The variations in these effects under a pure-tone paradigm are related to neuron response differences to the onset and offset portions of the stimulus. This implies that stimulus transients may be a main feature with regard to the role of MOC in the auditory system.