Introduction

Detecting novel, unexpected stimuli in an ever-changing environment is critical for an animal’s survival; as such deviations may indicate events of behavioral importance. Given this, an essential task for the brain is to detect any salient, novel information while adapting to—and ignoring—irrelevant background noise. On neuronal level, adaptation is often defined as decreased neuronal responsiveness to the repetitive stimulation, which occurs across species and sensory modalities; meanwhile, adaptation to repeated sounds while maintaining responsiveness to novel ones is known as stimulus-specific adaptation (SSA) and it is regarded as one of the mechanisms that give rise to novelty or change detection (Ulanovsky et al. 2003; Yu et al. 2009). Researchers have given considerable attention to studying SSA using double tones within the frequency domain along the auditory pathway, including auditory cortex (Ulanovsky et al. 2003; Szymanski et al. 2009; Von Der Behrens et al. 2009; Antunes et al. 2010; Farley et al. 2010; Taaseh et al. 2011; Fishman and Steinschneider 2012; Nieto-Diego and Malmierca 2016; Wang et al. 2019), medial geniculate body (Anderson et al. 2009; Yu et al. 2009; Antunes et al. 2010; Bäuerle et al. 2011; Richardson et al. 2013; Antunes and Malmierca 2014; Duque et al. 2014; Rui et al. 2018), and inferior colliculus (Malmierca et al. 2009; Zhao et al. 2011; Ayala and Malmierca Dr 2012; Ayala et al. 2012; Pérez-González et al. 2012; Anderson and Malmierca 2013; Duque et al. 2016). However, little work has been devoted to studies using complex sounds (Nelken et al. 2013; Klein et al. 2014) and no systematic assessment of SSA has been done with ecologically relevant, natural sounds. To this end, the research community is still ignorant as to whether or not SSA is preserved for natural sounds.

In addition to pure tones, SSA has also been conducted with manipulations to sound intensity (Duque et al. 2016), auditory-motion direction (Wasmuht et al. 2017), and close-field spatial cues (Reches and Gutfreund 2008; Xu et al. 2014). Only one study has been conducted in the thalamic reticular nucleus with regards to spatial locations in free-field natural environments (Xu et al. 2017). Spatial perception guides attention and improves the detection, segregation and recognition of sounds (Bregman 1990; Roman et al. 2003), which makes detection of novel sound location an important task. However, how spatial SSA might be represented in the auditory cortex of awake subjects remains fundamentally unclear.

In natural environments, a novel auditory object usually occurs with multiple, congruent deviance cues such as location, temporal spectra characteristics, and intensity. Most of the existing research has been focused on only one type of cue. However, one study in owls showed that congruent bimodal deviant stimuli evoked stronger response than one unimodal deviant stimuli, which was characterized as the enhancive effect (Reches et al. 2010), suggesting that the ongoing task of detecting unexpected events benefitted from the integration of both visual and auditory information. An important question that remains to be addressed is whether or not SSA can integrate different types of cues within a single modality (e.g., auditory identity and location).

In the work presented here, we recorded single unit (SU) activities in the auditory cortex of awake rats using three different types of oddball paradigms with two natural sounds presented at two locations. The purpose of this study was threefold: (1) to characterize SSA using the two natural sounds, (2) to probe SSA in the spatial domain, and (3) most importantly, to explore the integration in SSA between sound location and sound identity. Our results indicate that the AC exhibited a small SSA for the natural sounds and a comparatively strong SSA for spatial locations. A deviant stimulus at a deviant location evoked a stronger response when compared with a single deviant stimulus.

Methods

Subjects and surgery

Eight adult male Wistar rats (280–360 g) with clean external ears were used in the present study. Briefly, the animal was anesthetized with pentobarbital sodium (40 mg/kg) and atropine sulfate (0.05 mg/kg, s.c.) was administered 15 min prior to anesthesia to inhibit tracheal secretion. A local anesthetic (xylocaine, 2%) was liberally applied to the wound to reduce pain. A head fixation bar was then implanted on the top of skull with dental cement and six titanium screws. The animal was allowed to recover for 3 days post-op, after which it was habituated to head-fixation with a customized apparatus using our standard procedure (Schwarz et al. 2010). Habituation occurred for at least 10 days after which the animal was subjected to a craniotomy to expose the auditory cortex (3–6 mm posterior to bregma). All anesthetic and analgesic conditions were identical to the first surgery. After, two-component silicone gel (Kwik-Sil™, World Precision Instruments) was applied to seal the craniotomy and protect the cortex. Extracellular recordings began 3 days after the craniotomy. Animal weight was monitored daily to ensure all subjects were in good physical condition. All animal procedures were approved by the Animal Subjects Ethics Committees of Zhejiang University.

Recording

To get well isolated single unit (SU), tungsten microelectrodes (A-M Systems, Sequim, WA) with high impedances (10–12 M) were used. For each recording day, a single electrode was used and aligned to a fixed mental point, which was mounted in the bone, so that the coordinates could be kept consistent for different recording days during the whole experiment. This technique enabled us to reconstruct a physiological map containing many penetrations. For all recordings, electrodes were positioned with a stepping-motor microdrive, which was controlled from outside the soundproofed room, and inserted into the AC according to a standard rat brain atlas (Paxinos and Watson 2005). The signal recorded by each microelectrode was amplified, filtered (300 Hz–5 kHz) and stored using TDT software (OpenEX, TDT). A rigorous spike sorting was performed offline using template matching implemented in Spike 2 (CED, Cambridge, England), only when all spike waveforms were identical and clearly separable from other smaller unites and background noise, the recorded action potentials were considered to belong to a single unit. All recordings presented in the current study were performed from the primary auditory cortex (A1) and anterior auditory filed (AAF) in the left hemisphere, identified by the tonotopic gradient of characteristic frequencies (CF) (Horikawa et al. 1988; Profant et al. 2013).

Acoustic stimulation

Three natural sounds were recorded with a ¼″ condenser microphone (Brüel & Kjær 4954, Nærum, Denmark) as well as a PHOTON/RT analyzer (Brüel & Kjær) at 96 kHz sampling rate. Two sounds were rat vocalizations (S1 and S3) and the other one was knocking on a door (S2) (Fig. 1a). To minimize the background noise, the vocalization was recorded when a single rat was placed in the cage, which was located in the soundproof room. The recorded sounds were played through magnetic speakers (MF1, TDT) with TDT systems at a 96 kHz sampling rate. The mean level of sound pressure of each sound was calibrated and set to 55.5 dB SPL at the center of the rat’s head, and the peak level are 66, 76, 70 dB SPL for S1, S2, and S3, respectively. In addition to these three natural sounds, we also presented a tonal burst. In summary, the three stimulus protocols used in this are as follows:

Fig. 1
figure 1

Natural sound stimulation and experimental setup. a The power spectra of the three natural sounds used in the experiment. S1: one rat vocalization; S2: door knocking; S3: another rat vocalization. b Two natural sounds (S1 in red; S2 in blue) were presented at two locations (indicated by the directions of cartoon speaker) and formed 12 blocks of stimuli in three different oddball paradigms: identity oddball (ad); spatial oddball (eh); and integrative oddball (il). The 12 blocks of stimuli were randomly presented at an inter-block interval of 3.5 s. The bottom schema shows timing setting for each block with black color indicating sound and white color indicating silence. c Schema of sound comparison oddball. Four sounds (S1 in red; S2 in blue; Tone 1 in black; Tone 2 in gray) were presented at the contralateral site and formed two oddball paradigms (identity oddball with S1 and S2; pure tone oddball with Tones 1 and 2) with four blocks of stimuli. The four were randomly presented at an inter-block interval of 3.5 s

  1. 1.

    Frequency screening procedure A sequence of tones with a frequency range of 0.5 kHz to 48 kHz was presented in random order at the right contralateral position (90º) with 5 trials for each tone to determine the frequency response area (FRA). Tones were randomly presented with five repetitions at each frequency (0.5–48 kHz in 26 logarithmic steps) and intensity (5–65 dB SPL, 10 dB steps). Tones were 100-ms long with a 5-ms rise/fall time and an inter-stimulus interval of 300 ms.

  2. 2.

    Integrative oddball paradigm (IOP) We used two positions (contralateral and ipsilateral sites) and two natural sounds (S1 and S2 in Fig. 1a) to design the oddball paradigm (Fig. 1b). Three different oddball paradigms were developed with 12 blocks of stimuli: identity oddball, spatial oddball and integrative oddball. Each block consists of seven successive sounds with a 300-ms inter-stimulus interval (ISI), and the first six sounds were identical sounds presented at the identical location, while the 7th sound was either a deviant sound at the same location (Fig. 1b, a–d; identity oddball); or the same sound at the deviant location (Fig. 1b, e–h; spatial oddball); or a deviant sound at the deviant location (Fig. 1b, i–l; integrative oddball). After the seventh stimulus, there was a period of silence (1600 ms) (Fig. 1b). The 12 blocks were randomly presented. Those two natural sounds were abundant in the laboratory environment and mimicked two common scenarios in the oddball paradigm: (1) the rat was consistently around repetitive rat vocalizations in the colony when sudden door knocking occurred and (2) there was a sudden rat vocalization during a period of repetitive door knocking. This paradigm was designed with three goals in mind: (1) to examine natural sound SSA; (2) to probe spatial SSA in the AC; and (3) to explore neuronal responses to integrative deviance cues.

  3. 3.

    Sound comparison oddball (SCO) This protocol was similar to IOP but only presented at the contralateral position, two kinds of oddball paradigms were included: (1) the identity oddball with the two natural sounds and (2) the traditional oddball with two pure tones (Fig. 1c). The frequencies of the two tones were chosen to be in the vicinity of the neuronal characteristic frequency with 0.53 octave separation, and the tones were 100-ms long with a 5-ms rise/fall time. As in IOP, these two oddball kinds were randomly presented in blocks. This protocol was designed to explore SSA response differences between the natural sounds and the pure tones.

Other than the two extremely different stimuli (S1 and S2), we also presented two vocalizations for additional 75 neurons with SCO, in which S2 was replaced by S3 (Fig. 1a).

Data analysis

For both identity and spatial oddball paradigms, we employed two SSA indices identical to previously published work (Ulanovsky et al. 2003; Yu et al. 2009; Antunes et al. 2010). We computed an SIi (stimulus-specific index) using the following formula: SIi = [di− si]/[di+ si], with si and di indicating standard and deviant responses, respectively. In the blocks (Fig. 1b), the response to the sixth stimuli was calculated as the standard response, while the seventh was used as the deviant response. CSI (Common-SSA Index) was also calculated as the second SSA index according to the following formula: [d1 + d2 − s1 − s2]/[d1 + s1 + d2 + s2]. The SSA index CSI is used to describe the adaptive degree of SSA, CSI is 1 when adaptation is complete (i.e., no response to the standard, and significant response to the deviant), and 0 when there is no adaptation (i.e., the response to the standard and deviant is equal). To compare the double and single deviant responses, we defined two factors for each neuron: DII (double-identity index) to compare the responses to the double deviant stimuli and to the identity deviant stimuli; DSI (double-spatial index) to compare the responses to the double deviant stimuli and to the spatial deviant stimuli. The formulas were defined as: DII = [D(sti1) + D(sti2) + D(sti3) + D(sti4) − I(sti1) − I(sti2) − I(sti3) − I(sti4)]/4, and DSI = [D(sti1) + D(sti2) + D(sti3) + D(sti4) − S(sti1) − S(sti2) − S(sti3) − S(sti4)]/4, with D(stii), I(stii), and S(stii) being normalized responses to the four kinds of stimuli [stii(i = 1, 2, 3, 4): S1 at the contralateral site; S1 at the ipsilateral site; S2 at the contralateral site; S2 at the ipsilateral site] as the double deviant stimuli (integrative oddball), identity deviant stimuli (identity oddball), and spatial deviant stimuli (spatial oddball), respectively. The normalization was based on the response to the first trial, non-adapted (or less-adapted) response and the response to each stimulus was divided by the average response to the same stimulus at the same location when it was presented as the first stimulation in the blocks (Fig. 1b). A positive DII and DSI would indicate that the double deviant stimuli evoked a stronger response than did the single deviant stimulus.

Upon computing the spike count, the window was always chosen from 0 to 100 ms relative to the onset of the auditory stimuli. For all PSTHs, the bin was 5 ms. Statistical comparisons between standard response and deviant response were performed as Wilcoxon rank-sum test and neuronal comparisons between conditions (e.g., natural sounds vs pure tones) as paired t tests.

Results

We recorded 111 single units (SU) in the auditory cortex with IOP. 111 SUs were recorded with SCO (S1 and S2) and 75 additional SUs were also recorded with SCO in which S2 was replaced by S3. We will first describe SSA properties for sound identity and location individually before examining their integrative SSA properties.

SSA for the natural sounds

As has been shown previously, AC neurons displayed SSA with pure tones in both anesthetized and awake subjects (Ulanovsky et al. 2003; Szymanski et al. 2009; Von Der Behrens et al. 2009; Antunes et al. 2010; Farley et al. 2010; Taaseh et al. 2011; Nieto-Diego and Malmierca 2016). In the present study, we first examined whether AC neurons had SSA with natural sounds. Two kinds of natural sounds were used in the current study (Fig. 1a), one being a rat vocalization (S1) and the other being door knocking (S2). With those two natural sounds, the four blocks of stimuli in the identity oddball were presented at two locations (Fig. 1b, a–d) and were randomly interspersed in the other eight different blocks of stimuli in the spatial (Fig. 1b, e–h) and integrative oddballs (Fig. 1b, i–l).

An example neuron illustrating the response to the natural sounds in the identity oddball is shown in Fig. 2. The raster plots showed responses to the sound at the contralateral site and ipsilateral site when the sounds were presented as the deviant (top row of Fig. 2a, b) and standard (middle row of Fig. 2a, b) stimuli. Here, the deviant stimulus referred to the seventh stimulus in the block and the standard stimulus referred to the sixth stimulus (Fig. 1b). At the contralateral site, the neuron showed no significant difference for sound S1 between deviant and standard responses (S1: p = 0.21, Wilcoxon rank-sum test, left column of Fig. 2a), but a much stronger response to sound S2 when it was presented as the deviant stimulus than as the standard stimulus (compare raster graphs in Fig. 2a, top two rows to the right). The peri-stimulus time histograms (PSTHs) showed a significant difference between deviant and standard responses (bottom row in the right of Fig. 2a, S1: p = 0.021, Wilcoxon rank-sum test). At the ipsilateral location, however, significant difference was also found for both natural sounds between deviant and standard responses (Fig. 2b, S1: p = 0.023; S2: p = 0.038, Wilcoxon rank-sum test).

Fig. 2
figure 2

Responses of AC neurons to the natural sounds presented in identity oddball. a, b Responses of one representative AC SU to the natural sounds in the identity oddball at the contralateral (a) and ipsilateral sites (b). Raster displays showing response to the two natural sounds when presented as the deviant stimulus (black color in the top row) and the standard stimulus (gray color in the middle row). The sound identity and location are indicated in the title of each plot. The PSTHs (bottom row) showing the deviant (30 trials) and standard responses (30 trials). *p < 0.05, n.s. not significant, Wilcoxon rank-sum test. A small inset on the top row between a and b shows all spikes including deviant and standard responses at both contralateral and ipsilateral sites. c Scatter plots of SI2 (sound S2) versus SI1 (sound S1) for all neurons (n = 111). Filled colored points correspond to cases in which Wilcoxon rank-sum test shows significant difference for both sounds (p < 0.05, deviant response vs. standard response). d Distribution of CSI at the contralateral and ipsilateral sites, arrows illustrate means

The population results from awake subjects including 111 SUs were summarized in Fig. 2c, d. In the scatter plots for the sound-specific index SI1 (sound S1) versus SI2 (sound S2), most values were close to 0, implying no SSA in these recorded neurons. More specifically and as evidenced in the upper right quadrant, only 4 of 111 showed significant SSA at the contralateral site and 2 of 111 at ipsilateral location (Fig. 2c). CSI averages were 0.079 and 0.072 for the contralateral and ipsilateral locations, respectively (Fig. 2d). While small, these values were significantly greater than 0 (p = 8.97e−24 for contralateral location and p = 4.76e−15 for ipsilateral location, t test). This indicated that on average, deviant responses were stronger than standard responses, more importantly, that SSA existed on the population level for the natural sounds. These results contrasted with previous work using pure tone oddball in anesthetized subjects, which showed much greater SSA (Ulanovsky et al. 2003; Szymanski et al. 2009; Von Der Behrens et al. 2009; Antunes et al. 2010; Farley et al. 2010; Taaseh et al. 2011; Nieto-Diego and Malmierca 2016). To further explore whether the discrepancies could arise from the different sounds used (i.e., the natural sounds as used here and the pure tones used in previous studies), we recorded an additional 111 SUs with SCO.

SSA comparison between the natural sounds and pure tones

In SCO, two kinds of oddball paradigms (pure tones and the natural sounds) were randomly presented at only the contralateral site (Fig. 1c). We then compared the SSA response between those two oddball paradigms for the same recording. Similar firing rates were found between the standard and deviant responses for both S1 (left column in Fig. 3a, p = 0.35, Wilcoxon rank-sum test) and S2 (right column in Fig. 3a, p = 0.075, Wilcoxon rank-sum test). This same neuron did not show any SSA for the pure tones too (Fig. 3b, f1 = 19,169 Hz, p = 0.44; f2 = 27,603 Hz, p = 0.41, Wilcoxon rank-sum test). The population results are summarized in Fig. 3c, and the mean CSI was 0.086 and 0.087 for the natural sounds and pure tones, respectively. Though small, both CSIs were significantly greater than zero (p = 1.22e−23 for the natural sounds and p = 1.19e−19 for pure tones, t test, Fig. 3c), suggesting that significant SSA exists on population level. No significant difference was detected between the two kinds of stimulation (p = 0.96, paired t test). Interestingly, the CSI for the two kinds of oddball paradigms showed a positive correlation (Fig. 3d, r = 0.19, p = 0.04, Pearson correlation), indicating that CSI was reflecting some neuronal property that was present regardless of stimuli type.

Fig. 3
figure 3

Comparison of AC neurons in responses to the natural sounds and pure tones in oddball paradigms. a, b Responses of one representative AC SU to the natural sounds (S1 and S2) in the identity oddball and two tones (f1 = 19,169 Hz and f2 = 27,603 Hz) in the traditional oddball at the contralateral site. Raster displays showing responses to the sounds when presented as the deviant stimulus (black color in the top row) and the standard stimulus (gray color in the middle row). The sound identity is indicated in the title of each plot. The PSTHs (bottom row) showing the deviant (30 trials) and standard responses (30 trials). n.s. not significant, Wilcoxon rank-sum test. A small inset on the top row between a and b shows all spikes including deviant and standard responses to both the natural sounds and the pure tones. c Distribution of common stimulus-specific index (CSI) for the two natural sounds (top row) and the pure tones (bottom row). Arrows illustrate means. d Correlation in CSIs between pure tones and the natural sounds

We also recorded another 75 neurons with SCO, where S2 was replaced by S3 (Fig. 1a). The example neuron showed similar responses to the deviant stimuli and the standard stimuli for the natural sounds (Fig. 4a, S1: p = 0.24, S3: p = 0.69, Wilcoxon rank-sum test) and the pure tones (Fig. 4b, f1 = 13,312 Hz, p = 0.37; f2 = 19,169 Hz, p = 0.64, Wilcoxon rank-sum test). The mean CSI of the population data was also very small: 0.026 and 0.069 for the natural sounds and pure tones, respectively. Both CSIs were significantly greater than zero (Fig. 4c, p = 0.0055 for the natural sounds and p = 6.55e−13 for pure tones, t test). Interestingly, significant difference was detected between the two kinds of oddball paradigms (p = 8.29e−4, paired t test), where most data points were above the unitary line in the scatter plot showing the CSI of pure tones versus the CSI of the natural sounds (Fig. 4d).

Fig. 4
figure 4

Comparison of AC neurons in responses to another pair of natural sounds and pure tones in oddball paradigms. a, b Responses of one representative AC neuron to the natural sounds (S1 and S3) in the identity oddball and two tones (f1 = 13,312 Hz and f2 = 19,169 Hz) in the traditional oddball at the contralateral site. Raster displays showing responses to the sounds when presented as the deviant stimulus (black color in the top row) and the standard stimulus (gray color in the middle row). The sound identity is indicated in the title of each plot. The PSTHs (bottom row) showing the deviant (30 trials) and standard responses (30 trials). n.s. not significant, Wilcoxon rank-sum test. A small inset on the top row between a and b shows all spikes including deviant and standard responses to both the natural sounds and the pure tones. c Distribution of CSI for the two natural sounds (top row) and the pure tones (bottom row). Arrows illustrate means. d Relationship in CSIs between pure tones and the natural sounds

Spatial SSA for the natural sounds

After characterizing the SSA response to both the natural sounds and pure tones, we then explored whether AC neurons were sensitive to the statistical occurrence of sound locations. The natural sounds were presented repeatedly at two locations, with one serving as the standard location and the other as the deviant location (Fig. 1b, e–h). As shown in Fig. 5a, the raster plots showed responses to sound S1 at the same locations as both the deviant (top row of Fig. 5a) and standard (middle row of Fig. 5a) locations. Moreover, there was a much stronger neuronal response at the deviant location than that at the standard location (compare graphs in Fig. 5a, top with middle). The PSTHs (bottom row of Fig. 5a) also demonstrated a significant difference between the deviant and standard responses for both locations (Contralateral site: p = 7.1e−5; ipsilateral site: p = 0.00012, Wilcoxon rank-sum test). With sound S2, there was a significant difference between deviant and standard responses that was only detected for the contralateral site (right column of Fig. 5b, p = 0.0009, Wilcoxon rank-sum test), but not for the ipsilateral site (left column of Fig. 5b, p = 0.83, Wilcoxon rank-sum test), which was different from the result of sound S1 (Fig. 5a). This finding indicated that spatial SSA was sound-context dependent. The spatial CSI of the neuron were 0.12 and 0.084 for sounds S1 and S2, respectively.

Fig. 5
figure 5

Responses of AC neurons to the natural sounds presented in the spatial oddball. a, b Responses of one representative AC SU to the natural sounds in the spatial oddball with sound S1 (a) and S2 (b). Raster displays showing responses to the two natural sounds when presented at the deviant location (black color in the top row) and at the standard location (gray color in the middle row). The sound identity and location are indicated in the title of each plot. The PSTHs (bottom row) showing the deviant (30 trials) and standard responses (30 trials). **p < 0.01, n.s. not significant, Wilcoxon rank-sum test. A small inset on the top row between a and b shows all spikes including deviant and standard responses to both S1 and S2. c Scatter plots of SI2 (contralateral site) versus SI1 (ipsilateral site) for two sounds (left: S1; right: S2). Filled colored points correspond to cases in which Wilcoxon rank-sum test shows significant difference for both locations (p < 0.05, deviant response vs. standard response). d CSI distribution for two sounds (top: S1; bottom: S2) respectively, arrows illustrate means

The population results were summarized in Fig. 5c, d. In the scatter plots for the location-specific index SI1 (ipsilateral site) versus SI2 (contralateral site), most values were positive and existed within the upper right quadrant for both sounds (63.1% for S1 in the left column; 48.6% for S2 in the right column, Fig. 5c). The average CSI was 0.12 and 0.084 for sounds S1 and S2 (Fig. 5d), respectively. For both sounds, CSIs were significantly higher than zero (S1: p = 7.8e−23, S2: p = 2.97e−18, t test). This indicated that on average, the deviant stimulus elicited a stronger response when compared with the standard stimuli. Thus, this finding indicated that spatial SSA exists in the AC neuronal population. Spatial CSI for S1 was significantly greater than that for S2 (Fig. 5d, p = 1.03e−4, paired t test), suggesting that spatial SSA was sound-context dependent. In the same neuronal population, spatial CSI of S1 was also significantly greater than that of identity oddball at both locations (contralateral site: p = 4.48e−7, ipsilateral site: p = 1.9e−6, t test, compare with Fig. 2d), suggesting that spatial SSA may be much more robust compared with identity SSA. Meanwhile, the spatial CSIs of the two sounds had a strong positive correlation (Pearson’s correlation, r = 0.48, p = 1.1e−7), indicating that spatial SSA was tolerant to the sound change.

Integrative SSA with the natural sounds

We have already separately characterized both identity and spatial SSAs with the natural sounds and next probed how AC neurons integrated different cues during adaptation. Each natural sound was presented as a double deviant stimuli; that is, the deviant sound at the deviant location (Fig. 1b, i–l). As in Fig. 6a–d, the response of one example neuron to the four kinds of stimuli is shown, when the stimuli were presented as the double deviant stimuli and as the single deviant stimulus (S1 at ipsilateral site in Fig. 6a, S2 at ipsilateral site in Fig. 6b, S1 at contralateral site in Fig. 6c, S2 at contralateral site in Fig. 6d). As for the comparison between double deviant (DD) and single identity deviant (SID) responses, the double deviant response was stronger than single deviant response for all four situations (DD in integrative oddball vs SID in identity oddball, p = 0.57 in Fig. 6a, p = 0.026 in Fig. 6b, p = 0.025 in Fig. 6c, p = 2.8e−5 in Fig. 6d, Wilcoxon rank-sum test), indicating that there was an integrative enhancement at both spatial locations. When we compared the double deviant response and single spatial deviant (SSD) response, the former was significantly stronger than the latter, but only for sound S2 situations (DD in integrative oddball vs SSD in spatial oddball, p = 0.017 in Fig. 6b, p = 0.0039 in Fig. 6d, Wilcoxon rank-sum test). This was not the case for sound S1 (DD in integrative oddball vs SSD in spatial oddball, p = 0.79 in Fig. 6a, p = 0.77 in Fig. 6c, Wilcoxon rank-sum test), which indicated that integrative enhancement was dependent on sound identity.

Fig. 6
figure 6

Deviant responses of AC neurons to the natural sounds presented in the integrative oddball. ad Histograms showing deviant responses of one representative AC neuron to the same stimulus as single identity deviant stimuli in identity oddball, single spatial deviant stimuli in spatial oddball, or double deviant stimuli in integrative oddball for four different stimuli: S1 at ipsilateral site (a); S2 at ipsilateral site (b); S1 at contralateral site (c); S2 at contralateral site (d). *p < 0.05, **p < 0.01, n.s. not significant, Wilcoxon rank-sum test. eh population data showing comparison in deviant response between double deviant condition and single deviant condition. Filled colored points correspond to cases in which the t test shows significant difference (p < 0.05)

A summary of the results from all neurons (n = 111) is shown in Fig. 6e–h. In the scatter plot showing the double deviant response versus single identity deviant response (Fig. 6e, f), most data points were above the unitary line for both the contralateral and ipsilateral sites (Fig. 6e: p = 1.9e−21 and Fig. 6f: p = 2.1e−6, paired t test), indicating double deviant cues helped deviant identity detection. There were only 23 significant points (comparing between DD in integrative oddball and SID in identity oddball for each neuron, p < 0.05, t test) above the unitary line at the ipsilateral location (as indicated by filled yellow dots, Fig. 6e). In comparison, there were 62 at the contralateral location (Fig. 6f) from the same neuronal population. This difference indicated that the observed integrative effect was (1) dependent on spatial location and (2) favored the contralateral site.

When the double deviant response and the spatial deviant response were compared (Fig. 6g, h), there was no significant difference in the neuronal population (n = 111; p = 0.11, paired t test) for sound S1 (Fig. 6g). However, the double deviant responses for sound S2 were significantly stronger than single spatial deviant response in neuronal population (Fig. 6h: p = 8.4e−18, paired t test). 28 out of 111 points showed an enhanced integrative effect for sound S2, while there were only seven significant points above the unitary line for sound S1. These results indicated that the integrative effect was also sound context-dependent and favored sound S2 in our research.

To better understand the integrative effect in neuronal populations, we then defined two factors—DII and DSI—for each neuron to better characterize the strength of the integrative effect over identity information and spatial information, respectively. DII and DSI were taken as the average differences of the normalized response between the double deviant response and the single deviant response for the four stimuli in IOP (see "Methods"). The average DII and DSI values were 0.10 and 0.055, respectively, and both factors were significantly greater than zero (DII: p = 3.0e−17, DSI: p = 9.4e−14, t test, Fig. 7a, b). This indicated that on average, the double deviant responses were much stronger than the single deviant responses in the recorded AC population. Interestingly, DII was more than that of DSI (p = 5.5e−6, t test), suggesting that the integrative enhancive effect was much stronger when considering identity over spatial information.

Fig. 7
figure 7

Distribution of integrative effect across neuronal populations. a, b Distribution of DII (a) and DSI (b) for each neuron (n = 111). The arrays indicate means

Response dynamics in identity, spatial, and integrative SSAs

To study the time course of the three different oddball paradigms in the recorded neuronal population, we examined how response changed over the seven stimulations (Fig. 1b). We normalized the response according to the first trial response, which was a non-adapting (or less-adapting) response to compensate for the response diversity within the neuronal population. The response to each stimulus was divided by the average response to the same stimuli at the same location when it was presented as the first stimulation. The responses of the three different oddballs were identical for the first six stimulations (Fig. 8, F(2,1997) = 0.02, p = 0.98, ANOVA) and different for the last trial, with double deviant being the largest (p = 5.9e−17 for DD in integrative oddball vs. SID in identity oddball and p = 1.9e−13 for DD in integrative oddball vs. SSD in spatial oddball, paired t test). One interesting finding was that the normalized responses to the double deviant stimuli were not significantly different from the response to the first trial (p = 0.24, paired t test), suggesting an enhancive effect rather than simple adaptation, because the first stimulation was less adapted after a long-term period of silence.

Fig. 8
figure 8

Time course of IOP. Time course of population neuronal response in IOP for normalized firing, and sorted by three oddballs (black: identity oddball; blue: spatial oddball; red: integrative oddball). The normalization was made according to the response to the first trial

Comparison between A1 and AAF

As all recordings in the current study were performed from the core auditory cortex including A1 and AAF (Fig. 9a), identified by the tonotopic gradient of CF (Horikawa et al. 1988; Profant et al. 2013), we also make the comparison between those two brain areas for different oddball stimulations. In one example subject, No difference was detected between A1 and AAF in CSIs of pure tones (p = 0.38, Wilcoxon rank-sum test, Fig. 9b) or the natural sounds (p = 0.35, Wilcoxon rank-sum test, Fig. 9c). In population, the CSI in A1 was marginally greater than that in AAF only for the identity oddball with the pair of natural sounds S1 and S2 (p = 0.03, Table 1). No significant difference was detected in the spatial oddball, the pure tone oddball, or the integrative oddball (Table 1). As small difference between AAF and A1 was detected, we did not make a separate analysis for the two areas, but grouped all the data together in the previous analysis.

Fig. 9
figure 9

Topographic distribution of SSA throughout the core auditory cortex in one example subject. a CF distribution throughout the cortical field in one example rat with green dots indicating recording locations. b, c CSI distributions of pure tones (b) and the natural sounds (S1 and S2) (c) throughout the cortical field of the same rat

Table 1 Comparison between AAF and A1

Discussion

Neurons in the AC showed a weak SSA for the two pairs of the natural sounds and pure tones and a stronger spatial SSA for rat vocalizations (S1). When the sound was presented as the deviant sound at the deviant location, responses were more robust than the response to the same stimuli when it had been presented as single deviant stimuli.

SSA with spectral cue

Using pure tones, we demonstrated that AC neurons showed small SSA (Figs. 3c, 4c), which was similar to previous results in the awake animals (Von Der Behrens et al. 2009; Farley et al. 2010; Klein et al. 2014), but different from the results in the anesthetized animals (Ulanovsky et al. 2003; Taaseh et al. 2011; Nieto-Diego and Malmierca 2016; Wang et al. 2019) or two recent studies in the awake subjects (Parras et al. 2017; Polterovich et al. 2018). As different oddball paradigms are used in different research labs, it is difficult to make direct comparisons. Previous studies suggested that high SSA could arise from short ISI, high ratio between deviant and standard sounds occurrence, or a large difference between sounds (Ulanovsky et al. 2003; Anderson et al. 2009; Malmierca et al. 2009; Yu et al. 2009; Antunes et al. 2010; Duque et al. 2016). In our research, we used a very short ISI (300 ms), comparatively high ratio (6:1), and large frequency difference for pure tones (0.53 octave). As for the spectral difference for the natural sounds, we specifically chose two pairs: high-contrast pair (S1 and S2) and low-contrast pair (S1 and S3). Even for the high-contrast pair, the observed SSA was very close to zero, and the mean CSI was 0.086 (Fig. 3c). To put it together, we are confident that the small SSA cannot attribute to any of the oddball settings we selected. Meanwhile, our recordings were mainly focused on the core auditory cortex. In the anesthetized subjects, the non-primary fields showed stronger SSA than the core AC (Nieto-Diego and Malmierca 2016), thus we do not preclude the possibility that SSA with spectral cues could be strong in non-primary auditory cortex of the awake subjects.

In the present study, we also investigated SSA in AC using two pairs of natural sounds and our data clearly demonstrated small SSA. Klein et al. (2014) conducted the only systematic research on SSA using complex sounds beyond pure tones (Klein et al. 2014). Their work revealed that SSA with frequency-modulated (FM) tones was smaller than that observed with pure tones, similar to our finding that the CSI of vocalization pair was much smaller than that of pure tones (Fig. 4c). Both these results suggest that complex sounds may evoke a smaller SSA compared with pure tones and indicate that frequency enrichment may reduce SSA strength. Meanwhile, we also noticed that high-contrast pair (S1 and S2) evoked a comparatively stronger SSA than low-contrast pair (S1 and S3) (Figs. 3,  4), similar to the results with pure tones (Ulanovsky et al. 2003; Anderson et al. 2009; Malmierca et al. 2009; Yu et al. 2009; Antunes et al. 2010; Duque et al. 2016). We also acknowledge that only two pairs of natural sounds were considered in our study. This may prove to be a limiting factor in terms of the generalizability of these findings to other natural sounds. Future research will need to explore a wider variety of natural sounds and precisely characterize the relationship between SSA and the spectral components within such sounds.

SSA with spatial cue

SSA has been mainly found in the frequency domain along the auditory pathway (see Malmierca et al. 2014; Khouri and Nelken 2015, for review) and has never been addressed in the spatial domain in AC. Using spatial oddball paradigms with pure tones, Xu et al. (2017) found the neurons in thalamic reticular nucleus exhibited strong spatial SSA, which increases the neuronal spatial sensitivity for deviant locations. To the best of our knowledge, the results presented here are the first findings indicating that AC neurons could exhibit SSA for spatial locations with the natural sounds in awake subjects (Fig. 5). As spatial perception may guide attention and improve the detection, segregation, and recognition of sounds (Bregman 1990; Roman et al. 2003) and the AC has proved to play an important role in the spatial perception (Rodriguez-Nodal and Bajo-Lorenzana 2012), and the spatial SSA in AC may prepare animals for detecting salient locations in a natural scene, which is vital for sound processing and also animal survival.

The spatial SSA shares many traits with frequency SSA such as their dependency on the ISI and the difference between the two simulations in the oddball (Xu et al. 2017), however, we noticed several dissimilarities between spatial SSA and identity SSA with the natural sounds. Firstly, the CSI in the spatial SSA of the natural sound S1 (top row of Fig. 5d) was comparatively stronger than that seen in the identity SSA (Figs. 2d, 3c, 4c); meanwhile, in the scatter plots of SI1 versus SI2 for the same neuronal population (n = 111), only 4 (contralateral site) and 2 (ipsilateral site) neurons showed significant SSA in the first quadrant for the identity oddball (Fig. 2c) while 18 (S1) and 7 (S2) neurons showed significant SSA for the spatial oddball (Fig. 5c). Those results suggest that the spatial SSA may be stronger in AC compared with the identity SSA. Secondly, the spatial deviant response in the spatial oddball was significantly stronger than the identity deviant response in the identity oddball (Fig. 8), which implies that deviant locations may be more salient than deviant sounds, at least for the stimulation used in the current research. Lastly, unlike the responses in the identity oddball where CSIs at the contralateral and ipsilateral locations are similar (Fig. 2d), the responses in the spatial oddball depend on sounds; specifically, the sound S1 (rat vocalization) elicited stronger spatial SSA than the sound S2 (knocking the door) (Fig. 5d). This result suggests that the spatial adaptation is dependent on sound identity, echoing the recent finding that spatial processing is frequency specific (Sollini et al. 2017).

Integrative SSA with both spectral and spatial cues

A deviant subject always occurs with multiple congruent deviant cues that occur both within and across modalities. To this end, how does our brain integrate this abundance of deviance cues that each has different attributes? One particular line of work in sensory research has provided insight into this question. Reches et al. (2010) found that congruent visual and auditory bimodal stimuli evoked stronger SSA in the forebrain of anesthetized owls than a visual stimulus alone. In the present study, we demonstrated that SSA benefitted from the integration of different cues—even within the same modality. The congruent, double deviant sound elicited stronger response than the identity deviant stimulus or the spatial deviant stimulus did in AC neuronal populations (Fig. 8). The integrative effect described here provides the first example of cue integration within the same modality with ecologically natural stimuli. Moreover, the current research demonstrated that the population neuronal response to the double deviant stimuli was similar to those seen in the responses to the first stimulation in the block, which is less adapted (Fig. 8). This result suggests that the integrative effect may overcome the adaptation and enhance the neuronal response, and the integrative enhancement effect described here probably provides a neural mechanism for the interaction between deviance detection and cue integration, which is worth of future investigation with a strong control (i.e., the first stimulation in the block after a longer silence period or a deviant alone stimulus).

This integrative effect was dependent on both location (Fig. 6e–f) and sound context (Fig. 6g, h). In the current research, the integrative effect favored sound S2 (Fig. 6g, h) and the contralateral site (Fig. 6e, f). Future work regarding the interaction between spatial and identity (mainly spectral) information processing will have to investigate how and why this dependence occurs. We defined two factors to characterize the strength of the integrative effect by comparing the double deviant response and single deviant response (DII and DSI). The DII is much bigger than DSI, suggesting that integrative effect is not the same for the two cues and favored sound identity in our research (Fig. 7). Most of the values of DII and DSI tend to cluster toward the right side of the x axis (i.e., the positive values in Fig. 7), suggesting that the integrative effect could facilitate the responses to both identity and spatial deviant cues.