Introduction

Auditory adaptation manifests as temporary modifications of hearing sensitivity during prolonged sound action. The functional significance of the auditory adaptation is assumed to be an adjustment of the dynamic range of hearing to the most typical sound level of the auditory environment. An obvious effect of adaptation is a reduction of neuronal activity and responses to test stimuli during the presentation of an adaptive sound over a long time. In laboratory animals, it was found that the responses of auditory nerve fibers (Kiang et al. 1965; Smith and Zwislocki 1975; Smith 1977, 1979; Westerman and Smith 1984; Chimento and Schreiner 1991) as well as the cochlear action potential (Eggermont and Spoor 1973) decrease during the presentation of long auditory stimuli. A shift in the dynamic range of the auditory system toward the current sound level improves the coding of signals around this level (Dean et al. 2008; Wen et al. 2009, 2012). Due to the adjustment of the dynamic range to the mean current level of environmental sounds, it is possible to maintain high differential sensitivity within a wide range of sound levels (more than 100 dB), whereas the dynamic range of auditory neurons, as a rule, can be as narrow as 20–30 dB (Viemeister 1988).

Not much is known about adaptation in the auditory system of cetaceans that possess unique characteristics with regard to sensitivity, frequency range, frequency tuning, and temporal resolution (Au 1993; Supin et al. 2001; Au and Hastings 2008). Investigation of auditory adaptation in subjects with unique hearing capabilities may provide more knowledge about the mechanisms underlying those functions. Additionally, investigation of adaptation properties in cetaceans is important for evaluating the negative impact of man-made noises on the behavior and physiology of aquatic mammals, because adaptation to loud noise may reduce responses to target signals. Numerous studies have examined the long postexposure effect of loud noise, which is known as temporary threshold shift (TTS) (reviewed by Finneran 2015); however, few studies have described properties of auditory adaptation in a whale (Popov et al. 2016, 2018), which ultimately cannot be considered as providing explicit knowledge of the adaptation effect. In particular, little is known of how the auditory adaptation proceeds in cetaceans depending on the characteristics of the adapting sounds. In the study by Popov et al. (2016, 2018), the auditory adaptation was produced by signals of the same type as the test signal; specifically, the signals were trains of short tone pips. Under natural conditions, auditory adaptation may be expected to be the result of the actions of a wide variety of sound types. It is not known how various sound types produce auditory adaptation in cetaceans.

The goal of the present study was to investigate how the auditory adaptation in a cetacean, i.e., a beluga whale, depends on the temporal characteristics of the adapting sounds. To accomplish this goal, we used the evoked potential technique. Specifically, we recorded auditory evoked potentials (AEPs) to sound stimuli in an adapting background.

Materials and methods

Subject and facilities

The subject was a 3-year-old female beluga whale, D. leucas, with a body length of 250 cm and a body mass of 250 kg that was kept at the Utrish Marine Station of the Russian Academy of Sciences on the Black Sea coast. Her hearing thresholds were measured before the present study using the same evoked potential audiometry as used in previous studies with other belugas (Klishin et al. 2000; Popov and Supin 2009; Popov et al. 2013). Within a range of 32–54 kHz, the thresholds ranged from 45 to 48 dB re 1 µPa; at a frequency of 64 kHz (the tested frequency in the present study), the threshold was 60 dB re 1 µPa. These thresholds were assessed as indicative of normal hearing.

The animal was housed in a round sea water tank that was 6 m in diameter and 1.7 m in depth. The care and use of the animal were in compliance with the Guidelines of the Russian Ministry of Education and Science on the use of animals in biomedical research.

During experimentation, the water level in the pool was lowered to 60 cm. The animal was supported by a stretcher so that the dorsal part of the body and the blowhole were above the water surface. The stretcher was transparent to sounds (made of fish net). The animal was not anesthetized. The transducer that played sounds was immersed in the water at a depth of 30 cm and was placed 1 m in front of the animal’s head.

Signals

During a data acquisition trial, the subject was exposed to a continuous succession of two types of alternating signals, which are referred to as adapting and test signals below. During a measurement trial, the adapting and test signals continuously followed one another. Each adapting signal lasted 128 ms, and each test signal lasted 16 ms (Fig. 1a). Thus, a period containing one adapting and one test signal lasted 144 ms with the adapting signal occupying 89% of the period. With this percentage, we supposed that the adaptation level for the auditory system was primarily determined by the adapting signal. The adapting signal was either a tone pip train of a 64-kHz carrier or a 64-kHz pure tone; in baseline trials, there was no adapting signal. Each data acquisition trial contained 4000 presentations of pairs of the adapting and test signals. Thus, the overall duration of every trial was 576 s.

Fig. 1
figure 1

Timing and waveforms of the signals. a Timing diagram of the presentation of adapting and test signals. An arbitrarily chosen segment containing 3 of 4000 cycles of adapting and test signals is presented. Note the constant level of adapting signals and cycle-by-cycle variation of the levels of test signals. bd Envelopes of adapting signals in a 16-ms segment were presented on an extended time scale; in a, the extended segment is marked by dashed lines. b Pip train of 1-kHz pip rate, 0.0625-ms ERD. c Pip train of 0.125-kHz pip rate, 0.5-ms ERD. d Pure tone. e Envelope of the test signal on an extended time scale; in a, the extended segment is marked by dashed lines. f, g Test signal waveform in a 1-ms segment on an extended time scale; in e, the extended segment is marked by dashed lines. In ae, schematic presentations of the signal envelope; f electronic pip waveform; g acoustic pip waveform (delayed relative the electronic waveform by 0.67 ms)

For pip train adapting signals, each pip was enveloped by one cycle of a cosine function. The cosine cycle constituted 0.125 of the pip repetition interval. With this pip duration, its duty cycle was 0.0625, and the rise–fall time was 0.0625 of the pip repetition interval. With the duty cycle of 0.0625, the root-mean-square (RMS) level of a pip train was − 15 dB re. peak level (Fig. 1b, c).

For adapting signals, five pip rates were used: 0.0625, 0.125, 0.25, 0.5, and 1 kHz. At these rates, the parameters of the signals are presented in Table 1 where different combinations of parameters are arbitrarily designated from (1) to (5). Figure 1b, c show envelopes of adapting signals with pip rates of 1 kHz (Fig. 1b) and 0.125 kHz (Fig. 1c). For pure tone adapting signals, the RMS level was − 3 dB re. peak level (Fig. 1d).

Table 1 Parameters of pip train adapting signals

The test signal was a tone pip train of a 64-kHz carrier (Fig. 1e). The pip rate was 1 kHz. Tone pips were enveloped by one cycle of a cosine function. The cycle duration was 0.125 ms, so pip ERD was 0.0625 ms, and each pip contained eight carrier cycles. The duty cycle of the signal was 0.0625, so the RMS level of the train was − 15 dB re. peak level.

The acoustic pip waveform (Fig. 1g) was slightly distorted, in particular, prolonged compared to the electronic signal (Fig. 1f) because of the frequency response of the transducer and reflections from the tank bottom and water surface. The prolongation was around 0.1 ms, so it could not substantially influence the signal level and duty cycle.

During a data acquisition trial, the adapting signal level was kept constant, whereas test signal levels varied pseudorandomly across periods. The variation was within a 35-dB range with 5-dB steps, i.e., the range contained eight levels. During a trial containing 4000 signal presentations, each of the 8 test signal levels was presented 500 times on average. The reason for within-trial random variation of the test signal levels was to equalize the contribution of test signals to adaptation and to make the adaptation equal for all test signal levels independently of signal position within the 576-s trial.

Both adapting and test signals were digitally synthesized by a standard personal computer at a sampling rate of 512 samples/s, using a custom-made program (virtual instrument) designed with LabVIEW software (National Instruments, Austin, TX). The synthesized signals were form from digital to analog converted by an NI DAQ-6251 acquisition board (National Instruments). To amplify and attenuate the adapting and test signals, a custom-made amplifier attenuator with a 200-kHz passband and 50-Ohm output impedance was used. Using the amplifier attenuator, the 35-dB range of the in-trial variation of test signal levels could be shifted from 65 to 100 dB sound pressure levels (SPL) to 95–130 dB SPL. Adapting signals varied within a range of 65–140 dB SPL. The combination of the adapting and test signals was played through a B&K 8104 transducer (Bruel & Kjaer, Naerum, Denmark).

Acoustic measurements

The SPL of the signals was specified in dB RMS re 1 µPa. The RMS of pip trains were calculated throughout the signal with both the pips and inter-pip intervals. Because of temporal summation within the auditory system, this integrated RMS provides a good fit for the tone stimulus data (Supin and Popov 2007). SPL was monitored before and after the experiments by positioning a calibrated receiving hydrophone (B&K 8103, Bruel & Kjaer) in front of the animal’s head. Despite the sound reflections within the tank, local sound levels in front of the animal’s head varied within a range of ± 2.5 dB.

Evoked potential recording

Brain potentials were picked up through surface F-E5G 10-mm gold-plated disk electrodes (Grass Technologies, Warwick, RI). The active electrode was positioned at the vertex, 7 cm behind the blowhole and above the water surface. The reference electrode was positioned at the back above the water surface. Brain potentials were fed through shielded cables to a LP511 brain potential amplifier (Grass Technologies) with an 80-dB gain and a frequency passband of 100–3000 Hz. Amplified brain potentials were digitized at a sampling rate of 16 kHz with a 16-bit analog-to-digital converter, which was one of the A/D channels of the NI DAQ-6251 acquisition board. The digitized signals were processed using a custom-made program (Virtual Instrument) that was designed using LabVIEW software.

Online data processing implied coherent averaging of the brain potential responses. For averaging, 25-ms epochs were extracted from the brain potential record coherent with the test signals (the epoch onset coincided with the test signal onset). The epochs were sorted into eight bins according to the eight signal levels presented in each trial. For each signal level, the original records were averaged. Thus, in each trial, eight simultaneously collected averaged responses to test signals differing by 5-db steps were obtained (Fig. 2).

Fig. 2
figure 2

RFR waveforms (a) and spectra (b) at various test signal levels. The eight averaged records were obtained in one baseline (no adapting signal) trial. Signal SPLs (dB re 1 µPa) are indicated near the record waveforms and spectra; TS test signal envelope. Vertical dashed line in b marks the 1-kHz spectral peak

The subsequent offline processing quantified a rate following response (RFR) evoked by the rhythmic pip train test signal. For that processing, a 16-ms segment of each averaged record (from the 6th to the 22nd ms after stimulus onset) containing RFR to the test pip train signal was Fourier transformed to obtain the response frequency spectrum. The amplitude of the 1-kHz spectral peak was considered a measure of RFR amplitude. A record was considered response present if the amplitude of the 1-kHz spectrum peak was at least twice as high as amplitudes of spectrum components within a frequency range from 0.75 to 1.25 kHz. In such cases, the 1-kHz peak amplitudes were plotted as a function of the test signal level. Otherwise, the record was considered response absent, and the peak amplitude was not plotted.

Statistics

For each type and level of adapting signals, measurements were repeated three times. The results of the measurements were averaged offline and presented as the means and standard deviations.

Results

Rate following response (RFR) to test signals

The test signals evoked an evoked potential complex, as shown in Fig. 2a. The figure presents a family of eight records obtained in one baseline trial (without an adapting signal). Each of the records displayed RFR as a series of rhythmic waves with a 1-kHz frequency. The frequency spectrum of a segment of the record from 6th to 22nd ms displayed a definite peak at a frequency of 1 kHz (Fig. 2b). Its amplitude featured dependence on the test signal level within the 35-dB range of within-trial level variation; the spectrum peak was maximal at a signal SPL of 110 dB and fitted the response presence criterion at a level or 80 dB SPL. At a level of 75 dB SPL, the spectrum peak approached the record noise at a signal SPL of 75 dB, so this record was considered response absent.

RFR amplitude reduction by adapting signals

When a trial included both adapting and test signals, the RFR amplitude was reduced compared to baseline (no adapting signal) conditions. This effect depended on the adapting signal type. Pip train adapting signals produced deeper RFR reduction than a pure tone of the same SPL. This difference is evident when both adapting and test signals have equal or close SPLs. This case is demonstrated in Fig. 3, which presents a trial with equal SPLs for both the adapting and test signals. In the baseline trial with no adapting signal, the test signal evoked robust RFR (1). When the adapting signal was a 1-kHz pip train (see a signal envelope in Fig. 1b), i.e., the adapting and test signals composed a continuous succession of pips with a rate of 1 kHz, RFR amplitude was several times as low as the amplitude for the baseline record (2). When the adapting signal was a pure tone (see Fig. 1d), RFR reduction relative to the baseline was negligible (3)].

Fig. 3
figure 3

An example of adaptation effects of different adapting signals. 1—baseline (no adapting signal) RFR to a test signal of 105 dB SPL; 2—an adapting signal with a pip train of 1-kHz pip rate, 105 dB SPL; 3—an adapting signal as a pure tone of 105 dB SPL; 4—test signal envelope

Dependence of the adaptation effect on adaptive signal type

Figure 3 demonstrated that pip train adapting signals produced a deeper reduction in RFR than a pure tone of the same SPL. In more detail, this difference is presented in Fig. 4 as RFR amplitude dependencies on test signal level. These dependencies are presented for test signal levels of 70 dB SPL and higher, because at a signal level of 65 dB SPL RFR was absent according to the adopted criterion (see “Materials and methods”). RFR amplitude monotonically depended on the test signal level (the higher the level, the higher the amplitude). The presence of an adapting signal shifted the amplitude-level function to higher test signal levels.

Fig. 4
figure 4

Amplitude-level functions for various adapting signals. Adapting signals: a pip train of 1 kHz pip rate. b Pure tone. c Pip train of 0.125-kHz pip rate. Adapting signal levels are presented in the legends in dB SPL

With the adapting signal as a pip train with a 1-kHz rate (see a signal envelope in Fig. 1b), the shift appears at a level of the adapting signal as low as 80 dB SPL; at a level of 125 dB, the function shifted upward by approximately 40 dB (Fig. 4a).

The adapting signal as a pure tone (see Fig. 1d) also resulted in an upward shift of the amplitude-vs-level function. However, this shift was much lower than the shift in the previous case. Adapting signal levels up to 110 dB SPL resulted in no noticeable shift. A noticeable shift appeared at an adapting signal of 115 dB SPL, and at the maximum adapting level of 125 dB, the shift was as low as 15 dB (Fig. 4b).

Above, the SPL of both pip trains and pure tones were presented in RMS values (see “Acoustic measurements”). The validity of equalization of the pip trains and pure tones by RMS is not obvious, because at equal RMS, pip trains with a 0.0626 duty cycle feature peak amplitudes of 12 dB as high as that of pure tones as seen while comparing Fig. 1b–d. The influence of this difference on adaptation effects was not known in advance. So, for a valid comparison, the effects of adapting pip trains at various pip rates and various pip durations were investigated while keeping the duty cycle constant. With equal duty cycles, all pip trains had equal ratios of peak-to-RMS levels, so when equalized by RMS levels, they were equal by all other energy parameters and differed only in the temporal structure.

Figure 4c presents the effect of an adapting signal as a pip train with the 0.125-kHz pip rate (see Fig. 2c). This adapting signal produced a shift in the amplitude-level function that was smaller than the shift produced by the 1-kHz pip rate (Fig. 4a) and close to the rate produced by the pure tone (Fig. 4b).

All the data obtained with the use of the pip train adapting signals of various pip rates are presented in Fig. 5. The figure presents test signal levels that produce RFR with a criterial amplitude as a function of the adaptive signal level. As a criterion, 0.5 µV was chosen as a value close to the midpoint of the dynamic range for baseline RFR. The figure demonstrates regularities as follows.

Fig. 5
figure 5

Test signal levels producing 0.5-µV RFR as a function of the adapting signal level. Pip rates (kHz) in adapting signals are indicated in the legend; tone—pure tone adapting signal

  • The higher the adapting signal level, the higher the test signal level was which evoked RFR of the criterial amplitude. This regularity demonstrated a routine adaptation effect.

  • If the pip rate in the adapting signal was lower, the higher level of the adapting signal reduced RFR to the criterial value, i.e., the adaptation effect was weaker. The pip train of 0.0625-kHz pip rate produced an effect that was almost low as the pure tone.

Discussion

Manifestation of the adaptation

In agreement with previous observations (Popov et al. 2016, 2018), the present study showed that the adapting sounds presented between test signals reduce evoked potential responses in a beluga whale. This effect may be considered a manifestation of auditory adaptation.

In the present study, the duration of every adaptation trial (576 s) far exceeded the time constants of the adaptation found by Popov et al. (2016, 2018), with the test signals of various level randomly distributed within the trial. So, the data obtained herein may be assumed as characterizing conditions approaching the stabilized adaptation.

The data presented in this study reveal some properties of this effect. Specifically, adaptation appeared as a shift of the function relating the RFR amplitude to the test signal level (hereafter referred to as the amplitude-level function) along the level axis. Therefore, the adaptation influenced both low- and high-amplitude responses to a comparable extent. A similar effect has been described in auditory nerve fibers as a dynamic range adaptation (Wen et al. 2009, 2012). This manner of adaptation implies decreased sensitivity (increased thresholds) of the auditory system; as a result, particular sensation levels of the test signal are achieved at increased SPLs, which appear as a shift of the amplitude-level function.

Levels of the auditory system subjected to adaptation

In laboratory animals, dynamic range adaptation has been described in the auditory midbrain (Dean et al. 2005) and the auditory cortex (Watkins and Barbour 2008). However, a similar adaptation has been described in the auditory nerve (Wen et al. 2009, 2012). Therefore, the adaptation observed at higher levels of the auditory system may reflect processes in the auditory periphery. Indeed, several models address depletion and subsequent restoration of the transmitter in peripheral synaptic levels of the auditory system (Eggermont 1985; Meddis 1988; Hewitt and Meddis 1991; Zilany and Carney 2010). A similar suggestion is applicable to the experiments described in this study. Non-invasively recorded RFR in odontocetes is a rhythmic sequence of evoked potentials, and the main component is the auditory brainstem response (ABR). The highest waves of this response in odontocetes mainly reflect midbrain activity (Supin et al. 2001). Therefore, the observed adaptation event may originate either at the auditory midbrain or at lower levels of the auditory system.

Adaptation effects of “fast” and “slow” adapting signals

In the experiments described above, the adaptation effect substantially depended on the temporal parameters of the adapting signal: “fast” signals (short pips of high rate) produced stronger effects than “slow” signals (long pips with a low rate or pure tones). The pure tone produced the least adaptation effect.

For the pure tone, its comparison with pip train might be ambiguous because of different peak-to-RMS ratios: 3 dB for a pure tone, 15 dB for pip trains of the duty cycle used in the present study. Based on a previous study (Supin and Popov 2007), we assumed that the long-term RMS of pip trains is an appropriate metric for such a comparison. However, a few decibels’ ambiguity could not be excluded. In this regard, the “fast” and “slow” pip train signals were more indicative. They had equal duty cycles and, consequently, equal peak-to-RMS ratios, however, feature different adaptation effects. This difference definitely depended on the signal temporal and respective spectral differences.

Fast signals with wide frequency spectra synchronously trigger neurons within a wide range of sound frequency representations, which produces an intensive response. On the contrary, slow signals trigger neurons less synchronously and within a narrower range of frequency representations. For the auditory system of odontocetes, it has been demonstrated with the evoked potential technique that sound signals with a short rise–fall time provoke higher auditory evoked potentials (AEP) than signals with a long rise–fall time. In particular, short cosine envelope tone pips are more effective than long cosine envelope pips (Supin and Popov 2007). Additionally, the wider the frequency band is, the more effective a signal will evoke AEP in odontocetes (Popov and Supin 2001). When the signal was sinusoidal amplitude-modulated tone, the highest AEP appeared at modulation rates as high as 600–1000 Hz and decreased in amplitude at lower modulation rates, i.e., at slower rise–falls (Supin and Popov 1995). Different efficiencies for fast (wide-band) and slow (narrow-band) adapting signals indicate that this difference not only manifests in AEP amplitude but also influences activity associated with auditory adaptation.

Additionally, the efficiency of an adapting signal may depend on the relationship of its frequency band to the signal frequency band. When the adapting signal was a train of fast pips, its frequency spectrum covered the same frequency band as that of the test signal; alternatively, the adapting signal as a train of slow pips or a pure tone had a frequency band narrower than the test signal (Fig. 6). These frequency relationships may result in less efficiency for slow adapting signals compared to fast signals.

Fig. 6
figure 6

Frequency spectra of pip trains (acoustic signals) with a 0.0625 duty cycle and of various pip rates and pure tone. Pip rates are indicated next to the spectra; T pure tone