Introduction

Barn owls are well known for their exquisite sound-localization capabilities (Konishi 1973; Takahashi 2010; Wagner et al. 2013; Grothe 2018). Despite its much smaller skull, this animal can localize sound sources with a precision resembling that of humans (Blauert 1997; Bala et al. 2003). Broadband noise is the most effective stimulus for barn owls (henceforth also the “owl”) in their quest to capture prey (Konishi 1973; Singheiser et al. 2010; Takahashi 2010). This is reflected in neurons at the endpoints of the auditory pathway, the external nucleus of the inferior colliculus in the midbrain and the auditory arcopallium in the forebrain (Vonderschen and Wagner 2009), that integrate information over a wide frequency range.

The main cues that have been identified so far to underlie sound localization in the owl are the interaural time difference (ITD) and interaural level difference (ILD) (Moiseff and Konishi 1981; Poganiatz et al. 2001; Egnor 2001; Poganiatz and Wagner 2001; Bala et al. 2003; Krumm et al. 2019). Most studies have focused on the use of broadband stimuli. These studies have shown that barn owls are able to localize sound sources with a precision of 3°–4° (Bala et al. 2003; Krumm et al. 2019), corresponding to about 10 µs (von Campenhausen and Wagner 2006) or 4 dB (Bala, pers. comm.). Testing of frequency-specific cues revealed some role of frequency in sound localization in the owl as well. For example, Cazettes et al. (2014, 2016) demonstrated a correlation between the frequency tuning of neurons and spatial reliability of interaural phase difference, the phase equivalent of ITD. Likewise, Kettler et al. (2017) demonstrated that ILD helps to disambiguate frequency-specific timing information. Despite these contributions, it remained unclear so far, whether behaving barn owls and neurons in the optic tectum of these birds would be able to discriminate positions in space characterized by equal broadband ITD and ILD. Such discrimination capability would be useful, because prey typically produces broadband sounds.

We tackled this question by making use of head-related transfer functions (HRTFs). HRTFs contain the complete spatial information about a sound-source location (Wightman and Kistler 1989a; for a review see Blauert 1997; for the owl see also Keller et al. 1998; Poganiatz et al. 2001; von Campenhausen and Wagner 2006). We focused on such positions at which broadband ITDs and ILDs were ambiguous. We reasoned that if owls have more information available than carried by broadband ITD and ILD, the birds should be able to distinguish between positions at which the broadband ITDs and ILDs coincide up to the known behavioral resolution limits of these parameters. A discrimination of positions characterized by equal broadband ITD and ILD would show in differences in response latencies and/or the endpoints of head saccades in behavior or in different spike rates in electrophysiological experiments.

Using HRTF stimuli in behavioral and electrophysiological experiments, we demonstrate here that the owls have more information available and use more information for sound localization than contained in broadband ITD and ILD.

Materials and methods

Data were obtained from 8 adult barn owls (Tyto furcata pratincola) of either sex (5 females, 3 males) from the colony at the Institute of Biology II at RWTH Aachen University (Aachen, Germany). The procedures conformed to NIH principles of animal care and were approved by local authorization committees (LANUV).

Anesthesia

The protocol was the same as described in Ferger et al. (2018). Briefly, on the day before surgery or a recording session and on the day of an experiment, the owls were weighed and their health status was scored. Owls were sedated by an intramuscular injection of diazepam (1 mg/kg body weight; Ratiopharm GmbH, Ulm, Germany). After 30 min, anesthesia was induced by an intramuscular injection of ketamine (20–40 mg/kg; Ceva Tiergesundheit GmbH, Düsseldorf, Germany). Injections of atropine sulfate (0.05 mg/kg, intraperitoneal; Braun Melsungen AG, Melsungen, Germany) and buprenorphine (0.3 mg/kg, intramuscular; Reckitt Benckiser Healthcare (UK) Ltd., Hull, UK) were administered to reduce salivation and pain, respectively. Furthermore, ketamine and diazepam were given in intervals of 1–2 h as needed by the individual to maintain anesthesia. At the end of a recording session, the electrode was removed and the brain surface covered with an antibiotic ointment (Neomycinsulfat, Schur Pharmazeutika GmbH & Co. KG, Düsseldorf, Deutschland). The owls received another injection of buprenorphine before suture of the scalp at the end of a session and, after surgery, a dose of Carprofen (4–6 mg/kg; Pfizer GmbH, Berlin, Germany) for post-operative analgesia. The owls were then brought to a recovery box with access to food. After full recovery, usually overnight, the owls were returned to their home cage.

Initial surgery

A bar made of aluminum or brass was implanted on the skull of the owls under anesthesia before experimental sessions started. The bar allowed the fixation of the anesthetized animals during recordings of head-related acoustic information, during electrophysiological measurements, and for fixation of the earphone frame and sensors during behavioral experiments.

Experimental design

Birds were stimulated either from free-field speakers or with earphones in a virtual auditory world. The virtual auditory world was created through HRTFs. Behavioral reactions and responses of neurons were recorded as detailed below.

Head-related transfer functions

Measurements and analysis were described in detail in Poganiatz et al. (2001) and in von Campenhausen and Wagner (2006) before. Therefore, we present only a short overview here. Measurements were performed in a sound-attenuating chamber (IAC 403A, Industrial Acoustics, Niederkrüchten, Germany). A broadband loudspeaker (MacAudio ML-103E) was positioned along a custom-built semicircular track (90 cm radius). The track defined a three-dimensional spherical coordinate system with 0° azimuth and 0° elevation directly in front of the owl. Speaker rotation was possible from − 160 (left) to + 160 (right) degrees in azimuth and + 80 (up) to − 70 (down) degrees in elevation. Data were taken in steps of 10° for both coordinates. This resulted in measurements at 528 positions. The center of the owl's head was fixed in the center of the track, so that the midpoint between the ear canals was as close to the center of the sphere as possible. Audio hardware [Tucker–Davies Technologies Inc., Alachua, FL, USA (System II)] was used for sound delivery and recording. Transfer characteristics of the ears were measured with sound sweeps covering a broad frequency range (20 Hz–6 kHz, logarithmically rising, 500 ms duration, 5 ms rise/fall time, 5 repetitions). We fitted two microphones (Sennheiser KE4 21122) with a flexible probe tube (outer diameter 1.4 mm, inner diameter 1.2 mm, length 25 mm). The tubes were inserted 10 mm into the ear canal—so close to the ear drums that a veridical signal resulted (Keller et al. 1998; Poganiatz et al. 2001). The microphone signal was amplified, filtered (20 kHz cutoff frequency, 60 dB roll-off within 3 kHz) and recorded for 510 ms, beginning with the start of the stimulus. Digital-to-analog and analog-to-digital sampling were done with the same frequency (100 kHz), sampling depth (16 bit), and a synchronized clock. A reference measurement without the owl in place was performed, too, to remove influences of the setup on our data. For the reference measurement, the microphones with the tubes were positioned in the center of the setup, arranged in parallel at a horizontal distance of 1 cm and pointed toward 0 azimuth and 0 elevation.

To obtain the direction specific transfer function for a given position, the discrete Fourier transform (DFT, calculated with the MATLAB function for discrete fast Fourier transform, FFT) of the recorded signal was divided by the DFT of the reference measurement. The inverse Fourier transform of that quotient gave the impulse response for that position at each ear, the so-called head-related impulse response (HRIR). Broadband interaural time difference (ITD) was derived from a cross correlation of the right and left part of the HRIR at any given position. Note that the sampling frequency of 100 kHz caused the time resolution to be 10 µs. Broadband interaural level difference (ILD) was calculated by subtracting the left from the right HRIR power in decibels. The ITD was positive for right-ear leading sounds. The ILD was positive for right-ear louder sounds.

The frequency bandwidth of the analysis was adjusted to the hearing curve of the owl (Dyson et al. 1998). For broadband ITDs and ILDs, we analyzed the frequency band (0.4–9 kHz) in which the behavioral threshold was close to lowest. After broadband ITDs and ILDs had been calculated, pairs or even larger groups of ambiguous positions were determined. For positions to be termed ambiguous the combination of broadband ITD and broadband ILD was required to be within ± 10 µs and ± 1 dB. Since ITD and ILDs are conditionally independent (Fischer and Pena 2017), an influence of the ITD on the resolution of ILD or vice versa is not expected. In this way, we found about 200–700 pairs and triplets of ambiguous positions per owl within the acoustic space. In the second step, we selected ambiguous positions for use in experiments. These positions were preferentially located in the frontal hemisphere and were chosen to be different from the reference positions we also used in testing (see next section). All pairs or triplets of ambiguous positions selected in this way were just described within the limits (Table 1). The test positions were also chosen so that the fixation endpoints were expected to be far enough apart (typically at least 20°) to allow a conclusion about discrimination of the different positions by the owl. We also checked the absolute level at the selected positions, and corrected for level, if necessary, so that the levels of the stimuli presented at the ambiguous positions were equal and could not serve as a cue for the owl. We, then, used these ambiguous pairs or triplets of positions as a tool for our studies.

Table 1 Statistics on localization performance of ambiguous positions

Behavior

The tests with two male and one female barn owls (owls A, B, T) took place in a similar acoustic chamber as the HRTF-measurements. For stimulation, we generated broadband noise (300–15,000 Hz) and converted it to an analogous signal by a TDT DA3-4 digital/analog converter (Tucker-Davis Technologies, Alachua, Florida, USA). A TDT F6 device was included to prevent aliasing. During the behavioral experiments, the owl sat on a perch in the center of the chamber with the head free to move and the legs loosely tied to the perch with falconry jesses. Two infrared cameras, one above and one in front of the owl, monitored the general behavior. A mechanical food dispenser provided food rewards. A red light-emitting diode (LED) was placed in front of the owl at 0° azimuth and 0° elevation and 1 m distance. Before the experiments, the animals were trained to fixate the LED for a few seconds. During experiments, the owl initialized a trial by fixating the position of the LED within a spatial window adjusted to the individual with a typical size of ± 7° in azimuth and ± 15° in elevation. During fixation, the LED was automatically switched on for a variable waiting time of 1–2.5 s. After this period, the LED was automatically switched off, and the stimulus was presented either via loudspeakers (Visaton VRS 8) or via earphones (Philips SHE2550) which were mounted on a custom build frame. Stimuli had a duration of 100 ms with on and off ramps of 5 ms. Free-field sessions were performed separately from earphone sessions. In the former case, only the sensor of the head tracking system was attached to the aluminum bar. When hearing a sound, owls turn their head toward the perceived sound source. In the experiments, a bird was rewarded when the stimulus came from an “easy” position (see below), if it hits a target window with the turn. When the stimulus came from a position that belonged to a pair or triplet of ambiguous positions, the owl was always rewarded, independently of the turning amplitude and direction, to avoid introducing a bias for one or more positions. In accordance with earlier experiences (e.g. Kettler et al. 2017), the target window was typically set to ± 7° in azimuth (“Azi”) and ± 15° in elevation ("Ele") around the positions of the free-field or the virtual sources, but was adjusted to the stimulus situation and the animal if necessary. To keep the owl under stimulus control, care was taken that the daily reward rate did not drop below 70%.

In both free-field (FF) and virtual space (V), tests were conducted at several "easy" or reference positions (− 40° in azimuth, 0° in elevation (abbreviated: − 40/0); 40/0; − 40/30; 30/30, 40/30; − 40/− 30; − 40/− 20; 40/− 30; 50/− 30) (Wagner 1993; Poganiatz et al. 2001). For each owl, four or more reference positions were selected as there were four or more positions from the list of pairs or triplets of ambiguous positions. Care was taken that there was a left/right balance in source positions. At each position, at least 32 data points were collected.

Many parameters were extracted from the head-turning data. The most important parameters for this study were the endpoints of the head turns and response latency. Another, more general, parameter is localization error. Localization error is the difference between the position of the source and the endpoint of a head turn. The sign of the azimuthal localization error was positive, if the endpoint of a head was closer to the front than the source. For the elevational component, the localization error was positive, if the endpoint of the head turn was below the source. These parameters were determined as described earlier (Hausmann et al. 2009).

Electrophysiology

Standard electrophysiological techniques were used to record extracellularly from multi and single units in the optic tectum (OT). Recording was accomplished with a single tungsten electrode with an impedance around 20 MΩ. The optic tectum is the endpoint of the reflexive auditory pathway and contains a map of ITD and ILD (Knudsen 1982). While ITD tuning changes mainly along the antero-posterior axis, ILD tuning changes mainly along the dorso-ventral extent. The coordinates of the map were used in the positioning of the electrodes to find a location that represented a position that corresponded to an ambiguous pair or triplet as closely as possible.

When a suitable location had been found, first a basic characterization was obtained. Auditory stimuli were presented dichotically via earphones. Stimuli typically consisted of noise (typically 0.1–25 kHz or low-pass-filtered at 25 kHz) having a level of about 20 dB above response threshold. The duration of the stimuli was 100 ms with on and off ramps of 5 ms. Stimuli were presented in random order and typically repeated 5 times per parameter. The sensitivity of a unit to ITD ("ITD curve", ILD constant), to ILD ("ILD curve", typically at the ITD that evoked the highest response, the best ITD), to frequency (1/3 octave wide noises around the center frequencies, steps: 1/3 octave from 500 to 12,000 Hz at the best ITD and best ILD), to level (binaural and/or monaural at the best ITD and best ILD), and to virtual azimuth ("Azi curve", − 160° to 160° in steps of 10° at constant elevation) was examined. The physiological characteristics (sensitivity to coarse visual stimuli, best ITD, reduction of the side peaks compared with the peak at the best ITD ("side-peak suppression"), shape of ILD tuning curve, shape of frequency tuning curve, and the response behavior with monaural stimulation of the left or right ear) were the criteria for the decision whether a unit was located in the OT or not (for more information see Wagner et al. 2007).

In a second step, recordings with 100 stimulus repetitions per position were obtained for the selected ambiguous pair(s). Furthermore, if possible, Azi curves at as many elevations as possible were recorded. Ideally, this resulted in 16 such curves covering the complete acoustic space as represented in the HRTFs. In the following, we call the resulting map of spike rates by combining the 16 Azi curves of a unit a "spikemap".

Statistical analysis

Behavioral data were analyzed with respect to differences of the turning amplitudes with a Kolmogorov–Smirnov test for 2 samples and 2 dimensions. Electrophysiological and behavioral data were tested for differences in responses at the ambiguous positions with a Mann–Whitney U Test. Furthermore, spikemaps were analyzed with respect to all ambiguous pairs of positions found in a particular bird. For this analysis, the mean responses as measured in the azimuthal tuning curves at the different elevations were used. The mean response at ambiguous position 2 of an ambiguous pair was plotted as a function of the mean response at position 1 of that pair in a scatter plot. The underlying assumption was that at ambiguous positions the response should be equal, apart from neural noise, if the neurons did not represent any other cue than broadband ITDs and ILDs in their responses. In other words, the correlation between the responses at the two positions of an ambiguous pair should be high. Therefore, the square of the correlation coefficient was used as a measure of the variance explained by the above-mentioned assumption. We tested how much the random assignment of the responses to the positions to P2 and P1 within a pair influenced the correlation coefficient, and found little influence (not more than 0.02 in correlation). A similar analysis was carried out as a control after randomizing the indices of the arrays formed by all ambiguous pairs and containing the responses at both positions P1 and P2 with the MATLAB function randperm. This randomization destroys the correct assignment of responses to an ambiguous pair. After randomizing the indices, the responses were plotted as before. The prediction for this latter analysis was that the correlation should be close to zero, if no spurious dependencies between the responses at those positions were present. As a further control, we also calculated the mean of the absolutes value of the correlation coefficients obtained after randomization of indices.

Results

As mentioned above, the data presented here were collected from 8 barn owls. Behavioral data from 3 birds will be shown below, while the electrophysiological data stem from 5 owls. To reach our goal we first had to generate HRTFs from in-ear recordings and derive ambiguous positions from them. These data will be introduced first.

Head-related transfer functions

We recorded individual head-related impulse response (HRIRs) functions of all eight owls and calculated individual HRTFs and individual distributions of broadband ITDs and ILDs. The general structure of the HRTFs was similar for all owls and consistent with earlier reports (Poganiatz et al. 2001; v. Campenhausen and Wagner 2006; Hausmann et al. 2009). To explain our procedures, we show typical data from three birds in the following: one bird that participated in electrophysiological experiments (owl R, Fig. 1a), and two birds that were used for behavioral experiments [owls A (Fig. 1b) and T (Fig. 1c)]. The following description holds for the data obtained from all birds. As may be seen in Fig. 1, the ITD was negative for negative azimuths and positive for positive azimuths. The broadband ITD spanned a range of about 600 µs, with the extreme values of the ITD lying close to about − 300 and 300 µs, respectively. Iso-ITD lines were almost vertical, i.e. occurred at a constant azimuth, for elevations from about − 40° to 40°, especially in the frontal 60° of azimuth. For extreme values of elevation, iso-ITD lines changed direction. For example, the 50µs-ITD line in owl R (light yellow dashed line in Fig. 1a) bent in a clockwise direction for elevations larger than 60°, while the − 50 µs-ITD line (green dashed line in Fig. 1a) bent in a counter-clockwise direction for those elevations. The maximum/minimum of the ITD lay in the posterior hemisphere. The ILD pattern was more complex. Maximal values occurred at several positions. The typical frontal maximum resp. minimum occurred at positions 30/30 (azimuth/elevation) degrees resp. − 40/− 30° (Fig. 1a). The ILD gradient was steepest around 0° in azimuth and 0° in elevation. These characteristics may be seen in a similar way also in the plots shown in Fig. 1b, c, and were also observed in the HRTFs of the other 5 owls.

Fig. 1
figure 1

Distribution of broadband ITD and ILD. The distribution of the broadband ITD (dashed lines) and the broadband ILD (solid lines) is shown for three owls (a owl R, b owl A, c owl T). The colored pairs of dots show the ambiguous positions used for testing

Ambiguous positions

Ambiguous positions were defined as those pairs or triplets of positions whose broadband ITDs and broadband ILDs differed not more than ± 10 µs and ± 1 dB. The distributions of broadband ITDs and broadband ILDs as shown in Fig. 1 were the basis for finding ambiguous positions. Thirty-four to sixty-six percent of the 528 locations sampled did not have an ambiguous partner. In other words, ambiguous locations were frequently present in acoustic space of barn owls, and may cause problems in sound localization for the owl, if it only used broadband ITD and ILD for sound localization. We show 7 pairs of ambiguous positions in Fig. 1, one for owl R (Fig. 1a), two for owl A (Fig. 1b), and four for owl T (Fig. 1c).

For example, one pair of ambiguous positions in owl R, an owl that was used in electrophysiological experiments, was found at 20/10 and 40/40 (blue filled circles in Fig. 1a). Note that at these positions the ITDs were different by 10 µs. The ITD was 60 µs at 20/10 and 70 µs at 40/40. The ILDs were equal (5 dB) (Fig. 1a). In the ambiguous positions used in the behavioral experiments, one pair in owl A was found at − 100/50 and − 50/0 (black filled circles in Fig. 1b). At these two positions, the broadband ITD was − 140 resp. − 130 µs. By contrast, the broadband ILD was equal at the two positions (− 7 dB). The same held for the position pair − 80/− 50 and − 50/10 in owl T (black filled circles in Fig. 1c, Table 1). At these positions, the broadband ITD was − 150 resp. − 140 µs. The broadband ILD was − 3 dB at both positions (Fig. 1b, Table 1). There were further cases in which the ILD was equal (owl B V P1 and P2; see also Table 1). In other cases, the ITD was equal (owl B FF P7 (equal to point V P2) and FF P8 (equal to V P3) and V P1) (Table 1). Note that equal with respect to ITD here means equal within the computational accuracy limited by digital sampling (100 kHz). ITD was equal for all positions in the triplet (owl B VP1, VP2, VP3). By contrast, only VP1 and VP2 were characterized by the same ILD, while VP3 had a different ILD (Table 1). All in all the samples that were used for behavioral tests included all the possible combinations (a) ITD and ILD equal; (b) ILD equal, but ITD different by ± 10 µs; (c) ITD equal, but ILD within ± 1 dB; (d) both ITD and ILD within the limits) (see Table 1).

This selection of ambiguous pairs or triplets should make it impossible for the owls to localize sound from these positions solely based on broadband ITD and broadband ILD. However, other (so far unknown) cues may still be available. To find out more about the possible availability of such other cues, we analyzed the amplitude and phase spectra at ambiguous positions in a next step. The amplitude and phase spectra at ambiguous positions showed a rich structure. This structure contained additional information that the owls could utilize in behavioral experiments or which could influence neuronal responses. We use the two typical examples shown in Figs. 2, 3 to describe our procedures. The first is the position pair P1 (− 100/50) and P2 (− 50/0) of owl A (black dots in Figs. 1b, 2a–e). Here, the amplitude spectrum at point P1 showed a reduction of the amplitude in the right input compared with the left input (Fig. 2a). This is consistent with the negative broadband ILD (− 7 dB, see Table 1). While the amplitude spectrum of the left input was flat (variation < 5 dB) from about 2 to 9 kHz, the right input showed decreases in amplitude around 2.6 and 5.4 kHz (Fig. 2a). The sound from the second point of this pair [P2 (− 50/0)] was also attenuated more in the right than the left ear. A slightly larger decrease in amplitude than in point P1 was observed in the right input around 2.8 kHz, while the amplitude spectrum was flat from 4 to 8 kHz for both the left and right inputs (Fig. 2b). If the ILD spectra, the differences of the amplitude spectra (right amplitude spectrum minus left amplitude spectrum), were calculated, frequency-specific ILDs were observed around 3 kHz and around 5.5 kHz (Fig. 2e). The ILD spectra, thus, revealed the decreases in amplitude, specifically the broad indentation around 5.5 kHz in the ILD spectrum of P1 (Fig. 2e). By contrast, the phase spectra did not show conspicuous deviations from linearity as expected for a constant delay (Fig. 2c, d).

Fig. 2
figure 2

Spectral analysis at ambiguous position pairs. ae Data from owl A and positions P1 [azimuth: − 100°; elevation: 50° (− 100/50)] and P2 (-50/0). a Amplitude spectra related to P1: left (A-P1-L) and right (A-P1-R) inputs. b Amplitude spectra related to P2: left (A-P2-L) and right (A-P2-R) inputs. c Phase spectra related to P1: left (A-P1-L) and right (A-P1-R) inputs. d Phase spectra related to P2: left (A-P2-L) and right (A-P2-R) inputs. e ILD spectra related to P1 (A-P1-ILD) and P2 (A-P2-ILD). fj Data from owl T and positions P3 (50/− 50) and P4 (− 30/10). The equivalent data to ae are shown. Note the rich structure of the functions that contains spectral information apart from broadband ILD and ITD

Fig. 3
figure 3

Difference spectra and measures of resolution. af ITD, gl ILD. Shown are the difference spectra of ITD or ILD (thick black lines), the behavioral resolution for ITD or ILD (light grey rectangle), the variability of ITD and ILD as measured by Fischer and Pena (2017) for the two points belonging to the position pair in different shades of grey. e, f, k, l Show also the range of frequencies (indicated by the double arrows and the text "frequency tuning") for which the unit responded significantly. a, g Owl A, P1 and P2 (see Fig. 2a–e, Table 1). b, h Owl B, P7 and P8 free-field (FF), equivalent to P2 and P3 virtual stimulation (V) (see Table 1). c, i Owl B, P6 and P7 virtual stimulation (see Table 1). d, j Owl T, P3 and P4 (see Fig. 2f–J, Table 1). e, k Owl R, P1 and P2 used in the electrophysiological recordings shown in Fig. 9. f, l Owl Q, P1 and P2 used in the electrophysiological recordings shown in Fig. 10. Interpretation: If the solid black line takes values outside the grey areas, the owl should be able to use the information at these frequencies to discriminate between the two points of an otherwise ambiguous pair of positions

Similar considerations hold for the second example that is presented at more depth, point P3 (− 50/-50) and point P4 (− 30/10) in owl T (Fig. 2f–j; see also Fig. 1c). Here, a decrease in amplitude, reminiscent of a broad "notch" is conspicuous for the right input from P3 around 4.2 kHz (Fig. 2f) (since notches are typically very narrow in frequency we use in the term notch with quotation marks). This "notch" extended over a frequency range of about 2 kHz. A shallower decrease in amplitude occurred in the right input from P4 at 3 kHz (Fig. 2g). The left input does not show such decreases in amplitude (Fig. 2f, g). The ILD-spectra show the "notches" between 2.6 and 5.6 kHz for P3 and at 3 kHz for position P4 (Fig. 2j). Again, the phase spectra were close to linear (Fig. 2h, i). Decreases in amplitude of 10 or more decibels were often observed in ambiguous pairs, mainly for peripheral positions.

The ILD spectra exhibited conspicuous frequency-dependent changes (Fig. 2e, j). For a possible discrimination by the owl or its neurons, it is, however, not only important that such a rich structure of information exists in the inputs from the two sides and the ILD or ITD spectra. It is even more important that differences in the spectra are present between the points belonging to a position pair, and that these differences are above the resolution limits and the variability of the parameters as measured at these positions. To examine this, differences of the ILD and ITD spectra were computed (Fig. 3). We then applied several criteria to tackle the question whether an owl might make use of the differences in the spectra (gray areas in the plots of Fig. 3). Remember that positive ITDs are associated with turning to the right. Thus, a positive ITD difference should result in a positive difference of the mean azimuthal turning amplitudes. By contrast, negative ITD differences should result in a negative difference of the mean azimuthal turning amplitudes. A similar consideration holds for ILD, where positive ILD differences should typically result in a positive difference of the mean elevational turning amplitudes and negative ILD differences should typically result in a negative difference of the mean elevational turning amplitudes.

We explain our procedures with the data shown in Fig. 3a, g. The subtraction of the ILD spectrum at P1 in owl A from the ILD spectrum at P2 resulted in the ILD difference spectrum (Fig. 3g). The same was done for the phase spectra. Here we went one step further. We computed the frequency-specific ITD differences from the IPD spectra by dividing the IPD by the corresponding frequency (Fig. 3a). The ITD difference varied between about 0 and 40 µs in the frequency range analyzed (1–9 kHz) (thick solid line in Fig. 3a). Remember that the calculated broadband ITD difference was 10 µs (P2–P1) in this case, while the broadband ILD difference was 0 dB (Table 1). Therefore, the gray areas are centered around 10 µs (Fig. 3a) and 0 dB (Fig. 3g). To estimate whether the owl could use spectral variations for a discrimination of the two positions, we set those in relation to the known behavioral resolution limits. Bala et al. (2003) measured a minimum audible angle (MAA) of 3° in azimuthal direction near zero azimuth and elevation in the barn owl, while Krumm et al. (2019) determined a MAA of 4°. Knudsen et al. (1979), working with a double pole coordinate system, measured a decrease of resolution from frontal to peripheral auditory space so that MAA increased by 64% in their best performing owl from 10° to 30° in azimuth, by 172% to 50°, and by 384% to 70°. For elevation, the increase was 105% from 10° to 30°, 236% to 50°, and 905% to 70°. One degree in azimuth corresponds to approximately 2.5 µs (von Campenhausen and Wagner 2006). Thus, a value of ITD resolution of ± 20 µs resulted for point − 50/0 in owl A (Fig. 3a, rectangle with the lightest gray). Note that in Fig. 3a (and also in Fig. 3b–f) we used a conservative approach and only plotted the value for the behavioral resolution of that point of a position pair that was closer to 0° azimuth. This point showed a higher resolution than the other point that was further away from zero azimuth. Less is known about the behavioral resolution of ILD. Bala (pers. comm.) measured a resolution of 4° in frontal space. We used this value throughout in our plots (Fig. 3g–l) as conservative measure, because also for ILD we would expect a decrease in resolution for peripheral positions. We also used data of ITD and ILD variability [for details see Fischer and Pena (2017)] to obtain an estimate of ITD and ILD variability at the positions used in the experiments. These data appear in Fig. 3 as the other two gray areas. The variability data for the two locations is plotted so that most visibility of the limits is guaranteed. For example, at position − 50/0 in owl A, ITD variability decreased from about ± 80 µs at 2 kHz (there were no data below 2 kHz) to about ± 20 µ s at 8 kHz (there were no data above 8 kHz) (darkest grey area in Fig. 3a). ITD variability was larger for the second point, but also decreased from 2 to 8 kHz, reaching values of about 35 µs at 8 kHz (Fig. 3a). The solid thick line in Fig. 3a represents the spectral changes of the ITD difference that we measured for this position pair. Since this line did not extend outside the gray areas, we concluded that the owl should not be able to make use of spectral information contained in the ITD to discriminate between the two positions. For the ILD we proceeded in the same way, showing the behavioral resolution of ± 4 dB as light gray rectangle (Fig. 3g), and the ILD variability at the two points as the two areas of different grays (Fig. 3g). The spectral components of the ILD difference are outside the gray areas from 5.4 to 6 kHz and above 8.8 kHz. For the latter frequency range, no variability data were available. Therefore, the conclusion was that owl A could have used the narrow range from 5.4 to 6 kHz and/or perhaps frequencies above 8.8 kHz to discriminate between the two positions.

For the other examples, similar predictions hold. For example, in owl B the ITD difference did not take values outside the gray areas for the pair P3–P2 (V) [equivalent to P8–P7 (FF)], if we assume that ITD variability remained high below 2 kHz (Fig. 3b). By contrast, the ILD difference exceeds the range marked by the gray areas from 3.8 to 4.8 kHz and above 7.6 kHz (Fig. 3h). Thus, the ILD difference spectrum for this pair indicated that spectral components of ILD might be sufficiently different to allow a discrimination of the positions belonging to this pair. A similar conclusion was reached for the position pairs P7–P6 (V) in owl B (Fig. 3c, i), P4–P3 in owl T (Fig. 3d, j), P2-P1 in owl R (Fig. 3e, k), and P2–P1 I owl Q (Fig. 3f, l). The two latter pairs were used in electrophysiological experiments. Therefore, the range of frequencies for which significant responses were measured ("frequency tuning") is also indicated in the plots (Fig. 3e, f, k, l).

Figure 4 summarizes the results of the HRTF analysis graphically. Sufficient deviation is available in 4 of the 11 cases in the frequency range between 2 and 8 kHz for the ITD (Fig. 4a). However, the frequency regions for which this information is available scatter widely. Sufficient deviations occur in all 11 cases for the ILD. Again, the frequency regions scatter widely between the cases, but in 9 of 11 cases 5.5 kHz is part of the frequency regions that show sufficient deviations. Frequencies above 7 kHz also show sufficient deviations in 7 of the 11 cases for the ILD. In summary, information was available in all of the ILD difference spectra. The birds could use this information in behavioral tests to discriminate between positions in ambiguous pairs.

Fig. 4
figure 4

Summary of the results of the spectral analysis of the HRTFs. All 11 cases that were used in behavioral tests are shown as indicated by the names on the left side of the fig. a ITD, b ILD. Note that in more than 50% of the cases sufficient ITD deviations are missing, while these are present for all cases for the ILD

Behavior

Owls A (324 trials), B (674 trials), and T (1499 trials) participated in behavioral experiments. During the phase of testing, all trials collected on a given day were included in the analysis, if the response latency was between 50 and 500 ms. These borders were chosen according to earlier experiences. Trials with latencies shorter than 50 ms suggested that the owl had started to rotate the head before the test stimulus appeared (Wagner 1993), while trials with latencies longer than 500 ms suggested that the owl was not motivated (Hausmann et al. 2009). In total 48 of the 2497 trials or 1.92% were excluded from the analysis (16 in owl A, 15 in owl B and 17 in owl T).

In the experiments, the owls had to localize two different types of sources. On the one hand, there were sources at positions that had been tested in earlier experiments (Wagner 1993; Poganiatz et al. 2001; Kettler et al. 2017), and that had turned out to be easily localizable by the birds. We call these easily localizable positions also reference positions in the following. On the other hand, there were the stimuli originating from the ambiguous positions with pairwise equal broadband ITD and ILD. With respect of exclusion of trials from the analysis, there was no difference between trials directed to reference positions (26 trials were excluded) or trials directed to ambiguous positions (22 trials were excluded).

We first describe the head-turning behavior of the owls and show typical examples of head turns directed to sounds coming from different positions. The traces of head position with turns directed towards reference positions are drawn as solid lines (Fig. 5a, d), while the traces towards ambiguous positions are drawn as dotted lines (Fig. 5b, c, e, f). Each trace starts with a fixation phase (colored lines to the left of the black vertical lines in Fig. 5). The head orientation during this phase is called the starting point of a head turn. The stimulus is played at time zero, corresponding to the black vertical line. After a short latency—typically shorter than 200 ms—the owl turned its head in the direction of the perceived sound source. The head first accelerated, reached a maximum in velocity and then homed in into an endpoint where the head was held fixed for a short moment. Note that the traces from the starting direction to the first endpoints were all smooth without sudden changes in direction, indicating that the first endpoints were all open loop. The sound sources in the four examples shown in Fig. 5a were located at − 40°, − 40°, 30° and 50° in azimuth. The head turns all had sufficient amplitudes, so that the owl hit the target windows as indicated by the "I" at the right border of the sub-figures (see also Material and Methods) in all cases. For example, the head turn underlying the fixation of the source at 50° in azimuth was directed to the right and was larger in amplitude and homed into a different endpoint than the head turn initiated by the owl to localize the source at 30° in azimuth. Similar considerations hold for the turns shown in Fig. 5b, d, e. The owls had more problems with reacting properly to the elevational coordinate of a sound source (Fig. 5c, f). Often, the owls did not turn high enough. In these cases, the predesigned localization windows for elevations were increased (compare predesigned localization windows in Fig. 5b, e with c and f). Sometimes the owl did not hit the predesigned window with the first turn. In such cases, the owl often added one or more correction turns without a second stimulus played. In all cases, only the first turn was used for the analysis of the behavior as shown in Figs. 6, 7and8 and Tables 1 and 2. There were also trials in which the owl missed the localization window (Fig. 5f, red and green).

Fig. 5
figure 5

Examples of head turns. a Azimuthal component of head turns of owl B to reference positions in free field. b, c Head turns of owl B to ambiguous position in free-field: b azimuthal component, c corresponding elevational component. d Azimuthal component of head turns of owl T to reference positions in virtual space. e, f Head turns of owl T to ambiguous positions in virtual space: e azimuthal component, f corresponding elevational component. The "I"-shaped bars at the right borders of the subfigures indicate the localization windows. The black vertical line at time zero corresponds to the onset of the stimulus. Note that the traces were truncated when the bird reached the target

Fig. 6
figure 6

Localization of free-field sources by owl B. The differently colored crosses indicate the different target positions. The endpoints of the head turn to a given position after the first saccade are shown by the dots. a Scatterplots. The different colors correspond to different stimulus position (see color bar and numbers with the reference positions labelled in italics). b Ellipses showing mean and standard error of endpoints of head turns. Localization of reference positions without an ambiguous partner (solid lines), reference positions with an ambiguous partner (filled ellipses) and ambiguous positions (dotted lines) is shown. We performed tests at 4 reference positions (number represent degrees, the upper or first number always refers to the azimuthal position, the lower or second number to the elevational position): P1: − 40/30 (mint green), P2: − 40/0 (dark blue), P3: 50/− 30 (red), P4: 30/30 (light green)) and 2 ambiguous pairs [P5 and P6 at − 50/40 (orange) and -50/0 (magenta), P7 and P8 at 50/50 (dark red) and 20/10 (blue)]. Note that there are distinct clouds of different colors indicating that the bird could discriminate by its head turning between all the different positions. For statistics, see Table 1

Fig. 7
figure 7

Localization of virtual stimuli by owl T. Same presentation of data as in Fig. 5. Only the localization of the ambiguous pairs is shown [pair 1: P1 (− 80/− 50 (magenta)) and P2 (− 50/10 (dark blue)]; pair 2: P3 [− 50/− 50 (orange)) and P4 (− 30/10 (mint green)\; pair 3: P5 [40/20 (green) and P6 (70/40 (blue)]; pair 4: P7 [50/20 (red) and P8 (90/40 (dark red)]. The statistical analysis demonstrated that the owl discriminated successfully between P1 and P2, P5 and P6 and P7 and P8, but not between P3 and P4

Fig. 8
figure 8

Cumulative distribution of response latencies. Data from owl T. The responses to the different positions are indicated by different colors and include both reference (solid lines) and ambiguous (dashed lines) positions. Note that the data overlap, demonstrating that the owl could localize the ambiguous positions with the same response latency as it could localize the reference positions

Table 2 Relation between deviation in HRTFs and behavioral difference

All turns were analyzed in the way just described. To reduce the data and present the important aspects, the elevational and azimuthal coordinates of the first endpoints of the head turns were extracted. These endpoints were plotted in elevation vs azimuth scatter diagrams (Figs. 6, 7). The data shown in Fig. 6 are from owl B, recorded with free-field stimulation, and show the endpoints of the turns to both reference and ambiguous positions (see also Table 1). The data of turns to the different positions are coded by different colors. For example, the endpoints of the turns to the reference position 50/− 30 scatter around that position (see red dots in Fig. 6a). By contrast, the turns to the reference position 30/30 ended too low in elevation, but scattered about a narrow spatial extent as well (see light green dots in Fig. 6a), so that the two clouds of dots were clearly separated. This is even clearer seen in the plot of Fig. 6b that shows the mean and standard error drawn as an ellipse. A similar observation was made for the turns to reference positions on the left side: − 40/30 (mint green dots in Fig. 6a, b) and − 40/0 (dark blue dots in Fig. 6a, b). These results indicated that the bird could discriminate sounds from the different reference positions.

With this bird, we tested two pairs of ambiguous positions in free field. One pair was located on the left side with position P7FF (P7 used in free-field stimulation (see Table 1)) at -50/0 (magenta) and position P8FF at − 50/40 (orange). The other pair was located on the right side with position P5FF at 20/10 (light blue) and position P6FF at 50/50 (dark red). The end points of the turns belonging to these different positions again scattered around distinct centers (Fig. 6a), and the clouds of dots were clearly separated, as may be seen by comparing the cloud consisting of the magenta dots, corresponding to position -50/0, with the cloud consisting of the orange dots, corresponding to position − 50/40 (6a). The same holds for the light blue and dark red clouds of dots, corresponding to positions 20/10 and 50/50, respectively (Fig. 6a). The plot in Fig. 6b, showing the means and standard errors, clearly demonstrates discrimination of the ambiguous pairs.

While the owl clearly discriminated between positions belonging to ambiguous pairs, the scatter plots suggested that neither precision nor accuracy of the turning responses were high. This also held for the other two owls. Also, left–right or up–down biases or biases in responses to reference or ambiguous positions may occur. To look deeper into these issues we analyzed the responses with respect to possible biases in turns toward left and right sides, both for the azimuthal and elevational components of the head turns. If the responses of all conditions and all owls were lumped together, there was no left–right bias in the data (azimuthal localization error for stimuli from the left side: mean value ± standard deviation: 4.7° ± 14.8° undershooting, for stimuli from the right side: 0.03 ± 8.1 degrees, Mann–Whitney U test: N = 40, U = 184, z score = 0.42, p = 0.674). There was also no difference in elevational undershooting for stimuli from the two sides (undershooting for stimuli from left side: 16.8° ± 17.1°, for stimuli from the right side: 18.8° ± 12.2°; Mann–Whitney U test: N = 40, U = 195.5, z score = 0.11, p = 0.912). This also held, if the data of each were analyzed separately. We further checked whether differences in responses to stimuli from ambiguous or reference positions occur. We noticed during the analysis that some reference positions had ambiguous partners. Reference positions with ambiguous partners were excluded from the further analysis. The comparison of turns to ambiguous and reference positions showed a difference in elevational tuning (undershooting for stimuli from ambiguous positions: 20.0° ± 18.5°, for stimuli from reference positions: 12.8° ± 10.1°; Mann–Whitney U test: N = 34, U = 79, z score = 2.12, p = 0.034). With respect to azimuth, the owls turned too far by 2.2° ± 5.6° for stimuli from reference positions, while they undershot the target by 6.9° ± 14.8° for stimuli from ambiguous positions. This difference was also statistically significant (Mann–Whitney U test: N = 34, U = 72.5, z score = 2.35, p = 0.019). Thus, the owls performed slightly better when fixating stimuli from reference than from ambiguous positions. We claim here that this difference does not have a decisive influence on the most important question for this work (see also discussion), which was whether the scattered endpoints of the head turns suggested that the owl discriminated between the different source positions, specifically between positions belonging to an ambiguous pair. This was indeed the case, as demonstrated by the statistical analysis. Specifically, the owl discriminate between the ambiguous positions at a statistical level of p < 0.001 (Table 1). With respect to predictions derived from the difference spectra (Fig. 3b, h), the more positive elevational turning angles for sounds from P8FF compared with P7FF (Table 1), were consistent with the positive ILD differences for frequencies around 4 kHz (Fig. 3h, Table 2). By contrast, the negative ILD differences in the high frequency range (Fig. 3h, Table 2), are not reflected in the turning angles (Table 1).

Owl A did also discriminate the ambiguous positions at the two position pairs tested (Table 1). This is interesting, because the analysis shown in Fig. 3a, g suggested that sufficient information was available only in a narrow range in the ILD difference spectrum. The deviation in the narrow range was in the positive direction (Fig. 3g), but the owl’s turns to P2 ended lower in elevation than the turns to P1. This means that here the positive ILD difference did not lead to higher endpoints of the elevational components of the head turns as typically expected. Likewise, we measured positive ITD differences for P2–P1 that were, however, within the resolution limit (Fig. 3a). By contrast, the azimuthal turning amplitude for turns to P2 was more negative than the azimuthal turning amplitude to P1 (Table 1). This meant that owl A could also not have used the ITD difference for discriminating between P2 and P1. Thus, the conclusion was that the owl must have used other information than contained in either the broadband ITD and ILD or the spectral components of ITD and ILD to discriminate between P2 and P1.

These observations suggested to us that the owls could discriminate between positions characterized by the same broadband ITD/ILD combination. In some cases, the birds could have used spectral ILD information, in other cases it remained unclear to us what information the owls did use (for a deeper analysis of the relation between deviations in HRTFs and differences in behavior see below). One might argue that the birds could move their heads during the free-field fixation task and, thus, generate additional cues or change the ITD/ILD pattern by their active movement so that the free-field data might not be conclusive. We tested for this possibility by restricting the head position and orientation at sound presentation to values below 3°, which is the resolution in ITD in the barn owl reported by Bala et al. (2003). The result of this analysis was the same as the result with the whole data set.

To collect further evidence that the owls can discriminate positions characterized by the same broadband ITD and ILD, we also performed experiments in virtual auditory space where the stimuli were presented via earphones. In this case, the owl cannot change the ITD and ILD by active movement. Figure 7 shows the data from owl T. The owl discriminated three of four ambiguous pairs (Fig. 7, Table 1). It was able to discriminate the ambiguous pairs -80/− 50 vs − 50/10, 40/20 vs 70/40, 50/20 vs 90/40, but not the pair − 50/− 50 vs − 30/10 (see Table 1). It is unclear why the bird could not discriminate the positions belonging to the latter pair, because for this pair, as for the other three pairs, information that the owl could have used was available in the ILD difference spectra (Fig. 3j, see also Fig. 2f–j, 4b, Table 2). The reference positions closest to − 50/− 50 and − 30/10 tested in this bird were at − 40/− 20, − 40/0, − 40/30. The bird could discriminate all of these positions, and it discriminated these positions also from positions − 50/− 50 and − 30/10. We collected data in virtual auditory space with a second bird (owl B), a bird from which we already had obtained data in free field before (see Fig. 6). As with free-field stimulation, the bird discriminated all pairs and one triplet of ambiguous positions (Table 1). The behavioral data of bird B are consistent with the owl exploiting the larger ILD differences for sounds from P3V compared with sound from P2V around 4 kHz, but do not reflect the negative ILD differences above 7 kHz (Figs. 3i, 4b, Table 2). The turning behavior of owl B to sounds from P7V and P6V is compatible with the positive ILD differences from about 3.5–6 kHz, but not with the negative ILD differences above about 7 kHz (Fig. 3i). Thus, the data with two birds suggest that for virtual stimulation the birds could also discriminate most (7 out of 8) of the pairs and triplets with ambiguous information on broadband ITD and ILD.

Table 2 relates the behavioral differences to the deviations in the HRTFs for the 13 ambiguous pairs used in the behavioral tests and summarizes the data. We consider the joint effects of ITD and ILD on the behavioral performance. As quantified in Table 1, 12 of the 13 behavioral tests yielded a significant behavioral difference. In one test, such a difference was missing. Nevertheless, also in this case, as in all other cases, a sufficient deviation in the HRTFs existed. In other words, there was no case without a sufficient deviation in the HRTFs (Table 2). The case in which a behavioral difference was missing (T_P4–P3), a sufficient deviation in the ITD spectrum was missing, and the deviation in the ILD difference spectrum was in both directions (Fig. 3d, j). Looking further to Table 2, the deviation in the HRTF and the difference in the behavioral tests had the same direction in 4 of 12 cases. In other words, the deviation in the HRTF was consistent with the difference in behavior in these cases. In one case, the deviation in the HRTF was in the opposite direction as the difference in behavior (A_P2–P1). Thus, the deviations in the HRTF could not explain the behavioral differences in this case in a straightforward way. In 7 cases, a deviation in the HRTF was present in both directions, making it difficult to rate the behavioral difference. The owls could have used the deviations in the frequency region around 5.5 kHz in most of these cases, but then they would have needed to neglect the deviations at frequencies about 7 kHz in the majority of cases.

In our opinion, the behavioral data demonstrate that barn owls can discriminate by the head-turning amplitudes between ambiguous positions as defined by the same combination of broadband ITD and ILD both in free field as well as with virtual stimuli. Furthermore, these observations suggested to us that 4 of 13 cases could be explained in a straightforward way by the assumption that the owls used spectral ILD cues in the discrimination. Seven further cases might be consistent with such a hypothesis, if the owl used one frequency region and neglected a second frequency region in its behavior (read more on this issue in the discussion). Two behavioral cases could not be explained by the use of either broadband or spectral components of ITD and ILD: in one case no behavioral difference was observed although sufficient deviation in the HRTFs were present, while in the other case, the prediction from the deviations in the HRTFs were in the opposite direction than the observed behavior differences.

Earlier data had shown that even if the turning amplitude does not indicate problems in discrimination, increased response latency might reveal that localization may be more difficult with some stimuli (Poganiatz et al. 2001). We checked whether the latencies of the turns toward the ambiguous positions were different from the latencies toward the reference positions. As may be seen in Fig. 8 for owl T, the cumulative latency data for these two classes of turns overlap. An equivalent latency distribution was observed in the other owls and the other conditions as well (data not shown). This demonstrated that the owls did localize the ambiguous positions with the same response latencies as they did localize the reference positions, in other words, there was no hint from response latency that it was more difficult for the owls to localize positions that belong to ambiguous pairs or triplets than reference positions. One might argue that for head-turns with a latency shorter that the duration of the stimulus, the task was not open loop, and thus the owls might have used the information obtained in the first milliseconds of a turn to discriminate between the sources. We do not think that this can explain the data, because there were only very few head turns with a response latency below 100 ms, we only used the first turn of the owls for our analysis, and we did not observe sudden shifts in direction during the turning behavior (see Fig. 5).

At this point, we wondered what the neural correlate for the discrimination between ambiguous positions might be. The optic tectum (OT) contains maps of auditory and visual space (Knudsen 1982). Moreover, focal electrical stimulation at a site in the OT that represents position × in sensory space, causes head turns to position × (du Lac and Knudsen 1990). Therefore, in a next step we recorded from neurons in the optic tectum and tested out the complete auditory space, and specifically those positions that represented ambiguous positions.

Electrophysiology

We recorded from 42 single units in the left and right OTs and tested 54 pairs of ambiguous positions. We also collected enough data to compute 16 spikemaps and compared responses at all ambiguous positions available in these spikemaps. The data are from 5 adult barn owls, 4 females and 1 male. All data presented and analyzed below are from single units. Before we present the specific responses to ambiguous positions, we show basic responses that helped us to characterize the neurons and also to physiologically confirm the recording within the optic tectum. In the following, we first present data of two typical single units in some detail (Figs. 9, 10), and then summarize the responses in a reduced way (Fig. 11).

Fig. 9
figure 9

Example of responses of tectal unit. The cell (Unit 125) was recorded in the left optic tectum of owl R. a ITD curve (black solid line, recorded with broadband noise as stimulus at an ILD of 0 dB) and virtual-azimuth curve (dashed orange line, recorded with broadband noise as stimulus at a virtual elevation of 0°). Azimuth was blown up by a factor of 2.5 to show the close correspondence between the two tuning curves. b ILD curve recorded at an ITD of 60 µs. c Interpolated elevational-azimuthal response profile of the cell for 16 virtual-azimuth curves recorded from − 70° to 80° in elevation in steps of 10° of elevation and azimuth. The white dots named x1 and  x2 correspond to the pair of ambiguous positions that was tested in more depth (see d). d Responses at the pair of ambiguous positions (20/10 and 40/40) to 100 repetitions of the stimulus. The responses are statistically different. e Responses at pairs of ambiguous positions. The responses occurring at all pairs of ambiguous position (number N = 225) is plotted in a scatter diagram. f Responses at randomly selected pairs including the same data set as used in e

Fig. 10
figure 10

Second example of responses of tectal unit. The same plots as in Fig. 9  are shown. The unit (#375) was recorded in owl Q in the right optic tectum. Stimulus was broadband noise. a The ITD curve (black solid line) was recorded at an ILD of 0 dB. The azimuth curve (dashed orange line) was recorded at an elevation of − 30°. b The ILD curve was recorded at an ITD of − 120 µs. c Full spikemap. d Responses at positions that belong to a pair of ambiguous positions (shown as white dots x1 (− 90/− 50) and x2 (− 50/− 10) in c). e Responses at pairs of ambiguous positions. The responses occurring at all pairs of ambiguous position (number N = 231) is plotted in a scatter diagram. f Responses at randomly selected pairs. Note that this unit is tuned to different parameters, but otherwise responded similar as the other unit

Fig. 11
figure 11

Distribution of correlation coefficients. The 16 data points stem from the 16 neurons for which a spikemap was available. The correlation coefficients underlying the hypothesis that the responses at pairs of equal broadband ITD and ILD should be equal are plotted on the x axis and the number of occurrences on the y axis. Note the positive correlations, but also the broad distribution

The responses shown in Fig. 9 stem from a single unit recorded in the left optic tectum of owl R (see Fig. 1a for a distribution of ITDs and ILDs). Figure 9a (solid black line) shows the response of the unit when ITD was varied. The largest response occurred at 60/90 µs. For this reason, we call this response peak the main peak. Smaller response peaks, called side peaks, were present to the left and right of the main peak. The peak at 270 µs was only slightly lower than the main peak, while there was much variability in the responses for negative ITDs. The dashed orange curve shows responses recorded when virtual azimuth was varied, with the azimuthal coordinate multiplied by a factor of 2.5 (see axis on top of Fig. 1a), a typical factor for conversion of azimuth into ITD in the barn owl (von Campenhausen and Wagner 2006). Note the close correspondence of the peaks in the two curves (Fig. 9a). This was typical for all cells (see also Fig. 10a for a second example). We also tested for the influence of a variation in ILD at a constant ITD (60 µs in this case). The resulting ILD curve showed a broad sensitivity reaching from − 8 to 16 dB and a decreasing response at the borders (Fig. 9b).

The spikemap of this unit showed sensitivity for slightly positive azimuths in the lower hemisphere (Fig. 9c). A side peak that extended from negative to positive elevations occurred for azimuths around 100°. The second region of high response at elevations from 40° to 60° at 150° of azimuth corresponds well with the bending of the ITD contour lines representing ITDs corresponding to the main peak (see also Fig. 1a). The response for negative azimuths was low for all elevations. The sensitivity for negative elevations fitted well with the recording position that was the deepest in the given dorso-ventral penetration. We tested one pair of ambiguous positions in that unit (white dots in Fig. 9c). Stimulation at the virtual positions x1 and x2 yielded low to middle-sized responses, although their equal broadband ITDs (at 60 µs) and ILDs (at 5 dB) (Fig. 1a) corresponded well with the high response areas in the ITD (Fig. 9a) and ILD (Fig. 9b) curves. More importantly, the responses at these two positions were clearly different (Fig. 9d). A Mann–Whitney U Test yielded a p value of less than 0.00001 (100 stimulus repetitions, z score 9.18836). This is consistent with the spikemap, in which position x1 lies in a region of higher responses than position x2 (Fig. 9c). Since we often did not hit the highest response areas of a unit with the selected position pairs, we applied another measure to test whether broadband ITDs and ILDs explain the response behavior of the cells: we plotted the responses at all pairs of ambiguous positions in a scatter diagram (Fig. 9e). This made the analysis independent of a possible bias in the selected pairs of ambiguous positions. As already mentioned in Materials and Methods, the reasoning was that if the two cues would explain the data, the correlation for pairs of ambiguous positions should be close to 1. In fact, the correlation coefficient, based on 225 data points, was 0.26 in this single unit, as thus clearly significantly positive (correlation coefficient 0.26, 225 data points, p < 0.001). The variance explained, the square of the correlation coefficient was low (7%). We checked whether spurious correlations in the spiking might explain this effect, by randomizing the spiking data with respect to one position of an ambiguous pair. This reduced the correlation to close to 0 (Fig. 9f, mean of correlation coefficient with 100 randomizations: − 0.01; mean of absolute correlation coefficients with 100 randomizations: ≤ 0.06). This result suggested to us that the sensitivity to broadband ITD and ILD alone explained approximately 7% of the variance in the spiking data of this unit.

The responses shown in Fig. 10 are from another single unit in another owl, owl Q, and were recorded in the right optic tectum. The unit responded well to several positions in the left hemisphere, while the responses to virtual stimulation from right acoustic space were low or absent. The main peak was around − 60° in azimuth and − 30° in elevation. Side peaks occurred around − 130 to − 150 in azimuth (Fig. 10c). The peak in the ITD curve was located at − 120 µs, with a tail of elevated responses extending to − 180 µs (Fig. 10a). The peak in the azimuth curve occurred at − 60° and overlapped with the ITD peak if a stretch of 2.5 was applied to the azimuthal coordinate (Fig. 10a). The ILD curve (Fig. 10b) closely resembled the classic ILD curves (Olsen et al. 1989). The ILD curve exhibited a peak for slightly negative dB values (Fig. 10b). The responses to 100 repetitions of the stimulus at the selected positions, although weak, were statistically different, with stimulation from position x2 at − 50/− 10 eliciting a higher response than stimulation from position x1 at − 90/− 50 (Mann–Whitney U test, N = 100, p = 0.00044, z score = − 3.50504, Fig. 10d). This is consistent with the spikemap, where position x2 appears closer to the high-activity region around − 60/− 30 than position x1 (Fig. 10c). When the responses at all 231 pairs of ambiguous positions were taken into account, the correlation coefficient measuring the relation between pairs of ambiguous positions was 0.63, suggesting that the ITD/ILD model could explain about 40% of the variance (Fig. 10e). Again, the correlation between randomly drawn positions was checked as a control, and this correlation was zero (correlation coefficient averaged over 100 repetitions of random drawing: 0.01; mean of absolute correlation coefficients over 100 repetitions of random drawing: ≤ 0.06), indicating that spurious correlations cannot explain the observation made at ambiguous positions (Fig. 10f).

While Figs. 9 and 10 show examples of our analysis, in a next step we summarized our results taking into account responses of the whole sample. When the results of the testing at all selected pairs of ambiguous positions were considered, the responses were not different in about 60% of the pairs, while in 40% the responses differed. This suggested to us that the two cues could explain the data in slightly more than half of the cases. As said, this result may include some bias, because of nonrandom selection of pairs of ambiguous positions and also because of the weak responses we often encountered. Such biases are reduced or do not occur, if all pairs of ambiguous positions are taken into account as available in the responses in the spikemaps. Figure 11 shows that data from 16 units for which we obtained spikemaps. The correlation coefficients representing data at ambiguous positions in Fig. 11 scattered widely, but were clearly positive in all cases. The mean value ± standard deviation of the correlation coefficient for the 16 cases was 0.42 ± 0.25. The mean of 16 squared single correlation coefficients was 0.24 ± 0.23. By contrast, the correlation coefficients representing data for randomly drawn positions were always close to zero. Each correlation coefficient obtained from a random-pair assignment was lower than the corresponding correlation coefficient obtained from ambiguous-pair matching. The difference in the correlation coefficients between ambiguous and random pairs was highly statistically significant (Mann–Whitney U Test, N = 16, p < 0.00001, z score 4.80534). These analyses suggested that the responses at pairs of ambiguous positions were more similar than responses at random pairs. However, the variance that could be explained by the correlation scattered widely and the mean was only about 24%.

Overall, the electrophysiological data corroborate the behavioral data in the sense that more information is available in tectal responses than broadband ITD and ILD.

Discussion

Barn owls use more information for sound localization than is contained in the broadband ITD and ILD, and neurons in the optic tectum represent more information than broadband ITD and ILD. In the following, we discuss these findings with respect to our approach, the use of different cues in sound localization, and speculate about the information the owls might have used in our experiments.

Beyond cues in studying sound localization

For a long time, HRTFs have proven to be a powerful tool to study sound localization (Wightman and Kistler 1989b; Keller et al. 1998; Spezio et al. 2000; Sterbing et al. 2003; Koka et al. 2008; Slee and Young 2010; Keating et al. 2013; van Opstal et al. 2017). HRTFs veridically represent the complete acoustic information present at the eardrum. They even include information about pinna position (Young et al. 1996) or reflect changes during development (Anbuhl et al. 2017). Manipulation of HRTFs allows testing with stimuli that do not exist in nature (Poganiatz et al. 2001; Egnor 2001; Poganiatz and Wagner 2001; Bremen et al. 2007; Hausmann et al. 2009; Keating et al. 2013; Kettler et al. 2017). Keller et al. (1998) demonstrated for the barn owl that HRTFs measured about 5 mm from the eardrum sufficiently represent sound parameters. Here we utilized individual HRTFs to ask whether broadband ITD and ILD are the only source of information that the barn owl uses in finding the position of a sound source. We did not tackle the problem from the cue side by manipulating ITD and ILD as is often done (e.g. Wood et al. 2019), but with a somewhat reverse approach. Our basic stimulus contained the complete information available at the eardrum. We then selected positions at which the information carried by specific cues was ambiguous. In our case, this was the broadband ITD and the broadband ILD, which are considered the main cues influencing sound localization in the barn owl (Moiseff 1989a; 1989b; Keller et al. 1998; Poganiatz et al. 2001; Egnor 2001; Hausmann et al. 2009; Kettler et al. 2017). We searched for and found pairs of positions in acoustic space where broadband ITD and ILD were within the resolution limit of the species for each bird individually. We manipulated the HRTFs to eliminate also the influence of stimulus level. We observed that it was slightly more difficult for the owls to fixate ambiguous positions than reference positions. It is unclear whether this difference is real or due to the selection of positions, with reference positions more centrally located than ambiguous positions. Anyway, this difference should, in our opinion, not distract from the most important finding of the work, which is that the owls use more information than is available in the broadband ITD and ILD. In an analogue approach, Majdak et al. (2013) have reduced spectral cues in HRTFs to show that training can improve sound localization. Van Opstal et al. (2017) reconstructed spectral cues and found a remarkable resemblance to the idiosyncratic HRTFs of their listeners. This approach is insofar similar to ours in that it tries to find the information present at the eardrum; it is different in that it starts with ripples and arrives at HRTFs, while we started with HRTFs. We argue that approaches similar to that of Majdak et al. (2013), van Opstal et al. (2017), and ours might reveal more about sound-localization capabilities also in other species.

The spectral analysis demonstrated that there still was frequency-specific information available to the owls for discrimination of positions pairs with equal broadband ITD and ILD in the stimuli. Specifically, spectral components of ILD often extended beyond the limits of behavioral resolution [Knudsen et al. 1979; Bala et al. 2003; Bala (pers. comm.)] and variability (Fischer and Pena 2017). The behavioral data we obtained showed mixed results. About one third of the cases are consistent with the hypothesis that the owls have used frequency-specific information. About one sixth of the cases cannot be explained with this hypothesis in a straightforward way. For the rest, it is necessary to assume that the owls have used specific frequency regions and neglected others. We are not aware of a study that has tested this. However, owls are known to be excellent in frequency discrimination (Quine and Konishi 1974). Thus, it seems feasible that the owls could have neglected some frequency regions while using others. Furthermore, lateral positions are represented by low frequencies in the external nucleus of the inferior colliculus (Cazettes et al 2014). This might make it unnecessary to claim an active suppression of information in high frequency region by the owl. Nevertheless, two cases remain that cannot be explained by such a hypothesis. There was also much variability in the specific frequency regions that might have been used by the owl. One possibility that could resolve many of these uncertainties would be that the birds applied a kind of template matching. However, then one would have expected that owl T would not only have been able to discriminate points P2/P1, P6/56 and P8/P7, but also P4/P3. Template matching would require that the birds would know the spectral signatures at the different locations and use these for discrimination. Barn owls are able to learn to discriminate sounds (Konishi and Kenuk 1975) and exhibit adaptive plasticity in the sound-localization pathway (de Bello and Knudsen 2004). This speculation might be tested by manipulating the spectra at the positions of an ambiguous pair, for example by flattening the spectra.

Of course, one could also study further cues. This would certainly lead to new insights, but we believe that it might be more promising to develop new theoretical measures based on HRTF and test these in combination with the new stimulus proposed here. One could include such new measures in the concerted action to understand better binaural processing (Dietz et al. 2018) to test whether they really improve our understanding of sound localization.

Beyond broadband ITD and ILD

The presented here contributes specifically to the notion that frequency-specific ITD and ILD information is physiologically and behaviorally relevant in the barn owl. This should also help to refine stimulus design of future studies, because many studies conducted in the past did not consider frequency specificity (but see Arthur 2004). Other species also use more information than contained in broadband ITD and ILD for sound localization. In humans, frequency-specific notches are a good example for such an additional cue (Alves-Pinto et al. 2014; Middlebrooks 2015). A similar conjecture was made, amongst others, for the cat (Rice et al. 1992), the marmoset monkey (Slee and Young 2010), the guinea pig (Anbuhl et al. 2017), and the chinchilla (Koka et al. 2011). Of course, also spectral information in a more general sense influences sound localization (Blauert 1997). Wood et al. (2019) came to a similar conclusion as we do here by manipulating spectral information. The influence of notches has not been studied in barn-owl sound localization so far. However, an influence of frequency has been demonstrated that improves the sound-localization capabilities of these birds (Cazettes et al. 2014, 2016; 2018). A general influence of head-shape has been demonstrated to induce sound modifications that depend on the elevation of the source (Schnyder et al. 2014). Moreover, ferrets can be trained to improve localization in certain frequency bands (Keating et al. 2014). This implies an influence of cognitive components on sound localization also in animals. The underestimation of target position, observed in this study and elsewhere (Wagner 1993; Poganiatz et al. 2001; Hausmann et al. 2009) was explained by a Bayesian approach as a bias for frontal space (Fischer and Pena 2011; Cazettes et al. 2018).

Neural representation of acoustic space

In the barn owl, we have a special situation, because there are neurons, the space-specific neurons, that not only represent well-defined positions, but are also arranged in a map of auditory space (Knudsen and Konishi 1978; Knudsen 1982). Focal electrical stimulation in the map of auditory space elicits head turns to the locations represented in the map by the stimulated site (du Lac and Knudsen 1990). The removal of the spatial information by focal lesioning leads to focal deficits in sound localization (Wagner 1993). Thus, it seemed logical to examine responses of space-specific neurons in the space map of the optic tectum. These neurons are sensitive to ITD and ILD (Olsen et al. 1989). We reasoned, if these neurons represent auditory space, they should represent a location, and not cues, although cues may contain much information about a given position. There was already data that showed that neurons in external nucleus of the inferior colliculus that project to the optic tectum are sensitive to frequency-specific ILDs (Euston and Takahashi 2002; Spezio and Takahashi 2003; Arthur 2004). However, a dynamic adaptation to frequency-specific inputs as it would be required if the owls weigh frequency differently for different pairs of ambiguous positions has not been demonstrated in these neurons so far. We found that neurons in the optic tectum also represent more information in their response than contained in broadband ITD and ILD. This held for both responses at the selected ambiguous pairs and for the responses at all ambiguous pairs as tested in the correlation analysis. We regard the latter data more reliable than the earlier. The correlation analyses suggested that the responses at the ambiguous pairs explained only about 24% of the variance. One reason may be that these neurons, albeit broadly tuned to frequency (Brainard et al. 1992), nevertheless show some variation in the best frequency with the preferred sound location, and this was shown to be related to cue reliability (Cazettes et al. 2014, 2016). It would be interesting to examine this property in the future.

Euston and Takahashi (2002) investigated the conversion from spectrum to space by fixing ITD in HRTF-based stimuli. While these authors observed major contributions of broadband ILD to the responses, they, and Spezio and Takahashi (2003), also reported frequency-specific ILD influences. This was also seen by Arthur (2004). We would expect this to occur also in the neurons of the optic tectum.

Behavior will typically not be driven by the responses of single neurons. Therefore, population responses may yield a closer correlation between space and neural activity. This has already been implied by decoding data (e.g., Day and Delgutte 2013) and was examined in the owl by Fischer and Pena (2017). We refrained from such an analysis here, because we feared that too much information would be destroyed by averaging over the spikemaps that are characterized by different centers of the main peaks, different frequencies and frequency bandwidths and different spike rates. Nevertheless, the single unit data and the behavioral data presented here are consistent in that they both suggest that the barn owl has more available than broadband ITD and ILD to localize a sound source. This had not been demonstrated before for the barn owl. Finally, the data and conclusions presented here reveal previously untested analogies between mammalian and avian sound localization that may enable future cross-species comparisons or knowledge transfers.