Introduction

Intact auditory cortex is essential for normal sound localization. Bilateral lesions of the auditory cortex result in profound deficits in localization behavior, and contralesional localization deficits arguably are the most conspicuous symptoms of unilateral auditory cortex lesions. No physiological study has shown a point-to-point cortical map of auditory space. Instead, available evidence suggests that sound-source locations are represented by widely distributed populations of broadly sensitive neurons. Spatial receptive fields in cats, ferrets, and monkeys, with and without general anesthesia, typically span more than 180 ° of azimuth. Nevertheless, response patterns of neurons vary as a function of sound location within those broad receptive fields such that the response pattern of an individual neuron can distinguish multiple locations. Several empirical tests of location estimation based on responses of small ensembles of broadly tuned neurons have demonstrated performance approaching that of behaving animals. These topics have been reviewed by King and Middlebrooks (2011).

Recent behavioral and physiological studies have identified regions of the auditory cortex that appear to have distinct roles in sound localization. Lomber et al. employed cortical cooling to inactivate individually 19 cortical fields in cats and found sound localization deficits associated with only four auditory fields: the primary auditory cortex (A1), the dorsal zone (DZ), the posterior auditory field (PAF), and the anterior ectosylvian sulcus area (AES). Unilateral inactivation of PAF or AES, or combined inactivation of A1 and DZ, reduced localization accuracy to around chance levels (Malhotra et al., 2004), whereas selective inactivation of A1 or DZ produced more limited deficits, which differed between fields (Malhotra et al., 2008). Similarly, physiological recordings in α-chloralose-anesthetized cats revealed differences in spatial sensitivity among neurons in fields A1, DZ, and PAF (Stecker et al., 2003; 2005b). In those studies, A1 neurons demonstrated relatively broad spatial sensitivity, with sensitivity broadening even further as sound levels increased. First-spike latencies in A1 were rather insensitive to sound locations. In contrast, DZ and PAF showed somewhat more restricted, level-invariant, spatial sensitivity and showed prominent location-dependent changes in first-spike latency. The majority of units in DZ and PAF responded best to far-contralateral locations, although DZ also exhibited a population of ipsilaterally sensitive neurons. The far-contralateral bias in sensitivity in DZ seen under α-chloralose anesthesia conflicted with expectations based on an earlier study that used barbiturate anesthesia (Middlebrooks and Zook, 1983).

Our studies of field A1 in awake conditions have replicated the generally broad spatial sensitivity seen in the anesthetized preparation and have also demonstrated a greater variety of temporal firing patterns and of spatial sensitivity (Mickey and Middlebrooks, 2003; Lee and Middlebrooks, 2011). Notably, the spatial sensitivity of many A1 neurons sharpens when cats engage in a listening task and sharpens further when the task requires sound localization (Lee and Middlebrooks, 2011).

The present study explored functional differences among fields A1, DZ, and PAF in the absence of anesthesia and evaluated task-dependent functional dynamics. Unit responses were recorded while cats were awake but off task and while they performed listening tasks that did or did not require localization. Awake conditions emphasized some differences among fields that had been seen under anesthesia and unmasked other unanticipated properties. The results suggest a model consisting of an interconnected network of cortical fields, with A1 showing relatively weak spatial sensitivity in off-task conditions but adapting to task demands, DZ showing an overrepresentation of locations around the frontal midline region, and PAF exhibiting a relatively uniform representation of contralateral space.

Materials and methods

Overview

Data from cortical fields DZ and PAF were obtained from the same five purpose-bred cats that yielded recordings from A1 reported previously (Lee and Middlebrooks, 2011). We also present here previously unpublished data from those A1 recordings for the purpose of comparison with DZ and PAF. Each animal was trained to perform two listening tasks and then was implanted with multiple 16-site chronic recording arrays. Extracellular spike recordings were made while the animal performed each of the two tasks and also when it was not actively engaged in a task (the “Idle” condition). Procedures for behavioral training, implantation of chronic recording electrode arrays, and spike recording during various behavioral conditions were identical to those reported previously and will be only summarized here. All animal procedures were conducted at the University of Michigan and were approved by the University of Michigan Committee on the Use and Care of Animals.

Experimental apparatus

Experiments were conducted in a sound-attenuating chamber that was lined with acoustic foam. Digital sound synthesis and data acquisition were performed using instruments from Tucker-Davis Technologies (Alachua, FL, USA) and custom MATLAB software (Mathworks, Natick, MA, USA) running on a Windows-based personal computer. Sounds were presented under free-field conditions from small loudspeakers, calibrated to flatten and equalize their frequency responses (Zhou et al., 1992). The loudspeakers were positioned on a horizontal hoop, 1.2 m in radius, in 20 ° increments of azimuth from contralateral 180 ° to ipsilateral 160 °. A vertical arc, 1.1 m in radius, held speakers in 20 ° increments of elevation; the vertical arc could be rotated about the vertical axis from left to right 50 °. The cat sat or stood on a small platform centered in the arrays of loudspeakers.

Behavioral training

Each cat learned two behavioral tasks: “Periodicity Detection” and “Localization.” In all conditions, repeated nontarget probe bursts probed the spatial sensitivity of neurons. Those probe noise bursts were independent Gaussian samples, 80 or 150 ms in duration, 30 or 50 dB SPL in level, presented at onset-to-onset intervals of 1.25 s, jittered by 0.2 s, varying in azimuth from burst to burst. Probe sound sources were located in the horizontal plane at 0 ° elevation and varied through 360 ° of azimuth in 20 ° steps in random order. Cats were trained to press and hold a pedal during which probe stimuli were presented. The 10- to 20-s hold period, comprising approximately eight to 16 probe sounds, was terminated by presentation of a target sound. If the cat released the pedal within 1.5 s following the target onset, the trial was scored as a Hit and the cat received a food reward. Early pedal release was scored as a False Alarm and triggered a 2-s timeout period. Late (or no) release was scored as a Miss and no food was delivered.

The two tasks differed in the nature of the targets. The Periodicity Detection target was a 200/s click train, 80 or 150 ms in duration, which was presented from randomly varying azimuths and elevations. The Localization target was a noise burst, identical to the probe stimuli except for its location. The Localization target was presented from elevations 40 to 80 ° above the horizontal plane with the azimuth of the vertical hoop varying daily in a range of contralateral 50 ° to ipsilateral 50 ° azimuth. The high-elevation Localization targets also were presented in randomly selected trials in the Periodicity Detection condition, but responses to those stimuli were not included in the analysis. A third task condition, “Idle,” was defined by an absence of key pressing. Idle periods occurred interspersed with periods of task performance (38 % of Idle blocks) or near the end of a session when the cat was satiated (62 %). Movements of the head and body indicated that the cats were awake during Idle periods. Behavioral sessions lasted ~1.5 h and were conducted once or twice daily for each cat. Cats usually performed one or more blocks of each task in each session in an order that was varied from session to session.

The three task conditions placed differing demands on the cats. In the Idle condition, there was no contingency between the sound stimulus and reinforcement. In the Periodicity Detection condition, the cat was required to listen and evaluate sounds, but the source location was irrelevant. In the Localization condition, the cat was forced to evaluate the location of each sound in order to detect the elevated target. This localization was accomplished covertly, in that the cats typically did not make orienting movements of the head or external ears towards the sounds.

Device implantation

After training, each cat was implanted with a skull fixture and with recording electrode arrays; implantation was performed under aseptic conditions in an approved surgical suite. The skull fixture provided attachment points for the recording headstage and for the head-position tracker.

The recording arrays (NeuroNexus, Ann Arbor, MI, USA) were single-shank silicon substrate devices each having 16 recording sites spaced at 100 or 150 μm along the shank; these arrays were similar to those used in our previous acute experiments, differing in that each had a flexible silicon ribbon cable that lead to a skull-mounted connector. Two to four arrays were placed during each surgical procedure, three to ten procedures per cat. Field PAF was located on the posterior bank of the posterior ectosylvian sulcus and was clearly delimited from A1 and PAF by the surface landmarks. Field A1 was located in the middle of ectosylvian gyrus, and DZ was located dorsal to A1 on the ventral bank of the suprasylvian sulcus. The placements of arrays in particular cortical areas were confirmed by intraoperative unit recording through the arrays under conditions of isoflurane anesthesia. In particular, DZ units were distinguished from A1 units by the DZ units' characteristic broader frequency tuning biased toward high frequencies and by their longer latencies: A1 units typically responded briskly with first-spike latencies <20 ms, whereas DZ latencies tended to be longer and more variable (Middlebrooks and Zook, 1983; Stecker et al., 2005b). We attempted to place A1 and DZ arrays near the estimated centers of those fields. For that reason, we probably undersampled the region of the A1/DZ border. Unit responses varied along multisite recording arrays. We treated all recordings from a given array as being from the same cortical field. That is, all units recorded at a DZ array placement were counted as DZ units even if, for instance, one or more of them showed responses more typical of A1 units. As in our previous acute experiments (Stecker et al., 2003; 2005b), the recording arrays in field A1 were oriented roughly parallel to cortical anatomical columns, whereas those arrays that were placed in DZ and PAF on the banks of sulci tended to cross cortical columns. Two cats received array placements in both right and left hemispheres, whereas arrays were restricted to the right hemisphere in the other three cats; in the illustrations, stimulus locations are shown as contra- or ipsilateral relative to the side of the recording site. Across the five cats, complete data from single or multiple units were recorded at a total of 70 A1 sites on 13 of the 16-site probes, 103 DZ sites on 12 probes, and 223 PAF sites on 21 probes. “Complete data” means that data were obtained from all three behavioral conditions in one session, with each condition tested with an average of ~30 probe noise bursts per stimulus location; the average number of noise bursts tested per location ranged across all units and conditions from 6.4 to 65.6. All comparisons between behavioral conditions were made within a single recording session. Characteristic frequencies of units in awake conditions often were ambiguous, in contrast to the sharp frequency tuning usually encountered under anesthetized conditions, but those characteristic frequencies could be measured ranging from 2 to 22 kHz.

Physiological recording

Behavioral conditions during physiological recording were identical to those during training, except that a headstage and a head tracker receiver were mounted on the skull fixture. The neural waveforms on the 16 sites on each recording array were recorded simultaneously, amplified, digitized, and stored on the computer disk for offline analysis. The orientation of the cat's (unrestrained) head was recorded at the beginning and the end of each sound burst using an electromagnetic tracking system (Polhemus FASTRAK, Colchester, VT, USA). Offline, the recorded head orientations were combined with loudspeaker locations to express each stimulus location in head-centered coordinates (Mickey and Middlebrooks, 2003). Head-centered stimulus azimuths (θ) were quantized into 18 20 °-wide bins, centered at contralateral 170 ° to ipsilateral 170 ° with 20 ° intervals; θ is the angle formed by the sound source, the center of the animal's head, and the animal’s midsagittal plane, irrespective of the sound-source elevation. The loudspeakers at 80 and 100 ° were the most lateral that were tested, falling precisely on the edges of the contralateral and ipsilateral 90 ° sample bins. Any sideways roll of the animal’s head decreased the azimuth angle with respect to the midsagittal plane, thereby moving an 80 or 100 ° sound source out of the 90 ° bin. For that reason, there were very few stimuli in the head-centered azimuth bins at contralateral and ipsilateral 90 °, and we were forced to eliminate those bins from the analysis, leaving 16 bins of head-centered azimuth.

We recorded unit activity from cortical fields A1, PAF and DZ under three behavioral conditions: Idle, Periodicity Detection, and Localization. In all conditions, the cat was exposed to series of probe sounds consisting of 80- or 150-ms broadband noise bursts presented from varying azimuth locations in the horizontal plane. The probe noise bursts were used to investigate the spatial sensitivity of cortical units. The physiological recordings reported here reflect responses only to the broadband probe sounds, not to the target sounds.

The possible influence of pinna movements on neural spatial sensitivity was a concern. Video monitoring of the cats, however, indicated that pinna movements were minimal during recording sessions, consistent with our previous observations (Mickey and Middlebrooks, 2003). There was no indication of orientation of the head and pinnae to the probe sound bursts, which were presented at 1- to 1.5-s intervals. Moreover, significant sharpening, broadening, and/or no change in spatial sensitivity could be recorded from a set of units recorded simultaneously, which could not easily be attributed to any single change in pinna position.

Data analysis

Data analysis employed custom scripts written for MATLAB. Statistical tests used the MATLAB Statistics Toolbox. Critical values for multiple comparison were adjusted using Tukey’s least-significant difference procedure.

Extracellular action potentials (spikes) were identified offline from the stored neural waveforms using custom software based on principal component analysis of spike shape (Furukawa et al., 2000; Stecker et al., 2003). We encountered well-isolated single units and, more often, spikes from multiple unresolved units. Single- and multiunit recordings showed similar sensitivity to stimulus location and to behavioral conditions. Consistent with our previous reports and those of others, we refer to both as “unit activity.” Spike times were expressed relative to the sound onset at the loudspeaker; therefore, latencies include 3.5 ms of acoustic travel time.

The mean spike counts of units relative to stimulus time and location are represented by two-dimensional poststimulus time histograms (PSTHs), as in Figures 1, 2, 3, and 6, which show spike rate averaged across all trials as a function of head-centered azimuth function and of time relative to stimulus onset. The maximum mean spike rates of these multiunit responses and the range of the number of trials at each stimulus location are given for each PSTH in the corresponding legend. Spikes were assigned to 5-ms bins, and PSTHs were smoothed in the time dimension with a three-point Hanning window. White gaps crossing the plots correspond to the bins centered at contra- and ipsilateral 90 °, which were excluded from analysis. The spatial sensitivity of units was quantified by rate–azimuth functions, R(θ), which plotted mean spike counts (R) within three time windows after stimulus onset as a function of head-centered azimuth, θ. The three time windows were the “onset window” (10 to 40 ms after stimulus onset), the “long-latency window” (41 to 80 ms or 41 to 150 ms for 80- or 150-ms duration stimuli, respectively), and the “offset window” (81 to 130 ms or 151 to 200 ms after stimulus onset for 80- or 150-ms durations). We also computed mean spike counts across the “entire recording duration”: 10 to 160 ms or 10 to 230 ms for 80- or 150-ms durations.

Fig. 1
figure 1

Poststimulus time histograms (PSTHs) showing normalized mean spike rates (colors) as a function of time (horizontal axis) and head-centered stimulus azimuth (vertical axis). In each plot, the thin white lines at the bottom of the plots indicate the 80- or 150-ms stimulus duration. White gaps crossing the plots correspond to spatial bins centered at ipsilateral and contralateral 90 °, which were omitted from analysis. These PSTHs represent units studied in the Idle condition. AC Three units recorded in field A1. Maximum mean multiunit spike rates were 10.7, 11.4, and 5.7 spikes/s based on 22–43, 17–28, and 13–34 trials at each location in A, B, and C, respectively. DF Three units recorded in field DZ. Maximum mean multiunit spike rates were 7.0, 37.9, and 13.5 spikes/s based on 18–27, 21–44, and 29 to 55 trials at each location, respectively.

Fig. 2
figure 2

AG PSTHs of seven units recorded in field PAF in the Idle condition. Plot conventions are the same as in Figure 1. Arrows and numbers along the right axes of the panels indicate the azimuths of the best-area centroids, expressed as ipsilateral (i) or contralateral (c) relative to the recording site. Maximum mean multiunit spike rates were 31.9, 8.6, 14.7, 9.2, 16.8, 39.3, and 24.1 spikes/s based on 36–68, 16–38, 13–34, 14–28,13–34, 14–28, and 20–32 trials at each location in AG, respectively. HI Best-area centroids of 18 (H) or 14 (I) units recorded along each of two probe placements in field PAF oriented approximately perpendicular to the cortical surface. NC no centroid. NC indicates that no centroid could be computed because a unit’s spike rate did not vary sufficiently across location. The ticks on the depth axis indicate intervals of 0.1 mm along the recording probes, although the specific depths in the cortex were not verified histologically.

FIG. 3
figure 3

PSTHs of four units recorded in field PAF. Plot conventions are the same as in Fig. 1. AB Responses of two units in the Idle condition showing suppressive responses to sounds. Maximum mean multiunit spike rates were 36.8 and 5.0 spikes/s based on 19–31 and 18–56 trials per location in A and B, respectively. CD Responses of two units in the Localization condition recorded at sites separated by 600 μm along one probe placement. The units show complementary suppressive (C) and excitatory (D) responses with similar spatial preferences. Maximum mean multiunit spike rates were 12.3 and 13.7 spike/s based on 21–65 and 21–65 trials at each location in C and D, respectively.

We classified PSTHs as being “onset-dominant,” meaning that they contained a reliable response only in the onset window, or “complex,” meaning that responses were present in both onset and long-latency windows; other PSTHs showed primarily inhibition or suppression. Among units showing primarily excitatory responses to sounds, the distinction between onset and complex PSTHs was made first by determining the stimulus location corresponding to the peak of the rate–azimuth function, computed across the entire recording duration. If the response in the long-latency window to that stimulus location was ≥20 % (for the 80-ms stimulus) or 50 % (for the 150-ms stimulus) of the response across the entire recording duration, that PSTH was classified as complex; otherwise, it was onset-dominant.

Rate–azimuth functions for Periodicity Detection and Localization conditions were compiled only from responses to probe stimuli on Hit trials in which the response key was depressed and held until the cat correctly released the key in response to a subsequent presentation of a target. Responses to probe stimuli that were followed by early or late releases of the key (False Alarms or Misses, respectively) were excluded from analysis. Rate–azimuth functions in the Idle condition were compiled from responses to probe stimuli during periods in which the key was not pressed. For some units, the rate–azimuth functions for one condition were combined from more than one block of trials of the same condition during the same experimental session.

The modulation depth of spike rate by stimulus location was defined as 100 × (R max − R min) / R max, where R max and R min were the maximum and minimum mean spike rates in a rate–azimuth function. “Best areas” for responses excited by sounds were the stimulus regions that produced maximum responses of units. Best areas for responses that were suppressed by sounds were the stimulus regions that produced minimum responses. We represented best-area locations by spike count-weighted centroids, which were computed only in cases in which the modulation depth was >50 %. For units that were excited by sounds, a peak was defined as the set of one or more contiguous locations around the maximum of the rate–azimuth function at which the response exceeded 0.75 × R max, plus the two flanking locations. For units that were suppressed by sounds, a trough was defined as the set of responses at one or more contiguous locations around the minimum of the rate–azimuth function that were lower than R min + 0.25 × (R max − R min), plus the two flanking locations. The centroid of each best area was given by the orientation of the spike count-weighted vector sum of responses within the peak or trough.

The width of spatial sensitivity of each unit was represented by the width of its equivalent rectangular receptive field (ERRF). The ERRF was computed by integrating the area under the rate–azimuth function and reshaping to form a rectangle of equivalent peak rate and area (see Supplementary Fig. 1 in Lee and Middlebrooks, 2011). The ERRF width was adopted as the spatial tuning metric because it reflected both the breadth of tuning and the depth of location-dependent spike rate modulation. Also, because the ERRF was computed from responses to all stimulus locations, it was less sensitive to trial-by-trial response variance than would be a metric that used a particular response criterion, such as tuning width at half-maximal response. A rate–azimuth function typically contained a peak that was substantially narrower than the ERRF width and tails that were broader. We used a bootstrap procedure (Efron and Tibshirani, 1991) to estimate the trial-by-trial variation in ERRF widths of individual units. Each bootstrapped ERRF width was computed from a rate–azimuth function formed from the mean of a sample of spike rates at each azimuth, sampled randomly with replacement. The bootstrap sample size for each unit was the mean number of trials across locations for that unit. Comparisons of ERRF widths between pairs of task conditions were made by forming a receiver operating characteristic (ROC) curve (Green and Swets, 1966) based on 1,000 such computations for each condition. The area under the ROC curve yielded the proportion of trials in which the ERRF width was narrower (i.e., sharper tuning) or broader in a particular task condition. A proportion of 0.76 (corresponding to a discrimination index of 1.0) was used as the criterion indicating that an individual unit showed significant sharpening or broadening.

First-spike latencies for each unit at each stimulus location were given by the across-trial geometric mean of the first spike in each trial that fell ≥10 ms after stimulus onset. Trials that failed to elicit at least one spike were omitted from this computation. The across-location ranges of first-spike latencies were given by the differences between the longest and shortest mean first-spike latencies across all stimulus locations. Our presentation of latencies is consistent with that in our previous reports (Stecker et al., 2003; 2005b) in that latencies of individual units are given by geometric means, whereas the central tendencies of distributions across units is given by medians.

For the purpose of comparing spike rates and first-spike latencies among behavioral conditions, we defined “preferred” and “least preferred” locations for each unit. First, we smoothed rate–azimuth functions in each condition with a three-point Hanning window. Next, across-condition rate–azimuth functions were formed by averaging across Idle, Periodicity Detection, and Localization conditions. Finally, the preferred and least preferred locations were given by the locations producing maxima and minima, respectively, of the across-condition rate–azimuth functions. For each unit, the same preferred and least preferred locations were used across all three behavioral conditions.

We adapted methods from our previous studies to quantify the accuracy with which spike firing patterns of individual units and ensembles of units could signal sound-source locations (Furukawa and Middlebrooks, 2001; Mickey and Middlebrooks, 2003; Stecker et al., 2003); the classification scheme used here differed from those in our previous studies in that it used more limited temporal resolution and in that it yielded mean localization errors as a function of target location. Classifications of responses were based on three-element vectors comprising spike counts within onset, long-latency, and offset time windows (defined above). For each unit: (1) we divided trials randomly into two equal-sized pools, A and B. (2) For each location θ A or θ B , we drew randomly with replacement eight three-element response vectors and averaged to form R A ) and R B ). (3) For each target location, θ A , we computed the Euclidean distance from R A ) to R B ) for each θ B , and recorded the θ B that minimized the Euclidean distance, which was the “estimated location.” (4) We recorded the absolute error (in degrees) between the target and the estimated location. Steps 1 through 4 were repeated 200 times (with 200 different random pools A and B), and absolute errors for each target location were averaged across the 200 repetitions. The result was a plot of mean absolute error versus target location. To quantify location signaling by ensembles of four or 16 units, we drew randomly with replacement samples of responses from four or 16 units and concatenated their three-element response vectors. Classification (i.e., steps 1 through 4 described above) was then based on the resulting 12- or 48-element vectors. Sampling of ensembles and the classification analysis described above were performed 100 times for ensembles of four and 100 times for ensembles of 16 units.

Results

Characteristics of spatial selectivity and firing patterns across three cortical fields

Units in fields A1, DZ, and PAF of awake cats exhibited a greater variety of temporal firing patterns and spatial sensitivity than has been observed under anesthetized conditions. We first consider those properties recorded in the Idle condition. Three representative examples of A1 units are shown in Figure 1A–C. As reported in our previous study (Lee and Middlebrooks, 2011), the majority of units recorded in A1 (56 %) responded primarily with excitatory responses to the onsets of noise bursts. In the Idle condition, the spatial sensitivity of these onset-dominant units was broad, with the spatial response area typically occupying more than a hemifield (Fig. 1A). Another 33 % of A1 units showed complex temporal firing patterns, consisting of an onset response followed by a period of suppression followed by one or more additional bursts of spikes (Fig. 1B); the percentages of onset-dominant and complex units reported here differ slightly from those in the previous paper because we adopted a quantitative criterion in the present work that could be applied to all three cortical fields. In the Idle condition, the units with complex firing patterns usually exhibited sharper spatial tuning than that exhibited by onset units. The long-latency portion of the response consistently showed spatial tuning that was as sharp or, often, sharper than that of the onset burst. The remaining 11 % of units in our A1 sample showed prominent offset responses (as in Fig. 1C) or suppression of spontaneous activity (not illustrated). Spatial sensitivity of the suppression and/or offset units was broad.

Temporal response patterns of many units recorded from DZ were largely similar to those in A1. Like the majority of units in A1, the responses of 54 % of units in DZ were dominated by excitatory responses to stimulus onset, with little or no long-latency activity. About half of the units showing onset-dominant responses were like most units in A1 in that their spatial sensitivity was broad or favored nonfrontal locations. The other half of onset-dominant units in DZ differed from those in A1 in that they showed relatively sharp spatial tuning centered near the frontal midline (Fig. 1D). As in A1, a small percentage (5 %) of DZ units showed either suppression of the spontaneous activity or predominantly offset responses. The remaining 41 % of DZ units exhibited complex temporal firing patterns and complex spatial sensitivity. The example in Figure 1E showed an onset response that was broadly tuned followed by a long-latency response exhibiting much sharper spatial tuning restricted to near the frontal midline. In this example, the long-latency response to midline stimuli persisted for ~100 ms after the stimulus offset. Another unit (Fig. 1F) showed an onset response tuned to frontal locations followed by a long-latency response having similar spatial sensitivity. The examples in Figure 1E and F are representatives of the 41 % of DZ units showing complex temporal firing patterns in that the long-latency portion of the response typically was as or more sharply tuned than was the onset portion and in that the long-latency response typically favored near-frontal midline locations.

Units recorded in PAF exhibited an even greater variety of temporal response patterns and spatial selectivity than was observed in A1 and DZ. Only 30 % of PAF units showed onset-dominant responses excitatory like those of the majority of A1 and DZ units. Most of those units showed rather broad spatial sensitivity. A larger population of PAF units, 46 %, showed complex temporal firing patterns containing onset and long-latency components. About a third of those (complex pattern) units were rather insensitive to stimulus location. The other two-thirds had sharper spatial sensitivity. Unlike the sharply tuned units in DZ, however, best-area centroids of PAF units were uniformly distributed across contralateral and, to a lesser degree, ipsilateral space. In Figure 2A–G, we show PSTHs of seven units that are representative of the PAF units that showed spatially restricted complex responses, recorded from four animals during Idle conditions; the panels are arranged according to the ipsilateral to contralateral locations of their best-area centroids. The example in Figure 2A was a unit that favored stimuli from 19 ° azimuth in the right hemifield, which was ipsilateral to the recording site. The unit in Figure 2B favored stimuli near the frontal midline. Units in Figure 2C to F had best-area centroids ranging from contralateral 30 ° to 151 ° azimuth. The example in Figure 2G responded best to stimuli falling in the ipsi- and contralateral 170 ° bins, indicating a best-area centroid located near the rear midline. Although the examples shown here exhibited a variety of temporal firing patterns, most of them had some combination of onset and long-latency response components. Usually the long-latency component responses showed sharper spatial tuning than did the onset component (Fig. 2C–F).

The relatively uniform distribution of best-area centroids among the PAF units having complex temporal firing patterns raised the possibility that those units might represent a special population constituting a topographical map of auditory space. Our experimental design was not optimized for systematic cortical mapping. Nevertheless, we often encountered nearby units that had widely separated centroids, and widely separated units sometimes had similar centroids, both of which are inconsistent with the presence of a topographic map of space. Those characteristics are evident in Figure 2H–I, which summarize the best-area centroids of units encountered along two multisite probe placements in PAF that were oriented approximately perpendicular to the cortical surface. Each placement encompassed a large range of best-area centroids that varied erratically as a function of recording location, suggesting the absence of location-specific cortical columns. Moreover, there was no indication of particular best-area centroids associated with specific cortical depths.

Some 30 % of units recorded in PAF showed spontaneous activity that was suppressed by sounds or responded primarily after sound offset. The suppression typically was tonic, sometimes outlasting the duration of the sound. The units represented in Figure 3A and B had high spontaneous activity that was suppressed by the sound stimulus. That suppression could be spatially selective, as in Figure 3A, or quite broad, as in Figure 3B. Figure 3C and D represent a pair of units that is representative of three pairs of nearby units that showed complementary excitation and inhibition; the illustrated units were recorded simultaneously at sites separated by 600 μm during Localization conditions. The unit in Figure 3C showed suppression that was restricted to contralateral sounds. In contrast, the unit in 3D showed tonic excitation with spatial sensitivity corresponding to the region of suppressive sounds in the nearby unit. About 1/4 of the units exhibiting inhibitory responses to sounds (6 % of all PAF units) also had onset or long-latency components to their PSTHs. The characteristics of spatial sensitivity of units in all three cortical fields along with their temporal firing patterns are summarized in Table 1.

TABLE 1 Characteristics of spatial sensitivity and temporal firing patterns in the Idle condition in all three cortical fields

Quantitative comparison of spatial sensitivity among three cortical fields

We quantified the spatial sensitivity of units by the centroids of their best areas and by the widths of their ERRFs (defined in “Materials and methods”). These metrics were based on spike rates averaged over the entire recording duration in order to capture the diversity of response patterns and spatial sensitivity among different cortical neurons; we also considered separately the spatial sensitivity of onset and long-latency components of units having complex PSTHs. Spatial tuning metrics presented in this section were all based on recordings in the Idle condition. Widths of ERRFs were computed for all units. The centroids of best areas were computed only for units that showed ≥50 % modulation of spike rate as a function of stimulus location; other units are labeled in the figures with “NC” for “No Centroid”.

We compare in Figure 4 the distributions of best-area centroids across three cortical fields. In each panel, symbols are ranked by centroid. The three rows of panels represent, respectively, all units, onset-dominant units, and units having complex PSTHs. The first three columns represent units in fields A1, DZ, and PAF, respectively, that showed excitation in response to sound. Panel D shows centroids for suppression of PAF units; the numbers of suppressed units sampled in A1 and DZ were too small to yield meaningful plots. The top row of panels illustrates that field DZ exhibited a relatively large proportion of centroids near the frontal midline (indicated by the increased slope in the progression of symbols through ±45 ° azimuth), that field PAF exhibited a more uniform distribution of centroids throughout the contralateral hemifield (indicated by the relatively uniform slope), and that field A1 was intermediate between DZ and PAF. Of the units having measureable centroids (i.e., modulation depth ≥50 %), area DZ had 58.5 % of its centroids in the frontal quadrant of space (i.e., between contra- and ipsilateral 45 °) and 23.2 % in the contralateral quadrant (i.e., between contralateral 135 and 45 °), more than twice as many frontal as contralateral quadrant centroids. In contrast, area PAF had only 35.8 % frontal and 36.7 % contralateral quadrant centroids, a roughly equal distribution between the two quadrants. The distributions of centroids of all excitatory units (i.e., Fig. 4A, B, and C) were compared using a two-sample Kolmogorov–Smirnov test. The difference between areas DZ and PAF were significant (for N = 98 DZ units and 169 PAF units, K = 0.28, p = 0.00069). Differences between A1 and DZ and between A1 and PAF were not significant (p > 0.05).

Fig. 4
figure 4

Distributions of centroid locations. Each unit is represented by a symbol indicating the best-area centroid. Units for which no centroid (NC) could be computed are represented by symbols placed at the right edge of each plot. Units are ordered by best-area centroids. Horizontal dashed lines indicated the percentages of the populations in each condition having centroids within 45 ° of the frontal midline. The various panels show units recorded in various fields and showing various response patterns. A, B, C All units in fields A1, DZ, and PAF having any PSTH component showing excitation to sounds. D All units in field PAF showing primarily suppression or an offset response to sounds. E, F, G Units in fields A1, DZ, and PAF showing an onset response and little or no long-latency response. H, I, J Units in fields A1, DZ, and PAF having PSTHs containing onset and long-latency responses. The number of units (N) represented in each panel is indicated.

The differences among fields were magnified by examination of just the units having complex PSTHs (Fig. 4H, I, J). Among complex units, PAF again showed approximately equal percentages of frontal and contralateral quadrant centroids: 37.6 % frontal and 35.3 % contralateral, whereas the imbalance of frontal and contralateral centroids increased in DZ, to 74.3 % frontal and 11.4 % contralateral units. Among the complex units, there were significant differences in the distributions of centroids between fields A1 and DZ (for N = 23 A1 units and 42 DZ units, K = 0.63, p = 0.000036) and between fields DZ and PAF (N = 42 DZ units and 103 PAF units; K = 0.44, p = 0.000089); the difference between A1 and PAF was not significant (K = 0.21, p = 0.43). In a comparison of units having onset-only (Fig. 4E, F, G) or complex (Fig. 4H, I, J) PSTHs, area A1 and DZ both showed significant differences in the distribution of centroids (A1: N = 39 onset and 23 complex units, K = 0.63, p = 0.00020; DZ: N = 56 onset and 42 complex units, K = 0.30, p = 0.038); there was no significant difference in PAF (p = 0.99). The major difference between onset and complex units in A1 was an increase among the complex units in the percentage of units having measureable centroids and greater number of contralateral quadrant centroids, whereas in DZ, there was an increase among the complex units in the percentage of units with frontal centroids. Among just the complex units, there was no systematic difference in the locations of centroids computed from onset versus long-latency responses (data not shown; K = 0.087 to 0.189, p = 0.41 to 0.88, depending on cortical field).

Among PAF units that showed spontaneous activity that was suppressed by sound, we computed best-area centroids from the minima of rate–azimuth functions (Fig. 4D, see “Materials and methods”); 67 % of suppressed units showed measurable best areas. Unlike PAF units with excitation responses, best-area centroids of suppressive responses in PAF were restricted largely to the frontal hemifield, distributed fairly uniformly from ipsilateral 90 ° to contralateral 90 °.

The breadth of spatial sensitivity in the three cortical fields was quantified by the widths of their ERRFs. The widths of ERRFs reflected both the spatial extent of responses and the depth of location-related modulation of spike rates. Across all excitatory units, widths of ERRFs computed across full recording durations averaged 217.5, 204.1, and 211.1 ° in fields A1, DZ, and PAF, respectively. Widths of ERRFs in DZ were somewhat smaller than in A1 and PAF, but that only approached statistical significance (Analysis of variance (ANOVA), F (2,326) = 2.74, p = 0.066). For comparison with previous studies, the widths of rate versus azimuth functions at half-maximal responses averaged 266.1 ° in A1, 234.7 ° in DZ, and 249.0 ° in PAF.

In Figure 5, we compare the relationship between each unit’s centroid location and its ERRF width. The top row of panels represents every excitatory unit, with values computed across full recording durations. In the Idle condition, most A1 units had ERRFs broader than one hemifield, and there was no indication of a particular region of sharper spatial tuning (Fig. 5A). Units in DZ and PAF having narrower than average ERRFs similarly were widely distributed, although there was a tendency for the most sharply tuned units in areas DZ and PAF to show centroids near the frontal midline (Fig. 5B and C). The bottom three rows of panels in Figure 5 represent onset-dominant units (Fig. 5D–F), ERRF widths and centroids computed from just the onset responses of units with complex PSTHs (Fig. 5G–I), and corresponding values computed from just the long-latency responses of the same complex units (Fig. 5J–L). Computed across full recording durations, ERRF widths averaged 224.4, 209.4, and 231.6 ° for onset-dominant units in A1, DZ, and PAF, respectively, and averaged 205.9, 197.1, and 198.0 ° for units with complex PSTHs in A1, DZ, and PAF, respectively. Complex units had significantly narrower ERRFs than did onset-only units (two-way ANOVA: F (1,325) = 40.4, p < 0.00001 for the main effect of unit type and F (2,325) = 3.94, p = 0.020 for main effect of cortical field). Among the units with complex PSTHs, the long-latency components of responses showed consistently narrower spatial tuning than did onset components (two-way ANOVA: F (1, 324) = 16.9, p = 0.000051 for PST component, F (2, 324) = 3.23, p = 0.041 for cortical field); one DZ unit and seven PAF units had robust long-latency responses but inconsistent onset responses and, therefore, were excluded from the comparison of onset and long-latency ERRF widths. Mean ERRF widths were 171.5, 173.9, and 182.4 ° for the onset components in areas A1, DZ, and PAF, respectively, compared to 159.0, 148.3, and 167.2 °, respectively, for the long-latency components.

FIG. 5
figure 5

Breadth of spatial tuning represented by widths of equivalent rectangular receptive fields (ERRFs). The ERRF width of each unit is shown as a function of its centroid azimuth. The horizontal dashed line in each panel indicates the mean ERRF. The top row of panels (A, B, and C) represent ERRF widths computed from full recording durations (indicated “All PST”) from all units that showed an excitatory response to sounds. The second row represents units having onset-dominant responses. The third and fourth rows represent only the units that had complex PSTHs containing long-latency components. G, H, and I show ERRF widths computed from only the onset responses, whereas J, K, and L show ERRF widths computed from only the long-latency (indicated as “Late”) responses. Among the complex units, one DZ unit and seven PAF units had robust long-latency responses but inconsistent onset responses. Those units are represented in K and L but not H and I.

Task-dependent modulation of spatial sensitivity in A1, DZ, and PAF

We have shown previously that the spatial sensitivity of the onset responses of many units in field A1 sharpens during task performance (Lee and Middlebrooks, 2011). Task-dependent sharpening of onset response spatial tuning was also observed in DZ and PAF units. An example from DZ is shown in Figure 6A–C. In the Idle condition, that unit fired a transient burst of spikes predominantly within the first 40 ms after the stimulus onset in response to sounds throughout the contralateral hemifield (Fig. 6A). When the animal engaged in the Periodicity Detection task, however, the unit’s onset responses became restricted to stimuli around the frontal midline (Fig. 6B). That midline tuning also was evident during the Localization condition (Fig. 6C); the overall response magnitude was somewhat reduced in the Localization condition in this example.

Fig. 6
figure 6

Task-dependent modulation of response pattern and spatial sensitivity in DZ and PAF. Each row of PSTHs represents data from one unit studied in three behavioral conditions during one recording session. Left, middle, and right columns of panels represents the Idle, Periodicity Detection, and Localization conditions, respectively. The color map is equalized across the three task conditions for each unit such that any particular color indicates the same spike density (spikes per time and location bin) across the three panels in each row. af Two DZ units. Maximum mean multiunit spike rates were 6.7, 9.2, 5.1, 37.9, 37.2, and 35.7 spikes/s based on 23–47, 20–47, 9–34, 21–44, 20–50, and 23–52 trials at each location. gl Two PAF units. Maximum mean multiunit spike rates were 25.6, 32.9, 27.6, 17.8, 37.2, and 18.5 spikes/s based on 29–60, 30–60, 29–63,29–48, 28–52, and 24–46 trials at each location. Plot conventions as in Fig. 1

As shown in the previous section, many DZ units in the Idle condition exhibited reliable long-latency responses with somewhat restricted spatial sensitivity for stimuli near the frontal midline. Responses of a tonically firing DZ unit are shown in Figure 6D–F in the three behavioral conditions. In the Idle condition, this unit showed a broadly tuned onset response followed by a long-latency response driven only by the stimuli near the frontal midline. The long-latency activity was somewhat greater in magnitude and duration in the behavioral conditions (Periodicity Detection and Localization, Fig. 6E and F), whereas the selectivity for the frontal locations remained.

The task dependence of PAF unit responses was similar to that observed in DZ. Figure 6G–I and J–L illustrate PSTHs across three conditions for two PAF units that exhibited location-specific long-latency responses. The PAF unit in Figure 6G, in the Idle condition, exhibited a brisk onset response showing little spatial sensitivity followed by a long-latency response favoring sounds around the front and rear midline. In the Periodicity Detection condition (Fig. 6H), the spatial tuning of the onset response sharpened to exclude some ipsilateral locations, and the midline-tuned long-latency response increased in magnitude. That spatial sensitivity was maintained in the Localization condition (Fig. 6I), with some reduction in onset response magnitude. The unit in Figure 6J–L maintained selectivity for contralateral sounds, with an increase in the long-latency component of the response in the on-task conditions.

We quantified the task dependence of spatial sensitivity by computing the locations of best-area centroids and the widths of ERRFs in Idle and Localization conditions. In all three cortical fields that were studied, it most often was the onset response that showed the greatest change in spatial selectivity between behavioral conditions. Among just the units that had complex (i.e., onset and long latency) PSTHs, a two-way ANOVA showed a robust main effect of behavioral task on ERRF width of onset responses (F (2,475) = 8.14, p = 0.00033), but no significant main effect on ERRF widths of long-latency responses (F (2,475) = 0.95, p = 0.39). For that reason, and for the reason that many units had onset responses but no long-latency responses, we compared the task dependence of centroid locations and ERRF widths based just on spike counts within the first 40 ms of the stimulus onset; we exclude from this analysis one DZ unit and seven PAF units that had robust long-latency responses but inconsistent onset responses. Best-area centroids showed no systematic changes in location across changing stimulus conditions. That is demonstrated in Figure 7 by the tendency of data to cluster around the positive diagonals in plots of best-area centroid locations in Idle and Localization conditions (Fig. 7). There were no significant task-dependent changes in the overall distributions of centroids in any of the three cortical fields studied (two-sample Kolmogorov–Smirnov test, p = 0.30–0.84, K = 0.11–0.14, depending on field).

Fig. 7
figure 7

Comparison of locations of best-area centroids in Idle versus Localization conditions. Centroids were computed from onset spike counts, falling between 10 and 40 ms after stimulus onset. Vertical and horizontal lines indicate loci of centroids within 45 ° of the frontal midline. Panels indicate data from field A1, DZ, or PAF, as indicated. NC indicates units for which no centroid was computed because spike rates showed less than 50 % modulation by stimulus location.

In contrast to the absence of task dependence of best-area centroids, we observed a quantitative task dependence of the breadth of spatial sensitivity. The distributions of ERRF widths across behavioral conditions are shown in Figure 8. In each of the three fields, the distribution of onset ERRF widths showed significant sharpening when the animal was engaged in the behavioral tasks (Periodicity Detection and Localization) compared to the Idle condition (ANOVA across Idle, Periodicity Detection, and Localization conditions for A1: F (2,61) = 9.11, p = 00021; DZ: F (2,96) = 9.4, p = 0.00012; PAF: F (2,161) = 16.9, p < 0.00001). Figure 8 indicates the p values for pairwise comparisons after adjustment for multiple comparisons (Tukey’s least significant difference procedure). Mean ERRF widths were 186.2, 174.4, and 168.9 ° for Idle, Periodicity Detection, and Localization conditions, respectively in A1; 173.4, 157.2, and 161.7 ° for those conditions in DZ; and 191.1, 179.7, and 176.2 ° for those conditions in PAF. Overall, widths were significantly narrower between Idle and both on-task conditions in all three fields. Differences in ERRF widths between Periodicity and Localization conditions were not significant in any of the three cortical fields (p > 0.05).

FIG. 8
figure 8

Distributions of onset ERRF widths in Idle, Periodicity Detection, and Localization conditions, as indicated. In the box representing each condition, horizontal lines represent 25th, 50th, and 75th percentiles, and symbols outside the boxes represent data outside the middle two quartiles. Each panel represents data from A1, DZ, or PAF, as indicated.

There was considerable variation among units in the task dependence of their spatial sensitivity. We wished to estimate the percentages of units that showed statistically significant task-dependent sharpening or broadening of their ERRFs. For that reason, we utilized a bootstrap procedure to evaluate the trial-by-trial variation in rate–azimuth functions of individual units in the three cortical fields and thereby to test the significance of sharpening or broadening of the mean ERRF widths. In all three fields, nearly half of the units showed significant changes in ERRF widths, with more units sharpening than broadening their ERRFs in on-task (Periodicity Detection and Localization) compared to Idle conditions (Fig. 9). The strongest contrast was between Idle and Localization: 44 %, 35 %, and 31 % of units in A1, DZ, and PAF, respectively, showed significant sharpening of ERRFs compared to 5 %, 11 %, and 9 % that showed significant broadening. In a comparison between the Periodicity and Localization conditions, only A1 and PAF showed more units that significantly sharpened than broadened their spatial sensitivity (i.e., middle bars in each panels; significantly sharpened: A1, 24 %; PAF: 22 %; significantly broadened: A1, 10 %; PAF, 14 %).

Fig. 9
figure 9

Percentage of units that showed significant sharpening (light bar above the origin) or broadening (dark bar below) of spatial tuning between pairs of behavioral conditions, as indicated. Each panel represents data from field A1, DZ, or PAF, as indicated.

Task-dependent suppression and facilitation

Our previous study (Lee and Middlebrooks, 2011) demonstrated that the narrowing of onset ERRF in A1 during behavioral conditions resulted from the increased suppression of responses to the least-preferred stimuli (i.e., stimulus locations eliciting the lowest spike counts) rather than from an enhancement of responses to the preferred stimuli (stimulus locations eliciting the highest spike counts). The task dependence observed in the present recordings from DZ and PAF, in contrast, exhibited both suppressive and facilitatory influences on spatial selectivity. In Figure 10, we compare onset spike rates between Idle and Localization conditions for the three cortical fields and for stimuli at preferred and least-preferred locations; data are restricted to the subset of neurons that showed significant task-dependent narrowing of their onset ERRF widths. As in our previous observations in field A1, onset responses of DZ and PAF neurons to preferred stimuli showed no significant difference between Idle and Localization conditions (Fig. 10A–C; Wilcoxon paired signed-rank test, z = 0.94, 0.06, and 0.81, p = 0.35, 0.95, and 0.42, for fields A1, DZ, and PAF, respectively), whereas onset responses in all three fields to least-preferred stimuli were significantly suppressed (Fig. 10D–F, z = 4.1, 3.7, and 4.3, p = 0.000037, 0.00019, and 0.000021, for fields A1, DZ, and PAF, respectively).

FIG. 10
figure 10

AC Onset spike rates elicited by stimuli at preferred locations in Localization (vertical axis) versus Idle (horizontal axis) conditions. Symbols represent only the units that were excited by sounds and that showed significant task-dependent sharpening of ERRFs, as counted in Figure 9. Panels represent data from A1, DZ, or PAF, as indicated. DF Same as AC, for spike counts elicited by stimuli at least-preferred locations. p values are from a Wilcoxon paired signed-rank test.

About 30 % of the DZ and the PAF units had reliable long-latency responses. For those units, the long-latency activity usually was more spatially sensitive than the onset response. The spatial selectivity of the long-latency responses for these units usually did not change across conditions. The magnitudes of the long-latency responses, however, tended to increase during on-task conditions. Spike rates in PAF averaged 39.9 % higher in Localization compared to Idle conditions (Fig. 11B; Wilcoxon paired signed-rank test, z = 4.2, p = 0.000030); there was only a mean 9 %, nonsignificant, increase in long-latency firing in DZ (Fig. 11A; z = 1.5, p = 0.13).

FIG. 11
figure 11

Long-latency spike rates elicited by stimuli at preferred locations in Localization (vertical axis) versus Idle (horizontal axis) conditions. Symbols represent all the DZ and PAF units that had reliable long-latency responses. Wilcoxon paired signed-rank test.

First-spike latencies

In a previous study using anesthetized conditions, we found that mean first-spike latencies computed across all stimulus locations were considerably longer in PAF and DZ than in A1; median values of the distributions were 28.8, 22.0, and 17.6 ms in PAF, DZ, and A1, respectively (Stecker et al., 2003; 2005b). In the present study, in awake Idle conditions, computed across all stimulus locations, first-spike latencies in DZ and PAF again were significantly longer than in A1 (p < 0.001, Wilcoxon rank-sum test). Compared to the anesthetized condition, however, A1 and DZ latencies were longer (DZ: 27.9 ms; A1: 23.7 ms), and latencies in PAF were slightly shorter (28.4 ms) in awake conditions. As a result, the contrast in latencies among A1, DZ, and PAF was smaller than observed under anesthetized conditions. The contrast in latencies among A1, DZ, and PAF was even smaller when comparing just the first-spike latencies at preferred locations: in the Idle condition, such latencies were 18.8, 20.6, and 21.1 ms, a difference of only 2.3 ms between A1 and PAF.

First-spike latencies in all three fields varied markedly as a function of stimulus location, generally from short to long at stimulus locations eliciting high to low spike counts, respectively. The across-location ranges of median first-spike latencies all were broader in the awake Idle condition (A1: 15.1 ms; DZ: 15.3 ms; PAF: 16.0 ms) than in the anesthetized condition (A1: 3.1 ms; DZ: 8.4 ms; PAF: 10.6 ms; Stecker et al., 2003; 2005b). The across-location ranges of latency were more similar among fields in the awake condition (all 15.1 to 16.0 ms) compared to the anesthetized condition in which A1 units showed much narrower ranges of latency (3.1 ms) than did units in DZ and PAF (8.4 and 10.6 ms).

Median first-spike latencies of units in all three fields were significantly longer for the preferred stimuli during the behavioral conditions (Kruskal–Wallis, A1: χ2 (2,183) = 7.05, p = 0.029; DZ: χ2 (2,290) = 12.6, p = 0.0019; PAF: χ2 (2,484) = 9.71, p = 0.0078). Median first-spike latencies for preferred locations in each field for Idle, Periodicity Detection, and Localization conditions, respectively, averaged: A1: 18.8, 20.6, and 20.1 ms (Fig. 12A); DZ: 20.6, 21.8, and 22.1 ms (Fig. 12B); and PAF: 21.1, 22.3, and 22.4 (Fig. 12C). In each field, latencies for either on-task conditions were significantly longer than for the off-task condition (pairwise comparison with least-significant difference adjustment: p < 0.05 in A1, p < 0.005 in DZ, and p < 0.01 in PAF), but there was no difference in latencies between the two on-task conditions (p > 0.05 for each field).

FIG. 12
figure 12

Distributions of median values of first-spike latencies computed at the preferred locations for units in fields A1, DZ, and PAF, as indicated. Colors indicate Idle (black), Periodicity Detection (blue), and Localization (red) conditions. Symbols indicate mean values for each field.

Location coding by individual units and ensembles of units

Fields DZ and PAF differed markedly in the distributions of the best-area centroids of units, whereas both DZ and PAF showed some sharpening of spatial tuning when the animals were engaged in auditory tasks. Those observations led to the hypotheses that DZ and PAF differ in the regions of space in which units signal sound-source locations most accurately and that location signaling is more accurate when animals are engaged in a listening task. We developed a simple classification procedure to assess the accuracy with which DZ and PAF units could signal sound-source locations (as described in “Materials and methods”). In both DZ and PAF, the sharpest spatial tuning was exhibited by units showing complex temporal response patterns. For that reason, we restricted analysis of localization accuracy to units that showed reliable spike activity during the time window 40–80 ms after stimulus onset during Localization tasks. Also, because of the great range of spatial tuning widths among units, we restricted analysis to units with ERRF widths narrower than the median computed across all units in each field and to units for which data were collected for greater than or equal to eight trials at each azimuth in all three behavioral conditions in the same recording session; that amounted to 24 units in DZ and 47 units in PAF.

Mean errors of location estimates by individual units and by randomly drawn ensembles of four or 16 units are shown in Figure 13. Averaged across all stimulus locations in the Idle condition, mean errors by individual units in both cortical fields were around 80 °, which is little better than the 90 ° mean error expected for random chance performance. Errors decreased, however, when information was combined across units. In DZ, across locations, errors averaged 79.6 ° for individual units (Fig. 13A), 62.8 ° for ensembles of four units (Fig. 13B), and 41.0 ° for ensembles of 16 units (Fig. 13C). Mean errors were somewhat larger in PAF, averaging 80.7 ° for individual units (Fig. 13D; not significantly different from DZ; ANOVA; F (1,1134) = 1.35, p = 0.25), 69.7 ° for ensembles of four (Fig. 13E, larger than DZ; F (1,3198) = 131.9, p < 0.00001), and 51.3 ° for ensembles of 16 (Fig. 13F, larger than DZ; F (1,3198) = 415.8, p = 0.00001).

FIG. 13
figure 13

Mean errors of location estimates based on spike patterns of DZ (A to C) and PAF (D to F) units, selected as described in the text. Line colors indicate behavioral conditions, as indicated. Panels indicate results from individual units (A) (D) or randomly formed ensembles of four (B) (E) or 16 (C) (F) units. Error bars represent the standard error of the mean across estimations from 24 (a) or 47 (D) individual units or from 100 randomly selected ensembles (B, C, E, F).

Differences between cortical fields were more striking in regard to the patterns of errors as a function of stimulus location. Field DZ consistently showed the most accurate localization (i.e., the smallest errors) for stimuli located just contralateral to the frontal midline (Fig. 13A, B, and C). For ensembles of 16, the smallest mean errors were only 19.5 ° which, given the 20 ° resolution of our analysis, indicates that more than half of the location estimates fell within one bin of the correct location. The location dependence of errors in PAF was largely orthogonal to that in DZ. Localization by PAF units and ensembles showed the largest errors for stimuli near the front and rear midline, and localization was most accurate for far-lateral stimuli (Fig. 13D, E, and F). We tested for a similarity between fields DZ and PAF in the location dependence of mean errors by computing a correlation coefficient between the two vectors of mean error versus location. A coefficient near 1 would have indicated similar spatial dependence, and a coefficient near −1 would have indicated complementary (i.e., opposite) dependence. Instead, the coefficients were near zero, indicating that we could not exclude the null hypothesis that location dependence of mean errors was orthogonal between DZ and PAF (individual units: r(1,14) = 0.048, p = 0.86; ensembles of four: r(1,14) = −0.14, p = 0.62; ensembles of 16: r(1,14) = −0.104, p = 0.70).

Localization errors decreased when animals were engaged in the behavioral tasks. In DZ, errors by ensembles of 16 units, averaged across all locations, narrowed from the Idle condition (41.0 °) and the Periodicity Detection condition (41.3 °) to the Localization condition (38.9 °). A greater effect of task was demonstrated by a two-way ANOVA that also considered the effect of stimulus location (ensembles of 16: main effect of task, F (2,4782) = 29.4, p < 0.00001; pairwise comparison, Idle versus either on-task condition: p < 0.001, Periodicity Detection versus Localization, p > 0.05). No significant task dependence in DZ was observed for individual units (F (2,1134) = 0.36, p = 0.696) or for ensembles of four (F (2,4782) = 0.88, p = 0.41). The decrease in mean errors of location estimates was somewhat greater in PAF. Errors by ensembles of 16 units, averaged across all locations, narrowed from the Idle condition (51.3 °) to Periodicity Detection (45.1 °) to Localization (42.8 °). In the two-way ANOVA, for ensembles of 16, all pairwise combinations of Idle, Periodicity Detection, and Localization were significant (p < 0.0001 after least-significant difference adjustment; main effect of task F (2, 4782) = 269.1, p < 0.00001). In PAF, individual units and ensembles of four units also showed significant task-dependent reductions in errors (individual units: F (2,2238) = 4.0, p = 0.018, Idle versus either on-task condition, p > .05; ensembles of four: F (2,4782) = 58.4, p < 0.00001, pairwise comparison of Idle versus either on-task condition: p < 0.0001, comparison of Periodicity Detection and Localization, p < 0.005). Ensembles of 16 PAF units showed the greatest task-dependent decreases in errors for locations around the frontal and rear midlines. Errors were smallest across all task conditions for far lateral stimuli, with errors around 26 ° for ensembles of 16.

Discussion

Spatial sensitivity varies with anesthetic and task conditions

Most neurons in the present awake conditions showed fairly broad spatial sensitivity, consistent with previous studies of cats, ferrets, and nonhuman primates in anesthetized and awake conditions (reviewed by King and Middlebrooks, 2011; also Zhou and Wang, 2012.). Fewer no-centroid units were seen in awake animals, but most neurons responded to sounds throughout ≥180 ° of azimuth even when cats were on task. Consistent with essentially all previous reports, there was no indication of a point-to-point map of space in A1, DZ, or PAF.

Most spike patterns in anesthetized conditions are limited to a ~10–30-ms burst of spikes at stimulus onset. In contrast, many neurons in awake conditions showed tonic firing throughout the stimulus duration. The spatial sensitivity of long-latency components of complex spike patterns was as sharp or often sharper than that of onset components, especially in DZ and PAF. That observation accords with results from awake marmosets showing that tonic responses show greater selectivity for preferred stimuli than do onset responses (Wang et al., 2005). Also, all three fields showed examples of stimulus-driven suppression of spontaneous activity or of robust offset responses.

Location preferences in awake cats differed substantially from those in anesthetized preparations. All three fields showed more units tuned to frontal and ipsilateral locations than those seen in α-chloralose-anesthetized preparations (Stecker et al., 2003; 2005b), which typically show a bias toward far-contralateral tuning. Contrary to the α-chloralose results, an early study of DZ in barbiturate-anesthetized cats showed the majority of units in DZ responding best (or only) to approximately equal sound levels at the two ears (Middlebrooks and Zook, 1983), which predicted near-midline spatial tuning. Barbiturate and α-chloralose anesthetics both potentiate inhibition, although apparently through different binding sites (e.g., Garrett and Gan, 1998). A difference in the neuronal loci of anesthetic effects on inhibition might account for the differences in azimuth tuning seen between barbiturate and α-chloralose anesthesia conditions.

The analysis of location signaling by neural ensembles (Fig. 13) demonstrated orthogonal patterns between DZ and PAF in the regions of greatest localization accuracy. Performance by DZ was best for near-midline locations, whereas PAF showed the greatest localization accuracy for lateral locations. Our analysis was limited by use of ensembles of no more than 16 units and by use of 20 ° spatial bins. Given those limitations, it is remarkable that the smallest errors in our estimates averaged within a factor of two of errors in cats’ behavioral localization judgments (e.g., May and Huang, 1996; Tollin et al., 2005). We assume that the superior accuracy observed in localization behavior reflects the activity of more than 16 neurons and probably involves a more optimal analysis of temporal firing patterns than the simple three-time point pattern recognition scheme that we devised. Also, our analysis incorporated the responses of all the units with complex temporal responses that exhibited ERRFs narrower than the median across the entire sample in each field. The ‘lower envelope principle’ proposed by Barlow (1972; recently reviewed by Phillips et al., 2012) asserts that perceptual sensitivity corresponds to that of the most sensitive neurons. A stricter criterion for units to include in our analysis of localization accuracy might have yielded even greater accuracy, although the analysis might have suffered some loss of statistical power due to a limited sample size.

Spatial sensitivity of a substantial minority of neurons in fields A1, DZ, and PAF varied significantly with behavioral condition. Among those neurons, the most common effect was a sharpening of the spatial tuning of the onset response when the animal was on task, especially during the Localization task. Long-latency portions of responses typically showed little or no task-dependent sharpening. Long-latency responses, however, tended to show sharper spatial tuning in all conditions than did onset responses, and many units, particularly in PAF, showed increased long-latency spike rates in on-task conditions. For those reasons, long-latency responses contributed to an overall sharpening of spatial tuning in on-task conditions. That sharpening of tuning was accompanied by an increase in the accuracy with which neural populations could signal sound locations (e.g., Fig. 13).

First-spike latencies in all three fields were longer in on-task compared to off-task conditions, but there was no significant difference in first-spike latencies between the two task conditions. One might speculate that a greater latency effect would be observed under conditions of a more demanding task. That would be analogous to the observations from animal psychophysical studies showing longer reaction times for more demanding stimulus conditions (e.g., Stebbins, 1966; May et al., 2009).

We are aware of few studies that tested effects of task condition on stimulus specificity in the auditory cortex. One good example is the study by Fritz et al. (2003) in the ferret. In those experiments, the frequency tuning of neurons in response to probe sounds tended to shift in on-task conditions to favor responses to the frequency of a target tone. That study differed from ours in that a stimulus (frequency) had a particular target value and neural tuning shifted toward that value. In contrast, our task provided no particular target azimuth and the responses of a substantial minority of neurons demonstrated an overall sharpening of azimuth specificity in the on-task condition. Scott et al. (2007) demonstrated influences of behavior on cortical responses to a spatial cue, interaural phase difference, but about equal numbers of units exhibited increases or decreases in stimulus specificity in on-task compared to off-task conditions. Woods et al. (2006) recorded from auditory cortex in monkeys while the monkeys performed a task that required evaluation of sound locations. There was no comparison of on- versus off-task conditions in that report, however, and similar results were obtained from two trained monkeys compared to a third monkey that could not be trained to perform the task. Zhou and Wang (2012) measured auditory spatial sensitivity in awake, idle, marmosets. The breadth of sensitivity encountered in that study was comparable to that observed in our Idle conditions. That is, about half of the neurons responded with ≥62.5 % of maximal activity to about half of the tested locations, all of which were in the front half of space. The ERRFs of those neurons necessarily would have been broader than their “best areas,” which were computed using a 62.5 % of maximum rate criterion.

Cortical representation of sound location

The present physiological results complement the behavioral work by Lomber and colleagues (Malhotra et al., 2004; 2008) that demonstrates characteristic localization deficits resulting from deactivation of particular cortical fields. These results, taken together, lead us to propose a model of cortical spatial representation in which fields A1, DZ, and PAF each make distinctive contributions to spatial hearing. Field A1 receives robust thalamic inputs, especially from the ventral division of the medial geniculate body, and sends feed forward connections to area DZ, PAF, and other auditory cortices. (e.g., Lee and Winer, 2008a,b). Studies of field A1 in behaving animals demonstrate task-dependent modulation of responses to sound location (Lee and Middlebrooks, 2011 and the present results), frequency (Fritz et al., 2003; 2005), and frequency sequences (Scheich et al., 2007). Restricted inactivation of A1 that spared DZ produced a reduction in accuracy of localization of contralateral targets to around 45 % correct, compared to control levels of >90 % and reductions to around chance levels of 16.7 % when both A1 and DZ were inactivated simultaneously. During restricted A1 inactivation, most errors were ≤30 ° in magnitude, generally restricted to the correct sound hemifield (Malhotra et al., 2008). These observations suggest that A1 is a multifunction auditory processor that adapts to task demands. Its role in spatial hearing might be primarily as input source and modulator for fields that are more specifically spatial.

Field DZ is distinguished by the large percentage of units tuned to near-midline locations and by the superior accuracy with which its spike patterns can identify near-midline targets. Deactivation of DZ results in fewer localization errors than is produced by A1 or PAF deactivation (performance during DZ inactivation was around 60 % correct), but those errors tend to be large, often ≥45 °, extending into the incorrect sound hemifield (Malhotra et al., 2008). One interpretation might be that DZ is responsible for localization of near-midline targets. Alternatively, it might be that the role of DZ in spatial hearing is not in localization per se. Rather the tuning of DZ units for locations in the general area of gaze in front of the animal might help in isolating signals of interest from competing sounds.

Of the three fields studied here, PAF exhibits the most uniform distribution of best-area centroids throughout the contralateral hemifield, including front and rear locations. Unilateral deactivation of PAF results in profound contralateral localization deficits (Malhotra et al., 2004). These observations suggest that PAF might have a principal role in sound localization. A difficulty with that argument is that populations of PAF neurons show their greatest location signaling accuracy for lateral regions of space, whereas psychophysical localization accuracy in cats (May and Huang, 1996; Tollin et al., 2005) and humans (e.g., Makous and Middlebrooks, 1990; Carlile et al., 1997) is greatest around the frontal midline. It might be that PAF scans regions that are remote from frontal attention, cooperating with another cortical field, such as DZ, to provide high-acuity localization near the midline.

Several authors have proposed models in which the locations of sounds are represented by the opposing activity of populations of neurons having right- or left-hemifield spatial preferences (e.g., McAlpine and Grothe, 2003; Phillips, 2008; Salminen et al., 2009). We have argued that both right- and left-tuned neurons would need to reside in each cortical hemispheres to account for the contralesional deficits (and ipsilesional survival of function) that accompany unilateral cortical lesions or inactivation (Jenkins and Merzenich, 1984; Malhotra et al., 2004; Stecker et al., 2005a). The presence of substantial numbers of ipsilaterally (as well as contralaterally) tuned units in all three fields in awake cats supports such a model. Recently, the Phillips group (Dingle et al., 2010; 2012) has presented human psychophysical results consistent with the presence of a third, frontally tuned, spatial channel. The many frontally tuned units observed in the present study might constitute such a channel in cats.

The Nelken group has described a gradient of auditory spatial selectivity in the multisensory area AES in halothane-anesthetized cats (Las et al., 2008). Posterior AES (pAES) generally showed more units having frontal or ipsilateral spatial tuning than did anterior AES (aAES). Area AES receives a robust projection from DZ (Lee and Winer, 2008b), and it is tempting to think that frontally tuned neurons on pAES might inherit their tuning from the frontally tuned neurons in DZ. Area AES, as a whole, is distinguished from other auditory cortical areas by its multisensory responses (e.g., Wallace et al., 1992) and by its descending projections to the superior colliculus (Meredith and Clemo, 1989), which is important for reflexive movements to sounds. Inactivation of AES results in profound deficits in localization of contralateral sounds (Malhotra et al., 2004). One might think of DZ and pAES having somewhat differing roles in frontal hearing, with DZ being more perceptual and pAES being more motor. At present, there is no evidence to draw analogies between aAES and PAF. Las et al. only tested sound locations in the frontal ±75 ° of azimuth, so one cannot say whether aAES shares with PAF the property of fairly uniform spatial representation throughout front and rear locations in the contralateral hemifield.

The Recanzone group has compared spatial sensitivity among cortical fields in awake monkeys (Woods et al., 2006; Miller and Recanzone, 2009). On average, neurons in caudal belt fields showed sharper spatial tuning than did neurons in core or rostral belt fields. In a measure of location signaling by spike counts, the Caudal Lateral (CL) field showed the smallest errors averaged across all locations, which those authors took as evidence for a more important role of CL in localization. We note that CL neurons in that report signaled lateral targets accurately but showed relatively poor accuracy around the midline. In contrast, neurons in the Rostral core (R) provided superior accuracy near the frontal midline, where psychophysical acuity is sharpest. It is difficult to draw homologies between specific cortical fields in primates and cats. Nevertheless, it is interesting to note the analogies in spatial sensitivity between monkey CL and cat PAF, which signal lateral locations accurately, and between monkey R and cat DZ, which show the greatest accuracy around the frontal midline. It remains for future research to determine whether DZ in the cat and/or R in the primate function jointly with PAF and/or CL as a system for sound localization or whether the frontal tuning of DZ and/or R serves to facilitate detection and recognition of sounds of interest within a complex auditory scene.