Keywords

6.1 Introduction

Echolocating bats and dolphins project sounds into their surroundings and listen to the returning echoes to detect and identify objects. These animals must deal with a potentially wide range of acoustic interference that is dependent on the amount of clutter, or the distribution of extraneous objects, in the environment. For both dolphins and bats, the ability to detect and resolve targets of interest is due to the intricacies of the sound projection and echo reception systems in association with sophisticated neural processing. The present chapter offers an integrated view of selected research findings regarding the principal purpose of wideband biosonar: the localization and classification of targets based on accurate determination of the delay and spectrum of echoes.

6.1.1 Limitations on Comparisons Between Dolphins and Bats

It is difficult to articulate a single, comprehensive account, or even a reasonably specific unifying hypothesis, about how echolocation “works.” One reason is that there are a number of different systems used by different species of bats and odontocetes. Few species of either bat or odontocete have been sufficiently studied to yield a realistic, mechanistic assessment of even one type of echolocation system. The inevitable consequence of different kinds of experiments being carried out by different investigators using different species leads to a description of a fictional sonar system in a fictional, composite animal. Extracting broadly applicable principles from this diversity is not feasible until individual systems have been explored to unambiguously reveal their inner mechanisms and their relation to environmental demands and prior evolutionary pressures. Here, we highlight information derived from new research on echolocation by bottlenose dolphins (Tursiops truncatus) and big brown bats (Eptesicus fuscus). Both species transmit wideband biosonar sounds, with most of their energy in the range of 20 kHz to about 110–130 kHz. Studies of echolocation in bats have developed along somewhat different lines than research on cetaceans, largely from practical considerations—the difficulty inherent in observing these animals using their sonar in natural conditions plus the availability, suitability, and expense of equipment and facilities required for laboratory studies. Nevertheless, a relatively wide range of experiments has been conducted on both species to further the goal of understanding the mechanisms of echolocation.

6.2 Target Detection and the Operating Range of Echolocation in Relation to the Emission Patterns of Broadcast Signals

Identifying relations between the structure and pattern of biosonar emissions and performance in behavioral tests of target localization and perception is necessary to understand how echoes are processed in bats and dolphins. Figure 6.1 illustrates the spectrogram, spectrum, and autocorrelation function for a typical FM biosonar sound emitted by a big brown bat and an echolocation click emitted by a bottlenose dolphin. For big brown bats in most conditions, the duration of FM pulses ranges from a maximum of 20–25 ms when flying in open spaces to about 1–3 ms when flying in vegetation or clutter (Petrites et al. 2009; Aytekin et al. 2010; Hiryu et al. 2010; Moss and Surlykke 2010). When the bat is about to land on a surface or capture a flying insect, broadcast durations shorten further to 0.3–0.5 ms. The most salient characteristics of these sounds are their wide frequency band (extending from roughly 20 kHz at the low end to 100–110 kHz at the high end, to achieve a total bandwidth of 80–90 kHz), their downward FM sweeps, and their multiple harmonics (FM1, FM2, FM3). In contrast, bottlenose dolphins produce signals much shorter than big brown bats referred to as clicks. Echolocation clicks have durations between 40 and 70 μs for laboratory conditions (Au et al. 1974; Capus et al. 2007) and 10–23 μs for free-ranging conditions (Akamatsu et al. 1998; Wahlberg et al. 2011), although differences in data collection efforts and means of calculating durations may explain some of the observed differences between laboratory and free-ranging conditions. Dolphin echolocation clicks can have a wide frequency bandwidth (>85 kHz) with energy commonly between 25 and 130 kHz, but the dolphin can also control the spectral content of the click and capitalize on narrower band signals (Houser et al. 1999; Muller et al. 2008). Because of the transient nature of the dolphin echolocation signal, significantly less control in duration is observed relative to that of the big brown bat. Nevertheless, both big brown bats and dolphins change characteristics of their broadcasts in a multidimensional adaptation to the prevailing acoustic conditions.

Fig. 6.1
figure 1

Three different signal representations for (a) big brown bat and (b) bottlenose dolphin wideband echolocation sounds. Plots show the spectrogram (time–frequency), spectrum (frequency), and autocorrelation (time) representations for a representative biosonar signal. The very compressed autocorrelation functions illustrate how sharply echoes originating from ensonification with these sounds can be localized in delay, or target range (Simmons and Stein 1980)

Bats and dolphins produce FM pulses or clicks, respectively, in temporal sequences that result in a stream of echoes returning to the animal’s ears (Fig. 6.2). The sequence of clicks or pulses, often called “trains,” is important as each successive echo received from a target increases the amount of information about the object available to the animal. The sensitivity or probability of detection gets worse as the number of received echoes decreases in both species (Altes et al. 2003; Surlykke 2004). In the dolphin, each successive echo received reduces the signal-to-noise ratio (SNR) of the detection threshold, suggesting that the dolphin potentially conforms to a summation or integration receiver model (Au 1993). In the bat, a very different pattern emerges—as the number of echoes increases from one to three, sensitivity remains poor, but when a minimum of four to seven echoes are available, sensitivity abruptly improves (Surlykke 2004).

Fig. 6.2
figure 2

Natural acoustic behavior of a big brown bat during aerial interception. (a) Time series of biosonar broadcasts emitted by a big brown bat during an interception maneuver (Simmons 2005). Labels indicate the approximate segments of the search, approach, and terminal stages. Spectrograms of (b) shallow-sweeping “quasi-CF” search-stage sound, (c) first broad-sweeping FM sound following initial detection, (d) broad-sweeping FM sound emitted during active approach, (e) series of seven successive FM sounds emitted during terminal approach (“buzz”), and (f) sequence of sounds emitted during the transition into the terminal stage. Note the progression of shortened pulse durations and the decrease in interpulse intervals as the bat moves nearer to the insect

During flight or swimming, the bat or dolphin aims its head or steers the broadcast beam to ensonify the target of immediate interest. The receiving beams are such that the focus of reception is from echoes arriving from straight ahead. However, objects off to the sides still are ensonified relatively strongly and the echoes they return are picked up by the ears with considerable sensitivity (although the ear on the same side as one of these objects does receive a stronger version of the echoes). The raw transmitted and received beams thus do not directly segregate the target’s echoes from surrounding clutter.

The target strength is a representation of how strongly an object reflects sound. Because target strength is calculated as the logarithm of the ratio of the intensity of the reflected echo to the intensity of the incident sound, it is often presented in decibels (dB). For the bat, small objects such as flying insects have target strengths of roughly −10 to −40 dB depending on their size. In contrast, objects comprising clutter, such as leaves, have target strengths that depend more on their perpendicular orientation to the incident sound path than size; they reflect specular echoes with little intrinsic target-related attenuation other than that due to distance or direction because their dimensions are so much larger than the incident wavelengths. Thus, when bats fly through vegetation or close to the ground, they receive numerous, very strong echoes from the parts of the scene located at close range, even from parts that are off the beam axis because the transmitted beam is quite broad. Moreover, as bats move relative to their surroundings (typically at velocities of several meters per second), most of the clutter fluctuates in strength because different elements of the scene shift in and out of the necessary perpendicular, specular orientation. Surprisingly, more distant parts of the scene still can return strong echoes because the array of reflecting elements acts as an extended surface—at greater distances more and more reflections are recruited into the incident beam and add together to create an overall backscatter that declines only very gradually with range.

Bottlenose dolphins feed on prey much larger than those of bats, with dimensions often larger than the incident wavelength, and with a more complicated echo structure that fluctuates depending on aspect angle for tonal signals (Au et al. 2007). Despite these challenges, dolphins often forage in highly reverberant shallow waters and locate prey hiding among sea grass and beneath sand or mud. The broadband structure of the bottlenose dolphin click reduces the inherent fluctuations in echo target strength (based on energy), allowing the dolphin to maintain a relatively consistent echo target strength regardless of angle (Au et al. 2007). This allows for high levels of accuracy, with dolphins detecting targets with echoes less than 1 dB greater than the background clutter echoes (Au and Turl 1983).

Both big brown bats and bottlenose dolphins change the interval between sonar broadcasts (interpulse intervals [IPI for bats] or interclick interval [ICI for dolphins]) in response to prevailing acoustic conditions and variation in target distance. The echo stream is the series of echoes that returns to a bat or dolphin after they have ensonified the environment. The IPI or ICI defines the portion of the echo stream that returns to the bat or dolphin before the production of another pulse or click. The duration of the echo stream (ESD) following the production of a pulse or click is critical to resolving target echoes from clutter both for the bat and the dolphin. If the bat or dolphin emits a second sound before all of the first sound’s echoes have arrived, uncertainty arises about which broadcast is responsible for which echoes. This pulse-echo ambiguity occurs whenever successive ESDs overlap.

Echoes from targets at different distances return at different delays (5.8 ms/m in air and 1.3 ms/m in sea water), creating an equivalence between objects arrayed in space and echoes arrayed in time. The outer limit for this equivalence—the operating range of biosonar—is determined by propagation losses, the time delays themselves, and the degree of sensitivity of the hearing apparatus. For a point target, such as an insect, at all frequencies, spreading loss combined across the outward-bound and inward-bound paths as a function of range is 1/d2, where d is the distance. For a planar target, such as leaf clutter, it is 1/d. As distance increases, echo strength declines according to these terms, but added to geometric spreading is absorption due to the medium through which the sound travels, which is frequency dependent. Absorption discriminates sharply against higher frequencies in both air and water, but is greatly diminished relative to the propagation loss in water. For higher-frequency in-air broadcasts, detection ranges will be shorter than 5–10 m in the best conditions of very intense broadcasts and large reflecting surfaces (Fig. 6.3).

Fig. 6.3
figure 3

Operating range of echolocation for bats (top) and dolphins (bottom). Solid black curves trace the reduction in strength of echoes returned by an ideal point target (i.e., 0 dB target strength) assuming spherical spreading losses separately for the broadcast and the echo (accumulating as 1/d 2), plus absorption at selected frequencies. Solid gray curves trace the reduction in strength incurred by echoes returned by an ideal planar target (i.e., 0 dB target strength) assuming spherical spreading losses together for the broadcast and the echo (accumulating as 1/d overall), plus atmospheric absorption at selected frequencies. Top (bats): Data on echo strength at different distances are presented as echo attenuation relative to the broadcast (0 dB) recorded at a distance 10 cm from the bat’s open mouth. The gray curves give the likely maximum operating range for the strongest possible reflecting source, while the black curves give the likely operating range for an insect-like small reflector. Upper left: Gray circles show target strengths for 4.8 and 19.1 mm diameter spheres (Simmons and Chen 1989). Black circles and vertical lines show the spread of target strengths for insects having wing lengths of 2.6–3.1 mm, 5–5 mm, and 8–9 mm (Houston et al. 2004). Lower right: Gray circles show measured big brown bat distances of detection for spheres (Kick 1982). Horizontal black bar shows approximate spread of big brown bat distances of detection for different-sized insects. Bottom (dolphins): Data on echo strength are presented as echo attenuation relative to the broadcast (0 dB) recorded at 1.0 m from the dolphin’s rostrum. Upper left: Gray circles show target strengths for a 2.54 and 7.62 cm sphere (Au and Snyder 1980; Murchison 1980). Black circles show the spread of target strengths for fish of lengths 20–26 cm (Au et al. 2007). Lower right: Gray circles show the measured bottlenose dolphin distances of detection for spheres (Au and Snyder 1980; Murchison 1980)

Bats progress through three stages during foraging: the search stage; the approach stage, where the bat has found and approaches its prey; and the terminal stage, which is the short period starting just before prey capture that is typified by increased rates of pulse production (i.e., “buzzing”). In the big brown bat, the IPIs recorded in the field seem surprisingly long (90–180 ms; Fig. 6.4) during the search stage (before about 1,000 ms relative to the onset of the terminal stage buzz; Moss and Surlykke 2010). If the bat is waiting for reception of all audible echoes from various parts of the whole scene (e.g., trees in the distance or the ground) before emitting the next sound, the IPIs indicate that large sources of reflections that define the boundaries of the space must be detectable at long ranges of 15–30 m (see Fig. 6.3). For big brown bats using frequencies of 25–35 kHz in FM1 of shallow-sweeping sounds or the tail-end frequencies of broader-sweeping sounds, echoes of 0–10 dB SPL from point targets are detectable at distances of 3–5 m for small spheres (Kick 1982). The corresponding echo delays are about 18–30 ms. At these same broadcast levels, planar targets could plausibly be detected at distances of 20–25 m. The corresponding echo delays are 120–150 ms. It thus appears that the long IPIs used in open-area searches seem designed to accommodate the return of echoes from large background surfaces at long ranges (Schnitzler et al. 2003; Moss and Surlykke 2010).

Fig. 6.4
figure 4

Interpulse intervals (IPIs) for big brown bats during open-space interceptions. Data points show IPIs for biosonar sounds emitted during 19 separate aerial captures of June beetles (see Fig. 6.2 and Simmons 2005). The data are plotted with respect to the transition into the terminal stage, indicated by the first IPI consistently shorter than 20 ms (0 origin of horizontal axis)

To avoid pulse-echo ambiguity in open-area searches, big brown bats wait until all echoes have returned before emitting the next sound. However, when flying in dense clutter where the demands of obstacle avoidance become paramount, big brown bats shorten their IPIs by emitting sounds in strobe-group pairs or triplets (two or three sounds emitted close together followed by a longer interval between pairs) so that streams of echoes from successive sounds overlap and pulse-echo ambiguity does occur (Petrites et al. 2009). In this difficult situation, the bats make subtle changes in the frequencies of their broadcasts—alternately increasing and decreasing the tail-end frequencies of the sounds comprising the strobe groups to allow their echoes to be distinguished (Hiryu et al. 2010).

Another interesting feature of the IPIs shown in Fig. 6.4 is that the bats make an unexpectedly abrupt transition from the approach stage to the terminal buzz. In the field recordings, the bats appear to jump from emitting sounds at intervals mostly longer than 25–30 ms to mostly shorter than 15 ms in the terminal buzz, with practically no IPIs in the intervening region. This unexpected gap in the IPI distributions suggests that, even when the bat is actively engaged in tracking the target, it keeps emitting sounds at long enough intervals to preserve some space for echoes that arrive from background objects farther away than the insect. Broadcast durations, however, do continue to track the steadily declining distance to the insect, confirming that a true, acoustically specified approach stage is in progress. Laboratory experiments have confirmed that big brown bats distinguish between regulation of broadcast duration by the distance to targets of interest and regulation of IPIs by the larger scale of the surrounding space (Saillant et al. 2007; Petrites et al. 2009; Aytekin et al. 2010).

Bottlenose dolphins in both field and laboratory settings produce clicks that typically vary according to target range and with ICIs long enough to prevent overlap of successive ESDs (Au et al. 1974; Jensen et al. 2009). Work by Penner (1988) demonstrated that the generation of ICIs was conscious and that dolphins produce ICIs consistent with the expectation of target location, that is, randomization of target distances showed an influence of prior target location on ICIs and an increase in detection error rates relative to targets presented at a constant distance. For bottlenose dolphins producing clicks with a center frequency of 75 kHz, echoes from point targets with target strengths of –28 to –41 dB are detectable at distances of 74–113 m for small spheres (Au and Snyder 1980; Murchison 1980). The corresponding echo delays are 96–147 ms, and fall within the maximum ICI value of 462 ms recorded in the field (Jensen et al. 2009). At these same broadcast levels, planar targets could plausibly be detected at distances of 1,147–1,260 m with corresponding echo delays of 1,491–1,638 ms. Unlike the field recordings of foraging bats, most field recordings of bottlenose dolphins are obtained by encouraging the animal to focus its attention on an array of hydrophones that is hanging perpendicular to the ocean floor. Further, on-axis clicks are typically recorded when the animal is oriented perpendicular to the array and thus parallel to the ocean floor. In such a configuration, the dolphin is essentially swimming in a clutter-free zone and would have no need to produce echolocation signals with ICIs long enough to process echoes from large, distant surfaces.

Dolphins, like the big brown bat, significantly decrease the ICI in the final approaches to a target or prey item, that is, they also demonstrate a terminal buzz. However, more interesting may be the behavior that occurs when targets are at long range. At target distances greater than approximately 100–200 m, dolphins have been observed to produce packets of clicks and have been shown capable of detecting targets and changes in target echoes at distances of up to 800 m, provided the echo level was sufficiently high for detection (Ivanov 2004; Finneran 2013). Because the number of clicks per packet (approximately 4–10) does not significantly vary according to changes in target strength or echo level, the production of packets appears to be related mostly to the target distance (Finneran 2013). The production of packets should result in pulse-echo ambiguity, but the duration between packets follows the same pattern as that of individual clicks at shorter ranges; mainly, the packets are not produced until the echoes from all clicks within a packet are received by the dolphin. Thus, the duration between packets is likely used to resolve the pulse-echo ambiguity. Why packets are produced is unknown, although its occurrence may signify a limitation in the delay that can occur between echoes if the dolphins indeed utilize multi-echo processing (Au et al. 1988; Altes et al. 2003).

6.3 Perception of Target Range from Echo Delay

Having developed a means for estimating the overall operating range of echolocation, it is of interest to determine how effectively bats and dolphins can locate targets along this range dimension. For measuring the delay of echoes, a sonar system uses the time of the broadcast as a reference, or a trigger signal, and reception of an echo culminates in registration of the elapsed time since the trigger occurred. Assuming that the receiver has arbitrarily precise knowledge of this reference time, the accuracy of delay determination depends on the nature of the broadcast signal, in particular, its bandwidth. The availability of more frequencies equals sharper determination of delay, which means that wideband sonar signals are especially suited to precisely determining target range from echo delay. Indeed, it has been suggested that there is no purpose for bats to emit wideband FM sonar sounds unless they exploit the bandwidth by internally dechirping the echoes—removing the frequency modulation to minimize the duration of the sound, which then maximizes the accuracy of echo delay estimates (Glaser 1974).

The theoretical influence of the broadcast signal’s composition on the accuracy of target ranging is portrayed by the cross correlation function between echoes and broadcasts. (The autocorrelation function in Fig. 6.1 is equivalent to the cross correlation function of a delayed, attenuated replica of the broadcast, as is the case for the reflection from a point target, which provides one reflective surface, or glint, at close range. The shape of this function displays the intrinsic timing accuracy of the signal.) The example of a big brown bat sound in Fig. 6.1 has a broad bandwidth (approximately 80 kHz), which gives it a tightly compressed autocorrelation function, with little spread in time around its central peak (i.e., few side peaks and those of lower amplitude than the central peak). The total time span, including the main peak and the most prominent side peaks, is about 150 μs, which corresponds to about 2–3 cm in target range. The central peak alone is 7–8 μs wide, corresponding to slightly more than 1 mm in target range, and its very tip is even narrower, 1 μs or less, corresponding to a fraction of a millimeter. Autocorrelation functions of bottlenose dolphin signals demonstrate similar characteristics of bats (Fig. 6.1). The dolphin’s broad bandwidth signal (approximately 85 kHz) produces even tighter autocorrelation functions of total duration around 100 μs, which corresponds to 3–4 cm in target range. The central peak is even narrower, around 6–7 μs, corresponding to slightly more than 2 mm target range. To make a crude but useful first approximation, the core question about how echolocation works revolves around which of these time scales embodied in the cross correlation function best describes the animal’s acuity for perceiving echo delay.

The accuracy, bias, and precision of echo delay perception is visualized in Fig. 6.5a. The accuracy of a range estimate refers to the agreement between the estimated value of target range in relation to the target’s true, or objective distance. Any error in range accuracy reflects a bias in the delay-estimation process. If the bat creates a separate delay estimate for each broadcast-echo pair, and then emits a series of sounds so that multiple estimates accumulate, the width of the distribution of these estimates is the precision of echo delay estimation. Note that precision refers to the variability of delay estimates around a mean estimate for all of the echoes, not a statement about whether the bat experiences a bias away from the objective delay. If precision is high enough (i.e., the measurement distribution is narrow enough), the presence of a bias can be detected in the measurements, but if precision is low, any bias would go unnoticed owing to the excessive variability.

Fig. 6.5
figure 5

The measurement process for target ranging. (a) Accuracy, bias, and precision of range estimation using echo delay. See text for details. (b) Psychophysical experiments on the bat’s perception of range assess accuracy and precision indirectly, by determining resolution. This can be done by giving the bat two separate objects at the same time situated in different directions at different ranges and training it to chose one (the rewarded stimulus object at the correct range, S+) over the other (the unrewarded object at an incorrect range, S−), or by presenting the bat with one object at a time and training it to respond to the object presented at the rewarded range (S+) and not to the object presented at other ranges (S−). The bats limit of resolution is determined by reducing the difference in range between S+ and S−

Psychophysical experiments allow for controlled measurements of echolocation delay precision by training animals to respond to echoes that arrive at different delays and measuring the smallest change in delay the animal can perceive. Figure 6.5b illustrates what is meant by delay resolution, which is what psychophysical experiments measure. This procedural alternative to the sensorimotor approach seems straightforward enough, but it turns out not to be so simple in practice. Numerous psychophysical experiments have been carried out to estimate the bat’s target range or echo delay resolution (Simmons and Grinnell 1988; Moss and Schnitzler 1995). Such experiments often use targets that return one reflection for each incident broadcast; that is, each target has one glint. However, some of these experiments even estimate the bat’s two-point resolution (Fig. 6.5b), which is a special case in which the subject is presented two different ranges within the same target so that the object returns two reflections for every incident broadcast. This object, which is a two-glint object, is used to determine the smallest range separation between the glints for which the object still is perceived as containing two separate glints.

Psychophysical experiments investigating echolocation delay fall to two different types—two-choice discrimination of echoes delivered at different delays for a series of broadcasts emitted by the bat over a time span of several seconds, and detection of echoes that jitter in their delay from one broadcast to the next. Their methods differ most obviously in the time scale for the presentation of the stimuli—the two-choice method gives the bat several seconds to examine the echoes at each of two delays to determine which echoes arrive at the correct delay; the jitter method delivers the bat with alternating examples of both delays in just a few tens of milliseconds.

Figure 6.6 illustrates the two-alternative forced-choice method (the “two-choice” discrimination procedure) most commonly employed to measure the bat’s echo delay resolution with single-glint echoes (Moss and Schnitzler 1995). In this example, the bat sits on an elevated Y-shaped platform while its broadcasts are picked up by microphones (m), delayed electronically, and then returned from loudspeakers (s) as echoes at a particular delay that simulates a target at a particular range. The bat is trained to respond by moving forward toward the loudspeaker that returns the rewarded stimuli (S+), here shown as being presented at a fixed delay of 3.2 ms to simulate a target at a range of 55 cm. The bat should not approach the loudspeaker that returns the unrewarded stimuli (S−), which is presented at a series of different delays (illustrated schematically as proceeding from 1 to 10 in the diagram) that bracket the fixed delay of S+ (e.g., 3.5 ms down to 3.1 ms compared to 3.2 ms). The presentation of S+ and S− on the left or right is alternated randomly, and the bat’s task is to locate S+ at the rewarded delay in the presence of S− at a series of different delays. Insets show spectrograms for the FM broadcast followed by either S+ (upper) or by S− (lower). The example shows S− at a longer delay than S+ (dashed curves in S− spectrogram show delay of S+). As the delay of S− is changed from one value to the next, the hypothetical performance curve (S+ to S− delay difference vs. % errors) traces the masking effect of S− on S+, or the region where the delays of S+ and S− appear indistinguishable to the bat. The peak in the error curve marks the delay of S− that equals the perceived delay of S+, and the shape of the curve traces the bat’s representation of the delay—ideally, its width should be the bat’s delay accuracy.

Fig. 6.6
figure 6

The two-choice delay discrimination psychophysical test to determine the bat’s echo delay accuracy, with diagram showing possible outcomes from the test (m microphones, s loudspeakers, S+ rewarded stimuli, S− unrewarded stimuli). The letters ad correspond to possible outcomes of procedures using single-glint and multiple-glint echoes. See text for details

The two-choice experiment arrives at an estimate for the bat’s resolution for echo delay by gradually reducing the difference in delay between S+ and S−. Figure 6.6 shows schematically how the difference in delay changes (in steps 1–10) so that, across many trials, S- is delivered at a series of delays that bracket the delay of S+. The bat’s performance changes according to the size of the delay difference—it makes more errors in its choices (chance performance is 50 % errors) when S+ and S− have similar delays. At this performance level, the bat cannot distinguish between the delays. For the simplest two-choice experiment, the stimuli both consist of a single echo for each broadcast (i.e., a simulated single-glint target)—one echo from the microphone-loudspeaker channel set to the delay specified for S+, and the other echo from the channel set to the delay of S− (left or right, changed randomly). The shape of the error curve on the diagram in Fig. 6.6 for values of the S+ to S− delay difference traces the region where the bat perceives the two stimulus delays to be the same. As an index of the delay-discrimination threshold, the half-width of the error curve half-way down from its peak (at 25 % errors) is taken to be the bat’s echo delay resolution.

Figure 6.6a, b shows two extreme examples of alternate outcomes (hypothetical error curves) for the two-choice tests based on the shape of the cross correlation function of echoes (see Fig. 6.1). The examples assume single-glint echoes. In one extreme, the bat’s intrinsic accuracy is hypothesized to be of the order of 100 μs (width of gray shaded curve in Fig. 6.6a). This corresponds to a target range accuracy of about a centimeter or two, and also to the width of the entire cross correlation function from Fig. 6.1, including the central peak and its side peaks. In the other extreme, the bat’s intrinsic accuracy is hypothesized to be in the region of 1 μs or better (width of gray shaded curve in Fig. 6.6b), which corresponds to a target range accuracy of a fraction of a millimeter, and also to the width at the tip of the central peak in the cross correlation function from Fig. 6.1. This finding is satisfyingly similar to the precision of about 200 μs (3.4 cm) estimated from the bat’s vocal sensorimotor control of broadcast duration and its reaching out to seize the target. One caveat, however, is that in a typical trial of a two-choice experiment the bat emits a series of broadcasts while scanning left and right to examine the two simulated objects. The size of these left-to-right head movements amounts to several centimeters, which changes the distance from the bat to the microphones and loudspeakers by up to 1–2 cm. This, in turn, changes the actual delay of the echoes reaching the bat’s ears, by up to 50–100 μs. Therefore there remains a question as to whether the two-choice method has an intrinsic limitation that might conceal greater delay precision in the bat.

The next two examples in Fig. 6.6c, d are for S+ echoes simulating a multiple-glint target that returns two or three distinct reflections at small delay separations. The spectra are modified by interference nulls distributed at fixed frequency intervals across the spectrum, which are determined by the time separations of the glint reflections. Figure 6.6c shows a broad error peak, 50–100 μs wide, representing relatively low delay acuity coupled with an additional perceptual quality related to the spectral “coloration” from the nulls. In this case, there should not be an error peak at all because the bat can distinguish S+ from S− by the presence of this spectral coloration, even when the delay difference itself is small. Figure 6.6d shows a narrow error peak representing high range accuracy coupled with additional error peaks that mark delays where S− corresponds in delay to one or two additional glints in S+. The presence of the additional error peaks signify high delay resolution and represent the conversion of spectral nulls into time separation estimates by the bat. The curve in Fig. 6.6c implies that the bat perceives target glint structure in terms of the echo spectrum as a dimension orthogonal to echo delay (this dimension would prevent the error curve from having a peak), while the curve in Fig. 6.6d implies perception of target glint structure in terms of the distances to the glints within the target.

Experiments with detection of jitter in echo delay suggest the big brown bat’s intrinsic delay precision might be as small as 10–20 ns (Simmons et al. 1990). This extraordinarily acute, psychophysically measured delay resolution implies very high delay precision for the bat. Nevertheless, however implausible it seems, a resolution of 10–20 ns is not an impossible result from an information-theoretic perspective; it is achievable given the bandwidth and signal-to-noise ratio of the bat’s broadcasts (Sanderson and Simmons 2002). Because neural responses at various levels of the big brown bat’s auditory system exhibit latency variability typically of hundreds of microseconds (Ferragamo et al. 1998; Valentine and Moss 1998; Sanderson and Simmons 2000), the bat’s delay-perception mechanism clearly is distributed across populations of neurons and thus quite possibly does not have a conventional architecture (Simmons 2012).

6.4 Distortions of Perception for Target Range by Flying Bats

The occurrence of amplitude latency trading, or the physiological effect that causes a shift of response timing as a function of stimulus amplitude, introduces a bias in the bat’s estimate of echo delay (Fig. 6.5a). In big brown bats, latencies of neuronal responses evoked by FM sounds or tone bursts in the auditory system, particularly the inferior colliculus, become longer by about 15 μs when stimulus amplitude is decreased by 1 dB (Simmons et al. 1990; Burkard and Moss 1994; Ma and Suga 2008). This affects the estimate’s accuracy quite apart from the precision with which delay is perceived (variation around the mean perceived delay). Because the bat perceives echo delay in relation to the broadcasts—actually the neural responses evoked by echoes in relation to neural responses evoked by broadcasts (Simmons 2012; Simmons and Gaudette 2012)—the accuracy of the bat’s perception of delay depends on the latencies of these responses being the same for both the broadcasts and the echoes. Both the broadcast and the echo undergo changes in amplitude during their propagation from the bat’s larynx (Suthers 2004) to the inner ear (Veselka et al. 2010; Simmons and Gaudette 2012), and there is no easy way to estimate their equivalence as the proximal stimuli for perception of delay.

Concentrating just on changes in echo amplitude relative to broadcast amplitude during an interception maneuver would introduce a bias of underestimating the target’s range, and this bias increases as the distance from the bat to the target increases. Such a bias toward a progressively increasing underestimation of the target’s range could easily compromise the bat’s ability to coordinate its actions during interception. There is evidence that the big brown bat largely avoids experiencing this bias, however. As target range shortens from at least 1–1.5 m down to about 0.2 m, the bat actively compensates for the expected 15–19 dB increase in echo strength due to declining target range by raising its echo detection thresholds (i.e., decreasing its echo detection sensitivity) by roughly the same amount (Kick and Simmons 1984; Simmons et al. 1992). This appears to be achieved as a consequence of the contraction of the bat’s middle ear muscles synchronized to each vocalization. The relaxation of each contraction following the emission leads to progressively improving echo detection sensitivity along a track of 11–12 dB of improvement per doubling of echo delay from the moment of emission out to about 6–10 ms. The maximum attenuation achieved by the bat’s middle ear muscles is about 30–35 dB, which implies a zone of target ranges extending over a factor of 6–7 times for which the actual strength of echoes reaching the inner ear is stabilized. The bat’s peripheral auditory system intervenes between the target and the inner ear to regulate echo strength in a manner that cancels out the expected accumulation of bias in perception of target range so that the bat perceives more nearly the target’s actual range during its approach to capture.

The foregoing discussion of bias in perception of target range caused by amplitude changes in echoes that lead to amplitude-latency trading raises a broader question about what “accuracy” really means for biosonar. There are other sources of range bias, and the “big picture” reveals that the bat’s perception of target range floats on a sea of variability that belies the notion of real accuracy (Holderied et al. 2008). As considered in Sect. 6.3, this makes the distinction between accuracy and precision very important. Big brown bats are in flight in nearly all natural situations, so they are in constant motion while emitting sonar sounds and receiving echoes. Owing to the bat’s forward progress, the echo’s delay will be shorter than it should be for the bat’s initial position (when the process of determining delay was activated). If the whole auditory process is taken as the bat’s gauge for target range, the perceived range will be biased nearer than what was the true range at the moment when the broadcast was produced. This bias toward perceiving a too-short delay depends on the overall distance to the target. If the target is closer than 3 m, the bat will have traveled a shorter distance by the time the echo is received, and the size of the reduction in echo delay consequent on the bat’s forward movement will be shorter, so the range bias will be smaller. Figure 6.7 illustrates two major sources of bias for perception of echo delay and target range that depend on the bat’s flight. There are four targets depicted in this diagram, at increasingly longer ranges and larger shortening biases (1, 2, 3, 4).

Fig. 6.7
figure 7

Factors that affect the accuracy of target range determination (horizontal axis) by a flying bat. Four possible locations of the target are shown—at ranges 1, 2, 3, or 4 incremental “steps.” The gray peaks show the bat’s perceived image of the target at each range; dashed black lines over or near the gray peaks indicate the target’s objective position at positions 1, 2, 3, or 4. The width of the gray peaks stands for the accuracy of the estimation process, which is blurred by the bat’s continuous motion, while the difference in location between the gray peaks and the black dashed lines stands for systematic errors that occur as a result of this same forward motion. The magnitude and direction of both Doppler errors and displacement errors are demonstrated by the black arrows. See text for details

However, there is another source of range bias, and it is in the opposite direction. Apart from displacement in flight, first, the broadcasts that impinge on the target and, second, the echoes that return to impinge on the bat’s ears undergo Doppler shifts at a magnitude directly related to the bat’s flight velocity. In terms of frequencies, higher approach velocities mean larger upward Doppler shifts. The big brown bat’s sonar sounds are not single frequencies, however, but FM sweeps (Fig. 6.1), so the Doppler shift strictly is a compression of the FM waveform that shortens its duration while raising its frequencies. Because the FM signals always sweep downward, the upward Doppler shift in the FM sweeps results in a lengthening of the time that elapses between the occurrence of a given frequency in the broadcast and the occurrence of that same frequency in the echo. This causes the approaching bat to perceive Doppler-shifted echoes as biased toward a longer delay than is true for the target’s actual range, which adds a Doppler error toward a longer range to its overall range bias (rightward arrows in Fig. 6.7). Unlike the shortening bias from displacement errors, which increase in size as target range increases, Doppler errors are a lengthening bias that is the same for all target ranges (Fig. 6.7). The total range bias—the result of the two effects—is the sum of the displacement and Doppler errors. For targets close to the bat, the Doppler error dominates the displacement error, and the target is perceived as being farther away that it really is when the broadcast is sent out (Fig. 6.7, distance step 1). As range increases, the displacement error increases, gradually overcoming the Doppler error so that the target is perceived as being increasingly nearer than its true range at the moment of the broadcast (Fig. 6.7, distance steps 3 and 4). At one particular range, the Doppler and displacements biases are equal but in opposite directions; they cancel each other out so the bias is zero (Fig. 6.7, distance step 2). This is the “distance of focus,” where the target’s range is correctly determined relative to the moment the broadcast is sent out (Holderied et al. 2008).

For representative bat sounds, the range bias varies in size from a lengthening of 1–2 cm for near targets to as much as 3–4 cm of shortening for far targets. Added to both of these biases is the effect of the target’s direction in relation to the direction of the bat’s flight. The size of the Doppler error and the displacement error decrease for targets located to the side of the bat’s velocity vector by an amount proportional to the cosine of the angle of offset. The resultant bias thus also decreases. A plot of the overall range bias for targets located at different distances and in different directions shows a “force-field” of range-bias vectors that decrease in size as the offside angle of the target increases (Holderied et al. 2008). In the plane of range and offside angles (range and cross range), the distance of focus for correct estimation of range becomes a curve that extends to the bat’s left and right. For perception of range, locations closer to the bat than the distance of focus are associated with a lengthening bias while locations farther away are associated with a shortening bias.

The bat appears to compensate for this bias by stabilizing the amplitude of echoes at different delays and may potentially mitigate the range bias associated with the bat’s forward motion. Bats change the duration and sweep rate of their broadcasts as they fly nearer to prey in aerial interceptions (Fig. 6.2). As they do this, the distance of focus (equal Doppler and displacement errors; Fig. 6.7) comes nearer to the bat, as well, which raises the possibility that the bat takes advantage of the adaptability of its sounds to shift the distance of focus, too (Holderied et al. 2008). The question is whether the bat manipulates the distance of focus to keep either the target itself or some region near the target at the distance of focus. In that case, the adaptive changes in broadcasts, as shown in Fig. 6.2, may accomplish more in perceptual terms than merely matching the duration of broadcasts to echo delay so overlap does not occur, and to extend the sonar operating range (“field of view”) to accommodate background clutter (Schnitzler et al. 2003). This is a difficult hypothesis to test, but it will require the effort because range bias has a potentially adverse effect on the perception of targets.

6.5 Perception of Target Shape: Echo Spectra and Glint Delays

The arrangement of objects along the distance axis is the target “scene,” which is represented by the corresponding stream of echoes returning to the bat or dolphin. The target scene spreads out to the animal’s left and right. Within the full-spectrum zone any changes in echo amplitude and spectrum can reasonably be attributed to the target itself. These include changes in overall echo amplitude from one moment to the next due to fluctuating target strength, and changes in the echo spectrum due to interference between multiple reflections from the target’s glints. Additional modifications to returning echoes are dependent on the target’s location in the echolocation beam. Echoes generated off of the main response axis of the echolocation beam undergo low-pass filtering because directionality is narrower at high frequencies than low frequencies, and the target’s shape also affects the echo spectra. For small objects, such as flying insects (bats) or fish (dolphins), which consist of two or more prominent, closely spaced glints, such spectral patterning is the acoustic manifestation of target shape and size (Imaizumi et al. 2008; Au et al. 2009; Matsuo et al. 2009).

Virtually all materials have far greater acoustic impedance than the medium of air, so echoes consist of specular reflections from surfaces and points, which make target geometry the predominant object-related information carried by returning echoes. The situation is more complicated for dolphins, which may feed on species that have swim bladders as well as those that do not. The impedance of the tissues of fish are close to that of the sea water, leaving the swim bladder of prey species that contain them as the prominent source of signal backscatter and target geometry (Fig. 6.8; Au et al. 2007). However, dolphins and other species of echolocating odontocetes may also feed on flatfish, squid, and other animals that lack gas containing structures. In all cases, other ancillary structures (e.g., fins, mantles) and target geometry contribute to the fine echo structure and aspect dependence (Au et al. 2009).

Fig. 6.8
figure 8

Variation in target echoes for a simulated bottlenose dolphin click. (a) The echo from a mullet, a common prey item for the bottlenose dolphin, is highly variable owing to the air-filled swim bladder and fluctuates depending on target geometry. The angles listed correspond to the orientation of the incident signal with respect to the fish body (data from Au et al. 2009). (b) The echo from a solid stainless steel sphere yields three time separation reflections (waveform) which result in spectral notches in the frequency domain. If the echoes arrive within the integration time of the auditory system, the dolphin may use these spectral notches for target discrimination (data from Muller et al. 2007)

The spacing of glints has important implications to an echolocating animal’s identification of the target from which it originates. If two echoes arrive closer together than the integration time of the auditory system, they merge together to create a single spectrogram with interference notches located at specific frequencies determined by the time separation of the reflections (Fig. 6.8). If the frequencies of these notches are known, the underlying time separation can be estimated. In psychophysical tests, big brown bats actually perceive the arrival times of closely spaced reflections, which they infer from the frequencies of the interference notches. This process amounts to deconvolution; in bats, it is effective for determining two-glint separations from 10 μs to about 300 μs, and even down to 2 μs. This degree of resolution is possible because the bat has nearly perfect knowledge of the transmitted signal (it hears the sound at the moment of emission) and can work backward from its internal replica of the broadcast to determine the pattern of reflections required to produce a given interference pattern in echoes. Psychophysical tests with dolphins suggest similar capabilities and potentially similar underlying processes (i.e., deconvolution based on knowledge of the transmitted signal). Bottlenose dolphins have a temporal integration time of approximately 265 μs (Vel’min and Dubrovskii 1976; Moore et al. 1984; Au et al. 1988), yet bottlenose dolphins can determine glint separations as small as 75 μs. The resolution of closely spaced glints is, however, dependent on the relationship between the glint interval and the amplitude of the respective glints (Helweg et al. 2003). In both systems, the limitations of the deconvolution process are crucial for understanding how the echolocating animals cope with clutter.

The wide bandwidth of the big brown bat (75–80 kHz) and bottlenose dolphin (85+ kHz) biosonar sounds allows them to form images that precisely depict the delay of echoes on a scale finer than 1–5 μs, which corresponds to range precision of less than a millimeter. The bat’s delay precision depends on receiving the full broadcast bandwidth in echoes, but it still is very acute, far smaller than the different range biases that affect the accuracy of target ranging on a scale up to several centimeters. Why does the big brown bat have such high broadcast bandwidth and high delay precision? One reason appears to be incorporation of shape into the range images of targets (Simmons et al. 1995; Neuweiler 2000). Small targets such as flying insects are relatively simple objects in acoustic terms. A typical insect consists of two or three prominent body parts (the target’s glints, e.g., head, wings, abdomen) that each reflect a discrete replica of the incident sonar sound (Simmons and Chen 1989; Moss and Zagaeski 1994). Big brown bats prey upon flying insects mostly with dimensions up to about 1–3 cm, so the largest delay separation between reflections will be roughly 60–180 μs. Across insect aspect angles, flight postures, and wing positions, the majority of delay separations between glint reflections distribute between 5–10 μs and 50–100 μs, with periodic transient excursions to longer separations at particular points in the insect’s wing beat cycle. During approach, the bat’s broadcasts are 2–10 ms long (Fig. 6.2), while the glint reflections are only a few tens of microseconds apart. Echoes returning from insects thus contain two or three reflections that arrive at such small delay differences the reflections overlap almost completely. They add together to reinforce or cancel at different frequencies.

Frequencies and frequency spacing of interference nulls in the echo spectrum depend on the time separation of the glint reflections, but typical values range from 50 kHz null separations for a 20 μs two-glint delay separation down to 10 kHz null spacings for a 100-μs delay separation. These nulls are recognized by neurons in the big brown bat’s auditory system that are tuned to frequencies from 15 to 100 kHz with tuning widths from about 1–2 kHz to 10–12 kHz (Simmons 2012). When target shape is taken into account, the wide span of frequencies in the broadcast spectrum (Fig. 6.1) seems well adapted for registering the shape of targets from the pattern of regularly spaced nulls in the spectrum caused by interference between reflections from the target’s glints (Simmons 2012; Simmons and Gaudette 2012). If the target is located close to the bat, no further away than about 1 m, and on the axis of the broadcast beam (Masters et al. 1985; Ghose et al. 2006), only these nulls will affect the echo spectrum significantly because atmospheric absorption has not yet accumulated enough to lowpass filter the echoes by more than a few decibels (Fig. 6.3). Also, in most cases, the targets of interest to aerial-feeding bats have dimensions that are enough larger than the broadcast (i.e., incident) wavelengths to avoid Rayleigh scattering, which would lead to highpass filtering of echoes (Houston et al. 2004). (However, if the target is farther away than 1–2 m, or off the acoustic axis of the broadcast beam, then the echo spectrum manifests lowpass filtering caused by the frequency dependence of the directional beam and by the increased atmospheric absorption at higher frequencies. This effect is considered later in this section.)

The contour plot in Fig. 6.9 shows a vertical stack of 31 horizontal spectral-difference slices derived from a sequence of echoes recorded from a fluttering moth (data from Moss and Zagaeski 1994). In effect, these are slices of the moth’s transfer function frozen as an acoustic snapshot at the moment the incident sound (a bat-like FM signal) impinged on the moth to capture its instantaneous wing posture. The contour plot thus displays variations in the transfer function over time. The mean spectra of three such echoes are shown at the bottom of Fig. 6.9. This transfer function contains a deep notch at 55–70 kHz as well as noticeable ripple spaced at intervals of about 15 kHz. The pattern of notches reveals interference between reflections from different parts of the moth, most likely between the two wings and another prominent body part such as the abdomen.

Fig. 6.9
figure 9

Relative strengths of a series of 126 echoes from a tethered fluttering moth (data from Moss and Zagaeski 1994). (Top) The regular horizontal striping of moth echo spectra in the contour plot reflects the insect’s periodic fluttering wingbeats across successive incident sounds. Contours are plotted with 0 dB as the maximum level that occurs across all spectra so that only spectral coloration is displayed. The first harmonic of the bat’s sounds (FM1 approximately 23–50 kHz) arrives relatively intact in echoes, with minimal losses resulting from different acoustic effects. Higher harmonics especially are significantly affected by target orientations and wingbeat postures. (Bottom) Average spectra of three echoes

For a typical insect, the echo as a whole consists of one, two, or three discrete replicas of the incident sound arriving very close together in time according to the size of the target. Thus, a target with linear dimensions of less than 2 cm returns reflections at separations of 0 to about 100 μs depending on its orientation and the attitude of its wings in the wing-beat cycle. For example, in the case of two reflections arriving at a time separation of 50 μs, the spectrum has interference notches spaced at frequencies 20 kHz apart. Each of these reflections of course undergoes attenuation and lowpass filtering on the journey back to the bat’s ears, but it is the pattern of the notches that distinguishes the target. Echoes from targets located closer than 1–2 m of range and within the 10° width of the full-spectrum beam arrive to convey spectral effects specific to the target’s size and shape without significant lowpass filtering owing to the target’s location. When it does occur, lowpass filtering due to distance or direction is qualitatively different from the pattern of interference notches related to shape. Consequently, the effects of the target’s location in the beam readily can be segregated from the effects of the target’s intrinsic reflectivity. Moreover, the bat actively participates in segregation of location from identity for any particular target of interest. By pointing its head and ears at the target, and so tracking the target’s movements with the sonar version of gaze implied by its imaging beam, the bat nulls out fluctuations in positional lowpass filtering and keeps echoes as close as possible to the full original broadcast spectrum of FM1 and FM2, leaving only the effects of the target’s geometry to be perceived.

6.6 Summary

The principal purpose of wideband sonar is the localization and classification of targets. Considerable variation in the echolocation system of bats and odontocetes exists between and within groups (i.e., different species) and prevents a unified model of biosonar, yet the fundamental problem of target localization through echo delay and target identification through echo spectra is a commonality. Within certain target distances, bats and dolphins resolve pulse-echo ambiguity by producing pulses or clicks at intervals that permit the return of desired target echoes from a biosonar emission before a subsequent emission. In both systems, this pattern dissolves during the terminal phase of prey capture when emissions occur at shorter time intervals than that which corresponds to target range. In dolphins, the production of click packets may also occur at target distances >100 m. This phenomenon may permit integration of echo information across multiple echoes when delays between echo returns from a distant target would otherwise limit the integration process. The accuracy and precision of echo delay perception are acute in both bats and dolphins owing in part to the broadband nature of the biosonar emission. However, at least in bats, variations on the conventional neural architecture due to the distribution of delay perception across populations of neurons may contribute to delay resolution. Target identification through echolocation requires the discrimination of target glints, a deconvolution process that is crucial to resolving target shape. Bats and dolphins demonstrate an ability to resolve spectral notches associated with glint separations on the order of tens of microseconds or less. The resolution of the deconvolution process permits differentiation of glints at spatial scales substantially smaller than the size of the target, regardless of the medium in which echolocation is used (i.e., air vs. water).

Though diverse echolocation strategies and mechanisms exist that match the diversity of habitats in which animals live, it can be concluded that bats and dolphins both have evolved sophisticated approaches to echo delay resolution and spectral processing that permit desired targets (e.g., prey) to be differentiated from clutter within the acoustic scene of their respective environments. However, a complete picture of the mechanisms associated with any natural biosonar system, particularly at the level of neural processing, remains elusive. Continued investigations into the mechanisms engaged by individual systems, both bat and dolphin, will be required before a more complete understanding is obtained of how biosonar “works.”