Introduction

Both African and Asian (Elephas maximus) elephants produce high-amplitude [103 and 90 dB sound pressure level (SPL) at 5 m, respectively], low-frequency vocalizations in the range of 20 Hz (Poole et al. 1988; Payne et al. 1986), and Asian elephants are capable of detecting acoustic signals at 20 Hz when received above 60 dB SPL (Heffner and Heffner 1982). Although the threshold of hearing sensitivity for the African elephant is not yet known, it is reasonable to expect that it is similar to that of the Asian elephant, given the similar ear anatomy and vocalization frequency range. African elephants have also been shown to respond to acoustic playbacks of their own low-frequency vocalizations at a distance of up to 2 km (Langbauer et al. 1991). When both African and Asian elephants vocalize with these low-frequency rumbles, correlates of these vocalizations propagate in the ground (O’Connell-Rodwell et al. 2000a, 2001; Gunther et al. 2004).

The use of seismic cues for assessing the environment in the context of prey detection, predator avoidance, and communication has been documented over short distances in a diverse range of arthropods, amphibians, reptiles, and small rodent species (see O’Connell-Rodwell et al. 2000a for review). If elephants can detect and interpret the seismic replicates of their vocalizations as signals, they may be communicating and sensing the environment in a modality not previously considered.

Working under the hypothesis that the natural behavior of wild elephants would change significantly if they detected meaningful seismic stimuli, we played back seismic replicates of low-frequency elephant vocalizations to wild herds of African elephants in their natural environment in Etosha National Park and monitored their behavior for significant changes.

To demonstrate that elephants respond to seismic stimuli, we first chose to play back an elephant vocalization that had the greatest potential for a behavioral response. We then generated this vocalization seismically with enough amplitude to be detected by an elephant. Finally, the seismic signal had to be monitored to ensure that there was no coupling of the signal with the air.

Material and methods

Study area

The data collected for these studies occurred over the period of June and July, 2002. We conducted our studies at a remote water hole (Mushara) in Etosha National Park, Namibia, for three reasons. Large numbers of elephant herds visit this water hole on a consistent basis, and it is closed to the public and far from the main tourist road. Since the substrate environment in Etosha National Park is relatively homogeneous calcrete, it serves as an ideal testing site for this study because seismic signals may be more easily transmitted and detected under these geologic conditions; hence, elephants may be more likely to use them in this environment. In addition, to maximize the probability of creating a significant behavioral response in the elephants, we created seismic replicates of prerecorded acoustic alarm calls (Fig. 1a) that we know would cause dramatic behavioral changes when played to elephants in the study area as airborne signals (O’Connell-Rodwell et al. 2000b).

Fig. 1
figure 1

a Acoustic recording of an alarm call recorded from a breeding herd at Mushara water hole in the presence of lions in July, 1994. This alarm call was played back to three different breeding herds at this site as well as ten bulls in 1994 (O’Connell-Rodwell et al. 2000b) and again to two breeding herds and two groups of bulls in 2002, resulting in elephants fleeing the area. The call was played through a low-frequency speaker and recorded at 10 m. The linear-weighted peak was 95 dB SPL at a peak frequency centered around 200 Hz with a band width of approximately 200 Hz. The spectrogram was generated using Praat V.4.1.14 (sampling rate 44.1 kHz, FFT 16,384, gaussian window, gray-scale dynamic range 50 dB). b Second harmonic of alarm calls played seismically and recorded from a geophone (NG) and microphone (NM) 10 m from the seismic source during seismic playbacks. The noise in both traces is due to elephants drinking at the water hole. The spectrogram was generated using Praat V.4.1.14 (sampling rate 44.1 kHz, FFT 16,384, gaussian window, gray-scale dynamic range 50 dB)

Seismic signal

The playback signal was filtered in Cool Edit Pro V1.2 using a 15th order Butterworth low-pass filter (cutoff frequency 60 Hz) to remove the higher harmonics which previous experiments had shown to have a greater tendency to couple with the air (unpublished data). The filtered alarm calls were then transmitted to the elephants as isolated seismic signals through 6, 50-W Aura Base Shaker (20–80 Hz) seismic transmitters (Fig. 1b), buried 30 m from the water hole at a depth of 1 m below the surface (see Fig. 2a for setup). A Dell Inspiron 3700 laptop with a Digigram Vxpocket sound card provided the signal source for the seismic transmitters. An Audiotek SS 400.2, 400-W amplifier was then used to raise the amplitude of the signals to peak presentation velocities ranging from 0.44 to 1.1 μm/s root mean square (RMS) at 10 m from the elephants, approximating the velocity of a naturally occurring seismic elephant rumble (40–60 Hz) at this distance (mean 0.276 μm/s; O’Connell-Rodwell et al. 2000a), to be sufficiently above background level at the position of the elephants (25–30 m). These seismic signals were played (Fig. 1b, top trace) to 11 breeding herds over the course of a 3-week period. A pair of microphone/geophone sensors was also placed at 10 and 90 m from the shakers [near microphone (NM)/near geophone (NG) and remote microphone (RM)/remote geophone (RG), respectively] to monitor signal integrity.

Fig. 2
figure 2

a Experiment setup at Mushara water hole, Etosha National Park, Namibia, showing the location of the shakers in relation to the water hole (30 m), NG, and NM (10 m), as well as the RG and RM (90 m) and the position of the observation tower. b The spectral response of NG and RG at the calibration level, which is 14.6 dB higher than at the presentation level. From these, α=0.05/m and L 0=25.5×10−6 m3/2/s parameters for the model of Eq. (1) were calculated for shaker-induced motions, resulting in a signal level calculation of −13.5 dB for trial #5. The background activity level was similarly calculated to be −15.7 dB, with corresponding model parameters of α=0.018/m and L 0=4×10−7 m3/2/s. Therefore, the SNR was 2.2 dB. Both signal and noise level calculations are at 1 body length or 5 m from the water hole

Experimental design

Trials took place between 4 p.m. and 12 a.m. with an average temperature of 16.4°C and average wind speed of 4.6 km/h. Each trial began after breeding herds had sufficient time to drink (2–5 min), consisting of 5 min of control period, and 3 min of playback, where a 15-s string of alarm calls was played each minute, followed by 45 s of silence. Elephants were then monitored for 4 min after the trial or until they left the water hole.

Data collection

All data were collected in real time in notebooks as well as on videotape for confirmation as needed and for later analysis. Night time trials were conducted through low light and generation 1 and 2 night vision binoculars (Zeiss 8×56, Bushnell 7×50 binoculars; Rigel Optics 3200 Generation 1 Night Vision goggles with a 3.6× lens conversion, Voyager 2nd Generation Night Vision Monocular with 1.5× magnification). A night vision lens was attached to the video recorder to document these trials. Herd size and composition (adult, three fourths, one half, one fourth, and babies; sizes are relative to adult females) were noted for each trial. This information was used to identify herds being tested to confirm that they were indeed independent groups. The direction in which herds arrived and departed from the water hole was also noted.

During the trials, we monitored four classes of behavioral data: herd spacing, vigilant behaviors (smelling, visual orientation toward source of stimulus, freezing, rumbling, foot lifting, and scanning), body orientation relative to the seismic signal source, and duration of time spent at the water hole. For a control on the duration of time spent at the water hole, we also measured the duration of time that 16 breeding herds spent at the water hole when no seismic signals were played back. The observer noting real-time behaviors did not know the onset and timing of the trials.

A Neumann KM131 microphone with a Sound Devices MM1 preamplifier was placed directly above each of the two 4.5-Hz Mark Products geophones to test for evidence of any acoustic correlate of our seismic playback signals (Fig. 1b). Recordings were made on a TASCAM 2 channel digital recorder. Background noise activity was measured during trials prior to turning on the shakers.

Data analysis

Behavioral analyses were based on real-time data collection in the field, and body orientation analysis was based on video analysis. Individual occurrences of each type of behavior were noted for each trial period and summed. This number was divided by the number of animals present in the herd and then divided by the minutes in that period to give a measure of occurrences per animal per minute. Another observer recorded the inter-individual spacing of the herd every 15 s during the trial by noting the most common inter-individual spacing between the pairs of animals. This was noted in three categories: less than 1 body length apart (0), 1 body length apart (1), and greater than 1 body length apart (2). Average body lengths apart were then calculated for each period (control and post-playback) by summing the recorded spacing (either 0, 1, or 2) and dividing by the number of recorded spacings for that period. The duration of time spent at the water hole was measured from the time of arrival of the whole herd at the water hole until the time that the first individual left the water hole. To standardize the duration of time spent at the water hole (because the start of playback trial varied from 2 to 5 min after the herds’ arrival), we subtracted the time from the onset of playback for each trial. For the controls, since there was no playback, we subtracted the average onset from the playback (183 s).

We predicted that the inter-individual spacing between pre-playback and post-playback periods would decrease and that vigilance behaviors would increase due to an elevation in defensive behaviors in reaction to the nature of the seismic stimulus. We therefore used a directional Wilcoxon signed rank test to evaluate these variables (as the direction of the outcome was predictable a priori).

An analysis of covariance (ANCOVA) was used to determine if the occurrence of a seismic playback had an effect on the time an elephant herd spent at the water hole. We also included potential confounding variables such as herd size in the model.

Video analyses were conducted on the seismic playback trials to compare the body orientation of elephants before and during stimulus presentation. Videotapes were digitized and analyzed (Noldus Observer software) for body orientation (north, south, east, or west). The video analysis team was ignorant of the purpose, details, and timing of the experiment.

Video data were collected only on adult elephants (defined as any individual whose size was at least 75% of the size of the largest elephant in the group). Because the entire group was filmed on one camera, elephants were distant in the camera’s view and often obscured by one another; orientation was determined by a suite of cues, including the direction of trunk and ears, leg locations, and the overall trajectory of the spine from the head to the base of the tail. Each trial was coded using instantaneous scan samples every 10 s. Data were subsequently split into the control period (5 min) before playbacks began and the experimental period (3 min). Means were calculated for the average proportion of elephants facing each direction per interval for the control and experimental periods. A chi-square test was performed to determine whether elephants were randomly oriented before and during the presentation of the seismic stimulus.

Signal calibration

To calculate the presentation levels of the seismic signals and to confirm that there was no signal in the air during seismic presentations, the field-recorded geophone and microphone signals for each trial were transferred from the digital audiotape (DAT) recorder and calibrated as shown below.

Geophone and microphone calibrations were performed a few degrees higher than the average field trial temperature (20 vs 16°C), and the equipment was set up in an identical manner to that in the field, albeit with shorter cables. The microphone was calibrated using a Bruel & Kjaer (type 4320) calibrator. The calibrator tone frequency of 1 kHz, at 94 dB SPL (re 20 μPa), was recorded onto a TASCAM DAT (digital) and converted to computer .wav files via a Digigram Vxpocket interface using Cool Edit Pro. Spectrum analyses were performed using customized software. A scaling factor of 6.83 Pa/full-scale resulted in 94 dB SPL for the calibration tone. Because the microphone sensitivity is flat to within 2 dB in the 20-Hz to 20-kHz region, the calibration at 1 kHz is assumed valid in the 20-Hz to 20-kHz region. Below 20 Hz, the microphone sensitivity decreases.

The geophone was calibrated by comparing its response to an accelerometer. The geophone and a Bruel & Kjaer accelerometer (type 4381) were mounted within 9 cm on a 91×54×0.5-cm solid aluminum table, mounted with a Buttkicker 5- to 200-Hz shaker below, and powered by a Buttkicker 400-W preamplifier. The geophone signal was simultaneously fed into a TASCAM DAT recorder, while the accelerometer signal was fed into a Bruel & Kjaer charge amplifier (type 2635), selected to convert accelerations to velocities. The output of the charge amplifier was the input to the SYSid (Version 7.0) measurement and analysis system (Puria et al. 1993). A tone at 100 Hz was synchronously generated by SYSid to vibrate the table. The DAT signal was transferred to a computer, and spectrum analyses were performed. A scaling factor of 423 μm/s RMS/full-scale resulted in the same geophone level as that from the accelerometer. The geophone is expected to be flat to within 0.6 dB in the 10- to 100-Hz frequency region.

The silent periods between signals within the 15-s playback were deleted (i.e., 4–7 s in Fig 1b), and the 10-s segment from each signal was combined across the three playbacks, resulting in a 30-s continuous signal that was spectrally averaged. The analysis consisted of windowing 16,384 points, in window lengths of 372 ms, and taking its Fast Fourier transform (FFT), with a corresponding frequency resolution of 2.7 Hz, and squared to convert to energy. In a 30-s period, about 80 windowed energy buffers were averaged at each frequency point. The averaged energy components were integrated across approximately one-sixth-octave band frequencies following Heffner and Heffner (1982). Additionally, all energy components were summed to obtain the decibel SPL as a function of time.

To ensure that the seismic signal was detectable at 25 m from the shakers, we used the standard surface wave attenuation rate model (White 1965),

$$L = L_{0} \cdot \frac{{e^{{ - \alpha r}} }}{{{\sqrt r }}}$$
(1)

to calculate attenuation based on levels that were used to calibrate the signal (14.6 dB higher than presentation level) (see Fig. 2b). The limitation of our shaker system was such that, at the presentation level (at a low enough level such that there was no air coupling), the signal was not strong enough to be detectable at the 90-m geophone. Equation (1) relates the velocity level L (m/s) at distance r based on a source signal level L 0 and propagation constant α. Using Eq. (1) and the recorded signal at 10 and 90 m from the source, we were able to calculate L o and α. At calibration levels, the signal was detectable in geophones at 10 and 90 m away from the source (13.8 and −30.5 dB re 1 μm/s at 45 Hz, respectively); thus, we calculated a presentation level for trial #5 of −13.5 dB re 1 μm/s at 25 m (1 body length, or 5 m from the water hole), given the calculated α=0.05/m and L 0=25.5×10−6 m3/2/s (see Fig. 2b). The high-level calibration signal was not used in trials because acoustic coupling was detected in the microphone at 10 m from the source.

As an additional confirmation that there was no airborne coupling in any of the trials, we ran a series of matched filters on the microphone recordings during our seismic playback trials, using the playback signal recorded in the NG to create the matched filter. The matched filtering was done in Matlab 7.0 after the recordings were down-sampled from 44,100 to 1,000 Hz. The matched filter output was displayed as a spectrogram for visual inspection. All spectrograms were calculated using an FFT of 512 and a Hanning window with 98% overlap. There was no evidence of coupling in any of the trials (data not shown).

Results

The seismic signals were successfully played back to 11 different elephant breeding herds over a 3-week period. A total of three trials were excluded from the herd spacing, vigilance behavior, and body orientation analyses due to either the arrival of another herd or the herd leaving before the start of the postplayback period.

In response to seismic playback trials, we found that breeding herds aggregated into significantly denser groups (Wilcoxon=26.0, n=8, n=7 for test, p=0.026; see Fig. 3) when compared with individual spacing within the herd prior to the presentation of the stimulus; this closer spacing is considered to be a defensive behavior (McComb et al. 2001). Vigilant behaviors increased after the playbacks but not significantly (Wilcoxon=9.0, n=8, p=0.117; Fig. 3). The body orientation of individual elephants within a herd changed significantly after exposure to the seismic stimulus. Prior to the playbacks, elephants were oriented randomly (χ 2=2.18, df=3, p=0.536), but after presentation of the stimulus, they were no longer oriented randomly (χ 2=12.88, df=3 p=0.005) but rather primarily (40%) to the west. To confirm that this measure was not simply an indication of the direction of departure, the directions of arrival and departure were noted for all experimental groups. There were no instances of herds arriving from the west and only one instance of a herd departing toward the west (data not shown).

Fig. 3
figure 3

Responses of breeding herds to seismic playbacks of warning calls (means and standard deviations): interindividual spacing, engagement in vigilant behaviors, and duration of time spent at the water hole. Herd spacing and the duration of time at the water hole were significant, and vigilant behavior increased but not significantly

The duration of time that herds spent at the water hole while exposed to the seismic alarm calls (n=11, mean herd size 19.18, range 3–35; Fig. 3) was significantly less than the duration of time spent at the water hole by herds that were not exposed to the seismic stimulus (n=16, mean herd size 18.5, range 6–32; Fig. 3; F 1,24=8.11, p=0.009). The effect of herd size on the duration of time at the water hole was not significant (F 1,24=3.74, p=0.065). There was no interaction between herd size and treatment (F 1,23=0.02, p=0.889), and herd size was therefore excluded from the final ANCOVA model.

The signal detected in all four sensors (NG/NM, RG/RM) prior to and while the seismic shakers were active was measured for all eight playback trials. By using the standard surface wave attenuation rate model Eq. (1), we were able to estimate the signal strength and noise at the location of the elephants. It was assumed that the noise was produced by a single source at the water hole (i.e., the elephants) as their footfalls create seismic noise, while the signal was created at the shakers 30 m from the water hole (Table 1). When averaged across trials, the mean geophone velocity increased from −20.1 to −3.1 dB re 1 μm/s for an overall signal to noise ratio (SNR) of 17 dB during the seismic playback in the 40- to 60-Hz frequency region at 10 m. For representative trial levels see, Fig. 4a (trial #5).

Table 1 SNR calculations for all eight trials at 10 m, 90 m and at the elephant distance of 25 m from the seismic source
Fig. 4
figure 4

a The spectral response of near (NG) and remote (RG) geophones averaged over 30 s directly before (background activity) and while shakers were on for each of the three seismic playbacks (from trial #5 which is representative of the eight trials). b Spectral response of the near and remote microphones (NM/RM) in trial #5 averaged over 10 s directly before and during the first seismic playback. Also shown is the elephant minimum audible field (MAF) at the threshold of hearing (Heffner and Heffner 1982). c Spectral response of the NM in trial #5 averaged over 10 s directly before (1-Pre), during (2-During), and post (3-Post) playback, showing an increase in noise between 100 and 500 Hz upon initiation of playback and continuing after playback, most likely due to increased elephant activity in response to seismic stimuli

At the location of the elephants, the SNR is expected to be significantly lower due to a decrease in seismic signal level and an increase in the background activity level. When averaged across all eight trials, the mean seismic signal level is −13.6 dB re 1μm/s, and the mean background activity level is −12 dB re 1 μm/s for an overall SNR at the location of the elephants of −1.6 dB. There was no detectable acoustic signal from the shakers in the microphones in the frequency range we played back (Fig. 4b). In some trials, there was a slight increase in acoustic activity in the 100- to 500-Hz frequency range during presentation, most likely due to increased animal activity in response to the seismic stimuli, as this noise persisted following the presentation of the 15-s seismic signal (Fig. 4c). In the 20-Hz to 1.5-kHz range, the acoustic signal at both locations was typically near the threshold of hearing for Asian elephants (Fig. 4b,c).

Discussion

In this study, we played back seismic replicates of elephant alarm vocalizations to wild elephant herds in their natural environment in Etosha National Park, Namibia, and observed significant behavioral changes in all eight of the herds tested. Elephants responded to the seismic stimuli presented in this study by clumping into tighter groups, changing their orientation from a random orientation to perpendicular to the playback source, and by leaving the water hole sooner than the control herds that were not presented with seismic stimuli.

Although the seismic alarm call playbacks did not elicit the same dramatic response as when they were transmitted acoustically, they nevertheless elicited significant defensive behaviors (clumping, orientating in a nonrandom direction, and leaving early from the water hole). Since breeding herds actively protect their young (Dublin 1983), alarm calls from breeding herds would be of particular relevance. In this case, receiving a seismic signal, but not its acoustic counterpart, may have been interpreted by the elephants as a signal originating from a distance source, which could explain why their responses were not as dramatic as when they detected the acoustic counterpart of the signal. Another possible explanation for the responses being less dramatic is that the elephants may have detected the signal but may not have been able to interpret it correctly. While elephants in this study appeared to respond appropriately to the meaning of the antipredator signals by taking defensive postures, orienting and leaving, the study was not designed to assess whether elephants can discriminate meaning from seismic stimuli, and thus, this is yet to be determined.

Humans are able to detect a signal with an SNR of −2 dB, below 200 Hz (Zwicker and Feldtkeller 1999), and it is reasonable to assume that elephants have at least this level of signal detection capability and probably better at the lower frequencies (see Gunther et al. 2004 for review). With regard to the strength of the seismic stimuli, the average SNR was low at the location of the elephants (mean −1.6 dB), but, since elephants were positioned in a range of anywhere from 10 to 25 m away from the source, at least some elephants would have been in a detectable range.

While a mechanism for seismic detection in elephants has not yet been elucidated, one possible pathway of detection would be through bone conduction to the elephants’ ears via the toes, as the hypertrophied malleus of the elephant middle ear is well suited to facilitate bone conducted seismic cues (Reuter et al. 1998). It has also been suggested that seismic signals could be detected through as-yet-undescribed somatosensory structures in the feet (O’Connell et al. 1999) or by using an organ, such as the trunk, that has an extraordinary number of pacinian and Meissner’s corpuscles sensitive to low-frequency vibrations (Rasmussen and Munger 1996) and could be placed on the ground at appropriate times (unpublished data).

This study provides the first demonstration that a large mammal can detect seismically propagated playbacks of conspecific vocalizations. Limitations in the technology needed to recreate seismic signals artificially in the field meant that we had to keep the distance between our seismic source and the elephants within the range of about 30 m. The range over which a natural seismic signal created by an elephant might be detected, however, has yet to be determined empirically. Our most recent field studies, supplemented by modeling, indicated that elephants might be able to detect seismic vocalizations as far as 2 km away from their source (Gunther et al. 2004), while previous models predicted a maximum possible range of 16 km, depending on soil transmission characteristics (O’Connell-Rodwell et al. 2000a). Seismic cues could supplement acoustic information, replace acoustic information under poor atmospheric conditions, or under ideal seismic conditions, and may even extend the elephant’s range of communication.

If elephants can monitor both the airborne and seismic components of other elephants’ vocalizations simultaneously, they could also receive information regarding the distance and bearing of the vocalizing animal. Seismic signals most often travel at a different velocity than their acoustic counterparts, which produces a differential time-of-arrival cue that increases with distance from the source and maximizes the potential for determination of range and bearing of the vocalizing individual. At two of our study sites, we measured a 17-m acoustic wavelength and a 12-m seismic wavelength at 20 Hz, which indicate ideal conditions for signal localization, 12-m wave having a more discrete phase angle, providing the feet or the ears better information on locating the source of the signal.

Intraspecific eavesdropping on acoustic, visual, or chemical cues to facilitate the detection of relevant signals has been described as an advantageous strategy in a variety of contexts. Examples include object assessment by dolphins (Xitco and Roitblat 1996), food location by whales (Madsen et al. 2002), territoriality in birds (Peake et al. 2002; Dabelsteen et al. 1997), combat between fighting fish (Oliveira et al. 1998), predation by bats (Heller 1995), and predation and parasitization in arthropods (Stowe et al. 1995). Seismic eavesdropping is also thought to be a successful strategy for predation on termites by the golden mole (Narins et al. 1997), on kangaroo rats by snakes (Randall and Matocq 1997), and in territoriality, predator avoidance and mate finding (Cokl and Virant-Doberlet 2003), predation (Sturzl et al. 2000), or parasitization (Fischer et al. 2001) in many arthropod species. Eavesdropping by keeping an “ear” to the ground could pay off for the elephant as well in the context of localizing a caller and estimating distance from the caller, coordinating herd movements, mate finding, predator avoidance, or detecting substrate changes to facilitate finding water or possibly even changes in weather patterns and seismic disturbances.