Introduction

Animals continuously adjust their behaviour in response to environmental stimuli (Dall et al. 2005). The information gathered from these stimuli can be encoded through acoustic, visual, tactile or chemical signals (Bradbury and Vehrencamp 1998). In communication networks, animals receive intended signals produced by individual emitters and respond adaptively, so that the interaction benefits both the emitter and the receiver (Smith and Harper 2003). Animals can also gain information through eavesdropping, i.e., by either seeking or opportunistically detecting conspecific or heterospecific unintended signals (Dall 2005; Valone 2007). Eavesdroppers can thus benefit by making behavioural adjustments that optimise their on-going fitness-enhancing activities (e.g., facilitating mate choice or foraging efficiency) (Blanchet et al. 2010).

In predator–prey and competition systems, detecting cues emitted by each other can be particularly important for both parties. Predators, for instance, can use signals produced by their prey to facilitate their localization (Clark 2004). Conversely, by detecting predator cues, prey species can perceive an immediate risk of predation and may decide to interrupt a fitness-enhancing activity to switch to an adaptive anti-predator response (Sündermann et al. 2008; Curé et al. 2015). In competition systems, eavesdropping on foraging cues released by a conspecific or heterospecific food competitor represents an opportunity for eavesdroppers to locate feeding sites at reduced costs (Voigt-Heucke et al. 2016).

Sound is the most efficient propagation signal over large distances and in various types of environments (Urick 1983). Therefore, species that are particularly mobile, communicate at distance and/or live in dark or obstructed environments, strongly rely on acoustics to fulfil key life functions and behaviours. Sound is of specific importance in the marine realm. Cetaceans use sounds as a primary sensory modality to communicate, orientate and to locate prey (Tyack 1998; Tyack and Clark 2000; Deecke 2006; Hoelzel 2009).

Cetaceans belong to a complex trophic network in which predation and competition interactions occur at various levels. Therefore, they represent interesting species models to study acoustic eavesdropping. Experimental methods to study acoustic communication and eavesdropping usually involve playback experiments, i.e. presenting acoustic stimuli to animals and monitoring their behavioural responses (McGregor 1992; Magrath et al. 2015). Eavesdropping in cetaceans has been poorly studied, relatively to terrestrial animals. Most studies focussed on heterospecific cetacean dyads, especially predator–prey systems, demonstrating the importance of sound eavesdropping in the mediation of interspecific interactions (Fish and Vania 1971; Cummings and Thompson 1971; Tyack et al. 2011; Curé et al. 2013, 2015, 2019; Bowers et al. 2018; Benti et al. 2021). Regarding the acoustically-mediated interactions within species, some pioneering studies using playbacks on cetacean species were conducted in the humpback whale (Tyack 1981, 1983; Mobley et al. 1988), the right whale (Clark and Clark 1980; Parks 2003) and the killer whales (Filatova et al. 2011) and showed that the detection of conspecific sounds could trigger behavioural responses. Since then, there has been a significant advance in the understanding of acoustic communication systems, particularly in the bottlenose dolphin (King and Janik 2013; King and McGregor 2016). However, there remains a gap of knowledge in the potential use of acoustic eavesdropping within cetacean species, and its role in mediating social behavioural response and interaction between conspecific individuals.

Here, we hypothesize that cetaceans can eavesdrop on conspecific sounds produced in different socio-behavioural contexts, to identify and anticipate potential forthcoming threatening or beneficial situations. To test this hypothesis, playback experiments were conducted on free-ranging Risso’s dolphins (Grampus griseus). The Risso’s dolphin produces known context-specific sounds: echolocation clicks and buzzes while foraging (Visser et al. 2021) and social sounds termed whistles, burst pulses and whistle-burst pulses (Corkeron and Van Parijs 2001; Neves 2013; Arranz et al. 2016). They have a complex social organisation in which females with calves usually remain apart from male groups, likely to protect their offspring from potential harassing behaviours from males (Hartman et al. 2008; FV pers. obs.). Males form long-term associations with other males and observations of agonistic behaviours between male groups suggest they can potentially compete for territorial resources or access to females (Hartman et al. 2020). Based on these observations, dolphins were exposed to three acoustic stimuli recorded in different behavioural contexts, that were expected to elicit contrasting horizontal movement responses: (1) conspecific foraging sounds potentially indicating the presence of prey, providing an attractive dinner bell signal, (2) male social sounds simulating a potential risk of agonistic interaction, and (3) female-calf social sounds likely representing no risk.

Methods

General protocol

Experiments with free-ranging Risso’s dolphins (Grampus griseus) were conducted off Terceira island, Azores (Portugal) during July/August of 2015, 2016, 2018 and 2019. Once a group of Risso’s dolphins was sighted (hereafter named focal group), a tagged animal (suction-cup attached DTAG, Johnson and Tyack 2003), if present, or a recognizable animal in the group was identified as the focal individual. Tracking of individuals was conducted, visually (2015 and 2016) or using an unmanned aerial system (UAS; DJI Phantom 4 PRO; 2018 and 2019), in pre- and dur-periods. The pre-period corresponded to the period immediately preceding the start of the sound playback lasting for the same duration, and the dur-period corresponded to the period of sound exposure (n = 21 playbacks; playback duration mean ± s.d. = 12 ± 3.5 min). Experiments were performed in absence of other boats in the area.

Sex class assignment and tracking

Risso’s dolphins usually form groups mainly composed of males or females and calves, or mixed associations (Hartman et al. 2008). The focal group was defined following Visser (2014). The age- and sex-class of each individual of the focal group was defined using: (1) its sighting history (photo-identification catalogue, 2014-present), (2) the consistent presence/absence of a paired calf, and (3) published classification criteria such as body coloration and size (Hartman et al. 2015). Calves were defined as individuals of 75% or less the size of their associated adults, surfacing in close synchrony for at least 50% of observation time (Hartman et al. 2008). Adults paired with a calf on at least two observation days were identified as females. Adult males are larger and have lighter coloration than adult females (Hartman 2018). Larger individuals with a typical male coloration, i.e., white, a consistent association to other adults (part of stable group) and never recorded associated with a calf, were identified as adult males. Sub-adults and adults never sighted paired with a calf and without clear male coloration were classified as ‘unsexed’. At the focal group level, groups were classified as male or female group if all, or all but one or two individuals, were adult males, or adult females (with or without calves), respectively. Groups with individuals that could not be sexed and/or with both males and females, were classified as mixed groups.

Focal group size and composition were recorded during pre- and dur-periods using visual or UAS-based tracking. The UAS (DJI Phantom 4 PRO) recorded video of the focal group (4k resolution) while maintaining a height over the group of 20–30 m, at which we expected no disturbance (Christiansen et al. 2016; Torres et al. 2018). Hence, the frame size remained stable throughout the experiment and allowed for comparative analysis of the number of individuals in the frame between the pre- and dur-periods.

The horizontal path of the focal individual was used as a proxy of the horizontal response of the focal group. During visual tracking, the distance and bearing of the focal animal in relation to the research vessel location were collected at 2-min intervals or at first sighting following a longer dive (following Visser 2014). The geographic positions of the focal animal were then calculated using the GPS positions of the vessel during the tracking. The UAS-based tracking consisted of recording the UAS exact location and height above the sea level using its GPS and custom-built laser. These UAS data allowed to calculate GPS positions of the focal animal at one record per second (Table S1) using semi-automated tracking software (open source, Kinovea version 0.8.15).

Sound playback experiments

To exclude potential behavioural responses to tagging, playback experiments started after a minimum of 30 min recovery period following the end of any tagging effort. Sounds were broadcasted from an underwater omnidirectional loudspeaker (in 2015: Lubell LL9642T, frequency response 0.2–20 kHz; from 2016 to 2019: Oceanears DRS-12 underwater speaker, frequency response: 0.2–110 kHz) deployed from a 6–8 m rigged hull inflatable boat (engine off), hereafter called playback vessel, at 7–8 m depth using a TASCAM DR-680MKII player connected to a SONY XM-N502 amplifier. Frequency responses of the loudspeaker overlapped with the main energy frequency spectrum of Risso’s dolphin vocalizations (between 1 and 20 kHz, Fig. 1). To ensure that sounds were faithfully played by the system without any distortion, playback sounds were monitored using a calibrated hydrophone Bruel & Kjaer 8105 (frequency response 0.1–160 000 Hz , sensitivity: − 205 dB re 1 V/μPa) placed at 1 m from the speaker and recorded on a TASCAM DR40 recorder (frequency sampling: 96 kHz, resolution: 16 bit).

Fig. 1
figure 1

Spectrograms and oscillograms representing a sound sample for each of the three stimulus types, foraging (Fo), social female-calf (SF) and social male (SM). Each of these samples illustrates the typical sounds composing the corresponding stimulus: click series and buzzes for Fo, whistle, burst pulses and whistle-burst pulse for SF and SM. Spectrograms were made using Seewave package of R software version 1.2.1335 (Sueur et al. 2008). Hanning window; FFT window size: 1024 points; overlap 75%

Acoustic stimuli were composed of natural sound sequences recorded by DTAGS (Johnson and Tyack 2003) deployed on different animals in previous years. Non-vocal sounds (e.g., breathing sounds, rubbing, bubble noise, background noise) were removed from the sound sequences using Adobe Audition (version 2.0). Recordings of conspecific sounds originated from female or male groups (as defined above). The foraging or socializing behavioural context of the recordings was determined from the presence of echolocation clicks and buzzes produced by the tagged dolphin during typical foraging dive cycles, as recorded on the tag (foraging; Arranz et al. 2016; Jensen et al. 2020), or from visual observations at the sea surface reporting social interaction associated to the production of social sounds (socializing). Three types of conspecific acoustic stimuli were prepared and randomly broadcasted (Fig. 1). The “foraging stimulus” (Fo) comprised natural sequences of echolocation clicks and terminal buzzes (fast production of successive clicks) produced in a feeding context (i.e., during foraging dives). The “social female-calf” stimulus (SF) comprised natural sequences of social calls produced during social interactions in groups of females and calves, which included frequency-modulated whistles, broadband burst pulses and whistle-burst pulses (Corkeron and Van Parijs 2001; Neves 2013). The “social male” stimulus (SM) comprised natural sequences of social calls produced during social interaction in male groups, which included whistles, burst pulses and/or whistle-burst pulses. Three versions, each from a different DTAG recording, were prepared for each stimulus type to avoid pseudo-replication (McGregor 1992). All stimuli were prepared using natural sound sequences recorded from individuals in the study area (Azores), except one of the three versions of SM stimulus that was composed from recordings made in California (USA). Acoustic stimuli had a sound pressure level ranging from 126 to 136 dB re 1 µPa at 1 m within the maximum spectral energy frequency band (1–20 kHz), thus not exceeding the natural level of Risso’s dolphins sounds (Madsen et al. 2004). Using the source level and the distance between the source and the animal at start of playback (mean ± s.d.: 685 ± 268; range: 299–1198 m), the sound level received by the focal individual was estimated (mean ± s.d.: 74 ± 4 dB re 1 µPa; n = 22 playback experiments, range = 68–91 dB re 1 µPa) (Fig. S1). Given the hearing sensitivity of Risso’s dolphins (threshold at approximatively 70 dB SPL within their best hearing frequency range 5–80 kHz; Nachtigall 2005; Mooney et al. 2012), acoustic stimuli were likely in the audible range of the species at start of playback. Experiments for which the estimated received level was below the hearing sensitivity of Risso’s dolphin were excluded from analyses (n = 1, Fig. S1).

A focal animal and its group were randomly exposed to 1–3 playback stimuli (Table S1) of 12 ± 3.5 min each (mean ± s.d., n = 21 playbacks), on the same day, with a minimum of 30 min recovery time between the successive playbacks. The playback vessel was positioned ahead of and to the side of the focal animal’s projected path, to be able to detect a potential approach or avoidance response to the playback sound source location (Fig. 2).

Fig. 2
figure 2

Quantification of the horizontal movement response for two Risso’s dolphins exposed to foraging (Example 1) and social male stimuli (Example 2). Panels AC correspond, respectively, to the three successive steps (1, 2 and 3) for the calculation of the coefficient of reaction. Panel A illustrates the calculation of the actual reaction score at tmax. The horizontal trajectory of the focal individual (continuous line) was obtained either by visual tracking (Example 1) or using an unmanned aerial system (UAS-based tracking) (Example 2). Bold line corresponds to the playback trajectory, which is the trajectory of the focal individual during the sound exposure. The “sound source” arrow represents the natural drift of the playback vessel during the sound exposure. The no-change trajectory is represented by the dashed arrow and corresponds to the theoretical trajectory if the focal animal had kept its initial direction during the playback. The tmax point is the time at which the individual exhibits its maximal reaction. Panel B shows the calculation of the theoretical attraction score at tmax. The maximal attraction trajectory is represented by the dashed line and corresponds to the theoretical trajectory if the focal animal had kept a continuous heading towards the sound source during the playback. Panel C represents the calculation of the coefficient of reaction. Example 1: the animal is closer to the maximal attraction trajectory than to the no-change trajectory, the coefficient of reaction is positive, thus is considered as an attraction response. Example 2: the animal has a course away from the maximal attraction trajectory, the coefficient of reaction is negative, thus considered as an avoidance response

Quantification of changes in horizontal movement

To investigate the animal’s movement response to a sound playback, changes in the direction of horizontal movement of the focal animal were quantified by calculating a coefficient of reaction for each playback experiment. Similar to the movement reaction score developed by Curé et al. (2012), the coefficient of reaction developed in this study aimed at quantifying potential horizontal attraction or avoidance response to playbacks by comparing the actual animal horizontal trajectory during playback and the theoretical animal trajectory if the whale had kept initial direction of movement during playback. The main difference of the present approach compared to the one used by Curé et al. (2012) is that our coefficient of movement reaction was based on the animal position at its maximal response (maximal attraction or avoidance) occurring over the playback course rather than considering the animal position at end of playback as used by Curé et al. Our coefficient of movement response was based on the comparison between the real horizontal trajectory of the focal animal during the playback named the playback trajectory, and a theoretical no-change trajectory as if the animal had kept its initial direction of movement during the playback. Indeed, the natural behaviour of these animals are usually to keep a straight course and to sporadically change direction of movement when they switch activity (pers. obs.). Therefore, if the animal would have no reaction to the playback, we expect a no-change trajectory. The no-change trajectory corresponded to the projected path of the focal animal over the playback period using the real speed recorded during playback and the average direction of horizontal movement exhibited in the pre-period.

The calculation of the coefficient of reaction was based on three steps: (1) calculation of the actual reaction score, (2) calculation of the theoretical attraction score, and (3) calculation of the coefficient of reaction.

Step 1: Calculation of the actual reaction score

The three following parameters were calculated for each focal dolphin’s position collected during a playback trial: (1) the distance between the sound source and the theoretical dolphin position on the no-change trajectory, (2) the distance between the sound source and the actual dolphin’s position on the playback trajectory, and (3) the difference between these two distances, named actual reaction score. Thus, for a given position, if the focal animal was closer to the sound source than it would have been if it had kept its initial course, the actual reaction score had a positive value and was classified as an attraction. By contrast, if the position of the animal was further away from this position, the actual reaction score had a negative value and was classified as an avoidance response. Then, the actual reaction score that had the maximum value (either positive or negative) was retained among all actual reaction scores calculated for each position data point composing the playback. This score was used to define the position time tmax at which the maximal change of horizontal movement occurred.

Step 2: Calculation of the theoretical attraction score at t max

The maximal attraction trajectory was modelled as the projected maximal attraction of the focal animal toward the drifting sound source as if the animal had kept a continuous heading towards the playback vessel from the start to the end of the playback, with same speed as the one recorded during the playback. The theoretical attraction score at tmax was then calculated as the difference between the distance from the sound source at tmax to the theoretical dolphin position on the no-change trajectory (i.e., theoretical no-change position at tmax) and the distance from the sound source at tmax to the theoretical dolphin position on the maximal attraction trajectory (i.e., theoretical maximal attraction position at tmax) (Fig. 2B).

Step 3: Calculation of the coefficient of reaction

Variation in the position of the sound source and changes in trajectory prior to the start of the experiment produced variability in the geometry of the experiments. To render the coefficient of reaction independent from the experiment geometry, it was calculated as the ratio of the actual reaction score at tmax over the theoretical attraction score, for each playback experiment.

A coefficient of reaction with a positive or negative value indicated that the maximal movement response over the playback course was, respectively, an approach or an avoidance response (Fig. 2C).

Statistical analysis

To investigate whether Risso’s dolphins discriminated between the three acoustic stimuli, Generalized Estimating Equation (GEE) models (Liang and Zeger 1986) were used. The effects of three predictors Stimulus (Fo, SF, SM), Year (animals tested in 2015, 2016, 2018 or in 2019) and playback Order (first, second or third playback) were tested on the response variable coefficient of reaction. To validate that the focal individual response was representative of the group response, a second model was fitted to test whether group size changed between the pre- and dur-periods. No change in group size indicated that group members remained together during the experiment phase. The GEE models included the focal group identity as a blocking unit to account for repeated measures. The coefficient of reaction and the group size were modelled as Gaussian response variables. A Jackknife variance estimator was used to avoid biases induced by the small sample size (Paik 1988). The model selection was based on a backward selection using p-values given by an ANOVA (Wald test) model: the factor with the highest non-significant p-value (> 0.05) was removed and the GEE model was refitted (Curé et al. 2012). Assumptions for the normality of the residuals and the homogeneity of variances were verified before running the analysis. GEE pairwise comparisons were performed with a Bonferroni correction applied at p-value < 0.025. All statistical analyses were conducted using the software R (package geepack v1.3-1) (Carey et al. 2012).

Results

A total of 14 Risso’s dolphin focal individuals and associated groups were tracked and exposed to one (n = 7 groups), two (n = 5) or three (n = 2) playbacks (Table S1). Focal groups included six male groups, two female groups and seven mixed groups (Fig. 3, Table S1).

Fig. 3
figure 3

Risso’s dolphin avoidance and attraction responses to conspecific sounds. Data represent the coefficients of reaction (mean ± s.e.m.) for the change in horizontal movement of focal animals exposed to foraging sounds, social female-calf sounds and social male sounds. Coefficient of reaction > 0 indicates attraction, coefficient of reaction < 0 indicates avoidance. Each focal individual is represented by a symbol

During playback, groups maintained the same group size (mean ± s.d. = 7 ± 4; n = 14 groups) as prior to the playback (no significant change in number of individuals between pre- and dur-periods; ANOVA: Χ2 = 0.40; p-value = 0.53). Group composition was moreover validated throughout the experiments from both the visual and UAS-based tracking. Hence, the movement trajectory observed at individual-level was coordinated with group members.

Risso’s dolphins significantly changed their movement trajectory in response to the playbacks. The best-fitting GEE model showed a significant effect of the factor Stimulus on the degree and direction of a change in movement (coefficient of reaction; ANOVA: Χ2 = 12.2; p-value = 0.002). Playback order or year of exposure were not retained in the best-fitted GEE model. During six out of seven social female-calf playbacks and six out of seven foraging playbacks, male and mixed focal groups changed their initial direction of horizontal movement to approach the sound source (Fig. 3; Table S1).

In contrast, during the majority of social male playbacks (five out of seven), both male and female groups, as well as mixed groups, changed their horizontal course away from the sound source. One male group (gg18_197) and one female group (gg16_168) showed an attraction to the social male stimulus.

Individuals exposed to multiple playbacks reacted consistently. One mixed group exposed to two different versions of the foraging stimulus (gg15_191), responded with a strong approach response to both playbacks. Two out of three groups tested with both foraging and social female-calf stimuli approached the sound source during both playbacks. The mixed group exposed to foraging then to social male stimuli (gg16_171) approached the sound source in response to the first and showed an avoidance response to the second.

The avoidance response induced by the social male stimulus was significantly different from the approach response exhibited during the foraging stimulus (mean (s.e.m.): − 33.0 (− 18.4) vs. 51.2 (16.1); GEE: estimate ± s.d. = 84.2 ± 29.8, p-value = 0.005; Fig. 3) and during the social female-calf stimulus (51.7 (14.7); GEE: 84.7 ± 29.7, p-value = 0.004; Fig. 3). There was no difference between both attraction responses to the foraging and social female-calf playbacks (GEE: 0.5 ± 21.2, p-value = 0.98).

Discussion

Our study presents first evidence that cetaceans can eavesdrop on conspecific sounds and adjust their behaviour according to the perceived behavioural context. Risso’s dolphins (Grampus griseus) exposed to foraging or social female-calf sounds approached the sound source location. In contrast, they avoided the location of male social sound playbacks.

Eavesdropping on perceived threatening stimuli can be used by animals to anticipate a potential costly interaction and to accordingly adjust a trajectory to avoid the potential detected threat (Valone 2007; Magrath et al. 2015). The horizontal avoidance response to male social calls suggests that Risso’s dolphins may perceive the presence of conspecific males as a potential threat. Risso’s dolphins in California showed a horizontal avoidance to killer whale sound playbacks simulating nearby presence of potential predators (Bowers et al. 2018). Further anecdotal support is provided by an avoidance response to one playback to killer whale sounds (this study, playback #2 of gg16_171 not included in the analysis, Table S1). These observations support our finding that Risso’s dolphins exhibit an avoidance strategy in response to a potential perceived threat. Risso’s dolphins may thus eavesdrop on their congeners’ male social sounds to anticipate and avoid a potential agonistic interaction. One identified male group exposed to male social sounds avoided the playback location potentially to anticipate forthcoming agonistic interactions with other males (Hartman et al. 2008). Avoidance of males by females is also in accordance with natural observations of females and calves usually remaining apart from male groups (Hartman et al. 2008, 2015). In addition, the avoidance response to male social sounds was observed in response to both sounds recorded in the studied area and unfamiliar sounds recorded in California. This indicates that both stimuli are likely perceived as potential threat in spite of the familiarity criteria (Deecke et al. 2002).

Conversely, the attraction response to foraging sounds of conspecifics supports the diner bell effect hypothesis. Studies in bat species have demonstrated that eavesdropping on foraging sounds such as echolocation signals may facilitate the location of a food patch for eavesdroppers (Voigt-Heucke et al. 2016). Thus, echolocation signals produced by Risso’s dolphins while foraging could be perceived as a feeding opportunity for female or male eavesdroppers, who may then initiate food searching behaviours, characterized by approach toward the location with foraging signals, deep dives and production of click series (Arranz et al. 2016). Further experiments are needed to increase sample size with tagged animals to explore more behavioural metrics, e.g., echolocation and diving behaviour, and to confirm the dinner bell effect hypothesis.

The attraction response to the social female-calf sounds could be explained by two main hypotheses, depending on the sex of receivers (McGregor 1992). As reported among a large number of taxa, if the receiving group is composed of males, social sounds produced by females could be perceived as a potential mating opportunity (Mountjoy and Lemon 1991; Connor and Krützen 2015). In contrast, if the receiving group is composed of females, the calf calls contained in the social female-calf stimulus may stimulate other females for alloparental care, as shown in terrestrial mammals (Lee 1987; Lingle and Riede 2014).

Compared to the method used by Curé et al. (2012) to quantify the movement response, the new coefficient of movement response developed in the present study had the advantage to take into account potential multiple changes in the direction of horizontal movement occurring during the playback, as commonly observed in responding Risso’s dolphins. It can also be applied when there is a variability in the sound source position relative to the position of the focal individual across experiments, as this coefficient is independent of the experimental geometry (i.e., distance and angle of the sound source from the dolphin path at playback-onset). The reaction score was not impacted by the different temporal resolutions of visual vs. UAS-based horizontal tracking methods (Table S1), which suggests that, for this species, a sampling rate of one position every 2 min was sufficient to identify horizontal movement responses.

Strong added capability by the UAS-based tracking is unprecedented continuous monitoring and quantification of the behaviour of the animal and its group members, also subsurface. In particular, it was possible to record that animals often turned their head towards the sound source during playbacks. This indicated that they could hear the sound stimuli and moreover, that they could orientate towards the sound source, potentially for sound localization (Movie S1). As expected, we did not observe any evidence of behavioural changes associated to the use of the UAS-tracking system. A recent study has also shown that a UAS flying over Risso’s dolphins at a lower height than the one we applied (7–15 m vs. 20–30 m) did not have any effect on animals’ behaviour (Hartman et al. 2020).

Conclusions

By developing a coefficient for quantifying horizontal movement response, we showed that Risso’s dolphins eavesdrop on conspecific sounds. Sound eavesdropping allowed individuals to identify different foraging and social contexts and to adjust their behaviour accordingly, choosing to avoid or approach nearby conspecifics. Hence, Risso’s dolphins employ acoustic eavesdropping to mediate intra-specific interactions.