Introduction

Recent studies of sexual vocal communication in mammals have highlighted how spectral acoustic components encode information on caller phenotypes (Briefer et al. 2010; Charlton et al. 2009; Sanvito et al. 2007; Taylor and Reby 2010) and how receivers attend to this acoustic variation when assessing competitors (Reby et al. 2005) or choosing mating partners (Charlton et al. 2007a; Reby et al. 2010). However, in species where more than one call type is given in the same context, it is interesting to examine the different effects that these call types have on receivers and also to investigate how the occurrence of one call type affects the response of receivers to the other(s). Work on primates has shown that combinations of alarm call types can affect receiver responses, with one call type functioning as a semantic modifier (Zuberbuhler 2002). However, to our knowledge, no studies of mammal vocal sexual communication have compared the function of different call types nor investigated the effect of their co-occurrence on receivers.

Red deer (Cervus elaphus) stags have evolved two clearly distinct calls that are specific to male rutting behaviour (Fig. 1), the common roar and the harsh roar (Reby and McComb 2003a). Common roars are produced at high rates during the breeding season (Clutton-Brock et al. 1982) and have functional roles in male–male competition (Clutton-Brock and Albon 1979; Reby et al. 2005) and female mate choice contexts (Charlton et al. 2007a; McComb 1991; Reby et al. 2010). While harsh roars are given more rarely than common roars (representing less than 20% of roars: Reby and Charlton, personal observation), they are typically produced after intense herding of females and/or during vocal contests with other males (Reby and McComb 2003b; Reby et al. 2005), indicating that they also function in both inter- and intra-sexual communication.

Fig. 1
figure 1

Spectrogram of a bout of common roars a followed by a bout of harsh roars b from the same red deer stag

Common roars are given in short series (called bouts) and are characterised by relatively slow amplitude onsets and offsets, a mostly harmonic structure and periodic source, and extensive formant modulation corresponding to vocal tract lengthening during the course of the vocalisation (Fitch and Reby 2001). Acoustic investigations have shown that the minimum formant frequencies of red deer roars contain honest information on body size (Reby and McComb 2003a), and playback experiments of resynthesised roars have shown that receivers pay attention to this information when assessing competitors (Reby et al. 2005) and choosing mating partners (Charlton et al. 2007a). In contrast, harsh roars are given in bouts of shorter calls characterised by abrupt amplitude onsets and offsets (conferring a staccato quality), deterministic chaos (episodes of non-random noise produced by chaotic vocal fold vibration) and very little formant modulation, as the vocal tract is fully extended before the vocalisation commences (Reby and McComb 2003a). Together, these characteristics are likely to make this call type much more conspicuous to receivers than the common roar. In particular, the noisy, chaotic acoustic structure of male red deer harsh roars is likely to make these calls difficult to habituate to (Blumstein and Récapet 2009; Fitch et al. 2002; Owren and Rendall 2001), indicating that they may function to engage the attention of receivers that have become habituated to the high rates of common roars delivered during the breeding season.

In this experimental study, we investigate whether red deer hinds respond more strongly to bouts of harsh roars than to bouts of common roars and whether the production of a harsh roar bout affects hinds’ attention to subsequent bouts of common roars from the same stag. To this end, we played back a series of four common roar bouts followed by a harsh roar bout, then subsequently four more common roar bouts from the same male. Based on previous theoretical predictions (Fitch et al. 2002; Owren and Rendall 2001) and on the results of recent playback studies on other mammals (Blumstein and Récapet 2009; Townsend and Manser 2011), we expected harsh roars to evoke heightened responses in hinds: specifically, we predicted that hinds should respond more strongly to the harsh roar bout than to the first common roar bout and that they should respond more strongly to the common roar bouts given after the harsh roar bout than to those given before.

Materials and methods

Study site and animals

The research was conduced at the Institut National de la Recherche Agronomique (I.N.R.A) Redon experimental deer farm, Clermont-Ferrand, France during the 2005 autumn breeding season. We tested 20 red deer hinds (of Scottish origin) aged between 10 and 15 years (mean = 12.6).

Playback Stimuli

The bouts of common roars and harsh roars used to construct the playback sequences originated from five free-ranging adult red deer stags recorded by DR and K. McComb on the isle of Rum and six farmed adult red deer stags of Scottish origin recorded in New Zealand by DR, between 1976 and 2001. Recordings were made with Sennheiser MKH 816 and MKH 416 microphones linked to a Uher 4200 Report Monitor open reel, a Marantz CP 230 cassette recorder or a HHB PDR 1000 professional DAT recorder. Recordings were captured or digitised at 44.1 kHz sampling rate and 16 bits amplitude resolution.

Each playback sequence consisted of nine different bouts of roars originating from the same stag (recorded by the same researcher, using the same professional equipment) and separated by 20-s intervals of silence. The sequences included four bouts of common roars (CR1-4), followed by one bout of harsh roars (HR) and four bouts of common roars (CR5-8)–see Electronic Supplementary Material 1. Such sequences reflect the natural pattern of red deer roaring, with harsh roars interspersed between the more frequently produced common roars (Reby and McComb 2003b).

Common roar bouts were composed of two to three (average ± SD = 2.60 ± 0.54) common roars and lasted between 2.48 and 8.72 s (average ± SD = 5.46 ± 1.51 s). Harsh roar bouts were composed of three to eleven typically shorter harsh roars (average ± SD = 6.18 ± 2.30) and lasted between 2.20 and 6.77 s (average ± SD = 4.20 ± 1.43 s). Bouts were selected from the best quality exemplars and randomly ordered in the sequence. Two playback sequences were created for each of the 11 stag exemplars by alternating the series of four common roar bouts that came before and after the harsh roar bout, resulting in 22 unique playback sequences. We did not attempt to standardise bout duration or any other spectral parameters, in order to preserve the natural variability of common and harsh roar bouts. The amplitude of all bouts was normalised to 99% peak.

Playback experiments

The playback experiments were conducted between 9 and 11 am. Hinds were brought into an experimental enclosure containing a 1.5-m-long grain-filled trough, positioned 20 m away from the speaker and perpendicular to the speaker-hind axis. In order to standardise their position and facilitate the coding of head movements, females were positioned behind the trough and facing the loudspeaker. Two females that failed to stabilise in this position were excluded from the experiment. All other females remained in this position throughout the playback trials. The females had not been fed for at least 12 h and were provided with enough grain to last throughout the playback sequences in order to prevent dishabituation occurring due to a lack of food. To ensure a standardised context at the onset of each playback experiment, the playback sequences were only initiated when the subjects were feeding with their head down. The playback sequences were presented using an Anchor Audio Liberty 6000HIC loudspeaker at 105 dB peak SPL at 1 m from the source (determined using a Radio Shack Sound Level Meter set for C-weighted fast response). The speaker was placed at a height of 1.5 m from the ground, 20 m away from the females’ position at playback onset and hidden behind a semi-opaque farm gate (2 m high by 4 m wide).

Behavioural analysis

Female behaviour was videotaped during the experimental period using a Sony video camcorder (model DCR/PC120E), and the video sequences were analysed frame-by-frame (frame = 0.04 s) using Gamebreaker 7.0.121 (SportsTec, Sydney, Australia). We measured the duration of the first look given towards the playback source. Looking was defined as starting when the hind raised or turned her head towards the speaker, having previously faced down (feeding) or away, and ended when the head moved away from the playback source. Looks were considered to be towards the playback source when the subject’s head was oriented within 180 degrees of the speaker position. Video analyses were carried out by BC and DR. The behavioural responses of four randomly selected females (20% of the subjects) were then double-coded by an independent observer (Dr Megan Wyman), and an overall agreement of 99.2% was achieved on the 36 corresponding looking duration measures (98.51%, N = 18, excluding zero durations).

Statistical analyses

In order to investigate the effect of roar type (common versus harsh roar) on female behavioural responses, we compared the duration of their looks given in response to the first bout of common roars (lookCR1) and the first bout of harsh roars (lookHR). In order to characterise the overall response of females to the bouts of common roars before and after the presentation of the harsh roar bout, we averaged the looking duration across the common roar bouts occurring before the harsh roar bout (lookCR1-4) and across the common roar bouts occurring after the harsh roar bout (lookCR5-8). Because we played sequences from 10 exemplars to 18 different hinds, 8 exemplar sequences were played to two different hinds and 2 exemplar sequences were played to only one hind. In order to avoid pseudo-replication, we followed Kroodsma et al. (2001) and averaged the two hinds looking responses to obtain a single-looking response value for all the exemplars that were used twice. Our data points, therefore, consist of exemplars rather than individuals (Wiley 2003), and our sample size is 10. Since the data were normally distributed (Shapiro–Wilk: all P > 0.05), paired t-tests were used to compare lookCR1 with lookHR, and lookCR1-4 with lookCR5-8. All statistical analyses were performed using SPSS 16.0 for Mac OSX. Significance levels were set at 0.05, and two-tailed probability values are quoted.

Results

Females looked significantly longer towards the speaker in response to the harsh roar bout (HR: Mean ± SE = 8.50 ± 1.24 s) than they did in response to the first common roar bout (CR1: Mean ± SE = 4.90 ± 1.19 s, t 9 = 3.974, P = 0.003, see Fig. 2a). In addition, when we considered the mean looking duration of females to all four common roar bouts presented before and after the harsh roar bout, females looked significantly longer towards the speaker in response to the four common roar bouts played after the harsh roar (CR5-8: Mean ± SE = 4.50 ± 0.55 s) than they did in response to the four roar bouts given before the harsh roar (CR1-4: Mean ± SE = 3.75 ± 0.59 s, t 9 = 2.930, P = 0.017, see Fig. 2b). Female responses to the individual roar bouts (CR 1-8 and HR) are given in the Electronic Supplementary Material 1.

Fig. 2
figure 2

Error bar charts showing means ± SE of hind looking responses to: a the first common roar bout (CR1) versus the harsh roar bout (HR), and b the four common roar bouts given before the harsh roar (CR1-4) versus the four common roar bouts given after the harsh roar (CR5-8). *P < 0.05; **P < 0.005; P-values are from two-tailed t-tests; N = 10

Discussion

Here, we show that female red deer give stronger looking responses to the playback of male harsh roar bouts than to the playback of male common roar bouts. The looking duration in response to the harsh roar bout was more than twice the average of the looking responses to the first four bouts of common roars and substantially stronger than the reaction to the first common roar bout. This indicates that females not only dishabituate but also show a much stronger reaction to the harsh roar bout than they do to common roar bouts. We also found that the broadcast of the harsh roar bout increased the level of response to subsequent bouts of common roars. This is in clear contrast to previous experimental studies in which red deer hinds typically habituated to repeated playbacks of common roars from the same male (Charlton et al. 2007b; Reby et al. 2001). Accordingly, we suggest that harsh roar bouts have an attention grabbing and dishabituation function in this species’ inter-sexual communication.

Harsh sounding calls often function to recruit attention or prevent habituation in other mammals (Manser 2001; Slocombe and Zuberbuehler 2007) and are also known to be particularly evocative to humans (Belin et al. 2008). Indeed, the deterministic chaos that characterises the harsh roars of red deer stags, as well as the noisy screams or cries present in other mammal vocal repertoires (Blumstein et al. 2008; Facchini et al. 2005; Gouzoules et al. 1984; Tokuda et al. 2002), may constitute an adaptation to maintain the attention of listeners (Fitch et al. 2002; Owren and Rendall 2001). Because red deer harsh roars are characterised by broadband spectra and produced with a fully extended vocal tract, they typically have more salient and lower formant frequencies than common roars (Reby and McComb 2003a) and may, therefore, be more attractive to hinds (Charlton et al. 2007a, 2008). However, because we used naturally occurring calls to preserve the ecological validity of our experiment, we cannot determine the relative contribution of the deterministic chaos, staccato temporal structure, or maximally lowered and salient formant frequencies to the observed increases in female attention to these calls and subsequent common roars. Future experiments using resynthesis techniques could investigate this by independently controlling for the variation of these factors.

The function of harsh roars as dishabituators is consistent with the observation that while common roars are produced at very high rates by male red deer during the breeding season, bouts of harsh roars are far less frequent (Clutton-Brock et al. 1982). If harsh roars trigger stronger reactions in hinds because of intrinsic acoustic properties, we may predict that males would be selected to only produce this type of roar. However, harsh roar bouts clearly require more effort to produce than common roars (they are given with a more fully titled neck and more fully extended vocal tract: Reby and McComb 2003a) and, therefore, are likely to be more energetically costly to produce. Moreover, harsh roars do not clearly advertise the vocaliser’s fundamental frequency, a parameter that has been shown to influence behaviour in oestrous females (Reby et al. 2010). Together, these factors may constrain the frequency at which bouts of harsh roars are emitted and hence enhance their attention grabbing and dishabituation functions. It is interesting to note that both fallow deer (Dama dama) and Corsican deer (Cervus elaphus corsicanus) males also give distinct harsh versions of their sexual calls in high arousal contexts (Kidjo et al. 2008; McElligott and Hayden 2001; Vannoni and McElligott 2007). These observations call for a systematic investigation into the occurrence and function of harsh versions of sexual calls across a wide range of mammals, in both inter- and intra-sexual contexts.

The diversification (size/complexity) of mammal vocal repertoires has been attributed to selection pressures resulting from complex social interactions accompanying increases in group size (McComb and Semple 2005) or to the evolution of functional referentiality in anti-predator calls (Blumstein 1999; Ouattara et al. 2009; Zuberbuhler 2002) or food calls (Bugnyar et al. 2001; Evans and Evans 1999; Slocombe and Zuberbuhler 2005). While sexual selection has been proposed as the main mechanism behind the evolution of increased vocal complexity in bird songs (Catchpole and Slater 1995), there is little evidence of equivalent complexity in mammal sexual signals, with a few possible exceptions (for example gibbons: Cowlishaw 1996), presumably because most terrestrial mammals lack vocal learning abilities (Fitch 2000). The results presented here indicate that selection for soliciting and maintaining receivers’ attention in sexual contexts may also lead to the emergence of discrete call types and thereby contribute to the diversification of mammal vocal repertoires. They also emphasise the importance of considering discrete call types and their possible interactions, when investigating the function of mammal sexual signals.

This work follows the Association for the Study of Animal Behaviour guidelines for the use of animals in research and was conducted under authorisation A37801 of the French Ministry of Agriculture.