Introduction

Vocal signals play a key role in most biological functions, including reproduction (Catchpole & Slater, 2003; Delgado, 2006), predation avoidance (Macedonia & Evans, 1993; Scheumann et al., 2007; Zuberbühler, 2009), sociality (Radford & Ridley, 2008; Waser, 1975), and intergroup competition (Byrne & da Cunha, 2006; de Kort et al., 2009; Ramanankirahina et al., 2016). Although the selective advantage of these signals is usually evident, it often is unclear why some species have evolved larger repertoires for the same functions than others and why some acoustic structures prevail over others (Endler, 1992; Leighton & Birmingham, 2020; Wilkins et al., 2013).

Three factors seem to play a key role in the evolution of animal vocal signals: habitat structure, predation, and sociality (Catchpole & Slater, 1995; Freeberg et al., 2012; Waser & Brown, 1986; Zuberbühler & Jenny, 2002). First, habitat can influence the structure and use of vocal signals. For example, visually dense habitats generally favour acoustic communication (Marler, 1967), with propagation properties and local “soundscapes” having a direct impact on signal evolution (Brown & Waser, 1988; Marler, 1967; Marten & Marler, 1977; Waser & Brown, 1986). Depending on the proximity of the targeted recipient (close, long-distance), different signal structures are favoured to maximise the transmission efficacy and minimise the costs imposed by unintended overhearers (Dabelsteen et al., 1998; Ruxton, 2009; Waser & Waser, 1977).

Second, predation is generally thought to enhance signal diversification, both to inform conspecifics (Blumstein, 1999a, 1999b; Furrer & Manser, 2009; Macedonia & Evans, 1993) and to affect predators (Shelley & Blumstein, 2005; Zuberbühler et al., 1997). An important factor here is whether signallers can actively interfere with a predator’s hunting technique, either by communicating or by minimising detection. This can be either in terms of behavioural adaptations (e.g., altering or inhibiting signal production) or by evolving signal structures that are difficult to detect (e.g., “seeet” alarms of passerines; Jones & Hill, 2001; McGraw et al., 2007; Morisaka & Connor, 2007; Ruxton, 2009; Wilson & Hare, 2004). The same predator fauna can sometimes lead to different evolutionary outcomes, even in closely related prey species. For instance, predation by coyotes (Canis latrans) has impacted differently on two closely related deer species, due to basic differences in anti-predator behaviour (Lingle, 2001). Although of similar size, white-tailed deer (Odocoileus virginianus) flee from coyotes while mule deer (O. Hemionus) fight back. As a result, natural selection appears to have favoured larger, more cohesive groups in mule than white-tailed deer (Lingle, 2001), with further evolutionary consequences for their communication behaviour.

Finally, sociality favours signal evolution with increasing types and numbers of social interactions (Freeberg et al., 2012; Houdelier et al., 2012; McComb & Semple, 2005). Species living in complex societies (e.g., multimale, multifemale groups) are likely to encounter a more diverse range of social problems than species living in simple societies (e.g., solitary species), and this again is thought to impact on signal evolution (Bouchet et al., 2013; Kroodsma, 1977; Manser et al., 2014; McComb & Semple, 2005; Rebout et al., 2020). In the social domain, one source of diversification is whether it is advantageous for a signaller to encode individual identity. There is a wealth of evidence that animals from various taxa can recognise each other by their calls (Aubin & Jouventin, 2002; Briseño-Jaramillo et al., 2014; Kondo & Watanabe, 2009; Müller & Manser, 2008; Rendall et al., 1996). Generally speaking, calls given in social interactions convey identity better than calls that require urgent actions, such as alarm calls (Bouchet et al., 2013; Hasiniaina et al., 2020; Leliveld et al., 2011). Call types often vary across the repertoire in terms of their potential for identity coding (PIC). For example, in female Campbell’s monkeys (Cercopithecus campbelli), short, repetitive alarm and threat calls had the lowest PIC, trilled social calls had intermediate PIC, and combined contact calls had the highest PIC (Lemasson & Hausberger, 2011), which reflected their primary need in conveying information about caller’s identity (Coye et al., 2018).

Zimmermann and colleagues have argued that, to understand the evolution of vocal behaviour, it is essential to take into account the separate impact of a species’ phylogenetic history, its local ecology, and its current social system (Hasiniaina et al., 2018, 2020). With a research programme based on broad-scale species comparisons, they showed the complex interplay between ecology, predation, and phylogeny in the evolution of vocal behaviour in Malagasy mouse lemurs (Hasiniaina et al., 2018). This and other studies on primates confirmed that, across species, vocal repertoires consist of limited collections of acoustically fixed signals, with closely related species having more similar repertoires than more distantly related species, both in terms of calls structure and function (Gautier, 1988; Geissmann, 1984; Hasiniaina et al., 2020; Ord & Garcia-Porta, 2012). However, the picture may not be that clear-cut and exploring the repertoire of closely related species remains a useful endeavour, for several reasons. First, there can sometimes be surprising levels of variation within closely related taxa. For instance, the repertoire sizes of lemuriforms varies from 5 to 22 calls, with no clear phylogenetic patterns (Zimmermann, 2017). Second, primate communication can sometimes be remarkably flexible within species, such that closely related species differ considerably due to species differences in flexible rather than basic repertoire size (Bouchet et al., 2013; Coye et al., 2017; Gustison et al., 2012; Ouattara, Lemasson, et al., 2009a).

While the evolution of the vocal behaviour of adult males has already been investigated in guenons (Arnold & Zuberbühler, 2006; Keenan et al., 2013; Ouattara, Lemasson, et al., 2009a; Zuberbühler, 2000a, 2004), relatively less is known about communication of females and their offspring. However, female repertoires are usually larger and contain calls with more diverse functions than those of males (Candiotti et al., 2012a; Coye et al., 2018; Lemasson & Hausberger, 2011; Ouattara, Lemasson, et al., 2009a, 2009b; Zuberbühler et al., 1997). Among existing studies, data are available for adult females of two closely related guenon species, Campbell’s and Diana monkeys (Cercopithecus diana) (Candiotti et al., 2012a, 2012b; Lemasson & Hausberger, 2011; Ouattara, Lemasson, et al., 2009b; Zuberbühler et al., 1997). These two species are part of a rich primate fauna of the Upper Guinean forests, including six other species (lesser spot-nosed monkeys, Cercopithecus petaurista, putty-nosed monkeys C. nictitans, olive colobus Procolobus verus, red colobus P. badius, black-and-white colobus polykomos, and sooty mangabeys Cercocebus atys). The region has experienced drastic climate-related changes over the past millennia, with a major dry period and substantially reduced and fragmented forests some 18,000 years ago (Hamilton & Taylor, 1991), which has led to a complex migration history. As a result, the current primate species occupy distinct niches within the same habitat, presumably to minimise feeding competition, but frequently form poly-specific associations to maximise anti-predator benefits (Buzzard, 2006a; Mcgraw & Zuberbühler, 2008; Noë & Bshary, 1997).

Campbell’s and Diana monkeys are similar in many ways (Table I). They share the same habitat and the same predators (crowned eagles Stephanoaetus coronatus, leopards Panthera pardus, chimpanzees Pan troglodytes, Homo sapiens humans, and large vipers), and both forage for fruits, flowers, and insects (although in differing proportions). The species have similar home range sizes and group densities, with sometimes overlapping territories (Buzzard & Eckardt, 2007). They often form poly-specific groups (Buzzard, 2006a, 2006b) and have the same group composition (Candiotti et al., 2015), i.e., single-male, multifemale groups with several females and their offspring. Males of each group are spatially and socially peripheral but highly active in antipredator behaviour, whereas the females are the philopatric sex and form the social core of the groups. The males also have a vocal repertoire distinct from that of females, mainly consisting of a few alarm calls (Gautier, 1988; McGraw et al., 2007; Ouattara, Lemasson, et al., 2009b; Zuberbühler, 2000a, 2000b). Finally, both Campbell’s and Diana monkey females recognise each other through their contact calls (Coye et al., 2016; Lemasson et al.2005), suggesting that calls convey identity markers. So far, PIC analyses have only been conducted with Campbell’s monkeys (Lemasson & Hausberger, 2011), showing that the arched component (i.e., tonal, frequency-modulated vocal unit with an ascending phase and a descending phase; Fig. 1) of the vocal combinations functions to convey identity to varying degrees.

Table I Summary of socioecological characteristics of Campbell’s and Diana monkeys
Fig. 1
figure 1

Distinct acoustic elements found in the vocal repertoire of female Campbell’s and Diana monkeys. Acoustic structures are listed regardless of their use in single-element or combined calls. In Campbell’s monkeys, arches homologous to Ab/Af calls of Diana monkeys are only found in calls composed of an SH unit with an arch (CHb, Chf). We produced spectrograms using Audacity 3.0.2, with default settings (Algorithm = Frequencies, window = Hann, window size = 1024 s; minimum frequency displayed 0 kHz, maximum frequency displayed 8000 kHz). Data for Campbell’s monkeys are taken from Lemasson and Hausberger (2011) collected on 6 captive adult females in Paimpont, France, in 2000. Data for Diana monkeys are taken from Candiotti et al. (2012a) collected on 19 wild adult females in Taï National Parc, Côte d’Ivoire, in 2009-2010.

Although Diana and Campbell’s monkeys resemble each other in many features, with a shared common ancestor some 6 million years ago (Perelman et al., 2011), they differ in many key aspects. First, Campbell’s monkeys live in smaller (mean = 9.3 individuals) and more cohesive groups (<25 m group spread) than Diana monkeys (23.5 individuals, which often spread over 25 to 50 m) (Buzzard & Eckardt, 2007, data for two groups of each species). Second, intergroup encounters are 10 times more frequent in Diana than in Campbell’s monkeys, although group densities are similar for the two species (Table I). Intragroup social interactions also are more frequent in Diana monkeys, in which female maintain strong bonds and often form coalitions (as opposed to the moderately strong bonds formed by female Campbell’s monkeys; Buzzard, 2004). Third, Diana monkeys are conspicuous in their visual appearance and acoustic behaviour, larger than Campbell’s monkeys and boisterous in their locomotion with frequent running and leaping (McGraw., 1998), whereas Campbell’s monkeys are much harder to find due to their cryptic colouration and quiet locomotion (McGraw et al., 2007; McGraw et al.2007). Fourth, Campbell’s monkeys are among the smallest diurnal primates in West African forests and often are displaced by other species when foraging (Buzzard, 2006a; McGraw et al., 2007). In contrast, Diana monkeys occupy a central place in the Taï primate community with several other primate species actively seeking associations with them and following them through their home range (e.g., red colobus: Pilicolobus badius; Noë & Bshary, 1997). Fifth, Diana monkeys are sometimes considered as forest “sentinels,” because they detect danger faster and from greater distances than the other species (Mcgraw & Zuberbühler, 2008; Noë & Bshary, 1997; Wolters & Zuberbühler, 2003). Sixth, the two species differ in their antipredator strategies: Diana monkeys follow a strategy of active signalling when they detect leopards or eagles (Uster & Zuberbühler, 2001; Zuberbühler et al., 1997), whereas Campbell’s monkeys seek to avoid detection (McGraw et al., 2007). Finally, while Diana monkeys forage mostly in the top canopy layers (>20 m), Campbell’s monkeys spend up to 50% of their time in the lowest forest canopy layers (i.e., 0-5 m) (Buzzard, 2006b; McGraw et al.2007) where they are more exposed to predators. In particular, forest leopards and chimpanzees are highly specialised in hunting primates and both predators exert considerable pressure on the monkeys (Bshary, 2007; Jenny & Zuberbühler, 2005; McGraw et al., 2007; Zuberbühler et al., 1999; Zuberbühler & Jenny, 2002). In addition, the crowned eagles of Tai Forest pursue a sit-and-wait strategy when hunting monkeys, anticipating the travelling path of a group and attacking them from within the forest canopy (Shultz, 2007). Overall, this suggests that foraging in the lower forest strata is more dangerous than foraging in the open upper forest strata, which are less accessible to all primate predators.

In this study, we were interested in the relative importance of general phylogenetic and specific socioecological factors in the evolution of primate vocal behaviour. We combined published data on the vocal repertoires of the two species with new data to compare their acoustic diversity, use of single and combined calls, and their potential to convey identity. In line with the phylogenetic inertia hypothesis and given the phylogenetic relatedness between the two species, we predicted similarities in vocal repertoires, specifically in terms of identity coding (conveyed by the arched element of contact calls: Candiotti et al., 2012a). Specifically, we predicted that contact call structure (i.e., the arched-shaped, frequency-modulated part of the call) and function (maintaining contact, signalling identity) is conserved in these two species. However, given their opposite ecological niches, we also predicted differences in call use, call combinations, call rates and call functions. Specifically, we predicted that Diana monkeys make more use of call combinations (due to their larger groups) and produce more frequent and more conspicuous calls than Campbell’s monkeys due to differences in relative predation pressure.

Methods

Shared and Idiosyncratic Vocal Units, Call Function, Vocal Combinations, and Call Rates

To compare the vocal behaviour of the two species, we reviewed published data on vocal combinations, contextual use, and call rates (Table II). Most of the published data were collected in Taï National Park, Côte d’Ivoire, but one study included data from Tiwai island, Sierra Leone (Oates et al., 1990), and two included data from a captive group in France (Lemasson et al.,  2005; Lemasson & Hausberger, 2011). Some studies involved an experimental paradigm (Coye et al., 2015, 2016; Lemasson et al., 2005; Zuberbühler, 2000a, 2000b), but most studies relied on observational data. Data collection protocols varied between studies and included regular scan sampling (Buzzard, 2006b; Buzzard & Eckardt, 2007; McGraw., 1998; Wolters & Zuberbühler, 2003), transects (Oates et al., 1990), all-day group follows (Buzzard, 2006a), and focal sampling of individually known subjects (Candiotti et al., 2015; Lemasson et al., 2005; Ouattara, Zuberbühler, et al., 2009; Candiotti et al., 2012a, 2012b).

Table II Comparison of Campbell’s and Diana monkeys’ acoustic repertoire and call use

We report the numbers of shared vocal units and the number which occur in only one species (idiosyncratic vocal units), call function, vocal combinations, and call rates.

New Data Collection

We collected new data in Taï National Park—a tropical evergreen lowland forest in the South-West part of Côte d’Ivoire (5° 20′ –6° 10′ N; 6° 50′ –7° 25′ W). Taï Forest is one of the largest relatively intact segments of the ancestral Upper Guinean Forest belt. It has an estimated surface of more than 5,300 km2 (Office Ivoirien des Parcs et Réserves, 2006) and consists of dense ombrophilous vegetation with a continuous 40-60 m high canopy and emergent trees (Kolongo et al., 2006; Riezebos et al., 1994). The climate is characterised by stable temperatures over the year and an alternation of dry and wet seasons (Korstjens, 2001).

We recorded habituated females using focal sampling between 8 am and 5 pm, several days per week. We conducted recordings between January 2013 and September 2014, using a Sennheiser K6/ME66 directional microphone and a Marantz PMD660 solid-state recorder (sampling rate, 44.1 kHz; resolution, 16 bits) for Diana monkeys and between August 2006 and February 2007 using a Sony TCD D100 stereo cassette recorder and a Sennheiser ME88 microphone for the Campbell’s monkeys.

Comparing Identity Markers between Species

To compare the potential to convey identity in Campbell’s and Diana monkey full-arched calls (CHf and LAf respectively, i.e., contact calls with a full arch, as opposed to “broken arches,” in which the “top of the arch” is not uttered by the individuals; Figure 1), we used an automated classification using artificial neural networks (ANNs), based on a supervised machine learning procedure developed for guenon calls (Mielke & Zuberbühler, 2013). For each caller, we trained the ANN using a set of call exemplars before testing classification performance on new calls from the same caller. We ran the analyses separately for both species to compare results with chance levels and with each other. We used a set of high-quality recordings from three females from each species. Training sets consisted of 19-28 calls per individual (mean ± SE: 23.0 ± 1.6 calls) selected for their quality (low background noise and no overlap with other calls or human speech). We applied a low-pass filter at 12,000 Hz to eliminate high-frequency sounds, particularly from cicadas. We extracted the Mel-Frequency Cepstral Coefficients (MFCCs) from each call (Mielke & Zuberbühler, 2013). The general principle of MFCC extraction is to slice the power spectrum in sections (i.e., frames) small enough to be statistically stationary. Each frame is then multiplied with a Hamming window and the Fast Fourier Transform (FFT) is computed. The frames are subsequently mel-scaled (the spectrum’s frequency axis is transformed from Hertz scale into mel scale using filter banks) and the MFCCs are calculated by applying a discrete cosine transform to the energy from the frequency band filters (Logan, 2000). We then used the MFCC extracted to train 15 identical ANNs per species. We built ANNs using the cascade forward architecture (cascadeforwardnet) neural network in Matlab software. The ANNs consisted of an input layer of 448 neurons (= number of MFCCs extracted per call), a hidden layer with only two neurons (to prevent overfitting) and an output layer whose size corresponded to the distinct classification outputs possible (i.e., 3 corresponding to the 3 individuals per species). We used the “trainbr” training function of Matlab (Bayesian regularization backpropagation training function), with a maximum of 1,000 epochs (i.e., training iterations). We also used two complementary Input-Output processing functions: “mapminmax” (which normalizes inputs and targets between −1 and +1) and “mapstd” (which standardizes inputs and targets to have zero mean and unity variance). To determine when to stop the training, we measured network performance using the mean squared errors (“mse” performance function in Matlab®), with normalization set to its standard value (i.e., normalizing errors between −2 and +2). Following training, we tested the ANN’s performance using 24 calls that were not in the training set (4 calls from each subject, in each species). To maximize classification efficiency, we repeated the training and testing procedures on 15 identical ANNs (for each species) whose results we then averaged to obtain the final classification result.

Ethical Note

Ethics approval was given by the University of St Andrews (School of Psychology) Ethics Board; the research protocol was authorized in Côte d’Ivoire by the Minister of Scientific Research and the ‘Office Ivoirien des Parcs et R eserves’ (OIPR). This observational study does not raise major issues regarding animal welfare.

Results

Shared and Idiosyncratic Vocal Units and Call Function

Females of both species produced eight distinct acoustic units, six of which were shared between species. The shared units consisted of two repetitive structures given during threats (Campbell’s: RRC; Diana: Brrr) and mild alert (Campbell’s: RRA1; Diana: R), two trill-based structures given in relaxed social contexts (Campbell’s: ST/SH; Diana H/L) and two arch-shaped combined calls to remain in contact (Campbell’s: CHf/CHb; Diana: Af/Ab) (Fig. 1).

The remaining four acoustic units were only present in one species. These idiosyncratic units consisted of variations of shared call types, two for each species (Fig. 1; Table II). Interestingly, although all idiosyncratic calls functioned as alarm calls, the respective source calls were different between species. While in Campbell’s monkeys they resembled the short, repetitive units (notably RRA1), in Diana monkeys they resembled the tonal arched units (Af, Ab). In Campbell’s monkey, the idiosyncratic units were given to eagles and leopards (RRA3 and RRA4 respectively). They were used in addition to the general alert (RRA1) and were distinguishable by the number and structure of repetitive units (Ouattara, Zuberbühler, et al. 2009) (Fig. 1). We found no counterpart of RRA3 and RRA4 in the female Diana monkey vocal repertoire.

In Diana monkeys, the idiosyncratic units (Alk, W) also served as alarm calls, but these calls originated from the arched contact calls (Af; Fig. 1; Coye et al., 2015; Zuberbühler et al., 1997) with no structural equivalent in the Campbell’s monkey repertoire. Alk resembled an arched call whose lower frequencies were truncated and whose top was sharper, and W was composed of a short, high-pitched, and arched-shaped note preceding an Alk-like truncated arch (Fig. 1) (Candiotti, 2012; Coye et al., 2015; Zuberbühler et al., 1997).

Vocal Combinations

Females of both species combined vocal units in similar ways, by assembling nonarched units with full or broken arches (Fig. 2). While both species used their arched units to cast combinations, Diana monkeys produced four arched structures (two shared: Af, Ab; two idiosyncratic: Alk, W), and Campbell’s monkeys produced two (CHb, CHf) (Table II). In addition, Diana monkeys used their four arched units both singly and in combination with high- and low-pitched trills (L and H) or repetitive alarm calls (R) in both social and alarm contexts (Fig. 3).

Fig. 2
figure 2

Schematic trees representing the vocal repertoires of (a) Campbell’s and (b) Diana monkeys. On both plots, the line entitled “Single unit” shows calls consisting of one call unit only, the line entitled “Combined calls” shows combined calls, composed of two units. We plotted simple calls onto the same tree when presenting close acoustic structures. Vocal units composing combined calls are indicated by arrows with dashed lines. Shadings show the general function of calls, with green for socio-positive contact calls, yellow for socio-negative calls (threat, mild alarm) and red for alarm calls. Orange shows combination of calls from different functional categories (mixed calls).

Fig. 3
figure 3

(a) Mean call rate per hour for distinct call types in Campbell’s (grey) and Diana monkeys (black). Error bars show standard deviation. (b) Radar representing the percentage of total calls given by Campbell’s (grey) and Diana monkeys (black). Calls presented include: High-pitched trills (ST / H), Low-pitched trills (SH / L), broken arches (alone or combined: CHb/ Ab and any X-Ab combination in Diana monkeys), and full arches (alone or combined: CHf/Ab and any X-Af combination in Diana monkeys). Data for Figure 3 are taken from Candiotti et al. (2012a) on 19 wild Diana monkeys in Taï National Park, Côte d’Ivoire, collected in 2009 and 2010 and from Coye et al. (2018) on 10 wild adult female Campbell’s monkeys in Taï National Park, Côte d’Ivoire, in 2006 and 2007.

As a result of their higher propensity to combine calls, the female Diana monkey repertoire consisted of 16 calls, i.e., 8 noncombined calls (Brrr, R, L, H, Af, Ab, Alk, and W) and 8 combined calls (L-Af, L-Ab, H-Af, H-Ab, R-Af, R-Ab, R-Alk, and R-W), whereas the Campbell’s monkey repertoire consisted of only eight calls. This is because in Campbell’s monkeys, arched units were always produced in combination, never as single calls, and only with low-pitched trills (SH), resulting in only two combined calls (CHb and Chf), which serve as contact calls, and six noncombined calls (RRC, RRA1, RRA3, RRA4, SH, ST; Table II).

Call Rates

Diana monkeys were 4.5 times more vocal than Campbell’s monkeys in terms of contact call rates (Table II). Rates of both single and combined contact calls were higher in Diana than Campbell’s monkeys (Fig. 3). However, Campbell’s monkeys emitted two call types at higher rates: cryptic SH calls (homologous to the Diana L call; Fig. 1; Table II) and alarm calls (RRA / R). In addition, while Campbell’s monkeys mainly produced broken arches (79% CHb calls), Diana monkeys produced mainly full arches (72% LAf calls, homologous to Campbell’s CHf calls; Fig. 3).

Conveying Individual Identity

The results of machine learning showed high levels of individual differences in Campbell’s CHf and Diana’s LAf contact calls (91.7% accurate classification in both species; chance level: 33.3%), suggesting equivalent power to convey identity.

Discussion

We found that females in two closely related, sympatric forest primates, Diana and Campbell’s monkeys produced eight basic vocal units, six of which shared and four idiosyncratic (2 per species), suggesting similar articulatory capacities caused by shared phylogeny. Both species produced arched structures that functioned as contact calls and main carriers of identity. Our machine learning based analyses suggested that this occurred to similar extents in both species, although the results need to be considered with caution given small sample sizes. Finally, females of both species produced combined calls consisting of one arched vocal unit that follows a nonarched unit.

We also found a number of species differences, most likely caused by adaptations to their respective niches, particularly differences in predation pressure. Campbell’s monkeys are very cryptic, both in terms of visual appearance as well as vocal and nonvocal behaviour, and live in small, cohesive groups. Diana monkeys are different and live in large, spread out groups with individuals relying on vocalisations to remain in contact and warn each other about danger (Uster & Zuberbühler, 2001; Zuberbühler et al., 1997). Both species produce two idiosyncratic alarm calls whose structures differed strikingly. Although both species produced call combinations, Diana monkeys used this feature more by producing twice as many combined call types compared to Campbell’s monkeys. Diana monkeys also used combined calls in a greater range of contexts, including alarm and social contexts, whereas Campbell’s monkeys combined calls function only as contact calls. Finally, both species differed in the rate of call production across call types. Campbell’s monkeys were less vocal and favoured the quieter, broken arched calls compared to Diana monkeys, who preferred the full arched calls and used them at high rates.

Overall, our results show that, even in species with limited articulatory capacities, primate vocal behaviour can evolve rapidly in response to environmental pressures, partly due to flexible use of existing vocal units. Predation appears to play a main role as both species possessed idiosyncratic call units in this context, consistent with their respective anti-predator strategies. The Diana monkeys’ idiosyncratic calls (sharp arches – W and Alk) are amongst the most conspicuous calls in the forest while the Campbell’s monkeys’ idiosyncratic calls were short repetitive structures that (for humans) are difficult to detect (RRA1, RRA3, and RRA4). We found no counterpart of RRA3 and RRA4 in the female Diana monkey vocal repertoire, which suggests that these calls were either lost by Diana monkeys or emerged recently in Campbell’s monkeys. Interestingly, another call type with a similar structure (RRA2) was documented in the repertoire of captive Campbell’s monkeys and produced to signal the arrival of an unfamiliar human in the facility (Ouattara, Zuberbühler et al., 2009).

Another source of the flexibility that we identified concerned the ability to use distinct call types flexibly and to combine existing vocal units. First, both species used the more detectable full-arched calls depending on context. For instance, female Campbell’s monkeys used a single unit call (SH) and two combined units (CHb, CHf) to establish and maintain contact (Coye et al., 2018). The single-unit call is the quietest and least perceptible, due to its low-pitched, quavered structure. Females produced this call when predation risk was high and when they were not associated with other primate species (Coye et al., 2018). The two combined units (CHb and CHf) were more audible and given in nonpredatory contexts to maintain contact, with the full-arched call (CHf) mainly given during vocal exchanges (Coye et al., 2018). Female Diana monkeys followed a similar pattern: calls with full arches were used in contexts in which signalling identity was important, e.g., at territory borders where encounters with neighbours were likely (Candiotti et al., 2012a). Second, although both species use combined calls, Diana monkeys do so to a much greater extent and in a diverse range of contexts. In particular, female Diana monkeys not only combined low-pitch trills (L) but also high-pitched trills (H) and repetitive alarm call (R) with full and broken arches (Af, Ab). In previous work, we showed that the first unit (H, L, or R) conveys the caller’s perceived valence of an event (positive, neutral, negative context) while the arch contains the caller’s identity (Candiotti et al., 2012a). In a playback study, changing the first unit (e.g., L with R) or the arch (i.e., identity) were both perceived by listeners and caused differences in reactions, suggesting that both units contributed to the overall meaning (Coye et al., 2016). Interestingly, the Diana monkeys’ two idiosyncratic arched units (Alk, W) were only seen in combination with the repetitive alarm call (R), which generated a novel alarm call (Candiotti, 2012; Coye et al., 2015). In Campbell’s monkeys, combined calls functioned to convey individual identity although this appeared to be in trade-off with minimising detection. In Diana monkeys, pressure from ground predation is low due to their upper forest canopy niche, which may have enabled them to exploit the potential for combinations to a fuller extent. Our findings align with theoretical work predicting that vocal combinations may emerge as an alternative strategy to acoustic diversification in species facing the need for a larger vocal repertoire (Nowak & Komarova, 2001).

There is consensus in the literature that social and vocal complexity coevolve (Aubin & Jouventin, 2002; Blumstein, 1999b, 2003; Houdelier et al., 2012; Kroodsma, 1977; Mathevon et al., 2003; Pollard & Blumstein, 2012; Wilkinson, 2003). This conclusion is based on comparative studies of vocal repertoire sizes, although it often is unclear how to accurately determine repertoire size. Our results show that, in both species, vocal units can be part of vocal combinations, with sometimes distinct functions. These combinations can greatly increase the repertoire size as is the case in Diana monkeys. Furthermore, we identified another source of variation: the flexibility of call use, which further increases the effective repertoire size.

Some studies have adopted an alternative approach to comparing the size of the repertoire, instead assessing the complexity of repertoires using indicators, such as the presence of identity-rich structures (Bouchet et al., 2013), vocal combinations (Manser et al., 2014) or gradation between call types (Rebout et al., 2020). Again, sociality appears to be a main evolutionary driver. For example, across different mongoose species, repetition of vocal units was generally present, but only obligate social species produced combinations of calls (Collier et al., 2020; Manser et al., 2014). Similarly, across three species of primates (Campbell’s monkeys, DeBrazza monkeys (C. neglectus), and red-capped mangabeys, (Cercocebus torquatus)), call rates and vocal combinations increased with social complexity (single-male, single female with their offspring; single-male, multifemale with their offspring; multimale multifemales; Bouchet et al., 2013). In line with these observations, Diana monkeys have higher rates of social interactions, more differentiated intragroup social relations, more frequent intergroup encounters than Campbell’s monkeys (Table I), and a correspondingly larger and more complex vocal repertoire.

Conclusions

We found that two closely related primate species, adapted to different ecological niches within the same habitat, have correspondingly adapted vocal systems in call structure, production patterns, total effective repertoire size (partly caused by vocal combinations), and functional diversity of calls. We found several homologous vocal units due to phylogenetic inertia but both predation and social complexity seem to play a major role in the evolutionary divergence of vocal repertoires in these two species. Predation is particularly interesting as it can both increase the repertoire size and, if pressure is too large, inhibit the evolution of vocal combinations. Social complexity generally appears to favours diversification especially as combinations of call units. Future research on other species and taxa are required to test these conclusions at a larger scale than this comparison of two species.