Introduction

Non-human primates frequently use vocalizations to communicate with conspecifics (Todt et al. 2012; Zuberbühler et al. 1997). For arboreal group-living species, vocalizations may be especially important because transmitting visual signals can be limited by vegetation (Altmann 1967; Waser and Waser 1977). The role played by vocal communication can reflect peculiarities in a single signal of a species’ vocal repertoire that are prerequisites for species and individual recognition (Waser 1977; Snowdon and Cleveland 1980; Rendall et al. 1996; Gamba et al. 2012a, b). Thus, understanding a species’ vocal repertoire is important because it provides researchers with insights into an animal’s social behavior and sets the basis for a broad range of comparative studies (Fischer and Hammerschmidt 2002). For instance, knowledge of the vocal repertoire allows studying the contextual occurrence of specific vocalizations, determining the role of vocal individuality in regulating social interactions within a species, and it is crucial to decoding the biological relevance of communication (Gamba and Giacoma 2005; Favaro et al. 2014).

Studies of vocal behavior can also reveal specific adaptations of an animal’s communication signal to its environment (Morton 1975; Wiley and Richards 1978; Richards and Wiley 1980; Forrest 1994; Ziegler et al. 2011). According to the Principle of Acoustic Allometry for terrestrial mammals, the larger the animal, the lower the fundamental frequency of its calls (Charlton and Reby 2016). Moreover, the Acoustic Adaptation Hypothesis predicts that open habitats favor the propagation of high-frequency calls with wider bandwidths, distinct frequency modulation and shorter notes, compared to habitats with complex vegetational structures (Ey and Fischer 2009).

Acoustic communication in the odd-nosed colobines, a group of monkeys comprising the genera Rhinopithecus, Pygathrix, Nasalis, and Simias is relatively poorly understood. Snub-nosed monkeys (genus Rhinopithecus) comprise five species of large and unusual leaf monkeys found in forests of central and western China and northern Vietnam. They are colobine monkeys with a broad, short face with wide-set eyes and a short, flat nose with forward-facing nostrils. Snub-nosed monkeys are unique in their biology, increasingly endangered, and they show morphophysiological adaptations to a wide range of habitats and climates unusual for primates, such as temperate and high-altitude forests (Liedigk et al. 2012; Yanqing et al. 2020). Previous studies have reported that the communication in these primates is universally characterized by the presence of high-pitched signals (Li et al. 1993; Grüter 2003; Srivathsan and Meier 2011; Erb et al. 2013; Röper et al. 2014).

The observations by one of us (IR) that the relatively large Guizhou snub-nosed monkey (Rhinopithecus brelichi) (a) produces an extremely high-pitched call, and (b) that it often calls from 1.5 to 5.0 m above the ground in its natural habitat were the impetus and inspiration for the present study. We therefore sought to compare the call frequency of R. brelichi with those of other primates, and carry out acoustic playback studies in the field to determine the effect of the animals’ natural habitat on the attenuation and degradation of both the high-pitched calls (HPC) and the low-pitched (UHM) calls.

Materials and methods

Study animals

The Guizhou snub-nosed monkey (R. brelichi; females: mean: 7.8 kg; Kirkpatrick and Grueter 2010; males: mean: 14.5 kg, min 13.3 kg, max 15.8 kg, n = 4; Bleisch et al. 1993; Xie et al. 1982) is a member of the odd-nosed colobine group with a species distribution limited to Fanjingshan National Nature Reserve (27°46′ − 28°01′ N; 108°45′ − 108°48′ E) in Guizhou Province, People’s Republic of China (Kirkpatrick 1998; Yang et al. 2002). The global population of this species numbers approximately 750 individuals (Yang et al. 2002). They inhabit habitats from 1000 to 2200 m above sea level (hereafter, a.s.l.) but feed and travel in the forest canopy and on the ground (Yang et al. 2002; Niu et al. 2014). R. brelichi exhibits a daily altitudinal movement pattern; the monkey sleeps at lower but forages at higher elevations (see Niu et al. 2010). Previous studies have shown that R. brelichi may prefer to range in the mixed evergreen and deciduous broadleaf forest; frequently, the monkeys were observed in areas between 1350 and 1870 m a.s.l. (Yang et al. 2002; Niu et al. 2010). Here, the majority of their diet comprised 4–5 species of deciduous trees (Bleisch and Xie 1998), which are leafless from October to April in the following year (Tan et al. unpublished data).

Study site

In Fanjingshan, the vegetation between 1300 and 2200 m.a.s.l. consists of an assemblage of evergreen and deciduous broadleaf trees (Zhu and Yang 1990; Yang et al. 2002, 2010). Because these areas contain the majority of plants comprised in their diets, R. brelichi are more likely to occur in this altitudinal range (Bleisch and Xie 1998; Yang et al. 2002). The field study site was located in the northeast of Fanjingshan National Nature Reserve, in the Yaogaoping area, where the Guizhou snub-nosed monkey has been most frequently observed. The study area comprises both primary and secondary forest. For this study, we conducted acoustic playback experiments along five transmission transects, located between 1619 and 1710 m a.s.l. At these altitudes, the primary forest contains mixed evergreen and deciduous broadleaf trees (Zhu and Yang 1990). The crown of woody trees in the lower layer consisted of species such as Rhododendron spp. and Camellia spp. and can be 3–8 m in height (Yang et al. 2002). The woody trees in the secondary forest were usually below 5 m in height and mainly consisted of Dendrobenthamia japonica and Litsea elongata. The forest floor vegetation mainly comprised bamboo communities (e.g., Fargesia sp. and Indocalamus sp.) in the primary forest with some surface shrubs (e.g., Litsea sp.) and grasses in the secondary forest (Yang et al. 2002; Niu et al. 2014).

We recorded vocalizations from captive individuals at the Wildlife Rescue Center of Fanjingshan National Nature Reserve, in Panxi. We focused attention on five individuals (two adult males, two adult females, a subadult female) housed in two separate enclosures. All subjects were maintained on a natural light/dark diel cycle. We recorded monkey calls from November 26th to December 13th, 2009, between 07:00 h and 10.30 h and from 13.30 h to 17:00 h.

Acoustic recordings and vocalization analysis

Vocalizations of the Guizhou snub-nosed monkey were recorded using a high-resolution digital recorder (Sound Devices, model 702) equipped with a directional microphone (G.R.A.S. 40 BE). To increase the number of vocalizations we could record, we used both all-occurrence and focal animal sampling methods (Altmann 1974). We recorded spontaneously occurring vocalizations without playbacks, at a maximum distance of 10 m from the vocalizing animal. After a preliminary qualitative analysis of the entire recorded dataset, we selected and saved into separate files those recordings in which vocalizations could be quantitatively analyzed.

We submitted a set of 274 UHMs, 125 HPCs, and 2448 segments of environmental noise recordings in which snub-nosed monkeys' vocalizations were specifically excluded to a one-third-octave band spectrum analysis. We traced a line representing the average energy in the one-third-octave bands for the UHMs, HPCs, and noise level and then computed the area between the spectral profiles for each vocalization and the noise level as a saliency metric for each of these calls to be detected by the monkeys.

Environmentally related adaptations of vocal signals may become evident when considering the body length versus the call fundamental frequency (Garcia et al. 2017; Garcia and Ravignani 2020). We selected a set of six maximally spaced vocalizations (sensu Bowling et al. 2017) from the vocal repertoire of the Guizhou snub-nosed monkey and plotted the mean of their f0s (f06) against the f06 values for various primate species (Bowling et al. 2017; Fig. 2g).

Acoustic playback experiments: attenuation

The transmission experiments were conducted from 26 August to 6 December 2013. During playback experiments, diurnal humidity and temperature ranged from 20 to 76% and 5–28 °C, respectively. Within the Fanjingshan National Nature Reserve, we transmitted the test sequence along five 32-m transects, all in relatively flat areas without significant physical obstacles between the loudspeaker and the microphones, and located within an elevational range spanning about 90 m. We carefully selected five transmission transects with similar vegetation structures. We restricted the transects to the deciduous/semi-deciduous zone, with a maximum underbrush vegetation height of 1.4 m. We performed transects in low underbrush vegetation (about 0.5–0.7 m) across the 32 m whenever possible. We referred to “through the forest floor” when we recorded the transmitted signal at a height lower than the maximum underbrush vegetation (namely, at 0.5 m), and “above the forest floor” when the microphone we used for the recordings was higher than the underbrush namely, at 1.5 m.

Sound attenuation experiments were performed within and above the forest underbrush in the natural habitat of the Guizhou snub-nosed monkey using high-quality recordings of both the UHM and the HPC vocal types as playback stimuli. Playback stimuli were emitted using Avisoft-SASLab Pro (Avisoft Bioacoustics, Version 5.2) from a PC laptop and delivered to the loudspeaker (Avisoft Bioacoustics, UltraSoundGate Player BL Light) placed at 1.5 m above the ground level. To standardize the speaker’s output level, before each session, we emitted a 5-kHz pure tone (30-s duration) with a root-mean-square (RMS) level of 90 dB SPL measured with a sound level meter [MiteK MK5350, using time weighting Fast (τ = 125 ms)] placed 0.5 m directly in front of the loudspeaker.

Along each of the transects, we recorded the playback signals with a directional microphone (G.R.A.S., 40 BE) at distances of 4, 8, 16, and 32 m from the loudspeaker (Avisoft Bioacoustics, UltraSoundGate Player BL Light). After the calibration procedure, and during each recording session, the level of the loudspeaker was kept constant at 90 dB SPL. Experiments were performed between 08:30 h and 18:30 h.

For the attenuation experiments, the call broadcast sequence was recorded on channel 1 of a high-resolution digital audio recorder (Sound Devices, model 702) equipped with two directional microphones (G.R.A.S. 40 BE) placed on a stand at 0.50 m and 1.50 m above the ground, at four recording stations located along a linear transect at 4 m, 8 m, 16 m, and 32 m from the loudspeaker stand (Fig. 1). We adjusted the microphone input level for each recording to be the maximum possible to utilize the full dynamic range of the recorder and still avoid input overload. Then, for each microphone at each recording station, we broadcast [using a Multi-Track Linear PCM Recorder (Olympus LS-100)] a pure calibration tone (5 kHz) at 0.10 m from the microphones, and we recorded its SPL using a portable sound level meter (Mitek MK5350) placed at the microphone tip. This signal, recorded on channel 2 of the high-resolution digital audio recorder, was used during the acoustic analyses as a calibration reference to evaluate the absolute SPL of each UHM and HPC calls recorded on channel 1 of the same recorder. For all recordings, the sampling rate was 96 kHz; each recording was stored as a.wav file. Following Castellano et al. (2003) and Halfwerk et al. (2016), we calculated the absolute dB SPL of the broadcast signal using the formula

$$L_{{\text{s}}} = \, L_{{\text{c}}} + { 2}0 \, *{\text{ LOG}}_{{{1}0}} \left( {{\text{SQRT}}\left( {R_{{\text{s}}}^{{2}} - R_{{\text{n}}}^{{2}} } \right)/R_{{\text{c}}} } \right)$$
Fig. 1
figure 1

Setup for the acoustic attenuation and degradation experiments. A schematic view of the experiment layout showing the loudspeaker and microphone placements on their respective stands. All distances are in meters

where Ls, absolute dB SPL of the signal; Lc, absolute dB SPL of the calibration tone measured with the sound pressure level meter; Rc, digital RMS of the pure calibration tone; Rs, digital RMS of the recorded sound, and Rn = digital RMS of the environmental noise. For the reference lines in Fig. 4a, we used the formula

$${\text{dB SPL}}_{{\text{s}}} = {\text{ SPL}}_{{{\text{LS}}}} - { 2}0 \, *{\text{ LOG}}_{{{1}0}} (d_{{2}} /d_{{1}} )$$

where SPLs, the sound pressure level at point 2; SPLLS, the sound pressure level at point 1 (0.1 m from the loudspeaker), d1 is the distance from the sound source to point 1 (0.1 m), and d2 is the distance from the sound source to point 2 (4, 8, 16, or 32 m).

Because the loudspeaker was fixed on its stand at 1.50 m above the ground, the distance from it to the tip of the higher microphone was less than the distance to the tip of the lower microphone for all stand separations. It follows that due to the path length difference from the speaker to each microphone, the theoretical signal attenuation will be less at the high microphone than at the low microphone. However, this difference diminishes with increased speaker-microphone separation. Thus, in our analysis, we have only considered those speaker-microphone separations for which the difference in the theoretical attenuation to the two microphones is less than 1 dB, i.e., 4, 8, 16, and 32 m.

Acoustic playback experiments: degradation

We also performed sound degradation experiments within and above the forest underbrush in the natural habitat of the Guizhou snub-nosed monkey using high-quality recordings of both the UHM and the HPC vocal types as playback stimuli. The setup for the degradation playback experiments was identical to that for the attenuation playback experiments (see Fig. 1). For these studies, we also recorded the signals at distances of 4, 8, 16, and 32 m from the speaker. We performed digital cross-correlations between waveforms of both HPC and UHM calls recorded at distances of 8, 16, and 32 m and the same calls recorded at 4 m (reference signal) using Praat (Boersma 2001). The maximum cross-correlation coefficient at each distance was a metric for sound similarity; using these allowed us to determine signal degradation as a function of the distance from the source. We present these coefficients as values between 0 and 1; the higher the value, the greater the similarity between the sounds, thus less degradation.

Statistical analysis

We ran the General Linear Mixed Models (GLMMs) using the lme4 package in R (R Core Team 2015, version 3.2.0; Bates et al. 2015). The model we used to investigate attenuation variation across distances included the log-transformed SPL of the recorded signal as the response variable, the vocal type (UHM, or High-Pitched Call) and the microphone at which the signal was recorded (high at height 1.5 m, or low at height 0.5 m) as fixed factors. We used z-transformed average humidity, average temperature, and distance (including four values: 4, 8, 16, 32 m) as covariates. We used transmission transect as a random factor and also added all other necessary random slopes (Barr et al. 2013), namely microphone height, vocal type, humidity, temperature, and speaker–mic distance within transmission transect. For the degradation analysis, the same fixed and random factors were used, but in this case, the cross-correlation value was deemed to be the response.

We verified the assumptions that the residuals were homogeneous and normally distributed by checking the qqplot and the distribution of the residuals plotted against the fitted values using a function written by R. Mundry (Estienne et al. 2017). We excluded collinearity among predictors by examining the variance inflation factors (vif package; Fox and Weisberg 2011). We compared each model against a null model (Forstmeier and Schielzeth 2011) comprising the random factors exclusively using a likelihood-ratio test (Anova with argument test “Chisq”; Dobson 2002). If the model differed from the null model, we calculated the p values for each predictor using the R-function “drop1” (Barr et al. 2013). We used a multiple contrast package (multcomp in R) to perform all pairwise comparisons for the levels of each factor with a Tukey test (Bretz et al. 2010). We reported estimate, standard error (S.E.), z- and adjusted p values for the Tukey tests that we used to identify a significant effect of distance on the attenuation and degradation of the vocalizations.

Results

Call description

Rhinopithecus brelichi (Fig. 2a) produces a high-pitched call (HPC, Fig. 2b, d), which serves as a distress vocalization often made by individuals exhibiting an elevated emotional state in response to fear (separation or rejection from the mother) or hunger. We observed wild and captive animals emitting this vocalization while either dangling from or moving through tree branches at heights from 1.5 to 5.0 m. Adults and juveniles emitted this call which was usually directed to individuals at close range within the social unit. In captivity, we have observed that conspecifics at a distance of 25–45 m could detect the HPC and reacted by emitting the same call type. The fundamental frequencies of these signals are surprisingly high, given the large size of these animals (Fig. 2g) and can exceed 15 kHz in juveniles of R. brelichi and R. roxellana (Fig. 3). Moreover, adults of both sexes produce UHM calls, which is a low-frequency contact call with a fundamental frequency of 0.8 kHz (Fig. 2c, e) that functions to maintain group cohesion in this species. This vocalization is used between members of the same unit and between members of different units in the same troop; it signals the location of the caller and often elicits the same response from other members. Individuals can either be stationary (during resting or feeding) or moving while making the call (IR, CLT, pers. observations).

Fig. 2
figure 2

Vocalization analyses, expected call frequencies and high-pitched calls. a Guizhou snub-nosed monkey (Rhinopithecus brelichi) in its natural habitat (Photo: Duoying Cui, with permission). b Waveform of a high-pitched call. X-axis same as d. c Waveform of an Uhm call. X-axis same as e. d Sound spectrograms of a high-pitched call from an adult female and e an UHM call emitted by an adult male. Vocal signals in be were recorded at a captive facility in Fanjingshan National Nature Reserve (Guizhou Province, China). f Average profiles of one-third-octave band spectra for low-noise recordings of the high-pitched calls (red) and UHM (blue). The gray area shows the average one-third-octave band spectra of environmental noise in the home range of R. brelichi. Using this linear scale, the HPC area above the noise is 5.6 times the Uhm area. Plotting these data, a log scale (not shown), the HPC area above the noise would be 1.6 times the Uhm area. g Log–log plot of average body length vs. average fundamental frequency (‘f06’) for 41 primate species using data from Supplementary Table S3 of Bowling et al. 2017. The dashed line shows the ordinary least squares (OLS) regression from the original analysis (see Supplementary Table S2 of Bowling et al. 2017 for details). Primate families are color-coded; colored circles are the data for the original 41 species, the triangle represents the value for R. brelichi. The inset shows a box plot of the vertical distances between each data point above the regression line to the regression line for the original 21 species; for comparison, the triangle indicates this distance for R. brelichi

Fig. 3
figure 3

a Waveforms and b sound spectrograms of high-frequency calls emitted by juvenile odd-nosed monkeys: Rhinopithecus brelichi juvenile (left), and a R. roxellana infant (right). X-axis in a same as b. vocal signals were recorded in captivity at the Fanjingshan National Nature Reserve (Guizhou Province, China) and Beijing Zoo (Beijing, China), respectively. Spectrograms were generated using Praat with the following parameters: frequency range: 0–48 kHz; maximum: 100 dB/Hz; dynamic range: 40 dB; pre-emphasis: 6.0 dB/Oct; dynamic compression: 0.0. Photos by K. Niu (R. brelichi adult male) and C. L. Tan (R. roxellana adult male)

The Guizhou snub-nosed monkey vocal repertoire comprises several call types, including the high-pitched call (HPC)—a markedly FM emission (N = 152; fundamental frequency (hereafter, f0) = 6765 ± 175 Hz; Fig. 2b, d)—and the relatively low-pitched UHM (contact) call (N = 591; f0 = 788 ± 163 Hz; Fig. 2c, e). The area of the HPC spectrum above the ambient noise level was 598% greater than the corresponding area of the UHM call (Fig. 2f).

The f06 value of R. brelichi is clearly shifted upwards, away from the regression line through the data for the other primate species (Fig. 2g). In fact, the HPC of R. brelichi is produced at frequencies more than 2 octaves higher than the average calculated for the calls of the other 21 primate species above the regression line (Fig. 2g inset).

Attenuation of the transmitted vocalizations

As expected, the model showed that the sound amplitude decreased significantly with distance (GLMM, p < 0.001; see Fig. 4a and Table 1). The model also indicated that the attenuation of UHM and HPC calls differed significantly (GLMM, p < 0.001, Fig. 4a), and that the test sounds reaching the low microphone attenuated significantly more than those reaching the high microphone (GLMM, p < 0.001, Fig. 4b). We did not find a significant effect of temperature (GLMM, p = 0.081) or humidity (GLMM, p = 0.108) on the attenuation of the transmitted sounds.

Fig. 4
figure 4

Changes in the attenuation of the HPC and UHM calls during the transmission experiments. a The amplitude of the transmitted HPC and UHM decreased significantly with distance both when recorded by the high (first two panels) or the low microphone (last two panels). Reference line shows the expected attenuation (geometric spreading of sound, i.e., 6 dB/doubling of distance) with increasing distance from the sound source. b Variation in the attenuation of the HPC and UHM calls during the transmission experiments. Signals recorded at the low microphone were significantly more attenuated than those at the high microphone. Higher values of dB SPL indicate less attenuation. Excess attenuation of HPC and UHM calls is provided in Table S1, Supplementary Information. c Variation in the degradation of the HPC and UHM calls during the transmission experiments. These plots show the cross-correlation indices between the emitted signals recorded at 8 m, 16 m, and 32 m, and the emitted reference signal recorded at 4 m. Higher cross-correlation indices indicate lower degradation (the highest cross-correlation index is 1, meaning no degradation). The cross-correlation index of the transmitted HPC (orange curves) and UHM (blue curves) decreased significantly with distance. Signals recorded at the high microphone were significantly less degraded than those at the low microphone

Table 1 (ATTENUATION) Influence of the fixed factors on log-transformed SPL (dB); results of the reduced model (full vs. null: chi-sq = 63.475, df = 5, p < 0.001)

Degradation of the transmitted vocalizations

The degradation of R. brelichi calls increased with distance from the loudspeaker (GLMM, p < 0.001; see Fig. 4b and Table 2). The model indicated that UHM calls and HPC calls degraded similarly (GLMM, p = 0.278) and the signals recorded at the high microphone were significantly less degraded than those at the low microphone (GLMM, p = 0.012, Fig. 4c). Neither temperature (GLMM, p = 0.969) nor humidity (GLMM, p = 0.279) had a significant effect on the degradation of the transmitted sounds.

Table 2 (DEGRADATION) Influence of the fixed factors on the cross-correlation index; results of the reduced model (full vs. null: chi-sq = 23.850, df = 5, p < 0.001)

In summary, we observed that when broadcast through the Guizhou snub-nosed monkey’s natural forest habitat: (1) the high-pitched calls attenuated more over distance than low-pitched calls (p < 0.001, Fig. 4a, b) but (2) the degradation (p = 0.278, Fig. 4c) of high-pitched calls (HPCs) and low-pitched UHM calls did not differ, and (3) calls recorded through the understory suffered significantly more attenuation (p < 0.001), and (4) degradation (p = 0.012) than calls broadcast and recorded at 1.5 m above the forest understory.

Discussion

In this study, we investigated the relation between the Guizhou snub-nosed monkey vocalizations and the local environmental noise. As expected, the results from the transmission experiments showed that the effect of attenuation and degradation on signal propagation was greater (1) with increasing distance from the source along the transect; and (2) for signals recorded with the microphone through the forest floor vegetation relative to those recorded above the forest floor.

As expected from previous research on acoustic propagation through forest habitats, we found that transmission of high-pitched vocalizations suffers more attenuation than that of low-pitched calls (Wiley 2015). In our study, attenuation of low-pitched (“UHM”) calls was significantly less than high-pitched calls (p < 0.001) over the distances tested.

Given the absence of an audiogram for any snub-nosed monkey species, we searched for evidence that the Guizhou snub-nosed monkey can respond to high-frequency sounds. We observed that juveniles of this species produce the HPC to maintain contact with their mothers. Since the HPC emitted by juveniles of both R. brelichi and a congeneric species, R. roxellana, contain fundamental frequencies > 20 kHz (Fig. 3; Fan et al. 2018), it is likely that adult females of these species are capable of detecting these frequencies. Nevertheless, audiograms of these species are needed to definitively confirm their hearing ranges.

As in the case of Guizhou snub-nosed monkey, where dense vegetation prevents the use of visual cues, the exploiting of a wider frequency span can dramatically increase the informational repertoire of a species. Our high-frequency propagation results are particularly interesting when compared with the observations of other species living in China (Holman and Seale 1991; Narins et al. 2004; Feng et al. 2006; Shen et al. 2011). These studies hypothesized that the presence of high-pitched calls may be the result of selective pressure to avoid masking by the wideband, predominantly low-frequency environmental noise (Feng et al. 2006). Our results are consistent with this apparent behavioral adaptation for successful communication. Indeed, we observed snub-nosed monkeys often emitting HPCs from 1.5 to 5.0 m above the ground, thus avoiding excess attenuation. This behavior is often found in birds and insects emitting high-frequency signals (Brenowitz 1982; Arak and Eiriksson 1992; Römer and Lewald 1992). Our results suggest that by uncoupling its vocal output from its size, the Guizhou snub-nosed monkey is able to produce its broad-spectrum HPC, thereby increasing the frequency range over which the animal may more effectively communicate in its natural habitat. This, coupled with the concomitant behavior of moving above the understory to emit this call, may act together to provide a particularly salient signal for receivers which maximizes potential transmission and efficacy.

Our findings emerging from the comparison between Guizhou snub-nosed monkey vocalization profiles and environmental noise profiles support the Acoustic Adaptation Hypothesis. Indeed, we observed that the energy of the HPC was concentrated in the high-frequency bands that are less affected by environmental noise. Our results are in agreement with the model proposed by Charlton et al. (2019) that suggested that animals living in forests have greater hearing sensitivity for high frequencies and produce vocalizations showing higher frequency components. Indeed, vocalizations and sensory systems in forest mammals are likely to have coevolved and this is likely the case of the Guizhou snub-nosed monkey. Our results suggest that the Guizhou snub-nosed monkey represents an example of allometric escape (Tonini et al. 2020) in which its vocal output is uncoupled from the animal’s size (its expected position on the Size-Call Frequency allometry curve, Fig. 2g). This is consistent with a recent study showing that primate larynges exhibit a pattern of wide deviation from expected allometry with body size (Bowling et al. 2020). Large size variation in primate larynges has been used, for example, to explain the large laryngeal apparatus in howler monkeys (Alouatta sp.) as an adaptation for using low-frequency signals for long-distance communication through the forest (Bowling et al. 2020).

This feature enables this primate to raise its f0 and widen the bandwidth of its high-pitched call to increase its local signal-to-noise ratio to more effectively communicate in its natural environment. Previous studies showed that being able to produce calls containing frequencies that effectively exploit the frequency range available for communication can be critical for primates and other mammals to provide conspecifics with essential information (e.g., threats, territorial occupation; Clarke et al. 2006; Furrer and Manser 2009; Torti et al. 2017).

We have shown that the use of high-frequency signals (HPC) by the Guizhou snub-nosed monkey—a relatively large primate—may also be adaptive in certain forests. Future studies will need to investigate the morphophysiological bases of phonation in Rhinopithecus to verify if its high-frequency vocal output is the result of laryngeal neoteny as recently described for the bonobo (Grawunder et al. 2018), adaptations of the nasopharyngeal cavity, or a combination of these with additional factors.