Introduction

Learning is a prerequisite for the proper development of human language, both in terms of its production and comprehension. While human infants can produce a variety of nonverbal utterances from birth on, they normally proceed through a phase of repetitive vocal utterances —‘babbling’— before they begin to utter their first words (Scheiner and Hammerschmidt In press). From a comparative perspective, this raises questions of whether and to what extent experience plays a role in the acquisition of the species-typical vocal repertoire. Paradoxically, learning does not seem to be a prerequisite for the development of vocal production in young primates (Fischer 2002; Egnor and Hauser 2004). In contrast, our closest living relatives have to learn the appropriate responses to and hence the meaning of their conspecific vocalizations (Seyfarth and Cheney 1980; Fischer et al. 2001b; Fischer 2004).

Although there is general agreement that the structure of vocalizations in nonhuman primates is largely genetically determined, developmental modifications occur nonetheless. They can be related to growth in body size and to hormonal changes during puberty, which may affect the usage and structure of certain signals, specifically those that play a role in the mating context (Semple and McComb 2000; Kitchen et al. 2003; Fischer et al. 2004). Notably, hormonal variations linked to age and sex may affect body mass, and in turn, certain acoustic features.

We aimed to examine which acoustic parameters vary with age and sex of the caller, on a large sample of one type of vocalization uttered by subjects from both sexes and across all age classes. We studied chacma baboon (Papio ursinus) clear calls, also termed lost calls and contact barks (Cheney et al. 1996; Rendall et al. 2000; Fischer et al. 2001a, 2002). The baboons uttered clear calls from a very young age, and equivalent calls occur in males and females. Therefore, a comparison of acoustic features is possible across age classes and between sexes. The calls are often uttered in bouts, as in many other species, by individuals that are apparently at risk of losing contact with the rest of the group or that are separated from particular individuals (Byrne 1981; Waser 1982; Masataka and Symmes 1986; Masataka 1989; Boinski 1991; Norcross and Newman 1993; Cheney et al. 1996; Schrader and Todt 1993; Hammerschmidt et al. 2000; Rendall et al. 2000; Fischer et al. 2000, 2001a,b, 2002; Miller et al. 2004). Clear calls of baboons are tonal and harmonically rich (Figure 1). Female calls grade into alarm barks, which are harsher (Fischer et al. 2001a). Like males, females may also utter calls with two parts, the second one of lower amplitude (Fischer et al. 2002). Male clear calls take the form of wahoo barks, but they occur less frequently than female clear calls (J. Fischer, personal observation). Contact wahoo barks are part of a graded continuum of wahoo variants also given in contest situations and in alarm context (alarm barks are harsher variants of contact wahoos; Fischer et al. 2002).

Fig. 1
figure 1

Spectrograms of clear calls of a female and b male chacma baboons. From left to right: infant, subadult, and adult (fast Fourier transform resolution: 1,024 points; sampling frequency: 16 kHz; time resolution: 2 ms; time overlap: 96.87%; Hamming window). The graph represents the amplitude distribution (darkness of shading) as a function of frequency and time.

We made different predictions according to the mechanisms of sound production and to previous studies. Duration, fundamental frequency, and energy distribution parameters, i.e., peak frequency and distribution of frequency amplitude, should vary with the age and sex of the caller. Indeed, when age increases, calls should be longer, with a lower fundamental frequency, and with energy concentrated in low frequencies. Calls from males should also present these characteristics when they are compared with those of females. In contrast, the structure of the call, in this case mainly the course of the fundamental frequency, which allows receivers to identify the call type, is fixed from birth (Winter et al. 1973; Jürgens and Ploog 1981; Hammerschmidt et al. 2001; Owren et al. 2003; reviewed in Fischer 2002). Thus, we expected no age- or sex-related variation in acoustic parameters related to structure. We also aimed to study at which age sexual differences in acoustic signals emerge. We hypothesized that these differences should emerge around puberty when sexual size dimorphism appears because males experience an additional growth spurt compared with females (Johnson 2003).

Materials and Methods

Subjects

We studied a troop of wild chacma baboons, in the Moremi Wildlife Reserve in the Okavango Delta, Botswana. The group size ranged from 79 to 84 individuals during the recording periods. Because they had been under observation for more than 20 years, the matrilineal relatedness of all natal individuals are known and all of them are individually identifiable. The group is fully habituated to human observers on foot, so recordings of close proximity were possible (Cheney et al. 2004).

Data Collection

Fischer and Hammerschmidt recorded calls between February and November in 1998 and between January and April in 1999 (and in March 2005 for one call), with a Sennheiser directional microphone (K6 power module and ME66 recording head with MZW66 pro windscreen) and a Sony WM TCD-100 DAT recorder on digital audio tapes (Maxell DM120, Soundman Sm-120, and Sony DT120) with a sampling frequency of 44.1 kHz. They described context and identity of the caller onto the tape during data collection. We saved calls of sufficient quality, i.e., with a good signal-to-noise ratio, not overloaded, without background noise, and not affected by wind, from digital audio tapes on a Toshiba Satellite laptop connected to an USB audio interface Edirol Audio Capture UA5. We used the software Avisoft SASLab Pro 4.3 (R. Specht, Berlin, Germany), with a sampling frequency of 44.1 kHz and a 16-bit resolution, and worked in mono format.

Call Selection

As a first step, we obtained 808 clear calls of high quality from 48 individuals. In addition, we recorded 18 calls with some background noise (insects, leaves, wind) from 10 other individuals. In the latter calls, we did not calculate parameters related to energy distribution (dominant frequency bands, distribution of frequency amplitude) because energy levels might be affected by the additional energy of background noise. Therefore, we only included them in the analysis of parameters dealing with duration, fundamental frequency, structure, and peak frequency. We included only 1 call per individual in the analysis. Concerning individuals for which several calls were available, we randomly chose 1 call via a random function of Excel (Microsoft Office XP Professional 2002).

We ended up with two data sets. Data set A contained 48 clear calls of excellent quality from 48 different individuals for the analysis on the distribution of frequency amplitude and the first dominant frequency band. Data set B comprises 58 calls from 58 individuals (the 48 previous calls and 10 additional ones selected among those disturbed by some background noise) for the analysis on duration, fundamental frequency, structure-related parameters, and peak frequency.

We assigned each individual to an age class (Table I). The first age class included individuals that had not reached puberty; i.e., between birth and 2 yr of age (0–104 wk). There were too few male subjects around 3 years of age to perform a meaningful statistical analysis; hence we excluded them. Individuals that had just terminated puberty—subadults between 4 and 6 years of age (157–312 weeks)— constitute the second age class. They were not fully grown, and the sexual dimorphism in body size and mass was not very pronounced. The last age class include individuals that were fully grown and presented a clear sexual dimorphism in body size and mass. They are adults >6 yr of age (≥313 wk).

Table I Distribution of individuals for each age and sex class for the reduced data set with noise-free calls only (A), and the full data set including calls with some background noise (B)

Acoustic Analysis

We converted the sampling frequency of the calls via the software Avisoft SASLab Pro 4.3 (R. Specht, Berlin, Germany) from 44.1 kHz to 8 kHz, which represents the double of the maximum frequency of interest for us (Beeman 1998) according to previous findings about baboon clear calls (Fischer et al. 2001a, 2002). We used a frequency resolution of 1,024 points for the fast Fourier transform analysis and worked with the Hamming window. For the temporal resolution, we worked with an overlap of 96.87%; i.e., with 4-ms intervals.

We used a custom software program (LMA 2005), developed by Hammerschmidt, to calculate a suite of acoustic parameters (Schrader and Hammerschmidt 1997). In accordance with our predictions, we calculated 19 different acoustic variables (Table II). We used an interactive macro to calculate parameters linked to fundamental frequency and structure. In addition, we determined the duration, peak frequency parameters, and the global distribution of amplitude in the frequency spectrum. We did not calculate the formant frequencies because the structure of calls of young baboons with a high fundamental frequency and few harmonics does not allow for a reliable determination of the filter function of the vocal tract. Moreover, in previous analyses, we faced a poor observer reliability for the determination of the higher formant frequencies (Fischer et al. 2004). In addition, the lower formants were well represented by variables reflecting the amplitude distribution in the frequency spectrum (distribution of frequency amplitude, dominant frequency band).

Table II Definition of the acoustic parameters used for the statistical analysis

Statistical Analysis

First, we conducted a principal components analysis (methods of rotation: Varimax with Kaiser-normalization; rotation converged in six times) in order to reduce the number of acoustic parameters to a set of uncorrelated variables. The analysis identified four principle components (Table III). The first was related to the energy distribution and the peak frequency, but also described the local modulation. The second, third and fourth components correlated with the fundamental frequency, the energy distribution in the spectrum at the beginning of the call, and the location of the maximum value of the fundamental frequency, respectively. Call duration did not exhibit high loadings for any of the extracted components. Hence we decided to select that variable with the highest correlation with each of the components. In addition, we selected call duration and local modulation of the fundamental frequency. The local modulation gives an estimate of the stability of sound production and is not related to the peak frequency. We also made different predictions for local modulation and peak frequency.

Table III Matrix of the composition of the principal components after rotation. Numbers indicate the loading of each variable (correlation) with the respective component. Variables that were selected for further analysis are indicated in bold

We conducted a GLM analysis on each of the six representative acoustic parameters. As we made differential predictions for these variables, we refrained from corrections for multiple testing. We first examined the full model with age and sex as independent variables, and the interaction between age and sex. We considered age and sex as fixed factors, as we worked on three age classes and on the two sexes. For acoustic parameters that showed no significant interaction between age and sex, we recalculated the univariate analyses without the interaction between age and sex. Finally, we compared the linear regressions of the acoustic parameters that presented a significant interaction between age and sex to establish at which age these variables begin to differ significantly between males and females. To do this, we conducted a univariate analysis to compare the mean value of each parameter between sexes for each age class separately. We used SPSS 12.0 for Windows for all statistical analyses.

Results

Age and Sex Influences

The univariate analysis revealed that the duration of the calls and the mean fundamental frequency varied significantly with age (Table IV). With increasing age, they uttered longer calls (Figure 2a) with a lower fundamental frequency (Figure 2b). These two variables also exhibited a significant interaction between age and sex (Table IV), indicating that the profiles of age-related variations differed between the sexes. Males showed a more pronounced variation than females. Thus, in adults, males uttered longer calls with a lower fundamental frequency compared with those of females (Figure 2a, b).

Table IV Explained variance and results of univariate analysis on each parameter showing a significant interaction between age (juveniles, subadults, and adults) and sex
Fig. 2
figure 2

Mean values and their standard error for four acoustic parameters. a Duration of the call; b mean value of the fundamental frequency; c maximum value of the peak frequency; d local modulation of the fundamental frequency. Dashed lines females, Solid lines males.

The maximum peak frequency, the frequency at which the second quartile of global energy is reached at the beginning of the call, and the local modulation of the fundamental frequency varied significantly with the age of the caller (Table V). With increasing age, the subjects uttered calls with the energy concentrated in lower frequencies (Figure 2c) and with a less modulated fundamental frequency (Figure 2d). The maximum peak frequency and the local modulation of the fundamental frequency varied marginally with the sex of the caller, while there was no significant interaction between age and sex (Table V). Adult males produced calls in which the energy was mainly located in lower frequencies than in the calls of adult females. Females exhibited a higher local modulation of the fundamental frequency than those of males in the first two age classes, while adult males and females exhibited similar degrees of local modulation (Figure 2d). In contrast, there is no influence of age or sex in the parameter representative of the structure of the call, i.e., the location of the maximum fundamental frequency (Table V).

Table V Explained variance and results of univariate analysis on those parameters that did not show a significant interaction between age and sex

Emergence of Sexual Differences

Differences between sexes in duration and in fundamental frequency became significant in the third age class, with a trend already appearing in the second age class for the fundamental frequency (Table VI).

Table VI Comparisons of the acoustic variables that presented a significant interaction between age and sex. Explained variance, F and P values for differences between the sexes within each of the three age classes are shown

Discussion

Firstly, we found that age and sex influence acoustic parameters related to duration, fundamental frequency, peak frequency and also local modulation of the fundamental frequency. In contrast, the structure of the call, represented by the location of the maximum value of the fundamental frequency, is stable when age increases. Thus, the results met our predictions. Secondly, sexual differences tended to emerge in immature subjects for fundamental frequency parameters, but they were clearly significant only in adults for duration.

The location of the maximum value of the fundamental frequency as one indicator of the overall structure of the call did not differ between the sexes, or in relation to age. Other variables representing the global structure of the calls that were not included in the present analysis for statistical reasons also showed little variation with age, for instance the global trend. Our findings are consistent with the view that the global structure of the call is largely genetically fixed and does not change much during ontogeny (Winter et al. 1973; Jürgens and Ploog 1981; Owren et al. 2003). This is not to say that the fine-grained structure of a call is not subject to variation due to motivational state, individual call characteristics, hormonal status, as shown here, age and sex.

The age- and sex-related variations in acoustic parameters seem to correspond to the variations expected because of variations in body size, according to the mechanisms of sound production (Fitch and Hauser 1995). Notably, the growth of the whole body leads to a higher lung capacity, which may influence call duration and sound pressure: bigger lungs allow production of longer and louder sounds. The growth of the body also leads to an increase in the length and possibly shape of the vocal tract, and hence affects its resonant and filtering properties. Consequently, larger animals tend to exhibit a lower formant dispersion (Fitch 1997; Pfefferle and Fischer 2006). An increase in body size correlates with a lengthening of the vocal folds, which on average leads to the production of sounds with a lower fundamental frequency (Fitch 1997). Body size correlates negatively with the mean, highest and lowest frequencies of the repertoire across different nonhuman primate genera and also within species, even if some intrageneric and intraspecific exceptions remain (Hauser 1993). Body size obviously acts as a physical constraint in sound production. Such vocal signals, namely index signals, provide honest information about some of the sender’s intrinsic attributes, with relatively low cost for signallers, simply because of the mechanisms of sound production (Fitch 1997; Fitch and Hauser 1995, 2003; Vehrencamp 2000).

Another fact in favor of the implication of body size is that age-related changes appear to have a stronger influence than sex-related changes. This corresponds to the degree of variation in body size between young and adult animals versus between females and males. Furthermore, our results suggest that sexual differences tended to emerge earlier in fundamental frequency than in duration. The emergence of sexual differences in fundamental frequency and duration around 6 yr corresponds to the growth spurt experienced by males at 5½ yr (Johnson 2003). The slightly later emergence of sexual differences in duration is relevant because this parameter is linked to lung capacity and thus to the whole body. It may therefore take longer to achieve a sufficient increase in lung capacity than in vocal fold length. Precise measurements of body and skull size would be needed to clarify this relation.

A slightly earlier emergence of sexual differences in fundamental frequency parameters is also relevant. The larynx grows independently from the rest of the body, because the androgen receptors of the laryngeal cartilages respond to an increase in circulating testosterone (Fitch 1997). Therefore, changes only in body size may be too slow to fully explain variations in acoustic features, especially in fundamental frequency (Snowdon 1988). Thus, changes in hormonal concentrations, especially at puberty, may contribute to sexual differences in some acoustic parameters, but again, this remains to be investigated further. Indeed, new studies on the same group of baboons linking hormonal concentrations and social status have been published recently (Beehner et al. 2006; Bergman et al. 2006; Engh et al. 2006), but the hormonal data for our set of individuals are not available.

While we corroborated the observation that some acoustic parameters tend to vary significantly with body size, this may be true only for large size differences; for instance, between age classes, i.e., when considering infants, juveniles, subadults and adults. Concerning small size differences, as within an age and sex class, the relationship between acoustics and body size might be more complicated. Rendall et al. (2005) found no relationship between fundamental frequency or formant frequencies and body size in human adult females, and only a weak one between body length and formants in human adult males, as did González (2004) in adult humans of both sexes. Collins (2000) also highlighted the fact that fundamental frequency and formants allow a reliable estimation of weight but not of age and height in human adult males. In a previous study, we also found that among baboon adult males, the correlation of the fundamental frequency with shoulder height or weight broke down (Fischer et al. 2004). The relation between specific acoustic variables and their informational value can get even more complicated when other effects of the sound production process, such as social rank, begin to affect the structure of the variable under examination. In particular, the fundamental frequency may increase with increasing call amplitude, as a result of the higher lung pressure. Notably, high-ranking adult male chacma baboons uttered calls with a higher fundamental frequency than those of lower-ranking males, but there is no correlation between rank and body size in adult males (Fischer et al. 2004). Nevertheless, the fundamental frequency of the dominant males was substantially lower than that of subadult males; i.e., they were still clearly recognizable as adults. In sum, acoustic parameters can transmit reliable information about the gross size of a given individual, but may not be suited to distinguish the size of callers that vary by only a few percent.

Although we managed to collect calls from a large sample of individuals, the remaining problem was the imbalance in the number of calls from females and males. Indeed, despite increased sampling efforts for young males, we collected fewer calls from males than from females. Sexual differences in vocal behavior, in relation to sexual differences in global behavior, may represent one possible explanation. In fact, adult females and juveniles account for the majority of clear calls (Cheney et al. 1996). Males utter clear calls at a much lower rate than that of females (Cheney et al. 1996; Fischer et al. 2002). This might be due to the fact that young males may be more likely than females to stay with their playmates. Thus, the probability of them being completely isolated from the rest of the group is lower than for young females. In contrast, young females may be less used to stay with playmates. Therefore, when they become separated from the rest of the group, they are more likely to be completely alone and thus they may feel more aroused. This high arousal compared with young males may be one possible cause of the difference in calling rate. It may also be the reason for the fundamental frequency in young female calls being more modulated than in young males. However, this explanation begs the question why there is no difference between adult females and males, as adult males appear to be less bothered than adult females when they lose contact with the troop (J. Fischer, personal observation). In fact, adult males need to transfer between groups and can often be at the fringes of the group. Thus, they may be better habituated to temporal isolation than females are. Unlike females, males do not become separated from their dependent offspring, one major cause for emitting clear calls in adult females. The causes and consequences of this intriguing difference in the vocal behavior of male and female baboons remain to be further investigated.