Introduction

All known gibbon species produce elaborate, loud, long, and stereotyped patterns of vocalization often referred to as songs (Geissmann 1993). Previous researchers suggested that gibbon song characteristics are useful for assessing systematic relationships at the genus and the species level and for reconstructing gibbon phylogeny (Creel and Preuschoft 1984; Geissmann 1993, 2002a, b; Haimoff 1983; Haimoff 1984a).

Gibbons routinely maintain their territories through loud morning song bouts wherein mates combine their species-specific and often sex-specific vocalizations to produce well patterned duets (Deputte 1982, Geissmann 1993, Geissmann and Orgeldinger 2000). Species-specific differences in song—structure, the amount of solo singing of either sex, the amount of duetting, and the complexity of vocal coordination—all suggest that the functions of singing differ across gibbon species (Geissmann 1984; Geissmann and Orgeldinger 2000). Earlier published reports suggest that duetting may serve several functions in gibbons such as territory defense (Geissmann 1984), maintenance of group cohesion (Geissmann and Orgeldinger 2000), and advertisement of pair bonds (Raemaekers and Raemaekers 1985). The song repertoire is notably consistent in structure and organization for each species, and is believed to be largely genetically determined (Brockelman and Schilling 1984; Geissmann 1984; Leighton 1986).

Gibbons are highly specialized, but also very homogeneous, in their anatomy. Some species exhibit sexually dichromatic pelage coloration, and undergo striking color changes during their ontogeny. Therefore, reliable species or subspecies identification is often not possible based on pelage coloration criteria alone. Geissmann (2002b) used cladistic methods to compare 3 different types of data—fur coloration, anatomical/ morphological data, and vocal data—with respect to their relevance for the reconstruction of gibbon phylogeny. Of the 3, vocal data produced the most reliable phylogeny (Geissmann 2002b). Contemporary taxonomies extend the traits used beyond purely anatomical ones that form the basis of traditional classification. A species identity is maintained by selection for compatible reproductive biology at the behavioral, physiological, and anatomical levels (Paterson 1985). In gibbons, species-specific vocalizations may infer discrete breeding patterns. In some cases, anatomically indistinguishable populations may be reproductively isolated by their respective vocalizations.

Crested gibbons (Nomascus) are 1 of 4 main taxonomic groups within the Hylobatidae and are characterized by various morphological, anatomical, karyological, and vocal features (Couturier and Lernould 1991; Garza and Woodruff 1992, 1994; Geissmann 1993, 1995, 2000; Geissmann et al. 2000; Groves 1993, 2001; Schilling 1984; Takacs et al. 2005, Zhang 1997). Five species of crested gibbons are currently recognized (Table I). In terms of singing behavior, crested gibbons (Nomascus) exhibit a number of unique characteristics that set them apart from other gibbons (Geissmann 1997; Goustard 1976, 1984; Haimoff 1984b). Song bouts of mated pairs of Nomascus are highly stereotyped and male-dominated, whereas solo songs appear to be produced by nonmated individuals only. In addition, crested gibbons exhibit the highest degree of sex-specificity in their songs, as there is typically no overlap between the sexes in either note repertoire or phrase repertoire (Geissmann 2002a).

Table I Classification of the Hylobatidae, showing scientific, English, and common names based on a preliminary phylogenetic tree of the gibbons, combining trees based on vocal and molecular data

Crested gibbons occur in tropical evergreen and less seasonal parts of semi-evergreen rain forests of Indochina (southern China, Vietnam, Laos, and Cambodia). All crested gibbons occurring in central Vietnam and southern Laos are currently regarded as Nomascus siki (Geissmann 1995; Geissmann et al. 2000; Konrad and Geissmann 2006). Based on a comparison of songs, it has previously been reported that gibbons in a large area in Laos and Vietnam are neither typical Nomascus leucogenys nor Nomascus siki. Fur coloration and vocal data from this area suggested either a large hybrid zone with 1 subspecies gradually replacing the other, or the existence of a previously unrecognized taxon, or a combination of the 2 (Konrad and Geissmann 2006). Thus, Nomascus siki may consist of >1 taxon, a southern and a northern one (Konrad and Geissmann 2006).

I compared the song of wild northern and southern Nomascus siki to test the hypothesis that vocal differences are explained by geographic and taxonomic affiliations. Specifically I predicted that northern and southern populations of Nomascus siki will be distinguishable based on vocal characteristics alone.

Methods

Field Methods

The tape recordings of gibbon songs included in the present study originated from 7 areas in Vietnam and Laos (Fig. 1). I use the term geographic population to refer to the area, northern or southern, in which the individual populations occur. The northern geographic population is composed of populations 1–3. The southern geographic population is composed of populations 4–7. Populations are as follows:

  1. 1)

    Nakai Nam Theun National Biodiversity Conservation Area (NBCA), Laos. Recordings were made by Robert Timmins, Peter Davidson, Pham Nhat, and Anthony Stones.

  2. 2)

    Hin Namno National Biodiversity Conservation Area (NBCA), Bua La Pha District, Kham Muam Province, Laos. Recordings were made by Robert Timmins, Peter Davidson, Pham Nhat, and Anthony Stones.

  3. 3)

    Phong Nha Ke Bang National Park (NP) Quang Binh Province, Vietnam.(Ruppell 2007). Recordings were made by Julia Ruppell.

  4. 4)

    Recordings originated from seven communes within Thua Thien Hue Province, Vietnam: Hong Ha, A Roang, Hong Van, Thuong Quang, Houng Nguyen, Thi Tran Phu Loc, and Loc Thuy. Recordings were made by Barney Long in cooperation with Management of Strategic Areas for Integrated Conservation (MOSAIC).

  5. 5)

    Bach Ma National Park (NP), Thua Thien Hue Province, Vietnam. (Tallents et al. 2001). Recordings were made by Thomas Geissmann.

  6. 6)

    Xe-Bang Nouan National Biodiversity Conservation Area (NBCA), Laos and Xe Sap National Biodiversity Conservation Area (NBCA), Laos. Recordings were made by Robert Timmins, Peter Davidson, Pham Nhat, and Anthony Stones.

  7. 7)

    Recordings originated from 5 communes within Quang Nam Province, Vietnam: Phuoc Xuan, Ma Cooih, West La Dee, Phuoc My, and Tabhing. Barney Long’s recordings were made by Barney Long in cooperation with Management of Strategic Areas for Integrated Conservation (MOSAIC).

Fig. 1
figure 1

Map showing the 7 sites from which I obtained gibbon recordings.

I made sound recordings with a Marantz PMD 660 Flash Recorder and a Rode NTG 1 Directional Condenser Shotgun Microphone. Thomas Geissmann, Barney Long, Robert Timmins, Peter Davidson, Pham Nhat, and Anthony Stones used a SONY WM-D6C cassette recorder with a JVC MZ-7-7 directional microphone or a SONY TC-D5M cassette recorder with a Senheiser ME80 directional microphone. I digitized the tape-recordings with a sampling rate of 22 kHz and a sample size of 16 bit. I generated sonograms (time vs. frequency displays) of the sound material using Raven version 1.2. software (Cornell Laboratory of Ornithology) on a Dell Inspiron M1210.

Nomascus Song Structure

Acoustic terminology follows that proposed by Haimoff (1984a, b; Table II).

Table II Acoustic terms and definitions for gibbon song

Female Song Contributions

Adult female Nomascus produce great call phrases only. The great call begins with oo notes (fa), followed by bark notes (fb) and ends with twitter notes (fc; Fig. 2). In the course of a great call, the female starts with long notes of slowly increasing frequency (oo notes). Note durations and interval durations become continuously shorter and oo notes gradually change to short notes of steeply increasing frequency (bark notes; Figs. 2 and 3). After the climax of the acceleration, bark notes tail off into a twitter (fc).

Fig. 2
figure 2

Sample sonogram (from Geissmann et al. 2000; Konrad and Geissmann 2006) showing sexual dimorphism in song phrases of the northern white-cheeked crested gibbon. (a) Great call phrase of an adult female. The great call begins with oo notes (fa), followed by bark notes (fb), and ends with twitter notes (fc). (b) Phrases of an adult male begins with booms (ma), followed by staccato notes (mb), and ends with a multimodulated phrase (mc). (c) Trio song of an adult pair and their juvenile son. The female sings a great call into the phrases of her mate, who pauses his song after a boom note (ma), and adds a multimodulated phrase (mc) to the end of the female’s great call. During her great call, her juvenile son accompanies the female with a short, great call-like phrase. To facilitate “reading” of this sonogram, the female contributions are artificially lightened and the juvenile phrase is darkened.

Fig. 3
figure 3

Stylized sonograms (from Konrad and Geissmann 2006) of (a) the female great call phrase. Durations and ranges are measured corresponding with Table IV. Great calls of female crested gibbons have a stereotyped structure and can easily be recognized. (b) The first note and the second note of the male’s multi-modulated phrase, showing the split in different parts, all measurement points and tangents, durations and ranges measured on these notes. The initial and terminal parts of the second note exhibit moderate frequency modulation, whereas the roll part may include several rolls but includes at least one roll in fully developed phrases. Only fundamental frequencies are shown.

Male Song Contributions

Fully developed song phrases of adult male Nomascus typically consist of 3 different note types, i.e., ma, mb, mc (Fig. 2). The boom note (ma) is a very deep note of constant frequency and is produced during inflation of the throat sac. Boom notes are usually produced as single notes, unlike other male notes, which usually occur in a short series (phrases). The staccato notes (mb) are short, relatively monotonally repeated sounds. The most conspicuous part of the male song is the multimodulated phrase (mc). This phrase consists of several notes, which exhibit rapid and steep frequency modulations (Fig. 3). Adult males typically utter a multimodulated phrase (also called a coda) at or shortly after the climax of the female great call phrase.

In the course of a complex song bout, the male phrases are gradually built up. At the beginning of the song bout, the male produces long unmodulated notes that are precursors of the multimodulated phrases. Later in the song bout, the phrases become more and more modulated and boom notes and staccato phrases are added.

In a fully developed duet song bout, the male singer continuously cycles through the 3 types of phrases: boom phrase, staccato phrase, and multimodulated phrase, usually in this order. When the female starts a great call phrase, the male interrupts and at the end of the great call, answers with a coda (also called multimodulated phrase or mc). After that, he resumes cycling through the 3 types of phrases.

Sample Size of Tape-Recorded Gibbon Songs

I analyzed 42 group song bouts from wild, nonhabituated gibbons. The song bouts comprised 173 female phrases (great calls) from 42 different female gibbons and 192 male phrases from 42 different male gibbons (Table III). As the actual distribution of the group territories was unknown and the gibbon groups or individuals were generally out of sight while being recorded, some uncertainty about the actual group identity remains. When in doubt about whether 2 recordings were produced by the same group or by 2 distinct groups, I excluded recordings of inferior quality from the analysis. Owing to low recording quality, and thus low sonogram quality, I excluded some additional recordings from the analysis.

Table III The number of phrases analyzed for each gibbon group

The amount of sound material available for analysis varied considerably among groups. Gibbon males produce fully developed multimodulated phrases only after several minutes of producing simpler, less modulated phrases. I aimed to include only phrases in the analysis that represented the most developed stage of each male gibbon song. Female crested gibbons sing only during great calls whereas males sing during and between each great call. Therefore, song bouts include fewer female phrases than male phrases.

Per tape-recorded individual, I analyzed ≤10 complete and fully developed phrases. If more phrases were available, sonogram quality was the selection criterion. I regarded a male phrase as fully developed if the multimodulated phrase consisted of ≥2 notes in which the second exhibited at least 1 roll. Aborted great calls lack the twitter and usually comprise <5 notes.

Acoustic Analysis

I measured 22 male and 14 female structural parameters of the frequency or time dimension from the sonogram of each phrase selected for analysis (Table IV). To quantify acoustic characteristics of the male and female phrase, I defined 78 variables (Appendix). The reason for calculating so many variables was to adequately describe the complex gibbon song structure without making any assumptions a priori on the importance of the song characteristics.

Table IV Definition of note types, note parts, anchor points, and tangents

Statistical Analysis

I performed statistical analysis on a Dell Inspiron M1210 via SPSS (version 15 for Windows) and R (version 2.9.0). I used linear discriminant anlaysis (LDA), classification trees, and multidimensional scaling to assess vocal differences between gibbon populations.

Linear Discriminant Analysis

Linear discriminant analysis (LDA) is a parametric multivariate method useful for analyzing population differences. Linear functions of the independent variables, i.e., the song variables, are formed to describe the differences between ≥2 populations. LDA also identifies the relative contribution of a variable, or a set of variables, to the separation of the populations (Rencher 1995). I made use of this quality to estimate which of the 78 variables determined for this study contribute most to discriminating among populations (Appendix). I conducted a stepwise discriminant analysis. In this procedure, a model of discrimination is built up step by step, i.e., variables are included one after the other. At each step, all variables are reviewed and the one that contributes the most to separating the populations is included in the model. This process is repeated until either all variables are included or all redundant variables are excluded. Redundancy among the independent variables is expressed by the tolerance. The tolerance is a measure of the degree of linear association between independent variables. It is used to avoid entering a variable that is a linear combination of a variable already in the model (Norusis 2005).

I employed 2 types of coefficients to assess the relative contribution of each variable that I included in the analysis by the stepwise method to the separation of the populations: the standardized discriminant function coefficients and correlation coefficients between the values of the function and the values of the variables.

A major purpose to which LDA is applied is the issue of predictive classification of cases. Classification is used once a model has been finalized and the discriminant functions have been derived to show how well we can predict to which group a particular case belongs. On the basis of the discriminant functions, cases, i.e., the recorded gibbon groups, are classified into 1 of the populations, i.e., the 7 gibbon populations from which we obtained recordings. Because this is known, the actual population membership can be compared to the predicted population membership derived from the discriminant functions. The expected percentage that would be correctly classified by chance alone is the largest population size divided by the number of groups, in this case 10 (groups) divided by 42 (total groups) which equals 23.8%. One can take the percentage of cases classified correctly as an indicator of the effectiveness of the discriminant function.

Classification Trees

LDA is frequently used for classification problems, but it is unusual to find examples where researchers consider the statistical limitations and assumptions required for the techniques (Feldesman 2002). Classification trees are a nonparametric alternative used for classification that addresses the primary concerns with LDA. I performed a classification tree using R’s Recursive Partitioning RPART Routines (Therneau and Atkinson 1997). Classification trees are built via a process known as binary recursive partitioning. This is an iterative process of splitting the data into partitions, and then splitting it up further on each of the branches. The process starts with a data set consisting of preclassified groups in which the dependent variable has a known class or label. In this study, the known classes are populations 1–7. The goal is to build a tree that distinguishes among the classes. To choose the best splitter at a node, the algorithm considers each input field in turn. One then sorts each field and tries and considers every possible split, and the best split is the one that produces the largest decrease in diversity of the classification label within each partition. The goal of classification trees is to predict or explain responses on a categorical dependent variable. Therefore, the available techniques have much in common with the techniques used in the more traditional methods of LDA. Steinberg and Colla (1997) enumerated the general technical advantages of the classification tree algorithm over parametric techniques such as LDA. The primary advantages of classification trees relevant to this study are: it is nonparametric, it requires no advance variable selection, it is robust to outliers, and it can use any combination of categorical and continuous variables (Feldesman 2002; Steinberg and Colla 1997). Classification trees are recommended either as a replacement for LDA or as a supplement (Feldesman 2002).

Multidimensional Scaling

I used multidimensional scaling plots to visualize vocal distances and vocal variability within gibbon populations. Multidimensional scaling is designed to analyze distance-like data, called dissimilarity data, that indicate the degree of difference or similarity of samples (Norusis 2005) and display the structure of the data as a geometrical picture. A multidimensional scaling algorithm starts with a matrix of item-item similarities, and then assigns a location of each item in a low-dimensional space. On the basis of the multidimensional scaling analysis, the populations are placed on a map (Euclidean Distance Model).

Results

Linear Discriminant Analysis

LDA using the 7 predefined populations effectively separates between southern and northern Nomascus siki (Fig. 4). The individual populations within each geographic population overlap largely. This means that individual populations within each geographic population have a similar song, but the songs between geographic populations differ. Four variables contribute most to discriminating among populations: 3, 58, 64, and 66 (Appendix). Only the first of these is a male song variable, the others describe the female call. The other variables were excluded because they are either redundant with the selected variables or they do not contribute to separating groups.

Fig. 4
figure 4

2-Dimensional plot of the canonical discriminant functions.

Variable 3 was the presence or absence of staccato notes. In the northern geographic populations, I heard staccato notes in almost every recording. One sample from Nakai-Nam Theun is shown as an outlier (Fig. 4) because no staccato notes are audible in this sample, in contrast to all other samples from the northern geographic population. This outlier is probably an artifact of the poor quality and the shortness of the corresponding sound-recording and not due to an actual lack of staccato notes. I heard no staccato notes on any recordings from the southern geographic population.

Variable 58 was the range of start frequencies of the female great call phrase. I measured this variable by calculating the range in Hertz from the lowest start frequency (usually the first note) to the highest start frequency (usually last note) in the great call. The southern geographic population exhibited a greater range of start frequencies (mean range 549.5 kH) than the northern geographic population (mean range 216.7 kH).

Variable 64 was the relative duration of bark phase in the female’s great call. I measured this variable by calculating the percentage of the duration of the bark phrase within the duration of the entire great call. The southern geographic population had a longer bark phrase in proportion to the entire great call (mean relative duration was 62% of the great call) compared to the northern geographic population (mean relative duration was 58% of the great call).

Variable 66 was the duration of the first oo note in the female’s great call. The first oo note of the southern geographic population (mean 1.9 s) was longer than the first oo note of the northern geographic population (mean 1.33 s).

Classification predictions correctly classified 54.8% of original grouped cases and 45.2% of cross-validated grouped cases (Table V), indicating a better than chance accuracy in the classification predictions of the LDA. The LDA assigns each population to the correct geographic population (north or south) 100% of the time (Table V).

Table V Classification results of the discriminant analysis. Populations 1–3 are northern Nomascus siki populations, and 4–7 are southern Nomascus siki populations. Gray cells indicate incorrectly predicted group membership, which occurs only among groups of the northern Nomascus siki populations and among groups of the southern Nomascus siki populations, but not between the 2 forms of Nomascus siki

Classification Trees

The results of the RPART classification trees were virtually identical to those of the LDA. The classification trees had a 93% correct classification rate for the 2 geographic populations (north or south). As in LDA, variable 58 (range of start frequencies in the female’s great call) discriminates among populations. One of 2 groups from population 1 (Nakai-Nam Theun NBCA) and 2 of 3 from population 2 (Hin Namno NBCA) belonging to the northern geographic population were incorrectly classified into the southern geographic population. All other 39 groups were classified correctly.

The deeper results of the classification tree suggest that there may be some north-south clinal variation in the southern geographic population. The northern geographic population showed no evidence of a cline.

Multidimensional Scaling

The result of the multidimensional scaling analysis closely mirrors that of the LDA and the classification trees (Fig. 5). Dimension 1 separates the 2 geographic populations completely, with no overalp.

Fig. 5
figure 5

2-Dimensional plot of the results of multidimensional scaling.

Discussion

The results of the present study clearly show that what is currently known as Nomascus siki can be split into 2 distinct geographic populations based on vocal data: 1) northern (Phong Nha Ke Bang NP, Vietnam; Hin Namno NBCA, Laos; and Nakai Nam Thuen NBCA, Laos) and 2) southern (Bach Ma NP, Vietnam; Quang Nam Province, Vietnam; Quang Tri Province, Vietnam; and Xe Bang and Xe Sap NBCA, Laos; Fig. 6).

Fig. 6
figure 6

Representative sonograms of the songs of the 2 geographic populations.

Various factors may drive the observed divergence between geographic populations. First, the differences between groups may be attributed to the past presence of a geographic boundary such as uncrossable rivers or mountains that limited dispersal and gene flow between northern and southern populations. After isolation, genetic drift could explain the apparent differences between northern and southern geographic populations. Second, it is very noticeable that the whole topography of the northern area is more mountainous, with more extreme peaks and valleys than the southern area. The song differences may represent localized adaptations to the sonic properties of altered terrain.

The evidence of distinct groups geographically makes a compelling argument for separating them at the subspecies level. However, insufficient data are available to determine whether they are separate subspecies. Anecdotally, Konrad and Geissmann (2006) observed morphological differences in the width and bushiness of the male’s white cheek fur between the northern and southern geographic populations. Without genetic data and detailed morphological analyses, it is impossible to make a taxonomic resolution.

If an intergrade zone exists, as Konrad and Geissmann (2006) proposed, then it can be narrowed down to a relatively small area. Additional tape-recordings from Phou Xang He NBCA and Dong Phou Vieng NBCA in Laos, from southern Quang Binh Province and Quang Tri Province in Vietnam would help to refine results further. The analysis of morphological and genetic variability of Vietnamese and Laotian gibbons also provides avenues for further pursuit.