Introduction

The acoustic properties of birdsong, like those of human speech, depend not only on the sound produced by the vocal organ, but also on how it is modified by complex song-related motor patterns of the upper vocal tract. By changing the shape of its vocal tract a bird can control the spectral properties of its song in ways that may facilitate vocal communication, but the means by which birds control their vocal tract filter is still a poorly understood aspect of vocal production.

Birdsong is generated in the avian vocal organ, the syrinx, by a pair of labia at the cranial end of each primary bronchus. Each labial pair oscillate, much like the vocal folds of mammals, when they are adducted by syringeal muscles into the respiratory air stream (Goller and Larsen 1997; Larsen and Goller 1999). Sounds produced in the syrinx are subsequently altered by the suprasyringeal vocal tract filter, which favors the passage of sound near its resonant frequencies (Nowicki 1987; Beckers et al. 2003; Riede et al. 2004; Suthers and Zollinger 2008). Movements of the hyoid skeleton play a major role in tuning of the avian vocal tract filter (Fletcher et al. 2006; Riede et al. 2006).

The evolutionary origin of hyoidal control of upper vocal tract movements in birds can be traced back to the respiratory needs of early tetrapods which relied on an oropharyngeal buccal pump to ventilate their lungs. The primitive condition, pure buccal pumping, is found in most air breathing fishes (Brainerd and Owerkowicz 2006), but it also exists as a supplementary breathing pattern during fast locomotion in some reptiles (Owerkowicz et al. 1999). Buccal pumping represents a sophisticated solution for a biomechanical challenge stemming from the fact that muscles only generate force in the direction of shortening. A mechanical construct is necessary that causes an expansion of the oral and pharyngeal cavity in order to generate a negative pressure that sucks in air (or water). The hyoid apparatus provides the lever system that converts muscle shortening into abduction of hyoidal skeletal elements, resulting in an enlargement of the oropharyngeal–esophageal cavity (OEC). The development of thoraco-abdominal aspiration as the primary ventilation system in amniotes removed the respiratory constraints imposed by the buccal pump and allowed the oropharyngeal muscles to assume new tasks. Some of these secondary tasks produce communication signals including visual signals such as the extension of the throat or dewlap in lizards (Font and Rome 1990) and acoustic signals such as the articulation of human speech. The latter involves sophisticated coordination of the hyoid skeleton and hyoidal-pharyngeal muscles in a motor pattern that changes the geometry, and thereby the resonance, of the oral and pharyngeal cavities. Galis (1996) pointed out that structural innovation is essentially associated with four main types of morphological events: acquisition of a new structure; loss of a function by a structure that subsequently becomes available for a new function; duplication of a structure; and functional uncoupling. The use of the hyoid skeleton in acoustic communication in various taxa appears to represent an example where the previous function was lost and a new function evolved.

Cineradiography of singing northern cardinals (Cardinalis cardinalis) has shown that they actively control the dimensions of their oropharynx and esophagus so that its primary resonance tracks the fundamental frequency (f 0) of the song, even when this frequency is modulated over two or more octaves (Fletcher et al. 2006; Riede et al. 2006). Cardinals increase the volume of the oropharyngeal cavity and expand the cranial end of the esophagus at low fundamental frequencies, but reduce the volume of these structures when the fundamental frequency of a syllable increases. By maintaining an inverse relationship between the volume of the OEC and the fundamental frequency, the cardinal’s OEC helps position and elevate a single vocal tract resonance (Fletcher et al. 2006) tuned to the dominant frequency component generated in the syrinx.

Very little is known about the regulation of the avian hyoid motor pattern during song or its relationship to the song control system (e.g., Wild 1997). The hyoid motor pattern that accompanies song is reminiscent of the complex, learned articulatory maneuvers involved in human speech and thus represents another parallel between the production of birdsong and speech. Articulatory maneuvers in humans are acquired through a learning process (Kuhl et al. 2005; Kuhl and Rivera-Gaxiola 2008) before they are used in speech, posing the possibility that avian hyoidal motor patterns might also involve vocal learning.

Here we report on the hyoid motor pattern and its relationship with fundamental frequency in the white-throated sparrow (WTS; Zonotrichia albicollis). We chose the WTS because its song, unlike the strongly frequency modulated songs of the northern cardinal, consists of a series of almost constant frequency whistled notes with relatively little energy in higher harmonics (Fig. 1). The song begins with a few long whistled notes followed by a series of shorter notes. There is a change in the fundamental frequency after the first or second note. In most birds this consists of an upward frequency shift, but in some individuals the shift is downward. The great majority of WTS sing a single song type (Borror and Gunn 1965; Hurly et al. 1990, 1991; Weary and Weisman 1992; Falls and Kopachena 1994).

Fig. 1
figure 1

A time-series and spectrogram of a descending song from white-throated sparrow 516. Syllables are nearly constant frequency and contain most of their energy in the fundamental

Methods

Subjects

Experiments were performed on four tan morph adult male WTS that were caught as adults. Prior to the experiment, birds which had been kept in an aviary on short days were put on long days (14L:10D) and given subcutaneous implants of testosterone.

Cineradiography and song recording

X-ray imaging was performed with a Series 9800 Mobile C-Arm and 1 k × 1 k Mobile Neurovascular Work Station (OEC Medical Systems, Inc.). This system provides digital cine at 30 pulses/s with a 10-ms pulse width and true 1,000 × 1,000 imaging resolution, allowing digital recording at 30 frames/s. Songs were recorded from each bird as it sang spontaneously in the X-ray beam while sitting in a cage on a single small perch, about 6 cm long. The C-arm was adjusted so that the bird’s head was about 10 cm in front of the image intensifier screen. The probability of obtaining either lateral or frontal views of the bird was increased by orienting the perch so it was either parallel or at a right angle, respectively, to the X-ray beam. The digital signal from the fluoroscope was recorded on a video recorder (Sony GVD-1000 Video Walkman; Mini DV format) together with the sound recorded by a directional microphone (Audio Technica model AT835b) aimed at the bird from a distance of about 0.5 m, as described elsewhere (Riede et al. 2004, 2006).

All data are corrected for a 100-ms delay between the recorded audio and video signals due to the processing time of the X-ray image. The remaining margin of error between the alignment of the vocalization and the X-ray image is estimated not to exceed one-half frame (±17 ms).

Segments of song recorded on the Mini DV tape were digitized at 30 frames/s (video) and 44.1 kHz (audio) sampling rate (Vegas Video, Sonic Foundry, Madison, WI, USA) and displayed on a computer monitor as individual X-ray frames with their associated sound track. Data points were selected with an on-screen cursor. Acoustic measurements were performed using sound analysis software (PRAAT, version 4.1; www.praat.org). Sound files were high-pass filtered at 500 Hz and low-pass filtered at 20 kHz.

Vocal tract measurements

The changing shape of the upper vocal tract, due to movement of the hyoid apparatus during song, was analyzed from successive X-ray images taken from lateral and frontal views of the bird. Three songs of the same song type were analyzed from each bird in each of these views, except for WTS 527 for which only two songs were available in the frontal position. Movement of the hyoid apparatus, and of the larynx to which it is attached, were quantified in each frame of the X-ray movie during song by measuring two distances in the lateral (LV and LH) and one in the frontal view (Cornua) (Fig. 2). LV is the distance between the larynx (basihyoid) and the mid-point of the second vertebra. LH is the distance between the larynx and the dorsal edge of the beak–skull transition. ‘Cornua’ refers to the distance between the most ventral point of the cornua of the hyoid apparatus. Distances were computed (Martins 2001) from the coordinates of two points selected manually in each frame. Only X-ray images with an essentially perfect lateral or frontal view were used. A 1-cm long metal wire positioned at the mid-sagittal level of the bird allowed accurate calibration of distance measurements. Ten repeated measures of each of the same three distances in a single frame had a standard deviation of 0.35 mm. For comparison, the gap between two X-ray opaque metal edges, measured with the same procedure, had a standard deviation of 0.15 mm. The average length of LV, LH and Cornua was calculated for each syllable, using all but the first and the last frame of that syllable. Syllables with a duration <60 ms were excluded from further analysis due to the 30 frames/s sample rate. The Pearson correlation coefficient, R, was used to quantify the relationship between anatomical distances (LV, LH and Cornua) and the fundamental frequency.

Fig. 2
figure 2

Each white-throated sparrow song is accompanied by coordinated movements of the larynx and hyoid cornua that increase the volume of the oropharyngeal–esophageal cavity (OEC) at the beginning of each constant frequency syllable and maintain a relatively constant large OEC volume during the course of each syllable. a Schematic frontal view of bird showing the hyoid cornua and the distance measured between them (green arrow). b Schematic lateral view of upper vocal tract showing dorsoventral movement (LV, red arrow) of larynx measured from the middle of the second cervical vertebra and the craniocaudal movement (LH, blue arrow) of the larynx relative to the dorsal edge of the beak–skull transition. c, d Movements of larynx, cornua and beak during a song are indicated by changes in the distance between the anatomical structures that define the length of each of the vectors measured. c LV, LH and beak gape in successive frames of the X-ray movie (30 fps) during a descending song. The timing of each syllable and its fundamental frequency (f 0) is plotted in black beneath LV. d Movement of cornua and beak gape during an ascending song. The fundamental frequency of the song is plotted in black beneath Cornua. Artwork in a and b by S.A. Zollinger

Beak gape

During most X-ray recording sessions, the bird was also filmed on a digital video camera at 30 fps. Beak gape, the distance between the tip of the maxilla and mandible, was measured from both the X-ray image and the video recording if both were available and of good quality or from only one of these if it was of good quality and the other was not available. Beak gape was excluded from analysis if the X-ray was not clear and the video was not available. Beak gape was measured using the software described above for ‘Video Analysis’.

Computational model

Fletcher et al. (2006) provided a computational model for calculating the transfer function of a songbird’s vocal tract. We measured the dimensions of the trachea in euthanized male WTS. The volume of the OEC was estimated by two independent methods. Three male WTS were euthanized and casts of the suprasyringeal vocal tract, including the OEC cavity, were obtained by injection with dental impression medium. These casts were used to estimate the volume of the maximally expanded OEC by determining the volume of water it displaced in a small graduate cylinder. The volume of the OEC as it changes during song was also computed from X-ray images. A three-dimensional polygonal model was constructed (modeling package 3D-MAX) to closely approximate the outline of the OEC in frontal and lateral views (see Supplementary Video). The dimensions of these outlines were calibrated with a known reference in the mid-sagittal plane and the volume of the three-dimensional polygon was calculated.

Results

Song patterns

Birds 516, 525 and 667 produced descending songs starting with a long syllable of constant f 0 followed by a second long syllable with a constant but lower f 0 (Fig. 1). Bird 527 sang ascending songs in which f 0 of the second syllable was higher than that of the first (Table 1).

Table 1 Fundamental frequency (f 0) and duration of the first three constant frequency syllables in song

Movement of the hyoid apparatus and larynx during song

During song, all four WTS exhibited similar characteristic movements of the larynx and the hyoid apparatus to which it is attached. The onset of each syllable was consistently associated with a prominent abrupt increase of all three distances measured (LV, LH, Cornua) (Fig. 2). For the four birds, the larynx moved a mean distance of 4.6 mm (n = 117 syllables) in the ventral direction (LV) and 4.7 mm (n = 117 syllables) in a caudal direction (LH) (Table 2). The lateral movement of the hyoid cornua increased the mean distance between them by 5.1 mm (n = 55 syllables) (Table 2).

Table 2 Displacement of hyoid measured by three distances during each song syllable

The resulting ventro-caudal displacement of the larynx enlarged the oropharyngeal cavity (birds do not have separate oral and pharyngeal cavities) while the concurrent lateral movement of the cornua expanded the cranial end of the esophagus, resulting in the formation of a single large OEC. This enlarged OEC volume was maintained during the course of the syllable until the hyoid apparatus abruptly returned to its pre-phonatory position at the end of the syllable (see Supplementary Video).

Relationship between movement of hyoid apparatus and fundamental frequency

Quantification of frequency-related changes in the length of the three vectors measured in each X-ray image is difficult for two reasons. First, the frequency difference between notes in the song of individual WTS is small. The f 0 of the lowest and highest notes in the songs of the four WTS we studied differed by only about 0.2–0.3 octave. The difference in the position of the larynx that is needed to adjust the vocal tract resonance to high vs. low frequency notes is therefore also small, especially if the resonance is broadly tuned to the f 0. Second, given an estimated measurement error of approximately 10% for LH, LV and Cornua due to the small size of the WTS, these small movements may be difficult to quantify.

The Pearson correlation coefficients between fundamental frequency and each of the three distances measured in X-ray images are summarized in Table 3 and Fig. 3. The linear regression between LV or LH and f 0 is negative in all four birds, as expected assuming the bird maintains an inverse relationship between the volume of the OEC and f 0. The regression of LV on f 0 is significant in two birds and close to significance in a third (WTS 527). The regression of LH on f 0 is significant in only one bird. The regression for lateral movement of the cornua on f 0 is not significant in any of the birds.

Table 3 Correlations between fundamental frequency and distance hyoid apparatus moves
Fig. 3
figure 3

The movement of LV during song syllables tends to be inversely correlated with the syllable’s fundamental frequency. Increasing the distance the larynx moves from the vertebral column during song increases the volume of the oropharyngeal cavity. Each data point represents the mean length of LV during one syllable minus its mean length during all silent intervals between syllables in the same song. Linear regression for WTS 516: R = −0.35 ± 0.12, P = 0.147, n = 19; WTS 525: R = −0.37 ± 0.12, n = 48, P = 0.009; WTS 527: R = −0.71 ± 0.05, P = 0.07, n = 7; WTS 667: R = −0.57 ± 0.11, P < 0.0001, n = 42

Beak movements

We measured the beak gape and f 0 in each video frame of all syllables in six songs from each bird. Prominent beak movements accompanied most of these syllables, except in the case of WTS 516 which did not open his beak more than a few tenths of a millimeter during most of his syllables. Of the four birds, only WTS 525 had a significant correlation between the mean beak gape and the fundamental frequency of the syllable. In all four birds, the absolute magnitude of the mean gape varied greatly between different syllables having the same constant f 0 (Fig. 4).

Fig. 4
figure 4

Relationship between the mean beak gape and fundamental frequency of constant frequency WTS song syllables. In many cases mean beak gape varied over a wide range for the same f 0. Beak gape and f 0 were measured in each video frame (30 fps) and the mean gape and f 0 were calculated for each syllable in six songs from each bird. WTS 516: R = 0.277, P > 0.05, n = 47 syllables; WTS 525: R = 0.501, P < 0.01, n = 26 syllables; WTS 527: R = 0.303, P > 0.05, n = 32 syllables; WTS 667: R = 0.011, P > 0.05, n = 69 syllables. Frequency bins are 50 Hz wide. Beak gapes less than about 0.2 mm could not be accurately measured and were assigned a value of 0. This was the case for the majority of WTS 516 syllables, so that data points at zero gape include multiple superimposed syllables

Estimation of suprasyringeal vocal tract resonance

Measurements from casts of the vocal tract gave an average maximum OEC volume for four males of 0.5 ± 0.1 ml. The volume of the OEC during song was also estimated by reconstructing its shape from its outline in each successive frame of the X-ray movie during a syllable. The mean estimated OEC volume for eight syllables, including a high and a low frequency syllable in a song from each of four male WTS, was 0.65 ± 0.39 ml (Figs. 56). The regression line for these data indicates that the volume of the OEC has a significant inverse correlation (R 2 = 0.53; P < 0.05) with the f 0 of the constant frequency syllable (Fig. 6). The OEC tends to be larger during syllables with a low f 0 than it is during syllables with a high f 0.

Fig. 5
figure 5

The white-throated sparrow typically maintains a nearly constant OEC volume during each constant frequency syllable. In this song from WTS 667, the OEC volume is inversely related to the syllable’s f 0. The increased volume is due in part to a ventro-caudal movement of the larynx, indicated by the increase in distances LV and LH, that enlarges the oropharynx and by expansion of the cranial end of the esophagus (not shown). The volume of the OEC was calculated from its outline in X-ray images, not from LV or LH. The absence of morphological data at about 3 s is due to the bird briefly turning its head

Fig. 6
figure 6

Inverse relationship between the volume of the oropharyngeal–esophageal cavity and the fundamental frequency. Data points include the highest and lowest constant frequency syllables from each of four white-throated sparrows. R 2 = 0.53; P = 0.040; dashed lines indicate 95% confidence interval

We used a computational acoustic model of the songbird vocal tract (Fletcher et al. 2006) to estimate the resonance curves of the suprasyringeal vocal tract for OEC volumes between 0.2 and 1.4 ml, corresponding to the range of OEC volumes calculated for the eight syllables in Fig. 7. The model predicts a suprasyringeal vocal tract transfer function that is dominated by a major resonance peak that can be tuned to any frequency between about 3 and 8 kHz by adjusting the volume of the OEC to an appropriate value within its experimentally observed range from about 1.4 to 0.2 ml, respectively (Fig. 7).

Fig. 7
figure 7

Predicted resonance curves of the entire suprasyringeal vocal tract for five different volumes of the oropharyngeal–esophageal cavity. By varying only the volume of the oropharyngeal–esophageal cavity between 0.2 and 1.4 ml a white-throated sparrow could adjust the principal resonance peak of its vocal tract to match the fundamental frequency of its constant frequency syllables, thus significantly increasing the amplitude of the song. Arrows indicate predicted resonances of a 36-mm long trachea closed at the syrinx and open at the glottis. The length of the WTS beak is about 11 mm, however the effective length of the beak decreases with increasing gape. The resonance curves in this figure assume an effective beak length of 2.3 mm, which corresponds to a relatively large beak gape. Based on the Fletcher et al. (2006) computational model of the avian vocal tract

The first tracheal resonance at about 2.2 kHz was about 20 dB below the OEC resonance. A second tracheal resonance at about 6.6 kHz was also present. The frequency of tracheal resonances was not significantly affected by changes in the volume of the OEC. The length of the trachea in three freshly dissected, male WTS was 3.4 cm. In these same males, the cast of the trachea, made by injecting dental impression medium into the vocal tract through the mouth, was 3.8 cm long. The 4 mm increase in the length of the cast compared to the trachea before the injection of the impression medium, suggests a longitudinal elasticity in the trachea that may allow WTS to vary the length of their trachea by about 11%. For all resonance curves we assumed a tracheal length of 36 mm and an effective acoustic length for the beak of 2.3 mm, which is equal to about 20% of its geometrical length and corresponds to a relatively wide beak gape (Fletcher et al. 2006).

Discussion

The oropharyngeal–esophageal motor pattern

In previous experiments we (Riede et al. 2006) have shown that the northern cardinal, whose song contains prominent frequency modulated sweeps, maintains an inverse relationship between the volume of its OEC and the changing fundamental frequency of its song. A computational model of the cardinal vocal tract (Fletcher et al. 2006) indicates that singing cardinals actively adjust the volume of their OEC so that its primary resonance tracks the f 0 of the song.

In this paper we document similar song-related changes in the dimensions of the oropharynx and esophagus of the WTS, a species that is not closely related to the cardinal and whose song consists of constant frequency syllables with almost no frequency modulation. We show that the onset of each of these WTS syllables is accompanied by a rapid ventro-caudal movement of the hyoid apparatus and larynx together with a lateral movement of the hyoid cornua. This movement enlarges the oropharyngeal cavity and cervical esophagus which are maintained in their enlarged state with little variation until the end of the syllable when the larynx and hyoid apparatus return to their pre-phonatory position and the OEC collapses. Most syllables are also accompanied by opening the beak.

Vocal tract acoustics

The source-filter theory

According to the linear source-filter theory of human vowel production (Fant 1960), the time-varying glottal airflow at the sound source is not affected by the resonance filter of the upper vocal tract. The independence between the dynamics of the syringeal source and vocal tract resonance filter is an unresolved issue (Nowicki 1987; Beckers et al. 2003; Elemans et al. 2007). However, the formant pattern of vocalizations is determined by the vocal tract as a whole, including the trachea, glottis, OEC and beak, acting as a resonant system.

The trachea and glottis

The first and second tracheal resonances are close to those predicted for a simple tube closed at the syrinx and open at the larynx. Modeling the trachea as a stopped tube, i.e., as a quarter-wave resonator, predicts a first resonance (in moist air at 41°C) at about 2.6 and 2.3 kHz for lengths of 34 and 38 mm, respectively. The second tracheal resonance is predicted to be about 7.7 or 6.9 kHz, respectively, for these two tracheal lengths. Since the f 0 of syllables sung by our four subjects ranged from about 3.1 to 5.5 kHz (Table 2), our calculations suggest that the f 0 of most WTS song lies between the first two tracheal formants, where it is not supported by the trachea’s resonances.

Little is known regarding the ability of songbirds to vary the length of their trachea. Daley and Goller (2004) determined that zebra finches can only change the length of their trachea a maximum of 3%, equivalent to 1 mm or about a 78-Hz change in the predicted resonance of an equivalent stopped tube. In WTS, the difference between the length of the trachea before and after filling it with impression medium, suggests it may be more elastic than that of the zebra finch.

There are various ways in which a WTS might, in theory, adjust its tracheal resonance to match its f 0, but they have not been investigated in WTS. Tracheal length might be altered by contraction of the tracheolateralis or sternotrachealis muscles; by activity of hyoidal muscles that move the larynx; or by postural changes during song, such as tilting the head back. Nevertheless, large changes in tracheal length seem anatomically improbable.

Tracheal resonances might also be varied by controlling the glottal aperture. Constricting the glottis should lower the tracheal resonance. In the preceding calculations we assume a glottal opening 1.5 mm in diameter (tracheal diameter averages 1.6 mm). Our model predicts that reducing the glottal diameter to 0.5 mm will lower the second tracheal formant from about 6.9 to 5.0 kHz, which could coincide with the f 0 of WTS syllables in the high frequency portion of the species’ vocal range. Glottal aperture during phonation has not been measured in songbirds.

The oropharyngeal–esophageal cavity

Several kinds of evidence support the hypothesis that the prominent hyoidal motor pattern we observed in WTS tunes the vocal tract’s primary resonant frequency to match the f 0, which is the dominant frequency component generated in the syrinx.

Our cineradiography data show that each syllable is accompanied by an enlarged OEC which is maintained at an essentially constant volume for the duration of each constant frequency syllable, as expected if the sparrow is attempting to make the resonance of its vocal tract match the constant frequency of its song.

If the WTS adjusts the volume of its OEC to resonate at the f 0, then one would expect an inverse relationship between the ventro-caudal displacement of the larynx (i.e., LV and LH) and the f 0 of each syllable. Since the hyoid cornua appear to control the enlargement of the esophagus, the amplitude of their movement (Cornua) should also be inversely correlated with f 0. Although in all birds, the onset of each syllable was accompanied by changes in LV and LH in directions that enlarged the OEC, the amount of movement showed a significant inverse correlation with f 0 for LV in only two birds and for LH in only one bird.

The cornua of the hyoid apparatus always moved laterally at the onset of phonation, but the amplitude of this lateral movement was not significantly correlated with f 0 in any of the birds. It may be that modulation of OEC resonance for the high and low frequency syllables within the song depends on laryngeal movements that do not include the cervical esophagus. If so, it is in contrast to northern cardinals in which lateral movement of the hyoid cornua is related to the expansion of the cranial end of the esophagus (Riede et al. 2006).

Differences in the mean f 0 between high and low frequency syllables in individual WTS songs are relatively small, however, and require correspondingly small amplitude adjustments in the hyoidal motor program. The absence of a consistently significant inverse relationship between each of these hyoid movements and f 0 may be due to limitations in the accuracy with which we can measure these distances in X-ray images of this small bird. If the vocal tract resonance of WTS is broadly tuned, i.e., has a low Q-value, there may be little advantage in adjusting OEC dimensions for small (<0.5 octave) differences between the f 0 of different syllables within a song. This interpretation is supported by the fact that when the range of f 0 and OEC volumes is increased, by including a song from each bird, there is a significant negative relationship between these variables.

Finally, the resonance curves based on our computational model of the avian vocal tract (Fletcher et al. 2006) (Fig. 7), predict that the range of OEC volumes, which we measured in our four subjects (0.2–1.4 ml; Fig. 6), should have their primary resonance peaks close to the bird’s f 0 (5.5–3.1 kHz).

The beak

Sound in the OEC must pass through the beak before it is broadcast into the external environment. Westneat et al. (1993) showed that the fundamental frequency of the first three notes in WTS song is positively correlated with beak gape. We observed a similar correlation between beak gape and f 0 in many of the syllables we recorded although during some songs (including a majority of those sung by WTS 516) the beak gape was <1 mm. Particularly striking, is the lack of a correlation between absolute gape and f 0 across individual WTS. For example, the beak gape of WTS 667 for f 0 between about 2.9 and 4.2 kHz often exceeded the maximum mean gape of the other three birds, including that of WTS 527 at a f 0 as high as 6.5 kHz. It is interesting that although the regression of beak gape on f 0 suggests a positive relationship, the absolute values of gape as a function of frequency differ for each of the four birds. A 4-mm gape on the least squares regression line occurs at about 4.0 kHz for WTS 667, 4.8 kHz for WTS 525 and 5.8 kHz for WTS 527.

The acoustic contribution of beak gape to vocal tract resonance is not well understood. On the one hand, several studies of various species have shown an inverse relationship between beak gape and f 0 over at least part of the bird’s frequency range (Westneat et al. 1993; Moriyama and Okanoya 1996; Suthers and Goller 1997; Hoese et al. 2000; Goller et al. 2004; Podos et al. 2004). However, physical and computational models of the avian vocal tract (Fletcher and Tarnopolsky 1999; Fletcher et al. 2006) predict that the beak’s effective acoustic length rapidly decreases as gape increases from 0 to about 1 or 2 mm, but fail to explain why much larger gapes are typical of most species studied. Some experimental data (Suthers and Goller 1997; Goller et al. 2004; Nelson et al. 2005; Riede et al. 2006) suggest the filter effect of the beak per se is relatively small and limited to high fundamental frequencies. Further research is needed to resolve this dichotomy.

Taken together, our data indicate that WTS, like northern cardinals, adjust the dimensions of their oropharynx and esophagus during song in order to maintain a vocal tract resonance close to the f 0 they are generating in their syrinx. In doing this they, like cardinals, increase efficiency with which the dominant frequency component produced in the syrinx is transmitted through the suprasyringeal vocal tract. This in turn increases its level in the song. Higher harmonics of f 0 that are not close to a resonance peak will undergo relatively more attenuation in the vocal tract, increasing the tonal purity of the song.