Keywords

FormalPara Vocal fold vibration and intraglottal geometry

In order to phonate, the thyroarytenoid, lateral cricoarytenoid, and interarytenoid muscles adduct the folds. These muscles in addition to the cricothyroid and strap muscles (among others) are responsible for pre-phonatory conditions including vocal fold length and tension. When the folds are close enough, flow going through the vocal folds produces vibration. This is an example of a phenomenon known in engineering as a flow-structure interaction; a flag waving in the wind is another example. Flow going through the folds modulates intraglottal pressures, which alters the shape of the glottis. Changes in shape produce different airflow patterns that, in turn, modify intraglottal pressure; thus it changes the glottal shape and so on. To understand the nature of the flow-structure interaction, it is necessary to know the geometry of the glottis, the material properties of the vocal fold, and the intraglottal pressures at different time points in the phonation cycle.

Most of the movement during vibration occurs in the cover. The cover is defined as the mucosa and the superficial layer of the lamina propria. The body is defined as both layers of the ligament and the thyroarytenoid muscle. At higher amplitudes, the ligament and muscle can also vibrate. The rhythmic movement of the cover—the mucosal wave—can travel in three directions: medial-lateral, inferior-superior, and anterior-posterior. The anterior-posterior wave is less common, but this wave and the medial-lateral wave can be seen during videostroboscopy. The superior-inferior wave produces a convergent shape of the glottis during the opening phase and a divergent glottis during the closing phase. The glottis is convergent when the coronal section of the fold is narrower superiorly and wider inferiorly. The glottis is divergent when the superior aspect of the fold is wider than the inferior aspect. This vertical mucosal wave can be seen in Fig. 2.1 [1], which shows a coronal section of the glottis and the corresponding flow rate exiting the glottis. Frames 1 and 2 show the converging glottis during opening. Frame 3 shows a straight glottis during maximal opening and frames 5–7 show a diverging glottis during closing. Frames 8–10 show a fully closed glottis.

Fig. 2.1
figure 1

Glottal flow waveform and corresponding glottal motion. The specific phases of the vertical mucosal wave are specified in the glottal waveform. (Adapted with permission from Hirano [1], with permission)

Flow Rate and Source of Sound

Flow rate (Q) , which is called glottal flow in Fig. 2.1, is defined as the volume of air that exits the glottis per unit time and is equal to the velocity times area (Q = va). In the classic source filter theory, Fant says the source of sound is due to the change in flow rate (dQ) per time (dt) or dQ/dt and that the vocal tract acts as a filter to increase the intensity of certain frequencies and decrease others [2]. From Fig. 2.1, dQ/dt is the slope of the flow rate per time curve. The flow rate usually skews to the right, meaning that the decline of Q during closing is faster than the increase in Q during opening. The maximum slope of Q during closing is known as the maximum flow declination rate (MFDR). MFDR is highly correlated with acoustic intensity [3]. Since Q is equal to the velocity times the area, MFDR can be increased by increasing the rapid closure of the glottis or by increasing the rapid velocity deceleration of the flow exiting the glottis. The majority of the acoustic energy is produced during this rapid flow shutoff [4].

It is important to note that the source of sound, unlike in a loudspeaker, is not due to vibrations alone. Instead the source of sound is the change in flow rate at the glottal exit. Sound is a pressure wave or wave with constantly varying pressure. In laminar flow, pressure (P) is equal to the flow rate (Q) times the resistance (R) or P = QR . Thus, a constantly changing flow rate as is seen in Fig. 2.1 will produce a pressure wave. The area change is directly determined by the vibrations and can be seen by videostroboscopy. We do not currently have any clinical way to measure the velocity, which is one reason why a strobe can be normal with abnormal voice and vice versa. As Verneuil et al. [5] say: “For example, vocal folds with a normal appearance and no demonstrable physiologic deficit may produce poor vocal quality. Conversely, inflamed irregular vocal folds may produce surprisingly good voice. Information about the glottal energy source is required to improve our understanding of the relationship between laryngeal physiology and acoustics in normal and diseased states.” The term “glottal energy source” refers to the glottal flow rate waveform. The flow rate can be measured clinically by an indirect method that uses a Rothenberg mask placed over the mouth and nose. This method uses assumptions that have not been validated and the Rothenberg mask is used mostly in research.

Strictly speaking, the acoustic intensity is not proportional to the amplitude of the mucosal wave but to the rapid closing of the mucosal wave, although these two properties are likely related. However, amplitude is only one factor related to rapid closing. To understand the other factors, one has to understand the patterns and causes of vibration.

In the original aerodynamic-myoelastic theory of phonation, the closing of the glottis is attributed to Bernoulli forces [6]. The Bernoulli law, which assumes inviscid flow, says that pressure and velocity are inversely proportional. The law of conversation in fluid mechanics says that in steady flow, area × velocity is a constant. For example, in a constricted hose, the area decreases, the velocity increases, and the pressure will decrease. During opening the glottis is convergent. At the superior aspect of the glottis, the area is the smallest, the velocity is the highest, and the pressure is the lowest. During closing, the glottis is divergent; at the superior aspect, the area is the highest and the velocity is the lowest. By the Bernoulli law, the pressure will be the highest at the superior aspect of the glottis; however, the opposite is seen in experiments (Table 2.1).

Table 2.1 Description of terms

Intraglottal Pressures

Intraglottal pressures have been measured experimentally in an excised canine larynx using a hemilarynx preparation [7, 8]. The canine larynx is the most similar larynx to the human in terms of anatomy and size. There is not a well-defined ligament in the canine, but behavior of the mucosal waves are very similar. In the hemilarynx preparation, all tissue is removed from the vocal folds. Then one-half of the thyroid cartilage and the adjacent paraglottic tissue and vocal fold are removed. The remaining vocal process, arytenoid, and anterior thyroid cartilage are sealed to a plexiglass plate. Two 1 mm large pressure transducers are placed in the plexiglass plate, one in the superior glottis and one in the inferior. The canine glottis is typically 3 mm high. Measured pressures are shown in Fig. 2.2. The x-axis refers to the phase of vibration where 0° marks the point of opening and one cycle is 360°. The dark lines represent pressures measured in the canine hemilarynx in our lab [7]. The lighter lines are from another team and general trends are similar. The pressure on the y-axis is relative to atmospheric pressure. In the superior glottis, the pressure is actually negative during the latter part of closing; this is opposite of predictions made by the Bernoulli law. Negative pressure means that the pressure is lower than atmospheric pressure and will cause a suction force.

Fig. 2.2
figure 2

Measured pressures in the canine hemilarynx model. The top lines are taken from the inferior glottis. The lines with negative pressures are taken from the superior glottis. The phase of vibration is on the x-axis, where one vibration cycle is equivalent to 360°. (From Alipour and Scherer [7], with permission. The dashed lines represents data from that source)

To understand the origin of the negative pressures, one needs to understand the velocity fields in the divergent glottis during closing. Velocity has a magnitude and a direction and is therefore a vector. A picture showing the direction of the vector at selected points is known as a field, and the lines connecting the vectors are known as streamlines. The closer the streamlines are, the higher the velocity and the lower the pressure. Until relatively recently, the velocity fields inside the glottis during vibration were not measured experimentally, so assumptions were made. The three main assumptions are shown in Fig. 2.3. Figure 2.3a shows that the flow stays attached to the wall, which in this case is the medial aspect of the folds. Figure 2.3b shows the flow separating from the wall, which is a phenomenon known as flow separation. There are many physical flows that have separation, but this occurrence is known to occur in a divergent duct. Vortices, or areas of rotational motion , normally form between the wall and the separated jet of flow, but vortices are more complicated to model computationally, so an assumption has been made to ignore them. Figure 2.3c shows the vortices. These vortices will produce negative pressures. As previously noted, a negative pressure is lower than atmospheric and will produce a suction force. This suction force assists in the glottis closing faster. A faster closing will result in an increase in MFDR and, therefore, an increase in acoustic intensity.

Fig. 2.3
figure 3

Velocity fields in the divergent glottis during closing (see text). (a) Flow follows walls of divergent duct. (b) Flow separates from wall, no vortices. (c) Flow separates from wall—vortices form

Figure 2.4 shows an example of the intraglottal velocity fields between the folds during closing. Flow can be seen entering and exiting the sides of the glottis on both sides. This rotational flow produces the measured negative pressures in the hemilarynx. Even greater negative pressures are produced in the full larynx. It can be seen that the flow separates from the medial surface of the fold. As mentioned previously, this negative pressure is not predicted by the Bernoulli equation. This is because the Bernoulli equation does not apply when there is flow separation. We will refer to this intraglottal rotational motion as flow separation vortices. In physical divergent ducts, flow separation will occur for a divergence angle greater than 7°. As the angle increases up to a certain point, the vortex strength and thus the negative pressure will also increase. This negative pressure creates a suction-like force that helps close the glottis.

Fig. 2.4
figure 4

Velocity fields in a divergent glottis during closing in an excised canine larynx

Material Properties of the Vocal Fold

One possible reason for the divergent shape during closing is due to the material properties of the fold. Chettri et al. [9] used an indentation test to measure Young’s modulus of the medial surface in the superior and inferior aspect of the fold. A 1 mm probe was used. The probe was displaced various amounts in the lateral direction, the fold was locally compressed by the probe in a direction perpendicular to the medial surface of the fold, and the force for each displacement was recorded. From these measurements, stress-strain curves were calculated. Similar measurements were made in our lab [10]; an example is shown below in Fig. 2.5. At low strains, or displacements, the superior edge is about as stiff as the inferior edge. However as the displacement becomes greater, the inferior edge becomes much stiffer. This means that at similar intraglottal pressures, the superior aspect of the glottis will displace more laterally. The hypothesis for this stiffness gradient proposes that the insertion of the conus elasticus on the inferior edge produces the increases stiffness, especially as displacement increases and the conus is stretched more. Increasing subglottal pressure will increase displacement; thus it is expected that increasing subglottal pressure will increase the divergence angle, which is what is seen experimentally. A greater divergence angle is associated with stronger vortices and greater negative pressures; this results in a stronger suction force during closing which produces a higher MFDR and a louder voice.

Fig. 2.5
figure 5

Stress strain curves for the inferior 1 mm and superior 1 mm of the canine vocal fold. The fold is more stiff inferiorly than superiorly

This difference in elasticity is known as the vertical pressure gradient and varies as the subglottal pressure varies. Figure 2.6a shows the distance or displacement between the folds as a function of phase for three different subglottal pressures. Since the length of the folds is constant, the displacement is proportional to the area between the folds. At low subglottal pressures, the curve is fairly symmetric. On the other hand, the displacement curves are skewed to the right for moderate and high subglottal pressures. At low subglottal pressures, the divergence angle is minimal and there are no vortices and therefore no negative pressures causing additional closing forces. On the other hand, at higher subglottal pressures, there are vortices and the associated negative pressures cause rapid closing of the area curve or skewing of the curve to the right. The velocity curves are shown in Fig. 2.6b. The negative pressures in the superior glottis are also shown as dashed lines. Since the low subglottal pressure does not produce a divergent glottis, and therefore no vortices, there is no associated negative pressure. The negative pressure produces an additional pressure gradient between inferior and superior aspects of the fold. This increased gradient results in increasing velocity. The velocity suddenly decreases because the folds close rapidly. Both the increase and decrease contribute to the skewing of the velocity curve. Since flow rate is equal to velocity times the area, skewing of both velocity and area curves will cause skewing of the flow rate curve and an increase in MFDR. Thus, increasing divergence angle is one way of increasing SPL (sound pressure level, correlated with the perception of loudness) and the amount of higher harmonics. Opera singers can produce higher SPL at similar subglottal pressures compared to music theater singers, and it is also shown that opera singers produce a higher divergence angle [11].

Fig. 2.6
figure 6

(a) Displacement (mm) between folds halfway between the anterior commissure and vocal process at three different subglottal pressures. Note skewing of the wave at moderate and high subglottal pressures. (b) Velocity (m/sec) on the left vertical axis and negative pressures in the superior glottis (cm H2O) for three subglottal pressures. There are no negative pressures for the low subglottal pressure. Note a second peak in velocity associated with the negative pressures for the moderate and high subglottal pressures

Titze [12] notes that in order to sustain vibration, the intraglottal pressures during closing do not have to be negative but they have to be less than the pressures during opening. As previously discussed, two types of forces producing this pressure differential are the positive inferior intraglottal pressures during opening and the negative superior pressures during closing. Two additional forces are due to vocal tract inertance and elastic recoil.

Forces Involved During Vocal Fold Closing

The air in the glottis and vocal tract primarily acts as a mass of air that is accelerated during opening and decelerated during closing; this phenomenon is known as inertive vocal tract loading . During opening, the air column in the vocal tract and glottis is being accelerated requiring a positive intraglottal and glottal exit pressure. During deceleration, the mass continues its forward momentum causing a reduced or negative intraglottal pressures. This effect is increased with increased vocal tract constrictions (such as caused by false fold compression). This effect does not occur without a vocal tract. The experiments previously described in excised canine larynges do not have a vocal tract.

Vocal folds have been modeled as a combination of mass, damper, and spring components. Elastic recoil specifically refers to the spring element. During opening, the vocal fold moves laterally and the spring is compressed. During closing the spring will lengthen due to elastic recoil of a compressed spring. This lengthening causes the fold to move medially. The greater the subglottal pressure, the more the spring will be compressed, and the greater the elastic recoil. However, because of friction, the forces available for closing will always be less than the force required to compress the spring; thus the skewing of the area curve cannot be explained by the elastic recoil forces.

Properties of Sound Produced

The sound produced at the glottal exit is composed of a fundamental frequency and multiples of the fundamental frequency. These multiples are known as harmonics . Acoustic energy at frequencies other than the fundamental or harmonics is perceived as noise. This noise is often perceived as a roughness or breathiness and can be seen in multiple conditions including glottic insufficiency, turbulent airflow, and irregular vibrations. Acoustic measures, such as the signal-to-noise ratio, measure the amount of acoustic energy in the harmonics relative to the energy between the harmonics. The amount of acoustic energy in the higher harmonics will increase as the MFDR increases, and these higher harmonics are important for intelligibility in noise.

The fundamental frequency will be increased by increasing the length or tension of the vocal fold cover or by decreasing the area. Lengthening the fold has much greater effects on increasing tension. Cricothyroid activation will lengthen the vocal fold, decrease the area, and increase the tension of the cover, all changes that will increase frequency. The effect of the thyroarytenoid depends on how much of the fold is vibrating. At low amplitudes of vibration which predominantly involve the cover (which includes the endothelium and superficial lamina propria), thyroarytenoid contraction reduces the length and tension of the cover and will lower the fundamental frequency. If the amplitude is larger and involves the vocalis muscle, contraction of the thyroarytenoid increases tension of the muscle and frequency will be increased [13].

The vocal tract has different resonances depending on the size of cavities and constrictions. Resonance refers to an increase in the acoustic energy of a specific harmonics. The specific harmonics are referred to as formants and vowels and consonants are often associated with specific formants.

Summary

This chapter focuses on two main questions. First, what are the mechanisms for vocal fold vibration? Second, what is the mechanism for sound production? The mechanisms for vibration include positive pressure and Bernoulli forces during opening and elastic recoil and suction forces produced by vortices during closing. Actual measured pressures in the excised canine hemilarynx show that the Bernoulli law does not apply during the closing phase of vibration. As displacement increases, the vocal fold becomes stiffer inferiorly relative to the superior aspect of the folds. Vocal fold vibration does not directly produce sound. Instead the vibration produces changes in the flow rate exiting the glottis. This modulation of flow produces sound, which is then modified by the vocal tract. The majority of the sound is produced during the latter part of closing and can be characterized by the maximum flow declination rate (MFDR). Higher MFDR will produce greater acoustic intensity and more energy in the higher harmonics.