Abstract
Cervical auscultation is a noninvasive technique for the exploration of swallowing and has been used since the 1960s. The aim of our study was to describe how the volume and consistency of the bolus affect swallowing acoustic sound characteristics in healthy subjects. Twenty-three subjects aged from 20 to 59 years were included (13 women and 10 men). A microphone mounted on a stethoscope chest piece, positioned on the skin on the right side in front of the posteroinferior border of the cricoid cartilage, was used; it was connected to a computer for acoustic recordings. Each subject swallowed 2-, 5-, and 10-ml aliquots of water, yogurt, and mashed potato. Each bolus was administered once, with a period of at least 30 s between each swallow. For each recorded sound, the total duration of the sound and the duration of each sound component (SC) (SC1, SC2, and SC3) and interval (IT1 and IT2) between the SCs were measured. For all records, the average duration of acoustic measures was calculated. Differences according to the volume and the consistency of the swallowed bolus were assessed using Student’s t test for paired data. We calculated the percentage of recordings that included each SC. We also compared results between men and women using Student’s t test. We successfully interpreted 540 of the 621 (87 %) records. The results indicated that the average total duration of the sound, and especially the average duration of SC2, increased with increasing volume and was greater for mashed potato than for the boluses of other consistencies. SC2 was present in all of the records.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
Introduction
Swallowing is a complicated function that involves the bolus being propelled from the mouth to the stomach. It requires the coordination of 25 pairs of muscles of the oral cavity, pharynx, larynx and esophagus; thus, it is one of the more complex functions of the body [1].
Different techniques are used in the diagnostic investigation of swallowing disorders. Videofluorography is considered the gold standard, but it is an invasive method that also exposes the subject to radiation [2]. Fiberoptic endoscopy has been proposed as a routine minimally effective method [3]. Endoluminal sonography [4] and manometry can be used, but both require insertion of a probe that, in certain conditions, can alter bolus flow. Other methods such as high-resolution manometry [5], electromyography [6], and kinetic magnetic resonance imaging [7] are currently under evaluation in specialized centers. However, cervical auscultation has not been used for routine assessment of swallowing.
Several research teams have shown that the noise made during swallowing carries information that could be used for noninvasive analysis of swallowing [8, 9]. In 1990, Hamlet et al. [10] coupled acoustic recordings to simultaneous videofluoroscopy with the aim of determining the source of these sounds. One acoustic study showed that increasing the volume of the ingested bolus lengthened the overall duration of the sound, whereas increasing the consistency of the bolus decreased the duration [11]. Acoustic analysis also revealed there to be several sound components (SC), including three main components [12]. In a previous study, we coupled the sound of swallowing with videofluoroscopy images. The three sound components could thereby be linked to different stages of the swallowing process [13]. The first component (SC1) was associated with the rise of the larynx, the second (SC2) with the passage of the bolus through the superior esophageal sphincter (SES), and the third (SC3) occurred during the descent and the opening of the larynx. We have also demonstrated that the acoustic signal can be modified after surgery on the laryngeal tract [14].
By reviewing the literature, we found evidence that there are differences in the structure of the pharyngeal swallow according to the volume and consistency of the ingested bolus. As the volume swallowed increases, the length and diameter of the opening of the SES also increases [2, 15]. The same applies to the duration and amplitude of the laryngeal elevation, as measured from the displacement of the hyoid bone [16]. Moreover, as the bolus consistency becomes denser, both laryngeal elevation and the duration of the SES opening decrease and the diameter of the SES aperture increases [17]. Boiron et al. [11], and more recently Eyigör et al. [18], showed that the overall duration of swallowing sounds increases with the volume of the bolus and decreases with consistency.
There has been no previous study of the differences in swallowing sound components according to bolus type and volume. Our hypothesis was that if there were differences in the structure of the pharyngeal swallow according to the volume and consistency of the ingested bolus, this would be reflected in the features of the three swallowing sound components. The main objective of this research was to describe the variations of pharyngeal swallowing sound components (number of components, duration, and intervals) according to the volume and consistency of the bolus swallowed. The secondary objective was to identify consistencies and volumes generating the most informative acoustic signals for clinical investigations of the sounds associated with swallowing.
Materials and Methods
Participants
We enrolled 23 volunteers (10 men and 13 women; average age = 28 ± 10 years; maximum age = 59 years; minimum age = 20 years). We obtained informed consent from each subject after explaining the goal of the study and the risks involved, in accordance with current French legislation. Each volunteer’s medical history was recorded, and in particular we verified that none had a history of any swallowing disorder.
Bolus
Boluses of three different textures were studied: flat water at room temperature (dynamic viscosity measured by the Brookfield method: 1 mPa s), unsweetened yogurt (Danone®, dynamic viscosity measured by the Brookfield method: 300 mPa s), and reconstituted mashed potato (Mousline®, 125 g in 250 ml of milk and 500 ml of water at 30 °C; dynamic viscosity measured by the Brookfield method: 50,000 mPa s). Each subject swallowed 2-, 5-, and 10-ml samples, three times, of each of the three consistencies (a total of 27 swallows). Each bolus was prepared in a syringe and then transferred to the buccal cavity. Patients were asked to swallow it all at once when requested by the examiner during the recording. We allowed a minimum of 30 s between each swallow. For the comfort of the subjects, they fasted for at least 2 h before these recording sessions.
Acquisition Material and Analysis
Each study subject was placed in a seated position. The cervical recording was obtained with a microphone (Electret tie clip microphone, 50–18,000 Hz; Sony, Japan) positioned on the skin on the right side in front of the posteroinferior border of the cricoid cartilage. The microphone was held in place by a perforated elastic band. The microphone was connected to the sound card of a portable computer.
All recordings were analyzed by a single investigator using Cool Edit Pro software (Syntrillium Software Corporation, Phoenix, AZ, USA). The swallowing sounds were analyzed by first listening to all the recordings taken from an individual subject while looking at the displacement of the signal’s cursor. All recordings for which the acoustic swallowing signal was not visualized in the noise signal or not clearly heard were discarded. To analyze the entire swallowing sound from its start, we placed the first marker at the point at which the signal diverged from baseline (the initial deflection point), then we studied the number of SCs in each swallowing sound. To analyze the sound components (SC) within the swallowing sound, we similarly placed a marker at the beginning and the end of each SC. We called the interval between the end of a SC and the beginning of the next SC “IT.” The SCs were classified as first SC (SC1), second SC (SC2), and third SC (SC3). Consequently, there were two intervals: IT1 between SC1 and SC2 and IT2 between SC2 and SC3.
Parameters (Fig. 1)
We calculated the percentage of all recordings that contained each SC. For each recorded sound, the total duration of the sound, the duration of each SC, and the intervals (IT1 and IT2) between the SCs were measured. For all records, the average duration of acoustic features was calculated and compared according to the volume and the consistency of the bolus swallowed. We also compared measures between men and women.
Statistics
Normality of distribution was verified by a Kolmogorov–Smirnov test and homogeneity of variances was verified by a Levene test for all parameters. Analysis of total duration with respect to the consistency and the volume of the bolus was carried out by one-factor (gender) repeated-measures ANOVA [3 (consistency: water; yogurt; mashed) ×3 (volume: 10; 5; 3 ml)], corrected by the Greenhouse-Geiser test and followed by the LSD-Fisher post hoc analysis to compare paired averages. Analysis of SC duration with respect to the consistency and the volume of the bolus was carried out by one-factor (gender) repeated-measures ANOVA [3 (consistency: water; yogurt; mashed) ×3 (volume: 10; 5; 3 ml) ×3 (SC duration: SC1; SC2; SC3)], corrected by the Greenhouse-Geiser test and followed by the LSD-Fisher post hoc analysis to compare paired averages. Analysis of the SC interval with respect to the consistency and the volume of the bolus was carried out by one-factor (gender) repeated-measures ANOVA [3 (consistency: water; yogurt; mashed) ×3 (volume: 10; 5; 3 ml) ×2 (interval: IT1; IT2)], corrected by the Greenhouse-Geiser test and followed by the LSD-Fisher post hoc analysis to compare paired averages. If no effect of gender was found, analyses were continued without this factor.
Results
Population
Twenty-three subjects were included in this study and there were 621 acoustic recordings. For reasons of the development of the acquisition technique (see the Discussion section), 13 % of the recordings were not included in the analysis, leaving 540 records that were. The mean age of the subjects was 28 ± 10 years, the mean weight was 61 ± 10 kg, the mean height was 169 ± 10 cm, and the mean body mass index (BMI) was 21 ± 2 kg/m2 (Table 1). Both weight and height were significantly different between women and men (p < 0.0001 for both tests). There was no significant difference between the two groups for age or BMI.
Acoustic Results
No significant difference was found between men and women for any of the measures studied (total duration: F 1,18 = 0.11; p = 0.74; SC duration: F 1,7 = 0.03; p = 0.86; interval duration: F 1,7 = 1.15; p = 0.32). The percentages of the recordings containing the various SCs were as follows: 100 % for SC2, 81 % for SC1, and 77 % for SC3. These values were independent of bolus type.
For the total duration of the sound, there was a significant effect from the bolus volume (F 2,38 = 9.11; p < 0.0001), with a significant difference between boluses of 10 ml and those of 5 and 3 ml (p = 0.018, 10 vs. 5 ml for water; p = 0.008, 10 vs. 3 ml for water; p = 0.018, 10 vs. 5 ml for yogurt; p = 0.001, 10 vs. 3 ml for yogurt; p = 0.04, 10 vs. 5 ml for mashed potato; p = 0.001, 10 vs. 3 ml for mashed potato). The mean total duration of the sound was 515 ± 217 ms for boluses of 10 ml, 441 ± 150 for boluses of 5 ml, and 411 ± 155 for boluses of 3 ml (Fig. 2). No effect of the consistency (F 2,38 = 0.54; p = 0.59) was found and no interaction between the consistency and the volume was found (F 4,76 = 0.96; p = 0.9).
The duration of SC2 increased with bolus volume, independent of the consistency of the bolus (p = 0.0002, 10 vs. 5 ml for water; p = 0.02, 5 vs. 3 ml for water; p = 0.009, 10 vs. 5 ml for yogurt; p = 0.04, 5 vs. 3 ml for yogurt; p = 0.046, 10 vs. 5 ml for mashed potato; p = 0.05, 5 vs. 3 ml for mashed potato). Similarly, the duration of SC2 was significantly greater for the 10-ml mashed potato bolus than for the 10-ml water or yogurt bolus (p = 0.01 and 0.03, respectively). There was no detectable difference between yogurt and water whatever the bolus volume or between mashed potato and the two other textures for boluses of 5 and 3 ml (Fig. 3).
The values of SC1 and SC3 and the intervals IT1 and IT2 did not differ significantly with respect to the bolus.
Discussion
We found that the volume and consistency of the bolus swallowed affected the acoustic parameters. In particular, the total duration of sound increased with increasing volume and was longer for the foodstuff with the thickest consistency (mashed potato). We confirmed that SC2 is the sound component that is always present in the sound of swallowing and that the duration of SC2 depends on the bolus volume and consistency.
Development of the Acoustic Technique
The acoustic recording technique used was the same as that developed by Morinière et al. [13]. The position of the microphone was that defined by Takahashi et al. [19]. However, we encountered difficulties with the first three subjects recorded so that the recordings could not be used because of their poor quality and the presence of many sound artifacts associated with movement of the microphone during swallowing. We therefore mounted the microphone on a stethoscope chest piece, as described by Boiron et al. [20], and the resulting recordings were of satisfactory quality and reproducible.
Limitations of the Technique
Analysis of acoustic data requires 1 h per subject. This is too long for this technique to be of routine use. We are working on the development of automated analysis software to decrease this duration. Cervical auscultation allows only exploration of the pharyngeal time (second time) of swallowing, which comprises three. It is not enough to ensure a complete study of swallowing disorders. The absence of morphological data is also a limiting factor of the technique.
Number of Subjects
Thirteen women and ten men were included in the study, allowing comparison between sexes; indeed, there was no significant difference between these two groups with respect to age or BMI. The mean age of the patients included in the analysis was 28 ± 10 years. It was therefore a young population and informative age stratification was not possible.
Bolus Volume and Consistency
We chose bolus volumes of 5 and 10 ml because these volumes have been used in many acoustic studies of swallowing. A bolus volume of 3 ml was also used to establish reference values for a volume that could be used with patients with diseases and who may be able to swallow only small amounts. Reconstituted dried mashed potato, yogurt, and water were used to constitute boluses of different consistency. They provided a range of textures that could be used on a regular basis and that do not pose a swallowing problem in patients. In addition, water is frequently used in acoustic studies, yogurt was used by Boiron et al. [11], and mashed potato by Youmans et al. [21], which gave us several points for comparison. Each subject swallowed three boluses of each volume and each consistency, as in previous published acoustic studies. Morinière et al. [13] estimated that five records are necessary for the rigorous characterization of the sounds. Our analysis of the first patients showed that the acoustic recordings were of good quality and reproducible; we therefore decided that three recordings for each volume and consistency were sufficient. Boluses were administered by syringe because this allowed appropriate mixing and control of the volume administered. This approach was also used by Boiron et al. [11], Morinière et al. [13], and Youmans and Stierwalt [21]. However, this approach has the disadvantage of the small diameter of the syringe’s nozzle, which changes the consistency of the foodstuff administered, in some cases making them more fluid. Eyigör et al. [18] used a graduated glass or spoon to avoid this disadvantage.
Acoustic Results
The percentages of recordings that contained the various sound components were as follows: 81 % for SC1, 100 % for SC2, and 77 % for SC3. The bolus type had no influence on these values. These results are comparable to those reported by Morinière et al. [14] (81 % SC1, 100 % for SC2, and 81 % for SC3). These results can be explained by the differences in duration and intensity of the SC. Indeed, as described by Morinière et al. [14], SC2 is the longest sound component and the more intense one, while SC3 is shorter and less intense. This is why SC1 and SC3 may be present but undetected by the examiner due to their short duration or intensity.
The mean total duration of sound was 515.04 ± 217 ms for a bolus of 10 ml, 441 ± 150 ms for a bolus of 5 ml, and 411 ± 155 ms for a bolus of 3 ml. These times are consistent with the values reported by Takahashi et al. [22] and Cichero et al. [23] but are significantly lower than those found by Perlman et al. [24], Boiron et al. [11], Morinière et al. [13], and Youmans and Stierwalt [21] (Table 2). Youmans and Stierwalt [21] suggested that these differences are a consequence of the ages of the populations tested; indeed, the sound of swallowing lasts longer in older subjects [18, 21].
Like Boiron et al. [11] and Youmans and Stierwalt [21], we found that increasing the volume swallowed led to an increase in the duration of the sound: the sound of swallowing lasted longer for boluses of 10 ml than for 5- and 3-ml boluses. The absence of difference between the 5- and 3-ml boluses may be due to the small difference in volume between the two.
We did not find any difference in the total duration of the sound with respect to the consistency of the bolus, unlike the findings reported in the two studies cited above [11, 21]. The absence of difference may have been due to the mode of administration of the bolus. Indeed, as described above, mashed potato and yogurt may have been made more fluid by passage through the syringe such that its consistency became similar to that of water.
We found that the total duration of the sound was not significantly higher for men than for women. This result is in agreement with the study by Cichero et al. [23] which involved swallowing juice, and the study of Youmans and Stierwalt [21]. By contrast, Lebel et al. [25] and Takahashi et al. [22] found that the duration was shorter for women than men using volumes of water of 13 and 5 ml.
The duration of SC2 increased with bolus volume and for all three bolus consistencies. The duration of SC2 was significantly longer for 10 ml of mashed potato than for 10 ml of water or yogurt. These results are consistent with those for the total duration of the sound in the studies cited above. SC2 corresponds to the passage of the bolus through the SES. The result can be explained in that the increase of bolus volume and consistency causes an increase in the SES opening duration and an increase in bolus passage duration through the SES. This finding can be used for the diagnosis of swallowing disorders due to a lack of SES opening such as Zenker’s diverticulum, neurological disorders including Parkinson’s disease, and post-radiation stricture.
The analysis of the duration of SC2 is more informative than that of the total duration because it is more discriminative. Indeed, SC2 varies with the consistency of the bolus while the total duration of the sound does not. This can be explained by the fact that the total duration of the sound also includes SC1 and SC3 which are independent of swallowed bolus. This fact may hide a significant difference caused by an increase in the SES opening. In the future, it seems more appropriate to consider only SC2 rather than the total duration of sound or the two other SCs.
We did not find any difference between yogurt and water whatever the volume, or between mashed potato and the other two textures for boluses of 5 and 3 ml. The absence of difference between yogurt and water may have been due to the mode of administration of the bolus through the syringe, as discussed above. Thus, it can be concluded that water and yogurt, as boluses of 5 and 3 ml, provide the same acoustic information. It is not necessary to multiply these measures. A meal of mashed potato and water for volumes of 10 and 5 ml seems sufficient to obtain pertinent acoustic results. In our future studies we will focus on these boluses that are the most informative.
The average lengths of SC1, SC2, SC3, IT1, and IT2 for 10 ml found in our study agreed with those of our previous study [13].
Conclusion
The total duration of the sound of swallowing, and, in particular, the second sound component (SC2), depends on the bolus. The differences are between mainly the thickest-consistency bolus and the two other consistencies, and between the largest bolus (10 ml) and the two other volumes. This result was obtained in a normal small population. It must be confirmed in a larger population in order to assess variations between subjects due, especially, to the degree of SES strength or the diameter of the pharynx.
SC2 is the most characteristic element in the sound of swallowing, being both the most prevalent component (present in 100 % of recordings) and the most sensitive to variations in the nature of the bolus. It can be used for the study of swallowing disorders due to SES dysfunction.
To extend this study, we are currently working on the design of a typical or reference meal to be used in an investigation whose goal is to exploit noninvasive, acoustic swallowing analyses to establish a method for detecting an early warning of swallowing disorders.
References
Jean A. Brain stem control of swallowing: neuronal network and cellular mechanisms. Physiol Rev. 2001;81:929–69.
Cook IJ, Dodds WJ, Dantas RO, Massey B, Kern MK, Lang IM, Brasseur JG, Hogan WJ. Opening mechanisms of the human upper esophageal sphincter. Am J Physiol. 1989;257:748–59.
Langmore SE, Schatz K, Olsen N. Fiberoptic endoscopic examination of swallowing safety. Dysphagia. 1988;2:216–9.
Miller LS, Dai Q, Sweitzer BA, Thangada V, Kim JK, Thomas B, Parkman H, Soliman AM. Evaluation of the upper esophageal sphincter (UES) using simultaneous high-resolution endoluminal sonography (HRES) and manometry. Dig Dis Sci. 2004;49:703–9.
Mielens JD, Hoffman MR, Ciucci MR, Jiang JJ, McCulloch TM. Automated analysis of pharyngeal pressure data obtained with high resolution manometry. Dysphagia. 2011;26:3–12.
Ertekin C, Aydogdu I. Electromyography of human cricopharyngeal muscle of the upper esophageal sphincter. Muscle Nerve. 2002;26:729–39.
Albiter M, Petrow P, Kolb F, Bretagne E, Luboinski B. Sigal R [Swallowing study with kinetic MRI using a single shot fast spin echo sequence in healthy volunteers and patients treated for head and neck cancer]. J Radiol. 2003;84:311–6 [in French].
Lear CS, Flanagan JB, Mooress CF. The frequency of deglutition in man. Arch Oral Biol. 1965;10:83–99.
Mackoviak RD, Brenman HS, Friedman MH. Acoustic profile of deglutition. Proc Soc Exp Biol Med. 1967;123:1149–52.
Hamlet JA, Nelson RJ, Patterson RL. Interpreting the sound of swallowing: fluid flow through the cricopharygeus. Ann Otol Rhinol Laryngol. 1990;99:749–52.
Boiron M, Rouleau P, Metman EH. Exploration of pharyngeal swallowing by audiosignal recording. Dysphagia. 1997;12:86–92.
Morinière S, Beutter P, Boiron M. Sound component duration of healthy human pharyngoesophageal swallowing: a gender comparison study. Dysphagia. 2006;21:175–82.
Morinière S, Boiron M, Alison D, Makris P, Beutter P. Origin of the sound components during pharyngeal swallowing in normal subjects. Dysphagia. 2008;23:267–73.
Morinière S, Boiron M, Brunereau L, Beutter P, Patat F. Pharyngeal swallowing sound profile assessed after partial and total laryngectomy. Dysphagia. 2011;26:366–73.
Kahrilas PJ, Lin S, Chen J, Logemann JA. Oropharyngeal accommodation to swallow volume. Gastroenterology. 1996;111:297–306.
Ghosh SK, Pandolfino JE, Zhang Q, Jarosz A, Kahrilas PJ. Deglutive upper esophageal sphincter relaxation: a study of 75 volunteer subjects using solid-state high-resolution manometry. Am J Physiol Gastrointest Liver Physiol. 2006;291:525–31.
Doods WJ, Stewart ET, Logemann JA. Physiology and radiology of the normal oral and pharyngeal phases of swallowing. AJR Am J Roentgenol. 1990;154:953–63.
Eyigor S, Perlman A, He X. Effects of age, gender, bolus volume and viscosity on acoustic signals of normal swallowing. Turk J Phys Med Rehabil. 2007;53:94–9.
Takahashi K, Groher ME, MichiI K. Methodology for detecting swallowing sounds. Dysphagia. 1994;9:54–62.
Boiron M, Benchellal Z, Huten N. Study of swallowing sound at the esophagogastric junction before and after fundoplication. J Gastrointest Surg. 2009;13:1570–6.
Youmans SR, Stierwalt JA. Normal swallowing acoustics across age, gender, bolus viscosity and bolus volume. Dysphagia. 2011;26:374–84.
Takahashi K, Groher ME, Michi K. Symmetry and reproducibility of swallowing sounds. Dysphagia. 1994;9:168–73.
Cichero JA, Murdoch BE. Acoustic signature of the normal swallow: characterization by age, gender, and bolus volume. Ann Otol Rhinol Laryngol. 2002;111:623–32.
Perlman AL, Ettema SL, Barkmeier J. Respiratory and acoustic signal associated with bolus passage during swallowing. Dysphagia. 2000;15:89–94.
Lebel D, Parel C, Thouvenot J. Exploration de la déglutition à partir de son signal sonore. Arch Int Physiol Biochim. 1990;98:75–86.
Conflict of interest
All of the authors have no conflicts of interest to disclose.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hammoudi, K., Boiron, M., Hernandez, N. et al. Acoustic Study of Pharyngeal Swallowing as a Function of the Volume and Consistency of the Bolus. Dysphagia 29, 468–474 (2014). https://doi.org/10.1007/s00455-014-9529-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00455-014-9529-6