Aerodigestive tract disorders and their sequelae in infants are increasing as a result of advances in intensive care practices [1]. Reliable methods to evaluate swallowing and airway protection in small infants are scarce and can be limited by subject noncooperation, respiratory distress, risk of aspiration, movement, and need for anesthesia. Direct flexible nasolaryngoscopy (NLS) is valuable in visualizing the anatomy of the upper aerodigestive tract and assessing the glottal movement. For example, functional assessment of the glottis during the act of deglutition can be brief, and the invasive nature of the endoscopic procedure can evoke pharyngeal gag reflex that is not present in a more physiologic state. Some infants may not tolerate this procedure because of small nostril size or compromised respiratory status; furthermore, complications such as nose bleeds, vasovagal events, laryngospasm, and aspiration have been reported [2,3].

During evaluation of swallowing, as in FEES (fiberoptic endoscopic evaluation of swallowing) in adults and children, abnormalities of larynx or hypopharynx are visualized briefly during swallowing [4,5]. Also, during endoscopy or barium swallows in adults, glottal function during deglutition is visualized briefly [6,7]. All these brief procedures require subject cooperation and restraint and can be stressful to infants and children.

Laryngeal ultrasound in adults for the evaluation of structural problems has been described previously [810]. Moreover, Friedman et al. [11] established the role of laryngeal ultrasound in delineating the vocal fold movement in a lamb model as well as in infants and children with vocal cord palsy. Garel et al. [12] evaluated the glottis using USG, with prior knowledge of endoscopic findings in a nonblinded manner, and found USG to be useful in the evaluation of the inner structures of the larynx and its dynamic function. Vats et al. [13] showed good correlation by observers blinded to one of the modalities with ultrasound and direct laryngeal examination in children investigated for vocal cord palsy under anesthesia. Furthermore, use of general anesthetics alters the dynamic function of the larynx and limits the evaluation of sensory-motor aspects of laryngeal adductor reflexes [14]. None of the above-mentioned studies have reported concurrent correlation between the modalities tested, nor the temporal changes in glottal motion.

A reproducible and quantifiable change in vocal cord/glottal motion characteristics may permit evaluation of glottal reflexes in conditions affecting the aerodigestive tract. Therefore, the aim of this study was to correlate sequential glottal motion characteristics using concurrent NLS and USG of the glottis in children to validate the use of noninvasive USG in quantifying glottal kinematics. We tested the hypothesis that the duration of glottal closure time as an index of laryngeal adduction can be determined accurately and reliably using USG.

Subjects and Methods

Subjects

Subject characteristics are given in the Table 1. Ten subjects (5 males, 5 females, age = 4.5 months to 7.1 years, mean = 2.8 years, median = 2 years) that were undergoing elective outpatient followup NLS as part of standard clinical care participated in this study. These subjects were evaluated for dysphagia (n = 2), hoarseness (n = 2), laryngomalacia (n = 3), and stridor (n = 3). Five subjects had a diagnosis of gastroesophageal reflux disease (GERD) (50%). Informed consent was obtained from parents before the study. The institutional review board of the Columbus Children’s Hospital/Children’s Research Institute approved this protocol, and complete confidentiality and HIPAA guidelines were followed.

Table 1. Subject characteristics

Flexible Nasolaryngoscopy

NLS was performed as part of routine evaluation of deglutition, aspiration, GERD, and vocal fold movement. Subjects were seated on the parent’s lap and were restrained as necessary by nursing personnel during the NLS. Vocal cord closure was recognized during deglutition or phonation (older subjects followed instructions to say aah, eeh, and younger ones cried). The procedure was brief, and no anesthesia was used in this ambulatory procedure that was performed as part of standard clinical care. Flexible fiberoptic NLS (outer diameter = 2.5 mm) was used and video images were obtained by a single-chip video camera (Storz # 11101 SK, # 11101 RP) coupled to the scope eyepiece and recorded in real time. Anatomy of the entire upper aerodigestive tract, including the nasal airway, nasopharynx, oropharynx, hypopharynx, and larynx, was visualized.

Concurrent Ultrasonography of the Glottis

USG was performed at the same time as NLS, using Acuson Sequoia 512 (Siemens, Mountain View, CA) equipped with a 15L8 (7.0–14.0 MHz) linear array transducer or a Philips HDI 5000 ultrasound system (Bothell, WA) with a 12–5 MHz linear array transducer. Both are high-frequency broadband linear transducers that provide very similar performance in terms of spatial, contrast, and temporal resolution, such that there was no meaningful difference in the quality of obtained images. However, the choice of the transducers was based on availability. The ultrasound transducer was placed on the anterior neck using gel as an acoustic coupling medium (Fig. 1). Transverse imaging was performed at the level of the vocal cords, such that the anterior commisure of the vocal folds was visible anteriorly, and the arytenoid cartilages were visible posteriorly. This allowed evaluation of vocal cord/glottic closure. The recorded images had a reference scale incorporated into them to allow subsequent quantitative measurements of motion. Depth of the penetration varied with subjects (15–30 mm) and was adjusted in real time to optimize image resolution.

Fig 1.
figure 1

Schematic diagram of the positioning of the nasolaryngoscope and the ultrasound transducer in a subject.

Synchronization and Integration of NLS and USG Signals in Real Time

We used Medical Measurements Systems (MMS) software v8.3j, Build 1104 for the study (MMS, Dover, NH). Specifically, the digitization methods included the use of Matrox Meteor-II PCI, Compact PCI®, or PC/104-Plus™ form factor, which captures from NTSC, PAL, RS-170, and CCIR video sources, up to 12 video inputs, operating with 132-bit/33-MHz PCI bus-master, and allowing real-time transfer to system or VGA memory, with an extensive on-board buffering for reliable capture.

The video output signals derived from concurrent USG and NLS were integrated and synchronized using the Meteor 2 video card (MMS) into real-time cine loops of 1 min. This system operates on a Windows 2000 platform and uses a Dell workstation with a 2.7-GHz Pentium 4 processor and dedicated 532 MB of memory. Using this technique, the number of images acquired was 30 frames/s, and the time interval between acquisitions of two frames was 33 ms. This degree of video temporal resolution was sufficient to correlate temporal glottal movements between the two modalities evaluated. During analysis of the cine loops, the software allowed the examiner to magnify and analyze images frame by frame, with each frame representing a still image averaged over 33-ms duration.

Sonographic Glottal Landmarks

To establish landmarks of glottal closure with USG, we imaged a fresh human infant cadaver neck to visualize a still-closed glottis (Fig. 2). Concurrently, NLS was also performed. The sonographic anatomy of glottal closure was confirmed, and these images were used as a reference to define complete glottal closure in subjects.

Fig 2.
figure 2

Concurrent ultrasonographic and nasolaryngoscopic images of still glottis in a neonatal cadaver defining complete adduction of vocal folds. This is easily recognized centrally as an inverted T formed by the vocal folds. Arytenoids cartilage (AC), thyroid cartilage (TC), and adduction (Ad) of the vocal folds (VF) are recognized.

Data Analysis

All authors of this article were thoroughly trained in laryngeal and aerodigestive anatomy as part of their clinical training and experience. Initially, three observers (AG, ES, and GJW) visualized the NLS images and identified glottal closure. Next, two observers (AG and ES) played each cine loop frame-by-frame and correlated USG closure of glottis with NLS images. Later, two observers (SRJ and BDC) were blinded to the NLS images (covered) and individually scored the glottal closure image frames using the cadaver image as a reference. Time for the onset of glottal adduction to complete adduction, duration of complete adduction, and onset of abduction to complete abduction were measured in milliseconds.

Statistical Analysis

Between blinded observers (SRJ and BDC) and the unblinded observers, agreement rates for glottal closure and exact binomial confidence intervals determining the probabilities of missing a 33-ms glottal closure frame were computed. Statistical analysis was done using Stata software (Stata Corp., College Station, TX). Agreement rates are described as percentage. Duration (sec) of glottal abduction and adduction are reported as mean ± SD.

Results

General Observations

The entire procedure was very brief, 1–2 min. The otolaryngologist (GJW) observed the anatomy of the aerodigestive tract, spontaneous swallowing, presence of hypopharyngeal secretions, and spontaneous vocal fold movement with breathing, phonation, and deglutition. Vocal cord motion was observed and there were no structural anomalies. Subjects tolerated concurrent USG of glottis without any additional discomfort. A total of 12 cine loops (10,800 total frames) of about 1-min duration each were reviewed. Forty sequences of complete vocal cord abduction and adduction were recognized by NLS and USG in 770 frames from 10 subjects. The complete adduction and abduction of the vocal cords were identified as shown in Figures 2 and 3. The sequence of abduction to adduction to abduction was analyzed in the same way in all subjects and is shown in Figure 4. However, the durations of each step with each event per patient were different. Figure 5 shows average time taken for different steps in a glottal closure event (n = 10 subjects, pooled data).

Fig 3.
figure 3

Laryngeal anatomy and definitions of vocal fold abduction with USG and NLS. On the USG, abduction (Ab) is recognized as an inverted V formed by the vocal folds. TC = thyroid cartilage; VF = vocal folds; TR = trachea; AF = aryepiglottic fold; EPI = epiglottis.

Fig 4.
figure 4

A representative sequence of concurrent NLS and USG defining sequential changes in glottal opening, closure, and reopening in the same subject. During complete glottal abduction, the inverted V is recognized, and during complete adduction, an inverted T is noted. The periods of adduction onset to complete adduction, duration of adduction, and return to complete abduction can be measured. NLS images are inverted to clarify better comparison. VF = vocal folds; TC = thyroid cartilage.

Fig 5.
figure 5

Measurement of sequential time intervals for the onset of vocal cord adduction, duration of complete vocal cord closure, and onset of maximal abduction from 40 sequences of glottal closure evaluated from 10 subjects. The duration (s, mean ± SD) was similar with both the modalities (USG and NLS). Vc-Ad-O = vocal cord adduction onset; Vc-Ad-Max = vocal cord adduction maximum; Vc-Ab-O = vocal cord abduction onset; Vc-Ab-Max = vocal cord abduction maximum.

Concordance

There were 292 frames of 770 frames that depicted glottal closure. This was determined by the two observers (AG and ES). Investigators 1 (SRJ) and 2 (BDC), who were blinded to the NLS images, identified the USG-determined glottal closure with 99.3% and 100% accuracy, respectively. Thus, the agreement rates were excellent between the blinded investigators and the unblinded observers, in a reproducible manner.

Exact Binomial Confidence Intervals

With investigator 1, the probability of missing a glottal closure frame lasting 33 ms was 0.007 and the 95% CI ranged from 0.0008 to 0.024. With investigator 2, a one-sided confidence interval was computed because this observer did not miss any images, and the 97.5% one-sided confidence interval was 0.03.

Temporal Changes in Glottal Abduction and Adduction

Among the 40 sequences that showed glottal closure, temporal changes in glottal abduction and adduction were observed with both modalities (NLS and USG) and were similar (Fig. 4). Sequential maximal abduction to maximal adduction, duration of adduction, and the time taken to maximal abduction were quantified. The data were similar with both the modalities (Fig. 5).

Discussion

Applying a unique approach, we correlated the laryngeal anatomy as shown by concurrent use of NLS and USG of glottis in infants and children. The novelty lies in the direct temporal correlation between the two modalities, and an ability to quantify the duration of glottal closure in a noninvasive manner. To our knowledge this is the first article that reports such a strong correlation, with many potentially useful applications in the evaluation of laryngeal adductor reflexes.

Invasive procedures contribute to stress and anxiety and are less favored by children and/or their parents. Unsedated endoscopic procedures can permit short structural evaluations and are dependent on greater subject cooperation which is likely in adults [4,7] but not likely in infants or children [1113]. Though sedation and anesthesia make the procedures more tolerable, their use may compromise the functional evaluation of the upper airway, particularly the laryngeal adduction or abduction during swallowing or phonation [11]. Alternatively, the respiratory drive may be altered and thus may not offer evaluation under basal physiologic states. Furthermore, under these conditions, aerodigestive reflexes may be diminished and thus not be reliably assessed in infants.

NLS can be advantageous in yielding greater anatomic detail of the aerodigestive tract [11]. In our study, both methods offered accurate evaluation of glottal motion based on the data on sequential changes in glottal adduction and abduction. Furthermore, both methods can be viewed on tape for later analysis. However, USG offers many advantages in other respects. Nostril size and respiratory effort do not impact the evaluation. USG does not require sedation or anesthesia and allows for prolonged recording in a physiologic setting. Noninvasive methods minimize stress and anxiety in subjects or parents. In our study, evaluation by blinded observers eliminated potential bias, and concordance with unblinded observers proved reliability of the USG for this evaluation. Concomitant evaluation of pharyngeal wall motion during deglutition may also be possible with USG [15].

Evaluation of glottal closure using endoscopy concurrent with esophageal manometry has been accomplished in adults [7]. Despite the use of smaller-size scopes for a limited duration (1–2 min), as was done in this study, endoscopy methods concurrent with manometry for evaluating aerodigestive reflexes are not practical in neonates or infants with dysphagia or aerodigestive compromise. However, pharyngo-esophageal manometry studies to evaluate dysphagia and esophageal motility in neonates have been performed [1618], and use of USG in conjunction with pharyngo-esophageal motility studies may offer reliable evaluation of aerodigestive reflexes.

One may speculate about the variability in time differences with glottal motion. This variability may correlate with the subject’s age, clinical condition, GERD, respiratory rate, or the methods used to evoke laryngeal adduction and cannot be explained in this study. Our principal aim was to develop a temporal correlation between the two modalities. Therefore, cry versus phonation will be of little consequence in defining glottal closure, because we are evaluating glottic motion using the two modalities concurrently. In future studies, the USG methods will be useful in evaluating glottal kinematics noninvasively. For example, glottal closure time is a measure of laryngeal adduction. This measure is valuable in the evaluation of subjects with vocal cord paralysis, vocal cord function after surgery, evaluation of deglutition or speech, assessment of glottal closure in the neurologically impaired, and evaluation of stridor.

Neither the number of study infants, nor their age, nor the measurement of sequential changes during the acts of phonation or deglutition should be construed as a limitation. The study was not intended to discern differences in the data between different age groups or between different acts that are associated with glottal closure (phonation vs. deglutition); rather, we simply evaluated temporal correlation between the two modalities to validate a noninvasive method to quantify glottal kinematics.

In summary, we have shown a strong concordance between NLS and USG to evaluate glottal motion. By performing concurrent evaluation, we have validated the USG technique aimed for quantifying the parameters of glottal motion. Because of its noninvasiveness, wide availability, high fidelity, and concordance with NLS, USG should permit prolonged evaluation of the aerodigestive tract with ease. Apart from visualizing vocal fold impairment, USG of glottis can have other potentially clinical applications in the evaluation of aerodigestive reflexes in dysphagic infants by combining with pharyngo-esophageal manometry.