Introduction

Awareness of the importance of alterations in laryngopharyngeal mechanosensitivity (LPMS) has increased with recent discoveries of the role of these alterations in complications of chronic aspiration [1,2,3,4] dysphagia [1, 2, 5], obstructive sleep apnea (OSA) [6], chronic cough hypersensitivity syndrome [7, 8], irritable larynx syndrome [7, 9], and gastroesophageal reflux disease (GERD) [10, 11], among others. In these conditions, the spectrum of alterations in LPMS ranges from hypo- to hypersensitivity states, which highlights the importance of quantifying and objectively evaluating LPMS; however, the current commercially available device for assessing LPMS is designed only for evaluations of the laryngeal adductor reflex and psychophysical surface sensitivity [12] and could have limited reliability [13].

Hammer attempted to correct the commercial device reliability problems by developing a new device with improved stimulus reliability. Hammer remarked on the need of future work to improve the control of the endoscope to target distance [14]. Additionally, larger studies including a full spectrum of conditions potentially affecting LPMS are required for Hammer’s device [14, 15].

To solve the problem of standardization of the endoscope to target distance, as well as other factors potentially affecting stimulus intensity on the target, such as the angle and site of stimulus impact, a team of physicians, engineers, and a physicist recently developed a laryngopharyngeal endoscopic esthesiometer and rangefinder (LPEER) composed of a high precision air pulse generator, an endoscopic laser rangefinder, and a polar grid [16, 17]. The specifications of this LPEER enable examiners to measure not only the laryngeal adductor reflex threshold (LART) and surface sensitivity but also the cough reflex threshold (CRT) and gag reflex threshold (GRT) [16, 17].

The importance of a valid method for measuring the LPMS is highlighted by the large burden of the conditions associated with LPMS impairment. Aspiration increases the risk of pneumonia, a condition responsible for six million years of life lost (YLLs) every year in developed countries [3, 18, 19]. In OSA, LPMS compromise correlates with disease severity [20] and is likely involved in an impaired capacity of the central nervous system to detect upper airway narrowing and to respond by increasing airway dilator muscle tone to prevent airway collapse [6]. OSA may be responsible for 41 million YLLs per year in developed countries (calculated using the population-attributable fraction of 0.16 from a meta-analysis by Wang et al. [21] and the number of YLLs due to all-cause mortality of 256 million [19]). In addition, cough hypersensitivity syndrome, which causes cough reflex hypersensitivity [22], may have a prevalence of approximately 6% [23] and causes a significant impairment in health-related quality of life [24].

There are new experimental rehabilitation interventions aimed at improving laryngopharyngeal deficits associated with hypo- or hypersensitivity states [7, 25, 26]. However, a reproducible and valid method of measuring LPMS is necessary to quantify the effects of such interventions. A preliminary LPEER exploratory study [16] and a study conducted on stroke dysphagic and control subjects [17], corresponding to phases I–II of the diagnostic research classification proposed by Sackett and Haynes [15, 27], showed high reliability and ability to differentiate between subjects with aspiration and controls [17]. Phases I and II diagnostic test studies, which compare the test results in normal and sick people, have the risk of overestimating the test’s accuracy and require further studies in cohorts of patients with clinical abnormalities and a clinical spectrum similar to future candidates for the test (phase III diagnostic research studies) [28,29,30]. The preliminary results obtained with the LPEER justify a study in a cohort of patients with a wider spectrum of abnormalities [15, 31].

In this work, we performed an accuracy study of the LPEER in a prospective cohort of patients under suspicion of dysphagia or conditions potentially affecting airway sensitivity (phase III diagnostic research study). We evaluated the discriminative capacity of the alterations in LPMS measured by the LPEER to predict severe alterations in swallowing safety, such as a penetration–aspiration scale (PAS) ≥7, and we proposed cut-off thresholds to rate the degree of compromise.

Methods

Subjects and Type of Study

This was an accuracy study performed in a prospectively and consecutively recruited cohort of patients in two dysphagia clinics at tertiary care university hospitals. Participants were recruited from patients referred with dysphagia symptoms if they fulfilled the study criteria and consented to participate. The inclusion criteria were an age of 18 years or older and the presence of symptoms or risk factors for oropharyngeal dysphagia. Some volunteers with upper airway symptoms but without dysphagia symptoms were also allowed to participate in the study as a part of the reference group. The exclusion criteria were respiratory failure, bleeding diathesis, and anticoagulation. The criteria for removing a recruited subject from the study were any degree of epistaxis or severe discomfort during the test. We set a low threshold for removing a patient from the study (e.g., minor epistaxis) to maintain the study at minimal risk. We enrolled subjects from 30 December 2013 through 19 September 2014.

The Institutional Review Board at each recruitment center approved the study, which was conducted in accordance with good clinical practices, the Helsinki Declaration and national regulations. All of the participants received oral and written explanations and provided written informed consent.

Interventions

The subjects underwent a standardized clinical evaluation performed by a speech–language pathologist (SLP) with 7 years of experience in deglutition alterations. The evaluation included questions about dysphagia symptoms and risk factors. The subjects also received a physical examination oriented to detect speech, language, and deglutition alterations. Afterwards, the subjects underwent a fiberoptic endoscopic evaluation of swallowing with sensory test (FEESST) in a seated or semi-recumbent position with a modification of the standard technique described by Aviv [17, 32]. The modification consisted of including determinations of the CRT and GRTs, in addition to the LART, to enable a comprehensive evaluation of LPMS [17]. All of the reflex thresholds were explored using the LPEER, a device composed of a high precision air pulse generator, an endoscopic laser rangefinder and a polar grid, and which is designed to work coupled to a conventional fiberoptic endoscope. The LPEER was developed and manufactured at the Laboratories of the Schools of Medicine and Engineering of the Universities of Navarra and La Sabana by a team of medical doctors, engineers, and a physicist, who were associated with the above universities [16]. FEESST was performed with the participation of a pulmonologist with 9 years of experience in FEESST, as well as an SLP, a clinical assistant and a technical assistant.

To perform FEESST, the pulmonologist used a pediatric fiberoptic bronchoscope with a 1.2 mm working channel (Pentax FB-10 V, Pentax of America, Montvale, NJ, USA), the LPEER, a light source (Pentax LH-150PC, Pentax of America), a head camera for capturing endoscopic images, a video system (Pentax PSV 4000, Pentax of America), and a computer (Samsung RV420 Core i5, Samsung Electronics, Suwon, South Korea) for image processing and recording the test [16].

The pulmonologist evaluated the LART with a series of 10 air pulses of 100 ms at decreasing intensities from 0.7 to 0.04 mN. CRT and GRT required stimuli of greater intensity and were assessed with a series of 10 air pulses of 1000 ms and increasing intensities from 0.8 to 16.5 mN. The pulmonologist measured the reflex threshold at least twice on each side of the laryngopharyngeal tract: the LART and CRT were measured on the laryngeal mucosa at a point between the corniculate and cuneiform cartilages and the GRT was explored at the lateral wall of the pharynx at a point lateral to the epiglottis. We determined these points in our first exploratory study [16] because we observed that at such points the corresponding reflexes were elicited more consistently. The reflex threshold corresponded to the lowest-intensity stimulus capable of triggering the corresponding reflex (the median value of all the measurements for each reflex threshold was selected as the threshold). A protocol detailing the technique and the sites of stimulation of the laryngopharyngeal mucosa for every reflex was previously published elsewhere [17]. During the test, the stimuli were identified by numbers instead of the stimulus intensity; these numbers were blindly replaced by the corresponding intensity in mN after the study end. We chose to quantify the stimulus intensity in mN because we found in a previous work [16] that the morphological characteristics of the air pulses prevented reliable measurements using the traditional system of measurement [12, 14, 16, 33]. However, we established the equivalence of our method of measuring the intensity of air pulses to the previous method; these results were published with the protocol described above [17].

After the sensory evaluation, the subjects were subjected to a standard fiberoptic endoscopic evaluation of swallowing (FEES), including an evaluation of deglutition during dry swallows and eating, which was considered the reference standard [34, 35]. The SLP administered food of four different consistencies to the patient. The food consistencies were prepared by the SLP using cow’s milk yogurt, water, graham crackers, commercial food thickener, and green food coloring to obtain thick fluids, semi-solids (purees), solids (graham cracker), and thin fluids [17].

The pulmonologist and the SLP looked for alterations in swallowing safety and efficiency. Residues were defined as the presence of material on the laryngopharyngeal tract after swallowing, premature spillage as the premature passage of material from the mouth to pharynx, penetration as the passage of material into the laryngeal vestibule, and aspiration as the passage of material below the vocal cords (into the trachea). Alterations observed during the swallowing evaluation were rated by consensus between the pulmonologist and the SLP, including the application of the eight-point PAS and the dysphagia severity scale (DSS) [17, 36, 37].

At the end of the FEESST, the clinical assistant asked the patients about symptoms and discomfort experienced during the test. A standard form was used that included the presence or absence of symptoms and their severity using a scale from 0, indicating the absence of a symptom, to 10, indicating a symptom of the maximum intensity ever experienced by the patient.

The clinical assistant immediately registered all clinical, FEESST, and adverse event information using standard forms.

Statistical Analysis

Quantitative variables were evaluated with the Shapiro–Wilk test for normal distributions and were treated as corresponding, including the calculation of 95% confidence intervals (95% CIs). Because most quantitative variables had an asymmetric distribution, we used the two-tailed Mann–Whitney U test to contrast them and the Spearman’s ρ correlation coefficient (SCC) to evaluate monotonic relationships. We considered P < 0.05 (two-tailed) statistically significant. However, when multiple hypothesis tests were performed in the same family of tests, we applied the Bonferroni correction: the P significance level was divided by the number of comparisons in such family of tests (i.e., for three hypothesis tests in a family of tests, the corrected P significance level for that family of tests was set at 0.05/3, that is 0.017).

The discriminative capacity of each reflex and of some combinations of reflexes was evaluated by plotting ROC curves and calculating the area under the curve (AUC). Their 95% CI was calculated using the binomial exact method, and their significance was tested against the null hypothesis of an AUC = 0.5. We selected cut-off points for each reflex to classify patients according to the degree of compromise in swallowing safety according to the distributions of medians and ROC curves.

We calculated that the required sample size was 117 subjects using an equation proposed by Machin et al. [38] for ROC curves, with an estimated proportion of sick subjects of 0.3, a sensitivity of 0.9, a specificity of 0.7, and a 95% CI width of 0.08 per side.

The software used for the statistical analysis included Microsoft Excel 2007 (Microsoft Corporation, Redmond, WA, USA), MedCalc, version 14.12.0 (MedCalc Software bvba, Ostend, Belgium; http://www.medcalc.org; 2014), and IBM-SPSS Statistics software, version 20 (Armonk, NY, USA).

Results

Enrolment Flowchart

We assessed 142 subjects, of whom 2 met the exclusion criteria, 12 declined to participate, 8 exhibited continuous laryngopharyngeal movements that impeded reflex determination, and 2 were withdrawn due to minor adverse events. We finally included 118 subjects in the study (Fig. 1).

Fig. 1
figure 1

Enrollment flowchart

Baseline Characteristics

The mean age of the patients was 55.7 years. The patients were equally balanced regarding sex. The most frequent underlying diseases were stroke, GERD, and neurodegenerative diseases. Of the 118 patients, 93 (79%) had any degree of oropharyngeal dysphagia, with the severity ranging from 1 to 8 on the PAS (Table 1).

Table 1 Baseline characteristics of the entire cohort

Median Threshold Values by Alterations in Swallowing Safety

Patients with alterations in swallowing safety of thin liquids had a dose–response gradient in the LART, ranging from a median sum of right and left LART of 0.42 mN for those with residues to 1.39 mN for those with thin liquid aspiration. P values for thin liquid penetration and aspiration (P < 0.001) were in the range of statistical significance after applying the Bonferroni correction (P < 0.017) compared to the reference group (normal FEES). This dose–response gradient was also observed, albeit in a less obvious manner, with other consistencies and reflex thresholds. Patients with pharyngeal residues and without penetration or aspiration had lower degrees of reflex threshold compromise; compared to the reference group, this difference reached statistical significance (P < 0.0175) for most reflex thresholds but did not reach statistical significance in the LART and GRT for some food consistencies. However, more severe alterations in swallowing safety, such as penetration and aspiration, were associated with important increases in LART, CRT, and GRT, which were clinically (4-fold greater) and statistically significant compared with those of the reference group (see detailed information in Table 2).

Table 2 Relationships between reflex thresholds and alterations in swallowing safety during FEESST

This dose–response gradient was also supported by finding a positive correlation between the PAS and the LART (SCC 0.47; 95% CI 0.32–0.60; P < 0.001), CRT (SCC 0.46; 95% CI 0.30–0.59; P < 0.001), and GRT (SCC 0.34; 95% CI 0.17–0.49; P = 0.002).

Sensory Thresholds by Condition

Grouping the patients according to the presence of conditions that may affect the sensory thresholds of the laryngopharyngeal tract, we found that patients with DSS scores >2 had LART 1.6-fold higher, CRT 4-fold higher, and GRT 1.4-fold higher than the reference group (normal FEES, all P values <0.005, Table 3). A small group of patients with upper airway hypersensitivity-irritable larynx syndrome showed a trend toward higher LART and lower CRT and GRT than the reference group. Patients with GERD showed similar thresholds to the reference group (Table 3).

Table 3 Reflex thresholds by selected conditions

Discriminative Capacity of Laryngopharyngeal Reflex Thresholds for Severe Dysphagia

An analysis of the ROC curves and AUC values to assess the discriminative capacity of laryngopharyngeal reflexes to differentiate patients with a PAS ≥7 (aspiration without ejection of aspirated material) on at least two different food consistencies revealed AUC values ranging from 0.79 to 0.83 for LART and CRT (all P < 0.0001, Fig. 2; Table 4). The sum of LART and CRT was associated with the largest AUC (AUC 0.86, P < 0.0001). GRT had the lowest performance, with an AUC value of 0.72 (P < 0.0001). The cut-off points with the best balance of sensitivity and specificity for detecting a PAS ≥7 and their respective likelihood ratios and predictive values (for a population with a prevalence of PAS ≥7 on at least two different food consistencies of 13%) are shown in Table 5. LART had the best performance in terms of positive and negative likelihood ratios.

Fig. 2
figure 2

ROC curves of the reflex thresholds to discriminate penetration–aspiration scale ≥7 on at least two different consistencies: a ROC curve of the sum of right and left LART, b ROC curve of the sum of right and left CRT, c ROC curve of the sum of right and left GRT, and d sum of LART and CRT

Table 4 Area under the curve–ROC to discriminate PAS ≥7 by reflex
Table 5 Selected cut-off values to discriminate PAS ≥7% on at least two different food consistencies by reflex

Considering the median values and their 95% CI observed in alterations of swallowing safety of increasing severity (pharyngeal residues, penetration and aspiration), we propose the following cut-off points for grading the severity of LART compromise: mild compromise, from 0.2 to 0.3 mN; moderate compromise, from 0.3 to 0.4 mN; and severe compromise, greater than 0.4 mN. For CRT, the cut-points proposed for grading the severity of compromise are: mild compromise, from 8 to 12 mN; moderate compromise, from 12 to 14 mN; and severe compromise, greater than 14 mN (Table 6). Due to the wide distribution of GRT in this cohort of patients, we do not propose cut-off points for grading the compromise of this reflex.

Table 6 Cut-off point values for grading the severity of LART and CRT compromise

Adverse Events

Two patients withdrew from the study because of minor adverse events (one patient experienced epistaxis lasting less than 2 min and one patient reported severe nose discomfort caused by the endoscope). In general, the test was well tolerated; most patients reported mild to moderate discomfort. No patients required observation in the examination room for more than 5 min, referral to the emergency room, or hospitalization (Table 7).

Table 7 Adverse events during FEESST

Discussion

We assessed the accuracy of LPMS evaluations performed using an LPEER in a cohort of patients with varied underlying diagnoses and a dysphagia prevalence and severity comparable to what is likely observed in many dysphagia clinics. In this cohort, we found a strong association between disturbances in laryngopharyngeal reflexes and alterations in swallowing safety with a dose–gradient relationship and a very good discriminative capacity for LART and CRT, as shown by the ROC curves and AUC analysis [39].

The discriminative capacity of GRT was lower than that for the other reflexes but remained acceptable for studying dysphagic patients [39]. This finding is consistent with other reports showing a lower discriminative capacity for the gag reflex in this type of patient [40]. However, our findings do not exclude the potential benefits of GRT exploration in patients with upper airway hypersensitivity conditions, such as irritable larynx syndrome [9] and amyotrophic lateral sclerosis [41], because the distribution of values in the low-threshold extreme may be more narrow than those found here. We must consider the absence of other available methods for objective and quantitative measurements of this reflex and our finding of acceptable validity for GRT exploration. We think that it would be worthwhile to conduct future validation studies for GRT in patients with conditions associated with upper airway hypersensitivity.

Our findings, in combination with those of the LPEER reliability study [17], support the use of LPEER for the evaluation of LPMS through assessments of LART, CRT, and GRT in dysphagic and non-dysphagic subjects; however, these findings require validation in independent populations. The use of LPEER can help to clarify the interactions between motor and sensory disturbances observed in dysphagia, especially those with neurogenic origins. LPEER may also improve the selection of patients with impaired swallowing safety for percutaneous endoscopic gastrostomy (PEG), considering not only their motor and safety disturbances but also their sensory evaluation. Aviv reported preliminary data concerning the potential utility of sensory test results for preventing pneumonia in stroke patients [32]. The proper selection of patients for PEG would allow those who do not benefit from it to avoid unnecessary exposure to its potential complications and those who do benefit from it to receive it opportunely [42].

Aviv previously reported a combined effect of sensory and motor disturbances on alterations in swallowing safety [5], and we found a strong association between sensory disturbances and alterations in swallowing safety. Our findings and those of Aviv are consistent with previous reports of the effects of sensory alterations on motor disorders of limb muscles in patients who have experienced stroke, and they support the importance of sensory feedback on motor function [43].

Another potential utilization of LPMS testing using LPEER is the objective assessment of therapeutic interventions aimed at improving sensory abnormalities in dysphagic patients [25, 26]. LPMS evaluations in these patients may help confirm how much of their improvement is due to sensory recovery. This information would also help in perfecting such interventions.

Interventions that have been developed to treat laryngopharyngeal hypersensitivity conditions [7] may also benefit from the objective assessment of LPMS to determine the proportion of their effect that is mediated through improvements in the hypersensitivity state. These interventions could then be adjusted to improve their effectiveness.

The comprehensive exploration of LPMS, particularly the cough reflex, may also facilitate a better understanding of the mechanisms of cough under a variety of conditions, such as chronic cough hypersensitivity syndrome, irritable larynx, and persistent cough after viral infections [24, 44, 45]. Of particular interest is our finding of a trend towards a lower threshold for cough in patients with irritable larynx syndrome despite the small number of patients with this diagnosis. This finding is consistent with previous studies that found hypersensitivity to chemical stimuli in this condition [9, 45, 46], which merits further exploration in future studies of patients with chronic cough hypersensitivity and irritable larynx syndrome.

The LPEER may also prove to be a useful tool for better understanding the pathophysiology of OSA and the possible interactions between motor and sensory disturbances in the generation of obstructive airway events in patients suffering from OSA [6]. Sensory-improving therapies are likely to be explored in these patients, and these studies would benefit from objective, quantitative evaluations of LPMS.

In a previous study, we found normal values of 0.14 mN for LART, 4.4 mN for CRT, and 11.9 mN for GRT [17]. This normal LART value equals 2.5 mmHg in Aviv’s method of measurement, as stated in the previous LPEER study and its supplementary appendix [17], and is consistent with Aviv’s findings in healthy subjects, as well as with Grushka’s findings of a mechanosensitivity threshold of 0.15 mN in the most sensitive part of the tongue [12, 47]. These values are also consistent with our proposed cut-off points for grading LART and CRT based on their values according to the severity of swallowing abnormalities (Table 6). We propose that these cut-off points are linked to the severity of swallowing abnormalities detected on FEES instead of the normal or Gaussian distribution because our threshold values did not fit the Gaussian distribution (as with many other quantitative diagnostic tests) [15] and because it is more relevant and useful for the clinician to have test values that are linked to the target disorder (in our case dysphagia) [15]. We did not propose cut-off points for GRT because this reflex had a wide distribution with very high thresholds even in patients with normal FEES.

To the best of our knowledge, this study represents the most extensive accuracy evaluation of LPMS. However, our results and cut-off points require validation in at least one independent population with similar characteristics and in patients with other conditions potentially affecting LPMS, such as OSA and laryngeal hypersensitivity. We hope that our findings will be helpful for incorporating objective LPMS testing into the study, diagnosis and management of patients with conditions associated with alterations in LPMS.

Although we did not administer any anesthetics, the LPMS test with the LPEER was well tolerated. It only produced mild to moderate discomfort, and no adverse events occurred that required referral to the emergency room or hospital admission. We removed two patients from the study because of minor adverse effects, such as mild and self-limited epistaxis and moderate to severe nasal discomfort. The presence of these adverse effects would likely not indicate the suspension of the test in real clinical scenarios. We chose to remove the two patients experiencing adverse effects to maintain the study at minimal risk.

Conclusion

LPEER has good discriminative capacity for sensory evaluation of the laryngopharyngeal tract in dysphagic patients. Sensory compromise appears to play a key role in the development of alterations in swallowing safety, and its assessment would further the current understanding of the mechanism of dysphagia and may improve evaluations of the efficacy of interventions aimed at recovering sensory deficits.