Introduction

Open partial horizontal laryngectomies (OPHLs) are conservative surgical techniques for the treatment of selected laryngeal carcinomas. They allowed to expand the application of partial laryngectomies to the treatment of laryngeal cancer up to anterior T4a tumours, achieving in more improvement in quality of life (QOL) compared to total laryngectomy [1]. Preservation of the main laryngeal functions (respiration, phonation and swallowing) is achieved by maintaining one or both functioning crico-arytenoid unit(s) with the corresponding arytenoid(s) and the intact recurrent laryngeal nerve(s) of the same side. In relation to the inferior limit of surgical resection, the OPHLs are divided into type I (supraglottic laryngectomy), type II (supracricoid laryngectomy) and type III (supratracheal laryngectomy) [2]. Moreover, two different variants of the OPHL type II and type III exists: (1) types IIa and IIIa in which the epiglottis is preserved; (2) types IIb and IIIb in which the resection involves the epiglottis.

Swallowing is always affected in the first weeks after OPHL, but it spontaneously recovers during the first 6 postoperative months [3,4,5,6,7]. Indeed, the vast majority of the patients achieve an unrestricted diet between the sixth month and the first year following surgery [4, 8,9,10,11,12]. Nevertheless, a certain degree of chronic aspiration, especially with liquids, is often detected also in the long term. The rate of aspiration pneumonia is low [4, 5, 11,12,13,14,15]. Despite the low rate, the onset of aspiration pneumonia has been found to negatively influence the post-treatment survival in patients with head and neck cancer [16, 17]. Therefore, the assessment of the signs of dysphagia exposing the patient to the risk of developing an aspiration pneumonia (namely penetration and aspiration) is of utmost importance to both evaluate treatments’ outcome and make an accurate prognosis, together with other factors such as medical, nutritional and oral status.

Currently, a standard for assessing swallowing function after OPHL is still not established. A review of the literature has shown a great variability in measures used to investigate swallowing functional outcomes after OPHL [18]. Concerning the assessment of penetration and aspiration, many studies use clinical scales (e.g., the Leipzig [19] and Pearson [20] scale) to infer the presence of penetration and aspiration by observable signs during clinical assessment of swallowing (i.e., cough, throat cleaning). However, as penetration and aspiration silently occur in part of the OPHL population because of a reduction in sensitivity [12, 21, 22], the use of these scales may underestimate their rate. Other studies have investigated the presence of penetration and aspiration through an instrumental assessment, especially videofluoroscopy and fiberoptic endoscopic evaluation of swallowing (FEES). In these studies, non-validated outcome scales (4 or 5 points ordinal scales) and validated scales but for normal anatomy (the penetration–aspiration scale—PAS [23]) have been used to score the presence and the degree of penetration and aspiration. The heterogeneity of applied methods to assess penetration and aspiration in the OPHL population compromises the possibility to compare and combine results from different studies.

The PAS is a widely used ordinal scale, rating on 8 points the severity of penetration and aspiration. The score is assigned based on three variables:

  1. 1.

    The anatomical depth of bolus invasion into the airway.

  2. 2.

    The presence of a response to the inhaled material.

  3. 3.

    The efficacy of the “ejection” of the inhaled material.

First introduced by Rosenbek and colleagues in 1996 for the application to videofluoroscopy, the PAS has become widely used as a common standard for the interpretation of both videofluoroscopy and FEES. Although, more recently, some pitfalls of the scale have been criticized [24], the PAS is still considered a valuable tool. The anatomical depth of airway invasion is assessed based on the position of the bolus: into the larynx above the vocal folds; into the larynx to the level of the vocal folds; or below the vocal folds. In FEES, the entrance of the laryngeal vestibule is marked by the epiglottis, anteriorly, and the arytenoid, posteriorly. OPHLs significantly change the anatomy of the larynx, as the thyroid cartilage, in some cases the epiglottis (in OPHL I and in type b of OPHL II and III) and the vocal folds (in OPHL types II and III) are removed; therefore, the PAS landmarks for the entrance into the laryngeal vestibule (epiglottis and arytenoids) and for the level of the vocal folds cords are not present. Consequently, the original version of the PAS cannot be applied to the modified laryngeal configuration during FEES.

Therefore, the study aimed to adapt the PAS to the altered anatomy after OPHLs as observed during FEES and to test the reliability of the OPHL-PAS. The hypothesis was that the OPHL-PAS is a reliable scale to assess penetration and aspiration with FEES in this population. In case the reliability of the OPHL-PAS would be demonstrated, its application in both clinical and research practice may be useful to provide a common language to swallowing experts, allowing the combination of results from different studies on similar populations to improve statistical power and to compare different treatments’ outcome.

Materials and Methods

This cross-sectional study was carried out according to the Declaration of Helsinki and it was previously approved by the Institutional Review Board of the Luigi Sacco Hospital. The study is a secondary analysis of a larger study on long-term functional outcomes after OPHL. All subjects enrolled in the study gave their written informed consent; all data were collected prospectively between 1 October 2012 and 31 October 2013.

Adaptation of the PAS to OPHL

A working group made of a phoniatrician, and a speech and language therapists (SLT), with at least 5 years of experience in the interpretation of FEES in patients who underwent an OPHL, adapted the PAS to the altered anatomy of the OPHL. In particular, two landmarks needed to be identified: the entry of the laryngeal vestibule (for OPHL type I and type IIb–IIIb) and the neoglottis (for OPHL types II and III). The landmarks identified for the entry of the laryngeal vestibule depended on the type of surgery:

  • Type I The scar of the pexy (Fig. 1).

    Fig. 1
    figure 1

    Entry of the laryngeal vestibule in OPHL type I without (a) and with (b) the landmark. 1. Valleculae; 2. arytenoids; 3. vocal folds; 4. pyriform sinus; 5. posterior pharyngeal wall

  • Type IIb and Type IIIb The line where the arytenoid(s) contact(s) the base of tongue during phonation (Fig. 2). The ideal line is firstly identified during a phonation task before the swallowing trials, and then, the entrance of the bolus in the laryngeal vestibule is assessed in the post-swallow configuration of the larynx.

    Fig. 2
    figure 2

    Entry of the laryngeal vestibule in OPHL type IIb and type IIIb without (a) and with (b) the landmark (1. phonation configuration, 2. post-swallow configuration). 1. Uvula; 2. base of tongue; 3. arytenoids; 4. laryngeal vestibule; 5. neoglottis; 6. pyriform sinus; 7. posterior pharyngeal wall

The neoglottis was identified for OPHL type II (a and b) and type III (a and b) at the level of the scar of the pexy (Fig. 3).

Fig. 3
figure 3

Neoglottis in OPHL type II and type III without (a) and with (b) the landmark. 1. Laryngeal vestibule; 2. arytenoids; 3. pyriform sinus; 4. posterior pharyngeal wall

Table 1 compares the original PAS to the OPHL-PAS, according to the OPHL type.

Table 1 Original PAS and OPHL-PAS

Population

Patients were selected from a database of 1081 patients who underwent OPHL at the Department of Otorhinolaryngology of the Martini Hospital of Turin and of the Civil Hospital of Vittorio Veneto for the primary study of functional outcomes after OPHLs. Selection criteria were OPHL, no evident disease (NED) at the last follow-up, preservation of respiration and speech, non-enteral feeding (absence of non-enteral feeding, i.e., percutaneous endoscopic gastrostomy [PEG] or nasogastric tube [NGT]), absence of a tracheostomy, no salvage total laryngectomy performed and at least 6-month follow-up with FEES. All patients underwent the same preoperative and postoperative assessment and management as described by Rizzotto et al. [25].

Patients from the database were screened for the inclusion criteria and afterwards stratified based on the OPHL type (I, II and III). A unique identification number was assigned to each patient and random numbers were generated to select 90 patients. The patients were divided as follows: 27 patients underwent OPHL type I, 31 patients underwent OPHL type II and 32 patients underwent OPHL type III. After random selection of the patients, the surgery type was checked to verify the presence of at least one patient for each surgical option (type a, type b, extended to one arytenoid (+ ARY), extended to one piriform sinus (+ PIR), extended to the base of tongue (+ BOT), extended to one crico-arytenoid unit (+ CAU). Patients were 85/90 (94.4%) males and 5/90 (5.6%) females. Median age was 64 years (range 40–85). Clinical and treatment characteristics of the sample are reported in Table 2. On average, the median time from surgery to FEES was 38.5 months (range 6–191).

Table 2 Frequency distribution of clinical and treatment characteristics of the patients’ sample

Fiberoptic Endoscopic Evaluation of Swallowing

FEES was conducted using an Olympus Evis Exera II 18 endoscopy system and an Olympus ENF VQ trans-nasal flexible endoscope (Olympus Corporation, Tokyo, Japan); each FEES was video-recorded. Swallowing of liquids, semisolids and solids was assessed using room temperature water blue dyed, puddings and crackers. A 5 cc bolus was given to each participant three times for liquids and semisolids, while three trials with a quarter of an 8 g cracker each were carried out for testing solids.

Inter-rater and intra-rater Assessment

FEES recordings collected during the primary study on functional outcome after OPHLs were renamed, randomized and assessed by two independent SLTs who were not involved in the scale adaptation. The SLTs had at least 4 years of experience in the field of swallowing and attended specific 4-h training on the scoring using the OPHL-PAS. SLTs were aware of the type of OPHL (type I, II and III) performed to the patients. Afterwards, all videos were renamed, randomized and re-assessed by both SLTs for the second time at least 15 days from the first video analysis. If necessary, the raters could view the video frames-by-frames.

For each FEES recording, the raters were asked to record:

  • OPHL-PAS scores one OPHL-PAS score for each consistency. The OPHL-PAS score was assigned after having visualized all the trials of each consistency. The worse OPHL-PAS score for each consistency was assigned.

  • Number of views the number of views required to the raters to assign an OPHL-PAS score was recorded for each consistency. For the counting of the number of views, a new view of the video was added every time that the whole video or only part of the video was re-played by the rater.

  • Difficulty rating: for each FEES the raters judged the perceived difficulty in identifying the landmark for the entry of the laryngeal vestibule and the neoglottis, when applicable. The difficulty was rated on a visual analogue scale (VAS) ranging from 0 (extremely easy) to 10 (extremely difficult).

Statistical Analysis

Statistical analysis was performed with the IBM SPSS Statistics 24.0® package for Windows (SPSS Inc, Chicago, IL). Inter- and intra-rater agreement for the OPHL-PAS was assessed through the non-weighted Cohen’s kappa, firstly for the whole sample and then separated for each OPHL type (I–II–III) and consistency (liquid–semisolid–solid). According to the value of Cohen’s kappa, the level of agreement was considered: none for 0 ≤ k ≤ 0.20, minimal for 0.21 ≤ k ≤ 0.39, weak for 0.40 ≤ k ≤ 0.59, moderate for 0.60 ≤ k ≤ 0.79, strong for 0.80 ≤ k ≤ 0.90, and almost perfect for k > 0.90 [26].

The number of views was compared among different OPHL types using the Kruskal–Wallis test. A post hoc analysis was conducted in case of a significant result of the test with a Bonferroni correction for multiple comparisons. The difficulty rating was compared for the identification of the neoglottis in patients with OPHL type II and patients with OPHL type III. The Mann–Whitney U test was used as the assumption for normality was violated for the VAS distribution at the Kolmogorov–Smirnov test of normality. No comparison among different OPHL types was conducted because of the difficulty rating in identifying the entry of the neoglottis because of the small number of patients with OPHL types IIb and IIIb. The statistical significance was set at p < 0.05.

Results

Overall, each rater assessed 801 swallows (270 swallows with liquids, 270 with semisolids and 261 with solids) and assigned 267 PAS scores (90 for liquids, 90 for semisolids and 87 for solids). Three patients were not assessed with the solid bolus because it was considered highly unsafe by the clinician performing the FEES based on the performance on previous trials. For all the OPHL types, the most frequent score of the OPHL-PAS was 1, while the least frequent score was 6. The majority of the patients scored from 1 to 3 at the OPHL-PAS.

Inter-rater Agreement

An overall inter-rater agreement of k = 0.863 was found for the OPHL-PAS. In particular, inter-rater agreement was k = 0.924 for OPHL type I, k = 0.865 for OPHL type II and k = 0.808 for OPHL type III. Table 3 shows the distribution of the scores among the two raters. Perfect agreement was achieved in 240/267 (89.9%) cases, in 19/267 (7.1%) scores differed by 1 level, while in 8/267 (3%) cases differed by 2 levels. In 11 cases, the disagreement among the 2 raters led to a change in the category of the depth of airway invasion (no airway invasion vs. penetration vs. aspiration). Values of inter-rater agreement for different consistencies are reported in Table 4.

Table 3 Inter-rater agreement: distribution of the OPHL-PAS among different raters
Table 4 Inter-rater agreement: values of Cohen’s kappa for each consistency

Intra-rater Agreement

Overall, intra-rater agreement for the OPHL-PAS was k = 0.854. Intra-rater agreement was k = 0.914 for OPHL type I, k = 0.790 for OPHL type II and k = 0.858 for OPHL type III. The distribution of the scores among the first and the second assessment by rater 1 is reported in Table 5. Perfect agreement was achieved in 481/534 (90%) cases, in 39/534 (7.3%) scores differed by 1 level, while in 12/534 (2.2%) cases differed by 2 levels. In 19 cases, the disagreement among the 2 assessments led to a change in the category of the depth of airway invasion (no airway invasion vs. penetration vs. aspiration). Values of intra-rater agreement for different consistencies are reported in Table 6.

Table 5 Intra-rater agreement: distribution of the OPHL-PAS among the 1st and the 2nd assessment
Table 6 Intra-rater agreement: values of Cohen’s kappa for each consistency

Number of Views

The number of views required to assign an OPHL-PAS score differed among different OPHL types (p = 0.004). In particular, the OPHL type III (median 2, interquartile range 1.25) required significantly more views than OPHL type I (median 1, interquartile range 1) (p = 0.030). No difference was recorded when comparing OPHL type I and type II (median 2, interquartile range 1) (p = 0.265), and OPHL type II and type III (p = 0.281). All OPHL type I patients required maximum 3 views, 1 patient with OPHL type II required more than 3 views, while 5 patients with OPHL type III required more than 3 views.

Difficulty Rating

The difficulty rating for the identification of the neoglottis was perceived as significantly lower in patients who underwent OPHL type II (median VAS 1.4, interquartile range 3.27) compared to patients who underwent an OPHL type III (median VAS 3.83, interquartile range 4.5), as shown by the Mann–Whitney U test (p = 0.010).

Concerning the entry of the laryngeal vestibule, median VAS was 1.7 (interquartile range 1.93) for OPHL type I and 1 (interquartile range 2.38) for OPHL types IIb and IIIb.

Discussion

The PAS was adapted to the altered anatomy of the patients who underwent an OPHL when assessed using FEES. The OPHL-PAS showed strong to perfect intra- and inter-agreement. The OPHL-PAS represents the first validated scale to assess lower airways’ invasion during FEES specifically tested on patients with OPHL.

The PAS was originally developed for the videofluoroscopy [23]. However, studies have demonstrated its applicability and reliability in FEES [27, 28]. FEES was also found to provide comparable inter- and intra-rater reliability to videofluoroscopy, although the two procedures are not interchangeable due to a systematic difference of PAS scores within the same individual based on the instrumental assessment used [29]. In the present study, the OPHL-PAS was applied to FEES as it allows direct visualization of the laryngeal anatomy altered by the surgical resection. The necessity and possibility to adapt the PAS to the OPHLs’ anatomy during videofluoroscopy is beyond the purpose of the present study and should be further investigated.

The OPHL-PAS exhibited an inter-rater agreement of k = 0.863 and intra-rater agreement of k = 0.854. The levels of agreement are similar to those previously reported in the literature for the PAS in FEES. Colodny and colleagues found an inter-rater agreement ranging from 64.6 to 74.7% and an intra-rater agreement ranging from 78.5 to 91.1% depending on the rater [27]. In 2007, Kelly et al. reported an inter-rater agreement of k = 0.64 and an intra-rater agreement of k = 0.73 [29]. Using the intraclass correlation coefficient (ICC), the study by Butler and colleagues showed an inter-rater reliability of ICC = 0.85 and an intra-rater reliability of ICC = 0.94 [28]. Therefore, the reliability of the OPHL-PAS seems to be comparable to the reliability of the PAS in FEES. Slight differences of the reliability of the PAS among the studies may be the result of the influence of several factors, such as the clinical experience of the raters, the training, the number of views for each video and the retest interval [27, 28, 30]. Moreover, the different the statistical method used in the studies does not allow a direct comparison of the results as different concepts were tested. Indeed, in the present study, the non-weighted Cohen’s kappa was used to test the perfect agreement among raters. The ICC, used in Butler and colleagues’ studies, is an index of reliability, which addresses not only the level of agreement but also the degree of correlation between measurements.

Inter- and intra-rater agreement was satisfactory among the different OPHL types. However, OPHL type I showed slightly higher levels of agreement (k > 0.90). Moreover, the OPHL type III required the highest number of views to assign a score and was perceived as more difficult than the OPHL type II for the identification of anatomical landmarks. Therefore, as expected, the higher the surgical resection and reconstruction, the more difficult is the application of the OPHL-PAS, even though a strong reliability is guaranteed. These results support the importance of specifically trained clinicians and researchers applying the OPHL-PAS focused both on the identification of the signs of laryngeal penetration and aspiration and on the modified laryngeal anatomy following the different OPHL types. Inter- and intra-rater agreement was satisfactory among different consistencies; this finding is important as it showed that OPHL-PAS can reliably be applied with boluses of different rheology characteristics and therefore it further supports OPHL-PAS application in clinical practice and research.

The interpretation of PAS and of other visuo-perceptual ordinal variables is affected by experience and training [31, 32]. In this study, the two raters were SLTs with a 4-year experience in the field of dysphagia and underwent a specific training of 4 h on the anatomical changes after OPHL for the application of the OPHL-PAS. Therefore, based on the present results, a strong inter- and intra-rater agreement of the OPHL-PAS can be achieved in adequately trained clinicians. Understanding the impact of experience and training on the level of agreement requires further investigations.

OPHL-PAS scores showed a positively skewed distribution. The score of 1 was assigned in about 50% of the cases, and the scores 1 to 3 represented about the 80% of the scores. The score distribution of the OPHL-PAS reflects the results of the literature on long-term swallowing outcomes after OPHL. Indeed, other studies have shown that over the 50% of the patients who underwent an OPHL showed no penetration or aspiration [18, 33] and, when penetration occurred, material was effectively ejected from the laryngeal vestibule [13]. However, the distribution of the scores may be influenced by the long distance that occurred between surgery and FEES in the study sample, which allowed the recovery of swallowing safety in the majority of the patients. Application of the OPHL-PAS on patients short-term after surgery would probably be associated with more severe OPHL-PAS scores. The scores 4 and 6 were the least represented in the included sample. Analogously, these scores are rarely reported in the literature when applying the original PAS in patients with dysphagia [23, 29, 34,35,36]. Because of their rarity, the clinical values of these levels have been argued by Steele and Grace-Martin [24]. Concerns about the ability of the clinicians in distinguishing levels 4 and 6 from similar or adjacent levels on the PAS, as well as their clinical relevance have been raised.

Along with the discussion on the scores’ frequency among different PAS levels, other considerations should be contemplated on the use of the PAS to assess penetration and aspiration in patients with dysphagia [24]. Firstly, a debate on the ordinality of the scale is currently ongoing. A survey on the relative severity of the levels of the PAS showed that clinicians were uncertain on how to rank the severity of levels 3 and 5 when compared to levels 4 and 6, respectively [37]. Secondly, the scale does not consider the frequency of the lower airway’s invasion and the amount of inhaled material. Finally, the lack of a standard procedure on how the PAS should be applied (e.g., after the 1st swallowing act or at the end of swallowing status, using the worst score or the mean or the mode) makes the interpretation of the results from the studies difficult. On the contrary, strengths of the PAS are the widespread use easing the communication among clinicians and researchers, the availability of data on validity and reliability, and the ability to provide information guiding the clinicians in drawing inferences on the sensory and motor integrity of different regions of the pharynx and larynx.

Anatomical landmarks for the adaptation of the PAS to the altered anatomy following OPHL were defined. The entry of the laryngeal vestibule in OPHL type I and the neoglottis in OPHL types II and III were identified in the scar of the pexy. However, no specific existing anatomical point but an ideal limit (“the line of contact between the arytenoid(s) and the base of the tongue during phonation”) could be individuated for the entry of the laryngeal in OPHL types IIb and IIIb. In the study, only seven patients underwent an OPHL type IIb or IIIb. This was due to the fact that the study represents a secondary analysis of a larger study on functional outcomes after OPHLs and OPHL types IIb and IIIb are more rarely performed than OPHL types IIa and IIIa in our caseload, as previously reported [7]. The small sample size of patients with OPHL types IIb and IIIb represents a limit of the study, which should be overcome in future studies. However, 4/7 patients showed penetration and/or aspiration allowing the limit to be tested, and judges rated it as generally easy to identify.

Other limitations of the study include the heterogeneous frequency of scores among different OPHL levels, suggesting the need to stratify the patients recruited based on the severity of laryngeal penetration and aspiration, and the low number of raters. Future studies should include patients with a short-term follow-up from surgery and a larger sample of raters with different levels of expertise, to test influence of these variables on the reliability of the OPHL-PAS.

Conclusions

The OPHL-PAS is a reliable scale to assess lower airway’s invasion during swallowing in patients with OPHL using FEES, when assessed by trained clinicians. The study represents the first attempt to define standard tools to assess swallowing functional outcome in this population, in order to provide a common language among clinicians and ease the comparison of the results from different studies. The necessity and possibility to adapt the PAS to the laryngeal anatomy following an OPHL in videofluoroscopy should be investigated in future studies.