Introduction

Salivary gland fine needle aspiration (FNA) has a well-established role in the evaluation of salivary gland lesions [4, 8, 12, 21, 25]. It is a cost-effective and minimally invasive triage tool that prevents patients with non-neoplastic lesions from undergoing surgery. However, the interpretation of salivary gland cytology can be challenging due to tumor diversity, morphological overlap between benign and malignant entities, and technical limitations including low cellularity in cystic lesions [4, 8, 11, 12, 21, 25, 30]. The diversity of salivary glands neoplasms is well known and described in the most recent WHO classification with 31 different epithelial neoplasms to date [6]. Moreover, until recently the absence of a standardized cytologic classification system or a uniform reporting system made the diagnostic process and associated clinical management even more challenging [21, 25]. An effort for standardization started in 2015 and led to the development and publication of the first evidence-based tiered diagnostic framework known as the Milan system for reporting salivary gland cytopathology (MSRSGC) [9]. This tiered classification is based on seven diagnostic categories, each associated with its own risk of malignancy (ROM) and recommendations for patient management [1, 9].

A survey preceding the development of the MSRSGC highlighted that the majority of institutions used air-dried or alcohol-fixed smears in the evaluation of aspiration specimens [22]. Only 25% of international respondents used additional cell block preparations for ancillary testing [22]. The primary and sole use of cell blocks as main processing method for salivary gland FNA was not examined and is not mentioned in the MSRSGC guidelines [9]. Rather, the MSRSGC makes use of a combination of air-dried and alcohol-fixed direct smears, which may also be supplemented by liquid-based preparations [13], as the mainstay for salivary gland FNA diagnosis. The use of cell blocks is only recommended for selected cases for which ancillary tests including immunohistochemical stains and molecular studies would be needed9. Since its implementation in 2017, the MSRSGC has been further validated through multiple reviews and published single center experiences, all based on the use of direct smears and/or liquid based cytology (reviewed in Ref. 11) [13, 15,16,17, 27, 29, 31, 32]. As additional literature is published, it will be possible to further refine various aspects of the MSRSGC, with a second version expected in 2023 [7].

In contrast to most international institutions, the pathology department of the Jewish General Hospital in Montreal, Canada, uses cell blocks as the main preparation technique for salivary gland FNAs. Historically there was no dedicated cytopathologist at the Jewish General Hospital. The head and neck pathologist at the time had to handle both cytology and pathology specimens from the otolaryngology department and was more at ease with FNA processed as cell block technique. The technique was kept through the years as it seemed to be rather performant.

The diagnostic performance of the MSRSGC has been only scarcely reported in institutions using the cell block technique for FNA specimens [2]. However, as these FNA samples are fixed using 10% neutral buffered formalin solution instead of alcohol, diagnostic accuracy of salivary gland cytopathology can be improved by the use of ancillary studies on these cell blocks (possible without additional laboratory validation) [9]. We conducted a retrospective study to evaluate and further validate the MSRSGC based on our experience with the sole use of cell blocks, to evaluate the ROM on follow-up histopathology, and to investigate possible advantages and disadvantages of this processing method.

Materials and methods

The STARD guidelines for reporting diagnostic accuracy studies were used [3].

Study design and participants

We conducted a retrospective single-institutional cohort study of consecutive patients with parotid masses treated at the Jewish General Hospital (JGH) in Montreal, Canada between January 1, 2018, and June 30, 2021. The study was approved by the local Research Ethics Committee. Ethical guidelines were followed, and all patients provided informed consent. Eligibility criteria included previously untreated patients over 18 years of age at diagnosis with a parotid mass that underwent FNA. Only patients with parotid gland tumors were included. Tumors from the submandibular, sublingual and minor salivary glands were excluded.

Ultrasound-guided FNAs were performed by a single experienced board certified otolaryngologist (AM) with 21G needles. Each parotid lesion is aspirated twice, with six to eight passes for each FNA. Each FNA sample is sent to the Pathology department in separate conical formalin-filled tubes. All surgeries were performed by two board-certified head and neck surgeons (AM and MPH). All cytology specimens were fixed using a 10% neutral buffered formalin solution and processed into cell blocks using the HistoGel™ method, which is the main preparation technique for salivary gland FNAs at the pathology department of the JGH.

Test methods

Cytological diagnoses were classified prospectively according to the MSRSGC categories as: non-diagnostic (I), non-neoplastic (II), atypia of undetermined significance (AUS, III), neoplasm: benign (IVa), salivary gland neoplasm of uncertain malignant potential (SUMP, IVb), suspicious for malignancy (SFM, V), or malignant (VI) [9]. All cytology samples and (where available) the corresponding surgical pathology specimens were assessed by two board-certified experienced head and neck pathologists (LF and MPP).

Medical records of all patients were examined to obtain detailed demographic data (age, gender, laterality, size of the lesion (cm), location (parotid), MSRSGC category, specific cytologic diagnosis, use of ancillary studies including immunohistochemistry (IHC) on cell blocks, surgical follow-up, final histologic diagnosis, presence or absence of malignancy, clinical follow-up and length of follow-up period. The histological follow-up of excisional specimens (when available) was used to calculate the ROM for each diagnostic category in the MSRSGC.

Statistical analysis

The distribution of cases according to MSRSGC diagnostic categories was calculated in cross-sectional tables. Individual ROM for each diagnostic category of the MSRSGC was calculated. The ROM based on surgical follow-up used the final pathological analysis of the surgical specimen, while the ROM based on clinical follow-up was based on subsequent clinical and/or radiological evaluation. The relative tumor size was compared between the Milan system categories using the Kruskal–Wallis test. Pairwise comparisons were also made, and the Bonferroni correction was applied for multiple comparisons. Statistical analyses were performed using SPSS® 27.0.0.0 software (IBM©, Armonk, NY, USA). A P value less than 0.05 was considered to indicate statistical significance.

Results

Participants

A total of 230 parotid gland FNAs in a total of 221 patients, consisting of 113 (51.1%) males and 108 (48.9%) females, were evaluated. The mean age of the cohort was 62 years (standard deviation, SD of 15). The relative follow-up category distribution and ROM for each Milan category are shown in Table 1. The median follow-up time was 6 months (interquartile range, IQR 3-12).

Table 1 Clinical characteristics of each Milan category

Test results

The distribution according to the MSRSGC categories was as follows: 65 (28.3%) were non-diagnostic (Milan I), 21 (9.1%) were non-neoplastic (Milan II), 20 cases (8.7%) were AUS (Milan III), 87 (37.8%) were neoplasm: benign (Milan IVa), 14 (6.1%) were SUMP (Milan IVb), four cases (1.7%) were SFM (Milan V), and 19 cases (8.3%) were malignant (Milan VI) (Table 1).

A repeat FNA was performed in 34/65 (52.3%) of Milan I patients, 3/20 (15.0%) of Milan III patients and 1/4 (25.0%) of Milan V patients. As summarized in suppl. Table 1, Milan I patients most often obtained a second Milan I result, three Milan III patients had second results equally distributed between Milan categories I, III and VI, while a single Milan V patient obtained a malignant Milan VI category on the repeat FNA.

Regarding the size of the clinically and/or radiologically detected lesion, the mean size was 2.2cm (SD 1.1), with a distribution that was statistically significant among the Milan categories (Kruskal–Wallis test, p=0.024) (Suppl. Figure 1). Using pairwise comparison among each single category, the lesion size was significantly different between Milan I and Milan IVa categories (adjusted P=0.030). All other pairwise comparisons did not show any statistically significant size difference (adjusted P>0.05).

The ROM for each Milan category is shown in Table 1. The ROM based on surgical follow-up for the non-diagnostic, non-neoplastic, AUS, neoplasm: benign, SUMP, SFM and 198 malignant categories were 21.4%, 0%, 50%, 0%, 30%, 100% and 100%, respectively. The ROMs based on the clinical follow- up for these categories were 7.3%, 0%, 37.3%, 0%, 27.3%, 100% and 100%, respectively.

The final histological diagnosis of cases with a surgical follow-up is shown in Table 2. In the Milan I category, samples were often parotid cysts (4/14 (28.5%)), small lymph nodes (3/14 (21.4%)), but some also revealed to be benign (4/14 (28.5%)) or malignant (3/14 (21.4%)) neoplasms. Milan I FNA samples that remained in the Milan I category on repeat FNA were either small in size (less than 1 cm) or benign parotid cysts. Among patients who obtained a Milan I category result on a repeat FNA, none showed malignancy upon clinical follow-up. Importantly, there was no false negative case in the Milan IVa category (i.e., a benign case on FNA that would be diagnosed as malignant upon surgical excision).

Table 2 Final histological diagnosis for cases with surgical follow-up

Similarly, there was no false positive case in the Milan V and Milan VI categories (i.e., suspicious for malignancy or malignant on FNA that would be diagnosed as benign upon surgical excision). Selected representative cases in Milan categories III, IVb and VI are illustrated in Figs. 1, 2, 3.

Fig. 1
figure 1

A salivary gland FNA case of secretory carcinoma (AD). Tumor cells have abundant, eosinophilic, granular cytoplasm and round nuclei (A: FFPE cell block, H&E). The main differential diagnosis includes a zymogen-poor acinic cell carcinoma and several oncocytic neoplasms, including an oncocytoma. Without ancillary studies, this case would likely be placed into the SUMP category (Milan IVb). By IHC, tumor cells in the cell block material are positive for S-100 (B), mammaglobin (C) and pan-TRK (D), supporting the diagnosis of secretory carcinoma and placing the specimen into a definitive Malignant category (Milan VI), without the need to demonstrate the specific ETV6 gene rearrangement by FISH or NGS

Fig 2
figure 2

A salivary gland FNA case of low-grade mucoepidermoid carcinoma diagnosed as AUS (Milan III) on cytology. The aspirate contained only mucinous cyst contents with foamy histiocytes but no epithelial component (A: FFPE cell block, H&E). According to the Milan system, these cases are classified as AUS (Milan III) rather than negative or non-diagnostic given the significant risk of low-grade MEC, especially in the parotid gland. A mucicarmine stain confirms the presence of mucin (B). The resection specimen shows a well demarcated cystic mass filled with mucin (C, H&E); on higher magnification, the cystic lining contains bland-looking mucinous cells and squamoid to intermediate cells, corresponding to a low-grade mucoepidermoid carcinoma (D, H&E)

Fig. 3
figure 3

A salivary gland FNA case of basal cell adenoma diagnosed as SUMP (Milan IV) on cytology. The aspirate shows a basaloid neoplasm with a tubulo-trabecular growth pattern and cellular stroma (A: FFPE cell block, H&E). Dual immunostain of p63 (brown nuclear staining) and CK7 (red cytoplasmic staining) confirms the presence of a biphasic (basal/luminal) phenotype. The main differential diagnoses included basal cell adenoma and basal cell adenocarcinoma, which cannot be accurately distinguished on aspirate material. As a result, these cases are classified as SUMP, basaloid neoplasm (Milan IVb) according to the Milan system. The surgical follow-up shows an encapsulated noninvasive cellular basaloid neoplasm consistent with a basal cell adenoma (C). On higher magnification (D), basaloid cells show peripheral palisading, typical of basal cell adenoma

Ancillary IHC studies were performed in 48 FNA cases (20.9%), including 7/65 (10.7%) of Milan I, 2/21 (9.5%) of Milan II, 7/20 (35.0%) in Milan III, 9/87 (10.3%) of Milan IVa, 7/14 (50%) of Milan IVb, 3/4 (75%) of Milan V, and 13/19 (68.4%) of Milan VI. The immunostains performed included: CK5/6, CK7, CK8/18, p40, p63, DOG-1, AE1/AE3, CD10, CD20, CD45, CD68, CD117, SMA, PLAG1, SOX-10, MYB, pan-TRK, GATA3, beta-catenin, S100, androgen receptor, and Her2. The combination of stains was carefully selected based on the cytomorphology and differential diagnosis. The list and purpose of ancillary studies used in each of the Milan category IVb, V and VI cases are presented in the supplemental Table 2. Fluorescent in-situ hybridization (FISH) or next-generation sequencing (NGS) were not performed during the time frame of this study.

Discussion

The diagnostic accuracy of salivary gland FNA varies upon several pre- and post-analytical factors, including the prevalence and distribution of salivary gland tumors (SGTs) in a given population, the FNA technique (guided by manual palpation vs ultrasound-guided), the FNA operator’s experience, the use of rapid on site evaluation (ROSE), the quality of the cytologic preparation, the diagnostic experience of the pathologist, the use of reporting terminology, characteristics of the SGT (e.g., solid vs cystic), and the use of ancillary studies [21, 25]. The results of our study show that the diagnostic accuracy of salivary gland FNA using only formalin-fixed paraffin-embedded (FFPE) cell block material is excellent. Within the limited time frame of this study, there were no false positive cases in the Milan IVa (neoplasm: benign) category, and no false negative cases in the Milan V (Suspicious for malignancy) nor in the Milan VI (Malignant) categories of the MSRSGC. Furthermore, the accuracy to specifically diagnose both common SGTs (such as pleomorphic adenoma (PA) and Warthin tumor (WT)), and also less common SGTs (such as acinic cell carcinoma (AciCC), secretory carcinoma (SC) and salivary duct carcinoma (SDC)), is also very high. The use of ancillary studies including IHC was rarely needed for the diagnosis of PA and WT since key diagnostic morphological features, including the characteristic chondromyxoid matrix of PA and the bilayered oncocytic epithelium of WT, were usually well recognized on the cell block material, akin to conventional cytologic preparations. In contrast, IHC was performed to diagnose AciCC, SC and SDC, using a panel of routine immunostains that included various epithelial and myoepithelial markers, CD117, DOG-1, S100, SOX10, mammaglobin, androgen receptor and Her2; newer immunostains such as PLAG-1, MYB and pan-TRK were also used and served as indirect surrogate markers of specific genetic/molecular alterations of SGTs [5, 9, 14, 19, 20].

SGTs are well known to show significant heterogeneity and morphological overlap, precluding a definitive diagnosis of some entities based on cytology alone [4, 8, 11, 12, 21, 25, 30]. When used to specifically subtype a neoplasm, the accuracy of salivary gland FNA depends upon the specific entity, ranging from 48% to 94% [25]. For many of the uncommon low-grade SGTs, FNA lacks the specificity in being able to precisely classify the tumor subtype based on cytomorphology alone. In addition, although most of high-grade carcinomas are easily recognized as malignant, the clinically important distinction between primary high-grade carcinomas including SDC and metastatic carcinoma can be problematic. When cytomorphology reaches its diagnostic limits, IHC, being inexpensive and widely available, is a useful ancillary tool for reaching a definitive diagnosis in most cases by confirming the cell lineage and/or determining the site of origin of a malignancy. While IHC can be performed on any form of cytologic preparations including FFPE cell blocks, cytospins, smears and liquid-based preparations, the use of FFPE cell blocks is the preferred method in most institutions as it has the advantage over other cytologic preparations in producing several nearly identical sections on which an IHC panel can then be applied [5, 9, 14, 19, 20]. In addition, all immunomarkers are already validated on FFPE tissue and no additional separate validation is needed. For diagnostic purposes, the use of a panel of immunostains is often necessary as most lack sufficient specificity and sensitivity, and are expressed in multiple SGTs [5, 9, 14, 19, 20]. Therefore, in addition to cytomorphology, the choice of immunopanels should be tailored according to the clinical context, the patient’s medical history and imaging features. Over the last decade, the importance of ancillary studies for improving diagnostic accuracy of salivary gland FNA within the framework of the MSRSGC has been strongly emphasized [5, 9, 14, 19, 20]. Several new immunomarkers have been developed and can be very useful to restrict the differential diagnoses list or to favor a specific entity when cytomorphology alone is not sufficient [5, 9, 14, 19, 20]. Currently, several antibodies are available to identify protein surrogates of specific genetic alterations which are overexpressed in a subset of SGTs, including MYB (MYB-NFIB fusion in adenoid cystic carcinoma), PLAG-1 (PLAG-1 rearrangement in PA and carcinomas-ex-PA), and more recently pan-TRK and NR4A3 expression for SC and AciCC, respectively [5, 9, 14, 19, 20]. In general, these so-called molecular immunomarkers are more sensitive but less specific than their corresponding genetic alterations, which can be demonstrated by various methods such as FISH or NGS [5, 9, 14, 19, 20]. Recently, some institutions have developed their own comprehensive customizable NGS SGT-specific panel to detect specific gene alterations, including mutations, fusions and RNA gene expression alterations, in order to facilitate the diagnosis and classification of SGTs [10, 23, 24]. For example, the SalvGlandDx panel is an all-in-one RNA-based NGS panel suitable for the detection of mutations, fusions and gene expression levels of 27 genes involved in SGTs [10]. This promising approach covers most of the common molecular alterations of SGTs in a single test and can be reliably performed on FFPE cell block specimens. However, NGS does not yet have a routine diagnostic role in salivary gland cytopathology as it still onerous and not widely available [14]. With the continuous discovery of additional specific genetic alterations in SGTs and the increasing availability of diagnostic markers that can be applied on FNA material, the diagnostic accuracy of salivary gland FNAs has potential to be significantly improved. In this setting, FFPE cell block material can provide easy access to additional IHC and/or molecular tests for cases with diagnostic difficulty. In addition to IHC and molecular tests, histochemical stains can also be readily performed on cell blocks, for example, highlighting intra- and/or extracellular mucin or glycogen content of a subset of SGTs including mucoepidermoid carcinoma and SC [8, 9].

To the best of our knowledge, only one previous study from Belgium demonstrated the feasibility and diagnostic accuracy of using only cell blocks for salivary gland FNAs by Behaeghe et al.[2]. In that study, the overall accuracy, sensitivity, specificity, positive predictive value and negative predictive value were 92.9%, 75.9%, 97.9%, 91.7% and 95%, respectively, which is slightly lower than in the current study; however, Behaeghe et al. had a larger number of cases (359 specimens vs 230 in our study) [2]. The ROM in their study based on surgical follow-up for the non-diagnostic, non-neoplastic, AUS, neoplasm: benign, SUMP, SFM and malignant categories was 13.8%, 14.2%, 30%, 6.3%, 20.8%, 60% and 100%, respectively. In comparison, the ROM based on surgical follow-up in our study for the non-diagnostic, non-neoplastic, AUS, neoplasm: benign, SUMP, SFM and malignant categories was 21.4%, 0%, 50%, 0%, 30%, 100% and 100%, respectively. The small discrepancies in the number of cases and ROM per category between these studies can be attributed to institutional variability in practice patterns including pre-FNA diagnostics, sampling technique, experience of the pathologist, and the use of ancillary techniques mentioned above. While some differences exist, the ROM in both studies are within the reported range of other studies that have retrospectively applied the MSRSGC to their cases using conventional cytologic preparations [13, 15,16,17, 27, 29, 31, 32]. Importantly, the estimated ROM calculated in such studies, including ours, is overestimations of the actual ROM, due to the impact of selection bias [18]. Nodules that undergo surgical resection are more likely to have suspicious pre-operative clinico-radiological findings, increasing the likelihood of malignancy regardless of the FNA diagnosis. Therefore, we also calculated the ROM based on the clinical follow-up, which is more accurate, despite the sometimes limited follow-up period that was available at the time of this study. The ROM based on the clinical follow-up for the non-diagnostic, non-neoplastic, AUS, neoplasm: benign, SUMP, SFM and malignant categories were 7.3%, 0%, 37.3%, 0%, 27.3%, 100% and 100%, respectively. As expected, the ROM based on the clinical follow-up for non-diagnostic, AUS and SUMP are lower than the ROM based on surgical follow-up due to this selection bias.

The high diagnostic accuracy of salivary gland FNA using only cell blocks only may have limitations. The rates of the non-diagnostic category in our study and in the one from Behaeghe et al. were 25% and 33%, respectively [2]. These rates are higher than the recommended upper limit of 10% in the MSRSGC but are still within the range of salivary FNA studies that have used the MSRSGC criteria with traditional cytologic preparations (0-50% with a mean value of 16.9%) [13, 15,16,17, 27, 29, 31, 32]. Besides the aspiration technique and the processing technique itself, a contributing factor could be the absence of rapid on-site evaluation (ROSE) for these cases, which allows for immediate assessment of adequacy but requires cytologic smears, as well as additional time and workforce organization between the clinical and pathology teams. Using ROSE, cytology smears and cell blocks can be used in a complementary fashion, with triage of adequate samples for ancillary studies [28].

As ROSE is not performed at our institution due to workflow and time constraints, two FNAs are always preformed on each lesion, with 6-8 passes for each FNA. This increases the chance of obtaining quality material for diagnosis in most cases, even though repeat sampling may be required for a minority. In a case with a strong clinical suspicion of lymphoma, one additional FNA sample in RPMI medium is also sent for fluorescence activated cell sorting flow cytometry studies [28].

The preparation of FNA material using the cell block technique represents a diluted version of a core biopsy, which is equivalent to microbiopsies or minicores. Many of the benefits of FNAB over core needle biopsy are stripped without ROSE and aspirate smears. The cell block technique has the advantage, in a setting for which no ROSE is possible due to workflow and time constraints, to retain some of the benefits of core needle biopsy. The literature suggests a superior sample adequacy and diagnostic accuracy of core needle biopsies over FNAC, especially for malignant salivary gland tumors [26]. Using the cell block technique with a fine needle, the resulting FNAB has the advantage to avoid the potential drawbacks of a larger more invasive core needle biopsy (such as tumor seeding and local complications (hematoma, pain, facial nerve injury), while retaining some of its advantages, as discussed above.

The main limitations of the current study are its retrospective nature and the lack of systematic histopathologic surgical excision data for all aspirated masses. We do, however, report the ROM based on the surgical vs the clinical follow-up separately. In contrast to many studies, the cases were diagnosed prospectively according to the MSRSGC. Finally, our study had relatively few cases for some of the MSRSGC categories (e.g., Milan IVb and Milan V), which may have led to falsely elevated or depressed ROM.

In conclusion, the results of this study contribute to further validate the recent MSRSGC by providing the risk of malignancy for different Milan system diagnostic categories at our institution. The study also further validates the use of FFPE cell block material for salivary gland cytopathology at a time when various ancillary studies, including molecular testing, are becoming more essential for the diagnosis and optimal management of patients with salivary gland tumors.