Introduction

Parotid gland tumors account for about 80–85% of all salivary gland tumors, and approximately 75% of them are benign, especially pleomorphic adenomas [1, 2]. Since the surgical approaches for benign and malignant parotid tumors are different, accurate differentiation of these tumors is important. A superficial parotidectomy or extracapsular dissection is recommended for most benign tumors in the superficial lobe, whereas a conservative parotidectomy is selected for benign tumors in the deep lobe or the inferior portion of the parotid gland [3, 4]. In contrast, a conservative parotidectomy with preservation of facial nerve function is recommended for low-grade malignant parotid tumors in the superficial lobe, and a total parotidectomy is selected for high-grade malignant parotid tumors [2, 5].

Because computed tomography and ultrasound, including elastography, are disappointing [6, 7], preoperative fine-needle aspiration (FNA) cytology is usually performed for initial evaluation of parotid gland tumors [8,9,10]. FNA offers the advantages of minimal invasiveness and a low risk of complications; however, it shows variable sensitivity and specificity in differentiating benign and malignant lesions and has a relatively high rate of insufficient diagnosis [10].

Many studies have assessed MRI findings of parotid gland tumors, and the signal intensity on T2-weighted imaging (T2WI), T2 mapping, tumor capsule, time-intensity curve (TIC) pattern, apparent diffusion coefficient (ADC) values derived from diffusion-weighted imaging (DWI), and intra-voxel incoherent motion diffusion-weighted imaging (IVIM-DWI) parameters have been reported to be suitable parameters [11,12,13,14,15,16]. However, for semi-quantitative and quantitative analyses of the TIC on dynamic contrast-enhanced (DCE)-MRI and DWI, signal intensity or ADC values cannot be considered reliable indicators, because there is relatively large inter-study differences in results, or standardization is still lacking [17, 18].

Permeability imaging or quantitative DCE-MRI is a functional MRI modality that involves repeated acquisition of images of the same lesion and analyses of the TICs of the tumor and feeding artery to yield various parameters related to perfusion and permeability within the tumor. It has been reported to be useful in the differentiation of head and neck tumors [19] as well as tumors in other organs [20,21,22]. Kato et al [23] reported that arterial spin labeling (ASL) could differentiate Warthin’s tumor from pleomorphic adenoma and malignant tumors but could not differentiate pleomorphic adenomas and malignant tumors.

Multiparametric quantitative analyses of MR parameters including quantitative DCE-MRI and DWI in the characterization of parotid tumors have been recently reported using combination pseudo-continuous arterial spin labeling (pCASL) and DWI [24] or TIC pattern with the extracellular extravascular volume fraction (Ve) based on quantitative DCE-MRI [25]. These studies certainly showed the superior diagnostic performance of multiparametric analysis; however, they did not validate their results using another cohort [24, 25].

Thus, the purpose of this study was to determine the added value of permeability MRI to T2WI, semi-quantitative analysis of DCE-MRI, and IVIM-DWI in the characterization of parotid gland tumors.

Materials and methods

Patients

Our retrospective study was approved by the institutional review board, and the requirement for informed consent was waived. A computer search of the radiological reports and surgical records at our institution revealed 124 patients who underwent both DCE-MRI and IVIM-MRI between July 2015 and September 2017. Finally, a total of 61 patients (mean age, 58.5 years; age range, 15–89 years) with parotid gland tumors was enrolled in the clinical and MR data analyses. All patients underwent examinations with DCE-MRI and IVIM-DWI at our hospital. A flow diagram summarizing the initial candidates and each exclusion procedure is shown in Fig. 1.

Fig. 1
figure 1

Flow diagram of patient selection and exclusion criteria for the initial study

Histological diagnoses

Histopathologic diagnoses were made on the basis of the findings in specimens obtained by surgical resection (n = 53) or US-guided biopsy (n = 8). Table 1 shows the distribution of histological diagnoses for the 61 parotid gland tumors.

Table 1 Histological diagnoses of parotid gland tumors

MR imaging technique

MR examinations were performed within 2 weeks before surgery or biopsy in all patients. Patients were examined by using a 3.0-T MR unit (Achieva 3.0T TX, Philips Healthcare) with a 16-channel phased-array neurovascular coil. First, precontrast transverse T1-weighted spin-echo images (see Supplementary material 1 for details) and transverse T2-weighted fast spin-echo images without fat suppression (see Supplementary material 2 for details) were obtained. The IVIM-DWI was performed by using a multisection spin-echo single-shot echo-planar sequence in the transverse plane (see Supplementary material 3 for details). Sensitizing diffusion gradients were applied sequentially in the x, y, and z directions with b values of 0, 50, 500, and 1000 s/mm2. ADC maps were generated. Tumors were identified on the transverse T1-weighted MR images, and 10 sections, including one section that accounted for the maximum diameter of the tumor, were selected for DCE-MRI. A precontrast 3D-T1 fast field echo (FFE) sequence (see Supplementary material 4 for details) was obtained at first. Thereafter, gadobutrol (Gadovist, Bayer HealthCare Pharmaceuticals) was administered intravenously at a rate of 2 mL/s (total dose, 0.1 mmol per kg of body weight) by using a power injector (Sonic Shot 50, Nemoto Kyorindo Co., Ltd.), followed by a 20-mL saline flush. DCE-MRI using the 3D-T1 FFE sequence (using the same parameters as those for precontrast 3D-T1 FFE, but with a flip angle of 15°) was sequentially performed before and every 3.5 s after contrast material administration 95 times for a total of 333 s. Finally, post-contrast transverse T1-weighted spin-echo imaging using the same parameters as precontrast T1-weighted imaging (T1WI) was performed.

Data processing and MRI analysis

Morphological analyses on static MR images

First, the maximal diameters of the tumors were measured on axial T1WI. The presence of a capsule on T2WI was also analyzed. A capsule was defined as a low-intensity thin rim (2 mm or less) surrounding the mass. We assigned grade 2 when the capsule was identified in the entire perimeter of the tumor, grade 1 when the capsule was equal to or more than half the perimeter of the tumor, and grade 0 when the capsule was visible over less than half the perimeter of the tumor.

These analyses were independently interpreted by 2 authors (28 and 22 years of experience in head and neck MRI) who were blinded to the clinical information and histopathological results; a consensus interpretation was reached in cases of disagreement.

Semi-quantitative analysis in DCE-MRI

The same 2 authors who performed morphological analyses on static MR images independently and manually drew regions of interest (ROIs) for signal intensity measurements to avoid the cystic parts of the tumors. Each observer independently selected the maximum diameter level of the tumor, set ROIs on the post-contrast-enhanced T1WI, and copied and pasted it to other images. The range and average of ROI size for each observer in the initial study were as follows: for observer 1, range 36–1408 mm2 and mean 322 mm2, and for observer 2, range 39–1432 mm2 and mean 329 mm2, and those for the validation study were as follows: for observer 1, range 45–956 mm2 and mean 300 mm2, and for observer 2, range 48–972 mm2 and mean 308 mm2. When we had to set two or more ROIs to avoid cystic parts within the tumor, we set multiple circular ROIs with a diameter equal or larger than 6 mm. Then, we selected the area with the highest maximal enhancement among multiple areas. Maximal enhancement was calculated from the following formula: see Supplementary material 5 for details. Then, the average signal intensity within the ROI was plotted against time, and TICs were constructed. Based on the parameters derived from TICs previously reported in an MRI-based study of parotid gland tumors [13], the time to peak and the washout ratio (WR) were defined with the following equations (see Supplementary material 6 for details). The TIC patterns were categorized as follows: persistent pattern, in which the time to peak was more than 120 s; plateau pattern, in which the time to peak was 120 s or less with a low WR (< 30%); and washout pattern, in which the time to peak was 120 s or less with a high WR (≥ 30%).

Quantitative analysis in DCE-MRI

All quantitative analyses in DCE-MRI were performed using a vendor-provided software program (IntelliSpace Portal Ver. 6.0, Philips Healthcare). This system is based on a Tofts model, by which pixel-based parameter maps are generated from TICs [26]. For the quantitative analysis, the same 2 authors independently copied and pasted the ROI to each permeability maps to obtain Ktrans, kep, Ve, plasma fraction, and area under the curve values from the semi-quantitative analysis. The 2 authors also set the same ROI size in the contralateral normal parotid gland to obtain Ktrans, kep, Ve, plasma fraction, and area under the curve values of normal parotid gland tissue. The contrast relaxivity was set to 3.9 s−1/mmol. Arterial input function (AIF) correction was used to set the ROI in the affected side of external carotid artery (Supplementary Fig. a) and judged whether it showed a pulsed input pattern or a non-pulsed input pattern. AIF correction was not adopted unless at least one AIF curve among 49 AIF curves showed the pulsed input pattern (Supplementary Fig. b, yellow curve).

Quantitative analysis of IVIM images

For measurements of the ADC and IVIM parameters (D and f), pixel-based ADC and D maps were obtained at a commercial workstation (Synapse Vincent, Fujifilm), and the same 2 authors independently placed the ROIs within the same areas as those used for the TIC construction. ROIs for TIC construction and ADC measurement were chosen such that they avoided obvious cystic areas. A cystic area was defined as a lesion area that showed high intensity on T2WI and was not enhanced on contrast-enhanced MRI images. The 2 authors also set the same ROI size in the contralateral normal parotid gland to obtain IVIM parameters of normal parotid gland tissue. In addition, the same authors measured the signal intensity of the tumor and that of the neighborhood muscle on T2-weighted images to set the ROIs within the same areas as those used for the TIC construction, and calculated the signal intensity (SI) ratiolesion/muscle. The 2 authors also set the same ROI size in the contralateral normal parotid gland and that of the neighborhood muscle to obtain the SI ratiolesion/muscle of normal parotid gland tissue. In this study, T2-weighted images were obtained using a parallel imaging technique, and various degrees of sensitivity corrections were used for each receiver coil. Therefore, we used SI ratiolesion/muscle, not absolute SI on T2-weighted images.

Statistical analysis

Inter-observer agreements in the measurement of each quantitative MR parameter were assessed by using Spearman’s rank correlation coefficient and Bland-Altman plot. Limits of agreement given by 95% confidence intervals (CIs) were calculated by using Bland-Altman analysis [27]. We also added the intra-class correlation coefficients (ICCs). Both Spearman’s correlation coefficients and ICCs were regarded as follows: r/ICC < 0.2, poor correlation/agreement; r/ICC = 0.2–0.4, fair correlation/agreement; r/ICC = 0.41–0.6, moderate correlation/agreement; r/ICC = 0.61–0.8, good correlation/agreement; and r/ICC > 0.81, excellent correlation/agreement. For categorical variables, such as the presence of a capsule and the TIC pattern, inter-observer agreements were evaluated by using a linear type of weighted k statistics.

In the present study, all statistical analyses were performed using non-parametric statistical tests because a normal distribution of the continuous variables could not be assumed (age, maximal diameter of the tumor, SI ratiolesion/muscle, ADC, D, f, D*, Ktrans, kep, Ve, Vp). For statistical analyses, we analyzed age, sex, presence of facial nerve palsy, maximal diameter, presence of capsule, lesion-to-muscle SI ratio, TIC pattern, IVIM parameters (ADC, D, f, and D*), and permeability MRI parameters (Ktrans, kep, Ve, Vp). First, we performed univariate analysis among 4 groups: pleomorphic adenomas, Warthin’s tumors, other benign tumors, and malignant tumors using Kruskal-Wallis and Steel-Dwass tests for multigroup comparisons. The level of significance in the univariate analysis that was used for inclusion in the multivariate analysis model was less than 0.05/6 = 0.008. In addition, receiver operating characteristic (ROC) curves were used to determine the optimal cutoff levels of the significant parameters to enable differentiation of specific pairs of histological types. Next, we performed multivariate analyses with logistic regression analyses. These multivariate analyses were also performed in 4 subgroups. These subgroup analyses could increase type I error; therefore, we applied the Bonferroni correction for multiple comparisons and set the corrected significance level to less than 0.05/6 = 0.008. Third, we performed decision tree analysis using the significant parameters obtained after multivariate analysis. Finally, using McNemar’s test, we compared the diagnostic accuracy before and after diagnoses using the TIC pattern alone versus the TIC pattern and IVIM, and the TIC pattern and IVIM versus a combination of the TIC pattern, IVIM, and permeability MRI. These comparisons among three patterns could also increase type I error; therefore, we applied the Bonferroni correction for multiple comparisons and set the corrected significance level to less than 0.05/3 = 0.017. Computer software programs (JMP Pro 13.0.0; SAS and MedCalc, version 18; MedCalc software) were used to analyze the raw data.

Validation study

To test the true diagnostic value of the proposed diagnostic flow diagram, we applied it to 60 parotid gland tumors in 60 consecutive adult patients who underwent T2WI, DCE-MRI, and IVIM-DWI between October 2017 and December 2018. The technique used to visualize and evaluate the tumors and the exclusion criteria was otherwise the same as that used in the initial part of the study. A flow diagram summarizing the candidates of the validation study and each exclusion procedure is shown in Fig. 2.

Fig. 2
figure 2

Flow diagram of patient selection and exclusion criteria for the validation study

The sensitivity, specificity, positive predictive value, negative predictive value, and accuracy of the proposed classifications derived from the initial study results for differentiation of parotid gland tumors were examined. When we assumed tumors as malignant based on the decision tree analysis and histological results were also malignant tumors, they were regarded as true-positive results.

Results

Inter-observer agreements

Spearman’s rank correlation coefficients between 2 observers showed good to excellent correlation in evaluations of quantitative parameters (see Supplementary material 7 for details). Limits of agreement in Bland-Altman plots of inter-observer variabilities in the measurement of quantitative parameters showed the following findings: see Supplementary material 8 for details. ICC of the tumor assessment also showed excellent agreements except for f as fair agreement (see Supplementary material 10 for details). In addition, ICC of the normal parotid gland assessment also showed excellent agreements except for f as moderate agreement (see Supplementary material 9 for details).

For the inter-observer agreements of categorical variables of TIC pattern, k value for the 2 observers was in the almost perfect range (0.98).

Clinical data evaluation

Table 2 shows the distribution of clinical data among the 4 histological groups (pleomorphic adenomas, Warthin’s tumors, other benign tumors, and malignant tumors). Multivariate analysis showed that there was no significant difference among the four histologic types in terms of age, sex, and facial nerve palsy.

Table 2 Distribution of clinical data and MR characteristics among the 4 histological diagnoses

MR image evaluation

Table 2 shows the distribution of MR characteristics among the 4 histological diagnosis groups. Univariate analysis showed significant differences among the 4 histologic types in the grade of the peritumoral capsule, SI ratiolesion/muscle on T2WI, TIC pattern, ADC, D, kep, Ve, and Vp. Multivariate analysis revealed 3 significant features, i.e., the TIC pattern, the IVIM parameter of D, and the permeability parameter of kep. Figure 3 a and b show scatter plots of D value and kep with cutoff values among 4 histologic subtypes, respectively. Figure 4 shows a flow diagram derived from a decision tree analysis combining the TIC pattern, D values, and kep values of the lesions to differentiate the 4 histological types of parotid gland tumors based on the results of the initial study. It involves the TIC pattern in the first step, the D value in the second step, and the kep value in the last step. Regarding the AIF correction, we could select the best pulsed input pattern among 49 AIF curves in all patients and use manual AIF correction in all patients in both initial and validation study groups.

Fig. 3
figure 3

a Scatter plot of D value among 4 histologic subtypes. The black circles show each D value of the case. The red square and error bars represent the median value and 25th–75th percentiles of D value in each histologic type, respectively. The cutoff value to differentiate between pleomorphic adenomas and other 3 histologic types is 1.22 (× 10−3 mm2/s), that between other benign tumors and malignant tumors/Warthin’s tumors is 1.06 (× 10−3 mm2/s), and that between malignant tumors and Warthin’s tumors is 0.89 (× 10−3 mm2/s), respectively. b Scatter plot of kep value among 4 histologic subtypes The black circles show each kep value of the case. The red square and error bars represent the median value and 25th–75th percentiles of kep value in each histologic type, respectively. The cutoff value to differentiate between Warthin’s tumors and other 3 histologic types is 581 (× 10−3/min), and that between malignant tumors and other benign tumors/pleomorphic adenomas is 321 (× 10−3/min), respectively

Fig. 4
figure 4

Decision-tree algorithm for differential diagnosis of a parotid tumor The first row shows the classifications according to the TIC pattern, the second row shows those according to the D values, and the bottom row shows those according to the kep values. The blue steps show the benign tumor diagnoses, and the pink steps show the malignant tumor diagnoses

Evaluation of diagnostic value

Table 3 summarizes the diagnostic value of TIC patterns alone, the combination of TIC patterns and D values, and the combination of TIC patterns, D values, and kep values. When we assigned a diagnosis on the basis of TIC patterns alone and regarded persistent and washout TIC pattern lesions as benign tumors and plateau TIC pattern lesions as malignant tumors, the sensitivity, specificity, accuracy, positive predictive value, and negative predictive value were 50% (6/12), 69% (34/9), 66% (40/61), 29% (6/21), and 85% (34/40), respectively. Single sensitivity, specificity, and accuracy from TIC pattern, D, and kep were (6/12, 50%; 34/49, 69%; 40/61, 66%), (3/12, 25%; 42/49, 86%; 45/61, 74%), and (9/12, 75%; 40/49, 82%; 49/61, 80%), respectively. When we applied D values to TIC patterns and modified the diagnosis with reference to the D values, the sensitivity, specificity, accuracy, positive predictive value, and negative predictive value were 67% (8/12), 82% (40/49), 79% (48/61), 53% (8/15), and 87% (40/46), respectively. The use of D value thresholds improved the diagnostic values, although p value did not reach the corrected significance level (p = 0.013). When we applied kep values to TIC patterns plus D values and modified the diagnosis with reference to the kep values, the sensitivity, specificity, accuracy, positive predictive value, and negative predictive value were 92% (11/12), 94% (46/49), 93% (57/61), 79% (11/14), and 98% (46/47), respectively. The use of kep value thresholds significantly improved the diagnostic values further (p = 0.0078). Two representative cases are shown in Figs. 5 and 6. These cases could be correctly diagnosed because of permeability imaging.

Table 3 Comparison of diagnostic accuracy to differentiate benign tumors from malignant parotid tumors
Fig. 5
figure 5

Pleomorphic adenoma of the right parotid gland in a 66-year-old woman. a Transverse T2-weighted MR image shows a heterogeneously hyperintense mass with the entire capsule (arrow) in the right parotid gland. b TIC pattern derived from DCE-MRI shows a plateau pattern. c The D map derived from IVIM-DWI shows a mass with slightly low signal intensity (arrow, D = 0.78). d The kep map derived from DCE-MRI shows low kep (264) of the tumor (arrow). The plateau TIC pattern and the low D value, which is smaller than the cutoff value (1.06), indicate a malignant tumor; however, the low kep value (264), which is smaller than the cutoff value (321), finally leads to a diagnosis of a benign tumor. This is a representative case in which the addition of permeability imaging was useful

Fig. 6
figure 6

Acinic cell carcinoma of the right parotid gland in a 73-year-old man. a Transverse T2-weighted MR image shows a heterogeneously hyperintense mass with incomplete capsule (arrow) in the right parotid gland. b TIC pattern derived from DCE-MRI shows a plateau pattern. c The D map derived from IVIM-DWI shows a mass with slightly high-signal intensity mass (arrow, D = 1.15). d The kep map derived from DCE-MRI shows slightly high kep value (398) of the tumor (arrow). The plateau TIC pattern indicates a malignant tumor; however, the addition of the relatively high D value (1.15), which is larger than the cutoff value (1.06), indicates a benign tumor; the high kep value (398), which is larger than the cutoff value (321), finally leads to the correct diagnosis of a malignant tumor

Validation study results

Tables 1 and 3 also show the results of the validation study involving 60 tumors in 60 patients. When we applied the diagnostic algorithm combining the TIC pattern and the D and kep values, the sensitivity, specificity, accuracy, positive predictive value, and negative predictive value were 88% (14/16), 93% (41/44), 92% (55/60), 82% (14/17), and 95% (41/43), respectively (Table 3). The use of kep value thresholds improved the diagnostic values although p value did not reach the corrected significance level (p = 0.03).

Discussion

Our study findings showed that the addition of permeability MRI to semi-quantitative TIC analysis and IVIM imaging could improve diagnostic accuracy in the characterization of parotid gland tumors. The addition of kep values is particularly useful when differentiating benign parotid gland tumors with a plateau TIC pattern and a low D value and malignant tumors with a washout TIC pattern with a low D value, which were respectively misdiagnosed as malignant and benign tumors without permeability MRI.

Semi-quantitative analyses in DCE-MRI could be affected by the total dose and injection rate of the contrast medium, input arterial flow, and patient hemodynamics including the cardiac ejection fraction [17]. Perfusion MRI can be adjusted by the AIF, and it can provide additional physiological information related to the permeability of capillary membrane, capillary vascular space, and extravascular and extracellular space [18]. Therefore, we hypothesized that we might be able to obtain unique information from permeability MRI that could not be known by semi-quantitative analysis of TIC patterns or diffusion.

The only significant parameter derived from perfusion MRI in this study was kep, which represents a rate constant between the extravascular extracellular space and blood plasma. We speculated that these results are similar to those for the semi-quantitative TIC pattern; high kep values corresponded to a high WR, and low kep values corresponded to a low WR. Using this approach, it may be possible to analyze the movement of contrast medium between the extravascular extracellular space and blood plasma more accurately.

There have been several reports on the characterization of parotid gland tumors using functional MRI data such as the semi-quantitative and quantitative analyses of DCE-MRI, ADC values, IVIM parameters, pCASL parameter, and T2 mapping. Yabuuchi et al [14] reported that a persistent or flat TIC pattern on dynamic contrast-enhanced MR images indicates benign disease, but there is added value from including the ADC value in the evaluation of tumors that show a plateau or washout TIC pattern. Sumi et al [15] reported that IVIM parameters including f, D, and D* values could differentiate benign from malignant salivary gland tumors. As for the utility of perfusion imaging in the characterization of parotid gland tumors, Xu et al [25] recently reported that TIC pattern in combination with the Ve based on quantitative DCE-MRI could achieve the best diagnostic performance in the differential diagnosis of parotid tumors, and Kato et al [23] reported that ASL could differentiate Warthin’s tumors from pleomorphic adenomas and malignant tumors. Razek [24] recently reported that multiparametric MR imaging using pCASL and DWI was useful for differentiating parotid tumors. These studies certainly showed the superior diagnostic performance of multiparametric analysis; however, they did not validate their results using another cohort [24, 25].

This study has some limitations. First, the number of malignant lesions is relatively small. Therefore, the distribution of histologic types among the malignant tumors might have affected the results. Moreover, the small sample size of each subgroup might have reduced the statistic power. The low rates of malignancy (initial study, 20%; validation study, 27%) in our study might lead to disadvantage for sensitivity and to advantage for specificity, because we defined malignancy as a true positive case. Second, the number of b factors in IVIM imaging was limited to shorten the total imaging time. Third, it is well known that volumetric analysis avoids the possibility of sampling error that is inherent in 2D evaluations; however, we could not perform a 3D VOI approach, because it was not available in our institute. However, to our knowledge, this is the first report on the added value of using a combination of semi-quantitative TIC analysis, IVIM-MRI, and permeability MRI using decision tree analysis based on multiparametric analysis of 3 parameters for characterization of parotid gland tumors.

In conclusion, permeability MRI offers added value in the characterization of parotid gland tumors to IVIM-MRI and semi-quantitative TIC analysis of DCE-MRI.