Introduction

Modern high-resolution imaging increasingly confronts radiologists and clinicians with small incidental findings such as intraductal papillary mucinous neoplasms (IPMN) of the pancreas. IPMN has gone from being once assumed to be a rare pancreatic tumour to one of the most frequently diagnosed non-inflammatory pancreatic tumours, with a high presumed incidence [1].

IPMN are considered to be premalignant, primarily non-invasive mucin-producing tumours, which may progress to invasive carcinomas. The stages of progression are classified into subtypes, defined by the World Health Organization (WHO) as low-grade dysplastic (LGD), moderate-grade dysplastic (MGD), high-grade dysplastic (HGD) IPMN, and intraductal papillary mucinous carcinoma (IPMC) [2]. Invasive IPMN can have a similarly poor prognosis as primary pancreatic ductal adenocarcinomas [38]. The fact that LGD IPMN, the pre-malignant “benign” precursor, is visible on radiologic imaging as a result of mucin production provides the unique opportunity of preventing invasive cancer by resection in patients with IPMN harbouring moderate dysplasia or high-grade dysplasia (HGD IPMN) [9]. In this context, the accurate diagnosis of non-invasive “benign” vs. early malignant IPMN (HGD and IPMC), and the identification of “benign” forms (LGD and MGD) with high likelihood of malignant transformation (HGD), are crucial. And so the decision lies between resection with high associated complication rates, and watchful waiting with the possibility of overlooking early invasive IPMN at a curable stage.

Several indicative imaging criteria, such as cyst size, mural nodules, and main duct involvement, have been identified in studies in the past [3, 6, 7, 1015]. The findings were first summarized in 2006 [12] and updated in 2012 in the international consensus guidelines for the management of IPMN and MCN of the pancreas, which introduced high-risk stigmata and worrisome features to manage the indication and extent of surgical resection [13]. High-risk stigmata include main duct involvement and dilatation ≥10 mm, solid components, and the presence of mural nodules; worrisome features include size ≥30 mm and an accentuated main pancreatic duct (5–9 mm). Recent publications have proffered some discussion regarding a more aggressive management of so-called Sendai-negative IPMN, as these may contain already-malignant components despite their benign aspect on cross-sectional imaging in a significant number of patients [6, 1619]. The 2012 revised guidelines address many issues raised in previous publications and assert the necessity of a preoperative radiological diagnosis in preoperative management and planning [13]. Nonetheless, clinical decision-making in patients with IPMN remains difficult, even with the amended guidelines, and still greatly relies on morphological surrogate parameters derived from radiological imaging studies. Therefore, we investigated the actual diagnostic performance and stratification of the different subtypes of IPMN based on these imaging criteria in a collective of histopathologically confirmed IPMN that were resected at our institution.

Methods

Patients

In this retrospective study, all patients who fulfilled the inclusion criteria were identified from institutional databases over a period of time from 2001 through 2014. Inclusion criteria were as follows: digitally archived multi-detector computed tomography (CT) and/or magnetic resonance imaging (MRI) of the pancreas (at least biphasic contrast-enhanced imaging for both modalities and standard unenhanced T1- and T2-weighted (w) sequences with and without fat suppression (FS) for MRI), and subsequent (maximum three months after imaging) complete resection of the ductal lesion with confirmed histopathological diagnosis of IPMN. Exclusion criteria were history or findings of chronic pancreatitis such as pseudocysts and calcifications. The institutional review board approved this retrospective study.

The database search revealed a total of 58 patients (23 female, 35 male, mean age ± standard deviation [SD] 64 ± 12.2 years) who fulfilled the inclusion criteria. The median interval between imaging and surgery was 30 days (25th percentile, six days; 75th percentile, 44 days).

Imaging techniques

The 58 patients received a total of 83 CT (n = 42) and MRI (n = 41) examinations. CT examinations were obtained with 4- (n = 5), 8- (n = 4), 16- (27), and 64-row (n = 6) CT devices, according to the institutional standard protocols for abdominal CT and pancreatic imaging, which evolved over time and differed among scanner types. All examinations were performed with intravenous contrast agent (injected dose, 100–140 ml; iodine concentration, 350–400 mg/dl; flow rates, 2–4 ml/sec [automatic injection]; saline flush, 40 ml), and consisted of an arterial phase (20–40 seconds effective delay, performed in 86 % of cases), a portal venous phase (40–70 seconds delay, performed in 78 % of cases), and a venous phase (70–90 seconds delay, performed in 70 % of cases). Additional unenhanced scans were acquired in 29 % of cases. The minimum reconstructed slice thickness available was 1.00–3.75 mm.

All MRI examinations were performed with a 1.5 T magnet (n = 41) using phased-array surface coils. The examination protocols comprised T2w standard 2D sequences, with and without FS; and T1w unenhanced 2D sequences, with and without FS. Intravenous application of gadolinium-based extracellular contrast agents (0.1–0.2 mmol/kg body weight, according to manufacturer recommendation for the different types of contrast agent; manual or automatic injection at approximately 1–2 ml/sec flow rate, followed by 40 ml saline flush) was followed by at least two contrast phases (2D or 3D T1w sequences with FS; arterial, 75 %; portal venous, 89 %; venous phase, 96 %). Magnetic resonance cholangiopancreatography (MRCP) was performed in 39 (95 %) of all MRI examinations with single-shot thick slab and/or 3D sequences.

Image analysis

Image analysis was performed at dedicated viewing workstations (Centricity PACS RA1000, GE Medical Solutions, Fairfield, CT, USA). Three observers (O1–O3), all radiologists with expertise in abdominal imaging (7, 8, and 10 years of experience), reviewed all CT and MRI images separately in randomized order, and were blinded to clinical, histopathological, and surgical data. They were asked to identify IPMN-based lesions and to rate the identified ductal lesion according to the previously published Sendai criteria [13] as either “benign” or “malignant”, and also to indicate how certain they were of their diagnosis during the blinded read based on a 10-point scale (−5, very likely benign; −1, possibly benign; 5, very likely malignant, 1, possibly malignant). In addition, the radiologists were asked to determine the histological entity of the identified IPMN lesion as either “benign” (LGD IPMN or MGD IPMN) or “malignant” (HGD-IPMN, IPMC, or solid carcinoma (CA) arising from IPMN). IPMN with infiltrative solid components >10 mm arising from the cystic ductal lesion and invading the parenchyma (i.e., blurring of the lesion–pancreas interface, invasion of peripancreatic tissue, and adenocarcinoma-type hypoattenuation in contrast-enhanced imaging) was defined as solid CA. This distinction was made to visually discriminate early invasive IPMN (IPMC) from clear-cut invasive IPMN with gross solid components, with the intent to diagnostically discriminate less overtly malignant IPMN from premalignant, non-invasive IPMN (i.e., LGD, MGD and HGD IPMN). This 10 mm threshold was used to investigate the diagnostic accuracy of imaging criteria and to account for the fact that invasive carcinoma arising in IPMN is a distinct histopathological entity, with a potentially different outcome and prognosis compared to non-IPMN-related pancreatic adenocarcinoma, depending on the microscopic subtype of IPMN (i.e., gastral, oncocytic, intestinal, pancreatobiliary) [20] and the stage of the disease at the time of presentation [21, 22]. Histopathology served as the reference standard: LGD IPMN and MGD IPMN were defined as “benign”; HGD IPMN and IPMC were defined as malignant [2].

In a subsequent consensus read by all observers, descriptive lesion analysis was performed with the following parameters: maximum ductal lesion diameter, presence of mural nodules, main duct involvement according to Zhang et al. [15], solid components as described above, and contrast uptake.

In addition, the image quality of scans was determined with regard to image noise (minor, major, or non-diagnostic) and movement (or other) artefacts (minor, major, or non-diagnostic) on a three-point scale (1, good [minor image noise and moving artefacts]; 2, sufficient [at least one major and no non-diagnostic category]; or 3, poor/non-diagnostic [non-diagnostic image due to noise and/or artefacts]). The quality of contrast-enhanced dynamic imaging was assessed according to predefined criteria of phase-appropriate enhancement, and was also graded on a three-point scale (1, good [phase-appropriate enhancement]; 2, sufficient [timing slightly off]; or 3, poor/non-diagnostic [phase-inappropriate enhancement].

Statistics

Statistical analysis was performed using IBM SPSS software (release 19.0.0.1; SPSS Inc., IBM, Armonk, NY) and the R software (version 2.15.3, R Foundation for Statistical Computing, Vienna, Austria). Diagnostic parameters (sensitivity, specificity, and accuracy) were calculated using standard formulas. Sensitivities and specificities were tested according to the method described by Bennett [23]. For analysis of interobserver variability, kappa statistics were used (Cohen’s and Fleiss’ kappa coefficient). Due to the small sample size, normality distribution was not assumed, and in consequence, non-parametric tests were performed. The comparison of paired proportions was performed using McNemar's Chi-squared test with continuity correction. Unpaired proportions were compared using Fisher’s exact test. To compare the medians of two groups, we used the Mann–Whitney U test. Significance was defined by p values of less than 0.05. With regard to prediction of the various evaluated parameters, a univariate logistic regression model was used. Odds ratio (OR) estimates, confidence intervals, and p values were calculated using the exact logistic regression model provided by the R software. Estimated receiver-operating characteristic (ROC) curves were generated for evaluated parameters. The point on the ROC curve with the minimum distance between the 0 % false-positive rate and the 100 % true-positive rate was defined as the optimal cutoff value.

Results

Evaluated lesions

In the 58 patients, a total of 60 ductal lesions with conclusive histopathological assessment were identified and evaluated. Forty-one benign (LGD IPMN, n = 20; MGD IPMN, n = 21) and 19 malignant (HGD IPMN, n = 3; IPMC, n = 6; solid CA, n = 10) ductal lesions were identified histopathologically (six were main duct [MD] IPMN, 37 were branch duct [BD] IPMN, and 17 were mixed type IPMN).

Image quality

The overall quality of images was rated as good or sufficient (CT: 98 % good, 2 % sufficient; MRI: 95 % good, 5 % sufficient).

The quality of contrast-enhanced CT images (contrast phases missing, n=27) was good in 85 % for the arterial, 84 % for portal venous, and 91 % for the venous phase; and sufficient in 12 % for the arterial, 13 % for portal venous, and 9 % for venous phase. Poor quality was found in 3 % for the arterial and 3 % for portal venous phase.

The quality of contrast-enhanced MRI images (contrast phases missing, n=13) was good in 96 % for the arterial, 94 % for portal venous, and 91 % for the venous phase; and sufficient in 4 % for the arterial, 4 % for portal venous, and 8 % for the venous phase. Poor quality was found in 2 % for the portal venous phase and 1 % for the venous phase.

Morphologic findings

All morphologic findings were rated in consensus among the three observers. Details on size, main duct involvement, and the presence of nodules for benign and malignant IPMN and each histological diagnosis are summarized in Table 1.

Table 1 shows the different histological entities and the distribution of mean size and standard deviation, the number of target ductal lesions with mural nodules, and main duct (MD) involvement (MD >6 mm) for “malignant” and “benign” IPMN, as well as for each histological subgroup. Significance (p < 0.05*; p < 0.001**) in the prediction of the subtype in univariate analysis is indicated by *

Regarding size, ROC analysis revealed an area under the curve (AUC) of 0.70 (p = 0.02). The univariate analysis with a ductal lesion size ≥30 mm (optimal cutoff value derived from the ROC analysis) rendered moderate accuracy (71 %), with low sensitivity (56 %) and moderate specificity (78 %). In the multivariate analysis model including size and “benign” vs. “malignant” IPMN as variables, the OR was 1.03 (CI, 1.00–1.07; p = 0.049) for “malignant” IPMN (Fig. 1; Table 1). ROC analysis of each subtype showed a significant correlation of ductal lesion size with solid CA (p = 0.01), with accuracy of 75 %, sensitivity of 77 %, and specificity of 63 % (optimal cutoff of 41 mm). No significant correlation between histological entity and size was shown for LGD IPMN (p = 0.22), MGD IPMN (p = 0.24), HGD IPMN (p = 0.61), or IPMC (p = 0.24).

Fig. 1
figure 1

summarizes the percentage of cases per histological entity with respect to morphological imaging criteria. One point was given per positive criterion. Imaging criteria: size ≥30 mm, nodules present, main duct involvement (MD >6 mm). Note the increase of cases with ≥2 points from LGD IPMN to solid CA.

The presence of intramural nodules was highly significant for “malignant” IPMN in univariate and logistic regression analysis with regard to differentiation between “benign” and “malignant” IPMN (p < 0.001 [univariate]; p < 0.001 [log. regression]). Univariate analysis showed no significance in the prediction of LGD, MGD, or HGD IPMN (p > 0.05 with true OR > 1), respectively, while nodules were significant for IPMC and solid CA (p = 0.001; p < 0.001). The multivariate analysis model including nodules and “benign” vs. “malignant” IPMN as variables rendered an OR of 23.3 (CI, 5.3–103; p < 0.001) for malignant IPMN. The ROC analysis of nodule size on MRI showed a tendency towards significance (p = 0.09), with an AUC of 0.85 and optimal cutoff value of 6 mm. The same analysis of node size on CT did not show a tendency towards significance (p = 0.50), AUC 0.70.

All ductal lesions identified as solid CA (n = 10) with solid invasive mass-like ductal lesions >10 mm were histologically found to be invasive carcinoma on the grounds of IPMN (p < 0.001).

Main duct involvement—defined as a main duct diameter of greater than 6 mm—was not shown to be significant with regard to differentiation between “benign” and “malignant” IPMN (p = 0.57) (see also Figs. 4 and 5). The ROC analysis of main duct diameter for CT and MRI rendered small AUC for both modalities (0.61 and 0.50, respectively, with p > 0.05 for both modalities). The multivariate analysis model including main duct involvement and “benign” vs. “malignant” IPMN as variables did not show significant results (p > 0.05).

Multivariate analysis of histological subtypes, specifically MGD IPMN and HGD IPMN, with size and nodules did not render significant associations. Due to the small number of HGD IPMN (n = 2), no further multivariate analysis was performed for this histological entity.

Figure 1 summarizes all cases with respect to histology and points per positive morphologic criterion as defined in consensus by all observers; i.e., 0 points describes a ductal lesion smaller than 30 mm without discernable nodules and a main duct diameter <6 mm, while 3 points describe a ductal lesion ≥30 mm, and nodules and main duct dilatation >6 mm.

Observer agreement and confidence

The overall accuracy of differentiation between “benign” and “malignant” for all observers was 87 % (sensitivity, 93 %; specificity, 83 %) for CT and 92 % (sensitivity, 92 %; specificity, 92 %) for MRI, with good interobserver agreement (κ, 0.86 –0.95).

The overall accuracy of differentiation among the histological subtypes was 85 % (sensitivity, 100 %; specificity, 78 %) for CT and 91 % (sensitivity, 92 %; specificity, 89 %) for MRI for all three observers, with good interobserver agreement (κ, 0.84–0.95).

Observer confidence was best for “malignant” IPMN, especially IPMC and solid CA (Fig. 2; Table 2). Among cases with IPMC and solid CA, none of the observers rated any ductal lesion as benign in both CT and MRI examinations. There was a high percentage of false “malignant” ratings for the LGD IPMN (CT/MRI, 28 %/10 %) and MGD IPMN groups (CT/MRI, 18 %/13 %), as well as a high percentage of false “benign” ratings in the HGD IPMN groups (CT/MRI, 50 %/100 %). Overall, the observers were more confident and more precise with their diagnosis by MRI, with the exception of the small number (n = 3) of HGD IPMN (Fig. 2).

Fig. 2
figure 2

Observer confidence (CT vs. MRI) with respect to diagnosis (benign vs. malignant) on the y-axis compared to the actual histological subtype on the x-axis. Observer confidence ranges from 1, very uncertain, to 5, very certain of diagnosis. Negative values indicate “benign” ductal lesions (LGD and MGD IPMN); positive values indicate “malignant” ductal lesions (HGD IPMN, IPMC, and solid CA). Note that observer confidence is best for IPMC and solid CA, whereas LGD IPMN, MGD IPMN, and HGD IPMN show increased rates of falsely identified ductal lesions and scattering of confidence values.

Table 2 shows each observer (O1–O3) and their corresponding histological readings for each ductal lesion in CT and MRI. The bold values indicate the correct diagnosis. Note that the percentage of correctly diagnosed histological subtypes is best for “malignant” IPMN (i.e., IPMC and solid CA) with MRI

Subgroup analysis

In order to evaluate the problematic CT/MRI-based differentiation of the transitional subtypes, the obvious cases of malignant transformation with solid components >10 mm were excluded, i.e., IPMN with solid CA (n = 10). This resulted in an overall average accuracy of subtype identification of 83 % (sensitivity, 83 %; specificity, 83 %) for CT and 90 % (sensitivity, 83 %; specificity, 92 %) for MRI, with moderate to good interobserver agreement (κ, 0.78–0.93).

With the intention to also identify candidates for possible preventive resection, we analysed this subgroup for radiological differentiation of LGD IPMN (n = 20) vs. MGD IPMN, HGD IPMN, or IPMC (n = 30). This resulted in a drop in accuracy, with 67 % for CT (sensitivity, 39 %; specificity, 85 %) and 75 % for MRI (sensitivity, 59 %; specificity, 89 %); with poor interobserver agreement (κ, 0.39–0.72). Figures 3, 4, and 5 illustrate the difficulty of correctly differentiating LGD IPMN from MGD IPMN, HGD IPMN, and IPMC.

Fig. 3
figure 3

(a–c) show small branch duct LGD IPMN (19 mm, main duct 1 mm). CT (portal venous phase, (a) shows a small cystic ductal lesion with no apparent solid nodules or enlarged main duct. The ductal lesion was correctly rated as benign with high confidence by all three observers (−4, −5, −5). On T2w MRI (b), the ductal lesion was also rated as benign, with slightly higher confidence (all rated as −5). MRCP (c) shows at least two additional, smaller LGD IPMN in the head and tail. In contrast, (d-f) show a larger LGD IPMN (27 mm in the axial plane, main duct diameter 3 mm); the white arrowhead indicates possible nodules. CT (portal venous phase, d) were falsely rated with high confidence as malignant by two of the three observers (O2 and O3) (4; 5), while one observer (O1) rated the ductal lesion as “benign”, although with very low confidence (−1). On T2w MRI (e) and MRCP (f), all three observers rated the ductal lesion as benign with medium confidence (−3, −4, −4) in the absence of any mural nodules.

Fig. 4
figure 4

MRCP (a) of a histologically confirmed MGD IPMN with gross involvement of the main duct (10 mm), duct margins indicated by arrowheads. T2w MRI (b) of the same ductal lesion with no apparent solid nodules. This ductal lesion was correctly rated as “benign” with low confidence (−1) by only one observer (O1). The other observers (O2 and O3) rated the ductal lesion as “malignant” with low confidence (2 and 3, respectively)

Fig. 5
figure 5

Contrast-enhanced CT of histologically confirmed IPMC (axial diameter 45 mm) with no main duct dilatation (not depicted), with solid nodule (ca. 3 mm) (white arrowheads) (a). All three observers correctly identified this ductal lesion as “malignant” with high confidence (all rated as 4). T1w contrast-enhanced MRI (b) of a different histologically confirmed IPMC (axial diameter 37 mm) with a 10 mm mural nodule (arrowheads). This ductal lesion was also correctly identified as “malignant” with high confidence by all observers (5, 5, and 5)

In a further subgroup analysis of patients who underwent both preoperative CT and MRI (n = 26), the results with respect to overall accuracy for the differentiation of “benign” and “malignant” were better compared to those for the entire collective, and superior for MRI; accuracy was 91 % for CT (sensitivity, 100 %; specificity, 85 %) and 96 % (sensitivity 100 %; specificity 93 %) for MRI.

After the exclusion of pure MD IPMN, a further subgroup analysis was performed. In terms of size in this subset, the ROC- analysis revealed slightly improved significance (p = 0.008 vs. p = 0.01), with identical AUC and optimal cutoff. The univariate analysis with a ductal lesion size ≥30 mm showed slightly improved accuracy (64 % vs. 61 %), similar sensitivity (88 % vs. 88 %), and slightly improved specificity (53 % vs. 50 %). ROC analysis of each subtype in this subset showed a significant correlation between size and solid CA (p = 0.01 vs. p = 0.02), with similar accuracy (79 % vs. 75 %) and sensitivity (62 % vs. 62 %), and slightly improved specificity (82 % vs. 78 %).

Regarding the presence of mural nodules in this subset, univariate analysis showed improved significance in the prediction of IPMC (p = 0.002 vs. p = 0.008) and similar significance for solid CA (p < 0.001 vs. p < 0.001). The presence of solid components was also similarly significant (p < 0.05).

With respect to main duct involvement (MD >6 mm) in this subset, a tendency towards significance was shown in differentiation between “benign” and “malignant” IPMN (p = 0.29 vs. p = 0.57). In addition, p values were improved for the prediction of IPMC with main duct dilatation >6 mm, with p = 0.11 vs. p = 0.20 for the entire collective.

Diagnostic accuracy and observer confidence in this subset of cases was not significantly different from that of the entire collective as described above.

Discussion

There remains some discussion about the surgical management of patients diagnosed with IPMN. Therapeutic decision-making—and consequently, outcome and long-term survival—relies heavily on radiological imaging, and is dependent upon the grade of dysplasia of the treated ductal lesion [3, 19, 2429]. The recommendation for initial resection of IPMN radiologically suspicious for dedifferentiation (MD involvement, nodules, size, progression under surveillance) is undisputed by most authors for surgically fit patients [5, 7, 12, 13, 1618, 27, 3033]. Some authors recommend an aggressive surgical approach in light of findings that revealed a 24.6 % rate of at least high-grade dysplasia in a collective of resected radiologically unsuspicious branch duct IPMN (<30 mm, no nodules; “Sendai-negative”) [16], and that noted approximately 30 % overall malignant progression of all branch duct IPMN [34]. Hardcare et al. cite the poor long-term survival rates in patients with invasive carcinoma arising from IPMN, even after resection [17]. In contrast, other studies have shown that patients with small branch duct lesions under active surveillance have higher overall non-pancreatic cancer-related mortality compared to patients with IPMC [18] and low progression rates upon follow-up [24, 35, 36], with a five-year risk for HGD IPMN or IPMC ranging from 10 % to 15 %. Considering the surgical risk, preventive resection of premalignant forms of IPMN remains controversial, with overall morbidity after resection ranging from 30–60 %, even in experienced centres; complications include high rates of endocrine and exocrine insufficiency, and mortality rates of up to 5 % in patients who underwent partial or total pancreatectomy [37].

The accurate identification of the transitional IPMN subtype prior to surgery is critical for any consideration of preventive or curative surgery of pancreatic IPMN, and is almost impossible with imaging alone. The subtle changes of the IPMN along the sequence of dedifferentiation cannot be diagnosed radiologically, as they represent microscopic histopathological alterations, and so radiologic imaging depends upon surrogate parameters (i.e., size, main duct involvement, mural nodules, solid components), which are more or less indicative for the underlying entity. With increasing degrees of dysplasia, we observed an increasing number of positive morphological criteria (size and nodules), which have been shown to be indicative of malignancy within IPMN in many studies in the past. In our study, with an ideal optimal cutoff of ductal lesion size (≥30 mm), accuracy was moderate (61 %) and sensitivity was high (88 %), while specificity was low (50 %). These findings are consistent with other studies, in which a maximum ductal lesion diameter of greater than 2.5–4.2 cm was found to be an independent predictor of malignancy [11, 12, 33]. However, ductal lesion size was not found to be predictive [10, 26]. Also in line with previous studies, the presence of mural nodules alone was significant in the prediction of malignancy in general (p = 0.007) [4, 1012, 14, 15, 21, 25, 31, 33, 3841]. In our study, nodules were significant for IPMC and solid CA but not for HGD IPMN. There were, however, only three cases of HGD IPMN in our collective, which may very likely have affected our results. Previous studies in endoscopic ultrasound and MRI have also found mural nodule size to be predictive of invasiveness [7, 42], and we found that all solid CA identified as such had solid infiltrating components.

In contrast to the results of many other publications, in the present analysis, MD involvement alone was not found to be a significant predictor for malignancy [5, 10, 13, 15, 25, 30, 33]. This is likely owed to the fact that there were only six MD IPMN in our study, and to our small study collective. One reason that main duct dilatation is linked to malignancy in so many other studies may be related to the fact that main duct dilatation is more likely to cause symptoms that have been reported to be associated with higher rates of malignancy [43]. To account for the fact that MD IPMN cases are typically treated surgically [13], and for a possible bias of main duct IPMN in this collective, we also performed all statistical analyses after the exclusion of pure MD IPMN, but found no significant differences in the statistical analysis. In accordance with many previous studies [913, 32, 38], we were able to show good overall accuracy of 86 % for CT and 92 % for MRI in the differentiation between “benign” and “malignant” IPMN. This was also confirmed in our subgroup analysis of patients with both CT and MRI, with overall diagnostic accuracy of 96 % for MRI vs. 91 % for CT. Observer confidence with MRI was superior to that with CT in the identification of “benign” vs. “malignant” IPMN. To address the two possible hypothetical clinical settings (curative and preventive surgery), further subgroup analyses were performed. First, we excluded IPMN with infiltrating solid mass-like components >10 mm (solid CA). The remaining ductal lesions were analysed for the identification of malignant forms of IPMN prompting curative surgery, vs. “benign” IPMN, which can be considered for watchful waiting. With this approach, we were able to achieve good accuracy in the differentiation of HGD and IPMC from LGD and MGD for both CT (83 %) and MRI (90 %), with good interobserver agreement (κ 0.78–0.93).

To evaluate the reliability of radiologic imaging for the identification of potential candidates for curative and preventive surgery, a further subgroup analysis was performed in order to identify and differentiate IPMN with MGD, HGD, and early invasive IPMC vs. IPMN with LGD (i.e., preventive surgical candidates vs. candidates for watchful waiting). In this analysis, we were able to demonstrate a significant reduction in accuracy for both CT and MRI, to 67 % and 75 %, respectively, which underscores the difficulty in establishing a correct radiological diagnosis of IPMN with LGD from more progressed forms of dysplasia.

In our collective, the overall accuracy for MRI was higher for differentiation between “benign” and “malignant” IPMN as well as among histological subtypes (accuracy 92 % [MRI] vs. 86 % [CT] and 91 % [MRI] vs. 85 % [CT], respectively). These findings were further substantiated in the differentiation of LGD IPMN from MGD IPMN, HGD IPMN, and IPMC (accuracy 75 % for MRI, 67 % for CT). We were also able to show a tendency towards significance in the prediction of malignancy with increasing size of mural nodules in MRI, whereas no such correlation was shown for CT. The overall superiority of MRI in the diagnosis of IPMN has been demonstrated in previous studies and larger collectives, and can be explained by the improved possibility of imaging cyst connection to the pancreatic main duct with MRCP and the option to amend imaging with DWI [13, 14, 40, 4446].

Recent publications have indicated that diffusion-weighted imaging (DWI), especially in combination with MRCP, can aid in the identification of benign IPMN [46, 47], as well as help to differentiate IPMN from mucinous cystic neoplasms [44]. However, all authors note that the low spatial resolution of DWI is a limiting factor in ductal lesions smaller than 10 mm.

There are several limitations with regard to our study. Due to the retrospective study design and the long inclusion period, various MRI and CT devices and examination protocols were used, and no DWI was performed. To partially offset this drawback, distinct quality assessment was performed for all examinations with respect to image quality, image noise, contrast timing, and motion artefacts. Furthermore, only patients with resected IPMN were included in this study. In considering “benign” IPMN with LGD and MGD, the sample is likely not representative in view of the many small cystic pancreatic ductal lesions identified on CT and MRI that are not primarily resected. The poor accuracy in the diagnosis of IPMN with LGD vs. the other transitional subtypes is very likely also attributable to this fact. Apart from the total number of patients, which is a common problem in single-centre studies, some subgroups (e.g., HGD IPMN and pure main duct IPMN) are too small to allow for an informative statistical analysis.

The data presented illustrate the limited power of radiological imaging alone in distinguishing between IPMN with LGD and transitional subtypes with MGD, HDG, and early invasive IPMC. These findings are in need of additional clinical and paraclinical parameters for correct diagnosis. Progress has been made with endoscopic ultrasound (EUS) and fine-needle aspiration, and recent studies have investigated tumour markers, mutation patterns in cytology [4851], and DNA indices [52]. These are promising exemplary tools to be integrated alongside radiologic imaging in the primary diagnostic algorithm of this unique ductal lesion entity, and could help to bridge remaining diagnostic gaps.

In summary, invasive IPMN (IPMC and solid CA) can be identified with high confidence and sensitivity using CT and MRI. The diagnostic problem that remains, however, is the accurate radiological differentiation of premalignant and non-invasive subtypes of IPMN with the use of standard imaging techniques alone.