Introduction

The Bosniak Classification of Cystic Renal Masses version 2019 (v2019) proposes several changes to the Original Classification [1]. Among many proposed revisions, one of the most substantial is the formal incorporation of MRI [1]. The original Bosniak classification, which was developed by Dr. Morton Bosniak in 1986, was based entirely from and intended only to be applied to CT [2]. Only later, was the Original Classification validated when applied to MRI [3] but MRI was not formally added in the original Bosniak Classification. Differences between the two imaging modalities, primarily relating to the improved soft tissue resolution and tissue characterization with better depiction of enhancement and pseudoenhancement [4,5,6] with MRI, necessitated formal inclusion of MRI into the revised classification system [7].

Although, for the most part, the original Bosniak Classification is similar when applied in cystic masses imaged at CT and MRI, important differences are well known. Israel et al. first noted that MRI showed more septa, thicker wall or septa and better depicted enhancement when compared to CT in nearly 20% of cystic masses they studied, which resulted in a Bosniak Class upgrade in 10% of cases [3]. This result was later confirmed in a 2017 follow-up study which also showed a trend toward higher Bosniak Class assignment on MRI compared to CT using the original Bosniak Classification [8].

With formal incorporation of MRI and new quantitative definitions for imaging features such as: septa number, septal and wall thickness and protrusions, it is unknown if a systematic bias toward upward Class assignment on MRI persists when using Bosniak v2019. To our knowledge, only one previous study has explored this effect. In the 2020 study by Tse et al., the original and v2019 Bosniak systems were compared in cystic masses evaluated with both CT and MRI. The authors demonstrated that differences in Class occurred for both CT and MRI but there was no statistically significant Class change by modality [9]. MRI depicted a higher number of septa; however, wall and septa thickness and protrusions did not differ systematically comparing CT with MRI [9]. The observation by Tse et al. that Bosniak v2019 reduces or eliminates the systematic bias present in the Original Classification when comparing CT and MRI and can be considered strong evidence supporting adoption of the revised v2019 classification in modern clinical practices [10]; however, it requires validation. The purpose of this study was therefore to evaluate the original and v2019 Bosniak Classification of Cystic Renal Masses among cystic masses imaged with both CT and MRI and, to explore differences that occur between imaging modalities in cystic masses assessed by both Classification systems.

Materials and methods

Patients

With institutional review board approval, we queried our Picture Archiving and Communication System for the term ‘Bosniak II/2, Bosniak IIF/2F, Bosniak 3/III and 4/IV’ under the search filters ‘CT’ and ‘MRI’. After identifying 669 masses, we cross referenced to our pathology database and determined that 96 masses had histopathological diagnosis. To achieve a more balanced distribution of cystic masses which included lower Class (e.g., 2 and 2F) masses, 20 consecutive cystic masses assigned an Original Bosniak Class 2 and 20 cystic masses assigned an Original Bosniak Class 2F in the radiology report that had imaging preformed with both CT and MRI were retrieved consecutively over the same time period.

A fellowship-trained abdominal radiologist with 10 years of post-fellowship experience (R1, NS) and expertise in genitourinary imaging and in particular cystic renal masses, independently reviewed the 96 histologically confirmed masses blinded to the histopathological diagnosis and demographic data. The radiologist also evaluated the 40 provisional Bosniak Class 2 and 2F cystic masses, also blinded to patient demographic features and the original report. The radiologist was provided only with the location of the lesions. Therefore, the initial dataset consisted of 136 potential masses. Twenty-three of the histologically confirmed masses were excluded because: solid composition (> 25% enhancing internal elements) N = 16 [1], patient with underlying genetic syndrome predisposing to renal cell carcinoma (RCC) N = 1 [1], CT or MRI examination were incomplete N = 5 [1] or, for CT, the Bosniak v2019 Class could not be definitively assigned and MRI was required N = 1 [1]. Of the remaining 73 masses, 35 were imaged with both CT and MRI. From the 40 provisional Bosniak Class II and IIF masses, 15 consecutive Bosniak v2019 Class 2 and 15 Class 2F cystic masses were included to enrich the dataset. The other ten masses were excluded because of: incomplete CT (N = 2), incomplete MRI (N = 1), downgrade to Bosniak Class 1 (N = 2) and upgrade to Bosniak Class 3 (N = 5). Patient inclusion and exclusion criteria are summarized in Fig. 1.

Fig. 1
figure 1

Flow diagram illustrating patient inclusion and exclusion criteria for the present study

In total, 65 cystic masses were imaged with CT and MRI, Fig. 1. All imaging was performed at a single institution between the dates of 2009–2019. Histopathological diagnosis was reviewed by an experienced genitourinary pathologist (TF) who confirmed diagnosis. Diagnosis was established by 18-Guage core needle biopsy in 11.4% (4/35) or nephrectomy in 88.6% (31/35) of masses. The time interval from CT or MRI and pathology was 202 ± 208 days with no interval treatment in any patient. Mean patient age was 63 ± 13 years and there were 66.2% (43/65) male patients. Mean cystic mass size was 38.4 ± 26.8 (range 7 to 146) mm. Mean time differences between CT and MRI was 189 ± 187 days. There were 71.4% (25/35) malignant masses (16 clear cell renal cell carcinoma [RCC], 6 papillary RCC, 1 chromophobe RCC, 1 mixed conventional clear cell and clear cell papillary RCC, 1 collecting duct carcinoma) and 28.6% (10/35) benign or low malignant potential masses (2 multilocular cystic renal neoplasm of low malignant potential, 2 mixed epithelial and stromal tumor [MEST], 2 benign multiloculated cysts, 2 benign cystic nephromas, 1 simple epithelial cyst, 1 benign tissue with fibrin and chronic changes). The 35 histologically confirmed cystic renal masses were evaluated previously in a study evaluating the definitions and quantitative thresholds for cystic renal masses proposed in Bosniak v2019; however, the current objective of comparing CT and MRI was not studied. Of the 30 included Bosniak 2 and 2F cystic masses included, imaging follow-up showing stability of at least 5 years was available (an upper time limit proposed as a marker of benignity [1, 11]).

Imaging technique

All patients underwent multi-detector (16–256 channel) CT or 1.5–3 T MRI performed within a single referral center or from peripheral referral sites using the same imaging protocol with similar imaging parameters for renal mass CT or MRI. The details of institutional renal mass CT or MRI examinations are provided in supplementary Tables 1 and 2.

Imaging assessment

Three fellowship-trained abdominal radiologists with 1, 1, 2 years of post-fellowship experience (R2, R3, R4 = SA, JM, HO) independently evaluated all cystic masses. Radiologists were blinded to the histopathological diagnosis, patient demographic features and the original report but provided with the location of the lesion. Radiologists were provided with a presentation summarizing the original and v2019 Bosniak Classification systems which highlighted key changes in Bosniak v2019 taken from the original article by Silverman et al. [1]. Radiologists were instructed to first assign the original Bosniak Class and then assign the Bosniak v2019 Class for each mass. Bosniak Classes were recorded for each mass on both CT and MRI. The CT was evaluated first and the MRI second, after Bosniak Classes were assigned for CT. In addition to Class, radiologists also evaluated for: presence and number of septa (predefined range of 1 minimum to 10 maximum), wall and septa thickness, presence and size of protrusions as defined in Bosniak v2019 [1] first on CT and then on MRI. Discrepancies were resolved with the fourth expert radiologist (R1) and established through consensus. After the first round of interpretations, the three radiologists (R2, R3, R4) still blinded to the diagnoses, their own and other readers original evaluations, independently re-evaluated all cases after a minimum 4-week washout period to determine intra-observer agreement.

Statistical analysis

Data were tabulated for the three radiologists and for consensus interpretations. Comparisons between CT and MRI for Class assignment was performed using the Wilcoxon sign-rank test. Individual Bosniak v2019 imaging features were compared between CT and MRI using paired t-tests and the Fisher’s exact text. A p value < 0.05 was considered statistically significant. Inter-observer and intra-observer agreement was determined by Cohen’s kappa statistic where: 0–2.0 is slight agreement, 0.21–4.0 is fair agreement, 4.1–6.0 is moderate agreement, 6.1–8.0 is substantial agreement and 8.1–1.0 is almost perfect agreement. For consensus interpretation, 2 × 2 tables were constructed to determine the diagnostic accuracy of Original and v2019 Bosniak Classification compared to ground truth for both CT and MRI. A threshold of Class 2F or higher indicated a positive test result and a diagnosis of cancer on pathology indicated a true positive result. A threshold of Class 2 or lower indicated a negative test result, and a diagnosis of benign disease on pathology or 5 year stability for Class 2 and 2F cysts indicated a true negative result. Statistical analysis was performed using STATA v15.1 (Statcorp, College Station, TX, USA).

Results

A summary of original and v2019 Bosniak Classes assigned after consensus review by CT and MRI is provided in Table 1. There was 70.8% agreement (kappa = 0.60) between Classes assigned on CT and MRI for the Original Bosniak Classification and 72.3% agreement (kappa =  − 0.63) for Classes assigned on CT and MRI using Bosniak v2019. The difference in Class assignment was statistically significant only for Bosniak v2019 (p = 0.146 and p = 0.006, respectively). A breakdown of Class differences assigned by CT and MRI is provided in Table 2. For the original Bosniak Classification, two Class 4 masses assigned with CT were downgraded to Class 2F with MRI. Both of these masses were downgraded due to the presence of pseudoenhancement on CT which simulated enhancing tissue, Fig. 2. There were four masses that were downgraded from Class 2F to Class 2 from CT to MRI. Otherwise, there was a greater number of upgraded masses when comparing MRI to CT, with two masses assigned Class 2 and Class 2F that were upgraded to Class 3 and seven masses which were upgraded from class 2 to class 2F, Figs. 3 and 4. For Bosniak v2019, the same two masses were downgraded from Class 4 on CT to Class 2F with MRI and there were 3 other masses downgraded from Class 2F to Class 2. MRI otherwise upgraded ten Class 2 masses from CT to Class 2F with MRI, Figs. 3 and 4. Among pathologically confirmed masses, there was no difference in Bosniak Class assigned by CT or MRI using the original or v2019 classifications.

Table 1 Comparison of consensus original Bosniak and Bosniak version 2019 Classes assigned by CT and MRI in 65 cystic masses
Table 2 Cross-tabulation of original and version 2019 Bosniak Classification of 65 cystic renal masses by CT and MRI after consensus review
Fig. 2
figure 2

50-Year-old patient with left lower pole cystic renal mass. Lower portion of a 20 mm cystic renal mass shows apparent solid enhancing component which is homogeneous and soft tissue attenuation (40 Hounsfield Units [HU]) on axial unenhanced CT image (A) increasing to 92 HU on nephrographic phase-enhanced CT image (B). Given the suspected solid soft tissue component, the mass was classified using the original and version 2019 Bosniak Classification systems as Class 4. Surgical management was deferred and active surveillance performed. 6-month follow-up MRI shows the solid portion of the cystic mass identified on CT as mainly cystic with only a minimally thickened (3 mm) wall (white arrow) on axial nephrographic phase-enhanced MRI (C). The cystic mass was downgraded from Bosniak Class 4 to Class 2F using both the original and version 2019 classification systems

Fig. 3
figure 3

52-Year-old patient with right lower pole 54 mm cystic mass detected at liver protocol CT performed for incidental liver mass (not show). Axial portal-venous phase-enhanced CT image (A) shows the mass has a single thin incomplete septation (arrow). The mass was classified as Original and version 2019 Class 2. Axial nephrographic phase-enhanced MRI (B) performed 3 months later shows more septa within the cystic mass (arrows). The cystic mass was classified with the Original Classification as Class 3 due to ‘measurable enhancement’ and with the revised Bosniak Classification of Cystic Renal Masses as Class 2F due the presence of many (≥ 4) smooth and thin (1–2 mm) and minimally thick (3 mm) septa

Fig. 4
figure 4

75-Year-old patient with incidental right lower pole 32 mm cystic mass detected on Ultrasound. Axial nephrographic phase-enhanced CT image (A) shows the mass has a single thin incomplete septation (arrow). The mass was classified as Original and version 2019 Class 2. Axial nephrographic phase-enhanced MRI (B) shows more septa within the cystic mass (arrows), with many (≥ 4) smooth thin (1–2 mm) septa. The cystic mass was classified as Original and version 2019 Class 2F

A summary of individual features evaluated by Bosniak v2019 definitions on both CT and MRI is provided in Table 3. There was a higher number of septa identified with MRI (4 ± 4 [0–10]) compared to CT (2 ± 3 [0 = 10], p < 0.001). There was no difference comparing measurement of septal or wall thickness between CT and MRI (p = 0.855 and 0.067, respectively). A higher number of protrusions were identified in cystic masses with MRI compared to CT (p = 0.034) but with no difference in size of protrusions measured with either modality (p = 0.467).

Table 3 Comparison of individual imaging features, as defined in Bosniak version 2019 after consensus review, for cystic renal masses evaluated by CT and MRI in 65 cystic masses

Diagnostic accuracy was tabulated for Original and v2019 Bosniak Classifications for both CT and MRI and results are summarized in Table 4. For both CT and MRI, Bosniak v2019 had higher specificity with maintained sensitivity and higher overall accuracy compared to the original Bosniak Classification.

Table 4 Diagnostic accuracy of the original Bosniak Classification and Bosniak version 2019 for both CT and MRI using consensus interpretation scores

Inter-observer agreement for Class assignment for the 3 radiologists is summarized in Table 5. There was very similar to modestly improved levels of agreement comparing the 3 readers using the Original Classification (kappa = 0.35 CT, 0.37 MRI) and Bosniak v2019 (kappa = 0.44 CT, 0.39 MRI). Intra-observer agreement was fair to substantial (kappa = 0.22–0.69 for CT, 0.27–0.71 for MRI) for the Original Classification and not different from Bosniak v2019 (kappa = 0.31–0.66 for CT, 0.38–0.64 for MRI).

Table 5 Inter-observer agreement (Cohen’s kappa statistic) for original Bosniak and Bosniak version 2019 classification of 65 cystic masses evaluated by CT and MRI among 3 radiologists

Discussion

This study compared imaging features and Class assignment on CT and MRI using the original Bosniak Classification and the revised Bosniak v2019. We showed similar results comparing the Original Classification and Bosniak v2019, with agreement in overall Class assigned by CT and MRI in both systems of approximately 70%. Among discrepant cases, although there were differences in classification in either direction (e.g., upgrade and downgrade), MRI tended to upgrade Class 2 cystic masses assigned by CT to Class 2F most frequently in both systems. Our results indicate that although MRI is formally incorporated into Bosniak v2019, this change did not significantly alter differences in imaging features evaluated or overall Class assigned when comparing evaluation by CT or MRI. Improved depiction of septa, protrusions and enhancement, results, for the most part, in a trend toward upgrade of Class 2 to 2F although MRI may also downgrade a smaller proportion of cystic masses including those with pseudoenhancement on CT. Bosniak v2019 had similar-to-slightly higher inter-observer agreement and improved specificity with maintained sensitivity and higher overall accuracy for both CT and MRI compared to the Original Classification.

Studies directly comparing Bosniak Classification of cystic masses evaluated by CT and MRI are limited; however, the conclusion that MRI tends to upgrade Bosniak Class of cystic masses assigned compared to CT is well established [6, 7, 12]. The difference is speculated to be due to improved soft tissue resolution, tissue contrast and in older literature the possibility of thinner reconstruction intervals and multiplanar reformatted images with MRI. In the original study by Israel et al., MRI showed more septa, increased wall or septal thickness and better depiction of definitive enhancement (due to better depiction of soft tissue elements on MRI) [3]. These differences resulted in upgrade in approximately 10% of cystic masses [3]. These results were validated in 2017 [8] and, to our knowledge have not been formally studied since.

Only one study to date has explored the impact of the revised Bosniak v2019 on differences in cystic masses evaluated by CT and MRI. In the 2020 study by Tse et al., the original and v2019 Bosniak Class was compared in cystic masses evaluated with both CT and MRI. MRI depicted a higher number of septa; however, wall and septa thickness and protrusions did not differ systematically comparing CT with MRI [9]. The authors demonstrated that differences in Class occurred for both CT and MRI with trend toward any statistically significant Class change by modality [9]. Oure re-analysis of this effect showed similar but somewhat disparate results. In our study, there was also an increased number of septa depicted with MRI but no difference in wall or septal thickness comparing CT and MRI. We noted that MRI depicted more protrusions than CT and also better diagnosed pseudoenhancement on CT when using MRI in two masses (enabling downgrade from Bosniak 4 assigned by CT to Bosniak 2F with MRI). Moreover, although there were differences in Class occurring in both directions (i.e., upwards and downwards) comparing CT and MRI, a greater number of masses were assigned to Class 2F and 3 in the original Bosniak Classification and to Class 2F in version 2019. Though Tse et al. showed no trend toward systematic Class change by modality, in masses that were upgraded by MRI compared to CT, these were also on the basis of increased septa, septa/wall thickness and protrusions detected on MRI [9]. In terms of diagnostic accuracy, Bosniak v2019 had improved specificity with maintained sensitivity and higher overall accuracy for both CT and MRI compared to the Original Classification. This is compatible with what other investigators have shown to date when comparing accuracy of the two classification systems [9, 13, 14].

Inter-observer agreement reported in our study is concordant with several recent studies reported in the literature which describe similar or modest improvements in agreement when using the revised v2019 compared to the Original Classification [13,14,15]. Intra-observer agreement did not differ between readers when comparing the original and v2019 systems. We reported conventional kappa values to simplify presentation of results across 3 readers and since we were more interested in comparison between systems rather than absolute values. The use of a weighted kappa would be expected to result in higher absolute kappa values; however, it is more controversial when averaging across greater than 2 readers [16].

Our study has limitations. The number of patients and cystic masses included is relatively small, particularly for those with histopathological confirmation but similar to what has been reported previously in the literature [3, 9]. The necessity for both CT and MRI in the same patient, performed within a relatively close time interval and trend toward active surveillance of not only Bosniak Class 2F but also Class 3 and in some cases Class 4 cystic masses [1, 17,18,19] explains the challenge in obtaining a larger sample size from a single institution. Use of a multi-institutional approach may improve case numbers and the robustness of assessment. All Bosniak Class 2 and many Class 2F cystic masses in the study did not undergo histopathological confirmation, which is a necessary limitation in studies evaluating the Bosniak Classification given that the vast majority of Class 2 masse are benign and very few Class 2F masses are malignant or radiologically progress on follow-up [11, 20]. The time interval between imaging by CT and MRI and pathology could contribute to differences in classification if cyst morphology changed between imaging studies. Differences in hardward and software of CT and MRI systems used in our study could have contributed to Bosniak Class assignment disparities given the wide time period of the study. A prior study by Rosenkrantz et al. demonstrated that Bosniak Class may be influenced by imaging at 1.5 or 3 T, with a tendency to upgrade cyst complexity at higher field strength, which highlights the importance of both hardware and software considerations when evaluating and re-evaluating cystic masses [21]. Interpretation strategy where CT was evaluated before MRI and original Bosniak evaluated before Bosniak v2019 may have biased our results; however, it was elected to simplify the readout scheme since readers evaluated the dataset twice to determine inter- and intra-observer agreement and thereby to reduce learning effect between readouts by having only two rather than a greater number of readouts.

In conclusion, this study demonstrates that despite formal incorporation of MRI into the Bosniak Classification of Cystic Renal Masses version 2019, there remain persistent differences in imaging features and Bosniak Class assigned to cystic masses imaged with CT and MRI. MRI depicts more septa and protrusions compared to CT and is more accurate to evaluate for the presence or absence of enhancement; however, there is no difference in degree of wall or septa thickness or protrusion size when evaluated with CT or MRI. There was similar degree of agreement in overall Bosniak Class comparing the original and version 2019 systems on CT and MRI, occurring in over 2/3 of cases. For the discrepant cases, differences from CT and MRI resulted in both upgrade and downgrading of Bosniak Class; however, Bosniak v2019 had similar-to-slightly improved inter-observer agreement for both CT and MRI with improved specificity, maintained sensitivity, and higher overall accuracy for both CT and MRI compared to the Original Classification.