Introduction

In 2019, the Bosniak Classification of Cystic Renal Masses version 2019 (v2019) was introduced, which proposes several changes to the current classification [1]. Chief among these are strict definitions for imaging features of cystic renal masses including number of septa, septa and wall thickness, and septa and wall protrusions (termed “irregularity” and “nodule”) which aim to improve inter-observer agreement when evaluating a particular feature of a cystic renal mass and also when assigning its Bosniak v2019 Class [1]. Preliminary studies have shown modest improved inter-observer agreement for overall classification comparing Bosniak v2019 to the original classification [2, 3].

Proposed changes in Bosniak v2019 also aim to emphasize specificity in the diagnosis of cystic renal malignancy [1]. The proposed definitions are derived from terms used in the original classification but also largely based upon expert opinion [1]. The proposed definitions and threshold values of measurements described for each definition lack formal validation. While preliminary data suggest the Bosniak v2019 classes do achieve improved specificity when compared to the original classification [2, 3], individual features should be tested to determine if they can be optimized or simplified. The purpose of this study was therefore to evaluate Bosniak v2019 compared to the original classification and to formally evaluate definitions proposed in Bosniak v2019 for wall and septa features in cystic renal masses with reference to histopathology.

Materials and methods

Patients

With institutional review board approval, we queried our Picture Archiving and Communication System (PACS) for the term “Bosniak IIF/2F, Bosniak 3/III and 4/IV” under the search filters “CT” and “MRI.” After identifying 669 masses, we cross referenced to our pathology database and determined that 96 masses had histopathological diagnosis. A fellowship-trained abdominal radiologist with 10 years of post-fellowship experience in abdominal MRI and CT (N.S.), familiar with the original and v2019 Bosniak Classification systems, independently reviewed all 96 masses blinded to the histopathological diagnosis, patient demographic features, and the original report, provided only with the location of the lesion from the radiology and pathology reports which was provided by two unblinded radiology residents (J.H.Y., J.C.). Twenty-three masses were excluded because solid composition (> 25% enhancing internal elements) N = 16 [1], patient with underlying genetic syndrome predisposing to renal cell carcinoma (RCC) N = 1 [1], CT or MRI examination was incomplete N = 5 [1], or, for CT, the Bosniak v2019 class could not be definitively assigned and MRI was required N = 1 [1]. Patient inclusion and exclusion criteria are summarized in Fig. 1.

Fig. 1
figure 1

Flow diagram depicting patient selection and inclusion and exclusion criteria for the present study. 1Picture Archiving and Communication System

In total, 73 cystic (≤ 25% solid elements) masses in 73 patients imaged with CT (N = 28) or both CT and MRI (N = 56) and with histological confirmation were included between the dates 2009 and 2019. Histopathological diagnosis was reviewed by an experienced genitourinary pathologist (TAF) who confirmed the diagnosis. The diagnosis was established by nephrectomy in 83.6% (61/73) or 18-gauge core-needle biopsy in 16.4% (12/73) of the masses. Mean lesion size (determined by the maximum long-axis diameter) was 45 ± 30 (range 8 to 160) mm. Mean patient age was 60 ± 13 (range 26–90) years and there were 45 male patients. There were 78.1% (57/73) malignant masses (42 clear cell renal cell carcinoma [RCC], 10 papillary RCC, 3 chromophobe RCC, 1 mixed conventional clear cell and clear cell papillary RCC, 1 collecting duct carcinoma), and 21.9% (16/73) benign or low malignant masses (4 multilocular cystic renal neoplasm of low malignant potential, 3 mixed epithelial and stromal tumor [MEST], 1 oncocytoma, 6 benign multiloculated cysts, 1 simple epithelial cyst, 1 benign tissue).

Imaging technique

All patients underwent multi-detector (16–256-channel) CT or 1.5–3-T MRI performed within a single referral center or from peripheral referral sites using the same imaging protocol with similar imaging parameters for renal mass CT or MRI. The details of institutional renal mass CT or MRI examinations are provided in supplementary Tables 1 and 2. The time interval from CT or MRI and pathology was 152 ± 182 (range 26 to 1079) days with no interval treatment in any patient.

Imaging assessment

Three fellowship-trained junior abdominal radiologists with 1, 1, and 2 years of post-fellowship experience (J.M., H.O., S.A.) independently evaluated cystic masses. Radiologists were blinded to the histopathological diagnosis, patient demographic features, and the original report but provided with the location of the lesion. Radiologists evaluated each mass using standard institutional PACS (McKesson Radiology Station version 12.3.0, McKesson Corporation). CT images were viewed in soft tissue windows, level 40 and width 400 Hounsfield units. Radiologists were provided with a presentation summarizing the original and v2019 Bosniak Classification systems which highlighted key changes in Bosniak v2019 with emphasis on definitions and threshold measurements for the septa, wall, and protrusions. Radiologists independently assigned the original Bosniak classification. Radiologists were also instructed to record individual features as proposed in Bosniak v2019 as follows: number of enhancing septa (1–10; > 10 septa were coded as a maximum value of 10 which was determined a priori based on our own experience in a series of test cases which showed that accuracy for counting septa became extremely challenging when > 10), thickness of the wall and septa (mm), presence of septa or wall protrusions, angulation (acute or obtuse) of protrusion to the underlying wall or septa, and size of protrusions (mm) where protrusions ≤ 3 mm with obtuse angles are termed “irregularity” while those measuring ≥ 4 mm or with acute margins are termed “nodule” [1]. The method of measurement was derived from the Bosniak v2019 recommendations, and measurements were only performed on enhanced CT or MR images [1]. After the individual features were recorded, the Bosniak v2019 class was assigned for each mass.

Statistical analysis

Data were tabulated for the three radiologists, and discrepancies were resolved by consensus including the fourth senior radiologist. Inter-observer agreement was determined by Cohen’s kappa statistic for original and v2019 overall class and with Cohen’s kappa or Bland-Altman analysis for subjective features and quantitative measurements in v2019, respectively. Consensus interpretation data were used to compare overall class assignment and proportion of malignancy within a particular class using the original and v2019 classification and also to compare individual imaging features described in v2019 to pathological diagnosis. The proportions of benign and malignant masses within a particular class and described as having an “irregularity” or “nodule” were compared using 95% confidence intervals (CI). Tests of association were performed for “nodule” angle using the chi-square test. For quantitative variables (e.g., wall or septa thickness, number of septa, protrusion size), tests of association were performed using the rank sum test after a skew test and revealed non-gaussian distribution of data (p < 0.001–0.01). Spearman correlation was performed for septa number compared to pathological diagnosis. Empiric receiver-operator-characteristic (ROC) curves were generated, and the optimal cut point to diagnose malignancy was derived, for statistically significant features, using the method described by Youden. For patients with both CT and MRI, MRI data was used due to an improved depiction of enhancement and presence of enhancing septa [4,5,6]. Statistical analysis was performed using STATA v15.1 (Statcorp).

Results

The overall Bosniak version 2019 class stratified by pathological diagnosis for the 73 cystic masses for the three radiologists and after consensus review is provided in Table 1. A comparison of the original Bosniak classification and Bosniak Classification v2019 overall class assignment and proportion of malignancy within each class after consensus interpretation is provided in Table 2. There was an overall increase in the number of masses classified as Class IIF using version 2019 due to an overall decrease in the number of masses classified as Class III compared to the original classification. However, there was no difference in the proportion of benign or malignant masses within each class comparing the original and v2019 classification systems (substantial overlap in 95% CI). Inter-observer agreement for the original Bosniak classification and Bosniak version 2019 classification overall and stratified by consensus v2019 class assigned is summarized in Table 3. Overall, there was a slight improvement in agreement between readers comparing v2019 to the original classification with a higher agreement among higher v2019 consensus classes.

Table 1 Bosniak version 2019 classification of histologically confirmed cystic renal masses among 3 radiologists and after consensus review compared to histopathology
Table 2 Comparison of overall class and proportion of benign and malignant masses (with 95% confidence interval [CI]) within each class using original Bosniak Classification and Bosniak version 2019 Classification of Cystic Renal Masses using consensus data from four radiologists
Table 3 Comparison of overall inter-observer agreement comparing 3 blinded radiologists using the original Bosniak Classification and Bosniak Classification of Cystic Renal Masses Version 2019 overall and stratified by version 2019 class assigned after consensus review (numbers represent Cohen’s kappa value)

There were 84.9% (62/73) cystic masses with internal septa. After consensus interpretation, the mean number of septa was 7 ± 4 (range 1–10). The limits of agreement in the number of septa between radiologists ranged from 0.7 to 1.6 (95% CI 0–2.3). There was no association between septal number and malignancy (7 ± 4 for benign masses versus 6 ± 4 for malignant masses, p = 0.89) and no correlation between septa number and malignancy (rho = − 0.02, p = 0.89) (Fig. 2). Considering consensus data of cystic masses with ≥ 4 septa which were smooth (no protrusion) and thin or minimally thickened (≤ 3 mm), Class IIF, there were 60% (9/15) malignant masses.

Fig. 2
figure 2

Fifty-seven-year-old male with a 5.9-cm cystic mass in the anterior interpolar region of the left kidney. The axial enhanced MR image shows the mass (thick white arrow) with two interconnecting thin (≤ 3) smooth enhancing septa (arrowhead). Multiple other similar septa were present elsewhere in the mass at other levels (not shown). The mass is classified as Bosniak v2019 Class IIF due to the presence of many (≥ 4) thin smooth septa. The mass was resected with clear cell renal cell carcinoma diagnosis at pathology

After consensus interpretation, mean wall and septa thicknesses were 3 ± 3 (range 1–14) mm and 3 ± 2 (range 1–10) mm, respectively. The limits of agreement for wall and septa thickness measurements between radiologists ranged from 0.2 to 0.7 mm (95% CI 0–1.1) and 0.0–0.3 mm (95% CI 0–0.8) respectively. There was an association between increased wall thickness and malignancy (mean thickness 1.6 ± 1.8 mm for benign masses versus 3.0 ± 3.0 mm for malignant masses, p = 0.03) (Fig. 3). Malignant masses also showed thicker septa (3.3 ± 2.3 mm) compared to benign masses (1.6 ± 1.7 mm), but the difference was not significant (p = 0.20). The areas under the receiver-operator-characteristic (ROC) curve to diagnose malignancy by wall and septa thickness were 0.66 (95% CI 0.54–0.79) and 0.61 (0.45–0.78). As determined by the method described by Youden, the optimal cut point which maximized accuracy for diagnosis of malignancy was ≥ 3 mm (sensitivity 33.3%, specificity 86.7%) for wall thickness and ≥ 3 mm (sensitivity 53%, specificity 73%) for septa thickness. The sensitivity and specificity for various other size thresholds for wall and septa thickness to diagnose malignancy are summarized in Table 4.

Fig. 3
figure 3

a Thirty-eight-year-old male with a 3.4-cm cystic mass in the right kidney. Axial enhanced CT image shows the mass (thick white arrow) with a smooth thin (0–1 mm) wall and a single thin (1–2 mm) septation. There is an obtuse protrusion arising from the septa which measures 3 mm (arrowhead) termed an irregularity. The mass is classified as Bosniak v2019 Class III. The mass was resected with clear cell renal cell carcinoma diagnosis at pathology. b Fifty-seven-year-old male with a 16.0-cm mass in the right kidney. Axial enhanced CT image shows the mass (thick white arrow) with a thick (6 mm noted by white line) smooth wall. There is a thick (4 mm) incomplete septum (thin arrow). The mass is classified as Bosniak v2019 Class III. The mass was resected with clear cell renal cell carcinoma diagnosis at pathology

Table 4 Accuracy, sensitivity, and specificity for diagnosis of malignancy in renal cystic masses by wall and septal thickness

After consensus review, 35.6% (26/73) of the masses showed projections with obtuse margins measuring ≤ 3 mm in size (termed “irregularity”) and 39.7% (29/73) showed projections with obtuse margins measuring ≥ 4 mm or projections with acute margins of any size, termed “nodule” (Fig. 4). There were 13 cystic masses with both “irregularity” and “nodule”; after excluding these cases, there were 13 cystic masses where “irregularity” was the highest Bosniak v2019 feature. Among these 13 cystic masses, there was 23.1% (3/13, 95% CI 5.0–53.8%) benign and 76.9% (10/13, 95% CI 46.2–94.9%) malignant masses. In cystic masses with the highest Bosniak v2019 feature being a “nodule,” the proportion of benign masses was 10.3% (3/29, 95% CI 2.2–27.4%) and that of malignant masses was 89.7% (26/29, 95% CI 72.7–97.8%). Inter-observer agreement for diagnosis of “irregularity” and “nodule” was slight to moderate (kappa = 0.20–0.44 and kappa = 0.31–0.45, respectively). The angle of the nodule (p = 0.27) was not associated with malignant diagnosis. There were 94.4% (17/18; 95% CI 72.7–99.9%) malignant diagnoses in masses with “nodule” defined as a protrusion with obtuse angles measuring ≥ 4 mm in size compared to 81.8% (9/11; 48.2–97.7%) malignant diagnoses in masses with “nodule” defined as a protrusion of any size with acute angles. Inter-observer agreement for the nodule angle was poor to moderate (kappa = 0.16–0.43), and the limits of agreement for the nodule size was 1.7–1.9 mm (95% CI 0–3.7).

Fig. 4
figure 4

a Sixty-year-old female with a 4.3-cm cystic mass in the left kidney. Axial enhanced CT image shows the mass arising from the lower pole of the left kidney (thick white arrow). The mass displays multiple Bosniak v2019 features. There is a smooth thick (5 mm) wall (black line). There are multiple variably thickened septa with irregularity (protrusions with obtuse margins measuring ≤ 3 mm in size) for instance arising from the posterior wall (arrowhead). There is a nodule (black arrow) arising from one of the septa, presenting as a protrusion measuring 5 mm with obtuse margins to the underlying septa. When multiple Bosniak v2019 features are present, the highest feature defines the class. Therefore, the presence of the nodule indicates Class IV. The mass was resected with clear cell renal cell carcinoma diagnosis at pathology. b Sixty-three-year-old male with a 10.4-cm cystic mass in the right kidney. Axial enhanced CT image shows the mass (thick white arrow) with two protrusions. There is an 18-mm protrusion with an acute angle to the underlying wall from which it arises (thin white arrow), termed a nodule. There is a 3-mm protrusion at acute angles (black arrowhead) to the underlying wall from which it arises, which, despite its size, is termed a nodule due to the angle it forms with the underlying wall. The mass is classified as Bosniak Class IV. The mass was resected at a diagnosis of clear cell renal cell carcinoma confirmed at pathology

Discussion

In this study, we evaluated the recently revised Bosniak version 2019 compared to the original classification and, within version 2019, the newly proposed imaging features and their association to pathological diagnosis. Overall, we noted an increase in the proportion of Class 2F cystic masses when using version 2019 with a higher proportion of malignancy in Bosniak version 2019 Class 2F compared to the original classification but with overlapping 95% confidence intervals. There was slight improvement in inter-observer agreement comparing version 2019 to the original classification. There was no association between septa number and malignancy, which supports the Bosniak v2019 inclusion of cystic masses with many (≥ 4) septa as Class IIF. Wall and septa thicknesses were both thicker in malignant compared to benign masses, with the optimal cut point which maximized the accuracy of diagnosis determined to be ≥ 3 mm for both wall thickness and septa thickness. The proportion of malignancy overlapped comparing malignant masses which showed wall or septa protrusions with obtuse margins ≤ 3 mm in size (termed “irregularity”) compared to those which showed obtuse margins ≥ 4 mm or measuring any size with acute margins (termed “nodule”).

The proportion of benign versus malignant cystic masses by Bosniak v2019 class in our study is similar to what has been reported in recent studies evaluating the new system [2, 3]. Malignant masses comprised 50% Class II, 60% Class IIF, 78.5% Class III, and 89% Class IV. There was no difference in the proportion of malignancy comparing individual classes assigned with the original or version 2019 in our study. For Class II and IIF, malignancy rates are substantially higher in our study than rates from the literature which evaluated the original classification (~ 0% for Class II and ~ 10% for Class IIF) [7]. The differences can be explained by our biased sample, which necessitated pathological confirmation. This bias undoubtedly skews the proportion of malignancy, and these proportions cannot be considered accurate given the vast number of benign Class II and IIF masses which did not undergo pathological confirmation over the same time period. Nevertheless, we noted a higher proportion of Class IIF masses assigned using version 2019 compared to the original classification. The higher number of cancers among Class IIF when using version 2019 has also been shown in other recent studies [2, 3] and may be a byproduct of a proportion of Class III masses which are now shifted downwards to Class IIF. The Bosniak v2019 Classification system proposes to improve specificity of diagnosis in higher (Class III and Class IV) masses to prevent unnecessary treatment of benign cystic masses, and one of the consequences of this aim may be a shift of some malignant Class III masses into Class IIF [1]. This observation will require further study.

The inter-observer agreement for the v2019 classification in our study among junior-level fellowship-trained radiologists was slight to moderate which is on the lower end of the spectrum of recent data published regarding overall agreement of the new system [2, 3, 8], however likely explained by the dataset which consisted of only histologically confirmed masses mainly Bosniak Class 2F–4 masses. Including a more balanced grouping of Bosniak classes with Class 1 and 2 cystic masses would almost certainly increase agreement. Within our study, inter-observer agreement was slightly improved when comparing assessment by version 2019 compared to the original system, which is also consistent with recent reports [2, 3].

Our data suggest an increasing number of septa was not associated or correlated with malignant diagnosis, which supports the inclusion of masses with many (≥ 4) septa which are otherwise smooth and thin (≤ 2 mm) or minimally thick (3 mm) as Class IIF. There was an association between increasing wall thickness and malignancy with a similar trend in septa thickness. The optimal cut point which maximized accuracy according to the method described by Youden [9] was lower than proposed thresholds in the Bosniak v2019 classification, ≥ 3 mm in our study versus ≥ 4 mm in Bosniak v2019 [1]. It is important to note that Youden’s method optimizes overall accuracy with equal weight given to sensitivity and specificity [9]. Since the goal of Bosniak v2019 is to maximize specificity while maintaining an acceptable sensitivity, the 4 mm threshold proposed in Bosniak v2019 may still be appropriate.

Considering masses which showed wall or septa protrusions, our results demonstrate a lower proportion of malignant masses among those which showed only “irregularity” (i.e. protrusions with obtuse margins measuring ≤ 3 mm in size) compared to those which showed “nodule” (i.e., protrusions with obtuse margins measuring ≥ 4 mm in size or with acute margins measuring any size) but with overlapping 95% confidence intervals. In a 2020 study by Tse et al, the authors report a prevalence of malignancy of 71–85% for masses with “irregularity,” 71–85% for masses with “nodule” showing obtuse angles, and 87–95% for masses with “nodule” showing acute angles [10]. The authors do not report 95% confidence intervals, though, due to small sample likely overlap as in our study. In our study, there was no significant association between the angle of the protrusion and malignant diagnosis. The World Health Organization classification of renal tumors differentiates between multilocular cystic neoplasm of low malignant potential and conventional clear cell RCC by the presence of any “expansile mass” with no reference to size or the angle the mass makes to the underlying septa or wall from which it originates [11]. Our preliminary data, in conjunction with the data from Tse et al [10], suggests that differentiating wall/septa protrusions as “irregularity” versus “nodule” may not influence the probability of malignancy. However, this may yet be useful as Tse et al note a higher proportion of aggressive tumors in Bosniak v2019 Class 4 compared to Class 3 masses [10].

Our study has limitations. We restricted our analysis to cystic masses with histological confirmation, and this undoubtedly biases the overall proportion of malignancy to a higher number since masses which were surveilled and did not undergo histological confirmation were not included in the present study. Since the study population consists of only histologically confirmed masses, there are relatively few Bosniak Class IIF cystic masses and very few Bosniak Class II masses. Data evaluating the proportion of malignancy in the original Bosniak Class II cystic masses show a proportion of malignancy approaching zero [7], but are lacking for Bosniak v2019 Class 2 masses. Future studies including a broader representation of the Bosniak classes would be optimal. We included masses with both CT and MRI, which could be viewed as a limitation since there are differences in imaging features comparing CT and MRI in cystic renal masses [6]; however, the inclusion of both CT and MRI is reflective of clinical practice and there was no bias towards malignancy or feature classification comparing modalities in our study. A recent study comparing CT and MRI analysis of cystic renal masses with CT and MRI also showed no systematic bias towards classification comparing one modality to the other [12]. Our study included patients with imaging over a 10-year time span; although renal CT and MRI protocols did not differ during the study period at our institution, differences in CT and MRI technology during this time period could also be considered a limitation.

In conclusion, this study evaluated cystic renal masses using the recently revised Bosniak version 2019. Compared to the original classification, version 2019 results in a higher proportion of Class IIF masses with less Class III assignments with a higher number of malignant masses in Class IIF noting overlapping 95% confidence intervals. Inter-observer agreement was slightly improved with version 2019. Our results indicate that there is no association between the number of septa and malignancy, which validates the inclusion of cystic masses with many (≥ 4) thin, smooth septa as Class 2F. There was an association between increasing wall and septa thickness and malignancy. Our study demonstrated that a lower threshold of ≥ 3 mm, compared to ≥ 4 mm proposed in Bosniak v2019, for both wall and septa thickness optimized accuracy for diagnosis of malignancy. Lastly, there was an overlap in the proportion of malignant masses with wall or septa “irregularity” compared to wall or septa “nodule.” Further study of the Bosniak classification revisions is warranted to explore differences in wall/septa protrusions characterized by size and angulation.