Introduction

Among incidentalomas detected at abdominal imaging, renal masses are common [1, 2]. Though most solid (> 25% enhancing tissue [3]) renal masses are malignant, small (≤ 4 cm) solid masses can be benign in up to 20% of instances [4, 5]. Clear-cell renal cell carcinoma (ccRCC) is the most common RCC subtype and is also aggressive. In patients managed with active surveillance (AS), ccRCC is the most common cause of disease progression and metastasis (though occurring rarely) [6]. In current clinical practice, diagnosis of ccRCC among small solid renal masses is only possible by pathological analysis after extirpative therapy or biopsy. Pre-operative histological diagnosis would be desirable to both (1) reduce benign resection rates and (2) better inform AS management decisions [7]. Yet, renal mass biopsy is underutilized [8]. Renal mass biopsy is an additional diagnostic procedure that may delay treatment, is invasive (median complication rate of 8%) [9], and can be non-diagnostic in up to 20% of masses [10].

Dr. Pedrosa and colleagues have developed a clear-cell likelihood score (ccLS) system, which is a 5-tiered Likert scale that estimates the likelihood that a solid renal mass is a ccRCC using multiparametric MRI [11]. The ccLS can diagnose ccRCC with moderate accuracy and a high negative predictive value, and has been shown to have substantial inter-observer agreement [12,13,14]. A limitation of the ccLS is that renal masses are more frequently evaluated by CT than by MRI, owing to their incidental depiction on examinations performed for other reasons (most frequently CT), shorter examination times of CT, lower cost of CT, and increased accessibility to CT [15]. A predictive CT algorithm for diagnosis of ccRCC has been recently published [16] and could serve as an alternative to the ccLS in patients without MRI. The algorithm was recently validated by Lemieux and colleagues, who showed a high sensitivity and a high negative predictive value for diagnosis of ccRCC and high specificity for diagnosis of papillary (p) RCC [17].

Any imaging algorithm employed to diagnose renal masses, including the ccLS and CT score, should help guide management decisions and potentially improve patient selection for renal mass biopsy. The CT score may therefore be useful, to avoid biopsy in pRCC (due to its high specificity) and select patients for biopsy with possible ccRCC (due to high sensitivity). Avoiding biopsy entirely with non-invasive imaging is ideal; however, both CT and MRI suffer from an inability to reliably differentiate ccRCC from oncocytic tumors (e.g., chromophobe RCC and oncocytoma) [18] and both the CT score and ccLS do not determine the grade of ccRCC which is an important consideration since low-grade ccRCC may also be amenable to surveillance at small sizes [7]. This study therefore aims to evaluate the CT-based algorithm for diagnosis of ccRCC and secondarily for diagnosis of pRCC among small (≤ 4 cm) solid renal masses which were diagnosed by renal mass biopsy.

Methods

Patients

This retrospective study received institutional review board (IRB) approval with a waiver of the requirement for written patient informed consent. A fellowship-trained genitourinary radiologist (N.S.) and fellowship-trained genitourinary pathologist (T.A.F.), both with 11 years of post-training experience, searched the institutional pathology database and PACS (Enterprise Radiology Software, Change Healthcare Corporation) from January 2014 to January 2021 for renal masses in adult patients (age ≥ 18 years) that were imaged by CT prior to renal mass biopsy, yielding a pathologic diagnosis. This search identified 314 patients. Patients were excluded for the following reasons: CT did not use a complete renal protocol or biopsy was performed > 12 months after CT (n = 156); size of resected mass (based on mean of axial short-axis measurement, axial long-axis measurement, and coronal measurement) greater than 4 cm on CT (n = 107). Therefore, the final study population consisted of 51 patients with a total of 51 biopsy confirmed small (≤ 4 cm) renal masses, evaluated by multiparametric CT. Figure 1 shows the flow of patient selection.

Fig. 1
figure 1

Flow diagram depicting patient inclusion and exclusion criteria for the study

All CT examinations were performed for characterization or staging of renal masses. All included masses had a circumscribed (i.e., noninfiltrative) appearance on CT, and no patient had suspected T3 disease or metastases on CT. The mean interval between CT and biopsy was 4 ± 4 months (range, 1–12 months). No patient received intervening therapy between the CT and surgical resection. Institutional fellowship-trained genitourinary pathologists established the histological diagnosis for each mass as part of clinical care by pathologic evaluation of the specimens following partial or total nephrectomy using the WHO criteria for renal tumors [19]. In cases where biopsy was inconclusive or non-diagnostic, repeat biopsy or pathology after surgery was used to establish the final diagnosis. ccRCCs were assigned an International Society of Urogenital Pathology (ISUP) nuclear grade. Histologic diagnoses were retrieved from the pathology reports.

CT protocol

CT examinations were performed using a variety of different multi-detector computed tomography (MDCT) scans with 64 to 320 channels. Examinations used a triphasic renal-mass CT protocol that comprised an unenhanced acquisition of the kidneys, a corticomedullary-phase acquisition of the kidneys, and a nephrographic-phase acquisition of the entire abdomen. The timing of the corticomedullary phase was determined by bolus tracking using a circular ROI placed in the abdominal aorta at the level of the diaphragm, with acquisition 15 s after reaching an attenuation threshold of 100 HU (corresponding with approximately 30–40 s after injection). The nephrographic phase was acquired 80 s after the corticomedullary acquisition (approximately 110–120 s after injection). Typical scan parameters for patients examined on 64-channel systems included 120 kVp, automatic tube current modulation using a standard deviation (SD) of 15 for noise, and tube rotation of 0.5 s. Images were reconstructed in the axial plane with a slice thickness of at least 2.5 mm, with additional generation of sagittal and coronal multi-planar reformatted images. Reconstruction used iterative reconstruction algorithms that were selected to be as similar as possible across vendors. Patients were generally administered 105 mL of non-ionic contrast material at a fixed IV rate of 3.5 mL/s using a power injector followed by a saline flush. The renal-massed CT protocol was compliant with recommendations for renal mass evaluation from the Society of Abdominal Radiology Disease Focused Panel on RCC [3].

Radiologist interpretation

Two fellowship-trained radiologists, with 12 and 17 years of post-fellowship and an abdominal radiology fellow, independently reviewed CT images (C.W., R.H., F.E.). The radiologists were informed of the location of the mass in each patient for evaluation, but were blinded to pathology results.

To derive the CT score for each mass, the radiologists first independently assigned a heterogeneity score using a 5-point Likert scale where 1, completely homogeneous; 2, mostly homogeneous; 3, mixed areas of homogeneous and heterogeneous component (also described as mixed heterogeneity); 4, mostly heterogeneous; and 5, completely heterogeneous as described in the study by Al Nasibi et al [16]. Heterogeneity was assessed on the corticomedullary phase, as selected by Al Nasibi et al [16] because a previous work has shown that texture features in renal masses show the greatest variation among histologic diagnoses in the corticomedullary phase [20].

The radiologists then recorded the corticomedullary phase attenuation ratio of the mass relative to the renal cortex as described in the study by Al Nasibi et al [16]. For homogeneous masses, a region of interest (ROI) encompassed approximately two-thirds of the mass’s area (based on subjective visual assessment), avoiding the mass’s edges where it interfaced with renal parenchyma, renal sinus, or retroperitoneal fat. For heterogenous masses, the ROI encompassed approximately two-thirds of the mass’s most hyperattenuating area (based on subjective visual assessment) and was required to measure at least 5 mm in diameter. An ROI was placed on renal cortex ipsilateral to the mass on corticomedullary-phase images (typical ROI size, 5 mm), using the same axial slice as was used for measuring the mass attenuation (Figs. 2 and 3). The corticomedullary-phase attenuation ratio of the renal mass was then calculated by dividing the attenuation of the mass by the attenuation the renal cortex and the absolute value was converted into a score from 1 to 3 where 1 = mild < 0.40, 2 = moderate 0.40–0.75, and 3 = intense > 0.75, similar to the three-tiered stratification of the corticomedullary-phase enhancement ratio in the ccLS system used for MRI [13].

Fig. 2
figure 2

A 69-year-old female patient with ISUP grade 2 clear-cell RCC measuring 25 mm in the left kidney. Axial CT images in the (a) unenhanced, (b) corticomedullary, and (c) nephrographic phases at the same level depict the mass (arrows). The mass shown is mostly heterogeneous and shows intense enhancement in b. Circular regions of interest (ROI) were placed in the most enhancing component of the mass and in the ipsilateral renal cortex (circles, b), to determine the mass-to-cortex corticomedullary attenuation ratio. Heterogeneity scores were assigned subjectively using a 5-point Likert-type scale. These two features were used to inform development of the CT score. In this patient, the mass showed avid enhancement (mass-to-cortex corticomedullary enhancement ratio > 0.75) and was considered mostly heterogeneous (heterogeneity score, 4) by both radiologists, resulting in CT score 5

Fig. 3
figure 3

A 59-year-old male patient with papillary RCC measuring 19 mm in the left kidney. Axial (a) unenhanced, (b) corticomedullary, and (c) nephrographic phase contrast-enhanced CT images depict the mass as having mild enhancement (mass-to-cortex corticomedullary attenuation ratio < 0.4) and mostly homogeneous as rated by both radiologists. The CT score is 1

The CT score

The CT score is a 5-tiered algorithm which uses first the heterogeneity and then the mass-to-renal cortex attenuation ratio (Appendix 1). For each mass, each radiologist recorded a CT score for each tumor. Prior to recording the data, both radiologists were provided with a short presentation depicting the imaging features and how to apply them and the diagnostic algorithm. Two examples of application of the CT score using the corticomedullary attenuation ratio and heterogeneity score are provided in Figs. 2 and 3.

Proposed revisions to the CT score

Since the most common false positive diagnosis for ccRCC at multiphase CT is an oncocytoma/oncocytic neoplasm [16], we explored whether the use of segmental enhancement inversion (SEI) could further improve performance of the CT scoring system. SEI has been included into the ccLS system [21] as a feature which downgrades tumors into lower ccLS categories since it has been described in oncocytomas [22]. Radiologists therefore compared the corticomedullary phase and nephrographic phase images for each tumor and recorded whether SEI was present (areas within the tumor with a “flip-flop” pattern of enhancement: one area washes-out (de-enhances) and another area washes-in (progressively enhances)) as described by Kim et al [22]. Prior to recording SEI, both radiologists reviewed the diagnostic criteria for SEI and several illustrative examples. SEI results were adapted to the original CT score using a proposed revision to the CT score by Eldehimi et al [23] and revised CT scores were recorded for each patient and each radiologist. In this revision, the presence of SEI downgrades a CT score 4 tumor by one category [23]. A renal mass corticomedullary to nephrographic phase attenuation ratio > 1.5 was also proposed to upgrade CT score 3 or 4 tumors by one category [23]. Nephrographic phase attenuation was measured using copy-paste function matching measurements used in the corticomedullary phase.

Lemieux et al noted a similar performance of the CT score in their study, when excluding tumor heterogeneity and using only corticomedullary phase attenuation ratio, also improving inter-observer agreement [17]. Accordingly, we also calculated CT scores using the proposal by Lemieux et al [17], excluding heterogeneity data, for each tumor and each radiologist.

Statistical analysis

The number of ccRCC, pRCC, and other renal masses was tabulated. The area under the receiver operator characteristic (AUC) curve, sensitivity, specificity, and positive predictive value (PPV) for diagnosis of ccRCC using a CT score of  ≥ 4 and the negative predictive value (NPV) for CT score of  ≤ 2 were calculated as described in the study by Al Nasibi et al [16] and also using the revised proposals by Eldehimi et al [23] and Lemieux et al [17]. Comparisons were conducted using 95% confidence intervals (CIs), where overlap in 95% CI indicated no significant difference. The diagnostic accuracy of the CT score for pRCC, using a CT score ≤ 2, was also calculated. Diagnostic accuracy of the CT score for diagnosis of ccRCC and pRCC was derived overall using fixed effects logistic regression modelling controlling for the radiologist. The inter-observer agreement for CT score was performed using Fleiss weighted kappa. Statistical analysis was performed using STATA BE version 17.0 (Statcorp).

Results

A summary of patient demographic features and histologic diagnoses is provided in Table 1. A total of 51% (26/51) of masses were ccRCC, and 49% (25/51) had other histologic diagnoses. Among the remaining masses, 44% (11/25) were benign (9 oncocytoma, 2 fat-poor angiomyolipoma), and 56% (14/25) were malignant or had malignant potential. The most common malignancy in the study sample after ccRCC was papillary RCC (N = 8) (Table 1). In total, there were 8% (4/51) masses with initial non-diagnostic renal mass biopsy (2 clear-cell RCC, 2 papillary RCC) where diagnosis was later established by repeat biopsy or nephrectomy. In the other 92% (47/51 masses), a diagnosis of renal mass subtype was established at initial biopsy and among masses which later underwent surgery, there was no change in mass subtype after surgical pathology.

Table 1 Patient demographic and renal mass characteristics

Age did not differ between patients with ccRCC and those with other histologic diagnoses (68 ± 11 years vs 61 ± 12 years, respectively; p = 0.37). Among patients with ccRCC, 65% (17/26) were male compared to patients with other histologic diagnoses where 64% (16/25) were male (p = 0.63). Mean mass size also did not differ comparing ccRCC and other histologic diagnoses (28 ± 8 [range 15–40] mm for ccRCC and 25 ± 8 [range 13–40] mm for other masses), p = 2.0).

For diagnosis of ccRCC, using fixed effects logistic regression modeling, AUC, sensitivity, specificity, and positive predictive value (PPV) were 0.69 (95% confidence interval 0.61–0.76), 78% (68–86%), 59% (46–71%), and 67% (54–79%), respectively. CT score ≤ 2 had a negative predictive value 97% (92–99%) to exclude diagnosis of ccRCC. For diagnosis of pRCC, AUC, sensitivity, specificity, and PPV were 0.89 (0.81–0.98), 81% (58–94%), 98% (93–99%), and 85% (62–97%), respectively. Inter-observer agreement for CT scoring was moderate (Fleiss weighted kappa = 0.52 [range 0.46–0.60]).

A summary of the individual radiologists’ performance using the CT score for diagnosis of ccRCC and pRCC is provided in Tables 2 and 3. Results using the revisions proposed by Eldehimi et al and Lemieux et al are presented in Table 4. There was no difference in AUC comparing the original CT score to either proposed revision (overlapping 95% CI). The sensitivity was highest using the revision proposed by Eldehimi et al, but at the cost of lower specificity, and the worst performing system was the proposed revision by Lemieux et al; however, differences were not significant (overlapping 95% CI).

Table 2 Diagnostic accuracy of the CT score system for diagnosis of clear-cell renal cell carcinoma derived by the CT score (1–5) for three individual radiologists and overall in 51 small (≤ 4 cm) solid renal masses diagnosed by renal mass biopsy
Table 3 Diagnostic accuracy of the CT score system for diagnosis of papillary renal cell carcinoma derived by the CT score (1–5) for three individual radiologists and overall in 51 small (≤ 4 cm) solid renal masses diagnosed by renal mass biopsy
Table 4 Diagnostic accuracy of the original CT score system, the revision proposed by Eldehimi et al. [23] and Lemieux et al. [17] for diagnosis of clear-cell renal cell carcinoma for three individual radiologists and overall in 51 small (≤ 4 cm) solid renal masses diagnosed by renal mass biopsy

Considering the 4 masses with initially non-diagnostic renal mass biopsy, the CT score predicted the final diagnosis with 100% accuracy in all 4 masses for both radiologists. That is, both clear-cell RCC had CT score ≥ 4 and both papillary RCC had CT score ≤ 2.

Discussion

This study aimed to validate a recently proposed multiparametric CT algorithm in a consecutive cohort of small (≤ 4 cm) solid renal masses undergoing renal mass biopsy. The purpose of the CT scoring system was originally to differentiate clear-cell RCC from other histologic diagnoses among solid renal masses measuring ≤ 4 cm; however, investigators have shown a secondary benefit, which is high accuracy for diagnosis of papillary RCC [16, 17]. In this study, the CT scoring system had moderate agreement across three radiologists. A CT score ≥ 4 had high sensitivity for diagnosis of ccRCC, and CT score ≤ 2 had high NPV for exclusion of ccRCC. Moreover, a CT score ≤ 2 was highly accurate for diagnosis of papillary RCC. The CT score may represent a useful clinical tool when evaluating solid renal masses ≤ 4 cm by renal-protocol CT due to its ability to identify clear-cell tumors among small (≤ 4 cm) solid renal masses with high sensitivity and to diagnose papillary RCC with high specificity. This may help treating physicians better inform biopsy decisions, potentially avoiding biopsy in papillary RCCs, while performing biopsy in masses suspected to represent clear-cell RCC, establishing diagnosis and grade.

A CT score ≥ 4 achieved moderate to high sensitivity for diagnosis of ccRCC. In the original study proposing the CT score algorithm, sensitivity was 71% for one reader and 42% for the other reader [24]. In a subsequent study by Lemieux et al, a sensitivity of 74% was reported which compares favorably to our results [17]. The PPV in our study was slightly lower, when compared to the initial description of the CT scoring system [16] where PPV were 71% and 79% but again similar to Lemieux et al who report a pooled PPV of 59% [17]. Comparatively, the multi-parametric MRI ccLS has shown a similar sensitivity of 75% but higher PPV of 76% in a multi-center study by Schieda et al [16]. In another study, with more experienced radiologists, PPV of the ccLS for diagnosis of ccRCC increased to 87% indicating that there is likely a positive learning effect when applying the system [25]. Whether a similar positive effect on the PPV of the CT score can be achieved with increased reader experience is unknown and will require further assessment.

We evaluated recent proposed revisions to the CT score published by Eldehimi et al and Lemieux et al [17, 23]. The ccLS system uses segmental enhancement inversion (SEI) as a feature to help diagnose oncocytoma in category 4 tumors [21]. In our study, the proposed modifications by Eldehimi et al, which include adding SEI, did not alter overall accuracy and, although sensitivity was higher, this was at the cost of lower specificity. Similarly, the modifications proposed by Lemieux et al [17], namely to remove tumor heterogeneity focusing only on tumor corticomedullary attenuation ratio, did not alter overall AUC but had the lowest performance of the three systems tested.

A CT score ≤ 2 had a high NPV in our study and this threshold was highly accurate for diagnosis of papillary RCC. These results are similar to the study by Al-Nasibi et al where an NPV of 85–88% was reported [16]. At MRI, the ccLS also has shown a high NPV for score ≤ 2 tumors with a reported rate of 88% in the multicenter study by Schieda et al [14]. The specificity for diagnosis of pRCC, when CT score is ≤ 2, compares favorably to results by Al-Nasibi et al and Lemieux et al [17, 24]. Imaging diagnosis of pRCC has become more important recently, since pathological distinction of type 1 and type 2 pRCC is no longer performed in the 2022 WHO classification of renal tumors [26].

Inter-observer agreement in our study was moderate, comparable to the studies by Al-Nasibi et al and Lemieux et al [16, 17], but lower than reported using the MRI ccLS system [14]. This difference can be expected given the use of subjective assessment of tumor texture, which is a dominant feature in the algorithm [17]. Despite training and experienced radiologists, subjective assessment of tumor texture is likely to remain a feature with at best moderate agreement and future iterations of the algorithm might aim to improve agreement through use of more quantitative techniques such as texture analysis. Although Lemieux et al [17] report improved inter-observer agreement when eliminating tumor heterogeneity from the CT scoring system, in our study, this strategy resulted in the lowest diagnostic accuracy (although the difference was not significant).

This study has limitations. The sample consisted only of masses which underwent renal mass biopsy for histologic diagnosis of mass subtype. It is possible that renal mass biopsy diagnosis could be different compared to diagnosis after surgical resection; however, diagnostic accuracy of biopsy for renal mass subtype is high [9] in diagnostic specimens, and in our study, in masses with a biopsy diagnosis of subtype that underwent surgery, subtype pathology was concordant in all cases. A population of renal mass biopsy could be biased, particularly at our institution where renal mass biopsy is not universally applied to all cT1a renal masses and might include a higher proportion of oncocytic neoplasms. Therefore, the natural incidence of cT1a renal masses is not represented in our study and diagnostic performance of the CT scoring system should be expected to be higher in a population considering all consecutive cT1a renal masses due to a higher prevalence of ccRCC [4]. This might be expected to lower performance of the CT score due to lower prevalence of ccRCC. Renal-protocol CT was performed according to Society of Abdominal Radiology Disease Focused Panel on RCC recommendations, although using different CT models. This variability in equipment could have negatively influenced the observed performance of the derived CT score.

In conclusion, in this study, a CT scoring system applied for diagnosis of clear-cell RCC among solid renal masses measuring ≤ 4 cm had moderate inter-observer agreement, high sensitivity and NPV for diagnosis of ccRCC and high specificity for diagnosis of papillary RCC. These results could inform biopsy decisions in clinical practice, potentially avoiding biopsy in tumors diagnosed as papillary RCC and concentrating biopsy on suspected clear-cell RCC to establish diagnosis and histological grade better informing management decisions.