Introduction

Renal cell carcinoma (RCC) is one of the most common cancer worldwide [1]. Clear cell RCC (ccRCC) accounts for 70–80% of all RCCs [2]. Tumor pathologic nuclear grade has been proved to be one of the most important prognostic factors for ccRCC patients [3]. World Health Organization/International Society of Urological Pathology (WHO/ISUP) classification system is the most generally adopted grading system for ccRCC [4]. Well-differentiated and poorly differentiated ccRCC have different biological behaviors, including proliferative pattern, metastatic potentiality, and sensibility to molecular targeted therapy or immunotherapy. According to the previous studies, cancer-specific survival rates of ccRCC patients at 10 years for nuclear grades 1 to 4 were 89%, 84%, 46%, and 15%, respectively [5]. Patients with high-grade ccRCC have an elevated risk of postoperative recurrence [6]. Moreover, many noninvasive therapeutic strategies, including radiofrequency ablation, cryoablation, and active surveillance, have been applied in clinical practice for ccRCC patients. These therapeutic strategies require a proper screening criterion, such as pathological grading. However, pathological grading is usually based on postoperative specimens and preoperative predictive methods of pathologic nuclear grades remain suboptimal. Therefore, accurate preoperative prediction of ccRCC pathologic grades is still urgently needed and can be useful information for clinical decision-making and prognosis evaluation.

Radiomics analysis, a newly developed high-throughput way to extract large numbers of features from medical images, has attracted increasing attention in recent years [7, 8]. Possessing the advantage of feature recognition and data integration, radiomics analysis could provide us optimal approaches of a predictive model and guide the clinical practice [9]. Radiomics features extracted from computed tomography (CT), magnetic resonance imaging (MRI), and positron emission tomography-computed tomography (PET-CT) have been successfully applied in cancer diagnosis, assessment of prognosis, and prediction of therapy response [10,11,12]. Indeed, some studies have found that nuclear grades of ccRCC could be predicted by radiomics signatures [13,14,15]. However, these signatures were developed with limited radiomics features or without validation, making their clinical usefulness very limited. Moreover, clinical risk factors, which could improve predictive accuracy, have been overlooked in the development of these models.

In this study, we aim to develop and validate a radiomics nomogram, combining radiomics signature and clinical risk factors, for preoperative prediction of nuclear grades in patients with ccRCC.

Materials and methods

Patients

Two hundred ten RCC patients of the 2019 Kidney and Kidney Tumor Segmentation Challenge (C4KC-KiTS) dataset and 237 RCC patients of the Cancer Genome Atlas Kidney Renal Clear Cell Carcinoma (TCGA-KIRC) dataset were retrieved from the Cancer Imaging Archive (TCIA) [16, 17]. Patients’ characteristics were obtained from TCIA, including age, gender, pathologic grade, tumor size, T stage, and body mass index. Pathologic grade was divided into low-grade (grades 1–2) and high-grade (grades 3–4). Informed consent was not required since TCIA data contained no personal identifying information.

The recruitment pathway for patients in this study is shown in Supporting Fig. 1. In total, 258 ccRCC patients were enrolled, including143 patients in the training set and 115 patients in the validation set, in this study. The inclusion criteria for patients were as follows: (a) patient pathology diagnosed with ccRCC, (b) patients underwent preoperative contrast-enhanced CT, (c) complete image and clinical information. The exclusion criteria for patients were as follows: (a) patients with preoperative chemotherapy or radiotherapy. (b) CT images were of insufficient quality.

Region of interest segmentation and features extraction

Arterial-phase CT images of ccRCC patients were retrieved for radiomics analysis. The regions of interest (ROIs) of renal cancer were segmented slice by slice with 3D Slicer software (available at https://www.slicer.org/). In total, 1316 radiomics features were extracted from each ccRCC patients. Detailed information about feature extraction is presented in Supporting Method 1.

Radiomics feature selection

The least absolute shrinkage and selection operator (LASSO) regression algorithm, which is suitable and powerful for regression of high-dimensional data, was performed to select radiomics features. Spearman rank correlation was used to test the association between selected features.

Radiomics signature construction and performance assessment

Features with nonzero coefficients were selected to build the radiomics signature. The radiomics score was calculated for each ccRCC patient with the selected features weighted by their respective LASSO coefficients. The potential association between radiomics score and nuclear grade was explored by the Wilcoxon rank-sum test. The discrimination performance of the radiomics signature was assessed by using the AUC of the ROC in both training and validation set.

Radiomics nomogram construction

Multivariable logistic regression analysis was performed to identify independent predictors in radiomics signature and clinical candidate predictors, including age, gender, body mass index, CT-determined tumor size, and CT-determined T stage. Variance inflation factors of the predictors were calculated for the collinearity diagnosis. ROC curve and calibration curve were plotted to determine the performance of the radiomics nomogram in the training set.

Validation of the radiomics nomogram

The performance of the CT-based nomogram was validated in another independent cohort. The AUC was calculated to assess the predictive performance in the validation set. The calibration curve was plotted to determine the agreement between prediction and observation in the validation set.

Clinical usefulness

Decision curve analysis was performed to assess the clinical usefulness of the CT-based nomogram by calculating the net benefits at different threshold probabilities in the training set and validation set.

Statistical analysis

All statistical analyses were performed with R software, version 4.0.0 (The R Foundation for Statistical Computing; http://www.r-project.org). The detailed R packages used in this study are listed in Supporting Method 2. A p value < 0.05 was considered to be statistically significant.

Results

Patients characteristics

The study flowchart of this study is shown in Fig. 1. The characteristics of ccRCC patients are listed in Table 1. There are 143 ccRCC patients in the training set (97 low grade and 46 high grade) and 115 ccRCC patients in the validation set (40 low grade and 75 high grade). No significant differences were found in age and gender.

Fig. 1
figure 1

Study flowchart and radiomics workflow

Table 1 Characteristics of ccRCC patients in the training set and validation set

Radiomic features selection

Figure 1 showed the radiomics workflow of this study. In total, 1316 radiomics features were extracted from arterial-phase CT images of each ccRCC patient. Twenty of these radiomics features were selected to construct a radiomics signature by using LASSO logistic regression algorithm (Fig. 2a, b). Spearman rank correlation among selected features was low, indicating they provide complementary values to the radiomics signature (Fig. 2c). The selected features and corresponding coefficients are presented in Supporting Table 1.

Fig. 2
figure 2

Radiomics features selection with the LASSO logistic regression model. a On the basis of minimum criteria, we selected tuning parameters (λ) with 10-fold cross validation. The binomial deviance was plotted versus log(λ). The upper x-axis indicates the average number of radiomics features. The lower x-axis indicates the log(λ) value. The optimal λ value of 0.045, with log(λ) = − 0.310 was selected. b LASSO coefficient profiles of 1316 radiomics features. The upper x-axis indicates the average number of radiomics features. The lower x-axis indicates the log(λ) value. The dotted vertical line was drawn at the log(λ) = − 0.310. c Spearman rank correlation among 20 features selected by LASSO logistic regression method in the training set

Performance of radiomics signature

There was a significant difference in radiomics scores between low-grade and high-grade ccRCC patients in the training set (p < 0.001) (Fig. 3a). The radiomics signature yielded a favorable AUC (0.914; 95% CI, 0.868–0.961) in the training set (Fig. 3b). The optimal radiomics score cutoff value for grade classification was − 0.855 based on the maximum Youden index in the training set. A significant association between radiomics score and grades was also found in the validation set (p < 0.001) (Fig. 3c). The performance of radiomics signature was validated in the validation set with an AUC of 0.846 (95% CI, 0.771–0.921) (Fig. 3d).

Fig. 3
figure 3

The performance of the radiomics signature for prediction of nuclear grades. a Radiomics score of ccRCC patients in the training set. b ROC curve of the radiomics signature in the training set. c Radiomics score of ccRCC patients in the validation set. d ROC curve of the radiomics signature in the validation set

Radiomics nomogram construction

Multivariate logistic regression was performed to identify independent predictors for nuclear grades of ccRCC (Table 2). The radiomics signature and CT-determined T stage were selected as independent predictors. The variance inflation factors of the predictors ranged from 1.05 to 1.40, indicating that there was no collinearity. A comprehensive model incorporating radiomics signature and CT-determined T stage was established and presented as the nomogram (Fig. 4a).

Table 2 Multivariate logistic regression analysis of the radiomics score and clinical risk factors in the training set
Fig. 4
figure 4

Development of the radiomics nomogram and its performance. a The CT-based nomogram was constructed with the radiomics signature and CT-determined T stage in the training set (0, cT1-cT2; 1, cT3-cT4). b ROC curve of the CT-based nomogram in the training set. c Calibration curve of the CT-based nomogram in the training set

Performance of the radiomics nomogram

The CT-based nomogram achieved favorable predictive performance with an AUC of 0.929 (95% CI, 0.886–0.972) (Fig. 4b). The calibration curve of the CT-based nomogram was constructed and good agreement between prediction and observation was observed in the training set (Fig. 4c).

Validation of the radiomics nomogram

The AUC of the CT-based nomogram in the validation set was 0.876 (95% CI, 0.812–0.939), indicating a satisfying discrimination capacity (Fig. 5a). A good calibration curve was also observed in the validation set (Fig. 5b). In addition, ccRCC patients in the validation set were divided into high-risk group and low-risk group according to the nomogram. K-M survival analysis indicated a significant difference between high-risk group and low-risk group, suggesting the prognostic value of the CT-based nomogram (p = 0.0028) (Supporting Fig. 2).

Fig. 5
figure 5

Validation of the radiomics nomogram. a ROC curve of the CT-based nomogram in the validation set. b Calibration curve of the CT-based nomogram in the validation set

Clinical usefulness of the radiomics nomogram

Then we performed decision curve analysis to assess the clinical usefulness of the radiomics nomogram. The decision curve showed that the radiomics nomogram achieved more net benefit than either treat-all-patients or treat-none-patients strategies across all ranges of threshold probability in both training and validation set (Fig. 6a, b) (Supporting Fig. 3).

Fig. 6
figure 6

Decision curve analysis (DCA) of the CT-based nomogram. DCA comparing the net benefit of the CT-based nomogram versus radiomic signature or CT-determined T stage in the training set (a) and validation set (b). The x-axis indicates the threshold probability. The y-axis indicates the net benefit

Discussion

We developed and validated a noninvasive CT-based nomogram to preoperatively predict nuclear grades in ccRCC. This CT-based nomogram incorporated a 20-feature radiomics signature and CT-determined T stage. Our study demonstrated that this CT-based nomogram achieved favorable performance and might serve as a noninvasive tool for ccRCC grading and clinical decision-making.

Previous studies demonstrated that molecular and clinical characteristics of tumors could be reflected in medical images [18,19,20]. Radiomics features extracted from medical images can reflect the heterogeneity of tumors and be applied in tumor grading [21,22,23]. Many previous studies extracted features only from original images. Enming Cui et al extracted 276 features from CT images and built a CT-based model with an accuracy of 0.79 [15]. In this study, we extracted features from original images, wavelet-filter images, and Laplacian of Gaussian (Log)-filter images. A total of 1316 radiomics features were extracted from arterial-phase CT image of each ccRCC patient. These features were reduced to 20 potential predictors by using LASSO logistic regression method. Maker panels comprised of different makers have been increasingly applied to improve diagnostic or predictive performance in recent years [24, 25]. For instance, a 20-gene model was identified for prediction of the pathological node status in patients with bladder cancer [26]. A 35-gene signature was constructed to predict vascular invasion in hepatocellular carcinoma [27]. Therefore, we developed a radiomics signature with 20 selected features and it exhibited favorable discrimination with an AUC of 0.914 in the training set. Similarly, the radiomics signature showed satisfactory performance in ccRCC grading with an AUC of 0.846 in the validation set.

Then we considered clinical risk factors which can provide complementary information and improve the predictive accuracy of the model [28, 29]. Several radiomics models have been built to predict nuclear grades of ccRCC [13,14,15]. These previous studies focus on radiomics features and overlooked clinical risk factors. Shu Jun et al developed a combined CT-based model with an AUC of 0.82 [13]. In this study, we performed multivariate logistic regression and identify radiomics signature and CT-determined T stage as independent predictors. Thus, we constructed an easy-to-use CT-based nomogram based on the multivariate logistic regression. The CT-based nomogram could effectively identify high-gradeccRCC patients before operation which might facilitate personalized treatment decisions and improve the long-term prognosis. The AUC of the CT-based nomogram was 0.929 and 0.878 in the training and validation set, respectively, indicating a greater predictive accuracy than either the radiomics signature or CT-determined T stage.

Another important thing to consider for clinical predictive model is clinical usefulness. Decision curve analysis was performed to assess the overall net benefit of the CT-based nomogram [30]. The result indicated that the nomogram is a better predictive tool than the CT-determined T stage for clinical practice in the training and validation set. Moreover, two predictors incorporated in the CT-based nomogram are available from routine contrast-enhanced CT. Therefore, the CT-based nomogram may serve as a reliable, noninvasive, and easy-access tool to preoperatively differentiate high-grade from low-grade ccRCC.

There are several limitations to this study. Firstly, this is a retrospective study and a prospective randomized clinical trial should be needed to obtain high-level evidence for clinical application. Secondly, manual tumor segmentation was time-consuming and automatic segmentation methods should be developed in the future. Lastly, genomics and proteomics information have not been considered in this study. Genomics and proteomics classifiers have been proved to be effective predictive tools in ccRCC. Integrating radiomics with genomics and proteomics classifiers possesses great potential in improving the performance of prediction tools.

In conclusion, we developed and validated a CT-based nomogram that incorporated a radiomics signature and CT-determined T stage to preoperatively predict ccRCC grades. We presented it as an efficient, noninvasive, and convenient tool to assist clinical decision-making of ccRCC patients.