Renal cell carcinoma (RCC) is one of the most common malignant tumors of the kidney.1 Among RCCs, clear cell RCC (ccRCC) is the most common pathological subtype, with a relatively poor prognosis.2,3

In the latest WHO classification of tumors in the urinary system and male genital organs, a pathological nuclear grade system, the International Society of Urological Pathology (WHO/ISUP) grading system, has been established.4 This grading system regarded nuclear grade as an important prognostic predictor for ccRCC.5,6,7 Compared with low-grade ccRCCs, high-grade ccRCCs have a high growth rate and a poor prognosis.8,9,10 Therefore, nuclear grades can reflect the aggressiveness of ccRCC11; however, nuclear grade can only be confirmed by pathological tests from biopsy or surgical tissue, which are invasive.12

Radiomics is a non-invasive examination approach that has been widely used in diagnosis and treatment outcome evaluations in oncology.13,14,15,16,17 Through deep mining the quantitative radiomic features from medical images, radiomics can select key features that reflect clinical or pathological information of the tumors, to assist diagnosis and prognosis.18,19,20,21 Thus, the radiomics method provides a potential tool for non-invasive nuclear grading.

In this study, we developed a radiomic approach to analyze the association between preoperative CT images and nuclear grades of ccRCC.

Materials

Patients

This retrospective study was performed in accordance with the Declaration of Helsinki and was approved by the ethical committee of two hospitals. The requirement for informed consent was waived.

The inclusion criteria were (1) patients with pathologically confirmed ccRCC; (2) patients who underwent dynamic enhanced kidney CT scans before surgery; and (3) nuclear grades were available from the pathology reports. The exclusion criteria were (1) the CT scan was performed more than 1 week before surgery; and (2) the CT slice thickness was not 5 mm.

Finally, we collected 247 eligible ccRCC patients from center A (Wuxi People’s Hospital) and divided them into a training set (n = 124) and an internal test set (n = 123). Furthermore, an external test set (n = 73) was enrolled from center B (Changzhou No. 2 People’s Hospital). Details of the three sets are shown in Table 1.

Table 1 Clinical information of the ccRCC patients in all sets

Clinical Characteristics and Nuclear Grade

As shown in Table 1, the preoperative clinical characteristics of patients included sex, age, history of chronic diseases (hypertension and tumor history), chief complaint (gross hematuria or lumbar discomfort), presence of urinary occult blood, renal function, and T staging (determined by the radiologist from a CT scan, according to the American Joint Committee on Cancer [AJCC] guidelines).22

All pathological sections of patients’ surgical specimens were reviewed by two independent histopathologists to estimate the nuclear grades. According to the 2016 WHO/ISUP grading system, the nuclear grades were defined as grades 1, 2, 3, and 4 at × 100 and × 400 magnification pathologic images .4 Where necessary, discrepancies were resolved by a third histopathologist.

Computed Tomography Image Acquisition

The CT scans in center A were performed using a Sensation-64 scanner (SOMATOM Definition; Siemens, Munich, Germany), with parameters that included 120 kVp, automatic mAs, a pitch of 0.55, a scan field of view of 360 mm, a pixel matrix size of 512 × 512, and a reconstructed slice thickness of 5 mm. The CT scans in center B were performed using a Sensation-64 scanner (SOMATOM Force; Siemens, Germany), with parameters that included 120 kVp, automatic mAs, a pitch of 0.80, a scan field of view of 360 mm, a pixel matrix size of 512 × 512, and a reconstructed slice thickness of 5 mm.

During scanning, a non-enhanced CT scan was acquired, followed by post-contrast enhanced scans. Non-ionic iodinated contrast material, Loversol (320 mgI/mL), was intravenously injected with an automatic high-pressure injector, at a flow rate of 3 mL/s, followed by a 30 mL saline chaser. The total dosage of the contrast material was 1.5 mL/kg. Following the initiation of injections, a corticomedullary phase (CMP) scan was performed at 30 s and a nephrographic phase (NP) scan was performed at 70 s. The scan area ranged from the dome of the liver to the symphysis pubis.

Tumor Volume of Interest Segmentation

Manual segmentation of the volume of interest (VOI) of the tumor was conducted using three-dimensional (3D) Slicer software (ITK-SNAP, version 3.6.0; http://www.itksnap.org) on unenhanced, CMP, and NP CT images. The VOIs were segmented by a radiologist with more than 5 years’ experience in ccRCC diagnoses. Moreover, 30 patients were randomly selected and segmented by another radiologist to construct a test–retest set and to calculate the interclass correlation coefficient (ICC) of radiomic features.23

Radiomic Feature Extraction

This study used the radiomic features proposed by Aerts et al.24 and Lambin et al.25 We extracted four types of 3D features in the VOI: (1) intensity features, i.e. the first-order statistics features; (2) textural features; (3) shape features, i.e. size, maximum diameter, and maximum section of the tumor; and (4) wavelet features, i.e. intensity and texture features from wavelet decomposition of the original images. For all features, we used the normalization function to obtain the relative value instead of the overall resize. The features were continuous variables and no binning was performed. Note that the variations of a feature between different CT phases were also used as candidate features. In total, we extracted 647 features per phase, 647 variation features per two phases, and 3882 features in total per patient (the process is shown in Fig. 1).

Fig. 1
figure 1

Radiomics workflow diagram. VOI volume of interest, AUC area under the curve

Model Building and Evaluation

First, we used the test–retest data (n = 30), segmented by different radiologists, to evaluate the reproducibility of features between different segmentations. We preserved features with an ICC value < 0.75. Second, we used the maximizing independent classification information (MICI) criteria function26 to select the features associated with nuclear grades. On this basis, we used the recursive feature elimination with cross-validation (RFECV) method and the random forest (RF) classifier to find the best radiomic features for nuclear grading.27 Finally, the selected features were combined with clinical characteristics to generate a final combined model. We also built a clinical model using only clinical characteristics, and a radiomic model using only radiomic features, for comparison.

The ability of radiomic features and models to categorize nuclear grades was evaluated by the area under the curve (AUC) of the receiver operator characteristic curve (ROC).

Statistical Analyses

All statistical analyses were performed using Python version 3.6, while the RFECV feature selection, RC classifier, and AUC calculation used the Python Scikit-learn package.28,29

Results

Based on the test–retest data set, we obtained 2668 robust radiomic features with an ICC > 0.75 (Fig. 2). A total of 46 features determined by the MICI method were selected as candidates for classification, which included the maximum, minimum, median, and range values of the first three-order wavelet decompositions in the three phases of CT images, and the variations of texture features and entropy between the three phases.

Fig. 2
figure 2

Stable radiomic feature selection using test–retest set

Based on both the RFECV method and the RF model assessment, we obtained the most effective 11 features to build a radiomic model. Selected features included (1) texture features of CMP: gray-level co-occurrence matrix (GLCM) after wavelet decomposition, and gray level size zone matrix (GLSZM); (2) texture features of the NP: entropy and GLCM after wavelet decomposition; (3) variation of texture features between the CMP and the non-enhanced phase: the energy and neighborhood gray tone difference matrix (NGTDM); and (4) variation of texture features between the NP and the non-enhanced phase: the GLCM and NGTDM. The clustered map of the features in the radiomic model is shown in Fig. 3.

Fig. 3
figure 3

Radiomic feature cluster diagram. ‘a’ and ‘v’ represent the features extracted in the corticomedullary phase and nephrographic phase CT, respectively; ‘an’ and ‘vn’ represent the variation between the corticomedullary phase, nephrographic phase, and non-enhanced phase feature; and ‘Coif1’ means that the feature is decomposed by the first-order wavelet. CT computed tomography

The ROC curves of the radiomic model, clinical model, and the combined model are shown in Fig. 4. The AUC values of the four-class classifications of the radiomic model reached 0.71, 0.73, 0.81, and 0.80, respectively, in the internal test set. The clinical model (including sex, age, T stage, hypertension, chief complaint, and urine occult blood) showed high performance in predicting the grade 4 with an AUC of 0.88 in the internal test set, indicating the clinical factors had a high sensitivity in discriminating grade 4. Both the radiomic and clinical models worked worse in the external test set. Combined with radiomic features and clinical characteristics, the final combined model had an AUC of 0.77, 0.75, 0.79, and 0.85 in the internal test set. The AUC of the combined model on the external test set reached 0.75, 0.68, and 0.73 (there was no fourth-level grade in the external test set). We found that the addition of clinical characteristics to the radiomic model could help improve performance. Compared with the radiomic and clinical models, the combined model performed better and was more robust.

Fig. 4
figure 4

ROC curves of different models on the internal validation set and external validation set. ROC receiver operating characteristic, AUC area under the curve

As shown in Fig. 5a, the mean values of the clinical characteristics for each grade in the internal test set showed that nuclear grade 4 had the worst conditions (T stage, hypertension, urine occult blood); hence, the clinical model was effective in discriminating grade 4 in our study. Figure 5b shows the mean distribution of the final radiomic features used in the radiomic and combined models.

Fig. 5
figure 5

Analysis of the effective features. a Distribution of the mean values of different clinical features. b Distribution of the 11 radiomic features. Note: sex (0: female; 1: male), age (divided by 100), T stage (0: T1A; 1: T1B; 2: T2; 3: T3; 4: T4), hypertension (0: without hypertension; 1: with hypertension), chief complaint (1: hematuria; 2: lumbar discomfort; 3: other), urine occult blood (0: without urine occult blood; 1: with urine occult blood)

Discussion

In this multicenter study, we investigated the relationship between nuclear grades of ccRCC, with radiomic features in non-enhanced phase, CMP, and NP CT images, and found that the radiomic features in all CT phases were associated with nuclear grades. The combination of radiomic features and clinical characteristics could well discriminate nuclear grade, and it performed better than the radiomic and clinical models.

We found that the texture features and perplexity of the tumor VOI had significant variation in different nuclear grades. The WHO/ISUP grading system defined tumor grades 1–3 based on nucleolar prominence, and defined grade 4 based on the presence of pronounced nuclear pleomorphism, tumor giant cells, and/or rhabdoid and/or sarcomatoid differentiation. The proportion of nucleoli, nuclear pleomorphism, tumor giant cells, and rhabdoid or sarcomatoid differentiation under microscope reflected cell density and tumor heterogeneity, which could be associated with texture features and perplexity of the tumor in the CT images.

Ding et al. performed a preliminary study on nuclear grading.30 Compared with their work, we used the latest urological pathology (WHO/ISUP) grading system standard (4-level grading) instead of the Fuhrman grading system (2-level grading). In terms of method, our study used CMP and NP CT images, which were not included in the work of Ding et al. Moreover, the wavelet transform and variation of feature between different phases were used in our study to generate more features.

Our study had some limitations. First, although this study was performed on two centers, the sample size was relatively small. Further validation of our model on large-scale data sets is necessary. Second, we used a CT slice thickness of 5 mm and did not consider other thicknesses. Third, this was a retrospective research study and further prospective validation is needed.

Conclusions

This study proposed a non-invasive method based on preoperative CT images and radiomics methods for nuclear grading in ccRCC, which could assist clinicians in diagnosis and treatment decision making.