Introduction

Nephroblastoma/Wilms tumor represents the most common pediatric renal tumor, accounting for more than 90% of all kidney tumors in children [1]. In Europe, it is treated per the guidelines of the International Society of Pediatric Oncology (SIOP) with pre-operative chemotherapy to induce tumor shrinkage and facilitate subsequent surgical resection. Further treatment is based on pathological staging and the exact nephroblastoma subtype on histopathological work-up, which is characterized by the response to the neoadjuvant chemotherapy. Thus, adjuvant treatment ranges from follow-up only in patients with low- risk tumors and SIOP stage I, up to intensive adjuvant chemotherapy and radiation in patients with diffuse anaplasia or residual blastemic cells on histopathology [2]. Since initial biopsy of the tumor is not recommended, the ability to non-invasively characterize the histopathological tumor subtype before therapy (i.e., predict the response to treatment) would be of great clinical value, as it might allow for more personalized treatment.

Imaging plays a crucial role in detection and staging of pediatric renal tumors in the SIOP approach, in which the low risk of inadvertently treating other rare pediatric renal tumors mistakenly thought to be nephroblastomas is accepted [3]. While MRI is the modality of choice in most cases, the radiological assessment generally relies on the depiction of the tumor on ‘anatomical’ sequences, which allow for the determination of stage and possible infiltration of the renal vein or surrounding structures but do not provide any information on the underlying tumor physiology. Diffusion-weighted imaging (DWI) is an MRI technique that allows quantification of the “Brownian motion” of water molecules through calculation of the apparent diffusion coefficient (ADC). As the local microenvironment is known to differ significantly between tumorous and healthy tissue due to the higher cell count and changed tissue architecture in tumors, DWI has become part of the standard work-up in a variety of tumors, including renal tumors in adults [4]. While a single mean or median ADC value is usually measured in routine clinical practice, recent research in non-pediatric tumor patients has indicated that a more refined, histogram-based analysis of ADC values [5] or even a mathematical assessment of the inherent tumor ‘texture’ might better account for the known spatial intra-tumor heterogeneity and thus improve the differentiation between tumor subtypes or different grades of tumor aggressiveness [6].

To date, only a small number of studies have investigated the use of DWI in Wilms tumor [7,8,9,10,11,12], and given the overall rarity of the disease, these studies have understandably been limited by small sample sizes. However, their preliminary results have been promising: For example, they have shown a correlation between tumor cell density and ADC values in Wilms tumor [8] and possible differences in histogram ADC values between different subtypes [7], thus providing incentive to further assess the value of DWI in pediatric patients with renal tumors.

The purpose of this retrospective multi-center study was therefore to comprehensively assess the value of diffusion-weighted MRI in the pre-therapeutic evaluation of pediatric renal tumors, including the identification of histological subtypes of nephroblastoma with known adverse outcomes.

Materials and methods

Patients

The institutional review board approved this retrospective, multi-center study and waived the requirement for informed consent. A retrospective search was performed on the central database of the SIOP 2001 trial as of June 2016 for pediatric patients with renal tumors and available MRI examinations. A total of 68 centers participated in the SIOP 2001 trial. This initial search yielded 724 patients, for 250 of whom a pre-therapeutic MRI with a diffusion-weighted sequence was available. Some patients had to be excluded, for example, for starting chemotherapy before MRI or having upfront surgery because of symptoms (Fig. 1). Because patients with rare pediatric kidney tumors (adenoma, clear cell sarcoma, completely necrotic nephroblastoma, mesoblastic nephroma, renal cell carcinoma, and rhabdoid tumor of the kidney) were very few in number (n ≤ 3 in each subgroup), they were excluded from all statistical analyses. The final cohort therefore consisted of 130 pediatric renal tumors in 122 patients.

Fig. 1
figure 1

Flow chart detailing the patient inclusion process

All imaging examinations and clinical information were anonymized. The standard of reference for the histopathological subtype was surgical histopathology of the resected specimen. The presence of metastases at the time of diagnosis was based on the clinical assessment; the local pathological SIOP stage was extracted from the pathology report. Pre-operative chemotherapy according to the design of the SIOP 2001 trial consisted of 4 weeks of dactinomycin/vincristine (stage I–III) or 6 weeks of dactinomycin/vincristine/doxorubicin in stage IV patients [2].

MR imaging and analysis

All examinations were performed at a field strength of 1.5 (n = 102) or 3 Tesla (n = 20) using a dedicated MRI protocol that included a diffusion-weighted sequence (echo-planar imaging [EPI] sequence, repetition time [TR] = 928–34792 ms, echo time [TE] = 46–139 ms, matrix of 96 × 78–226 × 200; 17–41 cm field of view; 2–9-mm slice thickness). ADC maps were generated voxel-wise using a monoexponential model and the lowest and highest available b values (mean max. b value: 750 s/mm2, range 250–1000 s/mm2). Of the 130 tumors included, 111 were imaged on scanners manufactured by Siemens Healthineers (Erlangen, Germany), 18 on scanners manufactured by Philips Medical Systems (Amsterdam, Netherlands), and 1 on a scanner manufactured by General Electric Healthcare (Boston, USA).

Two readers (AMH and AL, with more than 5 years and more than 3 years of experience in interpreting genitourinary MR images, respectively), blinded to all histopathological and clinical patient information, independently identified each tumor by using all available MRI sequences to localize the tumor and its extent. Then, using ImageJ (version 1.47 m, National Institutes of Health, Bethesda, MD, USA) [13], they each volumetrically assessed the tumor on diffusion-weighted images by drawing a region of interest (ROI) around the entire tumor on every slice. Care was taken not to include any surrounding tissue. In patients with multiple visible tumors in a single kidney (n = 9; e.g., in cases of nephroblastomatosis), the largest lesion was assessed. Bilateral tumors were found in eight patients and were analyzed separately. The data from these ROIs were then analyzed using an in-house software written in Matlab (Mathworks Inc., Natick, MA, USA), which calculated the corresponding ADC and texture analysis values for each tumor on a voxel-wise basis. Histogram analysis included median, 12.5th percentile, 25th percentile, and 75th percentile as well as skewness and kurtosis of the distribution of tumor voxels within each volume (see Figs. 2a, b and 3a, b for examples). Texture analysis was based on Haralick texture features. In general, texture analysis aims to quantify the spatial dependence of tumor voxels, thus aiming to quantify. e.g., the tumor heterogeneity or homogeneity (see Supplementary Appendix 1 for details on how the parameters were defined and calculated). The following metrics were computed: contrast, correlation, energy, entropy, homogeneity, dissimilarity, and sum of entropy [14].

Fig. 2
figure 2

a, b Representative ADC map of a blastemal nephroblastoma (female patient, 11 years old) and the corresponding histogram

Fig. 3
figure 3

a, b Representative ADC map of a diffuse anaplastic nephroblastoma (female patient, 4 years old) and the corresponding histogram

Statistics

For descriptive analysis, mean, standard deviation, median, and quartiles for continuous variables, as well as absolute and relative frequencies for categorical data were calculated. Inter-reader agreement was examined via the intra-class correlation coefficient (ICC).

For further explorative analysis, Mann–Whitney U-tests were performed to determine whether any non-normally distributed continuous outcomes differed between two independent subgroups. The area under the nonparametric receiver operating characteristic (ROC) curve was calculated for each parameter. Bootstrapped 95% confidence intervals of the AUC were calculated with 2000 stratified replications. Two cutoff values were derived to maximize the Youden index and sensitivity of the ROC curve separately for each reader. Sensitivity and specificity at the cutoff point were estimated. Given the hypothesis-generating purposes of this study, all analyses were explorative and no adjustments for multiple testing were performed. In this context, an association was deemed significant if the p value was < 0.05. All analyses were performed using SPSS V23 (IBM Inc., Armonk, USA) and R (version 3.4.4 The R Foundation of Statistical Computing; package pROC).

Results

Patient and tumor characteristics

Detailed patient and tumor characteristics are given in Table 1. Mean age of the patients in the study population was 49 months (SD 35 months), with 48% of all patients being female and 52% male.

Table 1 Patient and tumor characteristics (“Local SIOP stage” and “Metastases at time of diagnosis” include nephroblastoma only)

Inter-reader agreement

Inter-reader agreement was excellent, with ICCs ranging from 0.717 (Kurtosis) to 0.975 (homogeneity). The ICC values are presented with their 95% confidence limits in Supplementary Table 1. Given the excellent inter-reader agreement, only the values for reader 1 are shown in the subsequent tables (Tables 2, 3), but values for reader 2 are given in the text where applicable.

Table 2 MRI parameters for reader 1 by SIOP stage, presence of metastases at time of diagnosis, and histopathological tumor subtype including p-values (median [range], unit for ADC values: × 10−3 mm2/s)
Table 3 MRI parameters for reader 1 by SIOP stage, presence of metastases at time of diagnosis, and histopathological tumor subtype including p-values (median [range])

Associations between MRI parameters and stage

Patients with locally aggressive tumor growth (SIOP 3 stage) or with metastatic disease at the time of diagnosis had significantly lower tumor ADC values in the 12.5th percentile compared to patients with less advanced disease; this significance extended to both readers (see Tables 2 and 4 for reader 1; reader 2: SIOP 1 vs. 3: p = 0.017, sensitivity: 0.52, specificity: 0.83; SIOP 2 vs. 3: p = 0.013, sensitivity: 0.87, specificity: 0.52; M1 vs. M0 at diagnosis: p = 0.028, sensitivity: 0.5, specificity: 0.81). In addition, the parameter energy from texture analysis differed significantly between patients with and without initial presence of metastases for both readers (reader 1: p = 0.01; reader 2: p = 0.028), while sum of entropy differed significantly between these two groups for reader 1 only (p = 0.032, reader 2: p = 0.077).

Table 4 Sensitivity, specificity, and area under the curve (AUC, with 95% confidence intervals) of MRI parameters in the differentiation between SIOP stages, presence of metastases at time of diagnosis, and histopathological tumor subtypes (unit for ADC values: × 10−3 mm2/s; R1 = reader 1, R2 = reader 2)

Associations between MRI parameters and histopathological tumor subtypes

When texture analysis values were compared between benign nephroblastomatosis and malignant nephroblastoma (all subtypes combined), contrast and homogeneity were found to differ significantly between the two groups for reader 1 (p = 0.027 and p = 0.018, respectively) and reader 2 (p = 0.007 and p = 0.045, respectively, see Table 4 and Fig. 4a); however, differences in entropy (median [range] reader 2: nephroblastomatosis, 3.02 [1.21–3.62] vs. other tumors, 3.46 [0.52–4.21]; p = 0.008) and sum of entropy (reader 2: nephroblastomatosis, 1.93 [0.62–2.84]) vs. other tumors, 2.8 [0.43–3.23]; p < 0.001) only reached statistical significance for reader 2.

Fig. 4
figure 4

ac ROC curves for the MRI parameters found helpful in the distinction between nephroblastomatosis (a), blastemal subtype of nephroblastoma (b) and diffuse anaplastic subtype of nephroblastoma (c) and other pediatric renal tumors

In tumors of the aggressive blastemal subtype, the skewness of the distribution of ADC values for all voxels was found to be significantly higher than that in other tumors for both readers (reader 1: see Table 2; median [range] for reader 2: blastemal, 1.26 [− 0.41–2.4] vs. other tumors, 0.64 [− 1.75–2.9]; p = 0.022); in diffuse anaplastic tumors, ADC values for the 75th percentile were found to be higher than in other tumors for both readers (reader 1: see Table 2; reader 2: diffuse anaplastic, 1.55 (1.33–2.88) × 10−3 mm2/s vs. other tumors, 1.31 (0.17–2.69) × 10−3 mm2/s; p = 0.042).

As both blastemal and diffuse anaplastic tumors represent aggressive tumors, we also calculated the best specificity at a maximal sensitivity of 100%, which corresponds, i.e., to the clinical setting in which no aggressive tumor should be missed (see also Table 4). For blastemal subtypes, the parameter skewness showed a maximal specificity of 0.52 (reader 1) and 0.13 (reader 2), whereas ADC values for the 75th percentile demonstrated a maximal specificity of 0.58 (reader 1) and 0.52 (reader 2) in diffuse anaplastic tumors (see Fig. 4b, c).

Compared to other tumors, epithelial nephroblastoma demonstrated significantly higher entropy values (reader 1: p = 0.029, reader 2: 0.024) and stromal nephroblastoma showed significantly higher median ADC values (reader 1: p = 0.002, reader 2: p = 0.019) as well as significant differences in the parameters contrast and dissimilarity for both readers (reader 1: p = 0.018 and p = 0.04, respectively; reader 2: p = 0.009 and p = 0.027, respectively). In addition, ADC values in the 25th and 75th percentiles were found to be significantly higher in tumors of the stromal subtype than in other tumors for reader 1 only (Table 2).

Discussion

The treatment of nephroblastoma differs between the USA and Europe, with pre-operative chemotherapy and subsequent tumor resection being favored in Europe per guidelines of the International Society of Pediatric Oncology (SIOP), while upfront surgery with adjuvant chemotherapy is recommended by the Children’s Oncology Group (COG) in the USA. Despite these differences, insights gained through coordinated group studies have led to increased survival rates of more than 90% in these patients for both approaches. In the SIOP approach, the decision to perform neoadjuvant chemotherapy is solely based on the finding of a renal tumor consistent with nephroblastoma on imaging, as pre-therapeutic biopsy is not recommended [15]. Further risk assessment is performed after surgical tumor resection by grading the amount of changes induced by the neoadjuvant treatment, thus characterizing the tumor as one of several nephroblastoma subtypes with differing clinical behavior. The ability to correctly identify different renal tumors (in particular, the benign nephroblastomatosis) on imaging or to predict tumor response to neoadjuvant chemotherapy in nephroblastoma before the start of treatment would be of great clinical value, as it could allow for better risk stratification and consequently more personalized treatment approaches. Diffusion-weighted MRI is of particular interest in this context, as it aims to quantify the local tumor microenvironment by assessing the diffusibility of water molecules and has been shown to be of value for differentiating between tumor subtypes of adult renal cell carcinoma [4]. However, recent reports [16] highlight the importance of intra-tumor heterogeneity in nephroblastoma, and therefore, more complex analytical methods such as histogram or texture analysis might be warranted to better characterize tumor aggressiveness and assess prognosis.

Our findings suggest that histogram and/or textural analysis of DWI data could allow for a more refined staging approach: For example, for both readers, median tumor ADC values in the 12.5th percentile were lower for patients with locally aggressive and infiltrative tumor growth (i.e., local SIOP 3 stage [17]) or with metastases at the time of diagnosis than for all other patients (the median ADC of all tumor voxels did not differ significantly between these groups). In addition, the energy parameter from texture analysis of the whole tumor volume differed between patients with M1 status and those with M0 status for both readers. Energy reflects the homogeneity grayscale distribution of images, with higher values in cases of an unequal distribution of values among the co-occurrence matrix. Energy values were higher in patients with metastases at the time of diagnosis, thus quantifying the greater heterogeneity of aggressive tumors. This finding highlights the influence of intra-tumor heterogeneity in nephroblastoma [16], as aggressive tumor cells may compose only a small percentage of the whole tumor volume (thus contributing little to a mean or median parameter value) while still having a decisive influence on patient prognosis.

As the indication for neoadjuvant chemotherapy is currently based solely on the finding of a renal tumor on imaging that is thought to be consistent with a nephroblastoma, the correct identification of benign nephroblastomatosis would help in assigning the correct treatment following radiological diagnosis. We found that the parameters contrast and homogeneity differed significantly between nephroblastomatosis and all other renal tumors and therefore might be of value in differentiating among these entities. The parameter homogeneity reflects the homogeneity of image textures in a tumor and thus reaches higher values in more homogeneous tumors. In our study, homogeneity values of benign nephroblastomatosis were found to be lower than those of nephroblastoma, which is surprising given that visually assessed homogeneity is thought to be a typical imaging feature of nephroblastomatosis. The reason for this may lie in the fact that nephroblastomatosis tends to be smaller at time of diagnosis, so that even small regressive changes can have a significant influence on the whole tumor volume. In addition, while visually assessed homogeneity is based on the appearance of the tumor “as a whole,” homogeneity as assessed by texture analysis is a statistical measure based on mathematical models. While texture analysis allows for a more observer-independent assessment of tumor heterogeneity, it is influenced by technical parameters, e.g., the spatial scale (i.e., the number of pixels in the proximity of a single pixel that are evaluated to derive a single measure) and as such does not necessarily correlate with the radiologist’s visual assessment, as was shown in a recent study in lung cancer [18].

It would also be beneficial to be able to identify the most aggressive tumor subtypes, namely the blastemal and diffuse anaplastic types, on pre-treatment imaging, as patients with these tumors tend to have poorer prognoses and therefore could benefit, for example, from intensified neoadjuvant treatment [2, 19]. While median ADC values did not differ significantly between the different tumor entities in our study (a result that aligns well with the preliminary results of Littooij et al. [10] and Meeus et al. [12]), we found that the skewness of the distribution of ADC values for all tumor voxels was significantly higher for the aggressive blastemal subtype than for all other renal tumor types assessed for both readers, and ADC values of the 75th percentile were significantly higher in diffuse anaplastic tumors possibly due to a greater amount of necrosis and regressive changes in these aggressive tumors. We also found higher ADC values for stromal subtypes compared to all other renal tumors assessed, which is consistent with the results of Hales et al. [7], who reported comparably high mean ADC values for this subtype.

Our study provides a comprehensive assessment of the value of DWI in the largest cohort of patients with nephroblastoma to date. Though substantial overlaps in parameter values between different tumor groups limit the current clinical applicability of our findings, given the promising nature of our results, our hypotheses merit further investigation in a prospective trial, possibly including additional emerging MRI techniques (e.g., arterial spin-labeling or chemical-shift imaging).

Our study had a number of limitations: Despite the relatively large size of our whole patient cohort, the numbers of patients in some subgroups were low, thus limiting the generalizability of our results. Because the investigation was conducted as an exploratory study, no adjustment for multiple testing was performed. Furthermore, given our multi-centered retrospective approach, we could not account for technical differences between the MRI examinations (for example, differing vendors, b values) and had to rely on local centers to ensure the compliance of the patient to the chemotherapeutic regimen and its documentation. Therefore, we think that our findings are of exploratory nature and still require verification in a separate cohort using a homogeneous imaging protocol. Also, the number of neoadjuvant chemotherapy cycles might have differed for patients presenting with metastatic or bilateral disease; however, in the interests of maximizing the study population, we did not exclude these patients from later analysis. In addition, as nephroblastomas also occur in infants, some of the tumors we analyzed were of small volume, which might have increased the variability of our results and led to the inclusion of outliers.

In conclusion, our study demonstrated that parameters from diffusion-weighted MR imaging might have the potential to be valuable in risk assessment (i.e., identification of patients with locally advanced or metastasized tumors and differentiation of tumor types) in pediatric patients with renal tumors. The ability to identify benign nephroblastomatosis as well as aggressive subtypes of nephroblastoma known to have poor prognoses on pre-treatment DWI could enable more personalized treatment in the future.