Introduction

Radiomics involve quantitative image analysis aided by computer vision with an emerging role in medicine and oncology as noninvasive biomarkers [1]. The high-dimensional imaging data can be linked to a diverse range of biological and clinical endpoints, including histological diagnosis, molecular characteristics, treatment response, survival outcomes, and toxicity estimation [2]. Magnetic resonance imaging (MRI) is considered the gold standard imaging modality in central nervous system (CNS) tumors due to superior anatomical representation of intracranial structures compared to other imaging modalities as well as functional information obtained from advanced sequences like perfusion, diffusion, spectroscopy [3]. Radiomics research has been widely applied in CNS tumors including gliomas in characterizing tumor histology (astrocytoma vs. oligodendroglioma), tumor grade (low grade vs. high grade), risk-stratification, and quantitative response assessment [4, 5].

Glioblastoma (GBM) is a grade 4 glial neoplasm, associated with poor prognosis with a median survival of 15 months [6]. Following maximal safe resection, patients are treated with radiotherapy (RT) along with concurrent temozolomide. Radiation target volumes constitute gross tumor volume (GTV) and the clinical target volume (CTV) accounting for macroscopic residual disease along with surgical cavity and microscopic infiltrative disease in the peritumoral region, respectively [7]. Following RT completion, patients are treated with adjuvant temozolomide and serial interval MRIs are performed to monitor disease status. As a part of RT planning, patients undergo computed tomography (CT) for dose computation purposes and MRI for delineation of target volumes and organs at risk. A significant body of work has been undertaken utilizing pre-operative MRI for patients with GBM, demonstrating the feasibility of prognostication and identifying different subsets of patients with variable outcomes [4]. However, limited data is available using radiomics on post-surgery or pre-radiation imaging, which can tentatively provide important insights into biological behavior. In this study, we have undertaken exploratory analysis to investigate the impact of quantitative features obtained from RT planning MRI, as well as the impact across different surgical resection subsets.

Methods

Patient selection

Patients with newly diagnosed IDH-wildtype GBM treated with RT at Sunnybrook Health Sciences Centre, Toronto, between January 2014 and December 2018 were considered eligible for the retrospective study. The institutional ethics committee approved the study, and the requirement for informed consent was waived. Patients who had T1-weighted gadolinium contrast-enhanced (T1-CE), T2-weighted FLAIR (T2-FLAIR), and apparent diffusion coefficient (ADC) MRI sequences and completed the scheduled course of RT were considered eligible for the study. Patients with prior RT, missing survival information, missing or motion artifact corrupted images were excluded.

Radiation planning contour extraction and image preprocessing

Patients were treated with RT and chemotherapy following surgery according to standard protocols [6, 8]. Following consultation and decision for RT, planning CT and MRI was performed for all patients. Radiation planning MRIs were acquired on GE Signa HDxT 1.5 T MRI scanners (GE Medical Systems, WI, USA) or Philips Ingenia 1.5 T systems (Philips Medical, WI, USA). The slice thickness for T1-CE, T2-FLAIR, and ADC was 1, 2, and 5 mm, respectively. The RT target volumes were drawn by the responsible radiation oncologists. GTV encompassed the enhancing residual disease on the T1-CE sequence and the surgical cavity. CTV included a 1.5 cm expansion beyond the GTV typically encompassing the peritumoral region (PTR) identified as hyperintense region on T2-FLAIR, duly edited from anatomical barriers like the falx, tentorium cerebelli, and bone. A planning target volume (PTV) of 5 mm was used. All patients were treated on a linear accelerator device using intensity-modulated RT with image guidance. Typical dose prescriptions included 60 Gy in 30 fractions over 6 weeks in patients less than 65 years or 40 Gy in 15 fractions over 3 weeks in elderly patients or patients with poor performance status. The decision regarding concurrent and adjuvant temozolomide was taken by the responsible neuro-oncologist and radiation oncologist.

Contours for the GTV and CTV that were planned on fused CT/T1-CE images in Pinnacle (Philips Medical, WI, USA) were exported into Matlab R2018b (The Mathworks, Inc., MA, USA), where the CT-based contours were transformed to the T1-CE reference frame using the CT/T1-CE registration information. Images were skull stripped using HD-BET [9], and then the T2-FLAIR and ADC volumes were rigidly registered to the corresponding T1-CE volumes using the FMIRB Software Library (FSL) tool FLIRT [10,11,12]. The contours and images were resampled to a uniform resolution of 1 mm3, N4 bias field corrected [13, 14] and intensity normalized by histogram matching [15,16,17] to the T1-CE and T2-FLAIR images acquired on the GE scanners. Finally, the CTV contours were manually refined using ITK-SNAP [18] (http://www.itksnap.org) to omit the ventricles, prominent sulci, and the skull where applicable, and the GTV was subtracted from the CTV to generate two disjoint contours.

Feature extraction

Radiomic features were extracted from the GTV and modified CTV for each of the three MRI sequences. Fixed bin width (FBW) quantization was used to discretize pixel intensities within each contour. To determine the FBW for each modality and contour type, the minimum and maximum intensities were measured, and the FBW was selected as the maximum width that produced bin counts greater than or equal to 30. Feature extraction was performed using PyRadiomics [19] software V2.2.0. The feature set included the following: 18 first-order statistical features; 22 gray level co-occurrence matrix features; 16 gray level size zone features; 16 gray level run length matrix features; 5 neighboring gray tone difference matrix features; and 14 gray level dependence matrix features. Ninety-one features were extracted for the GTV and CTV for each of the three MR modalities, resulting in a total of 546 radiomic features per patient. A detailed description of the features can be found on the PyRadiomics website (https://pyradiomics.readthedocs.io). A schematic of the data processing is shown in Fig. 1.

Fig. 1
figure 1

Workflow for image preprocessing and feature extraction

Statistical analysis

Risk modelling

The statistical analysis was performed using R V4.0.3 [20] (http://www.R-project.org). Four independent models were considered: one including all patients and one for each of the three surgical treatment subgroups i.e., gross total resection, subtotal resection, and biopsy. Internal validation was performed using leave-one-out cross-validation. First, a patient was left out, and the radiomic features from the remaining patients were shifted and scaled to zero mean and unit standard deviation using the caret [21] package function “PreProcess”, and then the shifting/scaling parameters were applied to the test patient. Using the package glmnet [22, 23], the radiomics signatures were constructed by least absolute shrinkage and selection operator (LASSO) Cox regression. To reduce potential overfitting, the regularization weight λ was optimized to minimize the tenfold cross-validation error on the training set. The fitted LASSO Cox model was then applied to the training and test patients and radiomics risk scores were derived by a linear combination of the features weighed by their model coefficients. The median of the training patients’ radiomics risk scores was used as a threshold to assign the test patient to either low or high risk. Using the survival [24, 25] package, Cox proportional hazard models were fit to the clinical features of the training data, which included age, GTV volume, and surgical treatment, and linear predictions were made on the test patient to produce a clinical risk score. Additionally, a combined radiomics and clinical Cox model was fit on the training data and applied to the test patient to create a combined risk score. All modeling steps were repeated from scratch for each left-out patient. A flowchart of the leave-one-out cross-validation procedure is shown in Supplementary Fig. 1.

Evaluation of predictive accuracy

Cross-validated Kaplan–Meier curves for high/low overall survival (OS) risk were constructed using the survival package function “survfit.” The split into binary risk groups (high and low) were obtained using the major split as obtained from the algorithm. The date of surgery was considered as baseline for the survival analysis. To assign statistical significance to the log-rank tests, the permutation distributions of the log-rank statistics were obtained by randomly permuting the correspondence between radiomic features and OS and repeating the entire leave-one-out cross-validation procedure 100 times. To test the hypothesis of predicting OS using radiomic features, the proportion of permuted log-rank statistics greater than or equal to the un-permuted statistic was taken as the significance level [26]. A weighted log-rank test was used (G-rho rank test, rho = 1). The cross-validated time-dependent area under the receiver operator characteristic curves for the radiomics, clinical, and combined risk scores were generated using the package risksetROC [27], and the integrated areas under the time-dependent curves (iAUC) were evaluated with the function “risksetAUC.” To evaluate the incremental value of radiomics to clinical in OS prediction, the null hypothesis that radiomics are independent of OS and clinical variables was tested by obtaining the permutation distribution of the iAUCs. In this case, the correspondence between radiomic features and OS were randomly permuted while clinical variables were left un-permuted and the leave-one-out cross validation procedure was repeated 100 times. The difference in iAUC measures between the combined and clinical models was used as the test metric. The proportion of iAUC differences with permuted radiomic features greater than or equal to the un-permuted iAUC differences were taken as the significance level. Kendall’s τb was used to test the association between the predicted radiomic risk scores and surgical treatment. Normality of the radiomics risk scores was assessed with the Shapiro–Wilk test and a one-way ANOVA or Kurskal-Wallis test for risk scores by surgical treatment was used where appropriate. p-values < 0.05 were taken as significant.

Results

Clinical characteristics

A total of 235 patients were included in the study. Patients undergoing gross total resection (GTR), subtotal resection (STR), and biopsy were 24%, 50%, and 26% of the cohort, respectively. The median interval between surgery and planning MRI was 19 days. The six week long RT course was delivered to 56% of patients. Concurrent and adjuvant temozolomide was used in 91% and 65% of patients, respectively. The median OS for all patients was 13 months. The detailed clinical and treatment characteristics for all patients as well as the surgical treatment subgroups are listed in Table 1.

Table 1 Clinical and treatment details for patients included in the study

Radiomics risk stratification

The cross-validated Kaplan–Meier survival curves for classifying patients into low- or high-risk groups according to the radiomics signatures for each of the four patient sub-groupings are shown in Fig. 2. The radiomics signatures derived from the inclusion of all surgery subtypes stratified the patients into low-risk and high-risk groups with G-rho permutation test p-value < 0.01. For the GTR and STR models, the radiomics signature could not stratify risk groups with statistical significance. In the biopsy subgroup model, the radiomics signature stratified patients into low-risk and high-risk groups (p = 0.04). A boxplot of the predicted radiomics risk scores as well stratified by surgical treatment are shown in Fig. 3, showing significantly higher risk in patients with a higher residual disease post-surgery (biopsy > STR > GTR).

Fig. 2
figure 2

Kaplan–Meier plots for the binary risk-stratified groups based on the radiomics modeling for all the patients and individual surgical subgroups

Fig. 3
figure 3

Radiomics risk score across the three surgical subgroups

Survival prediction

The cross-validated time-dependent area under the receiver operator characteristic curves of the radiomic, clinical, and combined models for survival prediction in the patient subgroups are shown in Fig. 4. For the prediction models including all patients, the iAUCs of the radiomics signature was 0.593; the clinical model was 0.644; and the combined radiomics and clinical model was 0.632. The iAUC difference (iAUCcombined–iAUCclinical) was − 0.012. For the GTR subgroup, the iAUCs of the radiomics signature were 0.576; the clinical model was 0.614; and the combined model was 0.604, with an iAUC difference of − 0.01. For the STR subgroup, the iAUC of the radiomics signature was 0.540; the clinical model was 0.538; and the combined model was 0.523. The iAUC difference was − 0.015. Finally, for the biopsy subgroup, the iAUC of the radiomics signature was 0.613; the clinical model was 0.522; and the combined model was 0.632. The iAUC difference was 0.11 (permutation test p < 0.01).

Fig. 4
figure 4

Integrated area under curve values with time as the dependent variable (X-axis) for all the patients and individual surgical subgroups

Analysis of the radiomics risk scores derived from the inclusion of all patients showed a moderate to strong association of radiomics risk with surgical treatment (τb = − 0.343, p ≪ 0.01). The Kruskal–Wallis test revealed a significant difference in the predicted radiomics risks among surgical treatment groups (p ≪ 0.001) and a large effect size (η2 = 0.188). Post-hoc multiple pairwise comparisons using Wilcoxon’s test with Bonferroni p-value adjustment revealed significant differences between the biopsy and STR groups (p < 0.01); between the biopsy and GTR groups (p ≪ 0.01); as well as between the STR and GTR groups (p ≪ 0.01). A G-rho rank test of OS for the surgical treatment groups showed a significant difference in OS (p ≪ 0.01). The survival plot for OS curves stratified by surgical treatment is shown in Supplementary Fig. 2.

Discussion

Computational analysis techniques have enabled medical imaging to cross the boundaries of traditional contribution in diagnostics and serve as noninvasive biomarkers [28]. Since MRI has an integral role at different stages of management of CNS tumors, quantitative analysis provides an excellent opportunity for integration with clinical factors to improve prognostication as well as potential treatment modifications. In this study, we have undertaken radiomic analysis of three MRI sequences extracted from the standard radiation target volumes performed following surgery as a part of RT planning to predict overall survival.

Radiation oncology treatment planning and delivery are deeply intertwined with various morphological and functional imaging modalities needed for dosimetry computation, delineation of target volumes and avoidance structures, and guiding treatment delivery. Although cone-beam CT images during RT have been analyzed in different malignancies like lung or head-neck primaries [2930], there is a dearth of studies regarding the use of pre-treatment RT volumes and corresponding imaging data in radiomic analysis. As the general principle for RT is the presence of macroscopic tumors within GTV and microscopic disease within the CTV, a significant amount of biological information is expected to be contained within the target volumes. It is reasonable to hypothesize that the GTV is likely to include more information than CTV volumes when RT is used as the primary modality rather than an adjuvant treatment. Also, in such a scenario, the imaging parameters obtained following surgery might turn out challenging with postoperative changes (e.g., gliosis, blood products in the brain), causing obfuscation of underlying biological information. In patients with GBM, although radiation is considered in the adjuvant setting, the surgical principle includes maximal safe resection, with macroscopic residual disease often left behind in proximity to eloquent areas. Therefore, we had considered performing analysis according to the surgical subgroups to understand the impact of residual disease. Also, it is well known the presence of microscopic infiltrative disease in the PTR surrounding the enhancing disease in patients with GBM, which is included within the CTV. Previous work has shown the clinical utility of radiomic analysis of PTR in GBM, with Prasanna et al. demonstrating quantitative features extracted from the PTR region alone using pre-operative MRI was able to stratify patients with different survival outcomes [31]. In our previous work, we have demonstrated voxel-based mapping to identify areas with infiltrative disease in the PTR, which correlated with areas of future recurrence [32]. Also, the distinctive MRI characteristics of the GBM PTR (constituting microscopic disease and edema) as opposed to low grade glioma (composed of infiltrative tumor cells) have been demonstrated using radiomics approach [33]. In the current study, we have not analyzed the PTV which is used to encompass the setup uncertainties during treatment delivery. The impact of inclusion of imaging features from the PTV as well as the penumbra region and dose-spillage beyond the target volume needs to be addressed in future studies.

In the present study, the radiomic features extracted from the radiation target volumes were able to stratify the survival outcomes into two groups for all the 235 patients (p < 0.01) and biopsy subgroups (0.04). In the other two surgical subgroups (GTR and STR), risk-stratification was not possible, which can be justified by the absence of enough biological information contained with the target volumes (due to the removal of the major bulk of macroscopic disease). When the clinical features were included in the radiomics model, the combined model performed significantly better (iAUC = 0.632) as compared to the radiomics model (iAUC = 0.613) or clinical model (iAUC = 0.522) alone. The findings from our study are quite similar to the radiomics study of pre-operative MRI, where the iAUC for radiomics model alone for overall survival and progression-free survival was 0.652 and 0.590, respectively, which improved to some extent with the use of clinical features [34]. The radiomics risk score was significantly better in the biopsy group than the other two subgroups indicating the presence of macroscopic tumor or core (corresponding to GTV) was the driver of the classifiers rather than the microscopic disease or the PTR (corresponding to CTV) in the postoperative setting. One important consideration here is the CTV in the majority of instances was not only limited to the T2 hyperintensity but also included the normal signal contained within the isotropic expansion from GTV edges, which could potentially weaken the relevant imaging information.

It may be worthwhile to pursue future studies to limit the feature extraction to only areas of altered FLAIR signal intensity within the CTV volume. A recently published phase 1/2 trial by Azoulay et al. used a small CTV margin of 5 mm with hypofractionated RT resulting in equivalent clinical outcomes in GBM, challenging the traditional concept of larger margins and possibly reducing toxicity [35]. The generation of efficient imaging biomarkers from RT planning MRI can pave the way towards precision radiation oncology workflow tailoring treatment protocols individually, such as treatment-escalation in high-risk groups. The advent of MR-LINAC in clinical practice has generated an opportunity to generate and adopt imaging biomarkers not only before treatment initiation but also to study the temporal changes during the course of RT. The prospective UNITED trial investigating the role of MR-LINAC guided small-margin adaptive RT (clinicaltrials.gov identifier NCT04726397) provides an opportunity to evaluate changes in radiomic features throughout a course of RT, which may provide additional insight in terms of survival prediction. Another limitation was the exclusion of MGMT from the clinical model, due to unavailability in a proportion of patients, which is otherwise known to strongly influence the survival outcomes. The dosimetric parameters and their impact on outcomes were not studied in the current work. Here we primarily investigated the utility of biological information contained within the target volumes, which can be mined through quantitative analysis. The dosimetric details (like cumulative dose, fractionation, dosimetric heterogeneity) can be incorporated in future studies along with the development of integrated "dosiomics" and "radiomics" models to explore if better results can be achieved from the combined models.”

Conclusion

Radiomic features from radiation planning T1 CE, T2-FLAIR, and ADC MRI sequences corresponding to the GTV and CTV can be used for predicting survival in patients with GBM, particularly in the subgroup undergoing biopsy. The iAUC of 0.613 was achieved for a radiomics-based classification for predicting OS in the biopsy subgroup, which improved to 0.632 with the inclusion of clinical features.