Introduction

Due to their poor prognosis and their disastrous impact on patients’ quality of life and cognitive function, malignant brain tumors are among the most feared types of cancer [1]. Glioblastomas (GBMs) are the most common type of primary malignant brain tumor, with a mean survival rate of 12–15 months with standard treatment, which includes maximal safe surgical resection, followed by radiation therapy and concomitant and adjuvant chemotherapy [1,2,3]. However, patients’ time to recurrence and overall survival varies, suggesting that therapy planning must become more individualized. To this end, research has focused on identifying prognostic factors in GBM patients [1, 4, 5].

Young age and high preoperative Karnofsky performance status (KPS) are confirmed to be predictors of good prognosis [5,6,7,8,9,10,11]. Donato et al. reported that age > 60 years is a poor prognostic factor [6]. A retrospective study of 416 GBM cases identified young age and high preoperative KPS as favorable factors in patient survival [8]. Magnetic resonance imaging (MRI) plays an important role in preoperative evaluation and monitoring the response to treatment in GBM patients [7, 8, 11,12,13,14,15]. Volumetric features identified with preoperative imaging have been identified as independent prognostic factors, including tumor volume at initial diagnosis [11, 16,17,18,19]. Iliadis et al. reported that higher preoperative enhancing tumor volume was associated with worse survival [16]. A higher volume of preoperative peritumoral edema/invasion was correlated with poor overall survival (OS) [17]. Zinn et al. demonstrated that combining preoperative enhancing tumor volume with age and KPS resulted in a robust prognostic model [19]. Postoperatively, the extent of resection (EOR), as defined by the residual enhancing tumor component, is associated with survival [7, 8, 14], with an EOR of > 98% of the enhancing tumor demonstrating a better prognosis [8].

The aforementioned studies illustrate the predictive value of MRI-based features. The current definition of gross total or subtotal resection is based on the enhancing portion of the tumor; thus, even in cases of gross total resection, the residual edema/invasion non-enhancing component of the tumor remains unresected [20]. Furthermore, 80% of recurrent GBMs occur within peritumoral T2/fluid-attenuated-inversion-recovery (FLAIR) hyperintense non-enhancing component of GBM (≤ 2 cm from the primary tumor field) indicating that this FLAIR hyperintense portion, characterized by a mixture of edema and tumor-infiltrating cells, plays an important role in symptoms and tumor progression [21,22,23]. Multiple studies have investigated the volume of preoperative peritumoral edema/invasion as prognostic factor of OS [11, 15, 17, 18, 20, 24]. However, there is insufficient knowledge about the independent contribution of postoperative residual volume of edema/invasion in patient outcome. To date, only one retrospective analysis by Grabowski et al. evaluated the impact of postoperative residual volume of edema/invasion in patient outcome using a single institution patient cohort [14].

The objective of this study was to determine whether the postoperative residual non-enhancing volume (PRNV) is predictive of GBM patient outcome using a multi-institutional patient cohort. For this purpose, we retrospectively analyzed 98 patients from The University of Texas MD Anderson Cancer Center (Houston, Texas), and tested the final predictive model in a cohort of 37 patients obtained from The Cancer Genome Atlas (TCGA).

Materials and methods

Statement of ethics approval

This HIPAA-compliant retrospective study was approved by The University of Texas MD Anderson Cancer Center review board. All necessary approvals, authorizations, human subjects assurances, and informed consent documents were obtained.

Patient population

In this study, we retrospectively analyzed 134 patients with newly diagnosed GBM; all patients that had available postoperative MRI studies were included. The training cohort was comprised of 98 newly-diagnosed GBM patients, obtained from The University of Texas MD Anderson Cancer Center (MDACC). Individuals were included from 2005 to 2014. The validation cohort was comprised of 37 patients with newly-diagnosed GBM obtained from TCGA (http://cancergenome.nih.gov) and corresponding MRI data from The Cancer Imaging Archive (TCIA) (http://www.cancerimagingarchive.net). Patients’ age, sex, and Karnofsky performance status (KPS) score were recorded. For both training and validation cohorts, both histopathologic confirmation and postoperative MRI studies [at least a FLAIR image and a post-gadolinium contrast T1-weighted image (T1WI)] were available. All MR images were acquired using typical clinical sequences, within the first 72 h after the operation. The training and validation cohorts were compared using Chi square test for the categorical variables, t-test (two-tailed) for the continuous variables, Wilcoxon rank-sum test for the KPS, and log-rank test for OS.

Image analysis

For image analysis and segmentation, we used “3D Slicer” (version 4.3.1, http://www.slicer.org/), open-source software for medical image visualization and post-processing [25, 26]. Prior to manual image segmentation, the post-contrast T1WI and FLAIR images from each patient were registered to each other using affine registration (12 degrees of freedom). Affine registration is the standard method of choice for registering anatomical images acquired for the same patient at the same time-point, it allows for correction of patient’s motion between different sequences without deforming the brain.

The segmented images were reviewed in consensus by two neuroradiologists (R.R.C., 9 years of experience, and A.E., 5 years of experience) who were blinded to clinical data. The PRNV was defined as the FLAIR hyperintense portion of the tumor that was not enhancing on post-contrast T1WI. Necrosis within a residual enhancing tumor was evaluated using the post-gadolinium contrast T1WI and was defined as a region that did not enhance or that showed diminished enhancement. Enhancement was identified in most cases studied. According to the literature, approximately 30% of postoperative MRI scans show surgically-induced enhancement within the first 72 h [27]. We were able to differentiate nodular enhancement from surgically-induced reactive enhancement on the basis of its radiographic appearance on the post-gadolinium contrast T1WI; peripheral enhancement around the resection cavity reflects postoperative granulation or scar tissue (surgically-induced reactive enhancement), whereas larger areas of mass-like enhancement reflect residual tumor (nodular enhancement) (Fig. 1) [28].

Fig. 1
figure 1

Representative magnetic resonance imaging scans and volume segmentation. Segmentation of postoperative residual non-enhancing component (blue) and residual enhancement (yellow). Left and middle panel: post-gadolinium contrast T1-weighted. Right panel: fluid attenuated inversion recovery (FLAIR) image

Finally, all volumes were calculated by multiplying the voxels within the outlined region by the volume of the voxel (Fig. 1).

Statistical analysis

We examined the association between PRNV and OS. OS was calculated from the time of surgery to the time of death. Patients who had no entry for the time of death but had last follow-up data were considered alive and were censored at the time of last follow-up. We did not evaluate the association between PRNV and progression-free survival due to insufficient annotation of progression-free survival data.

Factors potentially associated with OS were assessed, singly and together, by fitting Cox proportional hazards models; a backward elimination procedure was used to identify the final model. The performance metrics included the estimated hazard ratio (HR), 95% confidence limits on HR, and p-values for the significance of HR [29]. Although, stepwise methods, such as backward elimination, are primarily used for determination of the final model, their limitations should be acknowledged: (i) they only rely on the significance level, (ii) excluded factors cannot be re-entered in the final model, thus evaluating only a subset of possible models [30].

Recursive partitioning analysis (RPA) was used to identify prognostic groups; the classification and regression trees (CRT) partitioning technique was selected [31, 32]. OS curves were calculated using the Kaplan–Meier method. A log-rank test was used to compare survival curves between patient groups; the performance metric was the p-value of the test. All tests were two-sided, and p-values were corrected for multiple comparisons using the false discovery rate (FDR) approach. P-corrected of 0.05 or less were considered statistically significant.

Summary statistics and survival analyses were carried out using SAS software, version 9.3 (SAS Institute, Cary, NC). RPA analysis and plotting were carried out using R software, version 3.1.1 (R Foundation, Vienna, Austria; rpart package version 4.1-11).

Results

Patient demographics

A total of 134 patients were analyzed in this study (97 patients from MDACC and 37 patients from TCGA). Their characteristics are shown in Table 1. All patients had undergone surgical resection and tissue diagnosis. The training cohort (MDACC) included 61 men and 36 women who were aged 21–84 years at initial diagnosis (average age, 60.0 years; standard deviation, 13.1 years), and the validation cohort (TCGA) included 22 men and 15 women aged 18–80 years (average age, 55.2 years; standard deviation, 15.1 years). On the basis of postoperative MR images, 40 patients (41.2%) had nodular enhancement and 5 (5.2%) had residual necrotic tissue at MDACC, and 18 patients (48.6%) had nodular enhancement and 3 (8.1%) had residual necrotic tissue in TCGA. Age, and sex did not statistically differ between MDACC and TCGA, while median KPS score was significantly higher in MDACC cohort (median KPS score 90 vs 80 in MDACC and TCGA, respectively, p = 0.0003).

Table 1 Summary of patient demographics and MR characteristics

Statistical analysis

At the time of analysis, in the training cohort, 71 patients (73.2%) had died and 26 patients (26.5%) were either still alive or had been lost to follow-up and thus were categorized as censored. The median OS was 1.54 years (95% confidence interval 1.95–2.85 years). In the validation cohort, 31 patients (84%) had died, with a median OS of 0.97 years (95% confidence interval 0.55–1.73 years). The median survival times in the MDACC cohort were substantially longer than were those in the TCGA cohort, as indicated by the results of the log-rank test (Table 1; p = 0.0003).

High PRNV is predictive of poor survival

To assess whether volumetric data extracted from the MR images and other clinical parameters are independent prognostic factors for OS in GBM patients who have undergone surgery, we performed univariate and multivariate analyses using the Cox proportional hazards regression model. In the training cohort, our univariate results showed that older age (HR 1.032, 95% CI 1.013–1.052; p = 0.0009), increased PRNV (HR 1.072, 95% CI 1.020–1.127; p = 0.0094), and decreased KPS (HR 0.800, 95% CI 0.658–0.970; p = 0.0257) were significantly associated with poor prognosis (Table 2). The backward elimination process resulted in a model with two significant prognostic factors, age and PRNV. In the final model, with each year of age increase, the HR increased by 3.1% (p-corrected = 0.006, multivariate Cox regression analysis) (Table 3). Accordingly, we found that with an observed increase in PRNV by 10 cm3, the HR increased by 5.1% (p-corrected = 0.046, multivariate Cox regression analysis) (Table 3).

Table 2 Univariate Cox proportional hazards model for overall survival using training cohort (MDACC)
Table 3 Multivariate Cox proportional hazards model for overall survival using training cohort (MDACC)

We used the TCGA cohort to validate the final model obtained using the training cohort (Table 4). Based on our findings, the validation cohort replicated the significant results obtained from the MDACC cohort (Table 4). Our result showed that both age (HR 1.034, 95% CI 1.007–1.063; p-corrected = 0.022; multivariate Cox regression analysis) and PRNV (HR 1.127, 95% CI 1.051–1.195; p-corrected = 0.002; multivariate Cox regression analysis) were independent predictors of poor prognosis in GBM patients who underwent tumor resection (Table 4).

Table 4 Multivariate Cox proportional hazards model for overall survival using validation cohort (TCGA)

Subsequently, we performed an RPA analysis to identify a cut-off value for PRNV that can separate patients in high- versus low-risk groups. In the 98 patients in the training cohort, we identified 70.2 cm3 as the cut-off value. As shown in patients with PRNV > 70.2 cm3 had significantly shorter OS durations than did those with PRNV < 70.2 cm3 (median OS, 1.22 vs 1.69 years respectively); GBM patients with low PRNV (< 70.2 cm3) have a significant survival benefit (5.6 months; p = 0.0037, log-rank test) (Fig. 2a). We used the same cut-off value derived from the training cohort to evaluate the validation cohort. The cut-off value divided the patients into two groups: a low-risk group (median survival, 1.64 years; 95% confidence interval 0.88–2.25 years) and a high-risk group (median survival, 0.31 year; 95% confidence interval 0.10–0.64 years). There was a significant difference in the OS curves (p = 0.0089, log-rank test) (Fig. 2b).

Fig. 2
figure 2

a Kaplan–Meier analysis survival curves for high versus low postoperative residual non-enhancing volume (PRNV) in the training cohort (MDACC) (low PRNV: PRNV < 70.2 cm3; high PRNV: PRNV > 70.2 cm3). Patients with low PRNV (right, solid line, n = 55, event = 36) had significantly longer overall survival than did those with high PRNV (left, dotted line, n = 43, event = 36). The log-rank p-value was 0.0032. b Kaplan–Meier analysis survival curve for high versus low PRNV using validation cohort (TCGA). Patients with low PRNV (right, solid line, n = 23, event = 20) had significantly longer overall survival than did those with high PRNV (left, dotted line, n = 14, event = 11). The log-rank p-value was 0.0089

Discussion

In this study, we demonstrated that the PRNV plays a significant role in OS of GBM patients. By examining 135 patients from two independent datasets (training cohort: 98 patients; validation cohort: 37 patients), we identified a cut-off value for PRNV that classifies patients into high and low survival groups: GBM patients with low PRNV (< 70.2 cm3) had a significant survival benefit (5.6 months; p = 0.0037). Results from the multivariate Cox proportional hazards model analysis show that with an observed increase in PRNV of 10 cm3, the HR increased by 5.1% (p-corrected = 0.046). Similarly, with each year of age increase, the HR increased by 3.1% (p-corrected = 0.006). These findings were replicated in the validation cohort (higher PRNV: HR 1.127, p-corrected = 0.002; older age: HR 1.034, p-corrected = 0.022). None of the other parameters examined in this study were statistically significantly associated with OS. In this study, we have demonstrated via a thorough analysis, that the PRNV can provide meaningful information regarding survival of GBM patients.

The current standard surgical treatment for patients who present with GBM involves maximal safe resection of the enhancing portion of the tumor. EOR, such as gross-total or subtotal resection, is defined by the extent of removal of the enhancing portion of GBM; thus, the latter has been the focus of most research. Gross-total resection has long been recognized as a favorable prognostic factor [8, 20, 33]. However, even in these cases, recurrence invariably occurs. GBM recurrence primarily occurs in the neighboring area (local recurrence) of the primary tumor [23, 34, 35], which argues for further examination of the peritumoral non-enhancing hyperintense FLAIR area. The peritumoral non-enhancing hyperintense FLAIR portion of the tumor, composed of a mixture of edema and tumor cellular invasion, harbors infiltrating tumor cells [21]. With regards to the non-enhancing hyperintense FLAIR component, the literature has primarily focused on the prognostic impact of the preoperative volume of the hyperintense FLAIR component rather than on the post-operative residual volume [12, 18].

Surprisingly, before our study only in one retrospective analysis, the impact of postoperative residual T2/FLAIR volume was directly evaluated [14]. Grabowski et al. reviewed a total of 128 patients and analyzed the survival outcome according to preoperative and postoperative MRI measures, including postoperative residual T2/FLAIR volume. While in univariate analysis postoperative residual T2/FLAIR volume achieved statistical significance, multivariate analysis did not confirm, reaching a borderline significance of p = 0.10 although the observed trend was in-line with our findings [14]. Recently, Li et al. reported that resection beyond the contrast-enhancing area results in a better prognosis [36]. Our results confirm this finding (Tables 2, 3) and further suggest that the absolute residual volume of the non-enhancing component of GBM stratifies patients into high and low survival groups (Fig. 2). The prognostic value of PRNV is in line with our knowledge that infiltrating tumor cells are present inside the PRNV region; both tumor cells infiltrating deeper in the brain parenchyma and the increase in the secretion of vascular endothelial growth factor cause a higher T2/FLAIR volume (edema/invasion), which in turn leads to more mass effect [22, 37].

Advanced age at the time of diagnosis and decreased functional status were also significantly negatively associated with OS (Tables 2, 3). These findings are in agreement with those of previous studies examining prognostic factors for long-term survival [5,6,7,8,9,10]. On the basis of the results of our univariate analysis, younger age and higher KPS at presentation are associated with improved outcome (HR 1.032; HR 0.800, respectively). Furthermore, the results of our multivariate Cox regression analysis confirmed that combining PRNV with age could further improve final predictions (Table 3).

Several studies have highlighted that incomplete resection of the enhancing portion of GBM is an unfavorable prognostic factor [7, 14]. Contrary to our expectations, we did not find statistical prognostic significance of residual nodular enhancement, although there was a slight trend towards an unfavorable prognosis (Table 2). We anticipate that this difference arises from the fact that we examined the actual residual volume rather than percentage of resection.

A lower volume of necrosis has been reported as a favorable preoperative survival factor [38]. This finding was not replicated in our study (Table 2); however, this is likely related to our having focused on postoperative imaging features rather than on preoperative features. A stronger explanation for the discordance is that the latter study had more women than men. A recent study found that distinct sex-specific molecular mechanisms drive tumor necrosis and are concordant with differences in survival [39]; women with high versus low necrosis demonstrate statistically significant differences in survival; this difference in survival is not seen in men.

There are certain limitations to our study. First, because of its retrospective nature, there are inherent difficulties in choosing a homogeneous group in which all of the confounding parameters are controlled; for instance, although all patients received chemotherapy and radiotherapy variations we expect variations in the dose and duration. However, this inherent limitation is well-known and is factored in all retrospective studies [40, 41]. Second, being a retrospective study important information, such as molecular and genetic characteristics, was not available to review. For example the effect of IDH1/2 mutations, which is a known factor associated with longer OS, was not investigated in this work, since IDH1/2 status was not available for our studied population. Additionally, tumor location and how this affected the neurosurgeon’s decision to resect the volume that he/she resected were not evaluated. In addition, the EOR and its association with the actual residual tumor volume was not investigated. Finally, due to ethical issues patients could not be randomly assigned. A study that would deliberately randomly assign patients to total, subtotal, or partial resection would be unethical. However, further studies are needed to determine the exact pre- and post-operative relationship of non-enhancing hyperintense tumor volume. These studies are underway by our research group.

This study showed that a high PRNV is predictive of poor OS. The PRNV, along with the well-studied presence of residual tumor, can serve as prognostic biomarkers that are useable in clinical practice and assist in identifying at-risk patients immediately after surgery. Our results may lead to the conclusion that expanding safe resection, if not adjacent to eloquent brain area, beyond the enhancing tumor border in the area of FLAIR abnormality may improve OS and reduce the risk of recurrence. Studies with larger sample sizes and multicenter participation, which are underway at our institution, are needed to further evaluate the relationship between PRNV and patient mortality, morbidity, and OS.