Introduction

The current standard of care for patients with glioblastoma is the Stupp protocol [1], which consists of three parts: the maximum safe surgical resection of the tumor, followed by radiotherapy (RT) with concomitant and adjuvant administration of temozolomide (TMZ). Because of the infiltrative and aggressive nature of glioblastoma, the tumor usually cannot be completely resected. The residual tumor likely contributes to the inevitability of disease progression [2].

Pseudoprogression (PsP) is a treatment effect that probably involves tissue injury and inflammation that presents as an area of increasing contrast enhancement on MR imaging, mimicking early progressive disease (ePD). It occurs exclusively with RT and adjuvant TMZ (3), with a prevalence between 5.5 and 31% [3].

PsP typically occurs within 3 months of RT but may take up to 6 months to manifest [3]; the increased enhancement usually subsides or remains stable over the ensuing months. PsP does not represent tumor progression and may be associated with better prognosis(3). Because of the clinical and economic ramifications of unnecessarily escalating treatment, it is important to reliably distinguish PsP from ePD. Unfortunately, conventional MR imaging and diagnostic criteria, such as the response assessment in neuro-oncology (RANO) criteria, fail to prospectively distinguish between them [3, 4].

The signal intensity (SI) of fluid in the resection cavity on fluid-attenuated inversion recovery (FLAIR) MR images has been reported as an accurate method to differentiate ePD from PsP. Two previous studies assessed the diagnostic accuracy of FLAIR SI in the resection cavities of gliomas to predict and detect ePD; both studies achieved a specificity of 100% with moderate sensitivities of 57% [5] and 34% [6]. The patients included in these studies were treated before the introduction of the Stupp protocol and therefore probably did not include any cases of PsP. Given the challenge of identifying PsP and its unique features, we felt it important to assess whether increased FLAIR SI within the resection cavity is still 100% specific for ePD. Two recent studies have evaluated this sign in patients treated with the Stupp protocol, but neither focused on PsP and none of the reported false positives were attributed to PsP [7, 8]. Numerous paradigms for magnetic resonance imaging (MRI) have been evaluated to distinguish ePD from PsP, with some success. These advanced MRI biomarkers of PsP (compared with ePD) include reduced relative cerebral blood flow on MR perfusion imaging [9], low choline (among other characteristics) on MR spectroscopy [10], and elevated apparent diffusion coefficients on diffusion-weighted imaging [11]. However, increased FLAIR SI in the resection cavity remains incompletely explored as a diagnostic biomarker.

We hypothesized that the changes noted in PsP might also affect the changes in FLAIR signal intensity in resection cavities. We therefore tested the diagnostic accuracy of increased SI within resection cavities in FLAIR MR images in differentiating ePD from PsP in patients with glioblastoma treated with the Stupp protocol.

Methods

Study design and participants

We obtained approval for this retrospective study from our Institutional Review Board. In order to compare ePD and PsP, we required patients with histologically confirmed diagnosis of glioblastoma [12], underwent surgical resection under the Stupp protocol, and did not have potentially confounding chemotherapy within 6 months of RT. JPA and LAP identified 122 patients on review of 335 consecutive cases that received a standard RT dose of 60 Gy and standard doses of TMZ per Stupp Protocol and had MRI appearance of either ePD or PsP. Of those, 16 had histologic confirmation of disease and another 9 with long-term clinical follow-up (survived at least 3 years without change in therapy or re-resection, indicating the original process was PsP). All included cases of ePD were confirmed with a histologic diagnosis based on repeated resection. The selection criteria used in the study are summarized in Fig. 1.

Fig. 1
figure 1

Subject selection flowchart

Because ePD versus PsP may not be clear at each exam—in fact, it usually becomes apparent only after some time elapses post surgery—we evaluated individual examinations. For each subject, we included all MRI examinations performed within 6 months of completing RT and before any confounding event (additional neurosurgery, additional RT, new chemotherapy, or collapse of the tumor resection cavity). This resulted in the inclusion of 33 examinations from the ePD group and 37 examinations from the PsP group.

MRI protocol

FLAIR images of the brain were acquired at 1.5 or 3.0 T using a GE (GE Medical Systems, Waukesha, WI) or Siemens (Siemens Medical Solutions, Malvern, PA) MRI scanner. Our practice schedules patients so that they are always scanned on the same field strength. The FLAIR sequence parameters were TR, TE, and TI of 11,000, 140, and 2600 ms, respectively. Imaging was performed in the axial plane with 4-mm thick contiguous slices acquired in an interleaved fashion.

Image measurements

Two preprocessing steps were applied to all the FLAIR images. The first step included N4 bias correction [13] to correct for field heterogeneity. The second step was gray level standardization of the images. To account for gray level differences, all the images were normalized based on a region of interest (ROI) of uninvolved white matter placed contralaterally to the tumor. The images were then scaled such that the mean value of this white matter ROI was 1.00.

Regions of interest

ROIs within the resection cavities were created for all exams using MRIcron, a semi-automated ROI drawing tool [14]. The ROIs typically measured 2 × 2 cm, unless the resection cavity was too small to accommodate that size. This was done without knowledge of whether patients were from the PsP or the ePD group, nor the time interval between commencement of RT and the MR examinations. See Fig. 2.

Fig. 2
figure 2

FLAIR MR image demonstrating selected regions of interest (ROIs) from a 49-year-old female. ROI 1 resection cavity, ROI 2 cerebrospinal fluid, ROI3 contralateral healthy white matter

The SI of the resection cavity was calculated as the mean value for the resection cavity ROI divided by mean cerebrospinal fluid (CSF) measured from a manually placed ROI.

Statistical methods

The area under the ROC curve (AUROC) was used to evaluate FLAIR SI for its ability to differentiate PsP from ePD. The sensitivity and specificity at the optimal threshold obtained from the ROC curve were also calculated. The optimal threshold was selected as the point closest to (0, 1), which would be perfect sensitivity and specificity.

Results

The ePD group consisted of five females and six males with a mean age of 59.18 ± 7.74 (standard deviation [SD]) years, and the PsP group consisted of eight females and six males with a mean age of 53.54 ± 8.38 (SD) years. The selection criteria we applied resulted in 3.18 ± 0.98 (mean ± SD) examinations included per patient in the ePD group, resulting in a mean total MRI follow-up duration of 206.9 ± 55.7 days (SD), and 3.00 ± 0.87 examinations per patient in the PsP group, resulting in a mean total MRI follow-up duration of 203.1 ± 50.2 days. The delay from the start of RT to the first examination was 76.21 ± 21.13 days (mean ± SD) in the ePD group and 77.36 ± 22.64 days in the PsP group. The time between examinations was 62.87 ± 37.98 days in the ePD group and 52.61 ± 22.10 in the PsP group. Twenty-seven (27) of the examinations (8 of 25 subjects) were performed at 3T, with the remaining performed at 1.5T. Three of the 11 ePD examinations were performed at 3T, and 8 of the 14 PsP were at 3T. None of these differences was statistically significant.

The computed ROC yielded an AUROC of 0.788 (95% confidence interval [CI] 0.686–0.873). The optimal threshold of differentiation was 2.925. At this threshold, 37 of 70 (53%) examinations had FLAIR SI in the tumor resection cavity greater than 2.925. Of these, 26 were from the ePD group and 11 were from the PsP group. The remaining 33 (47%) examinations were below the threshold. Of these, 7 were from the ePD group and 26 from the PsP group. (See Figs. 3 and 4). The corresponding sensitivity and specificity at the optimal cutoff were 0.818 and 0.694, respectively. See Fig. 5.

Fig. 3
figure 3

FLAIR MR images illustrating signal intensities above the optimal threshold intensity ratio of 2.925 within the resection cavity for a 50-year-old male with confirmed early progressive disease (a) compared with a 54-year-old male with confirmed pseudoprogression (b)

Fig. 4
figure 4

FLAIR MR images illustrating signal intensities below the optimal threshold intensity ratio of 2.925 within the resection cavity for a 61-year-old female with confirmed early progressive disease (a) compared with a 59-year-old male with confirmed pseudoprogression (b)

Fig. 5
figure 5

Receiver operating characteristic curves for the combined analysis of all cases of pseudoprogression (PsP) (a), subgroup analysis of the biopsy-confirmed PsP cases (b), and the subgroup analysis of the clinically diagnosed PsP cases (c)

Subgroup analysis of biopsy-confirmed cases of PsP yielded an AUROC of 0.856 (95% CI 0.769–0.932), a sensitivity of 0.667, and a specificity of 0.957 at an optimal threshold of 4.147. When inclusion was restricted to clinically confirmed cases of PsP, subgroup analysis resulted in an AUROC of 0.667 (95% CI 0.490–0.819), a sensitivity of 0.788, and a specificity of 0.615 at an optimal threshold of 2.805. Results are summarized in Table 1.

Table 1 Summary of main results including area under the curve (AUC) with the 95% confidence intervals (CIs), sensitivity, specificity, and number of exams above and below the optimal threshold of differentiation for all subjects (11 biopsy-confirmed early progressive disease [ePD] cases with 14 biopsy-confirmed and clinically diagnosed pseudoprogression [PsP] cases) and for the biopsy-confirmed subgroup analysis (11 biopsy-confirmed ePD cases with 5 biopsy-confirmed PsP cases)

Discussion

We found lower accuracy of the increased FLAIR SI sign for differentiating ePD and PsP in glioblastoma patients treated with the Stupp than previously described. In our study group, this sign achieved a sensitivity of 82% and a specificity of 69% for detecting ePD. These results demonstrate that FLAIR SI within resection cavities may be increased in both ePD and PsP. In those patients with biopsy-proven diagnoses, the specificity of this sign was much higher, but we caution that this is a smaller population with few cases of PsP to evaluate specificity. The long-term follow-up of PsP subjects who did not undergo biopsy allows us to reasonably presume there probably was no recurrent tumor, even when the FLAIR SI in the resection cavity was increased. Therefore, our results suggest that the high specificity reported in the past is lost when patients are treated with the Stupp protocol.

A limitation of our study is that we do not have biopsy confirmation for all subjects; the small number of PsP subjects in the biopsy group is too low to ascribe the high specificity found in their cases to detectable SI differences. This likely occurred because only those subjects with a strong clinical case for progression underwent biopsy, while those suspected to have PsP were managed with close observation and, thus, their lesions were infrequently biopsied. Lack of consistency in our reference standard is also a limitation, as methods may differ in certainty of diagnosis. We could have had more subjects in the study, but only by including those with less clear tumor status—e.g., those converted to other agents and survived a long period may have had recurrence but may also have had PsP. Additional limitations of the present report are the retrospective nature of the analysis, and the inclusion of MRI examinations acquired at both 1.5 and 3.0T field strengths.

Both previous investigators, reporting data from the era predating the use of TMZ, noted an extremely high specificity of 100% of increased FLAIR SI to detect ePD [5, 6]. These reports of the high specificity of increasing FLAIR SI in patients with progressing glioblastoma is of great potential value, particularly given the new therapies that make differentiation between ePD and PsP more challenging.

Two recent studies reported data from patients treated with the Stupp protocol [7, 8]. The first reported a specificity of 80%, but only had five negative cases, and the one false positive was not attributed to PsP [7]. The second study reported a specificity of 70.6% and attributed all false positives to bleeding or infection [8]. From this available evidence, the effect of PsP on FLAIR SI within the resection cavity remained unclear. Our findings contribute new knowledge of the accuracy of this sign by demonstrating that PsP also has increased FLAIR SI.

Diminished specificity in the era of the Stupp protocol also provides valuable insights into the mechanism of increased FLAIR SI within brain tumor resection cavities. The supposition Winterstein et al. made was that the cause for the increasing signal was protein trapping and that this trapping happened only when tumors were growing (6). Protein trapping should not be affected by TMZ and the Stupp protocol (which was not standard practice at the time the data for (6) was collected). We suggest that because the presence of increased FLAIR SI occurs mostly in contrast-enhancing lesions—whether ePD or PsP—that the mechanism may be related to leakage of proteins from surrounding brain tissue into the resection cavity. It is not as simple as a leaking blood-brain barrier, as patients with stable contrast-enhancing tumors should also exhibit this finding, as confirmed by Winterstein et al. [5]. However, the more active exudation of plasma, which also produces a mass effect, might explain the difference.

Differentiating ePD from PSP is a critical problem that impacts therapeutic options. The fact that FLAIR SI is no longer a specific sign of progression places more demands on the use of advanced MRI for this task. Perfusion MRI has been heavily studied for the task of identifying ePD [15, 16].

While these have shown promise for distinguishing PD and PSP, they are not perfect predictors. Further improvement in these is needed.

Conclusion

The search for an imaging biomarker to separate true disease progression from treatment effect is needed. Accurate diagnosis of true progression is important to ensure this group receives the best available treatment, while accurate diagnosis of pseudoprogression may prevent unnecessary treatment and treatment-related adverse effects. We had hoped that the FLAIR signal in the resection cavity would continue to be a valuable biomarker in patients treated with the Stupp protocol, but our study suggests this is not the case. While increased FLAIR SI in the resection cavity of a glioblastoma is highly specific for disease progression in a patient not treated with the Stupp protocol, the specificity is substantially decreased when the Stupp protocol is employed. As such, this makes the use of more advanced imaging methods like perfusion and diffusion MRI critical for the assessment of tumor response in glioblastoma.