Introduction

Gliomas are the most common type of primary brain tumour in adults with an annual incidence of 4–5/100,000 people. For newly diagnosed glioblastoma (GBM), the most common and malignant of the gliomas, no treatment has yet been shown to be more effective than maximal safe surgical resection followed by chemoradiation and adjuvant chemotherapy with temozolomide [1]. There has been only a modest improvement in outcome over the last 10 years with 3-year survival increasing from about 4% in 1999–2000 to 10% in 2009–2010 [2].

Almost all GBMs will recur and no therapies currently exist that significantly prolong survival at relapse. Progress has been limited both by the paucity of effective treatments but also by the inherent difficulties in assessing treatment response in clinical trials. This review will discuss the advances and limitations of quantitative MRI techniques currently being implemented into clinical practice as well as emerging techniques that will become a standard part of clinical trial design over the next decade.

Challenges in monitoring treatment response with conventional MRI

It is essential that imaging is sufficiently accessible, technically reproducible and biologically meaningful to guide treatment decisions and assessment of efficacy of new treatments. Image evaluation with conventional MRI is limited to the assessment of disease bulk and morphology; it lacks information regarding the viability of residual tumour, crucial for response monitoring and the distinction between treatment effects on the brain and underlying disease status.

Although tumour dimensions are measurable within the millimetre range, microscopic infiltration remains occult on conventional imaging. Contrast enhancement is not a specific sign of active tumour, nor do all high-grade gliomas enhance [3]. Whilst harmonisation of imaging protocols can improve comparability, certain variations in technical parameters or patient position are difficult to completely eradicate, and can lead to overdiagnosis of disease progression [4, 5••]. Volumetric imaging is desirable, but may not be universally available, due to equipment or time constraints.

Pseudoprogression

Chemoradiation and radiotherapy of GBM can lead to an increase in tumour volume, oedema and enhancement, known as pseudoprogression (PsP), shortly after completion of treatment, which is often difficult to distinguish from progressive tumour [6, 7]. PsP occurs in 20–30% of patients and is defined as the appearance of new enhancing lesions and/or oedema in the absence of progressive disease (PD). These changes are followed by subsequent improvement or stabilisation without further treatment [8,9,10]. The precise mechanism remains poorly understood with most cases peaking within 3 months of completing chemoradiation, although some cases present well beyond the 12-week timeline [8]. PsP is thought to be due to a pronounced local tissue reaction involving an inflammatory component, oedema and abnormal vessel permeability. A variety of structural features have proven to be unreliable predictors of PsP, with the exception of subependymal enhancement, which according to one study may represent a relatively specific (93.3%) sign for GBM progression [6]. Studies have shown that there is a higher incidence of PsP in tumours with methylated MGMT and an association with increased survival, perhaps representing an active inflammatory response against the tumour [10,11,12]. PsP is not exclusive to treated GBM and has also been reported following radiotherapy for low-grade gliomas [13].

Pseudoresponse

Pseudoresponse is the radiological phenomenon whereby anti-angiogenic agents, e.g. bevacizumab, can produce early (within 1–2 days) and dramatic reduction of tumour enhancement and peri-tumoural oedema, as a result of reduced vascular permeability to contrast agents, known as vascular normalisation, rather than a true anti-tumour effect [14, 15] (Fig. 1). A subset of patients that initially experience reduction in tumour contrast enhancement subsequently develop non-enhancing tumour progression, best visualised on T2/FLAIR sequences [16, 17].

Fig. 1
figure 1

Multiparametric MRI of recurrent left frontotemporal GBM, treated with bevacizumab showing early pseudoresponse and vascular normalisation. Each column shows a different imaging biomarker derived from MRI at four timepoints. Column 1 shows long echo time MRSI with a high choline peak which remains stable up to 3 months after treatment. This is consistent with the absence of significant tumour cell kill. Column 2 shows a heterogeneous axial T2W image showing some reduction in the central cystic component with little change in mass effect. Columns 3 and 4 show DWI at b1000 and ADC images. There is no evidence of tumour cell kill. Vascular normalisation would be expected to decrease the extravascular space and so to reduce ADC values but this is not readily evident. Column 5 shows the axial T1W + Gadolinium in the equilibrium phase showing significant reduction of contrast enhancement which appears at the day 10 scan and is maintained for 3 months consistent with reductions in BBB permeability (pseudoresponse). Columns 6 and 7 show perfusion weighted imaging (rCBF and rCBV) showing an early reduction of rCBF and rCBV around the cyst accompanied by increases anteriorly consistent with vascular normalisation. (Figure kindly provided by Prof. Anwar Padhani, Paul Strickland Scanner Centre, Mount Vernon Cancer Centre, UK)

RANO criteria

Therapeutic trials use progression-free survival (PFS) as a surrogate for overall survival (OS), based on radiographic assessment of post-contrast T1-weighted and non-contrast T2-weighted/FLAIR MRI. As MRI parameters are used to define progression-free survival, this can confound assessments of treatment response and lead to erroneous conclusions being made either on the presumed efficacy or lack of efficacy of treatments. In recognition of this, an international working party was convened in 2008 to update the response assessment criteria for high-grade gliomas and produced a paper in 2010 known as the RANO criteria (Response Assessment in Neuro-oncology) (Table 1) [18]. These criteria were refined around the complexities of imaging following the advent of chemoradiotherapy for newly diagnosed GBM and anti-angiogenic agents for recurrent GBM. Prior to this, imaging response assessment tools used the Macdonald criteria [19], which originated in the era of cytotoxic chemotherapy where radiographic findings directly reflected anti-tumour effect.

Table 1 Summary of RANO response criteria (adapted from Wen et al. 2010 [18])

Limitations of RANO criteria

The clinical application of the RANO criteria is that patients with progressive radiographic findings can continue current therapy pending follow-up imaging by defining true progression as occurring no sooner than 3 months following completion of chemoradiation, unless there is (i) new enhancement outside the main radiation field or (ii) pathological confirmation of unequivocal tumour progression.

Despite these considerations, the RANO criteria still rely on two-dimensional measurements of enhancing tissue and are operator dependent. These measurements can be challenging when the enhancement has ill-defined margins and/or an irregular shape. Single centre studies have shown promise in using semi-quantitative MR measures such as apparent diffusion coefficient ratios [20], choline, creatine and N-acetylaspartate ratios from MR spectroscopy [21] and perfusion imaging [22] for differentiating PsP from progressive disease (PD) but these techniques have not yet been incorporated into response criteria, as they are not universally available, are difficult to reproduce and increase the overall cost of the imaging examination. They are unlikely to gain widespread acceptance until they have been validated in clinical trials.

There are several therapeutic trials currently underway comparing conventional imaging with advanced MRI techniques to improve treatment response criteria. In the paediatric high-grade glioma population, the HERBY trial is comparing conventional MRI with perfusion/diffusion MRI for response assessment while investigating the practicality of obtaining high quality diffusion/perfusion scans in a multi-centre setting [23]. The recent publication of updated response assessment criteria for the evaluation of patients undergoing immunotherapy (iRANO) [24] demonstrates that imaging response criteria will continually evolve as newer treatments are developed and more sensitive MRI surrogates of tumour burden are evaluated (see below).

Diffusion weighted imaging

Diffusion weighted MRI (DWI) examines the movement of water molecules in brain parenchyma. In any area, in which water diffusion is focally impaired (‘restricted’), this is displayed as bright image signal. Through mathematical subtraction of unwanted T2 effects, the apparent diffusion coefficient (ADC) map is generated, on which dark signal must correspond in location to confirm restriction. ADC signal in glioma is partly influenced by cellularity, making DWI a technique of great interest for both grading and detecting disease recurrence [25, 26].

Numerous studies have reported a difference in the ADC values of PsP and PD [27,28,29], but variable ADC thresholds with no clear cutoff have made it difficult to extrapolate beyond a single institution level. It has been proposed that ADC histogram analysis, specifically the 5th percentile value of the cumulative ADC histogram, may be a better predictor of PsP versus PD (with 90% sensitivity and specificity) [30]. At low diffusion gradient strengths (‘b values’), perfusion effects can neutralise the reduced ADC signal of cellular neoplasms. Therefore, obtaining histograms from high b value (b3000) imaging may improve diagnostic power, and for this the 5th percentile of the cumulative ADC histogram achieved superior accuracy compared to the standard b1000 value [31•].

Diffusion tensor imaging (DTI) gathers data from a greater number (6–256) of diffusion directions than DWI (3 directions). Although used mainly for presurgical tractography, DTI has also been applied to the distinction between PsP and PD [32], but has not been widely validated for treatment monitoring. The clinically disseminated DWI assumes a Gaussian normal distribution of diffusion as this would be the case in water, and is therefore subject to inaccuracies. Diffusion kurtosis imaging (DKI) constitutes an innovative imaging method, which can be calculated from DTI data and describes the deviation of diffusion values in tissue from the Gaussian curve. There is evidence that DKI and related mathematical models can yield information on glioma cellularity and on tissue heterogeneity with a diagnostic precision superior to standard DWI and DTI [33]. The application of DKI in assessing treatment response is discussed below.

In respect of pseudoresponse, enlarging areas of diffusion restriction corresponding to non-enhancing tumour, occurring with anti-VEGF treatment, have long represented a diagnostic dilemma because it was unclear if these represent viable tumour or tissue necrosis due to treatment response. Two comprehensive studies correlating histology with areas of low ADC signal changes have shown that these areas were consistent with aggressive recurrent GBM, either throughout or as a tumour rim. This result was matched by poor outcomes in all subjects [34•, 35].

Perfusion weighted imaging

Perfusion weighted imaging (PWI) exploits the neoangiogenic properties of proliferating gliomas, and is able to identify areas of high-grade tumour with high accuracy [36]. PWI can be performed in a number of ways; the main techniques in clinical practice are dynamic susceptibility-weighted contrast-enhanced MRI (DSC), dynamic contrast-enhanced MRI (DCE) and arterial spin labelling (ASL).

Dynamic susceptibility contrast

DSC represents the most commonly used modality, with gradient echo (T2*) images obtained at rapid intervals during the first pass phase of a gadolinium bolus injection. Relative cerebral blood volume (rCBV) has become the most validated perfusion parameter and reliably correlates with tumour grade, vascularity and focal anaplastic transformation [37, 38]. Using a threshold value of 1.75, DSC has high (95%) sensitivity but relatively low (70%) specificity in distinguishing LGG from HGG, as certain LGG, particularly oligodendrogliomas, may contain areas of high rCBV [39].

DSC studies have consistently demonstrated high accuracy (>90%) in differentiating between PsP and PD [40,41,42]. In a systematic review summarising the evidence for DSC for the PsP/PD indication, the main issue of defining a ‘universal’ rCBV threshold was highlighted [43]. In a recent meta-analysis, the high accuracy of DSC was confirmed (pooled sensitivities and specificities for rCBVmean and rCBVmax in the region of 90%) but only in individual studies with highly variable cutoff values [44••]. Contrast dose, bolus timing and vendor software may all influence DSC signal, probably to a greater extent than interoperator variation, and are challenging to harmonise between institutions [22, 45].

Because of these considerations and the fact that it is not practical to biopsy all new lesions with an elevated rCBV, repeat PWI at a short interval can help clarify the situation: indeed, longitudinal trends in rCBV could be more useful than absolute rCBV in distinguishing PsP from PD [22]. Again, a multiparametric approach might be considered here, for example Wang et al. found high accuracy (area under the curve/AUC 0.91) for a model combined with DTI, which permitted distinction not only between PsP/PD but also for PD and mixed response [46].

The complexity of defining a rCBV threshold for PSP/PD distinction is exacerbated in patients receiving anti-angiogenic drugs, although PWI could still be useful for treatment monitoring [47]. In one study, pretreatment perfusion appears to be the principal determinant of likely response to bevacizumab in GBM, whereby tumours with higher rCBV values retain this feature during treatment despite a greater anti-angiogenic effect of the drug [48].

Dynamic contrast enhancement

Dynamic contrast enhancement (DCE), or permeability imaging, is a T1-based perfusion technique, which can be a useful adjunct for lesions in which susceptibility artefact from blood products or calcification prevents DSC quantification, or where rCBV values are indeterminate [49]. The pharmacokinetic modelling required for DCE quantification is challenging, whereby a diversity of methods of imaging and post-processing analysis are described in the literature, some with an unclear physiological basis [44••]. This has led to its less widespread use in daily practice. In individual studies, DCE parameters (most frequently the volume transfer constant ‘K trans’) have shown good correlation with tumour malignant potential and for the distinction between PsP and PD: DCE results to date include a higher K trans at baseline as a predictor of worse PFS and OS [48] and a higher K trans for recurrent glioma compared to radiation necrosis [50]. A combination of K trans and plasma volume (Vp) enabled the distinction between PsP and PD with 85% sensitivity and 79% specificity [51••]. Too few studies are available to carry out meta-analysis [44••].

Arterial spin labelling

ASL uses magnetically labelled blood and has the advantage of not requiring a gadolinium contrast injection. Because of its low signal to noise ratio (SNR) and need for specialist post-processing, it has not yet featured much in brain tumour clinical imaging [43].

MR spectroscopy

MR spectroscopy (MRS) examines the distribution of chemical metabolites within a chosen volume of brain tissue. It can be performed either as a single voxel technique (SVS) to obtain average metabolite values within the region of interest or through simultaneous analysis of several voxels (multi-voxel spectroscopy/MVS, chemical shift imaging/CSI). SVS is quick and readily available as a standard on most clinical MRI systems, but cannot capture spatial tissue heterogeneity, whereas multi-voxel techniques are more demanding in terms of preparatory steps and data post-processing. Ample evidence exists for the value of MRS in the distinction of glioma from non-neoplastic conditions and for glioma grading [51••], whereby the best validated ratios indicative of tumour are Cho/NAA (choline/N-acetyl-aspartate) and Cho/Cr (choline/creatine) ratios.

The utility of MRS in post-therapeutic imaging is less well validated. A single study analysis seeking to define a threshold for MRS and ADC in distinguishing PD from PsP reported a sensitivity and specificity >95%, but included less than 40 patients [52]. Other investigators observed a significant reduction in metabolite ratios (Cho/NAA and Cho/Cr) in 26 HGG patients comparing baseline to 2 months after radiotherapy (RT) which was associated with a worse outcome, possibly explained through reduced cell death (and thus choline liberation) in non-responders [53, 54]. A recent meta-analysis of 455 patients with suspected glioma recurrence after radiotherapy concluded that MRS alone has moderate diagnostic performance in differentiating glioma recurrence from radiation necrosis using Cho/Cr and Cho/NAA metabolite ratios, and strongly recommended its use only in combination with other advanced imaging technologies [55].

Multiparametric imaging

A multiparametric imaging approach may aid PsP/PD distinction. In one study, combining DWI, perfusion and spectroscopy data improved the diagnostic accuracy (from 84–87% for the single modalities to 93%) [56]. A multiparametric (structural, DTI and perfusion MRI) machine learning technique has been developed which can predict sites and severity of future GBM recurrence on preoperative images with >90% sensitivity and specificity [57]. Machine learning studies could be revolutionary, but require computational ‘big data modelling’, which at present confines them to engineering-supported research environments.

Emerging techniques

1. Novel MRI sequences

i. Advanced diffusion techniques

DWI examines the movement of water molecules in biological systems, whereby clinical DWI assumes a normal (‘Gaussian’) distribution of diffusion values, as this would be the case in free water. In reality, the complex intracellular and extracellular in vivo environment causes the diffusion of water molecules to deviate significantly from this pattern. DKI is an attempt to account for this variation by providing a more accurate model of diffusion and capturing the true diffusion behaviour as a marker of tissue heterogeneity [58], whereby steep (‘leptokurtotic’) curves have been associated with high-grade glioma. Several studies have shown that DKI can differentiate between different grade in gliomas [59, 60]. A recent study investigating DKI and the molecular profile of gliomas has shown that normalised mean kurtosis is significantly lower in tumours with IDH1/2 mutation than tumours with IDH 1/2 wild type [61].

Attempts at more accurate modelling of diffusion in tumour tissue has led to the development of a novel technique called VERDICT (Vascular, Extracellular and Restricted Diffusion for Cytometry in Tumours) which models three primary components: (i) intravascular, (ii) extracellular-extravascular space (EES) and (iii) intracellular water to DW-MRI datasets with various diffusion times and diffusion weightings [62]. The model allows the extraction of features such as cell size and has recently been applied to gliomas at our institution in a feasibility study and shows promise in mapping glioma microstructure which has important implications for presurgical planning and assessing treatment response.

ii. Chemical exchange saturation transfer

Chemical exchange saturation transfer (CEST) MRI is sensitive to the chemical exchange between exchangeable protons on functional metabolite groups (e.g. hydroxyls, amides, amines). The advantages of this technique are that endogenous proteins are used to create imaging contrast rather than exogenous contrast agents and that solute metabolites can be studied at very small concentrations (below what can be visualised with MRS). The majority of CEST imaging studies in GBM have used the amide group to generate CEST contrast, showing that amide proton transfer (APT) can stratify patients by tumour grade [63] and can differentiate between radiation necrosis and active tumour [64•]. A recent study has targeted the amine protons of glutamine for higher CEST contrast and has provided a novel imaging biomarker for mapping regions of low pH within tumour demonstrating a shorter time to progression for acidic lesions compared to tumours with relatively low acidity [65].

Creatine has also been investigated for CEST signal and a recent report has shown reduced creatine CEST contrast in tumours compared to normal brain in mouse glioma models that may be able to differentiate tumours of different aggressiveness [66]. Finally, D-glucose has been used as a potential biodegradable MR contrast agent for imaging glucose uptake in tumours and the first report in human glioma patients has recently been published [67]. This group demonstrated that dynamic glucose enhanced imaging is safe and feasible in humans with signal changes due to glucose uptake in vessels, the brain and tumour areas. Further work to establish this GlucoCEST technique is underway which may provide comparable information to FDG-PET.

iii. Hyperpolarised MRI

Whilst conventional proton MRS can inform on steady state levels of cellular metabolites, the emergence of hyperpolarized carbon MRS has also enabled imaging of metabolic fluxes in real time [68]. This technique has been used in early imaging of response to targeted therapies and chemotherapy in GBM. A decrease in the hyperpolarized lactate-to-pyruvate ratio has been observed in vivo in orthotopic GBM models treated with everolimus [69] and temozolomide [70]. The first-in-human hyperpolarized brain MRI study is currently underway at UCSF that aims to detect abnormal tumour metabolism in patients and to differentiate treatment-related changes from tumour recurrence (dana.org).

2. Radiomics

Radiomics refers to the extraction and analysis of large amounts of advanced quantitative imaging data with high throughput from conventional MRI sequences for correlation with clinical outcomes [71]. Standardised semantic feature sets such as the Visually Accessible Rembrandt Images (VASARI) are widely used in GBM radiomic studies and can be reproducibly scored by radiologists (wiki.cancerimagingarchive.net). This feature set comprises 30 semantic features including tumour location, proportion of enhancing tumour and definition of the enhancing margin and has been shown to predict survival and molecular subtype in GBMs [72•]. For radiogenomic studies in GBM, computational features have been derived from conventional MRI sequences (T2W/FLAIR/T1W+C) that correlate with the VASARI semantic feature set [73] and with mRNA expression [74], molecular subtype and survival [75]. In a recent study, radiomic features computed from MRIs of treatment-naïve GBMs were able to distinguish patients with long-term and short-term survival [76]. The application of this technique to distinguish radiation necrosis from tumour recurrence has shown that texture features which emphasise edge-related differences can discriminate tumour recurrence from radiation necrosis with one study showing that a machine classifier was superior to radiologists in identifying tumour necrosis in 11 primary and 4 metastatic brain tumour cases [77].

This technique provides reliable non-invasive quantitative measurements from routinely acquired MR brain imaging that could readily be applied in clinical practice to assist radiologists. The challenge for use of radiomics in clinical practice, particularly for multi-centre trials, includes the standardisation of image acquisition protocols across different sites, accurate image-registration and accurate tumour segmentation. Future studies incorporating advanced MRI sequences (such as DWI and MRS) and PET data into radiological datasets for more sensitive feature extraction may provide improved imaging response criteria in future GBM treatment trials.

Assessing treatment response to immunotherapy

There is considerable emerging interest in immune therapies in cancer and a number of phase III trials are underway with vaccines and immune checkpoint inhibitors for GBM. A recent report from our institution showed that the combination of ipilimumab (a monoclonal antibody against cytotoxic T-lymphocyte antigen 4) with bevacizumab was associated with a partial response in 31% of patients and stable disease in another 31%, by RANO criteria [78].

Studies of immunotherapies in other solid tumours have revealed specific issues associated with the radiological assessment of treatment response, reflecting delayed responses or therapy-induced inflammation. In particular, clinical improvement and prolonged survival with visible tumour regression can still occur following initial apparent progression or appearance of new lesions. It has therefore become necessary to refine the RANO criteria for patients undergoing immunotherapy, and iRANO criteria have now been published [24], to provide recommendations for management of patients with early progressive changes seen on imaging after initiation of immunotherapy. These changes can be caused by tumour growth that precedes the development of an anti-tumour immune response or by pseudoprogression associated with an inflammatory immune infiltrate. In such cases, early progressive imaging changes do not necessarily predict a poor outcome. The iRANO criteria will therefore permit continuation of treatment in patients who are clinically stable, even if they are progressing radiologically, and to obtain confirmation of true tumour progression on follow-up imaging.

Conclusions

Despite the plethora of advanced imaging techniques available in most neuro-oncology centres, the assessment of treatment response in clinical trials is still hampered by a number of factors including lack of standardised protocols, the level of post-processing analysis required and the lack of consensus over the most reliable and reproducible imaging modality. Imaging biomarkers must be meaningfully translatable across centres and provide reliable surrogate endpoints for multi-institutional studies and clinical trials.

Further studies are urgently needed to establish reproducibility and applicability of current semi-validated quantitative diffusion and perfusion measures across multiple sites and MRI systems, which in turn will lead to optimisation of patient management and future multi-centre trials.