Gliomas are the most common primary tumors of the central nervous system, accounting for about 45% of all intracranial tumors [1]. On pathology, Gliomas can be divided into low-grade gliomas (LGGs, including WHO I ~ II) and high-grade gliomas (HGGs, including WHO III ~ IV) according to their histological and molecular features [2]. Surgical resection combined with radiotherapy and chemotherapy are still the basic treatment for gliomas. LGGs grow slowly and has a favorable prognosis in general [3]; but HGGs are more aggressive with a 5-year relative survival rate of 15–58% for anaplastic astrocytomas and of 6–22% for glioblastomas depending upon their age at diagnosis and various other prognostic factors [4]. Therefore, accurate assessment of the pathological grade of gliomas before surgery is of great clinical significance to determine the surgical resection range and to improve the survival rate of patients.

In the course of gliomas progression, the microstructures of tumors (tumor cell density, cell proliferation activity, and microvessel density, etc.) will undergo tremendous changes, reflecting the changes in histopathological characteristics of tumors [5]. Traditional morphological magnetic resonance imaging (MRI) can estimate the extent of histopathological differentiation of tumors based on cytotoxic edema, hemorrhage, necrosis, heterogeneity of signal intensity, degree and range of signal enhancement. However, studies have shown that the enhancement of gliomas is not completely consistent with tumor grade [6]. Roy et al. [7] reported that the sensitivity of conventional MRI to differentiate high-grade gliomas from low-grade gliomas ranged from 55.1 to 83.3%.

With the continuous development of MRI technology, multi-modal MRI technology has been used to evaluate the biological characteristics of gliomas from different perspectives and has potential application value in the grading of gliomas. Among them, artery spin labeling (ASL) and dynamic magnetic sensitivity enhanced perfusion imaging (DSC-MRI) are perfusion-weighted imaging techniques, while diffusion kurtosis imaging (DKI) is a diffusion magnetic resonance imaging technique. Quantitative perfusion parameters such as relative cerebral blood flow (rCBF) in ASL, relative cerebral blood volume (rCBV) in DSC-MRI and mean kurtosis (MK) in DKI are receiving more attention in the clinical application of preoperative grading of gliomas. Compared with low-grade gliomas, high-grade gliomas have a more abundant blood supply, so hemodynamic perfusion parameters will increase significantly [8]. The cellular pleomorphism and nuclear polymorphism in high-grade gliomas are more marked than those in low-grade gliomas, and the parameters associated with water molecular diffusion are also larger [9].

Most previous researches only utilized perfusion imaging or diffusion imaging to investigate the grading of gliomas, focusing on meta-analysis of diagnostic accuracy. The meta-analysis of quantitative parameters of the above imaging methods are lacking. Moreover, previous efforts only focused on the meta-analysis of diagnostic accuracy and lacked the meta-analysis on quantitative parameters. Due to the small sample size and incomplete parameters of individual studies, the reliability and repeatability of the technology are still unclear. Therefore, we propose a large sample-size comprehensive meta-analysis to resolve the conflicting findings in different studies and to evaluate the diagnostic performance of the quantitative perfusion and diffusion parameters in gliomas grading.

Materials and methods

Literature retrieval

A thorough search for literature from 2005 to 2019 relating to ASL, DSC-MRI or DKI in the grading of cerebral gliomas was performed, using sources from PubMed, Embase, Web of Science, CBM, China National Knowledge Infrastructure (CNKI), Wanfang Database. English search keywords were (astrocytoma or glioblastoma or glioma tumor or astrocytic tumor or gliomas or oligodendroglioma or oligodendroglial tumor) and (DKI or Diffusional Kurtosis or Kurtosis Imaging or kurtosis or DSC-MRI or Dynamic susceptibility contrast-enhanced MRI or Dynamic Susceptibility Contrast or DSC or rCBV or rCBF or ASL or arterial spin-labeling or perfusion or Continuous ASL perfusion or PASL or 3DpCASL or three-dimensional pseudo-continuous arterial spin labeling). In order to avoid missing documents, the combination of electronic search and manual search were performed.

Literature inclusion and exclusion criteria

Inclusion criteria

(1) ASL, DSC-MRI or DKI were used to differentiate gliomas of different grades; (2) At least one quantitative parameter of rCBF, rCBV and MK could be extracted or calculated from the study; (3) Only pathological diagnoses were included; (4) All subtypes of gliomas were included; (5) Fourfold table values of diagnostic tests can be obtained directly or indirectly, i.e. true-positive, true-negative, false-positive, and false-negative; (6) The quality evaluation scores of the included studies were at least 9 since high-quality studies are the basis for reliable meta-analysis.

Exclusion criteria

(1) animal experiments, such as animal experiments of rats; (2) any unpublished conference abstracts, comments, duplication of literature or research; (3) similar studies written by the same author; (4) lack of key data; (5) use of other imaging methods (such as CT, PET, etc.).

Data extraction from literature

The basic information includes first author’s name, country, the time of publication, patient age, tumor grade, number of cases, instrument type and field strength, journal of publication, methods, sequence and so on. Diagnostic information includes sensitivity, specificity, Fourfold table and the ROC curve with the corresponding area under the curve (AUC) value. If the information could not be obtained directly, the statistics were performed with the number of HGG and LGG cases and the sensitivity and specificity provided by the literature using RevMan 5.3 Software [10]. For articles providing sample size, median, extremum or quartile, methods of Luo et al. [11] and Wan et al. [12] were applied to estimate the mean and standard deviation of samples [13, 14].

Quality evaluation

Two researchers independently browsed the title and abstract of the retrieved literature, and read the full text of the literature that may meet the inclusion criteria, and finally determine whether to include them. If there were any disagreements especially on quality assessment, it was resolved by discussion with a third senior clinician. All selected studies were previously published, so there was no need for ethical review and approval or patients consent.

The quality assessment of diagnostic accuracy studies (QUADAS-2) recommended by Cochrane Collaboration was adopted as the evaluation criterion [15]. QUADAS-2 consists of the following key aspects: patient selection, index test, gold standard, flow and timing. Each of them was assessed in terms of risk of bias and signaling questions (yes/no/unclear) were included to assist in judgments. When the criterion is yes, the score increases by 1 point.

Data analysis

Heterogeneity test

Heterogeneity caused by different type of research design, age and gender of patients, pathological subtypes and other variables is a critical factor influencing the accuracy of results. The existence non-threshold effect was tested by Q-test and I2 value using RevMan 5.3 Software. I2<50% indicates insignificant heterogeneity, and a fixed-effect model was applied to merge statistics. I2 ≥ 50% indicates substantial heterogeneity, and a random-effect model was used to merge statistics. Q-test level was P < 0.05.

Meta-analysis

RevMan 5.3 Software (Cochrane Collaboration, Oxford, UK) was used to calculate the effect size and 95% CI. The pooled sensitivity, specificity, positive likelihood ratio, negative likelihood ratio, diagnostic ratio, AUC and its 95% CI were calculated by Stata 13.1, and SROC curves were constructed.

Publication bias

Publication bias was evaluated with Deek’s funnel plot by Stata 13.1 software. P > 0.1 indicated that there was no publication bias.

Sensitivity analysis

The stability of included studies was evaluated. We eliminated an individual study and calculated the pooled effect of the rest of studies.

Results

Literature retrieval results

Fifty-four studies were selected for inclusion after reading the full text, of which 43 were in English and 11 were in Chinese. Patients include both adults and children. The studies were conducted in the following countries: China (n = 24), India (n = 2), Italy (n = 3), Spain (n = 1), Turkey (n = 1), Sweden (n = 2), Japan (n = 3), Norway (n = 1), the United States (n = 5), Canada (n = 1), Korea (n = 2), France (n = 1), Germany (n = 4). Denmark (n = 1), Belgium (n = 1), Brazil (n = 1), Australia (n = 1). Seven studies reported two methods. Of those studies including quantitative data and Continuous Variable Forest Map, 20 was in ASL, 22 in DSC-MRI, 15 in DKI. Of those studies including fourfold table data for meta-analysis of diagnostic tests, 19 was in ASL, 19 in DSC-MRI, 16 in DKI. The flowchart of retrieval process is presented in Fig. 1. The basic information of the literature included is presented in Table 1.

Fig. 1
figure 1

Flow chart of literature screening and identification process

Table 1 Basic information of included studies

Analysis

rCBF in ASL

Twenty studies assessing the difference of rCBF between HGGs and LGGs were included. Heterogeneity test showed that χ2 = 66.79, I2 = 72%, P < 0.001, indicating substantial heterogeneity. Therefore, the random effect model was applied to estimate the pooled rCBF. The pooled rCBF was 1.45 (1.12, 1.77), P < 0.001 (Fig. 2).

Fig. 2
figure 2

Forest plot of mean difference in rCBF between HGGs and LGGs in ASL. Positive results were observed between HGGs and LGGs

rCBV in DSC-MRI

Twenty-two studies assessing the difference of rCBV between HGGs and LGGs were included. Heterogeneity test showed that χ2 = 74.23, I2 = 72%, P < 0.001, indicating substantial heterogeneity. Therefore, the random effect model was applied to estimate the pooled rCBV. The pooled rCBV was 1.37 (1.08, 1.66), P < 0.001 (Fig. 3).

Fig. 3
figure 3

Forest plot of mean difference in rCBV between HGGs and LGGs in DSC-MRI. Positive results were observed between HGGs and LGGs

MK in DKI

Fifteen studies assessing the difference of MK between HGGs and LGGs were included. Heterogeneity test showed that χ2 = 46.39, I2 = 70%, P < 0.001, indicating substantial heterogeneity. Therefore, the random effect model was applied to estimate the pooled MK. The pooled MK was 1.57 (1.21, 1.93), P < 0.001 (Fig. 4).

Fig. 4
figure 4

Forest plot of mean difference in MK between HGGs and LGGs in DKI. Positive results were observed between HGGs and LGGs

Diagnostic value

Sensitivity, specificity, positive likelihood ratio, negative likelihood ratio, diagnostic ratio and area under curve were summarized according to the studies including fourfold table (Table 2). The results showed that rCBF had the highest diagnostic ratio (DOR) of 71 (31,163). The SROC curve suggested that rCBF had the highest AUC value of 0.95 (0.93,0.97), followed by MK 0.93 (0.91,0.95), and rCBV 0.91 (0.89,0.94) (Fig. 5).

Table 2 The values of rCBF, rCBV and MK
Fig. 5
figure 5

SROC Curve for Each Parameter in the Grading of Cerebral Gliomas. A.ASL, B. DSC-MRI, C.DKI

The incidence of gliomas is about 45% of all intracranial tumors [63]. The Fagan diagram of rCBF, rCBV and MK in the grading of gliomas is shown in Fig. 6. Compared with 45% pre-test probability, the post-test probability of rCBF, rCBV and MK increases to 88, 80 and 83%, respectively. The DOR value of rCBF is 71 (31,163), indicating a high pooled diagnostic accuracy.

Fig. 6
figure 6

Fagan Map for Each Parameter in the Grading of Cerebral Gliomas. A.ASL, B. DSC-MRI, C.DKI

Meta-regression

The results of meta-regression are shown in Table 3. Among the five covariates in ASL study, region, year of study, number of patients and QUADAS-2 score were all important factors contributing to heterogeneity except for field strength. Among the six covariates in DSC-MRI study, region, year of study, number of patients, field strength and QUADAS-2 score, none had significant impact on heterogeneity. Among the five covariates in DKI study, the year of study, age of patients, number of patients and QUADAS-2 score all had no significant impact on heterogeneity except for region.

Table 3 Meta-regression

Subgroup analysis

Subgroup analysis was successively carried out according to the region and technique in ASL, the region and magnetic resonance field strength in DSC-MRI, and the region in DKI. The results of subgroup analysis are shown in Table 4.

Table 4 Subgroup analysis

Publication bias

Deek’s test was used to evaluate publication bias for studies containing fourfold Tables. P > 0.1 indicated that there was no publication bias. 19 studies of ASL, 19 studies of DSC-MRI, 16 studies of DKI were eligible for Deek’s test. Deeks funnel plot (Fig. 7) showed no significant publication bias for all groups (P = 0.85, P = 0.45, P = 0.12, for ASL, DSC-MRI, DKI group, respectively).

Fig. 7
figure 7

Funnel plot of publication bias. a ASL group; (b) DSC-MRI group; (c) DKI group

Sensitivity analysis

Sensitivity analysis is a key method for assessing heterogeneity and publication bias. We eliminated an individual study and calculated the pooled effect of the rest of studies. Compared with the pooled effect of all the included studies, we could determine the influence of individual study on the pooled effect. Results of this meta-analysis revealed that the included studies had no significant changes on the pooled value of rCBF and rCBV. However, the MK of Delgado et al. [53] showed significant influence on heterogeneity and publication bias before it was eliminated (I2 = 70% to I2 = 54% calculated by Revman5.3).

Discussion

This meta-analysis revealed the pooled rCBF, rCBV and MK of HGGs were higher than those of LGGs, with the results statistically significant. The specificity of rCBF is the highest among all parameters, suggesting that the rate of misdiagnosis in rCBF is the lowest. The sensitivity of rCBV is the highest, suggesting that the rate of missed diagnosis in rCBV is the lowest. The results of meta-regression showed that there were many factors contributing to the heterogeneity of ASL studies, while the studies of DSC-MRI and DKI were relatively stable. Although three kinds of MRI techniques included in this study could be applied to grade gliomas, the DOR suggested that rCBF in ASL had the highest diagnostics accuracy.

DSC-MRI perfusion imaging uses an exogenous contrast agent and relies on the acquisition of T2* images. DSC-MRI detects changes in MR signal as the contrast agent passes through the blood vessels, thus haemodynamic parameter (rCBV) can be indicative of microvascular properties such as vascular flow [8]. Compared to LGGs, HGGs have more abundant blood supply; therefore, the hemodynamic parameters (rCBV) would manifest notable increases significantly, which is consistent with the findings of Winkler et al. [68]. Awasthi et al. [36] observed that the microvessel density (MVD) and the positive expression of vascular endothelial growth factor (VEGF) had significant correlation with the pathological grade of gliomas and the rCBV value. Although the range of rCBV values reported in the literature amongst various types of gliomas, the most researchers observed higher rCBV in HGGs [69]. In this meta-analysis, we found that the discriminative values of sensitivity and specificity were 92 and 81% by rCBV between HGGs and LGGs.

ASL is a completely non-invasive MRI technique which measures blood flow by using magnetically labeled water protons in arterial blood as an endogenous tracer. It is not affected by the integrity of blood-brain barrier, therefore accurately evaluates gliomas microcirculation information, reflecting the situation of tumor angiogenesis, and thus the gliomas grade can be more accurately assessed [70, 71]. The relative rCBF has been widely used to discriminate between LGGs and HGGs. Although ASL suffers from low signal-to-noise ratio as well as sensitivity to motion, Cebeci and Luh et al. reported a strong correlation between ASL-derived CBF values and DSC-derived CBF values in brain tumours [17, 72]. Several studies had revealed that rCBF of ASL was a rigorous parameter of grading gliomas, thereby allowing it an alternative method of DSC-MRI [73,74,75].

Diffusion kurtosis imaging (DKI), first proposed by Professor Jensen of New York University in 2005, is a technique intending to explore the properties of non-gaussian diffusion of water molecules [76, 77]. It has been proposed to more accurately characterise the complicated water diffusion in biological tissues. The most commonly used parameter of DKI is mean kurtosis (MK) which provides additional information about tumour heterogeneity. The cellular pleomorphism and nuclear polymorphism in HGGs are more significant than those in LGGs. The proliferation of interstitial vessels is also more abundant in HGGs and thus the MK value is higher [9]. Some studies indicated that MK was higher in HGGs. Raab et al. [61] found that the AUC of MK was 92.3% for differentiating HGGs from LGGs, which were in strong agreement with the findings in this meta-analysis.

Heterogeneity is common in meta-analysis. After excluding the research of Falk Delgado et al., the heterogeneity of MK decreased from 70 to 54%. Since there is moderate heterogeneity in this meta-analysis, clinical decisions should be made cautiously based on these results. Heterogeneity may be caused by the following aspects: (1) imbalance in the distribution of HGGs and LGGs: for instance, grade I gliomas were not studied in some research which resulted bias in case selection; (2) different experimental conditions set by researchers, such as different instrument models, parameter settings, post-processing methods, etc. (3) regional heterogeneity resulted from inclusion of literature from different countries and regions; (4) The region of interest (ROI) and the reference region were heterogeneously placed in the different studies, which may have an impact on the results.

The main limitations of this study are: 1. This study only focused on diagnostic value of ASL, DSC-MRI and DKI in distinguishing LGGs from HGGs, their role in the follow-up and each specific pathological grade of gliomas were not discussed; 2. Only research in Chinese and English were included, the sample size was relatively small; 3. Most studies used the WHO classification system without molecule genomics.

Conclusion

Quantitative parameters rCBF in ASL, rCBV in DSC-MRI and MK in DKI had excellent diagnostic performances for differentiating HGGs from LGGs. rCBF is a rigorous parameter of grading gliomas with AUC of 0.95, thereby allowing it an alternative method of DSC-MRI or DKI.