Introduction

Cervical compressive myelopathy (CCM), a common progressive and degenerative cervical disease, is mainly caused by cervical spondylotic myelopathy (CSM) or ossification of posterior longitudinal ligament (OPLL). At present, the standard radiological diagnosis of CCM is the magnetic resonance imaging (MRI), which can not only reveal the degree of spinal cord compression but also depict the signal intensity changes in the spinal cord [1,2,3]. The incidence of increased signal intensity (ISI) on T2-weighted MRI is 18.8–65.0% among patients with clinically suspected CCM and 40.8–97.1% in those treated with surgery [4]. ISI reflects various intramedullary lesions such as edema, inflammation, myelomalacia, or gliosis [5]. Although ISI has been studied extensively, the significance of ISI remains controversial [6,7,8,9,10].

Currently, there is more emphasis on the multi-classification of ISI according to the different degrees of signal intensity than the arbitrarily binary classification based on the presence or absence of ISI. Various classification methods of ISI have been used in the clinical studies, and the most common one was reported by Yukawa et al. [11] in sagittal T2 images; ISI of the spinal cord at the narrowest level was classified into grade 0, none; grade 1, light (obscure); and grade 2, intense (bright). Some authors reported that using this classification method, the grade of ISI can be a predictor of surgical outcomes. And patients with the greatest ISI would be predicted to have the worst prognosis [11,12,13,14]. However, Machino et al. [15] conducted a large-scale prospective cohort study recently and found that this ISI grading classification was not associated with the preoperative severity of myelopathy and outcomes. Hence, there is no consensus on the predictive value of this ISI grading classification.

To our best knowledge, the association between three-grade classification of T2-weighted ISI and the severity of myelopathy as well as surgical outcomes in patients with CCM has not been systematically evaluated. Therefore, in order to obtain the best evidence, a meta-analysis was conducted to compare both the preoperative and postoperative conditions between patients of different grades of ISI.

Materials and methods

Search strategy and data sources

This systematic review was conducted according to the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) statement, which was included to ensure the transparency and completeness of reporting systematic reviews and meta-analysis [16]. A comprehensive search of the literature that used three-grade classification of T2-weighted ISI (grade 0, none; grade 1, light or obscure; and grade 2, intense or bright) in patients with CCM were acquired from three databases including PubMed, Embase, and Cochrane library. The key words were used as follows: (1) signal intensity OR T2-weighted MR images OR magnetic resonance imaging OR MRI; (2) cervical myelopathy OR degenerative cervical myelopathy OR DCM OR cervical spondylotic myelopathy OR CSM OR ossification of posterior longitudinal ligament OR OPLL OR hypertrophy of the ligamentum flavum OR ossification of the ligamentum flavum; (1) AND (2). All databases were searched till June 2018. Duplicates were removed, and titles, abstracts, and article contents were screened to determine whether the inclusion/exclusion criteria were met. The reference lists of all retrieved studies were also reviewed to identify potentially relevant studies. All identified articles were assessed by two reviewers independently according to the inclusion and exclusion criteria.

Inclusion criteria

Only studies fulfilling the following inclusion criteria were considered: (1) randomized or non-randomized controlled trials; (2) patients aged 18 or above; (3) patients who were diagnosed with CCM; (4) high-resolution MRI with 1.5 T or higher MRI machines; (5) Yukawa’s classification criteria of ISI; (6) the primary outcome measurements: functional Japanese Orthopedic Association (JOA) or modified JOA (mJOA) score, neurological recovery rate; (7) the secondary outcome measurements: preoperative C2-C7 lordotic angle and preoperative range of motion (ROM).

Exclusion criteria

The exclusion criteria were as follows: (1) patients with cervical myelopathy caused by trauma, infection, or tumor; (2) patients with pervious history of cervical surgery; and (3) studies of case reports, animal researches, and review articles.

Data extraction

After searching the electronic databases, Endnote X7 was applied to store all the citations of identified articles. The data relating to the primary and secondary outcomes in all included studies were extracted by two independent reviewers and cross-checked by a third reviewer for data accuracy. The discrepancies were resolved by discussions to reach a consensus. Non-statistical data extracted from the eligible studies included (1) first author, (2) year of publication, (3) location in which the study was performed, (4) number of patients, (5) study design, (6) follow-up periods, (7) patients demographics, and (8) surgical approaches. Statistical data were also extracted, including (9) preoperative and postoperative JOA/mJOA score, (10) neurological recovery rate, (11) preoperative C2–C7 lordotic angle, and (12) preoperative ROM.

Methodological quality assessment and publication bias

The methodological quality of the observational studies was assessed using the Newcastle–Ottawa scale (NOS) [17], as recommended by the Corchrane Collaboration. This scale provides specific criteria to assess the study selection (four items), comparability (two items), and ascertainment of exposure/outcome (three items). The “high”-quality items were scored with an asterisk and the maximum score was nine. A final score of ≥ 6 was considered to represent high quality. If there were randomized controlled trials (RCTs) identified, the risk of bias would be assessed via Cochrane Collaboration’s tool. Two researchers reviewed and scored each of the studies independently. Any discrepancies were addressed by reevaluation to reach a consensus.

Publication bias was assessed qualitatively using a funnel plot. Significant asymmetry indicates potential publication bias, which may have affected the validity of presented results.

Data analysis

All the statistical analyses were conducted using the Review Manager 5.3 (Copenhagen: The Nordic Cochrane Center, The Cochrane Collaboration). To estimate the pooled effects, the weighted mean difference (WMD) and its 95% confidence interval (CI) were calculated because all the outcomes in this review were continuous. Heterogeneity was assessed by chi-square test and quantified by calculating the I2 statistic. Random-effects or fixed-effects models were used depending on the heterogeneity. When P < 0.10 and I2 > 50%, the heterogeneity was considered as statistically significant and random-effects model was used; otherwise, the fixed-effects model was used.

Results

Literature review

Figure 1 summarizes the review flowchart in accordance with the PRISMA statement. The initial search identified 1412 articles in PubMed, 1863 articles in Embase, and 79 articles in Cochrane Library. After exclusion of duplicate articles, 2197 potential articles were retrieved. Of these studies, 2139 articles were excluded by the titles and abstracts. The remaining 58 articles were eligible for full-text review. Finally, eight studies [11,12,13,14,15, 18,19,20] were included into meta-analysis based on our inclusion and exclusion criteria.

Fig. 1
figure 1

PRISMA flow chart of studies included in our systematic review

Quality assessment and baseline characteristics

In total, eight studies [11,12,13,14,15, 18,19,20] included in this meta-analysis were non-randomized controlled trials: five studies were prospective cohort studies and three studies were retrospective studies. Eight included studies contained 1101 patients in total, among which 355 patients were classified into grade 0, 414 patients were classified into grade 1, and 332 patients were classified into grade 2. The baseline characteristics of eight studies are summarized in Table 1, and the clinical outcomes of eight studies are summarized in Supplementary Table 1. The quality of included studies was assessed by NOS. As shown in Table 2, the quality scores of these studies ranged from 6 to 8 (four studies scored 8 points, three studies scored 7 points, and one study scored 6 points), which were of high quality (Newcastle–Ottawa score ≥ 6).

Table 1 Characteristics of included studies
Table 2 Quality assessment of included studies based on the Newcastle–Ottawa scale

Clinical outcome

Preoperative JOA/mJOA score

There were eight studies (n = 1101 patients; 355 in grade 0 group, 414 in grade 1 group, and 332 in grade 2 group) that provided preoperative JOA/mJOA score with mean ± standard deviation. The preoperative JOA/mJOA score in grade 0 group was higher than other grades and the differences were both statistically significant (grade 0 vs. grade 1: P < 0.001, WMD 1.09 [0.76, 1.42], heterogeneity: χ2 = 7.76, df = 7, P = 0.35, I2 = 10%, Fig. 2a; grade 0 vs. grade 2: P < 0.001, WMD 1.65 [1.27, 2.04], heterogeneity: χ2 = 10.76, df = 7, P = 0.15, I2 = 35%, Fig. 2b). The preoperative JOA/mJOA score in grade 1 group was also statistically significantly higher than grade 2 group (grade 1 vs. grade 2: P = 0.002, WMD 0.60 [0.21, 0.99], heterogeneity: χ2 = 12.25, df = 7, P = 0.09, I2 = 43%, Fig. 2c).

Fig. 2
figure 2

Forest plots of the weighted mean difference and their 95% confidence interval (CI) of all studies investigating the preoperative JOA/mJOA score of patients in the subgroups of grade 0, grade 1, and grade 2. a Grade 0 vs grade 1. b Grade 0 vs grade 2. c Grade 1 vs grade 2

Postoperative JOA/mJOA score

There were five studies (n = 837 patients; 254 in grade 0 group, 311 in grade 1 group and 272 in grade 2 group) that provided postoperative JOA/mJOA score with mean ± standard deviation. The differences between postoperative JOA/mJOA score in grade 0 group and the other two groups were both statistically significant (grade 0 vs. grade 1: P < 0.001, WMD 0.98 [0.67, 1.29], heterogeneity: χ2 = 2.53, df = 4, P = 0.64, I2 = 0%, Fig. 3a; grade 0 vs. grade 2: P < 0.001, WMD 2.01 [1.16, 2.87], heterogeneity: τ2 = 0.61, χ2 = 14.10, df = 4, P = 0.007, I2 = 72%, Fig. 3b). The postoperative JOA/mJOA score in grade 1 group was higher than grade 2 group (grade 1 vs. grade 2: P = 0.002, WMD 1.12 [0.42, 1.82], heterogeneity: τ2 = 0.38, χ2 = 10.67, df = 4, P = 0.03, I2 = 62%, Fig. 3c).

Fig. 3
figure 3

Forest plots of the weighted mean difference and their 95% confidence interval (CI) of all studies investigating the postoperative JOA/mJOA score of patients in the subgroups of grade 0, grade 1 and grade 2. a Grade 0 vs grade 1. b Grade 0 vs grade 2. c Grade 1 vs grade 2

Recovery rate

A total of six studies (n = 984 patients; 317 in grade 0 group, 371 in grade 1 group and 296 in grade 2 group) provided neurological recovery rate with mean ± standard deviation. In these articles, the definition of recovery rate was consistent and was calculated using the formula suggested by Hirabayashi et al. [21] [Recovery rate = postoperative JOA score − preoperative JOA score) / (17 − preoperative JOA score) × 100%]. The neurological recovery rate in the grade 0 group was significantly higher than in the other grades (grade 0 vs. grade 1: P < 0.001, WMD 9.05 [5.20, 12.90], heterogeneity: χ2 = 3.37, df = 5, P = 0.64, I2 = 0%, Fig. 4a; grade 0 vs. grade 2: P < 0.001, WMD 20.52 [12.46, 28.57], heterogeneity: τ2 = 67.19, χ2 = 17.12, df = 5, P = 0.004, I2 = 71%, Fig. 4b). The neurological recovery rate in grade 1 group was also significantly higher than in the grade 2 group (grade 1 vs. grade 2: P = 0.002, WMD 11.70 [4.43, 18.97], heterogeneity: τ2 = 54.48, χ2 = 16.88, df = 5, P = 0.005, I2 = 70%, Fig. 4c).

Fig. 4
figure 4

Forest plots of the weighted mean difference and their 95% confidence interval (CI) of all studies investigating the recovery rate of patients in the subgroups of grade 0, grade 1, and grade 2. a Grade 0 vs grade 1. b Grade 0 vs grade 2. c Grade 1 vs grade 2

Preoperative C2-C7 lordotic angle

Four studies (n = 806 patients; 285 in grade 0 group, 287 in grade 1 group and 234 in grade 2 group) measured the preoperative C2-C7 lordotic angle with mean ± standard deviation. Forest plot is in Online Resource 1 (Supplementary Fig. 1). No statistically significant difference was observed in any pairwise comparison (grade 0 vs. grade 1: P = 0.21 > 0.05, WMD − 1.02 [− 2.62, 0.58], heterogeneity: χ2 = 4.66, df = 3, P = 0.20, I2 = 36%, Supplementary Fig. 1A; grade 0 vs. grade 2: P = 0.38 > 0.05, WMD − 0.73 [− 2.35, 0.89], heterogeneity: χ2 = 0.28, df = 3, P = 0.96, I2 = 0%, Supplementary Fig. 1B; grade 1 vs. grade 2: P = 0.52 > 0.05, WMD 0.57 [− 1.14, 2.27], heterogeneity: χ2 = 1.66, df = 3, P = 0.65, I2 = 0%, Supplementary Fig. 1C), indicating that the preoperative C2–C7 lordotic angle was similar among three groups.

Preoperative ROM

Three studies (n = 695 patients; 255 in grade 0 group, 237 in grade 1 group and 203 in grade 2 group) provided preoperative ROM with mean ± standard deviation. Forest plot is in Online Resource 1 (Supplementary Fig. 2). The differences of preoperative ROM in all pairwise comparisons were not statistically significant (grade 0 vs. grade 1: P = 0.61 > 0.05, WMD -1.40 [− 6.87, 4.06], heterogeneity: τ2 = 17.77, χ2 = 9.11, df = 2, P = 0.01, I2 = 78%, Supplementary Fig. 2A; grade 0 vs. grade 2: P = 0.38 > 0.05, WMD 0.90 [− 1.13, 2.94], heterogeneity: χ2 = 4.03, df = 2, P = 0.13, I2 = 50%, Supplementary Fig. 2B; grade 1 vs. grade 2: P = 0.49 > 0.05, WMD − 0.69 [− 2.66, 1.27], heterogeneity: χ2 = 1.05, df = 2, P = 0.59, I2 = 0%, Supplementary Fig. 2C).

Publication bias

Publication bias was assessed by funnel plots (Fig. 5). Because the plots were generally symmetrical, there was no statistically significant publication bias in this meta-analysis.

Fig. 5
figure 5

Funnel plots of mean differences and standard errors to assess the publishing bias. a For preoperative JOA/mJOA score of grade 0 vs grade 1. b For preoperative JOA/mJOA score of grade 0 vs grade 2. c For preoperative JOA/mJOA score of grade 1 vs grade 2. d For postoperative JOA/mJOA score of grade 0 vs grade 1. e For postoperative JOA/mJOA score of grade 0 vs grade 2. f For postoperative JOA/mJOA score of grade 1 vs grade 2, g For neurological recovery rate of grade 0 vs grade 1. h For neurological recovery rate of grade 0 vs grade 2. i For neurological recovery rate of grade 1 vs grade 2

Discussion

ISI on T2-weighted MRI is often observed in patients with CCM [15]. The role of ISI has been extensively studied with regard to its histopathological significance and its effects on the severity of myelopathy and surgical outcomes. ISI reflects a wide range of spinal cord lesions, from mild to severe changes. And ISI is considered as a product of necrosis owing to mechanical compression and venous infraction [22, 23].

Many researchers have investigated the association between the presence of ISI and surgical outcomes. Some reported that the patients with ISI on T2-weighted MRI would suffer a poor prognosis after surgery [6, 7, 24], whereas others found no such a relationship [9, 10, 25]. Although a majority of researchers focused on the presence of ISI on T2-weighted MRI, they did not grade the degree of ISI.

Recently, some studies have classified ISI according to the different degrees of signal intensity and studied their effects on clinical outcomes. Despite the variations of the field strengths and performance of MR scanners, the quality of sagittal T2-weighted MR images from 1.5 T or higher MRI machines had sufficient resolution for classifying ISI. Various classification criteria of ISI exist, but no one has been uniformly adopted. Mehalic et al. [24] classified ISI into five grades, from grade 0 (none) to grade 4 (very intense). However, owing to its complexity and subjectivity, only one other study used this classification method [26]. Yukawa et al. [11] proposed a more simplified grading scale (grade 0, none; grade 1, light or obscure; grade 2, intense or bright) in patients with CCM, and hence, this three-grade classification method of ISI was adopted in many studies. Some authors reported that the three-grade classification of ISI could be a predictor of surgical outcomes [11, 12, 18]; however, others found there was no significant relationship between ISI grades and surgical outcomes [15, 27, 28]. Up till now, the agreement on the prediction ability of this ISI grading scale has not been reached. Therefore, we conducted a meta-analysis to systematically compare the clinical effectiveness of this three-grade ISI classification method regarding the neurological condition and surgical outcomes in patients with CCM.

To evaluate the severity of myelopathy, we extracted JOA/mJOA score from original studies into final meta-analysis, because JOA/mJOA score was a major tool in the assessment of cervical myelopathy with good reliability and validity [29]. As shown in the pooled effects of preoperative JOA/mJOA score, the value in grade 0 group was the highest among three groups, followed by that in the grade 1 group, and the value in the grade 2 group was the lowest. These findings indicated that this three-grade classification of ISI could reflect the severity of cervical myelopathy well, and the ISI grade was negatively correlated with preoperative neurological status. In the meta-analysis of postoperative JOA/mJOA score, the same trend was observed as well, indicating that this ISI grading scale was also negatively related to postoperative neurological status. Previous data suggest that light ISI may reflect mild neuropathologic alterations in the spinal cord with a greater recuperative potential, whereas intense ISI may reflect severe alterations with a less recuperative potential [30]. In our meta-analysis, we also found that higher ISI grade was accompanied by more severe myelopathy.

To assess the surgical outcome, the neurological recovery rate was used for meta-analysis. The recovery rate in the grade 0 group was the best among the three groups, followed by that of the grade 1 group, and the rate in the grade 2 group was the lowest. These findings further confirmed that this ISI grading scale was negatively associated with surgical outcomes, which is of great clinical significance as timely surgery is important for CCM patients. Once the patients have the tendency of increasing ISI grades, the early surgical interventions may achieve better surgical outcomes for patients. Although ISI grades may correlate with the severity of myelopathy, clinical decisions should be made on the basis of clinical and radiological presentation.

In addition, we also evaluated the radiological parameters of different ISI grades. The results showed there was no significant difference in the C2–C7 lordotic angle and ROM among three grades, namely, the three-grade classification of ISI had no prediction ability regarding the postoperative alignment and range of motion of cervical spine.

Our meta-analysis has certain limitations. Firstly, all studies included were non-randomized controlled trails, so the level of evidence might not be high enough. Secondly, the clinical data were not reported in some studies, which might be the cause of heterogeneity. Thirdly, the number of included studies was limited and the funnel plots might not be reliable enough for the investigation of potential publication bias in this situation. Finally, all studies included in the meta-analysis focused on the evaluation of neurological function but ignored the assessment of life quality of patients.

Conclusion

In conclusion, our meta-analysis suggests that the three-grade classification of ISI on T2-weighted MRI can reflect the severity of myelopathy and surgical outcomes in patients with CCM. Higher ISI grade is correlated with more severe myelopathy and surgical outcomes. Overall, the three-grade classification of ISI is instructive and should be used universally.