Introduction

A number of scales are available to assess functional disability in patients with cervical spondylotic myelopathy (CSM) [4, 812]. The two commonly used scales are the Nurick grade and modified Japanese Orthopaedic association (mJOA) scoring system (Tables 1, 2). It is unclear whether these two functional scoring systems, especially the lower limb component of the mJOA (llmJOA) and the Nurick grade, provide similar evaluations of patients with CSM. Nurick grade assesses ambulatory status and so does the llmJOA. Thus, it would be rational to presume that the llmJOA score would correlate with Nurick grade better than the total mJOA (tmJOA) score. If the correlation between the Nurick grade and the tmJOA or the llmJOA is perfect then perhaps the Nurick grading system would be redundant. On the other hand, if the two instruments complement each other then it would be justified to utilize both of them in a given patient. Only few studies have compared Nurick grading and mJOA score in evaluating functional disability and outcome in patients with CSM [12]. Even these have evaluated the correlation in a small number of patients or in patients undergoing different types of decompressive surgery.

Table 1 Nurick grades [7]
Table 2 Modified Japanese Orthopaedic Association scoring system [1]

We performed a retrospective analysis of prospectively collected data of 93 patients who underwent uninstrumented central cervical corpectomy (CC) for CSM, in an attempt to determine the correlation between Nurick grade and mJOA both before and after surgery.

Clinical materials and methods

We prospectively gathered data in 93 patients with CSM who underwent uninstrumented CC with autologous bone grafting between 1998 and 2008. There were 87 men and 6 women. The mean age (±SD) of the patients was 48.4 ± 9.3 years (range 31–69 years). The mean duration of symptoms (±SD) was 22.9 ± 33.5 months (1–180 months). The mean duration of follow-up (±SD) was 18.9 ± 12.4 months (7–77 months). The first follow-up was between 6 and 18 months following surgery. Preoperative and postoperative Nurick grade and tmJOA and llmJOA score of each patient was documented.

An uninstrumented CC using either iliac bone graft or a fibular graft (for 3-level CC) was performed. All patients except those undergoing 3-level CC were mobilized in the immediate postoperative period with a Philadelphia collar that was prescribed for 6 months. Patients who underwent 3-level CC with fibular grafts were mobilized within a few days to a week postoperatively. The surgical technique has been described in previous publications [8].

Preoperative tmJOA score based grading of severity

The severity of functional disability was graded based on the preoperative tmJOA score as (1) mild, tmJOA score >12; (2) moderate, tmJOA score 9–12; and (3) severe, tmJOA score <9.

Statistical analysis

The pre- and postoperative Nurick grade and tmJOA score and llmJOA score were compared and the correlation between the Nurick grade and mJOA scores was analysed. After entering the data into an Excel spreadsheet, analysis was done using commercially available software (SPSS version 16.0, SPSS Inc.). The Wilcoxon signed-rank test was used to analyse whether the postoperative changes in Nurick grade, tmJOA scores and llmJOA scores were significant. Postoperative recovery rates were calculated using the following formulae [4, 9]:

$$ \begin{aligned} {\text{NGRR}} & = ({\text{preoperative Nurick grade}}-{\text{follow-up Nurick grade}}) \, \times \, 100/{\text{Preoperative Nurick grade}} \\ {\text{tmJOARR}} & = ({\text{follow-up tmJOA}}-{\text{preoperative tmJOA}}) \times 100/(18 - {\text{ preoperative tmJOA}}) \\ {\text{llmJOARR}} & = ({\text{follow-up llmJOA}}-{\text{preoperative llmJOA}}) \times 100/(7 - {\text{ preoperative llmJOA}}) \\ \end{aligned} $$

Spearman’s rho was used to determine the strength of association between Nurick grade, tmJOA and llmJOA scores, changes in these scores after surgery and the respective recovery rates. Scatter plots were drawn for categorical change in Nurick grade following surgery against corresponding tmJOA and llmJOA recovery rate as well as their categorical change. A p value of < 0.001 was taken to indicate statistical significance.

Results

Correlation between Nurick grade and mJOA scores (Table 3)

The mean preoperative Nurick grade (±SD) was 3.3 ± 0.99 (range 0 –5). The mean preoperative tmJOA score (±SD) was 11.97 ± 2.6 (range 6–17). The mean preoperative llmJOA score (±SD) was 4.2 ± 1.4 (range 2–7). The correlation between Nurick grade and llmJOA (Spearman’s ρ 0.901) was better than with tmJOA (0.846) preoperatively.

Table 3 Preoperative functional scores: Correlation between tmJOA and llmJOA scores and Nurick grades

The mean follow-up Nurick grade (±SD) was 1.8 ± 1.0 (range 0–4). The mean follow-up tmJOA score (±SD) was 15.8 ± 2.2 (range 9–18). The mean follow-up llmJOA score (±SD) was 5.8 ± 1.3 (range 3–7). The correlation between llmJOA and Nurick grades (Spearman’s ρ 0.886) was better than with tmJOA (0.862) at follow-up.

Correlation between NGRR and mJOARR

The mean Nurick grade recovery rate (±SD) was 45.8 ± 27.7 (range 0–100). The mean tmJOA recovery rate (±SD) was 65.2 ± 31.1 (range 0–100). The mean llmJOA recovery rate (±SD) was 59.9 ± 38.4 (range 0–100). The correlation between Nurick grade recovery rate and llmJOA recovery rate (Spearman’s ρ 0.840) was better than with tmJOA recovery rate (0.793) (Figs. 1, 2).

Fig. 1
figure 1

Scatter plot of total mJOA recovery rate versus Nurick grade recovery rate

Fig. 2
figure 2

Scatter plot of lower limb mJOA recovery rate versus Nurick grade recovery rate

Correlation in different grades of myelopathy

Across all grades of myelopathy llmJOA scores correlated better with the preoperative, follow-up and change in Nurick grade after surgery than tmJOA scores. However, in the mild myelopathy group, preoperative tmJOA scores had better correlation with Nurick grade than the llmJOA. Overall, the correlation between mJOA scores and Nurick grades and recovery rates were better for the moderate myelopathy group than for the mild or severe myelopathy groups (Table 4).

Table 4 Correlation between Nurick grade and mJOA scores, changes in them and their recovery rates in different grades of myelopathy (Spearman’s rank correlation coefficient)

Correlation between changes in Nurick grade and mJOA scores

The percentage of patients who improved in their Nurick grade, tmJOA and llmJOA scores was 83.8 (78/93), 94.6 (88/93) and 78.5 (73/93), respectively.

Correlation between change in tmJOA and llmJOA scores and the Nurick grade at follow-up is shown in Table 5. The correlation between categorical change in Nurick grade was better with change in the llmJOA score (Spearman’s ρ 0.737) than with change in tmJOA score (0.679). There were no patients with worsening in any of the scales at follow-up. Also the degree of agreement between outcome assessed at follow-up by Nurick grade and llmJOA (88.2%) was better than between Nurick grade and tmJOA (87%) (Tables 6, 7).

Table 5 Correlation between change in tmJOA and llmJOA scores and Nurick grade at follow-up
Table 6 Outcome by Nurick grade and tmJOA score (n = 93)
Table 7 Outcome by Nurick grade and llmJOA score (n = 93)

Discrepancy between changes in Nurick grade and mJOA scores

Among those in whom there was discordance, the proportion of patients in whom there was no improvement in llmJOA but had improvement in Nurick grade was higher (8/11 patients) than those in whom an improvement in llmJOA score did not translate into similar change in the Nurick grade (3/11 patients) (Table 7). Of the 8 patients in whom there was no improvement in llmJOA, 6 patients had improved from Nurick grade 3 to Nurick grade 2 (pre operative llmJOA was 6 in one patient, and 5 in three and 4 in the other two), and 2 patients had improved from Nurick grade 1 to 0. These two patients who had improved from Nurick grade 1 to 0 had no dysfunction on llmJOA preoperatively (score of 7).

Discussion

Previous reports of comparison between Nurick grade and mJOA scores

Though Nurick grade and mJOA score represent different functional capabilities, few studies have performed an in-depth analysis and comparison of the domains assessed by these two scoring systems. Several authors report results of decompressive surgery using either Nurick grade or mJOA score [2, 3, 5, 12]. Only few have reported patients’ functional status using both the functional scales [12].

In a retrospective study of 43 patients with CSM who underwent anterior decompression, Vitztum et al. [12] showed that there was good correlation between the preoperative and postoperative scores using the Nurick scale and JOA scoring system. This finding is similar to that of our study. However, a wide variation was noted in the outcome following surgery with only 33% of patients improving their Nurick grade whereas 81% had improved their JOA score. The authors attributed this difference to an improvement in upper limb and sensory function which is not reflected in the Nurick score. Interestingly, in our study, the variation in percentage of patients who showed an improvement in the Nurick grade, tmJOA and llmJOA scores was less, ranging from 78.5 to 94.6% for llmJOA and tmJOA scores, respectively, with the percentage of patients determined to have improved their Nurick grade being in between at 83.8%. Although Vitztum et al. [12] found a significant difference in the mean postoperative recovery rates assessed by the Nurick grade (23%) and JOA scoring system (37%), the variation was less than that noted for the percentage of patients showing improvement in the different functional scales. Hence, they concluded that to enable comparison across studies that have used CSM specific functional scales, it is more appropriate to report the mean recovery rate rather than the proportion of patients showing an improvement in the functional scores. The authors commented that Nurick scale is less sensitive to improvement following surgery and recommended the continued use of multiple scoring systems in assessing outcome following surgery in patients with CSM. A detailed analysis of the various components of individual scores (such as the llmJOA scores), however, was lacking in their study. Our results show that the variation in the mean recovery rates was less than that of the proportion of patients showing an improvement in the different functional grading systems.

Causes of discordance

The lack of significant correlation between preoperative llmJOA scores and Nurick grade in the mild myelopathy group can be explained by the impact of the disability in the lower limbs on the occupation which would have been reflected in a more severe Nurick grade in those whose employment was affected even with mild lower limb functional impairment. Similarly, the lack of correlation between preoperative tmJOA and Nurick grade in the severe myelopathy group is probably due to the poor upper limb or bladder function or sensory impairment in the setting of a moderate lower limb functional impairment, which resulted in low JOA scores but not in poor Nurick grades.

In our series of patients, there was disagreement in 12.9% (12 out of 93) patients between the Nurick grade and tmJOA and in 11.8% (11 out of 93) patients between the Nurick grade and llmJOA score in assessment of outcome at follow-up. The discrepancy between Nurick and tmJOA is expected as the functional domains assessed by the Nurick grade and mJOA vary significantly. King et al. [6], in their study for validating SF-36 in CSM, conclude that Nurick score and leg score of mJOA assess similar domains. However, if Nurick grade and llmJOA both assess ambulatory function, there should be minimal or no discrepancy between the scores obtained using the two scales at follow-up. The only flaw in this argument is the fact that Nurick grade assesses the employability of a patient along with ambulatory status. An improvement in ambulatory function may not be sufficient for a person to regain his/her ability to go to work. Hence, a positive change in llmJOA score may not be reflected in a similar change in the employment status as evidenced by change in the Nurick grade. Theoretically this should result in more patients showing an improvement in llmJOA at follow-up than in their Nurick grades. However, our results were contrary to this proposition.

The only explanation for the improvement in preoperative Nurick grade 3 to a follow-up Nurick grade 2 in 6 patients without a corresponding improvement in the llmJOA would suggest that these patients returned to work in spite of no improvement in their ambulatory status.

Disconnect between employment and ambulation

Nurick grade was devised primarily to assess ambulatory status of a patient and reflect its effect on employment. The premise was that an improvement in ambulatory status should translate into a similar change in the employment status of a person with CSM. This, however, does not hold true in our subset of patients as evidenced from the discordance between changes in llmJOA and Nurick grade at follow-up. The reasons for this apparent discrepancy may be manifold. One possibility is that significant improvement in lower limb function may not be necessary in certain occupations to remain employed. An example would be a grocery shop owner who just needs to sit at the cash counter of the shop and manage the counter without any need for him to move around. The other reason would be the nature of compensation in the absence of work. If a patient does not get any unemployment benefits, as happens in most developing countries, his economical needs may force him to work in spite of the less than ideal ambulatory function.

In other words, though Nurick grade can indicate whether a person is employable or not, it does not indicate whether an improvement in Nurick grade 3 to 2 is secondary to improvement in the ambulatory status. We are not suggesting that Nurick grade should no longer be used in outcome assessment, but that it should be supplemented with other functional scoring systems in assessment of outcome following decompressive surgery. The main advantages of the Nurick grading system is its simplicity and ease of use. It is also unique in providing a gross idea of the ambulatory and employment status of a patient at ‘one glance’ that cannot be acquired by any of the so-called “composite, all inclusive” scales for CSM. Although both Nurick grading and llmJOA score test ambulatory function, there are separate domains that are assessed by these scoring systems. Thus, it would be appropriate to use both these scoring systems in the assessment of preoperative disability and postoperative outcome in patients with CSM. The nature of employment, the consequences of losing the job and the provision of compensation or lack thereof play a complex role in the decision of a patient to return to his/her employment after surgery. Also the ability to continue daily practices such as squatting and sitting cross-legged do affect the outcome from the patients’ point of view. We need to evolve scoring systems that take into consideration these culture specific needs.

Utility of QOL instruments in CSM

More recently it is being advocated that assessment of patients with a chronic disease such as CSM, should be more comprehensive than reflecting purely on the physical impact of the disease. There has been a trend to study outcomes following surgery using generic scales for QOL (quality of life) and comparing them with the disease-specific scales such as the Nurick grading system and the mJOA. We have previously compared the Nurick grading system with more comprehensive scales in patients who underwent CC for CSM [9, 10]. Rajshekhar and Muliyil [9] compared a simple scale, namely the ‘patient perceived outcome score’ (PPOS) with NGRR. Though there was good correlation between the two, there was disagreement between them in 13.5% of patients. Thakar et al. [10] compared the Nurick grade and the 36-Item Short Form Health Survey (SF-36) and WHOQOL-Bref in patients undergoing uninstrumented CC for CSM. There was a good correlation between Nurick grade and the physical component summary (PCS) score of SF-36 and the physical domain of the WHOQOL-Bref but not with the overall SF 36 and WHOQOL-Bref scores. This again emphasizes the need for two separate scales, a disease-specific scale like Nurick grade for objective assessment of disease severity and a generic scale for multi-dimensional assessment of Health Related Quality-of-Life (HRQOL) [10]. Nurick grade in itself is inadequate for complete assessment and so is mJOA score as evidenced by disagreement between the respective recovery rates in the present study. However, these scores are unlikely to be replaced by more comprehensive HRQOL scales such as SF 36 or WHOQOL-Bref in clinical practice situations as they are simple and easy to administer.

Conclusion

Although Nurick grade and lower limb mJOA had good correlation at follow-up evaluation after surgery, there was disagreement in 11.8% (11 out of 93) patients. The correlation was best in patients with moderate myelopathy than in those with mild or severe myelopathy. The disagreement between the scores in some patients suggests that Nurick grade and mJOA scores assess separate domains of functionality. As disease-specific scales, we should continue to incorporate both Nurick scale and mJOA score in evaluation of patients with CSM till we evolve a comprehensive scoring system that reflects all aspects of function in a patient.